CN103366750B - A kind of sound codec devices and methods therefor - Google Patents

A kind of sound codec devices and methods therefor Download PDF

Info

Publication number
CN103366750B
CN103366750B CN201210085183.2A CN201210085183A CN103366750B CN 103366750 B CN103366750 B CN 103366750B CN 201210085183 A CN201210085183 A CN 201210085183A CN 103366750 B CN103366750 B CN 103366750B
Authority
CN
China
Prior art keywords
frequency spectrum
low frequency
territory
mdct
mdft
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210085183.2A
Other languages
Chinese (zh)
Other versions
CN103366750A (en
Inventor
潘兴德
吴超刚
李靓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING TIANLAI CHUANYIN DIGITAL TECHNOLOGY Co Ltd
Original Assignee
BEIJING TIANLAI CHUANYIN DIGITAL TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING TIANLAI CHUANYIN DIGITAL TECHNOLOGY Co Ltd filed Critical BEIJING TIANLAI CHUANYIN DIGITAL TECHNOLOGY Co Ltd
Priority to CN201210085183.2A priority Critical patent/CN103366750B/en
Publication of CN103366750A publication Critical patent/CN103366750A/en
Application granted granted Critical
Publication of CN103366750B publication Critical patent/CN103366750B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The present invention relates to a kind of sound codec devices and methods therefor, particularly relate to coding and decoding device and the method thereof of monophonic sounds coding and decoding device and method and stereo sound.The present invention is by being mapped to MDCT territory by digital audio signal from time domain, and the low frequency spectrum on MDCT territory and high frequency spectrum are transformed into MDFT territory, in conjunction with carrying out waveform coding for the low frequency spectrum on MDCT territory and carrying out parameter coding for the low frequency spectrum on MDFT territory and high frequency spectrum, finally the data-reusing of waveform coding and parameter coding is exported acoustic coding code stream, reduce computation complexity, under lower code check, improve the coding quality to music signal further.

Description

A kind of sound codec devices and methods therefor
Technical field
The present invention relates to a kind of sound codec apparatus and method, particularly relate to coding and decoding device and the method thereof of monophonic sounds coding and decoding device and method and stereo sound.
Background technology
Patent ZL200610087481.X discloses a kind of sound coder and method, comprising:
Time varying prediction analysis module, for carrying out Time varying prediction analysis to digital audio signal, to obtain time domain excitation signal;
Time-frequency mapping module, for by time domain excitation signal map to transform domain, to obtain the pumping signal on transform domain;
Coding module, for carrying out quantization encoding to the low frequency spectrum in the pumping signal on transform domain and intermediate frequency spectrum, to obtain low frequency waveform encoded data and intermediate frequency waveform encoded data; And according to the low frequency spectrum in the pumping signal on transform domain, intermediate frequency spectrum and high frequency spectrum, calculate the high-frequency parameter being used for recovering high frequency spectrum from low frequency spectrum and intermediate frequency spectrum, and quantization encoding is carried out to obtain high-frequency parameter coded data to high-frequency parameter; And
Bit stream Multiplexing module, for carrying out multiplexing to low frequency waveform encoded data, intermediate frequency waveform encoded data and high-frequency parameter coded data, to export acoustic coding code stream.
This sound coder and method introduce new coding framework, with fully in conjunction with sound waveform coding and the feature of parameter coding, under lower code check and calculation of complex degree constrain, all can encode by high-quality to voice and music.
According to the method that ZL200610087481.X proposes, how can under the prerequisite reduced or keep computation complexity, under lower code check, promote the coding quality to music signal further, be the problem that this technique direction faces.
Summary of the invention
Detailed description by setting forth below, accompanying drawing and claim are become obvious by other characteristic sum benefits of exemplary embodiment of the present invention.
According to a first aspect of the invention, provide a kind of monophonic sounds code device, comprise: Modified Discrete Cosine Transform (MDCT) conversion module, for digital audio signal is mapped to MDCT territory to obtain the voice signal in MDCT territory from time domain, and the voice signal in described MDCT territory is divided into low frequency spectrum and high frequency spectrum; Low frequency waveform coding module, for carrying out quantization encoding to obtain low frequency waveform encoded data to the low frequency spectrum of the voice signal on described MDCT territory; MDCT to revising discrete Fourier transformation (MDFT) modular converter, for the low frequency spectrum of the voice signal on described MDCT territory and high frequency spectrum being converted to low frequency spectrum and the high frequency spectrum of the voice signal on MDFT territory; High-frequency parameter coding module, for according to the low frequency spectrum in the voice signal on described MDFT territory and high frequency spectrum, calculate the high-frequency parameter being used for recovering high frequency spectrum in decoding end from low frequency spectrum, and quantization encoding is carried out to obtain high-frequency parameter coded data to described high-frequency parameter; And bit stream Multiplexing module, for carrying out multiplexing, to export acoustic coding code stream to described low frequency waveform encoded data and described high-frequency parameter coded data.
According to a second aspect of the invention, provide a kind of monophonic sounds coding method, comprise: digital audio signal is mapped to Modified Discrete Cosine Transform (MDCT) territory to obtain the voice signal MDCT territory from time domain, and the voice signal on described MDCT territory is divided into low frequency spectrum and high frequency spectrum; Quantization encoding is carried out to obtain low frequency waveform encoded data to the low frequency spectrum in the voice signal on described MDCT territory, the low frequency spectrum of the voice signal on described MDCT territory and high frequency spectrum are converted to low frequency spectrum and the high frequency spectrum of the voice signal revised on discrete Fourier transformation (MDFT) territory, and according to the low frequency spectrum in the voice signal on described MDFT territory and high frequency spectrum, calculate the high-frequency parameter being used for recovering high frequency spectrum in decoding end from low frequency spectrum, and quantization encoding is carried out to obtain high-frequency parameter coded data to described high-frequency parameter; And carry out multiplexing to described low frequency waveform encoded data and high-frequency parameter coded data, to export acoustic coding code stream.
According to a third aspect of the invention we, provide a kind of monophonic sound sound decoding device, comprising: bit stream demultiplexing module, for carrying out demultiplexing to acoustic coding code stream, to obtain low frequency waveform encoded data and high-frequency parameter coded data; Low frequency waveform decoder module, for described low frequency waveform encoded data of decoding, to obtain the low frequency spectrum decoded data of the voice signal on Modified Discrete Cosine Transform (MDCT) territory; MDCT to revising discrete Fourier transformation (MDFT) modular converter, for the low frequency spectrum decoded data of the voice signal on described MDCT territory is converted to MDFT territory from MDCT territory; High-frequency parameter decoder module, in the low frequency spectrum from the voice signal on MDFT territory, demapping section modal data is to HFS, obtain the high frequency spectrum data after mapping, then according to described high-frequency parameter coded data, parameter decoding is carried out to the high frequency spectrum data after described mapping and obtain high frequency spectrum decoded data; And inverse correction discrete Fourier transform (DFT) (IMDFT) conversion module, carry out IMDFT conversion for described low frequency spectrum decoded data and described high frequency spectrum decoded data being combined, to obtain the voice codec data in time domain.
According to a forth aspect of the invention, provide a kind of monophonic sounds coding/decoding method, comprising: demultiplexing is carried out to acoustic coding code stream, to obtain low frequency waveform encoded data and high-frequency parameter coded data; To decode described low frequency waveform encoded data, to obtain the low frequency spectrum decoded data in the voice signal on change Modified Discrete Cosine Transform (MDCT) territory, described low frequency spectrum decoded data is converted to from MDCT territory and revises discrete Fourier transformation (MDFT) territory, obtain the low frequency spectrum decoded data on MDFT territory, according to the low frequency spectrum decoded data on MDFT territory, parameter decoding is carried out to high-frequency parameter coded data, obtains the decoded high frequency spectrum decoded data on MDFT territory; And the low frequency spectrum decoded data on decoded MDFT territory and high frequency spectrum decoded data combined carry out inversely revising discrete Fourier transformation (IMDFT) conversion, obtain the digital audio signal in decoded time domain.
According to a fifth aspect of the invention, provide a kind of stereo encoding apparatus, comprise: Modified Discrete Cosine Transform (MDCT) conversion module, for respectively digital audio signal being mapped to MDCT territory from time domain, to obtain the digital audio signal on the MDCT territory of left and right sound channels, and the voice signal on the MDCT territory of described left and right sound channels is divided into low frequency spectrum and high frequency spectrum; Low frequency stereo coding module, for carrying out stereo coding to the low frequency spectrum on the MDCT territory of described left and right sound channels, to obtain low frequency stereo coding data; MDCT to revising discrete Fourier transformation (MDFT) modular converter, for the low frequency spectrum on the MDCT territory of described left and right sound channels and high frequency spectrum being converted to low frequency spectrum on MDFT territory and high frequency spectrum; High-frequency parameter coding module, for respectively according to the low frequency spectrum in the voice signal on the MDFT territory of described left and right sound channels and high frequency spectrum, calculate the high-frequency parameter being used for recovering high frequency spectrum respectively in decoding end from the low frequency spectrum of described left and right sound channels, and quantization encoding is carried out to obtain the high-frequency parameter coded data of described left and right sound channels to described high-frequency parameter; And bit stream Multiplexing module, for carrying out multiplexing to the high-frequency parameter coded data of described low frequency stereo coding data and described left and right sound channels, to export acoustic coding code stream.
According to a sixth aspect of the invention, provide a kind of stereo encoding method, comprise: respectively digital audio signal is mapped to Modified Discrete Cosine Transform (MDCT) territory from time domain, to obtain the digital audio signal on the MDCT territory of left and right sound channels, and the voice signal on the MDCT territory of described left and right sound channels is divided into low frequency spectrum and high frequency spectrum, to a described left side, low frequency spectrum on the MDCT territory of R channel carries out stereo coding, to obtain low frequency stereo coding data, by a described left side, being converted to of low frequency spectrum on the MDCT territory of R channel and high frequency spectrum revises low frequency spectrum on discrete Fourier transformation (MDFT) territory and high frequency spectrum, respectively according to a described left side, low frequency spectrum in voice signal on the MDFT territory of R channel and high frequency spectrum, calculate and be used in decoding end respectively from a described left side, the high-frequency parameter of high frequency spectrum is recovered in the low frequency spectrum of R channel, and quantization encoding is carried out to obtain a described left side to described high-frequency parameter, the high-frequency parameter coded data of R channel, and carry out multiplexing, to export acoustic coding code stream to the high-frequency parameter coded data of described low frequency stereo coding data and described left and right sound channels.
According to a seventh aspect of the invention, provide a kind of stereo decoding apparatus, comprising: bit stream demultiplexing module, for carrying out demultiplexing to acoustic coding code stream, to obtain the high-frequency parameter coded data of low frequency stereo coding data and left and right sound channels; Low frequency stereo de-coding module, for carrying out stereo decoding to described low frequency stereo coding data, to obtain the low frequency spectrum decoded data of the voice signal on Modified Discrete Cosine Transform (MDCT) territory of described left and right sound channels; MDCT is to revising discrete Fourier transformation (MDFT) modular converter, for the low frequency spectrum decoded data of the voice signal on the MDCT territory of described left and right sound channels is converted to MDFT territory from MDCT territory, obtain the low frequency spectrum decoded data on the MDFT territory of left and right sound channels; High-frequency parameter decoder module, for demapping section modal data from the low frequency spectrum on the MDFT territory of described left and right sound channels to HFS, obtain the high frequency spectrum data of the left and right sound channels after mapping, then carry out according to the high frequency spectrum data of high-frequency parameter coded data to the left and right sound channels after described mapping of described left and right sound channels the high frequency spectrum decoded data that parameter decoding obtains described left and right sound channels; And inverse correction discrete Fourier transformation (IMDFT) conversion module, IMDFT conversion is carried out, to obtain the stereo decoding data in time domain for being combined by the high frequency spectrum decoded data on the MDFT territory of the low frequency spectrum decoded data on the MDFT territory of described left and right sound channels and described left and right sound channels.
According to an eighth aspect of the invention, provide a kind of stereo decoding method, comprising: demultiplexing is carried out to acoustic coding code stream, to obtain the high-frequency parameter coded data of low frequency stereo coding data and left and right sound channels, stereo decoding is carried out to described low frequency stereo coding data, to obtain a described left side, the low frequency spectrum decoded data of the voice signal on Modified Discrete Cosine Transform (MDCT) territory of R channel, by a described left side, the low frequency spectrum decoded data of the voice signal on the MDCT territory of R channel is converted to from MDCT territory revises discrete Fourier transformation (MDFT) territory, obtain a left side, low frequency spectrum decoded data on the MDFT territory of R channel, from a described left side, in low frequency spectrum on the MDFT territory of R channel, demapping section modal data is to HFS, obtain the left side after mapping, the high frequency spectrum data of R channel, then according to a described left side, the high-frequency parameter coded data of R channel is to the left side after described mapping, the high frequency spectrum data of R channel are carried out parameter decoding and are obtained a described left side, high frequency spectrum decoded data on the MDFT territory of R channel, and the high frequency spectrum decoded data on the MDFT territory of the low frequency spectrum decoded data on the MDFT territory of described left and right sound channels and described left and right sound channels combined carry out inversely revising discrete Fourier transformation (IMDFT) conversion, to obtain the stereo decoding data in time domain.
The present invention is by being mapped to MDCT territory by digital audio signal from time domain, and the low frequency spectrum on MDCT territory and high frequency spectrum are transformed into MDFT territory, in conjunction with carrying out waveform coding for the low frequency spectrum on MDCT territory and carrying out parameter coding for the low frequency spectrum on MDFT territory and high frequency spectrum, finally the data-reusing of waveform coding and parameter coding is exported acoustic coding code stream, reduce computation complexity, under lower code check, improve the coding quality to music signal further.
Accompanying drawing explanation
Below with reference to accompanying drawings specific embodiment of the invention scheme is described in detail, in the accompanying drawings:
Fig. 1 is the structured flowchart of monophonic sounds code device according to the preferred embodiment of the invention.
Fig. 2 is the structured flowchart of the module of resampling shown in Fig. 1.
Fig. 3 is the structured flowchart of the waveform of low frequency shown in Fig. 1 coding module.
Fig. 4 is the structured flowchart of the coding module of high-frequency parameter shown in Fig. 1.
Fig. 5 is that the frequency spectrum of high-frequency parameter coding module maps schematic diagram, and wherein scheming a) is original signal spectrum figure, and figure is b) the signal spectrum figure after mapping.
Fig. 6 is the time-frequency plane figure after time-frequency maps, and wherein schemes a) to be the time-frequency plane figure of tempolabile signal, and figure is b) the time-frequency plane figure of fast changed signal.
The gain that Fig. 7 is the coding module of high-frequency parameter shown in Fig. 1 calculates schematic diagram, wherein schemes a) to be fast height position and pattern diagram, and figure is b) Region dividing and pattern diagram.
Fig. 8 is the structured flowchart of monophonic sound sound decoding device according to the preferred embodiment of the invention.
Fig. 9 is the structured flowchart of the waveform of low frequency shown in Fig. 8 decoder module.
Figure 10 is the structured flowchart of the decoder module of high-frequency parameter shown in Fig. 8.
Figure 11 is the structured flowchart of stereo encoding apparatus according to the preferred embodiment of the invention.
Figure 12 be according to the preferred embodiment of the invention with the illustraton of model of difference stereo coding pattern.
Figure 13 is the illustraton of model of parameter stereo coding pattern according to the preferred embodiment of the invention.
Figure 14 is the illustraton of model of the parameter error stereo coding pattern according to the embodiment of the present invention.
Figure 15 is the structured flowchart of stereo decoding apparatus according to the preferred embodiment of the invention.
Embodiment
For making the object, technical solutions and advantages of the present invention clearly understand, by the following examples, and reference accompanying drawing, the present invention is described in more detail.
Fig. 1 is the structured flowchart of monophonic sounds code device according to the preferred embodiment of the invention.
As shown in Figure 1, monophonic sounds code device comprises according to the preferred embodiment of the invention: resampling module 101, signal type judge module 102, Modified Discrete Cosine Transform (MDCT) conversion module 103, low frequency waveform coding module 104, Modified Discrete Cosine Transform (MDCT) are to revising discrete Fourier transform (DFT) (MDFT) modular converter 105, high-frequency parameter coding module 106, and bit stream Multiplexing module 107.
First, the annexation of modules and function in summarized introduction Fig. 1, wherein:
Signal after resampling for the digital audio signal of input is transformed to target sampling rate from crude sampling rate, and is outputted to signal type judge module 102 and MDCT conversion module 103 by resampling module 101 in units of frame.Should note, if the digital audio signal inputted inherently has target sampling rate, then code device can not comprise resampling module in accordance with the principles of the present invention, but directly digital audio signal can be input to signal type judge module 102 and MDCT conversion module 103.
Signal type judge module 102 for carrying out signal type analysis frame by frame to the voice signal after resampling, and outputs signal the result of type analysis.Due to the complicacy of signal itself, signal type can adopt multiform expression.Such as, if this frame signal is tempolabile signal, then directly exports and represent that this frame signal is the mark of tempolabile signal; If fast changed signal, then need continuation to calculate the position of fast height generation, and output represent that this frame signal is the mark of fast changed signal and the position of fast height generation.The result of signal type analysis outputs in MDCT conversion module 103 and controls for the exponent number carrying out MDCT conversion.It should be noted that if the result adopting the method determination signal type of closed-loop search to analyze, also can not comprise signal type judge module according to sound coder of the present invention.
The signal type analysis result that MDCT conversion module 103 exports from signal type judge module 102 for basis, adopt the MDCT conversion of different length exponent number, voice signal after resampling is mapped to MDCT transform domain, and the MDCT domain coefficient of voice signal is outputted to low frequency waveform coding module 104, MDCT to MDFT modular converter 105.Particularly, if this frame signal is tempolabile signal, then in units of frame, does MDCT conversion, select the MDCT of longer exponent number to convert; If fast changed signal, then this frame signal is divided into subframe, in units of subframe, does MDCT conversion, select the MDCT of shorter exponent number to convert.Converted by MDCT, the MDCT domain coefficient of voice signal is divided into low frequency spectrum and high frequency spectrum, wherein low frequency spectrum outputs to low frequency waveform coding module 104, and the result of low frequency spectrum, high frequency spectrum, signal type analysis outputs to MDCT to MDFT modular converter 105.
Low frequency waveform coding module 104 is for receiving the low frequency part of the MDCT domain coefficient of voice signal from MDCT conversion module 103, redundancy Processing for removing is carried out to it, and the low frequency spectrum after redundancy process is carried out quantization encoding obtain low frequency coded data, and output to described bit stream Multiplexing module 107.It should be noted that, if the temporal redundancy of low-frequency component meets coding requirement, low frequency waveform coding module 104 also can not carry out redundancy Processing for removing.
MDCT domain coefficient, for receiving the MDCT domain coefficient of voice signal from MDCT conversion module 103, is converted to the MDFT domain coefficient including phase information by MDCT to MDFT modular converter 105, and this MDFT domain coefficient is outputted to high-frequency parameter coding module 106.
High-frequency parameter coding module 106 is for from MDCT to MDFT, modular converter 105 receives MDFT domain coefficient, the high-frequency parameter of such as gain parameter, tonality parameter and so on required for therefrom extracting, and quantization encoding is carried out to high-frequency parameter and outputs to bit stream Multiplexing module 107.
Bit stream Multiplexing module 107, for being undertaken multiplexing by the side information exported from signal type judge module 102, low frequency waveform coding module 104 and high-frequency parameter coding module 106 and coded data, forms acoustic coding code stream.
Below, the resampling module 101 in above-mentioned monophonic sounds code device, low frequency waveform coding module 104, MDCT to MDFT modular converter 105, high-frequency parameter coding module 106 are specifically explained.
Fig. 2 is the structured flowchart of the module of resampling shown in Fig. 1.
As shown in Figure 2, resampling module 101 comprises up-sampler 201, low-pass filter 202 and down-sampler 203.Up-sampler 201 is for sample frequency being the up-sampling that signal x (n) of Fs carries out L times, obtain signal w (n) that sample frequency is L*Fs, low-pass filter 202 couples of w (n) carry out low-pass filtering and generate filtered signal v (n).The effect of low-pass filter 202 is mirror images of elimination up-sampler 201 generation and avoids by the issuable aliasing of down-sampler 203.Down-sampler 203 pairs of signals v (n) carry out M down-sampling doubly and obtain signal y (n) of sample frequency for (L/M) * Fs.And the signal after resampling is outputted to signal type judge module 102 and MDCT conversion module 103 in units of frame.
Fig. 3 is the structured flowchart of the waveform of low frequency shown in Fig. 1 coding module 104.
As shown in Figure 3, low frequency waveform coding module 104 comprises redundancy Processing for removing module 301 and quantization encoding module 302.MDCT conversion module export low-frequency component be part more stable in signal, but its temporal correlation or frequency domain correlation (i.e. redundance) stronger.Due to the complicacy of signal itself, the MDCT conversion of fixing exponent number can not reach optimum correlativity completely and eliminate.Such as, when signal type judge module 102 judges that this frame signal type is fast changed signal, adopt the MDCT conversion process fast changed signal of shorter exponent number, now the temporal correlation of the low frequency part in MDCT territory and frequency domain correlation (i.e. redundancy) still stronger; And when signal type judge module 102 judges that this frame signal type is tempolabile signal, use the MDCT conversion process tempolabile signal of longer exponent number, now the frequency domain correlation (i.e. redundancy) of the low frequency part in MDCT territory can be stronger.Therefore, the redundancy Processing for removing module 301 that sound coder of the present invention comprises is selectable, its can eliminate further MDCT conversion obtain low-frequency component in time redundancy or frequency domain redundancy.
The process of low frequency redundancy can adopt many kinds of methods to realize.Such as, adopt the transducer of shorter exponent number or the fallout predictor of higher-order number to eliminate the temporal correlation of the low frequency part in the MDCT territory between two subframes or between two continuous frames, such as discrete cosine transform (DCT), discrete Fourier transformation (DFT), Modified Discrete Cosine Transform (MDCT), long-term prediction (LTP) etc.And for example, the fallout predictor of lower-order number can be adopted to eliminate the frequency domain correlation of the low frequency part in MDCT territory, such as linear predictor (LPC) etc.Therefore, in sound coder of the present invention, redundancy Processing for removing module 301 can adopt multiple redundancy Processing for removing method to calculate the effect eliminating redundancy, i.e. actual coding gain, then select whether to adopt the process of low frequency redundancy and adopt which kind of low frequency redundancy processing method, finally will whether adopt the mark of redundancy Processing for removing module 301 and adopt which kind of method to output in bit stream Multiplexing module 107 as side information.
Quantization encoding module 302 pairs of low-frequency datas carry out quantization encoding and obtain the low-frequency data of coding.The quantization scheme adding Huffman (Huffman) as adopted the scalar in similar MPEG AAC and encode, also can adopt vector quantization scheme.In constant bit rate coding, vector quantizer is a rational selection scheme.Coding low-frequency data and by the process of low frequency redundancy export side information output in bit stream Multiplexing module 107.
MDFT domain coefficient for MDCT domain coefficient is converted to MDFT domain coefficient, and is outputted to high-frequency parameter coding module 106 by MDCT to MDFT modular converter 105.Concrete conversion method obtains MDST domain coefficient and combines with MDCT domain coefficient obtaining MDFT domain coefficient as carried out correction discrete sine transform (MDST) to voice signal, should be noted that, when adopting this conversion method, the voice signal that MDCT to MDFT modular converter 105 also will receive after resampling; And for example then carry out MDST conversion according to MDCT domain coefficient reconstruct time-domain signal and obtain MDST domain coefficient, MDCT domain coefficient and MDST domain coefficient are combined and obtains MDFT domain coefficient; Or obtain MDFT domain coefficient as then carried out MDFT conversion according to MDCT domain coefficient reconstruct time-domain signal; And as by setting up the relation between the MDCT domain coefficient of present frame and front and back frame and present frame MDST domain coefficient, determine three transition matrixes being calculated present frame MDST domain coefficient by this three frame MDCT domain coefficient, thus realize directly being processed by MDCT domain coefficient obtaining MDST domain coefficient, finally MDCT domain coefficient and MDST domain coefficient are combined and obtain MDFT coefficient.。
High-frequency parameter coding module 106, for according to the low frequency spectrum in MDFT territory and high frequency spectrum, extracts and is used for, from low frequency spectrum to recover the high-frequency parameter of high frequency spectrum, carrying out quantization encoding and output to bit stream Multiplexing module 107 to high-frequency parameter.
Fig. 4 comprises frequency spectrum mapper 401, tonality parameter extractor 402 and gain parameter extraction apparatus 403 for the structured flowchart of the coding module of high-frequency parameter shown in Fig. 1 106, high-frequency parameter coding module 106.
Frequency spectrum mapper 401 is mapped to the special frequency channel of high frequency spectrum for the special frequency channel of the low frequency spectrum by MDFT territory, obtains the high frequency spectrum in the MDFT territory after mapping, and the high frequency spectrum input point after mapping is clipped to tonality parameter extractor 402 and gain parameter extraction apparatus 403.Time-frequency plane after mapping is identical with former time-frequency plane, as shown in Figure 5.
The high frequency spectrum that the output of tonality parameter extractor 402 received spectrum mapper 401 and MDCT to MDFT modular converter 105 export, is divided into multiple sub-band by the high frequency spectrum after mapping and original high-frequency spectrum.Next, calculate the tonality of original high-frequency frequency band respectively and map the tonality of the corresponding frequency band of rear high frequency spectrum, obtain being used for adjusting at decoding device end the tonality parameter mapped required for rear high frequency spectrum tonality, and these parameters are outputted in bit stream Multiplexing module 107 after quantization encoding, wherein, tonality parameter can comprise adjustment type and adjustment parameter.
The high frequency spectrum that the output of gain parameter extraction apparatus 403 received spectrum mapper 401 and MDCT to MDFT modular converter 105 export.The position that gain parameter extraction apparatus 403 occurs according to signal type and fast height, high frequency time-frequency plane frequency spectrum after mapping and original high-frequency time-frequency plane are divided multiple region, the ratio of the region energy that the energy calculating each region in original time-frequency plane is corresponding with mapping time-frequency plane is as gain parameter, and this gain parameter outputs in bit stream Multiplexing module 107 after quantization encoding.
Be described in detail monophonic sounds coding method according to the preferred embodiment of the invention below, the method comprises the following steps:
Step 11: resampling process is carried out to input signal;
Step 12: signal type judgement is carried out to the voice signal after resampling, if gradual type signal, then directly output signal type, if become type signal soon, then continue the position calculating the generation of fast height, finally output comprises the signal type analysis result of signal type and fast height position as side information;
Step 13: according to signal type analysis result, adopts the MDCT conversion of different length exponent number, carries out MDCT conversion, obtain the MDCT domain coefficient of the voice signal after resampling to the voice signal after resampling;
Step 14: MDCT coefficient in transform domain is divided into low frequency spectrum and high frequency spectrum;
Step 15: low frequency waveform coding is carried out to low frequency spectrum and obtains low frequency waveform encoded data;
Step 16: MDCT domain coefficient is converted to MDFT domain coefficient, obtains low frequency spectrum and the high frequency spectrum in MDFT territory;
Step 17: extract and be used for, from the low frequency spectrum in MDFT territory to recover the high-frequency parameter of high frequency spectrum, carrying out quantization encoding to high-frequency parameter and obtaining high-frequency parameter coded data.
Step 18: the data after coding and side information are carried out multiplexing, obtains acoustic coding code stream.
Below each step of monophony coding method is according to the preferred embodiment of the invention described in detail.
In step 11, resampling process specifically comprises: first by sampling rate Fs and the resampling target sampling rate Fmax of input signal, the sampling rate calculating resampling compares Fmax/Fs=L/M.Wherein, resampling target sampling rate Fmax is that the best result of decoded signal analyses frequency, is generally determined by coding bit rate.Then carry out L up-sampling doubly to input audio signal x (n), the signal of output is signal after up-sampling is obtained by low-pass filter wherein N is the length (as N=∞, this wave filter is iir filter) of low-pass filter, and the cutoff frequency of low-pass filter is Fmax; The sequence of carrying out after M down-sampling doubly v (n) is y (n), then y (n)=v (Mn).Like this, the sampling rate of voice signal y (n) after resampling is exactly L/M times of the sampling rate of voice signal x (n) of original input.It should be noted that if the digital audio signal inputted inherently has target sampling rate, then without the need to performing step 11.
In step 12, signal type judgement is carried out to the digital audio signal after resampling.If gradual type signal, then directly output signal type, if become type signal soon, then continue the position, final output signal type and the fast height position that calculate the generation of fast height.It should be pointed out that this step can be omitted when not needing to carry out signal type analysis.
Signal type judges to adopt many kinds of methods.Such as, judge signal type by signal perceptual entropy, judge signal type etc. by the energy calculating signal subframe.Preferably, the method by calculating signal subframe energy can be adopted to judge signal type, and its detailed process is as follows:
In step 12-1: frame of digital voice signal y (n) is carried out high-pass filtering, by low frequency part, the frequency of such as below 500Hz, filters out;
In step 12-2: the signal after high-pass filtering is divided into several subframe yi (n), be convenience of calculation, usually a frame signal be divided into an integer subframe, as a frame be 2048 time, can 256 be a subframe;
In step 12-3: the ENERGY E i calculating each subframe yi (n) respectively, wherein i is the sequence number of subframe.Obtain the energy Ratios of present sub-frame and last subframe again, when energy Ratios is greater than certain threshold value Te, then judge that this frame signal type is fast changed signal, if when the energy Ratios of all subframes and former frame is all less than Te, then judge that this frame signal type is tempolabile signal.If fast changed signal, then continue to perform step 11d, otherwise do not perform step 11d, gradual signal type is defined as low frequency sub-band territory signal type analysis result.Threshold value Te in the method can adopt the well-known process in some signal transacting to obtain, and as the mean ratio of statistics coded signal energy, and is multiplied by certain constant and obtains Te;
In step 12-4: for fast changed signal, subframe maximum for energy is judged as the position that fast height occurs.The position that the signal type become soon and fast height occur is defined as low frequency sub-band territory signal type analysis result.
If do not need analytic signal type, without the need to performing step 12.
In step 13, according to signal type analysis result, adopt the MDCT conversion of different length exponent number, MDCT conversion is carried out to the voice signal after resampling, obtains the MDCT coefficient in transform domain of voice signal.
Below Modified Discrete Cosine Transform (MDCT) is specifically described in detail.
Choose the time-domain signal of former frame M sample and a present frame M sample, then windowing operation is carried out to the time-domain signal of common 2M the sample of this two frame, then MDCT conversion is carried out to the signal after windowing, thus obtain M spectral coefficient.
The impulse response of MDCT analysis filter is:
h k ( n ) = w ( n ) 2 M cos [ ( 2 n + M + 1 ) ( 2 k + 1 ) π 4 M ] ,
Then MDCT is transformed to: 0≤k≤M-1, wherein: w (n) is window function; The input time-domain signal that x (n) converts for MDCT; The output frequency-region signal that X (k) converts for MDCT.
For meeting the condition of signal Perfect Reconstruction, window function w (n) of MDCT conversion must meet following two conditions:
W (2M-1-n)=w (n) and w 2(n)+w 2(n+M)=1.
In practice, Sine window can be selected as window function.Certainly, also by using biorthogonal conversion, the above-mentioned restriction to window function can be revised with specific analysis filter and synthesis filter.
Like this, these frame data adopting MDCT to carry out time-frequency conversion just obtain different time-frequency plane figure according to signal type.Such as, suppose that time-frequency conversion exponent number when present frame is tempolabile signal is 2048, for time-frequency conversion exponent number during fast changed signal type is 256, then time-frequency plane figure as shown in Figure 6, and wherein Fig. 6 a is the time-frequency plane figure of tempolabile signal; Fig. 6 b is the time-frequency plane figure of fast changed signal.
In step 14, MDCT domain coefficient is divided into low frequency spectrum and high frequency spectrum.Due to the sampling rate of coded sound signal and coding bit rate a lot, the division of frequency band is adjustable.Typically, the separation of low frequency spectrum and high frequency spectrum can between [1/3,1] of encoded bandwidth, and wherein, encoded bandwidth is not more than the actual bandwidth of signal to be encoded.Here, according to nyquist sampling theorem, the actual bandwidth of signal is the half of its sample frequency.Such as, under 16kbps code check, during coding 44.1kHz monophonic sound tone signal, a selection of encoded bandwidth is 12kHz.
In step 15, low frequency waveform encoded packets is drawn together the process of low frequency redundancy and low frequency quantization and to be encoded two steps.The process of low frequency redundancy can adopt accomplished in many ways perhaps.Such as, adopt the transducer of shorter exponent number or the fallout predictor of higher-order number to eliminate the temporal correlation of the voice signal on the MDCT territory between two subframes or between two continuous frames, as discrete cosine transform (DCT), discrete Fourier transformation (DFT), Modified Discrete Cosine Transform (MDCT), long-term prediction (LTP) etc.; Or adopt the fallout predictor of lower-order number to eliminate the frequency domain correlation in the voice signal on MDCT territory, as linear predictor (LPC) etc.
Preferably, the process of low frequency redundancy process is described for the LPC of the DCT of shorter exponent number and lower-order number.
First, the situation adopting the DCT of shorter exponent number to carry out the process of low frequency redundancy is described.Now, carry out redundancy process in chronological order to the low frequency spectrum of fast changed signal, 8 namely identical to time-frequency plane upper frequency position spectral coefficients adopt the dct transform of 8*8 to carry out redundancy elimination, adopt DCTII transform-based function here.
Secondly, the situation adopting the LPC of lower-order number to carry out the process of low frequency redundancy is described.Now, linear predictive coding is carried out to low frequency spectrum, namely linear prediction analysis is carried out to low frequency spectrum, obtain predictor parameter and low frequency residual error spectrum, and predictor parameter is quantized.
The scalar in similar MPEG AAC can be adopted to add the quantization scheme of Huffman encoding to low frequency waveform quantization encoding, also can adopt vector quantization scheme.In constant bit rate coding, vector quantizer is a rational selection scheme.
In step 16, MDCT domain coefficient is converted to MDFT domain coefficient.
Be described revising discrete Fourier transform (DFT) (MDFT), the relation of MDCT and MDFT and the MDCT domain coefficient conversion method to MDFT domain coefficient below.
First, MDFT change situation is introduced.Choose the time-domain signal of former frame M sample and a present frame M sample, then windowing operation is carried out to the time-domain signal of common 2M the sample of this two frame, then MDFT conversion is carried out to the signal after windowing, thus obtain M spectral coefficient.The computing formula of MDFT conversion is:
X ( k ) = Σ n = 0 2 M - 1 s ( n ) exp ( j π 4 M * ( 2 n + 1 + 2 M 2 ) * ( 2 k + 1 ) )
k=0,1,...,2M-1。Wherein: w (n) is window function; The input time-domain signal that s (n) converts for MDFT; The output frequency-region signal that X (k) converts for MDFT.MDFT spectral coefficient X (k) has following character:
X(k)=-conj(X(2M-1-k))
Therefore, X (k) data only needing front M data wherein just can regain one's integrity.
For meeting the condition of signal Perfect Reconstruction, window function w (n) of MDFT conversion must meet following two conditions:
W (2M-1-n)=w (n) and w 2(n)+w 2(n+M)=1.
In practice, Sine window can be selected as window function.Certainly, also by using biorthogonal conversion, the above-mentioned restriction to window function can be revised with specific analysis filter and synthesis filter.
Secondly, the relation that MDCT conversion converts with MDFT is introduced.
For time-domain signal s (n), the computing formula of its MDCT domain coefficient X (k) is:
X ( k ) = Σ n = 0 2 M s ( n ) cos ( π 4 M * ( 2 n + 1 + 2 M 2 ) * ( 2 k + 1 ) )
Wherein, 2M is frame length.
Similar, the computing formula of definition MDST domain coefficient Y (k) is
Y ( k ) = Σ n = 0 2 M s ( n ) sin ( π 4 M * ( 2 n + 1 + 2 M 2 ) * ( 2 k + 1 ) )
With MDCT domain coefficient X (k) for real part, MDST domain coefficient Y (k) is imaginary part, and structure MDFT domain coefficient Z (k) is:
Z (k)=X (k)+jY (k), k=0,1 ..., M-1, j are imaginary symbols.
Z ( k ) = X ( k ) + jY ( k )
= Σ n = 0 2 M - 1 s ( n ) cos ( π 4 M * ( 2 n + 1 + 2 M 2 ) * ( 2 k + 1 ) )
+ j Σ n = 0 2 M - 1 s ( n ) sin ( π 4 M * ( 2 n + 1 + 2 M 2 ) * ( 2 k + 1 ) )
= Σ n = 0 2 M - 1 s ( n ) exp ( i π 4 M * ( 2 n + 1 + 2 M 2 ) * ( 2 k + 1 ) )
This MDFT conversion is complex transform, and with phase information, and meet energy conservation, transform domain and time domain energy are consistent.The real part of visible MDFT domain coefficient is equivalent to MDCT domain coefficient exactly.
Finally, exemplify several conversion method specifically to explain the method that MDCT to MDFT changes.
Conversion method 1: carry out MDST conversion to voice signal, combines MDST domain coefficient and MDCT domain coefficient and obtains MDFT domain coefficient
For MDCT domain coefficient is converted to MDFT domain coefficient, the relation of MDCT, MDST and MDFT can be utilized, by calculating MDST domain coefficient and itself and MDCT domain coefficient being combined to obtain MDFT domain coefficient.This method is divided into MDST conversion, MDCT domain coefficient and MDST domain coefficient to be combined as MDFT domain coefficient two steps.
Step a:MDST converts.Synchronous in order to what keep with MDCT domain coefficient, MDFT, MDST will adopt the window function identical with present frame MDCT, window length etc.The length of current MDCT conversion is 2M, and window function is w (n), then its MDST domain coefficient Y (k) is:
Y ( k ) = Σ n = 0 2 M s ( n ) sin ( π 4 M * ( 2 n + 1 + 2 M 2 ) * ( 2 k + 1 ) )
k=0,1,...,M-1.
Step b:MDCT domain coefficient and MDST domain coefficient are combined as MDFT domain coefficient.With MDCT domain coefficient for real part, MDST domain coefficient is imaginary part, and structure MDFT domain coefficient is Z (k):
Z(k)=X(k)+jY(k),k=0,1,...,M-1
Conversion method 2: according to MDCT domain coefficient reconstruct time-domain signal, then carry out MDST conversion and combine with MDCT domain coefficient obtaining MDFT domain coefficient.
By carrying out MDCT domain coefficient obtaining time domain reconstruction signal against Modified Discrete Cosine Transform (IMDCT) and splicing adding process, then MDST conversion is done to time domain reconstruction signal and obtain MDST domain coefficient, MDCT domain coefficient and MDST domain coefficient are combined and obtains MDFT domain coefficient.This method is divided into time-domain signal reconstruct, MDST conversion, MDCT domain coefficient and MDST domain coefficient to be combined as MDFT domain coefficient three steps.
Step a: time-domain signal reconstructs.This is identical with the method in inverse Modified Discrete Cosine Transform.The formula of inverse Modified Discrete Cosine Transform (IMDCT) is as follows:
x e ( n ) = Σ k = 0 M - 1 X ( k ) h k ( n )
h k ( n ) = w ( n ) 2 M cos [ ( 2 n + M + 1 ) ( 2 k + 1 ) π 4 M ] ,
Wherein: x en output time-domain signal that () converts for IMDCT; h kn () is the impulse response of MDCT composite filter; W (n) is window function; X (k) is MDCT domain coefficient.
Step b:MDST converts.Synchronous in order to what keep with MDCT domain coefficient, MDFT, MDST conversion will adopt and convert identical window function, window length etc. with present frame MDCT.The length that present frame MDCT converts is 2M, and window function is w (n), then MDST is transformed to:
Y ( k ) = Σ n = 0 2 M s ( n ) sin ( π 4 M * ( 2 n + 1 + 2 M 2 ) * ( 2 k + 1 ) )
k=0,1,...,M-1.
Step c:MDCT domain coefficient and MDST domain coefficient are combined as MDFT domain coefficient.With MDCT domain coefficient for real part, MDST domain coefficient is imaginary part, and structure MDFT domain coefficient is Z (k):
Z(k)=X(k)+jY(k),k=0,1,...,M-1
Conversion method 3: then carry out MDFT conversion according to MDCT coefficient reconstruct time-domain signal and obtain MDFT domain coefficient.
By carrying out MDCT domain coefficient obtaining time domain reconstruction signal against Modified Discrete Cosine Transform (IMDCT) and splicing adding process, then MDFT conversion being done to time domain reconstruction signal and obtaining MDFT domain coefficient.This method is divided into time-domain signal reconstruct, MDFT converts two steps.
Step a: time domain data reconstructs.This convert with IMDCT in step identical.
Step b:MDFT converts.Synchronous in order to what keep with MDCT domain coefficient, MDFT conversion will adopt and convert identical window function, window length etc. with present frame MDCT.The length that present frame MDCT converts is 2M, and window function is w (n), then MDFT is transformed to:
Z ( k ) = Σ n = 0 2 M - 1 sr ( n ) w ( n ) exp ( i π 4 M * ( 2 n + 1 + 2 M 2 ) * ( 2 k + 1 ) )
k=0,1,...,M-1。
Conversion method 4: MDCT domain coefficient is directly processed and obtains MDFT coefficient.
By setting up the relation between the MDCT domain coefficient of present frame and front and back frame and present frame MDST domain coefficient, determine three transition matrixes being calculated present frame MDST domain coefficient by this three frame MDCT domain coefficient.Thus realize directly being processed by MDCT domain coefficient obtaining MDST domain coefficient, finally MDCT domain coefficient and MDST domain coefficient are combined and obtain MDFT coefficient.This method is divided into MDST coefficient calculations, combination MDFT coefficient two steps.
Step a:MDST coefficient calculations
Coefficient sets current frame length as 2M, and window function is w (n), former frame MDCT coefficient is S -1k (), present frame MDCT coefficient is S 0k (), a rear frame MDCT coefficient is S + 1(k), then MDST domain coefficient Y (k) is by following formulae discovery:
Y(k)=S -1(k)T cs-1+S 0(k)T cs0+S +1(k)T cs+1
Wherein T cs-1, T cs0and T cs+1for transition matrix, represent former frame MDCT coefficient, present frame MDCT coefficient, a rear frame MDCT coefficient respectively for the contribution of present frame MDST coefficient.
T cs-1, T cs+1, T cs0be all sparse matrix, only have minority data non-zero, most of data equal 0 or close to 0, by the mode most of data being approximately 0, can simplify transition matrix, reduce operand.
Step b: combination MDFT coefficient.MDCT domain coefficient and MDST domain coefficient are combined as MDFT domain coefficient, with MDCT coefficient X (k) for real part, MDST coefficient Y (k) is imaginary part, structure MDFT coefficient Z (k) is: Z (k)=X (k)+jY (k), k=0,1 ..., M-1.
Below with T cs0for example illustrates transition matrix T cs-1, T cs+1, T cs0with account form.If the window function of former frame signal is w -1n (), length is 2M, and the window function of current frame signal is w 0n (), length is 2M, and the window function of a rear frame signal is w + 1n (), length is 2M.Former frame MDCT coefficient is S -1k (), current MDCT coefficient is S 0k (), a rear frame MDCT coefficient is S + 1(k), then present frame MDST coefficient Y (k) is:
Y ( k ) = Σ n = 0 2 M sr ( n ) w 0 ( n ) sin ( π 4 M * ( 2 n + 1 + 2 M 2 ) * ( 2 k + 1 ) )
Wherein sr (n) the time domain reconstruction signal that is present frame,
Be transition matrix, j is imaginary symbols.And have:
sr ( n ) = ( 1 2 M Σ k = 0 M - 1 S 0 ( k ) cos ( - π 4 M * ( 2 n + 1 + 2 M 2 ) * ( 2 k + 1 ) ) * w 0 ( n )
+ 1 2 M Σ k = 0 M - 1 S - 1 ( k ) cos ( - π 4 M * ( 2 ( n + M ) + 1 + 2 M 2 ) * ( 2 k + 1 ) ) * f - 1 ( n ) * w - 1 ( n + M )
+ 1 2 M Σ k = 0 M - 1 S + 1 ( k ) cos ( - π 4 M * ( 2 ( n - M ) + 1 2 M 2 ) * ( 2 k + 1 ) ) + f + 1 ( n ) ) * w + 1 ( n - M )
f - 1 ( n ) = 1 , n < M 0 , else
f + 1 ( n ) = 0 , n < M 1 , else
W -1(n), w 0(n), w + 1n () meets:
w -1(n+M) 2+w 0(n) 2=1
w 0(n+M) 2+w +1(n) 2=1
By above-mentioned formula of deriving, can be able to obtain:
Y(k)=Y -1(k)+Y 0(k)+Y +1(k)
Wherein,
Y - 1 ( k ) = &Sigma; n = 0 2 M - 1 1 2 M &Sigma; j = 0 M - 1 S - 1 ( k ) cos ( - &pi; 4 M * ( 2 ( n + M ) + 1 + 2 M 2 ) * ( 2 k + 1 ) )
* f - 1 ( n ) * w - 1 ( n + M ) * w 0 ( n ) * sin ( &pi; 4 M * ( 2 n + 1 + 2 M 2 ) * ( 2 k + 1 ) )
Y 0 ( k ) = &Sigma; n = 0 2 M - 1 1 2 M &Sigma; j = 0 M - 1 S 0 ( j ) cos ( - &pi; 4 M * ( 2 n + 1 + 2 M 2 ) * ( 2 j + 1 ) ) * w 0 ( n ) * w 0 ( n )
* sin ( &pi; 4 M * ( 2 n + 1 + 2 M 2 ) * ( 2 k + 1 ) )
Y + 1 ( k ) = &Sigma; n = 0 2 M - 1 1 2 M &Sigma; j = 0 M - 1 S + 1 ( k ) cos ( - &pi; 4 M * ( 2 ( n - M ) + 1 + 2 M 2 ) * ( 2 k + 1 ) )
* f + 1 ( n ) ) * w + 1 ( n - M ) * w 0 ( n ) * sin ( &pi; 4 M * ( 2 n + 1 + 2 M 2 ) * ( 2 k + 1 ) )
With Y 0k () is example:
Y 0 ( k ) = &Sigma; n = 0 2 M - 1 1 2 M &Sigma; j = 0 M - 1 S 0 ( j ) cos ( - &pi; 4 M * ( 2 n + 1 + 2 M 2 ) * ( 2 j + 1 ) ) * w 0 ( n ) * w 0 ( n )
* sin ( &pi; 4 M * ( 2 n + 1 + 2 M 2 ) * ( 2 k + 1 ) )
= &Sigma; j = 0 M - 1 1 2 M S 0 ( j ) &Sigma; n = 0 2 M - 1 cos ( - &pi; 4 M * ( 2 n + 1 + 2 M 2 ) * ( 2 j + 1 ) ) * w 0 ( n ) * w 0 ( n ) * sin ( &pi; 4 M
* ( 2 n + 1 + 2 M 2 ) * ( 2 k + 1 ) )
= &Sigma; j = 0 M - 1 1 2 M S 0 ( j ) &Sigma; n = 0 2 M - 1 w 0 ( n ) * w 0 ( n ) * cos ( - &pi; 4 M * ( 2 n + 1 + 2 M 2 ) * ( 2 j + 1 ) ) * sin ( &pi; 4 M
* ( 2 n + 1 + 2 M 2 ) * ( 2 k + 1 ) )
= &Sigma; j = 0 M - 1 1 4 M S 0 ( j ) &Sigma; n = 0 2 M - 1 * w 0 ( n ) * w 0 ( n ) ( sin ( - &pi; 4 M ) * ( 2 n + 1 + 2 M 2 ) * ( 2 j + 1 + 2 k + 1 ) )
- sin ( &pi; 4 M * ( 2 n + 1 + 2 M 2 ) * ( 2 j - 2 k ) ) )
Order
G ( k ) = &Sigma; n = 0 2 M - 1 w 0 ( n ) * w 0 ( n ) ( sin ( - &pi; 4 M * ( 2 n + 1 + 2 M 2 ) * ( 2 k ) ) , k=-2M,1,...2M
Then have
Y 0 ( k ) = &Sigma; j = 0 M - 1 1 4 M S 0 ( j ) ( G ( j + k + 1 ) + G ( j - k ) )
I.e. Y 0 (k)s can be expressed as 0the convolution array configuration of (j) and G (k).Make vectorial h (k)for
h ( k ) = 1 4 M ( G ( j + k + 1 ) + G ( j - k ) ) , j=0,1,...M-1,k=0,1,...2M-1
Then have:
Y 0(k)=S 0(k)T cs0
T cs0=(h (0)h (1)...h (2M-1))
This shows Y 0 (k)can be expressed as S 0k the array configuration of (), transformation matrix is T cs0.Similarly, Y -1k () can be expressed as S -1k the combination of (), transformation matrix is T cs-1, Y + 1k () is to be expressed as S -1k the combination of (), transformation matrix is T cs-1.T cs-1, T cs+1, T cs0be all sparse matrix, only have minority data non-zero, most of data equal 0 or close to 0, by the mode most of data being approximately 0, can simplify transition matrix, reduce operand.
In step 17, high-frequency parameter coding is that a kind of low frequency spectrum according to MDFT territory and high frequency spectrum extract the method for the parameter recovering high frequency spectrum of being used for.In the present invention, following steps are comprised to high-frequency parameter coding method:
Step 17-1, the special frequency channel of low frequency spectrum is mapped to the special frequency channel of high frequency, forms the high frequency spectrum mapped.At present, frequency spectrum maps can adopt accomplished in many ways, such as folding mapping, linear mapping, frequency multiplication mapping etc.For linear mapping, suppose that the scope of the low-frequency spectra of original signal is [0, F l], the scope of high frequency spectrum is [F l, F s], wherein 2 × F l< F s< 3 × F l, as shown in Fig. 5 a).After carrying out linear mapping, can obtain as in Fig. 5 b) shown in frequency spectrum.
Step 17-2, the position that the high frequency time-frequency plane after mapping and original high-frequency time-frequency plane occur according to signal type and fast height is divided multiple region, then calculate respectively each region of original high-frequency energy and after mapping high frequency corresponding region energy and calculate the energy gain in this region, then by gain quantization, finally the gain after quantification is outputted to bit stream Multiplexing module as side information.
The region divided described in step 17-2 is similar to the scale factor bands (ScaleFactor Band) in MPEG AAC, and the energy in certain region is obtained by the energy sum of the spectral line calculating this region.Because the high frequency spectrum after mapping is obtained by low frequency spectrum mapping, so its structure is also consistent with low frequency spectrum, as shown in Figure 7.When low frequency is gradual frame, high frequency spectrum can do Region dividing along frequency direction; When low frequency is fast change frame, in order to suppress the impact of pre-echo (pre-echo) and rear echo (post-echo), need higher temporal resolution, at this moment can do different Region dividing according to the position of fast height along time orientation.If fast height occur position as in Fig. 7 a) shown in, then corresponding Region dividing as in Fig. 7 b) shown in.Such as, when encoding low frequency, judge that fast height position occurs at the 3rd window by signal type judge module 102, then utilize in Fig. 7 and a) need preference pattern 3, then according in Fig. 7 b) shown in Region dividing corresponding to mode 3 be (3,1,3,1).In order to reduce transmission side information bit number used, the resolution of frequency can be reduced when fast change frame.It is specifically intended that original high-frequency spectrum should be consistent with the Region dividing of the high frequency mapped.Like this, the gain in certain region is exactly the ratio of energy of the energy of original high-frequency spectrum that calculates of this region and the high frequency spectrum of mapping.Finally the gain in all regions carried out quantizing and output to bit stream Multiplexing module 107.
Step 17-3, calculate the tonality of the high frequency band of each tonality of original high-frequency frequency band and the mapping of correspondence respectively, be adjusted the side information of special frequency band tonality, comprise adjustment type and adjustment parameter, and this side information is outputted to bit stream Multiplexing module.Multiple method can be adopted to calculate tonality.Such as, unpredictable degree is utilized to obtain the method for tonality in time domain by the method for linear prediction, the method for spectrum flatness and MPEG psycho-acoustic model 2.
For MPEG psycho-acoustic model 2, the computing method of tonality are described below: the tonality of model 2 is amplitude according to signal spectrum and phase place, " unpredictable estimate " that calculate spectral line obtains; Further, signal spectrum is divided into frequency range, each frequency range has a spectral line at least.
If the width number spectrum of current frame signal is:
X[k]=r[k]e jφ[k],k=1,...,K
Wherein r [k] is amplitude, and φ [k] is phase place.
Calculate the energy of each frequency range,
e [ b ] = &Sigma; k = k l k h r 2 [ k ]
Wherein k land k hfor the up-and-down boundary of each k frequency range.
Each spectral line unpredictable estimates the relative distance (namely unpredictable estimate) for currency and the predicted value based on front cross frame.If the amplitude of predicted value and phase place are:
r pred[k]=r t-1[k]+(r t-1[k]-r t-2[k])
φ pred[k]=φ t-1[k]+(φ t-1[k]-φ t-2[k])
Then the unpredictable c of estimating [k] is defined as:
c [ k ] = disk ( X [ k ] , X pred [ k ] ) r [ k ] + | r pred [ k ] | = | re j&phi; [ k ] - r pred e j&phi; pred [ k ] | r [ k ] + | r pred [ k ] |
Then the unpredictable degree of frequency range is that the line energy of this frequency range is multiplied by the unpredictable summation estimated.That is,
c [ b ] = &Sigma; k = k l k h c [ k ] r 2 [ k ]
The unpredictable degree of definition normalization subregion is:
c s [ b ] = c [ b ] e [ b ]
Calculate subregion tonality by the unpredictable degree of normalization to have:
t[b]=-0.299-0.43log e(c s[b])
And limiting 0≤t [b]≤1, is pure string when t [b] equals 1, is white noise when t [b] equals 0.Utilize above-mentioned computing method can obtain the tonality of original high-frequency spectrum and map the tonality of high frequency spectrum.Can calculate according to the methods below the tonality adjustment parameter mapping high frequency spectrum:
If the tonality mapping high frequency spectrum is Test, energy is the tonality Tref of Eest, original high-frequency.Wherein Test and Tref can be obtained by above-mentioned computing method.Can process in the following several ways the tonality adjustment of the high frequency spectrum after mapping:
Situation 1, when the tonality Tref of the tonality Test of high frequency after mapping and original high-frequency about equal time, adjustment type coding, for not adjust, is outputted to bit stream Multiplexing module 107 by adjustment type;
Situation 2, when the tonality Test mapping frequency range is less than the tonality Tref of original high-frequency special frequency channel, then adjusts type for adding string process.Specifically need the energy Δ E adding string tbe calculated as follows:
T ref = E est &CenterDot; T est 1 + T est + &Delta;E T E est &CenterDot; 1 1 + T est = E est &CenterDot; T est + &Delta;E T &CenterDot; ( 1 + T est ) E est
Obtain after arrangement: &Delta;E T = E est &CenterDot; ( T ref - T est ) 1 + T est . Will carry out quantization encoding as adjustment parameter, and output to bit stream Multiplexing module 107 together with the coding of adjustment type;
Situation 3, when the tonality Test mapping frequency range is greater than the tonality Tref of original high-frequency special frequency channel, then adjusts type for adding process of making an uproar.Concrete needs add the energy Δ E made an uproar nbe calculated as follows:
1 T ref = E est &CenterDot; 1 1 + T est + &Delta;E N E est &CenterDot; T est 1 + T est = E est + &Delta;E N &CenterDot; ( 1 + T est ) E est &CenterDot; T est
Obtain after arrangement: &Delta;E N = E est &CenterDot; ( T est - T ref ) T ref &CenterDot; ( 1 + T est ) . Will carry out quantization encoding as adjustment parameter, and output to bit stream Multiplexing module 107 together with adjustment type coding.
Below introduce monophonic sound sound decoding device and the method for the preferred embodiment of the present invention, because decode procedure is the inverse process of cataloged procedure, so only simply introduce decode procedure.
Fig. 8 is the structured flowchart of monophonic sound sound decoding device according to the preferred embodiment of the invention.
As shown in Figure 8, monophonic sound sound decoding device according to a preferred embodiment of the invention comprises: bit stream demultiplexing module 801, low frequency waveform decoder module 802, MDCT to MDFT modular converter 803, high-frequency parameter decoder module 804, inversely revise discrete Fourier transform (DFT) (IMDFT) module 805 and resampling module 806.
Below, the annexation shown in summarized introduction Fig. 8 between each module and and respective function.
Bit stream demultiplexing module 801, for carrying out demultiplexing to the acoustic coding code stream received, obtain coded data and the side information of corresponding data frame, export corresponding coded data and side information to low frequency waveform decoder module 802, export corresponding side information to high-frequency parameter decoder module 804 and inverse discrete Fourier transform (DFT) (IMDFT) module 805 of revising.
Low frequency waveform decoder module 802 for decoding to this frame low frequency waveform encoded data, and carries out the inverse process of redundancy according to redundancy process side information to decoded data, obtains the low frequency spectrum decoded data in MDCT territory and outputs to MDCT to MDFT modular converter.
Low frequency spectrum desorption coefficient, for receiving the output of low frequency waveform decoder module 802, is converted to MDFT territory from MDCT territory by MDCT to MDFT modular converter 803, and the low frequency spectrum data in MDFT territory are outputted to high-frequency parameter decoder module 804.
High-frequency parameter decoder module 804 to HFS, then adjusts its gain and tonality obtains high frequency spectrum decoded data according to the high-frequency parameter coded data (comprising Gain tuning and tonality adjustment side information) that bit stream demultiplexing module 801 exports for demapping section modal data in the low frequency spectrum from this frame MDFT territory.
IMDFT conversion module 805 carries out IMDFT conversion for low frequency spectrum and high frequency spectrum being combined.IMDFT conversion adopts the IMDFT conversion of different length exponent number according to signal type side information, obtains the time-domain signal of this frame.
Resampling module 806 for the sampling frequency conversion of this frame time-domain signal that IMDFT module 805 is exported to the sample frequency being applicable to acoustic playback, should note, if the sample frequency of the signal that IMDFT module 805 exports is suitable for acoustic playback, then can not comprise this module in sound decoding device of the present invention.
Below, the low frequency waveform decoder module 802 of monophonic sound sound decoding device and high-frequency parameter decoder module 805 are specifically explained.
Fig. 9 is the structured flowchart of the waveform of low frequency shown in Fig. 8 decoder module.As shown in Figure 9, low frequency waveform decoder module 902 comprises inverse quantization module 901 and redundancy against processing module 902.First, the low frequency coded data obtained from bit stream demultiplexing module 801 is carried out re-quantization decoding by inverse quantization module 901, obtains the low frequency spectrum after re-quantization, and the method for re-quantization decoding is the inverse process adopting quantization encoding in coding side low frequency waveform coding module.Then first redundancy is done to judge according to the mark side information whether carrying out the inverse process of low frequency redundancy against processing module 902, and do not do inverse process if be masked as, the low frequency spectrum after re-quantization does not change; Otherwise, the inverse process of low frequency redundancy is done to the low frequency spectrum after re-quantization.
Figure 10 is the structured flowchart of the decoder module of high-frequency parameter shown in Fig. 8 804.
As shown in Figure 10, high-frequency parameter decoder module 804 comprises frequency spectrum mapper 1001, tonality adjuster 1002 and fader 1003.
Frequency spectrum mapper 1001 for the special frequency channel correspondence mappings of the low frequency spectrum in the MDFT territory by re-quantization to the special frequency channel of high frequency spectrum.The rule that frequency spectrum maps is consistent with the frequency spectrum mapping ruler composing mapper 401 in coding side high-frequency parameter coding module 106.After mapping, time-frequency plane as shown in Figure 5.High frequency spectrum after mapping is divided into multiple sub-band by tonality adjuster 1002, division methods is identical with tonality parameter extractor 402 division methods in coding side high-frequency parameter coding module 106, then do to judge according to tonality adjustment type side information, if adjustment type is not for adjust, then the frequency spectrum after mapping does not deal with; If adjustment type is made an uproar for adding, then de-quantization adjustment parameter side information, calculates the energy adding and make an uproar according to the result of de-quantization, and in frequency spectrum in the mapped, corresponding frequency band adds the noise of corresponding energy; If adjustment type is for adding string, then de-quantization adjustment parameter side information, calculates the energy adding string according to the result of de-quantization, and the central authorities in this frequency band of frequency spectrum in the mapped add the string of corresponding energy.When adding string, the phase place that front and back frame adds string will keep continuous.Time-frequency plane is divided multiple region according to fast height position side information by fader 1003, and the method for division is identical with the region partitioning method of gain parameter extraction apparatus 403 in high-frequency parameter coding module 106.Then obtained the target energy of each region gain adjustment by Gain tuning parameter side information, finally the energy in each region is carried out adjustment and make it identical with this regional aim energy.
Be described in detail monophonic sounds coding/decoding method according to the preferred embodiment of the invention below, the method comprises the following steps:
Step 21, acoustic coding code stream is carried out demultiplexing, obtain low frequency coded data, high-frequency parameter coded data and all side informations used of decoding.
Step 22, according to low frequency coded data and side information, re-quantization and decoding are carried out to low frequency coded data, then carry out the low frequency spectrum that the inverse process of low frequency redundancy obtains decoded MDCT territory;
Step 23, the low frequency spectrum after re-quantization is converted to MDFT territory from MDCT territory, obtains the low frequency spectrum in MDFT territory;
Step 24, according to the low frequency spectrum in MDFT territory and side information, parameter decoding is carried out to high-frequency parameter, obtains the decoded high frequency spectrum in MDFT territory;
Step 25, the low frequency spectrum in decoded MDFT territory and high frequency spectrum combined and carries out IMDFT conversion, obtain decoded time-domain signal;
Step 26, re-sampling operations is carried out to decoded time-domain signal, by the sampling rate conversion of decoded time-domain signal to the sample frequency being applicable to acoustic playback.
In step 22, low frequency signal decoding comprises low frequency re-quantization and the inverse process of low frequency redundancy two steps.First re-quantization and decoding are carried out to low frequency coded data, obtain the low frequency spectrum after re-quantization.Then judge whether these frame data have carried out the process of low frequency redundancy at coding side according to side information, if it is need the low frequency spectrum after by re-quantization to carry out the inverse process of low frequency redundancy, otherwise the low frequency spectrum after re-quantization does not do and changes.
The quantization method that low frequency re-quantization and the inverse process of low frequency redundancy correspond respectively in low frequency signal coding method and redundancy processing method.If what adopt in the specific embodiments of low frequency signal coded portion is the method for vector quantization, then corresponding low frequency re-quantization needs from code stream, obtain codebook vector word indexing, finds corresponding vector according to codewords indexes in fixed code book.Vector is combined into the low frequency spectrum after re-quantization in order.Judge whether coding side has carried out the process of low frequency redundancy according to side information.If not, then the low frequency spectrum after re-quantization does not do the inverse process of low frequency redundancy; If so, then judge which kind of low frequency redundancy processing method coding side adopts according to side information, if coding side adopts DCT method, then decoding end adopts the IDCT of 8*8 to carry out the inverse process of redundancy to low frequency; If coding side adopts LPC method, then decoding end carries out re-quantization to LPC model parameter, obtains the linear predictor parameter after re-quantization, carries out liftering process to low frequency residual error spectrum.
In step 23, MDCT to MDFT conversion has accomplished in many ways at present, and concrete conversion method obtains MDST domain coefficient as then carried out MDST conversion according to MDCT domain coefficient reconstruct time-domain signal, MDCT domain coefficient and MDST domain coefficient is combined and obtains MDFT domain coefficient; Or obtain MDFT domain coefficient as then carried out MDFT conversion according to MDCT domain coefficient reconstruct time-domain signal; And as by setting up the relation between the MDCT domain coefficient of present frame and front and back frame and present frame MDST domain coefficient, determine three transition matrixes being calculated present frame MDST domain coefficient by this three frame MDCT domain coefficient, thus realize directly being processed by MDCT domain coefficient obtaining MDST domain coefficient, finally MDCT domain coefficient and MDST domain coefficient are combined and obtain MDFT coefficient.MDCT to MDFT conversion method in this step and monophony according to the preferred embodiment of the invention encode in conversion method identical, therefore repeat no more.
In step 24, high-frequency parameter coding/decoding method can comprise following steps:
Step 24-1, the special frequency band of low frequency spectrum after the re-quantization in MDFT territory is mapped to the special frequency band of high frequency.
Step 24-2, ask the energy in each region of the time-frequency plane after mapping, the division in region is consistent with scrambler.
Step 24-3, to obtain tonality adjustment type from bit stream demultiplexing module 801, if adjustment type is not for adjust, then execution step 24-5, otherwise carry out step 24-4.
Step 24-4, to obtain tonality adjustment parameter from bit stream demultiplexing module 801 and de-quantization, then according to the tonality adjustment parameter after de-quantization, tonality adjustment is carried out to the frequency spectrum after mapping.
Gain after step 24-5, each area quantization of time-frequency plane that obtains from bit stream demultiplexing module 801, after de-quantization each region gain of the high frequency spectrum that step 24-2 or step 24-4 export adjusted, make the energy in each region after adjusting identical with target energy, form the high frequency spectrum of signal.
Spectrum in step 24-1 maps has accomplished in many ways at present, such as folding mapping, linear mapping, frequency multiplication mapping etc.For linear mapping, the method that spectrum maps is described below.Suppose that the scope of the low frequency spectrum of original signal is [0, F l], the scope of high frequency spectrum is [F l, F s], wherein 2 × F l< F s< 3 × F l, as shown in Fig. 5 a).The frequency spectrum then obtained by linear mapping as in Fig. 5 b) shown in.
After adjustment parameter in step 24-3 after the adjustment type obtaining each region of high frequency and de-quantization, the tonality of mapping radio-frequency spectrum adjusts.If the energy mapping frequency band is Eest, the adjustment parameter after de-quantization is , then adjustment can divide following two kinds of situation process:
Situation 1, when adjust type for adding string process time, the position adding string is the center of this frequency band, and the energy adding string is and the phase place making front and back frame add string keeps continuously;
Situation 2, when adjusting type and making an uproar process for adding, adds the energy of making an uproar the phase place of noise is random number.
In step 25, IMDFT change is corresponding with the MDFT of coding side.Revise discrete Fourier transformation (IMDFT) for inverse, frequency-time mapping process comprises three step: IMDFT conversion, time-domain windowed process and time domain superpositions.
First IMDFT conversion is carried out to re-quantization spectrum, obtain time-domain signal sr (n) after converting.The expression formula of IMDFT conversion is:
sr ( n ) = 1 2 M &Sigma; k = 0 2 M - 1 S ( k ) exp ( - i &pi; 4 M * ( 2 n + 1 + 2 M 2 ) * ( 2 k + 1 ) )
Before IMDFT conversion, need S (k) to expand to 2M length:
X(k)=-conj(X(2M-1-k)),k=M...2M-1
Wherein, n represents sample sequence number, and 2M is frame length, and represent time domain samples number, value is 2048/256; K represents spectrum sequence number, and conj asks complex conjugate computing.
Secondly, in time domain, windowing process is carried out to the time-domain signal that IMDFT conversion obtains.For meeting perfect reconstruction filter bank, window function w (n) must meet following two condition: w (2M-1-n)=w (n) and w 2(n)+w 2(n+M)=1.
Typical window function has Sine window, KBD window etc.Can biorthogonal conversion be utilized in addition, adopt specific analysis filter and the above-mentioned restriction to window function of composite filter amendment.
Finally, overlap-add procedure is carried out to above-mentioned windowing time-domain signal, obtains time-domain audio signal.Specifically: by rear M sample overlap-add of the front M of the signal sample that obtains after windowing operation and former frame signal, obtain M the time-domain audio sample exported, i.e. timeSam i, n=preSam i, n+ preSam i-1, n+M, wherein i represents frame number, and n represents sample sequence number, has 0≤n≤M.
In step 26, the implementation method of resampling is identical with code device end.It should be noted that if the sample frequency of the time-domain signal after IMDFT conversion is suitable for acoustic playback, then can not comprise re-sampling operations.
Below introduce stereo encoding apparatus and the method for the preferred embodiment of the present invention.
Figure 11 is the structured flowchart of stereo encoding apparatus according to the preferred embodiment of the invention, this device comprises: resampling module 1101, and signal type judge module 1102, MDCT conversion module 1103, low frequency stereo coding module 1104, MDCT to MDFT modular converter 1105, high-frequency parameter coding module 1106 and bit stream Multiplexing module 1107.
First, the annexation of modules and function in summarized introduction Figure 11, wherein:
Resampling module 1101, for the digital audio signal in two sound channels of input is transformed to target sampling rate from crude sampling rate, and the signal after the resampling in two sound channels is outputted to and signal type judge module 1102 and MDCT conversion module 1103 in units of frame, should note, if the digital audio signal in input two sound channels inherently has target sampling rate, then code device can not comprise this module in accordance with the principles of the present invention, directly the digital audio signal in two sound channels can be input to and signal type judge module 1102 and MDCT conversion module 1103.
With signal type judge module 1102, for being calculated and signal by the left and right sound channels (L, R) in the stereophonic signal after resampling, signal type analysis being carried out to this and signal, exporting and signal type analysis result.Due to the complicacy of signal itself, and signal type can adopt multiform expression.Such as, if this frame and signal are tempolabile signals, then directly export and represent that this frame and signal are the marks of tempolabile signal; If fast changed signal, then need continuation to calculate the position of fast height generation, and output represent that this frame and signal are the mark of fast changed signal and the position of fast height generation.Output to the result of signal type analysis the exponent number carrying out MDCT in MDCT conversion module 1103 to control, bit stream Multiplexing module 1107 is also output to the result of signal type analysis, should note, if adopt the method for closed-loop search to determine and the result that signal type is analyzed, code device can not comprise this module in accordance with the principles of the present invention.
MDCT conversion module 1103 for according to from export with signal type judge module 1102 with signal type analysis result, adopt the MDCT conversion of different length exponent number, voice signal in after resampling two sound channel is mapped to MDCT territory, and the MDCT domain coefficient of the voice signal in two sound channels is outputted to low frequency waveform coding module 1104, MDCT to MDFT modular converter 1105.If stereo encoding apparatus does not comprise and signal type judge module 1102 in accordance with the principles of the present invention, then when MDCT converts, match exponents does not control.Particularly, if this frame and signal are tempolabile signals, then the voice signal in two sound channels are done in units of frame MDCT conversion respectively, select the transformation of variable of longer rank; If fast changed signal, then the voice signal in two sound channels is divided into subframe, in units of subframe, does MDCT conversion respectively, select the MDCT of shorter exponent number to convert.Respectively the MDCT domain coefficient in two sound channels is divided into low frequency spectrum and high frequency spectrum, low frequency spectrum in described two sound channels outputs to low frequency stereo coding module 1104, and the low frequency spectrum in described two sound channels and high frequency spectrum and signal type analysis result output to MDCT to MDFT modular converter 1105.
Low frequency stereo coding module 1104, for receiving the low frequency spectrum in the MDCT territory in described two sound channels from MDCT conversion module 1103, and low frequency spectrum is divided into several sub-bands, stereo coding pattern is adopted to carry out stereo coding to each sub-band respectively, obtain low frequency stereo coding data, and output to bit stream Multiplexing module 1107 as the sound coding data in acoustic coding code stream.Wherein, stereo coding pattern comprises and differs from stereo coding pattern, parameter stereo coding pattern and parameter error stereo coding pattern.When carrying out stereo coding, each sub-band selects the one in above-mentioned three kinds of coding modes to carry out stereo coding.Wherein, coding mode selects information to output in bit stream Multiplexing module 1107 as side information simultaneously.
MDCT to MDFT modular converter 1105, for receiving the MDCT domain coefficient in described two sound channels from MDCT conversion module 1103, MDCT domain coefficient in described two sound channels is converted to the MDFT domain coefficient including phase information in described two sound channels, and the MDFT domain coefficient in described two sound channels is outputted to high-frequency parameter coding module 1106.
High-frequency parameter coding module 1106, for from MDCT to MDFT, modular converter 1105 receives the high frequency spectrum in the low frequency spectrum in the MDFT territory of two sound channels and the MDFT territory of two sound channels, the high-frequency parameter of two sound channels is extracted according to the high frequency spectrum in the low frequency spectrum in the MDFT territory of two sound channels and the MDFT territory of two sound channels, the high-frequency parameter of these two sound channels is used for recovering from the low frequency spectrum of two sound channels the high frequency spectrum of two sound channels, then after the high-frequency parameter of this high-frequency parameter coding module 1106 to two sound channels extracted carries out quantization encoding, obtain the high-frequency parameter coded data of two sound channels, the high-frequency parameter coded data of these two sound channels is outputted to bit stream Multiplexing module 1107 as the enhancement data in acoustic coding code stream.
Bit stream Multiplexing module 1107, for by carrying out multiplexing from the sound coding data received with signal type judge module 1102, low frequency stereo coding module 1104 and high-frequency parameter coding module 1106 and side information, form stereosonic acoustic coding code stream.
In the present embodiment, MDCT conversion module 1103, MDCT to MDFT modular converter 1105, high-frequency parameter coding module 1106 need to process respectively stereosonic left and right sound channels, and its disposal route is identical with the resume module method of the same name in monophonic sounds code device.Therefore, each module in above-mentioned three modules is passed through the block combiner of the same name in two monophonic sounds code devices, thus realizes stereosonic process.
Visible, be with the monophonic sounds code device difference of the preferred embodiment of the present invention, when monophonic sounds code device generates the sound coding data of acoustic coding code stream, employing be low frequency waveform coding module 104; And stereo encoding apparatus is when generating the sound coding data of acoustic coding code stream, employing be low frequency stereo coding module 1104.This module is also carry out division sub-band and stereo coding to each subband of low frequency stereo coding data.
Be described in detail stereo encoding method according to the preferred embodiment of the invention below, the method comprises the following steps:
Step 31: respectively resampling process is carried out to the digital audio signal in two sound channels of input;
Step 32: calculated and signal by the voice signal after the resampling in two sound channels, carries out signal type analysis to this and signal, if gradual type signal, to be then directly defined as by signal type and signal type analysis result; If become type signal soon, then continue the position calculating the generation of fast height, finally signal type and fast height position are defined as and signal type analysis result.
Step 33: according to signal type analysis result, adopt different length exponent number to carry out MDCT conversion to the voice signal after the resampling in described two sound channels respectively, obtain the voice signal in the MDFT territory in described two sound channels.
Step 34: respectively the MDFT domain coefficient in two sound channels is divided into low frequency spectrum and high frequency spectrum.
Step 35: respectively the low frequency spectrum in the MDFT territory in two sound channels is divided into several sub-bands, carries out stereo coding to each sub-band, obtains low frequency stereo coding data
Step 36: the MDFT domain coefficient respectively the MDFT domain coefficient of two sound channels being converted to two sound channels, obtains the high frequency spectrum in the low frequency spectrum in the MDFT territory of two sound channels and the MDFT territory of two sound channels;
Step 37: according to the high frequency spectrum in the low frequency spectrum in the MDFT territory of two sound channels and the MDFT territory of two sound channels, extract the high-frequency parameter being used for recovering the high frequency spectrum in described two sound channels from the low frequency spectrum described two sound channels, quantization encoding is carried out to the high-frequency parameter of described two sound channels, obtains the high-frequency parameter coded data of described two sound channels.
Step 38: carry out multiplexing to the high-frequency parameter coded data of above-mentioned low frequency stereo coding data, described two sound channels and side information, obtain stereosonic acoustic coding code stream.
Wherein, MDCT transform method in method for resampling in step 31, the signal type determination methods in step 32, step 33, MDCT to the MDFT conversion method in step 36 and the high-frequency parameter coding method in step 37 were introduced all in the embodiment of the coding method of monophony code device of the present invention, in the embodiment of the coding method of stereo encoding apparatus of the present invention, adopt identical method, therefore repeat no more.
Wherein, the process of the low frequency stereo coding of step 35 is, first the low frequency spectrum in described two sound channels is divided into several sub-bands respectively, then namely and difference stereo coding pattern, parameter stereo coding pattern and parameter error stereo coding pattern one is selected from three kinds of coding modes to each sub-band, the frequency spectrum in the sound channel of two in this sub-band is encoded.When dividing, respectively each subband of the low frequency spectrum of two sound channels is divided.First the implementation method that two kinds of coding modes are selected is provided below:
Coding mode selects implementation method 1: carry out Code And Decode with identical bit number to the low frequency spectrum in described two sound channels with three kinds of coding modes respectively, calculate the error of low frequency spectrum before low frequency spectrum in two sound channels that decoding recovers and coding, and the minimum coding mode of Select Error is as the coding mode of stereo coding.Information is selected by coding mode to output in bit stream Multiplexing module 1107 as side information;
Coding mode selects implementation method 2: for the lower frequency sub-band of frequency in low frequency spectrum lower than a determined value, the sub-band of such as below 1kHz, adopt respectively and differ from stereo coding pattern and parameter stereo coding pattern carries out Code And Decode, calculate the low frequency spectrum in two sound channels recovered and the error of front low frequency spectrum of encoding, and the coding mode that Select Error is less, information is selected by coding mode to output in bit stream Multiplexing module 1107 as side information, for the upper frequency sub-band of frequency higher than above-mentioned determined value, as the sub-band of more than 1kHz, adopt parameter stereo coding pattern.Now, the selection information of parameter stereo coding pattern can export or not export bit stream Multiplexing module 1107 to.
Certainly, also can adopt fixing stereo coding pattern in actual applications, in this case, not need to be selected by coding mode information to output in bit stream Multiplexing module 1107 as side information.
Respectively the implementation method of three kinds of stereo coding patterns is described in detail below.
Figure 12 be according to the preferred embodiment of the invention with the illustraton of model of difference stereo coding pattern.Be according to the low frequency spectrum in the sub-band in described two sound channels with difference stereo coding pattern, calculate in this sub-band one and frequency spectrum and a difference frequency spectrum.Specific implementation method is as follows:
Composed by the spelling of left and right sound channels with calculate corresponding and frequency spectrum compose with difference frequency and will with after carrying out waveform quantization coding, by what obtain with bit stream Multiplexing module 1107 is outputted to as low frequency stereo coding data. with calculating formula be:
M &RightArrow; = ( L &RightArrow; + R &RightArrow; ) / 2
S &RightArrow; = ( L &RightArrow; - R &RightArrow; ) / 2
Wherein, right with carrying out waveform quantization coding can adopt low frequency waveform coding module 304 pairs of low frequency spectrums of monophonic sounds code device to carry out the method for quantization encoding.
Figure 13 is the illustraton of model of parameter stereo coding pattern according to the preferred embodiment of the invention.Parameter stereo coding pattern is according to the low frequency spectrum in the sub-band k in described two sound channels, calculate a monaural frequency spectrum in this sub-band k, calculate the parameter being used for the low frequency spectrum recovered by this sub-band monophony frequency spectrum in this sub-band k in described two sound channels simultaneously.Enumerate the specific implementation method of two kinds of parameter stereo codings below.
Parameter stereo coding specific implementation method 1 comprises following steps:
Step 35-1a: in sub-band k, for certain sound channel, as R channel calculate the weighting parameters g of this sound channel r(k), and the frequency spectrum obtaining this sound channel after convergent-divergent make after convergent-divergent with energy equal; g rk the computing method of () can adopt following formula:
g r ( k ) = E R ( k ) E L ( k )
Wherein, E r(k) and E lk () is respectively the energy of R channel, L channel in sub-band k.
Step 35-1b: for each Frequency point i in sub-band k, calculate the weighted sum frequency spectrum of this Frequency point with weighted difference frequency spectrum due to after convergent-divergent, in sub-band k, the energy Ratios of the left and right acoustic channels of each Frequency point is statistically approximate identical, so by with energy approximation is equal, therefore weighted sum frequency spectrum with weighted difference frequency spectrum near normal.Computing formula is as follows:
M &RightArrow; &prime; = ( L &RightArrow; - R &RightArrow; &prime; ) / 2 = [ L &RightArrow; + 1 g r ( k ) R &RightArrow; ] / 2
S &RightArrow; &prime; = ( L &RightArrow; - R &RightArrow; &prime; ) / 2
Step 35-1c: produce and weighted sum frequency spectrum the quadrature spectrum that constant amplitude is vertical according to quadrature spectrum with weighted difference frequency spectrum calculate quadrature spectrum weighting parameters g dk (), makes to adopt g dquadrature spectrum after (k) convergent-divergent with energy equal.G dk the computing method of () can adopt following formula:
g d ( k ) = E S ( k ) E D ( k )
Wherein, E s(k) and E dk () is respectively weighted difference frequency spectrum in sub-band k with quadrature spectrum energy.
Step 35-1d: above-mentioned weighted sum frequency spectrum and g r(k) and g dk () outputs to bit stream Multiplexing module 1107 respectively after quantization encoding.Wherein, after quantization encoding for low frequency stereo coding data, the g after quantization encoding r(k) and g dk () is side information.
Relative to specific implementation method 1, the parameter g in parameter stereo coding specific implementation method 2 r(k), g d(k) and weighted sum frequency spectrum obtain according to error minimum principle, comprise following steps:
Step 35-2a: for sub-band k, according to formula below, calculates first parameter g d(k):
g d ( k ) = - b ( k ) + b 2 ( k ) + a 2 ( k ) a ( k )
Wherein,
a ( k ) = &Sigma; i &Element; band ( k ) ( x r [ i , k ] y l [ i , k ] - x l [ i , k ] y r [ i , k ] ) ,
b ( x ) = &Sigma; i &Element; band ( k ) ( x l [ i , k ] x r [ i , k ] + y l [ i , k ] y r [ i , k ] )
Wherein, x land y lbe respectively real part and the imaginary part of L channel low frequency spectrum, x rand y rbe respectively real part and the imaginary part of R channel low frequency spectrum;
Step 35-2b: for sub-band k, according to formula below, calculates second parameter g r(k):
g r ( k ) = - ( c ( k ) - d ( k ) ) + ( c ( k ) - d ( k ) ) 2 + g ( k ) m 2 ( k ) g ( k ) m 2 ( k )
Wherein,
c ( k ) = &Sigma; i &Element; band ( k ) ( x l [ i , k ] x l [ i , k ] + y l [ i , k ] y l [ i , k ] ) ;
d ( k ) = &Sigma; i &Element; band ( k ) ( x r [ i , k ] x r [ i , k ] + y r [ i , k ] y r [ i , k ] ) ;
m ( k ) = 2 b ( k ) ( 1 - g d 2 ( k ) ) + 2 a ( k ) g d ( k ) 1 + g d 2 ( k )
Step 35-2c: for each Frequency point i in sub-band k, goes out weighted sum frequency spectrum according to formulae discovery below
x m [ i , k ] = x l [ i , k ] + g d ( k ) y l [ i , k ] + g ( k ) g r ( k ) ( x r [ i , k ] - g d ( k ) y r [ i , k ] ) ( 1 + g d 2 ( k ) ) ( 1 + g ( k ) g r 2 ( k )
y m [ i , k ] = - g d ( k ) x l [ i , k ] + y l [ i , k ] + g ( k ) g r ( k ) ( g d ( k ) x r [ i , k ] + y r [ i , k ] ) ( 1 + g d 2 ( k ) ) ( 1 + g ( k ) g r 2 ( k )
Wherein, x mand y mrepresent weighted sum frequency spectrum respectively real part and imaginary part, g (k) is the importance factors of sub-band k intrinsic parameter stereo coding, reflect the distribution of parameter stereo coding error at left and right acoustic channels, can select according to characteristics of signals, such as g (k) can equal ratio and the E of L channel and the energy of R channel in sub-band k l(k)/E r(k).
Step 35-2d: above-mentioned weighted sum frequency spectrum g r(k) and g dk () outputs to bit stream Multiplexing module 1107 respectively after quantization encoding.Wherein, after quantization encoding for low frequency stereo coding data, the g after quantization encoding r(k) and g dk () is side information.
Figure 14 is the illustraton of model of parameter error stereo coding pattern according to the preferred embodiment of the invention.Parameter error stereo coding pattern is according to the low frequency spectrum in the sub-band in described two sound channels, calculates a monaural frequency spectrum in this sub-band, an Error Spectrum and is recovered the parameter of the low frequency spectrum in the sub-band in described two sound channels by this monophony frequency spectrum, Error Spectrum.
Compared to the computation model of parameter stereo coding pattern, if need to improve encoding precision, adopt parameter error stereo coding pattern, calculate the error of frequency spectrum further, i.e. Error Spectrum and by Error Spectrum also waveform quantization coding is carried out.The specific implementation method of parameter error stereo coding pattern comprises the following steps:
Step 35-3a: for certain sound channel in sub-band k, as R channel calculate the weighting parameters g of this sound channel r(k), and the frequency spectrum obtaining this sound channel after convergent-divergent because the energy Ratios of the left and right acoustic channels of each Frequency point i in parameter extraction frequency band is statistically approximate identical, so with energy approximation is equal, so weighted sum frequency spectrum with weighted difference frequency spectrum near normal; Wherein, g rg in the computing method of (k) and step 35-1a rk the computing method of () are identical.
Step 35-3b: for each Frequency point i in this sub-band, calculate the weighted sum frequency spectrum of this Frequency point with weighted difference frequency spectrum
Step 35-3c: produce and weighted sum frequency spectrum the quadrature spectrum that constant amplitude is vertical
Step 35-3d: according to quadrature spectrum with weighted difference frequency spectrum calculate weighting parameters g d(k), and obtain according to g dquadrature spectrum after (k) convergent-divergent wherein, g dg in the computing method of (k) and step 35-1c dk the computing method of () are identical.
Step 35-3e: by calculating weighted difference frequency spectrum with the quadrature spectrum after convergent-divergent difference can obtain error spectrum namely
Step 35-3f: above-mentioned weighted sum frequency spectrum error spectrum parameter g r(k) and g dk () outputs to bit stream Multiplexing module 1107 respectively after quantization encoding.Wherein, after quantization encoding with for low frequency stereo coding data, the g after quantization encoding r(k) and g dk () is side information.
Below introduce stereo decoding apparatus and the method for the preferred embodiment of the present invention.
Figure 15 is the structured flowchart of stereo decoding apparatus according to the preferred embodiment of the invention.As shown in figure 15, the stereo decoding apparatus of the preferred embodiment of the present invention comprises: bit stream demultiplexing module 1501, low frequency stereo de-coding module 1502, MDCT to MDFT modular converter 1503, high-frequency parameter decoder module 1504, IMDFT conversion module 1505 and resampling module 1506.
, specifically introduce annexation and the function of modules shown in Figure 15 below, wherein,
Bit stream demultiplexing module 1501, for carrying out demultiplexing to the acoustic coding stream received, obtains sound coding data and the side information of corresponding data frame.Export corresponding coded data and side information to low frequency stereo de-coding module 1502, side information comprises the mark whether carrying out the inverse process of low frequency redundancy; Side information to high-frequency parameter decoder module 1504 output comprises the position that tonality adjustment type, tonality adjustment parameter, Gain tuning parameter and fast height occur; The control signal exported to IMDFT module 1505 is signal type parameter.When the low frequency stereo coding module 1104 of coding side outputs coding mode selection information, coding mode selects information also will export low frequency stereo de-coding module 1502 to as side information.
Low frequency stereo de-coding module 1502, information is selected to carry out stereo decoding to low frequency stereo coding data for the coding mode exported according to bit stream demultiplexing module 1501 in side information, obtain the low frequency spectrum in described two sound channels, send to IMDFT conversion module 1505 and MDCT to MDFT modular converter 1503.
MDCT to MDFT modular converter 1503, for receiving the output of low frequency stereo de-coding module, low frequency spectrum desorption coefficient in two sound channels is converted to MDFT territory from MDCT territory, and the low frequency spectrum data in the MDFT territory in two sound channels are outputted to high-frequency parameter decoder module 1504.
High-frequency parameter decoder module 1504, the high-frequency parameter coded data for two sound channels exported according to the low frequency spectrum and bit stream demultiplexing module 1501 that are received from the MDFT territory in described two sound channels of low frequency stereo de-coding module 1502 recovers the high frequency spectrum in the MDFT territory in described two sound channels.
IMDFT conversion module 1505, IMDFT conversion is carried out for the low frequency spectrum in the MDFT territory in described two sound channels and high frequency spectrum being combined, IMDFT conversion adopts the IMDFT conversion of different length exponent number according to signal type side information, obtains the stereophonic signal of this frame decoding.
Resampling module 1506, for the sampling frequency conversion of the stereophonic signal of this frame decoding that IMDFT module 1505 is exported to the sample frequency being applicable to acoustic playback, should note, if the sample frequency of the signal that IMDFT module 1505 exports is suitable for acoustic playback, then can not comprise this module in sound decoding device of the present invention.
In the present embodiment, MDCT to MDFT modular converter 1503, high-frequency parameter decoder module 1504, IMDFT conversion module 1505, resampling module 1506 adopt the module of the same name of two cover monophonic sound sound decoding devices to process left and right sound channels signal respectively respectively.
Be described in detail stereo sound coding/decoding method according to the preferred embodiment of the invention below, the method comprises the following steps:
Step 41: acoustic coding code stream is carried out demultiplexing, obtains low frequency stereo coding data, the high-frequency parameter coded data of two sound channels and all side informations used of decoding.
Step 42: carry out stereo decoding to low frequency stereo coding data according to the low frequency stereo coding mode selecting information in side information, obtains the decoded low frequency spectrum in described two sound channels.
Step 43: the decoded low frequency spectrum in two sound channels is converted to MDFT territory from MDCT territory, obtains the low frequency spectrum in the MDFT territory in two sound channels.
Step 44: according to the high-frequency parameter coded data in the low frequency spectrum in the MDFT territory in described two sound channels and described two sound channels, recover the high frequency spectrum in the MDFT territory in described two sound channels, obtain the decoded high frequency spectrum in described two sound channels.
Step 45: the low frequency spectrum in the MDFT territory in decoded described two sound channels and high frequency spectrum are combined and carries out IMDFT conversion, obtain decoded stereophonic signal.
Step 46: carry out re-sampling operations to decoded stereophonic signal, by the sampling rate conversion of decoded stereophonic signal to the sample frequency being applicable to acoustic playback.
Wherein, IMDFT transform method in MDCT to MDFT conversion method in step 43, the high-frequency parameter coding/decoding method in step 44, step 45 and the method for resampling in step 46, all introduced in the embodiment of the coding/decoding method of monophony decoding device of the present invention, in the embodiment of the coding/decoding method of stereo decoding apparatus of the present invention, adopt identical method, therefore repeat no more.
Wherein, step 42 selects information to carry out stereo decoding according to coding mode, selects implementation method 1 corresponding to coding mode, and coding/decoding method is select information to the low frequency stereo coding decoding data of each sub-band according to coding mode; Implementation method 2 is selected corresponding to coding mode, coding/decoding method is select information to the low frequency stereo coding decoding data of sub-band each in lower frequency sub-band according to coding mode, for the sub-band of upper frequency, adopt parameter stereo decoding schema.Wherein, low frequency stereo decoding comprises three kinds of stereo decoding patterns.
Recovered the low frequency spectrum in described two sound channels in this sub-band by the low frequency in sub-band and frequency spectrum and difference frequency spectrum with difference stereo decoding pattern.Specific implementation method is as follows:
Low frequency stereo de-coding module 1502 will receive after low frequency stereo coding data carry out re-quantization decoding from bit stream demultiplexing module 1501, obtain low frequency and frequency spectrum compose with difference frequency following formula is adopted to recover the low frequency spectrum of left and right sound channels.
L &RightArrow; ^ = M &RightArrow; ^ + S &RightArrow; ^
R &RightArrow; ^ = M &RightArrow; ^ - S &RightArrow; ^
Parameter stereo decoding schema is the weighted sum frequency spectrum in the sub-band that receives according to low frequency stereo de-coding module 1502 and the relevant parameter in side information with recover the left and right sound channels low frequency spectrum in this sub-band.Corresponding to the embodiment 1 in the parameter stereo coding method of coded portion and embodiment 2, but the decode procedure of two kinds of embodiments is identical, comprises following steps:
Step 42-1a: low frequency stereo de-coding module 1502 will receive after lower frequency region stereo coding data and relevant parameter carry out re-quantization decoding from bit stream demultiplexing module 1501, obtain weighted sum frequency spectrum parameter with
Step 42-1b: produce and weighted sum frequency spectrum the quadrature spectrum that constant amplitude is vertical wherein, D &RightArrow; [ i , k ] = - y m [ i , k ] + jx m [ i , k ] ;
Step 42-1c: according to the parameter obtained by quadrature spectrum convergent-divergent obtains the quadrature spectrum after convergent-divergent
Step 42-1d: by weighted sum frequency spectrum with the quadrature spectrum after convergent-divergent obtain the frequency spectrum of left and right sound channels, the frequency spectrum of one of them sound channel (R channel) is through after convergent-divergent; Computing formula is as follows:
R &RightArrow; ^ &prime; = M &RightArrow; ^ &prime; + D &RightArrow; ^ &prime;
L &RightArrow; ^ &prime; = M &RightArrow; ^ &prime; - D &RightArrow; ^ &prime;
Step 42-1e: by the parameter obtained from side information convergent-divergent is carried out again to a sound channel of convergent-divergent and returns original size, obtain
Parameter error stereo decoding pattern is the sub-band weighted sum frequency spectrum obtained according to low frequency stereo de-coding module 1502 error spectrum with parameter corresponding in side information with recover this sub-band left and right acoustic channels frequency spectrum.Specific implementation method comprises following steps:
Step 42-2a: low frequency stereo de-coding module 1502 will receive after low frequency stereo coding data and relevant parameter carry out re-quantization decoding from bit stream demultiplexing module 1501, obtain weighted sum frequency spectrum error spectrum and parameter with
Step 42-2b: produce and weighted sum frequency spectrum the quadrature spectrum that constant amplitude is vertical
Step 42-2c: according to the parameter obtained by quadrature spectrum convergent-divergent obtains the quadrature spectrum after convergent-divergent
Step 42-2d: the quadrature spectrum after convergent-divergent with error spectrum be added, the weighted difference frequency spectrum be restored
Step 42-2e: by weighted sum frequency spectrum with weighted difference frequency spectrum obtain the frequency spectrum of left and right acoustic channels, the frequency spectrum of one of them sound channel (R channel) is through after convergent-divergent;
Step 42-2f: pass through parameter convergent-divergent is carried out again to the sound channel of convergent-divergent and returns original size.
Obviously, under the prerequisite not departing from true spirit of the present invention and scope, the present invention described here can have many changes.Therefore, all changes that it will be apparent to those skilled in the art that, all should be included within scope that these claims contain.The present invention's scope required for protection is only limited by described claims.

Claims (24)

1. a monophonic sounds code device, comprising:
Modified Discrete Cosine Transform MDCT conversion module, for digital audio signal is mapped to MDCT territory to obtain the voice signal in MDCT territory from time domain, and is divided into low frequency spectrum and high frequency spectrum by the voice signal in described MDCT territory;
Low frequency waveform coding module, for carrying out quantization encoding to obtain low frequency waveform encoded data to the low frequency spectrum of the voice signal on described MDCT territory;
MDCT to correction discrete Fourier transformation MDFT modular converter, for being converted to low frequency spectrum and the high frequency spectrum of the voice signal on MDFT territory by the low frequency spectrum of the voice signal on described MDCT territory and high frequency spectrum;
High-frequency parameter coding module, for according to the low frequency spectrum in the voice signal on described MDFT territory and high frequency spectrum, calculate the high-frequency parameter being used for recovering high frequency spectrum in decoding end from low frequency spectrum, and quantization encoding is carried out to obtain high-frequency parameter coded data to described high-frequency parameter; And
Bit stream Multiplexing module, for carrying out multiplexing, to export acoustic coding code stream to described low frequency waveform encoded data and described high-frequency parameter coded data.
2. device according to claim 1, also comprises:
Signal type analysis module, before mapping at described MDCT conversion module, signal type analysis is carried out to described digital audio signal, to know that described digital audio signal is fast changed signal or tempolabile signal, and signal type analysis result is outputted to described MDCT conversion module, described high-frequency parameter coding module and described bit stream Multiplexing module, wherein
Described MDCT conversion module also for, the MDCT conversion of different length exponent number is adopted according to described signal type analysis result, described high-frequency parameter coding module also for, described high-frequency parameter is extracted according to described signal type analysis result, described bit stream Multiplexing module also for, described signal type analysis result is carried out multiplexing with described low frequency waveform encoded data together with described high-frequency parameter coded data.
3. device according to claim 1, wherein, described low frequency waveform coding module also comprises redundancy Processing for removing module, before the low frequency spectrum of the voice signal on described MDCT territory carries out quantization encoding, carry out redundancy Processing for removing to it.
4. device according to claim 1, wherein, described high-frequency parameter coding module also comprises:
Frequency spectrum mapper, the special frequency channel for the low frequency spectrum by the voice signal on described MDFT territory is mapped to the special frequency channel of high frequency spectrum, obtains the high frequency spectrum of the voice signal on the MDFT territory after mapping;
Tonality parameter extractor, the high frequency spectrum of voice signal on the high frequency spectrum according to the voice signal on the MDFT territory after mapping and the MDFT territory before mapping, extracts and is used in decoding end the tonality parameter that adjusts needed for the tonality of high frequency spectrum; And
Gain parameter extraction apparatus, the high frequency spectrum of voice signal on the high frequency spectrum according to the voice signal on the MDFT territory after mapping and the MDFT territory before mapping, extracts and is used in decoding end the gain parameter that adjusts needed for the gain of high frequency spectrum,
Wherein, described tonality parameter and described gain parameter are from low frequency spectrum, recover the high-frequency parameter of high frequency spectrum in decoding end.
5. device according to claim 1, also comprises:
Resampling module, before carrying out described mapping at described MDCT conversion module, transforms to target sampling rate by described digital audio signal from crude sampling rate.
6. a monophonic sounds coding method, comprising:
Digital audio signal is mapped to Modified Discrete Cosine Transform MDCT territory to obtain the voice signal MDCT territory from time domain, and the voice signal on described MDCT territory is divided into low frequency spectrum and high frequency spectrum;
Quantization encoding is carried out to obtain low frequency waveform encoded data to the low frequency spectrum in the voice signal on described MDCT territory, the low frequency spectrum of the voice signal on described MDCT territory and high frequency spectrum are converted to low frequency spectrum and the high frequency spectrum of the voice signal revised on discrete Fourier transformation MDFT territory, and according to the low frequency spectrum in the voice signal on described MDFT territory and high frequency spectrum, calculate the high-frequency parameter being used for recovering high frequency spectrum in decoding end from low frequency spectrum, and quantization encoding is carried out to obtain high-frequency parameter coded data to described high-frequency parameter; And
Carry out multiplexing to described low frequency waveform encoded data and high-frequency parameter coded data, to export acoustic coding code stream.
7. method according to claim 6, also comprises:
Before described digital audio signal is mapped to MDCT territory, signal type analysis is carried out to described digital audio signal, to know that described digital audio signal is fast changed signal or tempolabile signal, and output signal type analysis result;
The MDCT conversion of different length exponent number is adopted according to described signal type analysis result;
Described high-frequency parameter is extracted according to described signal type analysis result;
Described signal type analysis result is carried out multiplexing with described low frequency waveform encoded data together with described high-frequency parameter coded data.
8. method according to claim 6, also comprises:
Before quantization encoding is carried out to the low frequency spectrum of the voice signal on described MDCT territory, redundancy Processing for removing is carried out to it.
9. method according to claim 6, wherein, describedly also to comprise the step that high-frequency parameter calculates:
The special frequency channel of the low frequency spectrum of the voice signal on described MDFT territory is mapped to the special frequency channel of high frequency spectrum, obtains the high frequency spectrum of the voice signal on the MDFT territory after mapping;
According to the high frequency spectrum of voice signal on the high frequency spectrum of the voice signal on the MDFT territory after mapping and the MDFT territory before mapping, extract and be used in decoding end the tonality parameter that adjusts needed for the tonality of high frequency spectrum; And
According to the high frequency spectrum of voice signal on the high frequency spectrum of the voice signal on the MDFT territory after mapping and the MDFT territory before mapping, extract and be used in decoding end the gain parameter that adjusts needed for the gain of high frequency spectrum,
Wherein, described tonality parameter and described gain parameter are from low frequency spectrum, recover the high-frequency parameter of high frequency spectrum in decoding end.
10. method according to claim 6, also comprises:
Before digital audio signal is mapped to MDCT territory from time domain, described digital audio signal is transformed to target sampling rate from crude sampling rate.
11. methods according to claim 6, wherein, one of comprise the following steps from MDCT to the conversion of MDFT:
Carry out correction discrete sine transform MDST to the time-domain signal of sound and obtain MDST domain coefficient, described MDST domain coefficient and MDCT domain coefficient are combined and obtain MDFT domain coefficient, wherein, voice signal is mapped to MDCT territory and obtains by described MDCT domain coefficient;
Reconstruct described time-domain signal according to described MDCT domain coefficient, MDST conversion is carried out to the time-domain signal of described reconstruct and obtains MDST domain coefficient, described MDCT domain coefficient and MDST domain coefficient are combined and obtains MDFT domain coefficient;
Reconstruct described time-domain signal according to described MDCT domain coefficient, MDFT conversion is carried out to the time-domain signal of described reconstruct and obtains MDFT domain coefficient; And
By setting up the relation between the MDCT domain coefficient of present frame and front and back frame thereof and present frame MDST domain coefficient, determine three transition matrixes being calculated present frame MDST domain coefficient by this three frame MDCT domain coefficient, obtain MDST domain coefficient according to the MDCT domain coefficient of this three frame and the transition matrix of correspondence thereof, then MDCT domain coefficient and MDST domain coefficient are combined and obtain MDFT coefficient.
12. 1 kinds of monophonic sound sound decoding devices, comprising:
Bit stream demultiplexing module, for carrying out demultiplexing to acoustic coding code stream, to obtain low frequency waveform encoded data and high-frequency parameter coded data;
Low frequency waveform decoder module, for described low frequency waveform encoded data of decoding, to obtain the low frequency spectrum decoded data of the voice signal on Modified Discrete Cosine Transform MDCT territory;
MDCT to correction discrete Fourier transformation MDFT modular converter, for being converted to MDFT territory by the low frequency spectrum decoded data of the voice signal on described MDCT territory from MDCT territory;
High-frequency parameter decoder module, in the low frequency spectrum from the voice signal on MDFT territory, demapping section modal data is to HFS, obtain the high frequency spectrum data after mapping, then according to described high-frequency parameter coded data, parameter decoding is carried out to the high frequency spectrum data after described mapping and obtain high frequency spectrum decoded data; And
Inverse correction discrete Fourier transform (DFT) IMDFT conversion module, carries out IMDFT conversion for described low frequency spectrum decoded data and described high frequency spectrum decoded data being combined, to obtain the voice codec data in time domain.
13. devices according to claim 12, wherein, described low frequency waveform decoder module also comprises:
Inverse quantization module, for carrying out re-quantization decoding to described low frequency waveform encoded data, obtains the low frequency spectrum data after re-quantization;
Redundancy, against processing module, is eliminated inverse process for carrying out redundancy to the low frequency spectrum data after described re-quantization, is obtained described low frequency spectrum decoded data.
14. devices according to claim 12, also comprise:
Resampling module, for transforming to the sample frequency of applicable acoustic playback by the sampling frequency of the voice codec data in described time domain.
15. devices according to claim 12, wherein, described high-frequency parameter decoder module also comprises:
Frequency spectrum mapping block, for the special frequency channel of the low frequency spectrum on described MDFT territory being mapped to the special frequency channel of high frequency spectrum, obtains the high frequency spectrum after mapping;
Tonality adjusting module, for carrying out tonality adjustment to the high frequency spectrum after mapping; And
Gain regulation module, for carrying out Gain tuning to the high frequency spectrum after tonality adjustment, obtains the high frequency spectrum decoded data on MDFT territory.
16. 1 kinds of monophonic sounds coding/decoding methods, comprising:
Demultiplexing is carried out to acoustic coding code stream, to obtain low frequency waveform encoded data and high-frequency parameter coded data;
To decode described low frequency waveform encoded data, to obtain the low frequency spectrum decoded data in the voice signal on change Modified Discrete Cosine Transform MDCT territory, described low frequency spectrum decoded data is converted to from MDCT territory and revises discrete Fourier transformation MDFT territory, obtain the low frequency spectrum decoded data on MDFT territory, according to the low frequency spectrum decoded data on MDFT territory, parameter decoding is carried out to high-frequency parameter coded data, obtains the decoded high frequency spectrum decoded data on MDFT territory;
Low frequency spectrum decoded data on decoded MDFT territory and high frequency spectrum decoded data are combined and carries out inverse correction discrete Fourier transformation IMDFT conversion, obtain the digital audio signal in decoded time domain.
17. methods according to claim 16, wherein, the step of described low frequency waveform encoded data of decoding also comprises:
Re-quantization decoding is carried out to low frequency waveform encoded data, obtains low frequency spectrum decoded data; And
Redundancy is carried out to described low frequency spectrum decoded data and eliminates inverse process.
18. methods according to claim 16, also comprise:
The sampling frequency of the digital audio signal in described time domain is transformed to the sample frequency of applicable acoustic playback.
19. methods according to claim 16, wherein, one of comprise the following steps from MDCT to the conversion of MDFT:
According to the time-domain signal of described MDCT domain coefficient reconstruct sound, correction discrete sine transform MDST conversion is carried out to the time-domain signal of described reconstruct and obtains MDST domain coefficient, described MDCT domain coefficient and MDST domain coefficient are combined and obtains MDFT domain coefficient, wherein, the described low frequency waveform encoded data of described MDCT domain coefficient decoding obtains;
Reconstruct described time-domain signal according to described MDCT domain coefficient, MDFT conversion is carried out to the time-domain signal of described reconstruct and obtains MDFT domain coefficient; And
By setting up the relation between the MDCT domain coefficient of present frame and front and back frame thereof and present frame MDST domain coefficient, determine three transition matrixes being calculated present frame MDST domain coefficient by this three frame MDCT domain coefficient, obtain MDST domain coefficient according to the MDCT domain coefficient of this three frame and the transition matrix of correspondence thereof, then MDCT domain coefficient and MDST domain coefficient are combined and obtain MDFT coefficient.
20. methods according to claim 16, wherein, the step of high-frequency parameter being carried out to parameter decoding also comprises:
The special frequency channel of the low frequency spectrum on described MDFT territory is mapped to the special frequency channel of high frequency spectrum, obtains the high frequency spectrum after mapping;
Tonality adjustment is carried out to the high frequency spectrum after mapping; And
Gain tuning is carried out to the high frequency spectrum after tonality adjustment, obtains the high frequency spectrum decoded data on MDFT territory.
21. 1 kinds of stereo encoding apparatus, comprising:
Modified Discrete Cosine Transform MDCT conversion module, for respectively digital audio signal being mapped to MDCT territory from time domain, to obtain the digital audio signal on the MDCT territory of left and right sound channels, and the voice signal on the MDCT territory of described left and right sound channels is divided into low frequency spectrum and high frequency spectrum;
Low frequency stereo coding module, for carrying out stereo coding to the low frequency spectrum on the MDCT territory of described left and right sound channels, to obtain low frequency stereo coding data;
MDCT to revising discrete Fourier transformation MDFT modular converter, for the low frequency spectrum on the MDCT territory of described left and right sound channels and high frequency spectrum being converted to low frequency spectrum on MDFT territory and high frequency spectrum;
High-frequency parameter coding module, for respectively according to the low frequency spectrum in the voice signal on the MDFT territory of described left and right sound channels and high frequency spectrum, calculate the high-frequency parameter being used for recovering high frequency spectrum respectively in decoding end from the low frequency spectrum of described left and right sound channels, and quantization encoding is carried out to obtain the high-frequency parameter coded data of described left and right sound channels to described high-frequency parameter; And
Bit stream Multiplexing module, for carrying out multiplexing to the high-frequency parameter coded data of described low frequency stereo coding data and described left and right sound channels, to export acoustic coding code stream.
22. 1 kinds of stereo encoding methods, comprising:
Respectively digital audio signal is mapped to Modified Discrete Cosine Transform MDCT territory from time domain, to obtain the digital audio signal on the MDCT territory of left and right sound channels, and the voice signal on the MDCT territory of described left and right sound channels is divided into low frequency spectrum and high frequency spectrum;
To a described left side, low frequency spectrum on the MDCT territory of R channel carries out stereo coding, to obtain low frequency stereo coding data, by a described left side, being converted to of low frequency spectrum on the MDCT territory of R channel and high frequency spectrum revises low frequency spectrum on discrete Fourier transformation MDFT territory and high frequency spectrum, respectively according to a described left side, low frequency spectrum in voice signal on the MDFT territory of R channel and high frequency spectrum, calculate and be used in decoding end respectively from a described left side, the high-frequency parameter of high frequency spectrum is recovered in the low frequency spectrum of R channel, and quantization encoding is carried out to obtain a described left side to described high-frequency parameter, the high-frequency parameter coded data of R channel, and
Carry out multiplexing, to export acoustic coding code stream to the high-frequency parameter coded data of described low frequency stereo coding data and described left and right sound channels.
23. 1 kinds of stereo decoding apparatus, comprising:
Bit stream demultiplexing module, for carrying out demultiplexing to acoustic coding code stream, to obtain the high-frequency parameter coded data of low frequency stereo coding data and left and right sound channels;
Low frequency stereo de-coding module, for carrying out stereo decoding to described low frequency stereo coding data, to obtain the low frequency spectrum decoded data of the voice signal on the Modified Discrete Cosine Transform MDCT territory of described left and right sound channels;
MDCT, to revising discrete Fourier transformation MDFT modular converter, for the low frequency spectrum decoded data of the voice signal on the MDCT territory of described left and right sound channels is converted to MDFT territory from MDCT territory, obtains the low frequency spectrum decoded data on the MDFT territory of left and right sound channels;
High-frequency parameter decoder module, for demapping section modal data from the low frequency spectrum on the MDFT territory of described left and right sound channels to HFS, obtain the high frequency spectrum data of the left and right sound channels after mapping, then carry out according to the high frequency spectrum data of high-frequency parameter coded data to the left and right sound channels after described mapping of described left and right sound channels the high frequency spectrum decoded data that parameter decoding obtains described left and right sound channels; And
Inverse correction discrete Fourier transformation IMDFT conversion module, IMDFT conversion is carried out, to obtain the stereo decoding data in time domain for being combined by the high frequency spectrum decoded data on the MDFT territory of the low frequency spectrum decoded data on the MDFT territory of described left and right sound channels and described left and right sound channels.
24. 1 kinds of stereo decoding methods, comprising:
Demultiplexing is carried out to acoustic coding code stream, to obtain the high-frequency parameter coded data of low frequency stereo coding data and left and right sound channels;
Stereo decoding is carried out to described low frequency stereo coding data, to obtain a described left side, the low frequency spectrum decoded data of the voice signal on the Modified Discrete Cosine Transform MDCT territory of R channel, by a described left side, the low frequency spectrum decoded data of the voice signal on the MDCT territory of R channel is converted to from MDCT territory revises discrete Fourier transformation MDFT territory, obtain a left side, low frequency spectrum decoded data on the MDFT territory of R channel, from a described left side, in low frequency spectrum on the MDFT territory of R channel, demapping section modal data is to HFS, obtain the left side after mapping, the high frequency spectrum data of R channel, then according to a described left side, the high-frequency parameter coded data of R channel is to the left side after described mapping, the high frequency spectrum data of R channel are carried out parameter decoding and are obtained a described left side, high frequency spectrum decoded data on the MDFT territory of R channel, and
High frequency spectrum decoded data on the MDFT territory of the low frequency spectrum decoded data on the MDFT territory of described left and right sound channels and described left and right sound channels is combined and carries out inversely revising discrete Fourier transformation IMDFT conversion, to obtain the stereo decoding data in time domain.
CN201210085183.2A 2012-03-28 2012-03-28 A kind of sound codec devices and methods therefor Active CN103366750B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210085183.2A CN103366750B (en) 2012-03-28 2012-03-28 A kind of sound codec devices and methods therefor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210085183.2A CN103366750B (en) 2012-03-28 2012-03-28 A kind of sound codec devices and methods therefor

Publications (2)

Publication Number Publication Date
CN103366750A CN103366750A (en) 2013-10-23
CN103366750B true CN103366750B (en) 2015-10-21

Family

ID=49367950

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210085183.2A Active CN103366750B (en) 2012-03-28 2012-03-28 A kind of sound codec devices and methods therefor

Country Status (1)

Country Link
CN (1) CN103366750B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105096957B (en) * 2014-04-29 2016-09-14 华为技术有限公司 Process the method and apparatus of signal
EP3614382B1 (en) * 2014-07-28 2020-10-07 Nippon Telegraph And Telephone Corporation Coding of a sound signal
CN105336334B (en) * 2014-08-15 2021-04-02 北京天籁传音数字技术有限公司 Multi-channel sound signal coding method, decoding method and device
EP3182411A1 (en) * 2015-12-14 2017-06-21 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for processing an encoded audio signal
CN108206021B (en) * 2016-12-16 2020-12-18 南京青衿信息科技有限公司 Backward compatible three-dimensional sound encoder, decoder and encoding and decoding methods thereof
CN107123428A (en) * 2017-04-26 2017-09-01 中国科学院微电子研究所 A kind of voice compressing system detected applied to abnormal sound and method
CN112599139B (en) * 2020-12-24 2023-11-24 维沃移动通信有限公司 Encoding method, encoding device, electronic equipment and storage medium
CN113220947A (en) * 2021-05-27 2021-08-06 支付宝(杭州)信息技术有限公司 Method and device for encoding event characteristics

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008519290A (en) * 2004-11-02 2008-06-05 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ Audio signal encoding and decoding using complex-valued filter banks
WO2009029036A1 (en) * 2007-08-27 2009-03-05 Telefonaktiebolaget Lm Ericsson (Publ) Method and device for noise filling
CN101521012A (en) * 2009-04-08 2009-09-02 武汉大学 Method and device for MDCT domain signal energy and phase compensation

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE10234130B3 (en) * 2002-07-26 2004-02-19 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Device and method for generating a complex spectral representation of a discrete-time signal

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008519290A (en) * 2004-11-02 2008-06-05 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ Audio signal encoding and decoding using complex-valued filter banks
WO2009029036A1 (en) * 2007-08-27 2009-03-05 Telefonaktiebolaget Lm Ericsson (Publ) Method and device for noise filling
CN101521012A (en) * 2009-04-08 2009-09-02 武汉大学 Method and device for MDCT domain signal energy and phase compensation

Also Published As

Publication number Publication date
CN103366750A (en) 2013-10-23

Similar Documents

Publication Publication Date Title
CN103366750B (en) A kind of sound codec devices and methods therefor
CN103366749B (en) A kind of sound codec devices and methods therefor
EP1943643B1 (en) Audio compression
CN101276587B (en) Audio encoding apparatus and method thereof, audio decoding device and method thereof
KR101589942B1 (en) Cross product enhanced harmonic transposition
CN101086845B (en) Sound coding device and method and sound decoding device and method
KR101747918B1 (en) Method and apparatus for decoding high frequency signal
KR101809592B1 (en) Audio encoder, audio decoder and related methods using two-channel processing within an intelligent gap filling framework
CN101067931B (en) Efficient configurable frequency domain parameter stereo-sound and multi-sound channel coding and decoding method and system
EP2056294B1 (en) Apparatus, Medium and Method to Encode and Decode High Frequency Signal
EP2212884B1 (en) An encoder
EP1852851A1 (en) An enhanced audio encoding/decoding device and method
CN101089951A (en) Band spreading coding method and device and decode method and device
CN103366751B (en) A kind of sound codec devices and methods therefor
EP1873753A1 (en) Enhanced audio encoding/decoding device and method
CN104103276A (en) Sound coding device, sound decoding device, sound coding method and sound decoding method
RU2409874C9 (en) Audio signal compression
JP7326285B2 (en) Method, Apparatus, and System for QMF-based Harmonic Transposer Improvements for Speech-to-Audio Integrated Decoding and Encoding
CN104078048A (en) Acoustic decoding device and method thereof

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant