CN1154087C

CN1154087C - Improving sound quality of established low bit-rate audio coding systems without loss of decoder compatibility

Info

Publication number: CN1154087C
Application number: CNB008092699A
Authority: CN
Inventors: 尤于利; ʷ; W·P·史密斯; Z·费伊佐; S·史密斯
Original assignee: Digital Theater Systems Inc
Current assignee: DTS BVI Ltd
Priority date: 1999-06-21
Filing date: 2000-06-19
Publication date: 2004-06-16
Anticipated expiration: 2020-06-19
Also published as: EP1204970B1; JP4204227B2; US6226616B1; JP2003502704A; TW565826B; EP2228790B1; AU5745200A; WO2000079520A1; KR20020027364A; EP2228790A3; KR100606992B1; EP2228790A2; JP2008020931A; EP1204970A4; EP1204970A1; HK1043422A1; CN1357136A

Abstract

A multi-channel audio compression technology is presented that extends the range of sampling frequencies compared to existing technologies and/or lowers the noise floor while remaining compatible with those earlier generation technologies. The high-sampling frequency multi-channel audio is decomposed into core audio up to the existing sampling frequencies and a difference signal up to the sampling frequencies of the next generation technologies. The core audio is encoded using a first generation technology such as DTS, Dolby AC-3 or MPEG I or II such that the encoded core bit stream is fully compatible with a comparable decoder in the market. The difference signal is encoded using technologies that extend the sampling frequency and/or improve the quality of the core audio. The compressed difference signal is attached as an extension to the core bit stream. The extension data will be ignored by the first generation decoders but can be decoded by the second generation decoders. By summing the decoded core and extension audio signals together, a second generation decoder can effectively extend the audio signal bandwidth and/or improve the signal to noise ratio beyond that available through the core decoder alone.

Description

Improve method, scrambler and the code translator of audio frequency coding with low bit ratio system tonequality

Invention field

The present invention relates to the audio frequency coding with low bit ratio system, particularly under loss of decoder compatibility not, improve the method for the audio frequency coding with low bit ratio system tonequality of having set up.

Background technology

In the broad range of consumer and professional audio playback products ﹠ services, the current many audio frequency coding with low bit ratio system that just using.For example, Dolby AC3 (Dolby numeral) audio coding system is a kind of being used for laser-optical disk, the DVD of NTSC coding and the stereo and 5.1 channel audio sound channels of ATV to be used the world wide standard of encoding up to the bit rate of 640kbits/s..MPEG I and MPEG II audio coding standard are widely used in the DVD video of PAL coding, the continental rise digital radio broadcasting in Europe and the stereo hyperchannel sound channel coding of satellite broadcasting of the U.S., and frequency is up to 768kbits/s..The relevant acoustics audio coding system of DTS (Digital Theater System) is through being usually used in the channel audio sound channel of performing in a radio or TV programme Room quality-class 5.1 of compact disk (CD), DVD video and laser-optical disk, and its frequency is up to 1536kbits/s..

A subject matter of these systems is their design underactions, and their are upgraded to adapt to higher PCM sample frequency, pcm word length or higher systematic bits rate inconvenience.Along with music and film world trend are abandoned sample frequency and the 16 bit word length compact disk digital audio formats of old 44.4KHZ, and adopt the new 96kHZ sampling and the master control form of 24 bit word lengths, this will become an important issue in the coming years.

Therefore utilize sound equipment transmission must be suitable for and the signal fidelity of this increase can be passed to the consumer as existing audio coding systems such as AC-3, MPEG and DTS.Unfortunately, in the Already in existing customer base in basis of the tone decoder process chip (DSP) of a large amount of these code translator functions of mounted realization.These code translators can not be upgraded easily to adapt to ever-increasing sampling rate, word length size or bit rate.Thereby the supplier of music by these media sales products and substance film will be forced to continue to provide the coded audio that meets old standard stream.This means in the future, transmit medium such as DVD sound equipment, ATV, satelline radio etc., can be forced to transmit multiple bit stream, every kind of bit stream meets different standards.For example, a kind of failing to be convened for lack of a quorum comprises that the owner who allows existing playback system receives and playing standard voiced band and second kind of stream will exist to allow the new equipment owner to play the tone channel of encoding with 96kHZ/24 bit PCM form and to utilize intrinsic more Hi-Fi benefit.

The problem of this transmission method is that many playback media may not be provided for sending extra bandwidth of additional audio streams or channel capacity.The bit rate of added bit stream (example is supported those streams of 96kHZ/24 bit) will wait what or at least greater than the bit rate of supporting old form.Therefore, in order to support two kinds or multiple audio standard, bit rate is double possibly even more.

Summary of the invention

In view of the above problems, the invention provides a kind of coding method, it expanded frequency range and noise lower limit has fallen at the end but avoided transmitting repetition voice data and thereby adapt to aspect PCM sampling rate, word length and the coding bit rate change more effective.

This available one " core " coding techniques adds " expansion " coding method and realizes that traditional in the method audio coding algorithm has constituted " core " audio coder, and remains unchanged.Represent high audio frequency (under higher sampling rate situation) more or higher sampling resolution (more under the long situation of long word), or the voice data during both dual-purposes is used as the transmission of " expansion " stream.This just allows the sound equipment content provider to comprise single audio bitstream with dissimilar code translator compatibilities in the consumer device basis.Core flow will be by old decoder for decoding, and old code translator will be ignored growth data, and new code translator will use core and extended data stream to provide the sound reproduction of better quality.

The key features of this system is the core signal that deducts reconstruct in original " high-fidelity " input signal (coding/decoding and/or owe sampling/over-sampling) and produce growth data.Gained difference signal coding is produced dilatant flow.Use this technology, avoided going back to mixing repeatedly in core or the spread signal.Thereby the quality of core audio frequency is not added the influence of dilatant flow.For the system that is operated in its basic model, only need know the wait retardation or the time-delay of core code translator.Therefore, even do not know the internal algorithm of scrambler or the details of realization, this method also can successfully be applied to any audio coding system.Yet if the expansion design of encoder must be mated with core encoder on the signal frequency range of core signal, this system can work more effectively.

A first aspect of the present invention provides a kind of and has been used for by sampling of known sampling rate and the multichannel audio coding device that has the digital audio and video signals of an audio bandwidth to encode, it comprises: the extraction low-pass filter (LPF) that is configured to receive described digital audio and video signals, to remove the component of signal that is higher than audio bandwidth of core encoder, described extraction low-pass filter produces the digital audio and video signals output through filtering to digital audio and video signals filtering for it; Owe to sample signal output through filtering to extract the withdrawal device of core signal, wherein this sampling rate little than described digital audio and video signals; Sampling rate that has and audio bandwidth mate the core encoder of this withdrawal device, it is set receives described core signal and this core signal is encoded into core-bits; To core-bits decoding core code translator, be used for receiving described core-bits from described core encoder with the input of its coupling with the core signal that forms reconstruct; The core signal of over-sampling reconstruct is provided with it and receives described reconstruct core signal with the interpolater of the sampling rate that reaches the extended coding device, and it produces the reconstruct core signal of an over-sampling as output; Interpolation LPF, it receives the reconstruct core signal of described over-sampling and the reconstruct core signal filtering of this over-sampling is mixed repeatedly to remove interpolation from described interpolater, and it produces over-sampling reconstruct core signal once filtering as output; Summing junction, be configured to receive described digital audio and video signals and described over-sampling reconstruct core signal through filtering, it deducts described over-sampling reconstruct core signal through filtering to form a difference signal from digital audio and video signals, wherein said extended coding device is encoded into extended bit with difference signal; Sampling rate that has and audio bandwidth equal the sampling rate of described digital audio and video signals and the extended coding device of audio bandwidth, it is set receives described difference signal and described difference signal is encoded into extended bit; An and wrapper, it adds extended format with core-bits and ratio by core and is packaged into bit stream, in this form, first generation tone decoder can extract and decipher core-bits, and second generation tone decoder can extract core-bits and adds extended bit to reappear high-quality audio signal to reappear sound signal.

A second aspect of the present invention provides a kind of and has been used for by sampling of known sampling rate and the multichannel audio coding device that has the digital audio and video signals of an audio bandwidth to encode, it comprises: the extraction low-pass filter (LPF) that is configured to receive described digital audio and video signals, it to digital audio and video signals filtering to remove the component of signal that is higher than audio bandwidth that extracts low-pass filter, the transition band that described extraction low-pass filter has is at the annex of described core encoder audio bandwidth, and generation is through the digital audio and video signals output of filtering; Owe to sample signal output through filtering to extract the withdrawal device of core signal, wherein this sampling rate little than described digital audio and video signals; Sampling rate that has and audio bandwidth mate the core encoder of this withdrawal device, it is set receives described core signal and this core signal is encoded into core-bits; To core-bits decoding core code translator, be used for receiving described core-bits from described core encoder with the input of its coupling with the core signal that forms reconstruct; The core signal of over-sampling reconstruct is provided with it and receives described reconstruct core signal with the interpolater of the sampling rate that reaches the extended coding device, and it produces the reconstruct core signal of an over-sampling as output; Interpolation LPF, it receives the reconstruct core signal of described over-sampling and the reconstruct core signal filtering of this over-sampling is mixed repeatedly to remove interpolation from described interpolater, and it produces over-sampling reconstruct core signal once filtering as output; Summing junction, be configured to receive described digital audio and video signals and described over-sampling reconstruct core signal through filtering, it deducts described over-sampling reconstruct core signal through filtering to form a difference signal from digital audio and video signals, wherein said extended coding device is encoded into extended bit with difference signal; Sampling rate that has and audio bandwidth equal the sampling rate of described digital audio and video signals and the extended coding device of audio bandwidth, it is set receives described difference signal and described difference signal is encoded into extended bit, described extended coding device is in described transition band and be higher than in the frequency band of this transition band allocation bit to expand the frequency range of described coded signal; An and wrapper, it adds extended format with core-bits and ratio by core and is packaged into bit stream, in this form, first generation tone decoder can extract and decipher core-bits, and second generation tone decoder can extract core-bits and adds extended bit to reappear high-quality audio signal to reappear sound signal.

A third aspect of the present invention provides a kind of and has been used for by sampling of known sampling rate and the multichannel audio coding device that has the digital audio and video signals of an audio bandwidth to encode, it comprises: extract core signal and it is encoded into the core encoder of core-bits from the digital audio and video signals on an audio bandwidth, described core encoder comprises that one is decomposed into the N band filter group of N subband with core signal and generates the N subband coder, reconstruct N sub-band samples of the core-bits N subband code translator with formation reconstruct core signal; From the core signal of reconstruct and the summing junction of the difference signal digital audio and video signals formation transform domain or the subband domain; Described new difference signal is encoded into the extended coding device of extended bit, described extended coding device on its audio bandwidth with the core encoder coupling and comprise: the two band filter groups that digital audio and video signals are split into low-frequency band and high frequency band; N band filter group with the core encoder equivalence, it is decomposed into N subband with the digital audio and video signals in the low-frequency band, described summing junction is present in the described extended coding device and comprises N subband node, and they deduct the N sub-band samples of reconstruct to form N difference subband respectively from N subband of digital audio and video signals; N subband coder, they to N difference sub-band coding to form the lower band expansion bit; M band filter group, it is decomposed into M subband with the digital signal in the high frequency band; And M subband coder, they to M sub-band coding to form the high frequency band extended bit.

A fourth aspect of the present invention provides a kind of and has been used for by sampling of known sampling rate and the multichannel audio coding device that has the digital audio and video signals of an audio bandwidth to encode, it comprises: extract core signal and it is encoded into the core encoder of core-bits from the digital audio and video signals on an audio bandwidth, described core encoder comprises that one is decomposed into the N band filter group of N subband with core signal and generates the N subband coder, reconstruct N sub-band samples of the core-bits N subband code translator with formation reconstruct core signal; From the core signal of reconstruct and the summing junction of the difference signal digital audio and video signals formation transform domain or the subband domain; Described new difference signal is encoded into the extended coding device of extended bit, described extended coding device mates with core encoder on its audio bandwidth and comprises: L band filter group, it is decomposed into N low subband and M high-frequency sub-band frequently with digital audio and video signals, the characteristic of described L band filter group is mated with the characteristic of stating N band filter group on its N low frequency band, described summing junction is present in the described extended coding device and comprises N subband node, and they deduct the N sub-band samples of reconstruct to form N difference subband respectively from N subband of digital audio and video signals; N subband coder, they to N difference sub-band coding to form the lower band expansion bit; M subband coder, they to M sub-band coding to form the high frequency band extended bit.

A fifth aspect of the present invention provides a kind of and has been used for from the multichannel black box tone decoder of bit stream reconstruct multitone frequency channel, wherein each voice-grade channel is sampled by known sampling rate and an audio bandwidth is arranged, it comprises: replacer, be used for once reading in and storing a frame bit stream, each described frame comprises the core field with core-bits and has synchronization character and the extended field of extended bit that described replacer extracts described core-bits and detects described synchronization character to extract and to separate this extended bit; To the core code translator of core-bits decoding with the core signal of formation reconstruct; To the expansion code translator of extended bit decoding with the difference signal of formation reconstruct, sampling rate that described expansion code translator has and audio bandwidth are greater than the sampling rate and the audio bandwidth of described core code translator; The core signal of over-sampling reconstruct is with the interpolater of the sampling rate that reaches the extended coding device; Reconstruct core signal filtering to over-sampling mixes low-pass filter repeatedly with the decay interpolation; And the difference sound signal of reconstruct added the core sound signal of reconstruct with fidelity of improving the reconstruct core signal and the summing junction of expanding its audio bandwidth.

A sixth aspect of the present invention provides a kind of and has been used for from the multichannel black box tone decoder of bit stream reconstruct multitone frequency channel, wherein each voice-grade channel is sampled by known sampling rate and an audio bandwidth is arranged, it comprises: replacer, be used for once reading in and storing a frame bit stream, each described frame comprises the core field with core-bits and has synchronization character and the extended field of extended bit that described replacer extracts described core-bits and detects described synchronization character to extract and to separate this extended bit; N core subband code translator, they are decoded into N core subband signal with core-bits; N expansion subband code translator, they are decoded into N low frequency expansion subband signal with extended bit; M expansion subband code translator, they are decoded into M high frequency expansion subband signal with extended bit; N core subband signal and N expansion subband signal are separately formed mutually N summing junction of N complex sub-band signals; And wave filter, their comprehensive N composite band signals and M expansion subband signal are to reproduce multi channel audio signal.

It is a kind of to by the sampling of known sampling rate and the method for the multi-channel digital coding audio signal of an audio bandwidth is arranged that a seventh aspect of the present invention provides, it has kept providing again simultaneously with second generation tone decoder high-quality reproduction sound with the compatible of the existing basis of first generation tone decoder, and it is characterized in that may further comprise the steps: this digital audio and video signals of low-pass filtering is to remove the component of signal that is higher than the core audio bandwidth; Owe to sample signal through filtering to extract the core signal of sampling rate and core samples rate-matched; With with the mode of described first generation tone decoder compatibility, core signal is encoded into core-bits by core samples rate and audio bandwidth and does not turn back and mix repeatedly, described core samples rate and audio bandwidth are less than numeral signals sampling speed and audio bandwidth frequently; Use first generation tone decoder decoding core-bits to form the core signal of reconstruct; The core signal of over-sampling reconstruct is to the expansion sampling rate; The low-pass filtering over-sampling the reconstruct core signal mix repeatedly to remove interpolation; From digital audio and video signals, deduct described filtering signal to form described difference signal; With the expansion sampling rate and the audio bandwidth that equal described numeral frequency signals sampling speed and audio bandwidth difference signal is encoded; And core-bits and extended bit be packaged into bit stream by the form that core-bits adds expansion, second generation tone decoder can extract and decipher core-bits and adds extended bit to reproduce high-quality audio signal with reproducing audio signal can to extract and decipher core-bits with this form first generation tone decoder.

It is a kind of to by the sampling of known sampling rate and the method for the multi-channel digital coding audio signal of an audio bandwidth is arranged that a eighth aspect of the present invention provides, it has kept providing again simultaneously with second generation tone decoder high-quality reproduction sound with the compatible of the existing basis of first generation tone decoder, it may further comprise the steps: this digital audio and video signals of low-pass filtering is to remove the component of signal that is higher than the core audio bandwidth, and described filtering has a transition band around the core audio band; Owe to sample signal through filtering to extract the core signal of sampling rate and core samples rate-matched; With with the mode of described first generation tone decoder compatibility, core signal is encoded into core-bits by core samples rate and audio bandwidth and does not turn back and mix repeatedly, described core samples rate and audio bandwidth are less than numeral signals sampling speed and audio bandwidth frequently; The over-sampling core signal is to expanding sampling rate to form the reconstruct core signal; The core signal of filtering reconstruct mixes repeatedly to remove interpolation; From digital audio and video signals, deduct described reconstruct core signal through filtering to form a difference signal; With expansion sampling rate and the audio bandwidth that equals described numeral frequency signals sampling speed and audio bandwidth difference signal is encoded into extended bit, in described transition band, distributes the frequency range of described extended bit in the transition band with the extended coding sound signal with being higher than; And core-bits signal and extended bit be packaged into bit stream by the form that core-bits adds expansion, second generation tone decoder can extract and decipher core-bits and adds extended bit to reproduce high-quality audio signal with reproducing audio signal can to extract and decipher core-bits with this form first generation tone decoder.

A ninth aspect of the present invention provides a kind of method of reconstruct multi channel audio signal, it may further comprise the steps: the sequence of received code frame, each described frame comprise having and the core synchronization character are abutted against the core field before the core-bits and will expand extended field before synchronization character abuts against extended bit; Detect the core synchronization character to extract the core signal that core-bits also is decoded into it reconstruct subsequently; Detect the expansion synchronization character to extract extended bit and with sampling rate and audio bandwidth they to be decoded into the difference signal of reconstruct subsequently greater than described core-bits; The core signal of over-sampling reconstruct is to the sampling rate of reconstruct difference signal; The low-pass filtering over-sampling the reconstruct core signal mix repeatedly with the decay interpolation; Addition through the core signal of filtering and reconstruct and heavy difference signal with the reconstruct multi channel audio signal.

Characteristics of the present invention and advantage will become more obvious from the description to preferred embodiment down and in conjunction with the accompanying drawings to those skilled in the art.

The accompanying drawing summary

Fig. 1 is the frequency response curve of two frequency band decimation filter groups of the separation base band that is used to encode in the early stage method and high frequency band.

Fig. 2 realizes that general core of the present invention adds the scrambler block diagram of extension framework.

Fig. 3 a and 3b are respectively input audio frequency and the encoded core signal and the spectrum curve figure of difference signal.

Fig. 4 has illustrated that the core of single frames adds the bitstream format of extended audio data.

Fig. 5 a and 5b have illustrated physical medium and the broadcast system that is used for individual bit stream is delivered to code translator respectively.

The core that the code translator block diagram of Fig. 6 meets among Fig. 2 adds the expansion code translator.

Fig. 7 is the audio signal frequency spectrum curve to the reconstruct of multi-tone test signal.

Fig. 8 a and 8b are respectively scrambler and the code translator block schemes of realizing the high resolving power extension framework.

Fig. 9 is the spectrum curve of the difference signal of high resolving power extension framework.

Figure 10 is for the spectrum curve figure of high score rate frequency extension framework through the multi-tone test signal of reconstruct.

Figure 11 a and 11b are respectively scrambler and the code translator block schemes of implementing high frequency division extension framework.

Figure 12 is the spectrum curve figure for the reconstructed audio signal of multi-tone test signal under fixed bit rate.

Figure 13 is a scrambler block diagram of implementing another high frequency extension framework.

Figure 14 a and 14b are respectively the block diagrams of extended coding device and expansion code translator.

Figure 15 a and 15b are respectively the block diagrams of subband coder and subband code translator.

Figure 16 is the block diagram of black box hardware configuration.

Figure 17 has illustrated the data stream that is input to the on-chip memory of first processor from serial.

Figure 18 has illustrated the data stream from the on-chip memory of first processor to serial port.

Figure 19 explanation data stream from the on-chip memory of first processor to the on-chip memory of second processor.

Figure 20 explanation data stream from the on-chip memory of second processor to the on-chip memory of first processor.

Figure 21 a and 21b are respectively the block diagrams of opening box scrambler and code translator.

Figure 22 a and 22b are respectively the block diagrams of another opening box scrambler and code translator.

Detailed description of the present invention

The present invention has defined, and a kind of " core " adds " expansion " coding techniques, is used for the high fidelity signal, and it allows the audio content supplier to comprise a single audio bitstream, this bit stream and the dissimilar code translator compatibilities that reside in the consumption basic equipment.Core bit stream will be used old decoder for decoding, ignore growth data, yet new code translator will utilize core and extended data stream to provide higher-quality audio reproduction simultaneously.This disposal route will satisfy simultaneously wishes to keep the existing consumer colony of their existing code translator and wish to buy and can reproduce more those consumer colonies of the new code translator of high fidelity signal.

So that existing code translator and scrambler of future generation keep the compatible high protonotion of protecting sound signal of method coding to be introduced by people such as Smyth, referring to being entitled as of submitting on May 2nd, 1996 " on frequency, time and multichannel, utilize the multichannel prediction subband audio coder that the tonequality adaptive bit distributes " application 08/642,254, it also transfers DTS Inc..Shown in people's such as Smyth Fig. 4 a and 4b, audible spectrum utilizes 256 taps, two frequency bands to extract the pre-filtering group and does initial division, provides the audio bandwidth of every frequency band 24kHZ.Division bottom frequency band (0-24kHZ) also is encoded into 32 even frequency bands.Division top frequency band (24-48kHZ) U also is encoded into 8 even frequency bands.

The mirror image operation of new design of encoder digram coding device is deciphered and is followed with taking out this Hi-Fi audio signal of bank of filters reconstruct in 256 taps, two frequency bands top and bottom frequency band.System demonstrates required unit gain frequency response on whole 48kHZ frequency band.

Old code translator before described this high-fidelity coding techniques exists is only to decoding of bottom frequency band and generation base-band audio signal.System does not keep compatible with existing code translator in this sense.Yet as shown in Figure 1 here, the frequency response of two frequency bands extraction pre-filtering frequency band causes the repeatedly problem of mixing near 24kHZ when having only core to be encoded.Provide the unity gain response for core being added the extended coding device, bottom and top

frequency band response

8 and 10 produce the intersection transition respectively in their transitional regions separately of 24kHZ.Yet the part that is higher than 24kHZ of bottom frequency response 8 is mixed repeatedly down in having only the code translator of core.Its result, the base-band audio signal of reconstruct will have destruction to a certain degree, and this did not find in the old-fashioned coding/decoding system that has only base band.Thereby coded system does not keep the compatibility of " really " with existing code translator.In addition, this disposal route has restricted any additional bit and has been assigned to the top frequency band, and this may be suboptimum under many occasions.

The broad sense core adds extension framework

The general process of coding and decoding is shown in Fig. 2-7.Be fed to simulation Anti-liased LPF14 in order to produce spread bit stream (Fig. 2) analogue audio frequency 12, to signal limit band.This bandlimited signal is sampled into discrete/digital audio and video signals 16.The cutoff frequency of LPF14 must be less than half of sampling rate to satisfy the Nyguist criterion.For example, the cutoff frequency that is suitable for 48kHZ for the 96kHZ sampling rate of expansion.

Digital audio and video signals 16 is sent to core encoder 18 (AC3, MPEG, DTC etc.) and with specific bit rate coding.The sampling rate of sound signal and bandwidth need by low-pass filtering and owe sampling to adjust to mate with core encoder in some cases.For simplicity, single channel or multichannel are thought in the input of audio frequency shown in the figure.Under the multichannel input condition, each channel subtracted each other with addition handle.This core bit stream 20 is maintained in the wrapper 22 before producing growth data.Core bit stream also is fed back to core code translator 24, and it is consistent with the code translator that exists in existing consumer's playback apparatus.

From the time-delay mode of original input signal 16, deduct the reconstruct core sound signal 26 of 28 gained subsequently.Upward accurately aim in order to obtain the time, delay time 32 with core encoder/feasible core audio frequency of deciphering of code translator hysteresis coupling and input audio signal.This difference signal 34 has been represented the component behind the coded signal that has not had in the original input signal in the core bit stream 20 now, promptly or be the component of high resolution component or higher frequency more.Difference signal by extended coding device 36 codings, is fit to use the bit stream 38 that produces expansion such as standard code technology such as sub-band coding or transition codings subsequently.Spread bit stream and core bit stream are by time alignment and decide them multiplexed to form combined-flow 40 or to keep transmitting as the stream that separates on application.

The notion of spread-spectrum and reduction noise lower limit is further shown in Fig. 3 a and 3b.Fig. 3 a is depicted as 96kHZ sampled audio input signal spectrum 42 snapshot plottings.Audio frequency has clearly comprised and has been higher than the outer frequency component of 48kHZ.Trace 44 among Fig. 3 b has shown the frequency spectrum of signal behind extraction and core encoder.Audio frequency is more than the elimination 24kHZ, and sample frequency drops to 48kHZ to mate with core encoder by withdrawal device.Trace 46 has been described the difference signal frequency spectrum before entering the extended coding device.Very clear, the extended coding device can concentrate on its data resource on this part frequency spectrum that core encoder do not represent, promptly near the transition band 48 and the high frequency expansion 50 from 24kHZ to 48kHZ the 24kHZ.Distribute some bits to reduce the noise lower limit of core bandwidth to remaining core signal 52 in addition.According to the concrete condition that the investigation extended bit of some application is distributed is the resolution of (1) extended core signal, core resolution of (2) spread signal and high frequency inside, and (3) extended high frequency content only.For listed every kind of situation, coded system can be configured to the disposal route of " black box ", wherein only must know the time-delay relevant, or the method that is configured to opening " box " is to utilize specific core encoder structure with coded system.

In order to keep and only contain the downward compatibility of the code translator of core, the single composite bit stream 40 that has core and

extended audio data

20 and 38 respectively also adds the extended mode format with core.This bit stream is a synchronous frame sequence 54, and every frame is made up of two fields: core field 56 and extended field 58 (see figure 4)s.Only contain code translator detection synchronization character (CORE-SYNC) 61 of core and core-bits 20 decodings in the core field 56 with generation core audio frequency, and then ignore 58 pairs of next frames decodings of extended field by the beginning of jumping to next frame.Yet the expansion code translator can be deciphered core-bits and detect whether there is the synchronization character (EXT-SYNC) 60 that is used for extended bit subsequently.If there is no, code translator output core audio frequency and the beginning of jumping to next frame are deciphered next frame.Otherwise code translator continues the decoding of the extended bit in the extended field 58 to produce extended audio and subsequently with it and core audio frequency combination results high quality audio.Core-bits be defined in it the noise of reconstruct core audio frequency in the frequency band of striding, the noise lower limit of extended bit further meticulous (reductions) in core band also defined noise lower limit for the remainder of audio band.

Shown in Fig. 5 a and 5b, composite bit stream 40 is coded in transfer medium such as CD, the directly broadcast system broadcasting of on the general digital CD or footpath.Shown in Fig. 5 a, utilize the single composite bit stream of knowing 40 of technology to be written into the portable machine readable memory medium, as CD, on DVD or other digital storage equipment.Shown in Fig. 5 b, composite bit stream 40 is embodied on the carrier wave 64 and with after satellite, cable or other CS broadcasting.

For to core and spread bit stream decoding (Fig. 6), replacer 66 dismounting composite bit streams 40 and with core and spread

bit stream

20 and 38 their

code translators

68 and 70 separately of guiding.Code translator 72 addition subsequently output 74 is with reconstruct Hi-Fi audio signal 76.In playback reproducer does not have expansion code translator situation under under (as the situation of retrospective installation), ignore spread bit stream simply and decipher core bit stream to produce the sound signal of core matter * amount.In the decoding example, think that the core code translator is identical with the time-delay of expansion code translator.As described later on, the difference of time-delay can and be regulated at code translator level or encoder level adding additional delay.

The advantage that core adds the extended coding technology is clear the demonstration in Fig. 7, this figure drawn respectively to the response of multi-tone test signal only contain core and core adds spread-spectrum 78 and 80.In this application-specific, the reconstructed audio signal that only contains core that this audio system produces has noise lower limit and is about-100dB from DC to 24kHZ.As will more going through later on, this its edge of response that only contains core than in the used simulation Anti-liased wave filter of legacy system because the signal that only contains core that difference produces is good, also than digital decimation devices all in the new-type scrambler produce good.Comparatively speaking, the core that audio system (No. 2, concrete condition) produces adds the extended audio signal noise lower limit of core signal is reduced to approximately-160dB, and be about at noise lower limit-the 60dB place expands to 48kHZ with signal bandwidth.Note, at the frequency place of higher code translator (not too responsive) the higher noise lower limit of tolerable ear.

The high resolving power extension framework

Fig. 8 a and 8b only illustrated in order to improve the code distinguishability of core processing, promptly reduces the encoding error in the audio output signal of decoding and do not expand the coding and the decode procedure of output audio signal frequency band.Bit rate by the existing encoding scheme of what (AC3, MPEG, DTS) is fixed, if require higher code distinguishability, this just requires to use complete different non-compatible coding device to audio-frequency signal coding usually.

In current scheme, existing core encoder 84 is used to provide that (this will be 640kbit/S for AC3 in the bit rate restriction range that may be operated in existing code translator; For MPEG is 768kbit/S; For DTS is 1536kbit/S) best code distinguishability.In order further to improve code distinguishability, promptly reduce encoding error, to form the reconstruct core signal, it is deducted from input signal to coding core signal decoding (86), and input signal is delayed time (90) so that these two signals are realized aiming at constantly accurately.Extended coding device 82 use certain arbitrarily cataloged procedure to this difference number coding.Wrapper is packaged into composite bit stream with core and extended bit as mentioned above.In this case, sample frequency and audio bandwidth are identical in expansion and

core encoder

82 and 84 respectively.Idea, if high-fidelity 96kHZ input audio frequency is provided, it must be low pass filtering and owe sampling to mate with two kinds of scramblers.

Shown in Fig. 8 b, for to signal interpretation, replacer 94 dismounting composite bit streams also are sent to code translator separately respectively with core and spread bit stream and handle 96 and 98, and the output of each code translator is added together 100.If there is no expand code translator, then directly utilize the output of core code translator.In this example, can be thought of as the signal to noise ratio (S/N ratio) that is used to improve output audio signal to spread bit stream, i.e. output by addition expansion code translator reduces the coding noise lower limit.The degree that reduces will depend on the bit rate of distributing to bit stream.

Fig. 9 shows the snapshot plotting of difference signal frequency spectrum 102 before entering the extended coding device.Core encoder is handled the bandwidth of the noise lower limit of the encoding error generation that has across 0 ~ 24kHZ.At the amplitude peak error at 24kHZ place transition band width owing to Anti-liased filtering.The extended coding device distributes its available bits to reduce coding and transition band width error.Most of allocative decisions distribute more bits to give bigger error (such as transition band width) and less bit gives less error with excellent good overall performance.

As shown in figure 10, by the core-bits rate is increased to 2048bits/sec and distributes additional bit to give the extended coding device from 1536bits/sec, noise lower limit translation in fact downwards, be-100db for the frequency response 78 that only contains core, add extended response 104 for-160db and extend to whole transition band with respect to core.Notice that these bit rates only are some examples that possible be used for existing DTS coded system.Remarkable improvement in the audio fidelity direction be obtained and represented to-160db noise lower limit can not with other operational scrambler.

The high frequency extension framework

Figure 11 a and Figure 11 b have described a kind of coding framework, and it allows spread bit stream to carry the high-frequency audio information that the core encoder system can not represent.In this example, DAB is represented with having the 24 bit PCM samples in 96kHZ cycle.DAB at first utilizes linear phase FIR filter 106 low-pass filtering with integer time-delay to remove the component of signal that is higher than 24kHZ.Notice that the cutoff frequency of the simulation Anti-liased wave filter in the cutoff frequency of this digital filter and the existing audio coder that only contains core is identical.Because digital filter tends to have narrower transition band than corresponding analog filter, the signal that only contains core that only contains in the comparable existing system in edge of signal reality of core is better.

Then the signal through filtering is extracted 108 by the factor 2, obtain effective 48kHZ sampled signal.This is owed sampled signal and is fed to core encoder 110 with normal mode, and the bit stream of gained is placed in the frame buffer 111 at least one frame that this bit stream delayed time.The bit stream of time-delay is put into wrapper 112 then.This is owed sampled signal and also is fed to the sampled digital audio stream (it have encoding error) of core code translator 114 with reconstruct 48KkHZ.Before it can be deducted from the input audio signal of original 96kHZ, it at first must also be mixed repeatedly to remove interpolation through low-pass filtering then with the factor 2 over-samplings.This filtering can use the linear phase FIR filter 118 with integer sampling time-delay suitably to obtain again.Thereby this signal still only carried the audio-frequency information that keeps in the core bit stream, and promptly it does not comprise any audio frequency component of any 24kHZ of being higher than.The core signal of reconstruct deducts 119 then producing difference signal from the signal of time-delay (119) mode of input signal 122, difference signal through time-delay 121 and with sample code device 123 codings of 96kHZ to produce spread bit stream.

Decode procedure is similar to the description of front.Shown in Figure 11 b, replacer 124 dismounting composite bit streams also are fed to their code translators 126 and 128 separately with core and spread bit stream.Expand code translator in this case and do not exist, the audio frequency of reconstruct directly is output (being the PCM of 24 bit 48kHZ among the figure).If exist in player under the situation of expansion code translator, the core audio frequency of decoding is become 96kHZ (130) by oversampling, low-pass filtering (132) and with the output addition (134) of expansion code translator.

The notion of this processing adds extension framework with reference to general core at first and makes an explanation in Fig. 3 a and 3b.Fig. 3 a shows the frequency spectrum snapshot plotting of 96kHZ sampled audio input signal.This audio frequency has clearly comprised the outer frequency component of 48kHZ.Trace 44 has shown the signal frequency behind extraction and the core encoder among Fig. 3 b.Audio frequency filtering above frequency and the sampling rate of 24kHZ drop to 48kHZ by withdrawal device because core encoder can not be operated in higher frequency.Trace 46 has been described the frequency spectrum that enters the preceding difference signal of extended coding device.Very clearly its data resource can be focused on these portions of the spectrum of not represented, promptly between 24kHZ and the 48kHZ by core encoder.

A kind of Bit Allocation in Discrete scheme at first has been described among Fig. 7, some extended bits has been assigned to core dimensions and other are assigned to high frequency spectrum.As shown, the two noise lower limit of having expanded the bandwidth of output audio signal and having reduced the 0-24kHZ scope.This example has been supposed has additional bit can distribute the extended coding device.Another Application as shown in figure 12, is distributed in core and spreading range keeping total bit number to be fixed on the existing level and with its branch.Very clear high frequency performance and noise lower limit 138 intercropping compromise, noise lower limit 138 up to 24kHZ all than unaltered core noise lower limit 78 height.In another kind of disposal route, any additional bit is all distributed to the frequency spectrum of frequency more separately and noise descended is stayed in the core dimensions alone.Because error is quite big near the transition band 24kHZ, the high frequency spectrum best definition is for comprising transition band.

In in the end a kind of situation, suppose that the noise lower limit that is provided by core encoder is that the improvement of enough good or high frequency spectrum is more important than reducing noise lower limit.There is not extended bit to be assigned to reduce the encoding error relevant in both cases with the reconstruct core signal.Can simplify cataloged procedure like this reducing required calculating number and time-delay, this can reduce the cost of audio frequency apparatus and complicacy and not influence code translator.

As shown in figure 13, this can finish to remove the component of signal that is higher than more than the 24kHZ by at first utilizing this DAB of linear phase FIR filter 140 low-pass filtering with integer time-delay.Then filtered signal is extracted 142 by the factor 2, obtain effective 48kHZ sampling letter.Owing sampled signal is fed to core encoder 144 and the gained bit stream is put into wrapper 146 with normal mode subsequently.Then the signal of owing to sample is mixed repeatedly the interpolation of reconstruction signal to remove by the factor 2 oversamplings 148 and low filtering 150.Make linear phase fir finish this filtering once more with integer sampling time-delay.Thereby the signal of reconstruct still only carries the audio-frequency information that keeps in the core bit stream, and promptly it does not comprise the audio frequency component of any 24kHZ of being higher than, but does not have encoding error.From time-delay (154) the mode signal of input signal, deduct this reconstruction signal then to produce time-delay (157) difference signal and to have the sample code device 158 of 96kHZ to encode to produce spread bit stream.

Among this scheme and Figure 11 the difference of scheme be in processing core encoder and code translator by bypass to produce difference signal.Should compromise be exactly not improve, because the encoding error in the core encoder is not reflected in difference signal by the noise lower limit in the topped frequency band of core encoder.Therefore, the extended coding device should not given Bit Allocation in Discrete the low frequency sub-band away from the transition band of extraction and interpolation filter.

The filter characteristic subject under discussion

Extraction Anti-liased substantial low pass filtering device (LPF) to signal filtering, its objective is and will can not be removed by the signal that core algorithm is represented usually before carrying out core encoder.In other words, the code translator that exists in the consumer device is not encoded and utilizes these frequencies.For fear of mixing repeatedly effect and may causing sound quality to descend, this wave filter is fully roll-offed before transition point usually.Yet the technical indicator of this wave filter, be that ripple, transition band width and Stop band attenuation can be adjusted to obtain necessary quality standard by the user.

The purpose of interpolation Anti-liased wave filter is guaranteed simply just that interpolation is mixed and is repeatedly decayed effectively, makes that mixing the degree of changing does not disturb oeverall quality.This wave filter can be the simple reprint of extracting the Anti-liased wave filter.Yet, may be with bigger in order to ensure the complicacy of the quality decimation filter of core signal.Its possibility of result wishes to reduce the size of interpolation filter to be reduced at the calculated load at code translator and code translator place.

Normally, keep the filtering characteristic of interpolation filter identical at scrambler with the code translator place hope.This has just guaranteed accurately coupling of time-delay and response, make code translator the place and will accurately make reverse process to the difference at scrambler place, can wish to reduce to decipher the computational complexity of interpolation filter once in a while.Though this can cause a little between scrambler and the processing of code translator interpolation and not match, suitable Design of Filter can make this difference become very little.Another important issue is the time-delay of wave filter.If time-delay is different, it must compensate by add time-delay in extended chain or core chain.Its purpose once will be guaranteed expansion and core signal time alignment accurately before addition.

The realization of coder

In above-mentioned encoding scheme, be arbitrarily for the encoder/decoder of core and spread bit stream, promptly they can be the combination in any of sub-band coding, transition coding etc.General core adds extended method can be divided into two different realizations.First is the black box method, it does not require the algorithm of core encoder and the knowledge of inner structure, need only know required encoding time delay.If yet know the characteristic of core encoder and make the extended coding device design to such an extent that mate that extended coding can be done more effectively in some cases with it.

The black box coder

Black box method hypothesis does not have the inner structure knowledge about core encoder/code translator (codec) except knowing the time-delay of core encoder and code translator.Be used to describe the block diagram that general core adds extended method above and also be used to illustrate the black box method.Core and extended coding and decoding are handled and are separated fully as shown.Unique reciprocation appears at and forms difference signal or when output signal sued for peace, this took place in time domain fully.Therefore need be about the inner structure knowledge of core coder, do not need yet according to this heart coder or be subjected to it constraint select to expand coder.Yet, must select time-delay so that (a) core signal of reconstruct and input signal accurately time alignment and (b) core signal and difference signal time alignment accurately before the addition of code translator place before forming difference signal.Current preferred process method as Figure 11 a and 11b is that all time-delays are placed in the scrambler to minimize storer required in the scrambler.

For making the nuclear signal time alignment of input signal and reconstruct, the whats such as amount that input signal is delayed time:

Delay _input＝Delay _{DccimationLPF}+Delay _CorcEncoder+Delay _CoreDncoder+Delay _{Interpolation?LPF}.

For at code translator place time alignment and difference signal, the delay adjustments of frame buffer is:

Delay _FrameBuffer＝Delay _{DifferenceSigmal}+Delay _{ExtensionEncoder}+Delay _{ExtensionDncoder}

The problem of scrambler shown in Figure 11 is undue encoding time delay, and it is

CodingDelay＝Delay _{DecimationLPF}+Delay _CoreEncoder

+Delay _{DifferenceSignal}+Delay _{ExtensionEncoder}+Delay _{ExtensionDncoder}

+Delay _CoreDncoder+Delay _{InterpolationLPF}

If the scheme and the interpolation filter LPF that use Figure 13 to provide are suitably designed, this time-delay can be reduced to

CodingDelay＝Delay _{DecimationLPF}+Delay _CoreEncoder

+Delay _CoreDncoder+Delay _{InterpolationLPF}

Black box extended coding device

An example that is suitable for black box extended coding device 160 coders is shown in Figure 14 and 15.This coder is based on bank of filters type coding technology, be used to currently marketed all main audio coding system: DTS sound equipment that is concerned with on this technical spirit, MPEG I and MPEG II have used sub-band coding, and AC-3 and MPEG II AAC have then adopted transition coding.Thereby the coder details of Jie Shaoing can be applicable to the realization of the expansion coder that uses in the following open box realization easily here.

Extended coding device 160 is shown in Figure 14 (a).Filtered device group 162 divisions of difference signal and extraction are N subband.Each subband signal can be used subband coder 164 codings shown in Figure 15 (a).Subband bit packaged subsequently 166 from each subband coder becomes extended bit.

Code translator is shown in Figure 14 (b).Extended bit is at first by 170 one-tenth each independent subband bits of dismounting.The subband bit of dismounting is subsequently by the subband signal of 172 decodings of subband code translator shown in Figure 15 (b) with generation reconstruct.At last, the reconstruct difference signal by operation synthesis filter group 174 on the subband signal of reconstruct.

In each subband coder (Figure 14 (a)), sub-band samples is grouped into the Substrip analysis window.Sub-band samples in each such window is used to the coefficient of one group of four predictive filter of optimization, utilizes search tree VQ strategy that they are quantized then.This vector quantization predictive coefficient is used to predict the subband signal in each analysis window.The ratio that obtains with the variance of the variance of sub-band samples and prediction residue can be used as prediction gain.If prediction gain is positive and the prediction gain loss that is enough to cover the expense of sending predictive coefficient VQ address and may be caused the quantification of prediction residue after a while by what, prediction residue will be quantized and be transmitted.Otherwise prediction gain will be dropped and sub-band samples will be quantized and transmit.To the indication of " predictive mode " in the use compression bit of the adaptive prediction of subband analysis window sign.Like this, no matter when as long as can just dynamically be activated by lower quantization error adaptive prediction.

If predictive mode is work to a certain Substrip analysis window, just calculate a calibration factor, it is the RMS (root mean square) or the peak amplitude of prediction residue.With this calibration factor normalization prediction residue.If predictive mode is resolved window to certain subband and do not worked, analyze sub-band samples to find out the transition that may exist.Transition is defined in the sharp-pointed or quick transition between low amplitude value state and high-amplitude state of value.If such window is used single calibration factor, its low level sample before for transition may be undue, may cause the Pre echoes when the low bit rate pattern.In order to get rid of this problem, each analysis window is divided into up to the inside of some subwindow.Locate the position of transition and calculate two calibration factors according to analyzing subwindow in analysis window, one is used for window before the transition and another is used for the window after the transition.The subwindow identification number of transition appearance place is packed into bitstream encoded subsequently.Thereafter, the sub-band samples of each subwindow is with they calibration factors separately, and normalizing is excellent.

According to bit rate, with 64 grades (step pitch 2.2dB) or 128 grades of (step pitch 1.1Db) Gen Fangbiao to calibration factor to the quantity ratio.They can dynamically follow the tracks of the audio frequency in the 140dB scope.Quantization table is embedded in the bit stream of each analysis window.

Bit Allocation in Discrete across all subbands of all channels realizes for topped available fair water algorithm of the time of Substrip analysis window.For high bitrate applications, fair water algorithm is to the power work of subband.For low bit-rate applications, by to all channels operation tonequality analyses obtain subjective transparent coding with the signal that obtains this subband to mask than (SMR, Signal to Mask Ratio), and subsequently this SMR is presented to the water-filling algorithm.In harmless or variable bit rate coding pattern, Bit Allocation in Discrete determines that by quantizing the step pitch size it has guaranteed that quantizing noise is lower than predetermined threshold values, for example half LSB of source PCM sample.The Bit Allocation in Discrete of Huo Deing is embedded in the bit stream subsequently like this.

After Bit Allocation in Discrete, sub-band samples or prediction residue are quantized and quantification index is packed into bit stream.

Tonequality studies show that, the human auditory system to the perception of space imagery more based on to the temporal envelope of sound signal rather than based on its temporal Fine Structure.Therefore, in the very low bit rate pattern, to the high-frequency sub-band sum coding of selected number voice-grade channel, might improve overall reconstruct fidelity by only.In when decoding, these high-frequency sub-band of each channel can be by duplicating this and signal and each fixed calibration factor calibration and reconstruct with them subsequently.If adopt the combined strength coding, one of combined channel (source channel) carries with sub-band samples and other channel only carries index and their calibration factors separately to the source channel.

When low bit-rate applications, calibration factor, transient position, Bit Allocation in Discrete or quantization index can further utilize entropy coding (for example Huffman coding) to be encoded.In this case, the actual total bit number that uses may be significantly smaller than fixed bit rate and use the maximum number bits that is allowed behind the entropy coding.In order to make full use of the maximum number bits of permission, utilize a kind of alternative manner, begin incrementally to distribute untapped bit from the least significant end of subband, up to using up untapped bit with the most significant end that ends at frequency band.

The hardware of black box code translator is realized

The realization of 5.1 channels, 96kHZ, 240 bit DTS code translators that is operated in two HSARC 21065L floating point processors is shown in Figure 16-20.All " core " pieces are handled and series connection I/O data stream is carried out in processor #1 (P#1) 180.Expansion decoding required the most of of signal processing operations placed restrictions in processor #2 (P#2) 182.This structure allows can be 96/24 " high definition " audio format and considers simple simple HardwareUpgring path.Particularly only for " core " coding, use processor #1 enough, it passes through outside port 186 and external storage 184 interfaces, and is connected to SPDIF receiver 188 and three SPDIF transmitters 190a, 190b and 190c by output serial port 192.Be upgraded to 96kHZ, 24 bit DTS code translators can be finished by processor #2 being connected to bunch external memory bus of multiprocessing structure 194.The on-chip bus arbitrated logic of SHARC allows two processors sharing common bus.

Can obtain available digital stream from DVD player or the DVD conveyer in DVD player.Must receive digital stream and convert it to suitable form to present RX serial port 195 with the SPDIF receiver to SHARC P#1.The digital stream that enters utilizes DMA to be sent to data buffer the internal storage of SHARC P#1 from the RX serial port.

The block diagram 196 of Figure 17 has illustrated the flow process that enters data stream.(L, R), (C, LFE) 6 are multiplexed into three input streams through deciphering PCM stream for an outsourcing left side and outsourcing right channel (SL, SR) and central and low frequency efficient channel to be used for a left side and right channel.Three serial port dma channels are used to the data buffer of output stream from the internal storage of SHARCP#1 is sent to suitable transmission serial exit.Serial port can be configured to any commercial SPDIF transmitter or DAC and presents.

Block diagram 198 among Figure 18 has illustrated the defeated stream of the number flow process of going out.A small bundle of straw, etc. for silkworms to spin cocoons on multiprocessing structure makes the shared external memory storage and the I/O register of addressable two processors of each processor.By sharing the exchanges data of double buffer implementation between two processors in the external memory storage.Particularly, from 6 channels of " core " voice data of current DTS frame, utilize the outside port dma channel of P#1 to be sent to the impact damper of sharing the external memory storage piece (routine piece A) by internal storage from P#1.Also be sent to same that shares external memory storage in their corresponding buffers from five channels of the expansion sub-band samples of current DTS frame in addition from the internal storage of P#1.The outside port DMA of P#1 is used for this transmission once more.

Shown in the block diagram 200 of Figure 19, the internal storage that is sent to P#2 by the corresponding buffers from the piece B that shares external memory storage from " core " and the growth data of previous DTS frame in current DTS image duration.The switching of the scheduling of these transmission and memory block (A/B) is finished by the control to the I/O register of two processors by P#1.Be sent to the internal storage of P#1 similarly by the impact damper from the piece D that shares external memory storage from the outside port dma channel of 6 channel usage P#1 of the 96kHZ pcm audio of previous DTS frame.The block diagram 201 of Figure 20 has illustrated the flow process of this data stream.The scheduling of these transmission and the switching of the piece (C/D) of storer finished by the expansion to the I/O register of two processors by P#1 once more.

Open box coder I

The realization of open box requires to know the inner structure of core coder.Scrambler example shown in Figure 21 and 22 is to use the core encoder of bank of filters method coding techniques.They include but is not limited to sub-band coding (DTS be concerned with sound equipment, MPEG I and MPEG II) and transition coding (Dolby AC-3 and MPEG II AAC).Known the inner structure of core encoder, selected and design expansion coder makes its response on core bandwidth (example 0 to 24kHZ) and the responses match of core encoder.As a result, can in transform domain or subband domain rather than in time domain, form difference signal.This has reduced amount of delay and calculated amount.

DAB is represented with having the 24 bit PCM samples in 96kHZ cycle in first example.At first low-pass filtering 202 DABs then extract 202 again by the factor so that its broadband is reduced to 24kHZ below one, obtain 48kHZ efficiently sampling signal.This is owed sampled signal and delivers to core encoder 206 then.N band filter group 208 in the core encoder is owed sampled signal with this and is decomposed into N subband.Each subband can be used adaptive prediction, and multiple technologies such as scalar and/or vector quantization and entropy coding 210 are encoded.In an optimum structure, the sub-band coding technology will with all technology coupling in the core encoder.The bit stream of gained is put into wrapper subsequently.This bit stream also is fed to core subband code translator 212 and is provided with back extended coding device with the reconstruct sub-band samples and is used to generate the subband difference signal.

96kHZ sampling input PCM signal is delayed time 214 and be fed to 2 Methods of Subband Filter Banks 216 subsequently to produce into the sampling subband signal of 2 48kHZ.Low band signal is used with the used identical N band filter group 218 of core encoder and is resolved into N subband signal.They each is all deducted the subband signal separately of core encoder reconstruct to generate the subband difference signal.Difference signal is by subband coder 222 codings and put into wrapper 224 subsequently.High-frequency band signals from two band filter groups is fed to M band filter group 226 to generate the M subband signal.Subsequently by subband coder 228 to their codings and put into wrapper.This subband coder can comprise multiple technologies such as adaptive prediction, scalar sum vector quantization and/or entropy coding.Time-delay before the given extended coding device is:

Delay+Delay _{2-band filter}=Delay _{DecimationLPF}Make the core subband signal of reconstruct and audio sub-band signal at summing point 220 places time alignment (seeing Figure 21 (a)) accurately.Signal is aimed at automatically at the summing point place (to see Figure 21 (b)) in code translator.M band filter group must design to such an extent that make their time-delays and the time-delay of N band filter group coupling, otherwise, must introduce extra time-delay and make the subband signal in the high frequency band identical time-delay be arranged with subband signal in the low high frequency band.

Figure 21 (b) shows decode procedure.Core bit stream is by dismounting 230 and decipher 232 to produce N subband signal.If there is not the expansion code translator on the player device, just with these core subband signals be fed to N frequency band synthesis filter group 234 with generation core audio frequency.Otherwise, skip this step and with the core subband signal be fed to expansion code translator 236 and with difference subband decoded signal 240 additions 238 from spread bit stream.Subsequently the subband signal of these additions is delivered to N frequency band synthesis filter group 242 to produce low band signal.Form high-frequency band signals by deciphering 244 spread bit stream and will being fed to M frequency band synthesis filter group 246 through the M subband signal of decoding.At last, high-frequency band signals and low band signal are delivered to two frequency band synthesis filter groups 248 to produce the sampled audio output of 96kHZ.

Advantage of this open cassette method comprises the encoding time delay that has reduced:

EncodingDelay＝Delay _{DecimationLPF}+Delay _Core?Encoder

DecodingDelay＝Delay _Core?Decoder+Delay _{2-band?filter}

And decoding complexity:

Decoding?MPIS＝MPIS _Core?Decoder+MPIS _{M-band?filter}+MPIS _{2-band?filter}

If the FIR filter tap number of M frequency band and two band filter groups selects enough for a short time, M frequency band and the necessary MIPS of two frequency band synthesis filter groups can become the MIPS less than the N frequency band synthesis filter group that is operated in 48kHZ.Thereby total MIPS of decoding 96kHZ audio frequency can be less than the twice of the required MIPS of core code translator that handles the 48kHZ sampled audio.

Open box coding and decoding II

If replace M frequency band synthesis filter group with the N frequency band synthesis filter group among the preferred embodiment II, three bank of filters in the expansion coder can be combined to form a L band filter group, wherein L=M+N (Figure 22 (a) and Figure 22 (b)).If realize cosine (Cosine) modulation then make up L band filter group little calculated load can be provided with fast algorithm.

Figure 22 (a) realizes that with open box II is identical basically, except three analysis filterbank in the extended coding device are substituted by single L frequency range analysis bank of filters 250 and deducted with generation value subband signal from the low frequency N subband signal of L frequency band from the reconstruct subband signal of core encoder.This is possible because be operated in the low frequency N subband of L band filter group of extended coding device of 96kHZ sampling rate each topped be operated in the identical audible spectrum of N band filter group of the core encoder of 48kHZ sampling rate.In order to make the success of this scheme, the filter characteristic of critical certainly is L frequency band and N band filter group will match each other, though they be operated in different sample frequency also should be like this.

Decode procedure shown in Figure 22 (b) almost with Figure 21 (b) in the same, except combining with a L frequency band and bank of filters 252 substitutes three and combines and the reconstruct subbands of wave filter and core code translator are added to come in the low frequency N subband of L band filter group output accordingly.

Though show and described some illustrative embodiment of the present invention, can occur many variations or alternative embodiment to those skilled in the art.For example, sampling rate discussed here can be corresponding to Current Standard.These sampling rates also can change as time passes.Can envision this variation and alternative embodiment and can be made and do not break away from the spirit and scope of the present invention by appended claims regulation.

Claims

1. one kind is used for it is characterized in that comprising to by sampling of known sampling rate and the multichannel audio coding device that has the digital audio and video signals of an audio bandwidth to encode:

Be configured to receive the extraction low-pass filter (LPF) of described digital audio and video signals, to remove the component of signal that is higher than audio bandwidth of core encoder, described extraction low-pass filter produces the digital audio and video signals output through filtering to digital audio and video signals filtering for it;

Owe to sample signal output through filtering to extract the withdrawal device of core signal, wherein this sampling rate little than described digital audio and video signals;

Sampling rate that has and audio bandwidth mate the core encoder of this withdrawal device, it is set receives described core signal and this core signal is encoded into core-bits;

To core-bits decoding core code translator, be used for receiving described core-bits from described core encoder with the input of its coupling with the core signal that forms reconstruct;

The core signal of over-sampling reconstruct is provided with it and receives described reconstruct core signal with the interpolater of the sampling rate that reaches the extended coding device, and it produces the reconstruct core signal of an over-sampling as output;

Interpolation LPF, it receives the reconstruct core signal of described over-sampling and the reconstruct core signal filtering of this over-sampling is mixed repeatedly to remove interpolation from described interpolater, and it produces over-sampling reconstruct core signal once filtering as output;

Summing junction, be configured to receive described digital audio and video signals and described over-sampling reconstruct core signal through filtering, it deducts described over-sampling reconstruct core signal through filtering to form a difference signal from digital audio and video signals, wherein said extended coding device is encoded into extended bit with difference signal;

Sampling rate that has and audio bandwidth equal the sampling rate of described digital audio and video signals and the extended coding device of audio bandwidth, it is set receives described difference signal and described difference signal is encoded into extended bit; And

One wrapper, it adds extended format with core-bits and ratio by core and is packaged into bit stream, in this form, first generation tone decoder can extract and decipher core-bits, and second generation tone decoder can extract core-bits and adds extended bit to reappear high-quality audio signal to reappear sound signal.

2. multichannel audio coding device as claimed in claim 1, it is characterized in that, described extended coding device distributes described extension bits in the middle of following: (a) core frequency spectrum, be used for by provide improved resolution to reduce encoding error in the core frequency spectrum at described core frequency spectrum, and (b) spread-spectrum, be used for the frequency that is higher than described core encoder bandwidth is encoded, the noise lower limit on the core encoder audio bandwidth thereby described extended bit becomes more meticulous, and defined the noise lower limit of remainder of the audio bandwidth of extended coding device.

3. multichannel audio coding device as claimed in claim 1, it is characterized in that, described extended coding device distributes described extended bit with near the transition band that defines described extraction LPF and higher frequency band, thereby to the definitions of bandwidth noise lower limit of the extended coding device that is higher than described transition band;

Wherein, described core-bits is to having defined the noise lower limit of reconstruct core signal in its audio bandwidth.

4. multichannel audio coding device as claimed in claim 1, it is characterized in that scrambler has kept providing the reproduction of using second generation tone decoder high quality sound again with the compatible while on the existing basis of first generation tone decoder, described core code translator comprises one of described first generation tone decoder and described core encoder and first generation code translator compatibility.

5. one kind is used for it is characterized in that comprising to by sampling of known sampling rate and the multichannel audio coding device that has the digital audio and video signals of an audio bandwidth to encode:

Be configured to receive the extraction low-pass filter (LPF) of described digital audio and video signals, it to digital audio and video signals filtering to remove the component of signal that is higher than audio bandwidth that extracts low-pass filter, the transition band that described extraction low-pass filter has is at the annex of described core encoder audio bandwidth, and generation is through the digital audio and video signals output of filtering;

Sampling rate that has and audio bandwidth equal the sampling rate of described digital audio and video signals and the extended coding device of audio bandwidth, it is set receives described difference signal and described difference signal is encoded into extended bit, described extended coding device is in described transition band and be higher than in the frequency band of this transition band allocation bit to expand the frequency range of described coded signal; And

6. one kind is used for it is characterized in that comprising to by sampling of known sampling rate and the multichannel audio coding device that has the digital audio and video signals of an audio bandwidth to encode:

Extract core signal and it is encoded into the core encoder of core-bits from the digital audio and video signals on an audio bandwidth, described core encoder comprises that one is decomposed into the N band filter group of N subband with core signal and generates the N subband coder, reconstruct N sub-band samples of the core-bits N subband code translator with formation reconstruct core signal;

From the core signal of reconstruct and the summing junction of the difference signal digital audio and video signals formation transform domain or the subband domain;

Described new difference signal is encoded into the extended coding device of extended bit, described extended coding device on its audio bandwidth with the core encoder coupling and comprise:

Digital audio and video signals is split into two band filter groups of low-frequency band and high frequency band;

N band filter group with the core encoder equivalence, it is decomposed into N subband with the digital audio and video signals in the low-frequency band, described summing junction is present in the described extended coding device and comprises N subband node, and they deduct the N sub-band samples of reconstruct to form N difference subband respectively from N subband of digital audio and video signals.

N subband coder, they to N difference sub-band coding to form the lower band expansion bit;

M band filter group, it is decomposed into M subband with the digital signal in the high frequency band; And

M subband coder, they to M sub-band coding to form the high frequency band extended bit;

7. one kind is used for it is characterized in that comprising to by sampling of known sampling rate and the multichannel audio coding device that has the digital audio and video signals of an audio bandwidth to encode:

L band filter group, it is decomposed into N low subband and M high-frequency sub-band frequently with digital audio and video signals, the characteristic of described L band filter group is mated with the characteristic of stating N band filter group on its N low frequency band, described summing junction is present in the described extended coding device and comprises N subband node, and they deduct the N sub-band samples of reconstruct to form N difference subband respectively from N subband of digital audio and video signals;

8. one kind is used for from the multichannel black box tone decoder of bit stream reconstruct multitone frequency channel, and wherein each voice-grade channel is sampled by known sampling rate and an audio bandwidth is arranged, and it is characterized in that comprising:

Replacer, be used for once reading in and storing a frame bit stream, each described frame comprises the core field with core-bits and has synchronization character and the extended field of extended bit that described replacer extracts described core-bits and detects described synchronization character to extract and to separate this extended bit;

To the core code translator of core-bits decoding with the core signal of formation reconstruct;

To the expansion code translator of extended bit decoding with the difference signal of formation reconstruct, sampling rate that described expansion code translator has and audio bandwidth are greater than the sampling rate and the audio bandwidth of described core code translator;

The core signal of over-sampling reconstruct is with the interpolater of the sampling rate that reaches the extended coding device;

Reconstruct core signal filtering to over-sampling mixes low-pass filter repeatedly with the decay interpolation; And

The difference sound signal of reconstruct is added the core sound signal of reconstruct with fidelity of improving the reconstruct core signal and the summing junction of expanding its audio bandwidth.

9. one kind is used for from the multichannel black box tone decoder of bit stream reconstruct multitone frequency channel, and wherein each voice-grade channel is sampled by known sampling rate and an audio bandwidth is arranged, and it is characterized in that comprising:

N core subband code translator, they are decoded into N core subband signal with core-bits;

N expansion subband code translator, they are decoded into N low frequency expansion subband signal with extended bit;

M expansion subband code translator, they are decoded into M high frequency expansion subband signal with extended bit;

N core subband signal and N expansion subband signal are separately formed mutually N summing junction of N complex sub-band signals; And

Wave filter, their comprehensive N composite band signals and M expansion subband signal are to reproduce multi channel audio signal.

10. tone decoder as claimed in claim 9 is characterized in that described wave filter is a single M+N band filter group, wherein N low-frequency band and N core subband code translator compatibility.

11. tone decoder as claimed in claim 9 is characterized in that described wave filter comprises:

N band filter group, it and N core subband code translator compatibility, its comprehensive this N complex sub-band signals;

M band filter group, it is this M expansion subband signal comprehensively; And

Two band filter groups, it makes up the output of N and M band filter group to constitute multi channel audio signal.

12. one kind to by the sampling of known sampling rate and the method for the multi-channel digital coding audio signal of an audio bandwidth is arranged, it has kept providing again simultaneously with second generation tone decoder high-quality reproduction sound with the compatible of the existing basis of first generation tone decoder, it is characterized in that may further comprise the steps:

This digital audio and video signals of low-pass filtering is to remove the component of signal that is higher than the core audio bandwidth;

Owe to sample signal through filtering to extract the core signal of sampling rate and core samples rate-matched;

With with the mode of described first generation tone decoder compatibility, core signal is encoded into core-bits by core samples rate and audio bandwidth and does not turn back and mix repeatedly, described core samples rate and audio bandwidth are less than numeral signals sampling speed and audio bandwidth frequently;

Use first generation tone decoder decoding core-bits to form the core signal of reconstruct;

The core signal of over-sampling reconstruct is to the expansion sampling rate;

The low-pass filtering over-sampling the reconstruct core signal mix repeatedly to remove interpolation;

From digital audio and video signals, deduct described filtering signal to form described difference signal;

With the expansion sampling rate and the audio bandwidth that equal described numeral frequency signals sampling speed and audio bandwidth difference signal is encoded; And

Core-bits and extended bit are packaged into bit stream by the form that core-bits adds expansion, and second generation tone decoder can extract and decipher core-bits and adds extended bit to reproduce high-quality audio signal with reproducing audio signal can to extract and decipher core-bits with this form first generation tone decoder.

13. one kind to by the sampling of known sampling rate and the method for the multi-channel digital coding audio signal of an audio bandwidth is arranged, it has kept providing again simultaneously with second generation tone decoder high-quality reproduction sound with the compatible of the existing basis of first generation tone decoder, it is characterized in that may further comprise the steps:

This digital audio and video signals of low-pass filtering is to remove the component of signal that is higher than the core audio bandwidth, and described filtering has a transition band around the core audio band;

The over-sampling core signal is to expanding sampling rate to form the reconstruct core signal;

The core signal of filtering reconstruct mixes repeatedly to remove interpolation;

From digital audio and video signals, deduct described reconstruct core signal through filtering to form a difference signal;

With expansion sampling rate and the audio bandwidth that equals described numeral frequency signals sampling speed and audio bandwidth difference signal is encoded into extended bit, in described transition band, distributes the frequency range of described extended bit in the transition band with the extended coding sound signal with being higher than; And

Core-bits signal and extended bit are packaged into bit stream by the form that core-bits adds expansion, and second generation tone decoder can extract and decipher core-bits and adds extended bit to reproduce high-quality audio signal with reproducing audio signal can to extract and decipher core-bits with this form first generation tone decoder.

14. the method for a reconstruct multi channel audio signal is characterized in that may further comprise the steps:

The sequence of received code frame, each described frame comprise having and the core synchronization character are abutted against the core field before the core-bits and will expand extended field before synchronization character abuts against extended bit;

Detect the core synchronization character to extract the core signal that core-bits also is decoded into it reconstruct subsequently;

Detect the expansion synchronization character to extract extended bit and with sampling rate and audio bandwidth they to be decoded into the difference signal of reconstruct subsequently greater than described core-bits;

The core signal of over-sampling reconstruct is to the sampling rate of reconstruct difference signal;

The low-pass filtering over-sampling the reconstruct core signal mix repeatedly with the decay interpolation;

Addition through the core signal of filtering and reconstruct and heavy difference signal with the reconstruct multi channel audio signal.