CN101160725A - Lossless information encoding for maximum bitrate - Google Patents
Lossless information encoding for maximum bitrate Download PDFInfo
- Publication number
- CN101160725A CN101160725A CNA2006800120914A CN200680012091A CN101160725A CN 101160725 A CN101160725 A CN 101160725A CN A2006800120914 A CNA2006800120914 A CN A2006800120914A CN 200680012091 A CN200680012091 A CN 200680012091A CN 101160725 A CN101160725 A CN 101160725A
- Authority
- CN
- China
- Prior art keywords
- information
- value
- coded representation
- rule
- coding
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 claims description 39
- 230000005236 sound signal Effects 0.000 claims description 10
- 238000004590 computer program Methods 0.000 claims description 8
- 230000005540 biological transmission Effects 0.000 description 10
- 230000003595 spectral effect Effects 0.000 description 8
- 230000006835 compression Effects 0.000 description 4
- 238000007906 compression Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 239000002609 medium Substances 0.000 description 3
- 238000005192 partition Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 101000969688 Homo sapiens Macrophage-expressed gene 1 protein Proteins 0.000 description 2
- 102100021285 Macrophage-expressed gene 1 protein Human genes 0.000 description 2
- 238000010420 art technique Methods 0.000 description 2
- 235000009508 confectionery Nutrition 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 239000006163 transport media Substances 0.000 description 1
Images
Landscapes
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
当把产生具有可变长度的信息值编码表示的第一编码规则与产生具有固定长度的信息值编码表示的第二编码规则进行比较时,并且当选择导致需要较少数量的信息单元的编码表示的编码规则时,可以导出不超过预定大小的信息值紧密编码表示。因此,可以确保最大比特率至少是导出第二编码表示的第二编码规则的最大比特率。把通过某规则信息而对编码规则的选择连同信息值的编码表示一同发出,之后能够在解码器端使用与编码期间所使用的编码规则相适合的解码规则来导出正确的信息值。
When comparing a first coding rule that produces a coded representation of an information value with a variable length with a second coding rule that produces a coded representation of an information value with a fixed length, and when the coded representation that results in requiring a smaller number of information units is selected A tightly encoded representation of information values that does not exceed a predetermined size can be derived when the encoding rules of . Thus, it can be ensured that the maximum bit rate is at least the maximum bit rate of the second coding rule deriving the second coded representation. The selection of an encoding rule via a rule message is sent together with the encoded representation of the information value, after which the correct information value can be derived at the decoder using a decoding rule adapted to the encoding rule used during encoding.
Description
技术领域technical field
本发明涉及信息值的无损编码,具体涉及这样一种概念:确保信息值的编码表示的最大比特率。The invention relates to lossless coding of information values, and in particular to the concept of ensuring a maximum bit rate for the coded representation of information values.
背景技术Background technique
最近以来,多通道音频再现技术越来越显重要,这或许是因为如下事实:例如目前已为人熟知之mp3技术的音频压缩/编码技术,使得利用网络或者其它具有有限带宽的传输通道来分发音频记录成为可能。该mp3编码技术之所以会变的这么有名,系因为如下事实:它可以以立体声格式,亦即以包含第一或者左立体声通道以及第二或者右立体声通道的音频记录的数字表示,来分发记录。Recently, multi-channel audio reproduction techniques have gained importance, perhaps due to the fact that audio compression/encoding techniques, such as the now well-known mp3 technology, allow the distribution of audio over networks or other transmission channels with limited bandwidth record possible. The reason why this mp3 encoding technique has become so famous is due to the fact that it can distribute recordings in stereo format, that is, as a digital representation of an audio recording containing a first or left stereo channel and a second or right stereo channel .
然而,传统的二声道系统有其基本的缺点,因此,开发出环绕声技术。一种推荐的多通道环绕声表示除了包括两个立体声通道L以及R以外,还包括额外的中央通道C以及两个环绕声道Ls、Rs。该参考声音格式也称为三/二立体声,意谓其具有三个前端通道以及两个环绕声道。一般说来,五个传输通道是必须的。在重放环境中,至少需要把五个扬声器放在五个适当的位置,以获得距离该五个已经适当放置的扬声器特定距离的最佳甜美音点。However, the traditional two-channel system has its fundamental shortcomings, therefore, surround sound technology was developed. A proposed multi-channel surround representation includes, in addition to the two stereo channels L and R, an additional center channel C and two surround channels Ls, Rs. This reference sound format is also known as three/two stereo, meaning that it has three front channels and two surround channels. Generally speaking, five transmission channels are necessary. In a playback environment, at least five speakers need to be placed in five suitable positions to obtain the sweet spot of sweet sound at a specific distance from the five speakers that are already properly placed.
已经有许多为人熟知的技术,可以用来降低传输多通道音频信号所需要的数据量,这些技术称为联合立体声技术。为此目的,请参考第9图,其中显示了一联合立体声设备60。该设备可以是一个用来实行强度立体声(intensity stereo简写为IS)或者立体声提示编码(binauralcue coding简写为BCC)的设备。这样的一个设备通常会接收至少两个通道(CH1、CH2、…CHn)作为输入,并输出至少是单一载波通道以及参数数据。对该参数数据进行定义,使得在解码器中能够计算原始通道(CH1、CH2、…CHn)的逼近(approximation)。There are already many well-known techniques that can be used to reduce the amount of data required to transmit multi-channel audio signals, these techniques are called joint stereophony techniques. For this purpose, please refer to Figure 9, which shows a
一般说来,该载波通道会包含子波段采样、频谱系数、时域采样等,如此一来可以提供基础(underlying)信号的比较良好的表示,而参数数据不包含该频谱系数的采样,但包括控制参数,以控制特定的重建算法,例如乘法加权、时间平移、频率平移、相位平移等。该参数数据因此仅包含该信号或者与其关连的通道的比较粗略的表示。若以数字来表示,载波通道所需的数据量大约在60kbit/s至70kbit/s的范围内。而一个通道的参数辅助信息所需要的数据量典型地在1.5kbit/s至2.5kbit/s的范围内。几种为人熟知的参数数据包括:缩放因子、强度立体声信息或者立体声提示参数,如同下文所述。Generally speaking, the carrier channel will contain sub-band samples, spectral coefficients, time-domain samples, etc., so that a relatively good representation of the underlying signal can be provided, while the parameter data does not contain samples of the spectral coefficients, but includes Control parameters to control specific reconstruction algorithms such as multiplicative weighting, time shifting, frequency shifting, phase shifting, etc. The parameter data therefore only contain a relatively coarse representation of the signal or the channel associated with it. If expressed in numbers, the amount of data required by the carrier channel is approximately in the range of 60kbit/s to 70kbit/s. However, the amount of data required for the parametric auxiliary information of one channel is typically in the range of 1.5 kbit/s to 2.5 kbit/s. Several well-known parameter data include: scaling factor, intensity stereo information or stereo cue parameters, as described below.
举例而言,BCC技术在下列文章中有所叙述:AES convention paper5574,“Binaural Cue Coding applied to Stereo and Multi-ChannelAudio Compression”,C.Faller,F.Baumgarte,May 2002,Munich;IEEEWASPAA Paper“Efficient representation of spatial audio usingperceptual parametrization”,October 2001,Mohonk,NY;“Binauralcue coding applied to audio compression with flexible rendering”,C.Faller and F.Baumgarte,AES 113th Convention,LosAngeles,Preprint 5686,October 2002;以及“Binaural cuecoding-Part II:Schemes and applications”,C.Faller和F.Baumgarte,IEEE Trans.on Speech and Audio Proc.,volumelevel.11,no.6,Nov.2003。For example, BCC technology is described in the following papers: AES convention paper5574, "Binaural Cue Coding applied to Stereo and Multi-ChannelAudio Compression", C. Faller, F. Baumgarte, May 2002, Munich; IEEEWASPAA Paper "Efficient representation of spatial audio using perceptual parametrization", October 2001, Mohonk, NY; "Binauralcue coding applied to audio compression with flexible rendering", C. Faller and F. Baumgarte, AES 113 th Convention, Los Angeles, Preprint 5686, October 2002; and "Binauralcue cuecoding-Part II: Schemes and applications", C. Faller and F. Baumgarte, IEEE Trans. on Speech and Audio Proc., volumelevel.11, no.6, Nov.2003.
在BCC编码方式中,首先利用具有重迭窗口的基于离散傅立叶变换(Discrete Fourier Transform,简写为DFT)的变换将多个音频输入通道转换为频谱表示。由上述方法得到的均匀频谱被分为不重迭的部分,每一部分的频宽与等效矩形带宽(Equivalent Rectangular Bandwidth,简写为ERB)近似成正比。然后针对每一部分,在两个通道之间进行BCC参数的估测。一般说来,每个通道的BCC参数都相对于参考通道而给出,并且进一步被量化。该传输参数最后再根据指定的方程式(已编码的)进行计算,其也可能依赖于待处理的信号的特定部分。In the BCC encoding method, firstly, multiple audio input channels are converted into spectral representations using a discrete Fourier transform (DFT)-based transformation with overlapping windows. The uniform spectrum obtained by the above method is divided into non-overlapping parts, and the bandwidth of each part is approximately proportional to the Equivalent Rectangular Bandwidth (ERB for short). Then for each part, the BCC parameters are estimated between the two channels. In general, the BCC parameters for each channel are given relative to a reference channel and further quantized. The transmission parameters are finally calculated according to specified equations (encoded), which may also depend on the specific part of the signal to be processed.
多个BCC参数确实存在。举例而言,ICLD参数用来描述两个相比较的通道所包含能量的差(比值)。通道间相干性/相关性(inter-channel coherence/correlation,简写为ICC)参数用来描述两个通道之间的相关性,其可以理解为两个通道波形的相似性。通道间时间差(inter-channel time difference,简写为ICTD)参数用来描述两个通道之间的全局时移,而通道间相位差(inter-channel phasedifference,简写为IPD)参数则是用来描述信号之间相位的差。Multiple BCC parameters do exist. For example, the ICLD parameter is used to describe the difference (ratio) in the energy contained in two compared channels. The inter-channel coherence/correlation (ICC) parameter is used to describe the correlation between two channels, which can be understood as the similarity of the two channel waveforms. The inter-channel time difference (ICTD) parameter is used to describe the global time shift between two channels, while the inter-channel phase difference (IPD) parameter is used to describe the signal phase difference between them.
应当注意的是,在音频信号的帧式处理(frame-wise processing)中,BCC分析也以帧式执行,也就是时变的,而且还以频率式(frequency-wise)而执行。这意味着,对于每一个频谱波段,分别获得BCC参数。这还意味着,如果用声音滤波器组(bank)将输入信号分解为例如32个带通信号,则BCC分析块获得针对此32个波段中每一个的BCC参数组。It should be noted that in frame-wise processing of audio signals, BCC analysis is also performed frame-wise, ie time-varying, but also frequency-wise. This means that, for each spectral band, the BCC parameters are obtained separately. This also means that if the input signal is decomposed into eg 32 bandpass signals with an acoustic filter bank, the BCC analysis block obtains a set of BCC parameters for each of these 32 bands.
一种相关的技术,也就是所谓的参数立体声,在下列文章中有所描述:J.Breebaart,S.van de Par,A.Kohlrausch,E.Schuijers“High-Quality Parametric Spatial Audio Coding at Low Bitrates”,AES 116th Convention,Berlin,Preprint 6072,May 2004;以及E.Schuijers,J.Breebaart,H.Purnhagen,J.Engdegard,“LowComplexity Parametric Stereo Coding”,AES 116th Convention,Berlin,Preprint 6073,May 2004。A related technique, so-called parametric stereo, is described in J. Breebaart, S. van de Par, A. Kohlrausch, E. Schuijers "High-Quality Parametric Spatial Audio Coding at Low Bitrates" , AES 116 th Convention, Berlin, Preprint 6072, May 2004; and E. Schuijers, J. Breebaart, H. Purnhagen, J. Engdegard, "LowComplexity Parametric Stereo Coding", AES 116 th Convention, Berlin, Preprint 6073, May 2004 .
总括来说,多通道音频信号参数编码的新近方法(空间音频编码以及立体声提示编码等),是借助下混合(downmix)信号(可以是单通道的,或者包括若干通道)以及参数辅助信息(空间提示)来表示多通道音频信号,所述参数辅助信息描述了感知的空间声基宽(sound stage)的特征。通常希望将辅助信息的数据率维持的尽可能低,以便将额外开销信息降至最低,并且为下混合信号的编码留出尽可能多的可用传输容量。In summary, recent approaches to parametric coding of multi-channel audio signals (spatial audio coding, stereo cue coding, etc.) rely on downmixed signals (which can be single-channel or include several channels) and parametric side information (spatial audio coding, etc.) hint) to represent a multi-channel audio signal, the parametric side information characterizes the perceived spatial sound stage. It is generally desirable to keep the data rate of side information as low as possible in order to minimize overhead information and to leave as much available transmission capacity as possible for encoding of the downmix signal.
一种使辅助信息的比特率保持为低的方法是,例如通过向辅助信息应用熵编码算法,对空间音频方案的辅助信息进行无损编码。One way to keep the bit rate of the side information low is to losslessly code the side information of the spatial audio scheme, eg by applying an entropy coding algorithm to the side information.
无损编码已经广泛用于一般的音频编码中,以确保量化的频谱系数和辅助信息的最佳紧密表示。适当的编码方案与方法的例子可以在ISO/IEC标准MPEG1第3部分、MPEG2第7部分以及MPEG4第3部分中找到。Lossless coding has been widely used in general audio coding to ensure an optimal compact representation of quantized spectral coefficients and side information. Examples of suitable coding schemes and methods can be found in ISO/IEC standards MPEG1 Part 3, MPEG2 Part 7 and MPEG4 Part 3.
这些标准以及,例如IEEE paper“Noiseless Coding of QuantizedSpectral Coefficients in MPEG-2 Advanced Audio Coding”S.R.Quackenbush,J.D.Johnston,IEEE WASPAA,Mohonk,NY,October 1997描述了现有技术状态中的技术,包含以如下手段对量化的参数进行无损编码:These standards and, for example, the IEEE paper "Noiseless Coding of Quantized Spectral Coefficients in MPEG-2 Advanced Audio Coding" S.R. Quackenbush, J.D. Johnston, IEEE WASPAA, Mohonk, NY, October 1997 describe techniques in the state of the art, including by means of Losslessly encode quantized parameters:
●量化频谱系数的多维霍夫曼(Huffman)编码●Multidimensional Huffman (Huffman) coding of quantized spectral coefficients
●针对系数组,使用公共(多维)霍夫曼码本● For coefficient groups, use a common (multidimensional) Huffman codebook
●将值编码为一个孔(hole),或者将符号信息以及数量信息分别进行编码(即针对给定的绝对值仅具有霍夫曼码本条目,这减小了所需码本的大小,“带符号的”与“无符号的”码本的比较)Encode the value as a hole, or encode the sign information as well as the quantity information separately (i.e. have only Huffman codebook entries for a given absolute value, which reduces the required codebook size, " Signed" vs. "unsigned" codebooks)
●使用具有不同的最大绝对值的替代码本,亦即在待编码的参数中具有不同的最大绝对值Use alternative codebooks with different maximum absolute values, ie different maximum absolute values among the parameters to be coded
●针对每一个LAV,使用具有不同统计分布的替代码本● For each LAV, use a surrogate codebook with a different statistical distribution
●以辅助信息的方式把霍夫曼码本的选择发送至解码器●Send the Huffman codebook selection to the decoder in the form of auxiliary information
●使用“分区”(sections)来定义每一个所选霍夫曼码本的应用范围● Use "sections" to define the scope of application of each selected Huffman codebook
●频率缩放因子的差分编码,以及随后对结果进行霍夫曼编码Differential encoding of the frequency scaling factors, and subsequent Huffman encoding of the result
在MPEG1音频标准中,提出另一种将粗略量化值以无损的方式编码为单一PCM码的技术(在该标准内被称作分组(grouping),并且用于第2层),这在ISO/IEC标准11172-3:93中有更详细的解释。In the MPEG1 audio standard, another technique (called grouping in the standard and used for layer 2) of encoding coarse quantized values into a single PCM code in a lossless manner is proposed, which is described in ISO/ It is explained in more detail in IEC standard 11172-3:93.
出版物“Binaural cue coding-Part II:Schemes andapplications”,C.Faller与F.Baumgarte,IEEE Trans.on Speechand Audio Proc.,volume level.11,no.6,Nov.2003提到一些有关BCC参数编码的信息.其提出以如下方式对量化的ICLD参数进行差分编码:The publication "Binaural cue coding-Part II: Schemes and applications", C.Faller and F.Baumgarte, IEEE Trans.on Speechand Audio Proc., volume level.11, no.6, Nov.2003 mentioned some relevant BCC parameter coding information. It proposes to differentially encode the quantized ICLD parameters in the following manner:
●在频率上进行差分编码,而且随后对结果进行霍夫曼编码(使用一维霍夫曼码)- Differentially encode in frequency, and then Huffman encode the result (using 1D Huffman codes)
●在时间上进行差分编码,而且随后对结果进行霍夫曼编码(使用一维霍夫曼码)- Differentially encode in time, and then Huffman encode the result (using 1D Huffman codes)
最后,选择更有效率的变体(variant)作为原始音频信号的表示。Finally, a more efficient variant is chosen as the representation of the original audio signal.
如同前面所提到的,已经提出通过在频率上(以及可选地在时间上)进行差分编码、然后选择更有效率的变体而对压缩性能进行优化。然后,通过一些辅助信息将所选变体发信号通知(signal)给解码器。As mentioned before, it has been proposed to optimize compression performance by differentially encoding in frequency (and optionally time) and then selecting the more efficient variant. Then, the selected variant is signaled to the decoder by some side information.
以上所叙述的这些现有技术用于减小例如音频或视频流中所必须传输的数据量。使用之前所叙述的基于熵编码方案的无损耗编码技术通常会产生非恒定比特率的比特流。The prior art techniques described above serve to reduce the amount of data that has to be transmitted eg in audio or video streams. Lossless encoding techniques using the previously described entropy-based encoding schemes typically result in a non-constant bit-rate bitstream.
虽然这些现有技术适用于显著地减小待传送的数据大小,但是它们都有一个基本的缺点。由于熵编码主要对被认为在待压缩数据集中经常出现的信息值进行压缩,因此,大量连续出现的罕见参数将会导致很大的码长。由于该参数组合在待编码的复杂数据流中有时可能出现,因此所产生的比特流一般具有比较高的比特率的分区(section)。Although these prior art techniques are suitable for significantly reducing the size of the data to be transferred, they all have a fundamental drawback. Since entropy coding mainly compresses information values that are considered to appear frequently in the data set to be compressed, a large number of rare parameters that appear continuously will lead to a large code length. Since this combination of parameters may sometimes appear in a complex data stream to be encoded, the generated bit stream generally has a section with a relatively high bit rate.
在这些分区中,如果比特率超过传输介质(transport medium)的最大可行比特率(maximum feasible bit rate),例如流应用中的无线连结的最大净数据率,那么已编码数据的传送可能会发生停顿或甚至被中断,这当然是最不利的。In these partitions, if the bit rate exceeds the maximum feasible bit rate of the transport medium, such as the maximum net data rate of a wireless link in streaming applications, the transmission of encoded data may stall Or even be interrupted, which is of course the worst.
发明内容Contents of the invention
本发明的目的旨在提供一种对信息值进行无损编码、同时确保较低比特率的概念。The object of the present invention is to provide a concept for lossless coding of information values while ensuring a lower bit rate.
根据本发明的第一方面,此目的可以由一种用于对由大于一个的比特所描述的信息值进行编码、以导出该信息值的编码表示的编码器来实现,所述编码器包括:比特估测器,适于使用第一编码规则和第二编码规则来估测对信息值进行编码所需的信息单元的数量,所述第一编码规则使得,当对信息值进行编码时,将会导致具有不同数量的信息单元的编码表示,所述第二编码规则使得,当对信息值进行编码时,将会导致具有相同数量的信息单元的编码表示,其中,所述编码表示从具有至少两个组合的信息值的信息值组合中导出;以及提供器,适于提供使用导致针对编码表示的较少数量的信息单元的编码规则而导出的编码表示,并提供指示所述编码表示所基于的编码规则的规则信息。According to a first aspect of the invention, this object is achieved by an encoder for encoding an information value described by more than one bit to derive an encoded representation of the information value, said encoder comprising: a bit estimator adapted to estimate the number of information units required to encode an information value using a first encoding rule such that when encoding an information value, the would result in coded representations with a different number of information units, said second coding rule is such that, when coding an information value, would result in a coded representation with the same number of information units, wherein said coded representation starts with at least derived from an information value combination of two combined information values; and a provider adapted to provide a coded representation derived using coding rules that result in a smaller number of information units for the coded representation, and to provide an indication of the coded representation on which the coded representation is based The rule information of the encoding rules.
根据本发明的第二方面,此目的可以由一种用于对由大于一个的比特所描述的信息值的编码表示进行解码、并用于处理指示对该信息值进行编码所使用的编码规则的规则信息的解码器来实现,所述解码器包括:接收机,用于接收所述编码表示和所述规则信息;以及解压缩器,用于对所述编码表示进行解码,所述解压缩器取决于所述信息规则而使用第一解码规则或第二解码规则来导出所述信息值,所述第一解码规则使得,从具有不同数量的信息单元的编码表示中导出所述信息值,并使用第二解码规则,所述第二解码规则使得,从具有相同数量的信息值的编码表示中导出所述信息值,其中,所述信息值从所述编码表示中具有至少两个组合的信息值的信息值组合中导出。According to a second aspect of the invention, this object may be achieved by a rule for decoding a coded representation of an information value described by more than one bit and for processing a code indicating the code used to code the information value information, the decoder includes: a receiver for receiving the coded representation and the rule information; and a decompressor for decoding the coded representation, the decompressor depending on Deriving said information value based on said information rule using a first decoding rule such that said information value is derived from encoded representations having a different number of information elements, and using A second decoding rule, said second decoding rule being such that said information value is derived from an encoded representation having the same number of information values, wherein said information value has at least two combined information values from said encoded representation Exported from the combination of information values.
根据本发明的第三方面,此目的可以由一种用于对由大于一个的比特所描述的信息值进行编码、以导出所述信息值的编码表示的方法来实现,所述方法包括:使用第一编码规则和第二编码规则来估测对所述信息值进行编码所需的信息单元的数量,所述第一编码规则使得,当对信息值进行编码时,将会导致具有不同数量的信息单元的编码表示,所述第二编码规则使得,当对信息值进行编码时,将会导致具有相同数量的信息单元的编码表示,其中,所述编码表示从具有至少两个组合的信息值的信息值组合中导出;以及提供使用导致针对编码表示的较少数量的信息单元的编码规则而导出的编码表示,并提供指示所述编码表示所基于的编码规则的规则信息。According to a third aspect of the invention, this object is achieved by a method for encoding an information value described by more than one bit to derive an encoded representation of said information value, said method comprising: using A first encoding rule and a second encoding rule to estimate the number of information units required to encode said information value, said first encoding rule being such that, when encoding an information value, will result in a different number of An encoded representation of an information unit, said second encoding rule being such that, when encoding an information value, will result in an encoded representation with the same number of information units, wherein said encoded representation is derived from an information value having at least two combinations and providing an encoded representation derived using an encoding rule that results in a smaller number of information units for the encoded representation, and providing rule information indicating the encoding rule on which the encoded representation is based.
根据本发明的第四方面,此目的可以由一种当在计算机上运行时执行上述方法的计算机程序来实现。According to a fourth aspect of the invention, this object is achieved by a computer program which, when run on a computer, performs the method described above.
根据本发明的第五方面,此目的可以由一种用于对由大于一个的比特所描述的信息值的编码表示进行解码、并用于处理指示对该信息值进行编码所使用的编码规则的规则信息的方法来实现,所述方法包括:接收所述编码表示和所述规则信息;以及取决于所述规则信息,使用第一解码规则或第二解码规则对所述编码表示进行解码, 所述第一解码规则使得,从具有不同数量的信息单元的编码表示中导出所述信息值,并使用第二解码规则,所述第二解码规则使得,从具有相同数量的信息值的编码表示中导出所述信息值,其中,所述信息值从所述编码表示中具有至少两个组合的信息值的信息值组合中导出。According to a fifth aspect of the invention, this object may be provided by a rule for decoding a coded representation of an information value described by more than one bit and for processing a code indicating the code used to code the information value information, the method comprising: receiving the coded representation and the rule information; and depending on the rule information, decoding the coded representation using a first decoding rule or a second decoding rule, the A first decoding rule such that said information value is derived from coded representations with a different number of information elements and a second decoding rule such that said information value is derived from coded representations with the same number of information values The information value, wherein the information value is derived from a combination of information values in the encoded representation having at least two combined information values.
根据本发明的第六方面,此目的可以由一种当在计算机上运行时执行上述方法的计算机程序来实现。According to a sixth aspect of the present invention, the object is achieved by a computer program which, when run on a computer, performs the method described above.
根据本发明的第七方面,此目的可以由一种信息值的编码表示来实现,其中,所述编码表示包括:使用第一编码规则产生的第一部分,所述第一编码规则使得,当对信息值进行编码时,将会导致具有不同数量的信息单元的编码表示;使用第二编码规则产生的第二部分,所述第二编码规则使得,当对信息值进行编码时,将会导致具有相同数量的信息单元的编码表示,其中,所述编码表示从具有至少两个组合的信息值的信息值组合中导出;以及规则信息,指示所使用的编码规则。According to a seventh aspect of the present invention, this object is achieved by an encoded representation of an information value, wherein said encoded representation comprises: a first part generated using a first encoding rule such that, when for encoding an information value would result in an encoded representation having a different number of information units; the second part is generated using a second encoding rule such that, when encoding an information value, would result in an encoded representation having an encoded representation of the same number of information units, wherein the encoded representation is derived from a combination of information values having at least two combined information values; and rule information indicating the encoding rule used.
本发明乃是基于如下的发现:当把产生具有可变长度的信息值编码表示的第一编码规则与产生具有固定长度的信息值编码表示的第二编码规则进行比较时,并且当选择导致需要较少数量的信息单元的编码表示的编码规则时,可以导出不超过预定大小的信息值紧密编码表示。因此,可以确保最大比特率至少是导出第二编码表示的第二编码规则的最大比特率。把通过某规则信息而对编码规则的选择连同信息值的编码表示一同发出,之后能够在解码器端使用与编码期间所使用的编码规则相适合的解码规则来导出正确的信息值。The present invention is based on the discovery that when comparing a first coding rule that produces a coded representation of an information value with a variable length with a second coding rule that generates a coded representation of an information value with a fixed length, and when the choice results in the need When encoding rules for encoding representations of a small number of information units, a densely encoded representation of information values that does not exceed a predetermined size can be derived. Thus, it can be ensured that the maximum bit rate is at least the maximum bit rate of the second coding rule deriving the second coded representation. The selection of an encoding rule via a rule message is sent together with the encoded representation of the information value, after which the correct information value can be derived at the decoder using a decoding rule adapted to the encoding rule used during encoding.
该原理将在以下段落中更为详细地进行概括,假定适当设计的可变长度码与待编码的信息值的统计相匹配。The principle will be outlined in more detail in the following paragraphs, assuming that a properly designed variable length code matches the statistics of the information value to be encoded.
当对量化数值应用熵编码时,表示数据集所需的实际要求取决于待编码的数值是已知的。一般说来,数值越相似,则所消耗的比特就越少。反之,很不相似的数据集需要高比特率。这种方式下可能出现的是,某些数据块需要很高的数据率,这是不利的,例如在传输通道具有有限的传输容量的情况下。When applying entropy coding to quantized values, the actual requirements needed to represent a data set depend on what is known about the values to be encoded. In general, the more similar the values, the fewer bits will be consumed. Conversely, very dissimilar data sets require high bit rates. It can occur in this way that certain data blocks require a very high data rate, which is disadvantageous, for example in the case of a transmission channel with a limited transmission capacity.
所提出的方法即使在很罕见数值的情况下也能够确保对熵编码的数据集进行编码的比特需求的已知上限。具体地,该方法确保比特需求不会超过使用PCM码的比特需求。该编码方法可以概括如下:The proposed method ensures a known upper bound on the bit requirements for encoding entropy-encoded datasets even at very rare values. In particular, the method ensures that the bit requirements do not exceed those using PCM codes. The encoding method can be summarized as follows:
●使用常规的熵(例如霍夫曼,Huffman)编码过程对数据集进行编码。储存所得到的比特需求。• Encode the dataset using a conventional entropy (eg Huffman) encoding process. The resulting bit requirements are stored.
●计算针对PCM表示的比特需求。注意,这仅是将待编码数值的数量乘以PCM码长,或者乘以PCM码长的一部分,因此容易计算。• Calculate bit requirements for PCM representation. Note that this is just multiplying the number of values to be encoded by the PCM code length, or a part of the PCM code length, so it is easy to calculate.
●若熵编码的比特需求超过PCM编码的比特需求,则选择PCM编码,并且通过适当的辅助信息,将所选的PCM编码发信号通知给解码器。• If the bit requirement of the entropy coding exceeds that of the PCM coding, the PCM coding is selected and the selected PCM coding is signaled to the decoder with appropriate side information.
解码级以相应的方式工作。The decoding stage works in a corresponding manner.
在本发明的优选实施例中,比较熵编码方案和PCM码,对量化数值进行编码。In a preferred embodiment of the invention, entropy coding schemes are compared to PCM codes for coding quantized values.
在本发明的上述实施例中,最大比特率由PCM码的字长来限定。因此,在知道该字长的情况下,能够有利地设计一种包括编码器、传输介质和解码器的系统,通过选择传输介质使其传输容量超过由PCM码所限定的最大比特率,确保安全的操作。In the above-described embodiments of the present invention, the maximum bit rate is limited by the word length of the PCM code. Therefore, in the case of knowing the word length, it is possible to advantageously design a system including an encoder, a transmission medium and a decoder, by selecting the transmission medium so that its transmission capacity exceeds the maximum bit rate defined by the PCM code, ensuring security operation.
在第二优选实施例中,基于先前的本发明的实施例,把若干信息值额外地组合为能够使用PCM编码更有效率地表示的单一值,即,该单一值具有接近2的幂次的范围。分组(grouping)由下面的例子更加详细地说明:In a second preferred embodiment, based on the previous embodiments of the invention, several information values are additionally combined into a single value that can be represented more efficiently using PCM coding, i.e., the single value has a value close to a power of 2 scope. Grouping is illustrated in more detail by the following example:
范围是0...4(即5个可能的不同值)的量化变量值无法以PCM码有效地表示,因为3比特的最小可能码长会浪费2^3=8个可能值中的3个。把3个该变量(因此有5^3=125种可能组合)组合为7比特长的单一码可以显著地降低冗余量,因为5^3=125接近2^7=128。Quantized variable values in the
因此,该方法和所提出的用于约束比特需求的上限的概念的组合式实现,将使用用于确定数据率的上限的分组PCM编码(以及后退式(fall-back)编码)作为PCM的替代。Therefore, a combined implementation of the method and the proposed concept for constraining an upper bound on bit requirements would use packet PCM coding (and fall-back coding) for determining an upper bound on the data rate as an alternative to PCM .
这种组合式的实现具有可进一步降低最大比特率的明显优点。This combined implementation has the obvious advantage of further reducing the maximum bit rate.
附图说明Description of drawings
下面参考附图对本发明的优选实施例进行描述,其中:Preferred embodiments of the present invention are described below with reference to the accompanying drawings, wherein:
图1示出了本发明的编码器;Fig. 1 shows the encoder of the present invention;
图2示出了根据本发明概念的比特估测的示例;Figure 2 shows an example of bit estimation according to the inventive concept;
图3a示出在PCM编码之前把两个信息值进行分组;Figure 3a shows grouping of two information values prior to PCM encoding;
图3b示出了三个信息值的分组;Figure 3b shows a grouping of three information values;
图4示出了本发明的解码器;以及Fig. 4 shows the decoder of the present invention; and
图5示出了依据现有技术的多通道音频编码器。Fig. 5 shows a multi-channel audio encoder according to the prior art.
具体实施方式Detailed ways
图1示出了本发明的编码器的框图,该编码器用于对信息值进行编码,或导出信息值的编码表示,以确保固定的最大比特率。编码器100包括比特估测器(bite estimator)102和提供器(provider)104。Figure 1 shows a block diagram of an inventive encoder for encoding information values, or deriving an encoded representation of an information value, ensuring a fixed maximum bit rate. The
待编码的信息值106被输入比特估测器102和提供器(104)。在一种可能的实施方式中,比特估测器102使用第一编码规则和第二编码规则来估测所需的信息单元的数量。提供器104通过规则数据链路108可获得如下信息:哪个编码规则将导致需要较少信息单元数量的编码表示。然后,提供器104使用所获得的编码规则对信息值106进行编码,并在其输出端传送编码表示110以及规则信息112,其中规则信息112指示所使用的编码规则。An
在先前描述的本发明的实施例的修改中,比特估测器102使用第一和第二编码规则对信息值106进行编码。然后,比特估测器102计算两种编码表示所需的信息单元,并把具有较少信息单元数量的编码表示以及规则信息传送至提供器104。图1中的虚线数据链路114表示已编码的表示从比特估测器至提供器的可能的转移。然后,提供器104仅把已编码的表示转送(forward)至其输出端,并且额外地传送规则信息112。In a modification of the previously described embodiment of the invention,
图2通过把霍夫曼码与PCM码进行比较,示出了比特估测器102如何估测导出编码表示所需的比特数量。Figure 2 shows how the
霍夫曼码本120用于把整数值122分配给由比特序列表示的码字124。这里要注意的是,这里选择尽可能简单的霍夫曼码本,从而把注意力放在本发明概念的基本思想上。The Huffman codebook 120 is used to assign
用于比较以及确保最大恒定比特率的PCM码包括长度为4比特的PCM码字,允许16个可能的码字,如PCM说明126中所示。The PCM codes used for comparison and to ensure a maximum constant bit rate include PCM codewords of
在此处所示的简单示例中,待编码的信息值128由6个连续整数(011256)来表示,这意味着,每一个信息值仅有十种可能的设置。信息值128被输入比特估测器102,比特估测器102使用霍夫曼码本(如比特估测器102的霍夫曼分区130中所示)和PCM表示(如PCM分区132中所示),导出构建编码表示所需的比特数。从图2中可看出,信息值的熵编码表示需要22个比特,而PCM编码表示需要24个比特,即信息值的数量乘以单一PCM码字的比特长度。在图2的情况下,本发明的编码器将会采用信息值的熵编码表示,并且发出与熵编码表示一同输出的适当规则信息。In the simple example shown here, the
图3a和3b示出:通过把信息值128有利地进行分组以形成PCM编码的信息值组,以进一步减小最大比特率的可能。Figures 3a and 3b illustrate the possibility of further reducing the maximum bit rate by advantageously grouping the information values 128 to form PCM-coded groups of information values.
接下来,使用与图2中相同的信息值128,以强调PCM分组对于对信息值进行编码的本发明的概念的影响。Next, the
同样,由于单一信息值仅有十种可能的设置,所以在建立组合值的PCM表示之前,可以有利地把两个连续的信息值分组为信息值组140a至140c。这是可能的,因为7比特PCM码允许128种不同的组合,而包括两个任意信息值的组仅能建立100种不同的组合。Also, since there are only ten possible settings for a single information value, it may be advantageous to group two consecutive information values into
现在,向信息值组140a-140c中的每一个分配单一的7比特PCM码字142a-142c。从图3a中可以看出,在建立PCM表示之前应用分组策略会导致仅具有21比特的信息值128的编码表示,可以与图2中未分组的PCM表示所需的24比特进行比较。在上述的分组策略中,数据流中每一个信息值平均消耗3.5比特(7比特/2信息值)。A single 7-
如同第3b图所示,通过把3个值分组成为信息值组146a至146b,还可以进一步增大分组的效率。如此一来可以形成1000种可能组合,并且可以由10比特PCM码所覆盖,如图3中的PCM码字148a和148b所示。因此,该PCM表示仅需要20比特,把每信息值平均比特数进一步降低至3.33(10/3)。As shown in Figure 3b, the efficiency of grouping can be further increased by grouping 3 values into
可以明显地看出,通过对数值进行分组,为编码所需的比特率带来了明显的益处,因为图3a和3b所给出的例子会把最大比特率降低12.5%(16.7%)。此外,将分组应用于图2中例子甚至会使比特估测器102做出不同的决策,并且发出信号以表示产生需要较低比特数的编码表示的PCM码。It can be clearly seen that by grouping the values, there is a clear benefit to the bit rate required for encoding, since the example given in Figures 3a and 3b reduces the maximum bit rate by 12.5% (16.7%). Furthermore, applying grouping to the example in FIG. 2 would even cause the
图4示出了根据本发明的解码器的框图。解码器160包括解压缩器162接收机163,接收机163用于提供编码表示110和规则信息112,该规则信息112指示对信息值进行编码所使用的编码规则。Fig. 4 shows a block diagram of a decoder according to the invention. The decoder 160 comprises a decompressor 162 and a receiver 163 for providing an encoded
解压缩器162处理该规则信息112,以导出适合用于从编码信息110中导出信息值106的解码规则。The decompressor 162 processes the
然后,解压缩器162使用该解码规则对编码表示110进行解压缩,并在其输出端提供信息值106。Decompressor 162 then decompresses encoded
在前面段落的叙述中,通过把产生可变比特长度码的熵编码方案与产生固定比特长度码的PCM编码方案进行比较,详细说明了本发明的概念。本发明的概念绝不限于编码过程中进行比较的这些种类的码。基本上,两种或更多种码的任意组合都适合用于进行比较,并导出尽可能紧密(compact)的信息值编码表示,特别是比当仅使用一种码而导出时更为紧密。In the description of the preceding paragraphs, the concept of the present invention has been explained in detail by comparing an entropy coding scheme which produces variable bit length codes with a PCM coding scheme which produces constant bit length codes. The inventive concept is by no means limited to these kinds of codes that are compared during encoding. Basically, any combination of two or more codes is suitable for comparison and leads to an encoded representation of the information value that is as compact as possible, especially more compact than when only one code is used.
本发明在音频编码的环境中加以描述,其中,根据本发明的概念对描述例如音频信号的空间特性的参数进行编码和解码。可确保编码内容的最大比特率的本发明的概念也可以有利地用于任何其它的参数表示或信息值。The invention is described in the context of audio coding, wherein parameters describing eg the spatial properties of an audio signal are encoded and decoded according to the concept of the invention. The concept of the invention, which ensures a maximum bit rate for encoded content, can also be advantageously used for any other parameter representation or information value.
尤其适合于对先前量化的参数进行熵编码的实施方式,因为如此一来预期的编码效率会较高。然而,音频或视频信号的直接频谱表示(directspectral representation)也可以用作本发明的编码方案的输入。特别地,当信号由在时间上互相跟随的信号的不同部分来描述时,其中时间部分由包括信号的频率表示的参数来描述,上述编码措施是可以在频率和时间上使用。也可以应用PCM分组,在时间或频率上对参数进行分组。It is especially suitable for implementations that entropy encode previously quantized parameters, since then the expected coding efficiency is higher. However, direct spectral representations of audio or video signals can also be used as input to the coding scheme of the present invention. In particular, when a signal is described by different parts of the signal that follow each other in time, where the time parts are described by parameters including the frequency representation of the signal, the above-mentioned encoding measures can be used both in frequency and in time. PCM grouping can also be applied to group parameters in time or frequency.
虽然上述本发明的解码器借助于向解码器发信号(signal)以指示规则的规则信息,来导出关于使用哪种解码规则对编码表示进行解码的信息,然而在备选实施例中,解码器160也可以从编码表示110中直接导出使用哪种解码规则,例如通过识别编码表示中的特殊比特序列,其优点是可以省略发信号表示规则信息的辅助信息。While the decoder of the invention described above derives information about which decoding rules to use to decode an encoded representation by means of rule information that signals the decoder to indicate the rules, in an alternative embodiment, the decoder 160 can also directly derive from the coded
取决于本发明的方法的特定实现要求,本发明的方法可以以硬件或者软件实现。该实现可以使用数字储存媒介来执行,特别是其上存储有电可读控制信号的盘、DVD或CD,其与可编程计算机系统一同操作,从而执行本发明的方法。因此大体上说,本发明是在机器可读载体上存储有程序代码的计算机程序产品,当该计算机程序产品在计算机上运行时,该程序代码可以用于执行本发明的方法。换句话说,本发明的方法是具有程序代码的计算机程序,当该计算机程序在计算机上运行时,该程序代码可用于执行本发明的方法中至少一种方法。Depending on the specific implementation requirements of the method of the invention, the method of the invention can be implemented in hardware or software. The implementation may be performed using a digital storage medium, in particular a disc, DVD or CD, having stored thereon electrically readable control signals, operating with a programmable computer system to perform the method of the invention. In general, therefore, the present invention is a computer program product having program code stored on a machine-readable carrier, which program code can be used to carry out the method of the present invention when the computer program product is run on a computer. In other words, the method of the present invention is a computer program with a program code for performing at least one of the methods of the present invention when the computer program is run on a computer.
虽然在上文中参考特定实施例进行了特定的示出与描述,本领域的技术人员可以理解,在不背离本发明的精神和范围的前提下,可以在形式和细节上做出各种其他的改变。应当理解的是,在不背离这里所公开的以及由所附权利要求所包括的更宽的概念的前提下,可做出各种改变以适应不同的实施例。Although the above has been specifically shown and described with reference to specific embodiments, it will be understood by those skilled in the art that various other changes in form and details may be made without departing from the spirit and scope of the present invention. Change. It should be understood that various changes may be made to accommodate different embodiments without departing from the broader concepts disclosed herein and encompassed by the appended claims.
Claims (20)
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US67099305P | 2005-04-13 | 2005-04-13 | |
US60/670,993 | 2005-04-13 | ||
US11/233,351 | 2005-09-22 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410454271.4A Division CN104300991A (en) | 2005-04-13 | 2006-02-13 | Lossless information encoding for maximum bitrate |
Publications (1)
Publication Number | Publication Date |
---|---|
CN101160725A true CN101160725A (en) | 2008-04-09 |
Family
ID=39256956
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNA2006800120914A Pending CN101160725A (en) | 2005-04-13 | 2006-02-13 | Lossless information encoding for maximum bitrate |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101160725A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103369312A (en) * | 2012-03-27 | 2013-10-23 | 富士通株式会社 | Method and device for compressing image |
CN104704825A (en) * | 2012-08-21 | 2015-06-10 | Emc公司 | Lossless compression of fragmented image data |
CN106993191A (en) * | 2016-01-21 | 2017-07-28 | 晨星半导体股份有限公司 | Video stream decoding method and video stream decoding system |
CN107077849A (en) * | 2014-11-07 | 2017-08-18 | 三星电子株式会社 | Method and apparatus for recovering audio signal |
TWI601410B (en) * | 2016-01-11 | 2017-10-01 | 晨星半導體股份有限公司 | Video stream decoding method and video stream decoding system |
-
2006
- 2006-02-13 CN CNA2006800120914A patent/CN101160725A/en active Pending
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103369312A (en) * | 2012-03-27 | 2013-10-23 | 富士通株式会社 | Method and device for compressing image |
CN103369312B (en) * | 2012-03-27 | 2017-04-12 | 富士通株式会社 | Method and device for compressing image |
CN104704825A (en) * | 2012-08-21 | 2015-06-10 | Emc公司 | Lossless compression of fragmented image data |
US10249059B2 (en) | 2012-08-21 | 2019-04-02 | EMC IP Holding Company LLC | Lossless compression of fragmented image data |
US10282863B2 (en) | 2012-08-21 | 2019-05-07 | EMC IP Holding Company LLC | Lossless compression of fragmented image data |
CN104704825B (en) * | 2012-08-21 | 2019-08-30 | Emc 公司 | The lossless compression of segmented image data |
US11049283B2 (en) | 2012-08-21 | 2021-06-29 | EMC IP Holding Company LLC | Lossless compression of fragmented image data |
US11074723B2 (en) | 2012-08-21 | 2021-07-27 | EMC IP Holding Company LLC | Lossless compression of fragmented image data |
CN107077849A (en) * | 2014-11-07 | 2017-08-18 | 三星电子株式会社 | Method and apparatus for recovering audio signal |
CN107077849B (en) * | 2014-11-07 | 2020-09-08 | 三星电子株式会社 | Method and apparatus for restoring audio signal |
TWI601410B (en) * | 2016-01-11 | 2017-10-01 | 晨星半導體股份有限公司 | Video stream decoding method and video stream decoding system |
CN106993191A (en) * | 2016-01-21 | 2017-07-28 | 晨星半导体股份有限公司 | Video stream decoding method and video stream decoding system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7991610B2 (en) | Adaptive grouping of parameters for enhanced coding efficiency | |
CA2601821A1 (en) | Planar multiband antenna | |
AU2006233513B2 (en) | Lossless encoding of information with guaranteed maximum bitrate | |
EP1869775B1 (en) | Entropy coding with compact codebooks | |
US20020049586A1 (en) | Audio encoder, audio decoder, and broadcasting system | |
CN101160725A (en) | Lossless information encoding for maximum bitrate |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination |