WO2007011083A1 - Apparatus and method of encoding and decoding audio signal - Google Patents

Apparatus and method of encoding and decoding audio signal Download PDF

Info

Publication number
WO2007011083A1
WO2007011083A1 PCT/KR2005/002306 KR2005002306W WO2007011083A1 WO 2007011083 A1 WO2007011083 A1 WO 2007011083A1 KR 2005002306 W KR2005002306 W KR 2005002306W WO 2007011083 A1 WO2007011083 A1 WO 2007011083A1
Authority
WO
Grant status
Application
Patent type
Prior art keywords
channels
block switching
levels
switched
block
Prior art date
Application number
PCT/KR2005/002306
Other languages
French (fr)
Inventor
Tilman Liebchen
Original Assignee
Lg Electronics Inc.
Noll, Peter
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/022Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/0017Lossless audio signal coding; Perfect reconstruction of coded audio signal by transmission of coding error

Abstract

A method and apparatus of encoding and decoding an audio file are disclosed. A pair of channels of an audio data frame included in an audio file independently or synchronously is switched based on whether a correlation between the channels exists. The channels are switched hierarchically at one or more block switching levels. Then, first block switching information is generated. The first block switching information indicates whether the channels are switched independently or synchronously and how the channels are switched at the block switching levels, respectively.

Description

[DESCRIPTION]

APPARATUS AND METHOD OF ENCODING AND DECODING AUDIO SIGNAL

Technical Field

The present invention relates to a method for processing audio signal, and

more particularly to a method and apparatus of encoding and decoding audio signal.

Background Art

The storage and replaying of audio signals has been accomplished in different

ways in the past. For example, music and talk has been recorded and preserved by

phonographic technology (e.g. record players), magnetic technology (e.g. cassette

tapes), and digital technology (e.g. compact discs). As audio storage technology

progresses, many challenges need to be overcome to optimize the quality and

stσrability of audio signals.

For the archiving and broadband transmission of music signals, lossless

reconstruction is becoming a more important feature than high efficiency in

compression by means of perceptual coding as defined in MPEG standards such as

MP3 or AAC. Although DVD audio and Super CD Audio include proprietary lossless

compression schemes, there is a demand for an open and general compression

scheme among content-holders and broadcasters. In response to this demand, a new lossless coding scheme has been considered as an extension to the MPEG-4

Audio standard. Lossless audio coding permits the compression of digital audio data

without any loss in quality due to a perfect reconstruction of the original signal.

Disclosure of Invention

The present invention relates to a method for processing forward-adaptive

linear prediction, which offers remarkable compression even with low

predictor orders. Nevertheless, performance can be significantly improved by using

higher predictor orders, more efficient quantization and encoding of the predictor

coefficients, and adaptive block length switching.

It is an object of the invention to provide an embedded a lossless audio coding

to permit the compression of digital audio data without any loss in quality due to a

perfect reconstruction of the original signal.

Another object of the invention is to provide a lossless coding techniques for

high-definition audio signals. Audio Lossless Coding will define methods for

lossless coding of audio signals with arbitrary sampling rates, resolutions of up to 32

bit, and up to 256 channels. The lossless codec uses forward-adaptive Linear

Predictive Coding (LPC) to reduce bit rates compared to PCM, leaving the

optimization entirely to the encoder. Thus, various encoder implementations are

possible, offering a certain range in terms of efficiency and complexity. Although remarkable compression is achieved even for low predictor orders,

still better compression becomes possible using high-order prediction. In this case,

more efficient coding of the predictor coefficients is necessary in order to limit the

amount of side information. This is achieved by applying a non-linear compander to

the most important coefficients, followed by linear quantization and entropy coding of

the quantized values. In addition, adaptive block length switching is used to account

for changing signal statistics. As a result, compression ratios are comparable to the

best high-order backward adaptive prediction schemes, but with a significantly less

complex decoder, and maintaining full random access to arbitrary parts of the

encoded signal.

The present invention relate to an encoder and/or decoder (including methods

of encoding and decoding) data. Data may be encoded or decoded in a lossless

manner. Embodiments relate to a flexible, hierarchical block switch scheme,

allowing for up to six different block lengths within a frame. Embodiments relate to

independent block switching for each channel. Embodiments relate to a maximum

predictor order of 1023.

Additional advantages, objects, and features of the invention will be set forth

in part in the description which follows and in part will become apparent to those

having ordinary skill in the art upon examination of the following or may be learned

from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the structure particularly pointed out in the written

description and claims hereof as well as the appended drawings.

To achieve these objects and other advantages and in accordance with the

purpose of the invention, as embodied and broadly described herein, a method of

processing an audio file includes switching a pair of channels of an audio data frame

included in an audio file independently when the channels are not correlated with

each other, switching the channels synchronously when the channels are correlated

with each other, and generating first block switching information indicating whether

the channels are switched independently or synchronously.

In another aspect of the present invention, a method encoding an audio file

includes switching a pair of channels of an audio data frame included in an audio file

independently or synchronously based on whether a correlation between the

channels exists, the channels being switched hierarchically at one or more block

switching levels, and generating first block switching information indicating whether

the channels are switched independently or synchronously and how the channels are

switched at the block switching levels, respectively.

In another aspect of the present invention, a method of decoding an audio

file includes receiving an audio file including an audio data frame which has a pair of

channels, the channels being switched independently or synchronously based on

whether a correlation between the channels exists and being switched hierarchically at one or more block switching levels, and parsing first block switching information

from the audio data frame, the first block switching information indicating whether the

channels are switched independently or synchronously and how the channels are

switched at the block switching levels, respectively.

In another aspect of the present invention, an apparatus of encoding an

audio file includes an encoder configured to switch a pair of channels of an audio

data frame included in an audio file independently or synchronously based on

whether a correlation between the channels exists, the channels being switched

hierarchically at one or more block switching levels, wherein the encoder generates

first block switching information indicating whether the channels are switched

independently or synchronously and how the channels are switched at the block

switching levels, respectively.

In a further aspect of the present invention, an apparatus of decoding an

audio file includes a decoder configured to receive an audio file including an audio

data frame which includes a pair of channels, the channel being switched

independently or synchronously based on whether a correlation between the

channels exists and being switched hierarchically at one or more block switching

levels, wherein the decoder parses first block switching information from the audio

data frame, the first block switching information indicating whether the channels are

switched independently or synchronously and how the channels are switched at the block switching levels, respectively.

It is to be understood that both the foregoing general description and the

following detailed description of the present invention are exemplary and explanatory

and are intended to provide further explanation of the invention as claimed.

Brief of Description of Drawings

The accompanying drawings, which are included to provide a further

understanding of the invention and are incorporated in and constitute a part of this

application, illustrate embodiment(s) of the invention and together with the

description serve to explain the principle of the invention. In the drawings:

Figure 1 is an example illustration of an audio signal encoder.

Figure 2 is an example illustration of an audio signal decoder.

Figure 3 is an measured distributions of parcor coefficients for 48KHz1 16-bit

audio material.

Figure 4 is an compander functions C(r) and -C(-r).

Figure 5 is an example of a block switching hierarchy structure.

Figure 6 is an example of a block switching examples and corresponding

block switching information codes.

Figure 7 is an example of a bit stream of old block switching scheme.

Figure 8 is an example of a bit stream of new block switching (BS) scheme: No BS (top), synchronized BS between CPE channels 1 and 2 (middle), independent

BS (bottom).

Figure 9 is a switched difference coding scheme.

Figure 10 is a partition of the residual distribution.

Best Mode for Carrying out the Invention

Reference will now be made in detail to the preferred embodiments of the

present invention, examples of which are illustrated in the accompanying drawings.

Wherever possible, the same reference numbers will be used throughout the

drawings to refer to the same or like parts.

Prior to describing the present invention, it should be noted that most terms

disclosed in the present invention correspond to general terms well known in the art,

but some terms have been selected by the applicant as necessary and will

hereinafter be disclosed in the following description of the present invention.

Therefore, it is preferable that the terms defined by the applicant be understood on

the basis of their meanings in the present invention.

In a lossless audio coding method, since the encoding process has to be

perfectly reversible without loss of information, several parts of both encoder and

decoder have to be implemented in a deterministic way. [Structure of the codec]

Figure 1 shows the typical processing for one input channel of audio data. A

buffer stores one block of input samples, and an optimum set of parcor coefficients is

calculated for each block. The number of coefficients, i.e. the order of the predictor,

can be adaptively chosen as well. The quantized parcor values are entropy coded for

transmission, and converted to LPC coefficients for the prediction filter which

calculates the prediction residual. The residual is entropy coded using different

entropy codes. The indices of the chosen codes have to be transmitted as side

information.

Finally, a multiplexing unit combines coded residual, code indices, predictor

coefficients and other additional information to form the compressed bitstream. The

encoder also provides a CRC checksum, which is supplied mainly for the decoder to

verify the decoded data. On the encoder side, the CRC can be used to ensure that

the compressed data is losslessly decodable.

Additional encoder options comprise block length switching, random access

and joint channel coding. The encoder may use these options to offer several

compression levels with different complexities. The basic version of the encoder uses

a fixed block length. Optionally, the encoder can switch between different block

lengths to adapt to stationary regions as well as to transient segments of the audio

signal. The codec allows random access in defined intervals down to some milliseconds, depending on the block length.

Furthermore, joint channel coding is used to exploit dependencies between

channels of stereo or multi-channel signals. This can be achieved by coding the

difference between two channels in those segments where this difference can be

coded more efficiently than one of the original channels.

The entropy coding part of the prediction residual provides two alternative

coding techniques with different complexities. Besides low complexity yet efficient

Golomb-Rice coding, the BGMC arithmetic coding scheme offers even better

compression at the expense of a slightly increased complexity.

Furthermore, The encoder will also offer efficient compression of floating-point

audio data in the 32-bit IEEE format. This codec extension employs an algorithm that

basically splits the floating-point signal into a truncated integer signal and a

difference signal which contains the remaining fractional part. The integer signal is

then compressed using the normal encoding scheme for PCM signals, while the

difference signal is coded separately. A detailed description of the floating-point

extension can be found.

The Figure 2 shows the lossless audio signal decoder which is significantly

less complex than the encoder, since no adaptation has to be carried out. The

decoder merely decodes the entropy coded residual and the parcor values, converts

them into LPC coefficients, and applies the inverse prediction filter to calculate the lossless reconstruction signal.

The computational effort of the decoder mainly depends on the predictor

orders chosen by the encoder. Since the average order is typically well below the

maximum order, prediction with greater maximum orders does not necessarily lead to

a significant increase of decoder complexity. In most cases, realtime decoding is

possible even on low-end systems.

[Linear Prediction]

Linear prediction is used in many applications for speech and audio signal

processing. In the following, only FIR predictors are considered.

Prediction with FIR Filters

The current sample of a time-discrete signal x(n) can be approximately

predicted from previous samples x(n - k) . The prediction is given by

K x(n) = ]T hk * x(n - k), (1) A=I

where K is the order of the predictor. If the predicted samples are close to the

original samples, the residual e(ή) = x(n) - x{n) (2)

has a smaller variance than x(ri) itself, hence e(ή) can be encoded more

efficiently.

The procedure of estimating the predictor coefficients from a segment of input

samples, prior to filtering that segment, is referred to as forward adaptation. In that

case, the coefficients have to be transmitted. If the coefficients are estimated from

previously processed segments or samples, e.g. from the residual, we speak of

backward adaptation. This procedure has the advantage that no transmission of the

coefficients is needed, since the data required to estimate the coefficients is available

to the decoder as well.

Forward-adaptive prediction with orders around 10 is widely used in speech

coding, and can be employed for lossless audio coding as well. The maximum order

of most forward-adaptive lossless prediction schemes is still rather small, e.g. K = 32.

An exception is the special 1-bit lossless codec for the Super Audio CD, which uses

predictor orders of up to 128.

On the other hand, backward-adaptive FIR filters with some hundred

coefficients are commonly used in many areas, e.g. channel equalization and echo

cancellation. Most systems are based on the LMS algorithm or a variation thereof, which has also been proposed for lossless audio coding. Such LMS-based coding

schemes with high orders are applicable since the predictor coefficients do not have

to be transmitted as side information, thus their number does not contribute to the

data rate. However, backward-adaptive codecs have the drawback that the

adaptation has to be carried out both in the encoder and the decoder, making the

decoder significantly more complex than in the forward-adaptive case.

Forward-Adaptive Prediction

In forward-adaptive linear prediction, the optimal predictor coefficients hk (in

terms of a minimized variance of the residual) are usually estimated for

each block by the autocorrelation method or the covariance method. The

autocorrelation method, using the Levinson-Durbin algorithm, has the additional

advantage of providing a simple means to iteratively adapt the order of the predictor.

Furthermore, the algorithm inherently calculates the corresponding parcor

coefficients as well.

Another crucial point in forward-adaptive prediction is to determine a suitable

predictor order. Increasing the order decreases the variance of the prediction error,

which leads to a smaller bit rate Re for the residual. On the other hand, the bit rate

i?c for the predictor coefficients will rise with the number of coefficients to be transmitted. Thus, the task is to find the optimum order which minimizes the total bit

rate. This can be expressed by minimizing

Rlotal (K) = Re(K) + Rc(K) (3)

with respect to the prediction order K. As the prediction gain rises

monotonically with higher orders, Re decreases with K. On the other hand Rc rises

monotonically with K, since an increasing number of coefficients have to be

transmitted.

The search for the optimum order can be carried out efficiently by the

Levinson-Durbin algorithm, which determines recursively all predictors with

increasing order. For each order, a complete set of predictor coefficients is calculated.

Moreover, the variance σe 2 of the corresponding residual can be derived, resulting

in an estimate of the expected bit rate for the residual. Together with the bit rate for

the coefficients, the total bit rate can be determined in each iteration, i.e. for each

predictor order. The optimum order is found at the point where the total bit rate no

longer decreases.

While it is obvious from equation(3) that the coefficient bit rate has a direct

effect on the total bit rate, a slower increase of Rc also allows to shift the minimum of Rtotal to higher orders (where Re is smaller as well), which would lead to better

compression. Hence, efficient though accurate quantization of the predictor

coefficients plays an important role in achieving maximum compression.

Quantization of Predictor Coefficients

Direct quantization of the predictor coefficients hk is not very efficient for

transmission, since even small quantization errors may result in large deviations from

the desired spectral characteristics of the optimum prediction filter. For this reason,

the quantization of predictor coefficients is based on the parcor (reflection)

coefficients rk , which can be calculated by means of the Levinson-Durbin algorithm.

In that case, the resulting values are restricted to the interval [-1 , 1]. Although parcor

coefficients are less sensitive to quantization, they are still too sensitive when their

magnitude is close to unity. The first two parcor coefficients rx and r2 are typically

very close to -1 and +1 , respectively, while the remaining coefficients rk, k > 2,

usually have smaller magnitudes. The distributions of the first coefficients are very

different, but high-order coefficients tend to converge to a zero-mean gaussian-like

distribution (Figure 3).

Therefore, only the first two coefficients are companded based on the

following function:

Figure imgf000016_0001

This compander results in a significantly finer resolution at rx -» -1 , whereas

-C(-r2 ) can be used to provide a finer resolution at r2 -» +1 (see Figure 4).

However, in order to simplify computation, +C(-r2 ) is actually used for the

second coefficient, leading to an opposite sign of the companded value.

The two companded coefficients are then quantized using a simple 7-bit

uniform quantizer. This results in the following values:

Figure imgf000016_0002

Figure imgf000016_0003

The remaining coefficients rk, k > 2 are not companded but simply quantized

using a 7-bit uniform quantizer again:

ak = [64rk ] (7) In all cases the resulting quantized values ak are restricted to the range [-64,

+63]. These quantized coefficients are re-centered around their most probable values,

and then encoded using Golomb-Rice codes. As a result, the average bit rate of the

encoded parcor coefficients can be reduced to approximately 4 bits/coefficient,

without noticeable degration of the spectral characteristics. Thus, it is possible to

employ very high orders up to K = 1023, preferably in conjunction with large block

lengths.

However, the direct form predictor filter uses predictor coefficients hk

according to Eq. (1). In order to employ identical coefficients in the encoder and the

decoder, these hk values have to be derived from the quantized ak values in both

cases (see Figures 1 and 2). While it is up to the encoder how to determine a set of

suitable parcor coefficients, A lossless coding method specifies an integer-arithmetic

function for conversion between quantized values ak and direct predictor

coefficients hk which ensures their identical reconstruction in both encoder and

decoder.

Block Length Switching

Embodiments relate to encoders, decoders, methods of encoding, and methods of decoding. In embodiments, an encoder is at least one of an audio

encoder, and an Audio Lossless Coding encoder. In embodiments, a method of

encoding is implemented in at least one of an audio encoder, and an Audio Lossless

Coding encoder. In embodiments, a decoder is at least one of an audio decoder,

and an Audio Lossless Coding decoder. In embodiments, a method of decoding is

implemented in at least one of an audio decoder, and an Audio Lossless Coding

decoder.

<Hierarchical Block Switchinq>

Embodiments relate to a block switching mechanism which subdivides a

frame of audio data into four quarter-length blocks, instead of encoding it as one

single block. Switching between one long and four short blocks may be performed

adaptively on a frame-by-frame basis.

Even though this switching mechanism may enable a higher compression

ratio than using a constant block length, there may be some drawbacks. For example,

if only 1 :4 switching is possible, 1 :2 or 1 :8 switching (and combinations thereof) may

be more efficient in some cases, in accordance with embodiments. For example, if

switching is done identically for all channels, there may be challenges if different

channels require different switching, in accordance with embodiments. For example,

since a more flexible block switching scheme enables the use of a wide range of block lengths (including very long ones), even higher maximum predictor orders may

be feasible, in accordance with embodiments.

In embodiments, a more flexible, hierarchical block switching scheme, allows

for up to six different blocks lengths (differing by factors of two) within a frame. In

embodiments, independent block switching for each channel may be implemented

(e.g. each channel pair may be switched independently in the case of joint channel

coding). In embodiments, a maximum predictor order of 1023 may be implemented.

In embodiments, the same compression can be achieved with relatively low

decoder complexity, which also allows higher compression at the same complexity.

Audio Lossless Coding (ALS) includes a relatively simple block switching

mechanism. Each frame of N samples is either encoded using one full length block

(N B = N) or four blocks of length NB - N/4, where the same block partition applies

to all channels. Under some circumstances, this scheme may have some limitations.

For example, only 1 :4 switching may be possible, although different switching (e.g.

1 :2, 1 :8, and combinations thereof) may be more efficient in some cases. For

example, switching is done identically for all channels, although different channels

may require different switching (which is especially true if the channels are not

correlated).

In embodiments, a relatively flexible block switching scheme may be

implemented, where each frame can be hierarchically subdivided into many blocks. For example, Figure 5 illustrates a frame which can be hierarchically subdivided up to

32 blocks. Arbitrary combinations of blocks with NB - N, N/2, N/4, N/8, N/16, and

N/32 may be possible within a frame, as long as each block results from a

subdivision of a superordinate block of double length, in accordance with

embodiments. For example, as illustrated in example Figure 2, a partition into N/4 +

N/4 + N/2 may be possible, while a partition into N/4 + N/2 + N/4 may not be possible.

In embodiments, the actual partition may be signaled in an additional field

block switching information(bs_info) (illustrated in the right column of Figure 6),

where the length depends on the number of block switching levels. Table 1

illustrates an example relationship of the maximum number of levels, the minimum

NB , and the number of bytes used for bsjnfo.

Table 1 : Block switching levels.

Figure imgf000020_0001

The bsjnfo field may include up to 4 bytes, in accordance with embodiments.

The mapping of bits with respect to the levels 1 to 5 may be [(0)1223333 44444444 55555555 55555555]. The first bit may be reserved for indicating independent block

switching. In the example of Figure 26, there are three levels, thus the minimum

block length is NB = N/8, and bsjnfo consists of one byte. Starting at the maximum

block length NB = N, the bits of bsjnfo are set if a block is further subdivided. For

the topmost example there is no subdivision at all, thus the code is (0)0000000. The

frame in the second row is subdivided ((0)1...), where only the second block of length

N/2 is further split ((0)101...) into two blocks of length N/4. If an N/4 block is split as in

the fourth row, it is indicated in the following bits ((0)111 0100).

In each frame, bsjnfo fields may be transmitted for all channel pairs (CPEs)

and all single channels (SCEs), enabling independent block switching for different

channels, in accordance with embodiments.

<lndependent Block Switching>

In Independent Block Switching, while the frame length is identical for all

channels, block switching can be done individually for each channel, in accordance

with embodiments. If difference coding is used, both channels of a channel pair

should be switched synchronously, but other channel pairs can still use different

block switching. If the two channels of a channel pair are not correlated with each

other, difference coding may not pay off, and thus there will be no need to switch

both channels synchronously. Accordingly, if the two channels of a channel pair are not correlated with each other, switching the channels independently may not be

practical.

There may be a bs_info field for each CPE and SCE in a frame (e.g. the two

channels of a CPE are switched synchronously), in accordance with embodiments. If

they are switched independently, the first bit of bs_info may be set to 1 , and the

information applies to the CPE's first channel. In this example, another bsjnfo field

for the second channel becomes necessary.

In embodiments, as a result of the increased flexibility, the arrangement of

blocks in the bit stream can be dynamically arranged. As illustrated in example

Figure 7, all channels use the same partition (e.g. either one long or four short

blocks) and corresponding short blocks of different channels are arranged

successively (e.g. blocks 1.1 , 2.1 , and 3.1 ), leading to an interleaved structure.

In embodiments illustrated in example Figure 8, short blocks are only

interleaved if they belong to a channel pair that uses difference coding and therefore

synchronized block switching (e.g. the middle row of Figure 8). This interleaving may

be beneficial, since in a channel pair a block of one channel (e.g. block 1.2) may

depend on previous blocks from both channels (e.g. blocks 1.1 and 2.1), so these

previous blocks may need to be available prior to the current one. For channels

whose blocks are switched independently, channel data can be arranged separately

(e.g. bottom row of Figure 8). <Hiqher Predictor Orders>

Embodiments relate to higher predictor orders. Absent hierarchical block

switching, there may be a factor of 4 between the long and the short block length (e.g.

4096 & 1024 or 8192 & 2048), in accordance with embodiments. In embodiments

(e.g. where hierarchical block switching is implemented), this factor can be increased

(e.g. up to 32), enabling a larger range (e.g. 16384 down to 512 or even 32768 to

1024 for high sampling rates).

In embodiments, in order to make better use of very long blocks, higher

maximum predictor orders may be employed. The maximum order may be Kmaκ =

1023. In embodiments, Kmm may be bound by the block length NB, where K^

< NB / 8 (e.g. Kn^x = 255 for NB = 2048). Therefore, using K^ = 1023 may

require a block length of at least NB = 8192.

In embodiments, the max_order field in the file header is 10 bits. In

embodiments, the opt_order field of the block data is 10 bits. The actual number of

bits in a particular block may depend on the maximum order allowed for a block. If

the block is short, this local maximum order may be smaller than the global maximum

order (stated in max_order in the file header). For example, if Kmaκ = 1023, but NB =

2048, the opt_order field is 8 bits (instead of 10) due to a maximum local order of 255. The opt_order is determined based on the following equation. opt_order = min

(global prediction order, local prediction order), and the global prediction order is

determined from the max_order, and the local prediction order is determined from the

length of the block. In detail, global and local prediction orders are determined by

global prediction order = ceil(log2(maximum prediction order +1)), and local

prediction order = max(ceil(log2((Nb»3)-1)), 1)

In embodiments, it is necessary to predict data samples of the subdivided

block from channel. A first sample of a current block is predicted using the last K

samples of a previous block. The K value is determined from the opt_order which is

derived the aboved equation.

If the current block is a channel's first block, no samples from the previous

block may be used. In this case, prediction with progressive order is employed,

where the scaled parcor coefficients are converted progressively to LPC coefficient

inside the prediction filter.

Random Access

Random access stands for fast access to any part of the encoded audio signal

without costly decoding of previous parts. It is an important feature for applications

that employ seeking, editing, or streaming of the compressed data. In order to enable random access, the encoder has to insert frames that can be decoded without

decoding previous frames. In those random access frames, no samples from

previous frames may be used for prediction.

The distance between random access frames can be chosen from 255 to one

frame. Depending on frame length and sampling rate, random access down to some

milliseconds is possible.

However, prediction at the beginning of random access frames still constitutes

a problem. A conventional K-th order predictor would normally need K samples from

the previous frame in order the predict the current frame's first sample. Since

samples from previous frames may not be used, the encoder has either to assume

zeros, or to transmit the first K original samples directly, starting the prediction at

position K + 1.

As a result, compression at the beginning of random access frames would be

poor. In order to minimize this problem, the codec uses progressive prediction, which

makes use of as many available samples as possible. While it is of course not

feasible to predict the first sample of a random access frame, we can use first-order

prediction for the second sample, second-order prediction for the third sample, and

so forth, until the samples from position K + 1 on are predicted using the full K-th

order predictor. Since the predictor coefficients hk are calculated recursively from

the quantized parcor coefficients ak anyway, it is possible to calculate each coefficient set from orders 1 to K without additional costs.

In the case of 500 ms random access intervals, this scheme produces an

absolute overhead of only 0.01-0.02% compared to continuous prediction without

random access.

Joint Channel Coding

Joint channel coding can be used to exploit dependencies between the two

channels of a stereo signal, or between any two channels of a multi-channel signal.

While it is straightforward to process two channels X1(Ji) and x2(n) independently,

a simple way to exploit dependencies between these channels is to encode the

difference signal

d(n) = x2 (n) — X1 (n) (8)

instead of x1(n) or x2(n). Switching between x,(n) , x2(n) and d(n) in each

block can be carried out by comparison of the individual signals, depending on which

two signals can be coded most efficiently (see Figure 9). Such prediction with

switched difference coding is beneficial in cases where two channels are very similar.

In the case of multi-channel material, the channels can be rearranged by the encoder in order to assign suitable channel pairs.

Besides simple difference coding, Lossless audio codec also supports a more

complex scheme for exploiting interchannel redundancy between arbitrary channels

of multichannel signals.

Entropy Coding of The Residual

In simple mode, the residual values e(ή) are entropy coded using Rice

codes. For each block, either all values can be encoded using the same Rice code,

or the block can be further divided into four parts, each encoded with a different Rice

code. The indices of the applied codes have to be transmitted, as shown in Figure 1.

Since there are different ways to determine the optimal Rice code for a given set of

data, it is up to the encoder to select suitable codes depending on the statistics of the

residual.

Alternatively, the encoder can use a more complex and efficient coding

scheme called BGMC (Block Gilbert-Moore Codes). In BGMC mode, the encoding of

residuals is accomplished by splitting the distribution in two categories (Figure 10):

Residuals that belong to a central region of the distribution, e(»)| < emax , and ones

that belong to its tails.

The residuals in tails are simply re-centered (i.e. for e(rc) > emx we have et (n) = e(n) - e^ ) and encoded using Rice codes as described earlier. However, to

encode residuals in the center of the distribution, the BGMC encoder splits them into

LSB and MSB components first, then it encodes MSBs using block Gilbert-Moore

(arithmetic) codes, and finally it transmits LSBs using direct fixed-lengths codes. Both

parameters emax and the number of directly transmitted LSBs are selected such that

they only slightly affect the coding efficiency of this scheme, while making it

significantly less complex.

[Compression Results]

In the following, the lossless audio codec is compared with two of the most

popular programs for lossless audio compression: The open-source codec FLAC,

which uses forward-adaptive prediction as well, and Monkey's Audio (MAC 3.97), a

backward-adaptive codec as the current state-of-the-art algorithm in terms of

compression. Both codecs were run with options providing maximum compression

(flac -8 and mac-c4000). The results for the encoder were determined for a medium

compression level (with the prediction order restricted to K _ 60) and a maximum

compression level (K _ 1023), both with random access of 500 ms. The tests were

conducted on a 1.7 GHz Pentium-M system, with 1024 MB of memory. It comprises

nearly 1 GB of stereo waveform data with sampling rates of 48, 96, and 192 kHz, and

resolutions of 16 and 24 bits. recompression Ratio!

In the following, the compression ratio is defined as

C - CompressedFileSize ^

(9) OriginalFileSize

where smaller values mean better compression. The results for the examined

audio formats are shown in Table 2 (192 kHz material is not supported by the

FLAC codec).

Table 2: Comparison of average compression ratios for different audio formats

(kHz/bits)

Figure imgf000029_0001

The results show that ALS at maximum level outperforms both FLAC and

Monkey's Audio for all formats, but particularly for high-definition material (i.e. 96 kHz / 24-bit and above). Even at medium level ALS delivers the best overall compression.

rComplexityl

The complexity of different codecs strongly depends on the actual

implementation, particularly that of the encoder. As mentioned earlier, the audio

signal encoder of the present invention is just a snapshot of an ongoing development.

Thus, we restrict our analysis to the decoder, a simple C code implementation with

no further optimizations. The compressed data was generated by the currently best

encoder implementation. The average CPU load for real-time decoding of various

audio formats, encoded at different complexity levels, is shown in Table 3. Even for

maximum complexity, the CPU load of the decoder is only around 20-25%, which in

return means that file based decoding is at least 4-5 times faster than real-time.

Table 3: Average CPU load (percentage on a 1.7 GHz Pentium-M),

depending on audio format (kHz/bits) and ALS encoder complexity.

Figure imgf000030_0001

The codec is designed to offer a large range of complexity levels. While the maximum level achieves the highest compression at the expense of slowest

encoding and decoding speed, the faster medium level only slightly degrades

compression, but decoding is significantly less complex than for the maximum level

(around 5% CPU load for 48 kHz material). Using a low-complexity level (K _ 15,

Rice coding) degrades compression by only 1-1.5% compared to the medium level,

but the decoder complexity is further reduced by a factor of three (less than 2% CPU

load for 48 kHz material). Thus, audio data can be decoded even on hardware with

very low computing power.

While the encoder complexity may be increased by both higher maximum

orders and a more elaborate block switching algorithm (in accordance with

embodiments), the decoder may be affected by a higher average predictor order.

As the results for a scheme in accordance with embodiments with Kn max

127, The foregoing embodiments (e.g. hierarchical block switching) and advantages

are merely examples and are not to be construed as limiting the appended claims.

The above teachings can be applied to other apparatuses and methods, as would be

appreciated by one of ordinary skill in the art. Many alternatives, modifications, and

variations will be apparent to those skilled in the art.

[Syntax] The present invention is related the syntax which is comprised in encoded bit

stream. The syntax is as bellows;

File Header: The block_switching field is extended from 1 to 2 bits, the

max_order field is extended from 8 to 10 bits. The frame Jength and

userjramejength fields are merged, resulting in a framejength field of 16 bits,

while the user_frame_length field is removed.

Table 4: Syntax of als_header

Figure imgf000032_0001
Figure imgf000033_0001

Frame Data: If block switching is used, the bsjnfo field is added. Depending

on the value of block_switching, it has 8, 16, or 32 bits. The first bit of a CPE's

bsjnfo field holds the independent_bs flag. The number of blocks is implicitly derived

from bsjnfo as well. If blockjswitching is off, there is no bsjnfo field, thus blocks is

one and independent_bs is zero.

In order to improve readability, both new and old syntax are shown separately

in the following table, instead of mixing new with old syntax elements.

Table 5: Syntax of frame_data

Figure imgf000033_0002
SCE = channels for (cp = 0; cp < CPE; cp++){ if (block_switching){ bs_info 8,16,32 UiMsbf

} if (independent_bs){ for (c = 0; c < 2; c++){ if (c == 1){ bsjnfo 8,16,32 UiMsbf

} for (b = 0; b < blocks; b++){ block_header() block_data()

} else{ for (b = 0; b < blocks; b++){ for (c = 0; c < 2; c++){ block_header() block_data() } }

} for (sc = 0; sc < SCE; sc++){ if (block_switching){ bsjnfo 8,16,32 UiMsbf

} for (b = 0; b < blocks; b++){ block_header() block_data() if (inter_channel_correlation){ channel_data(c) } } }

Block Header: The short_blocks field is removed, since block switching

information is completely transmitted on frame level (bsjnfo, see previous

paragraph). Table 6: Syntax of block_header

Figure imgf000035_0001

Block Data: The opt_order field is extended to a maximum of 10 bits

(previously 8 bits).

Table 7: Syntax of block_data

Figure imgf000035_0002
[Semantics!

File Header:

Table 8: Elements of als header

Figure imgf000036_0001
Figure imgf000037_0001

Frame Data:

Table 9: Elements of frame data

Figure imgf000037_0002
Figure imgf000038_0001

Table 10: Elements of block header

Figure imgf000038_0002

Table 11 : Elements of block data

Figure imgf000038_0003

Industrial Applicability

It will be apparent to those skilled in the art that various modifications and

variations can be made in the present invention without departing from the spirit or scope of the inventions. For example, the present invention can be adopted another

audio signal codec like the lossy audio signal codec. Thus, it is intended that the

present invention covers the modifications and variations of this invention provided

they come within the scope of the appended claims and their equivalents.

Claims

[CLAIMS]
1. A method of processing an audio file, the method comprising:
switching a pair of channels of an audio data frame included in an audio file
independently when the channels are not correlated with each other;
switching the channels synchronously when the channels are correlated with
each other; and
generating first block switching information indicating whether the channels
are switched independently or synchronously.
2. The method of claim 1, wherein the pair of channels are switched
hierarchically at one or more block switching levels, and the first block switching
information further indicates how the channels are switched at the block switching
levels, respectively.
3. The method of claim 2, further comprising generating second block
switching information indicating a number of the block switching levels, wherein a
total length of the first block switching information is determined based on the second
block switching information.
4. The method of claim 3, wherein the second block switching information is included in a file header included in the audio file.
5. The method of claim 3, wherein the second block switching information is
indicated by 2 bits.
6. The method of claim 3, wherein the number of the block switching levels
indicated in the second block switching information is any one of up to 3 levels, 4
levels, and 5 levels.
7. The method of claim 6, wherein the second block switching information is
defined by "01" to indicate up to 3 levels, by "10" to indicate 4 levels, and by "11" to
indicate 5 levels.
8. The method of claim 3, wherein the total length of the first block switching
information is any one of 8 bits, 16 bits, and 32 bits.
9. The method of claim 3, wherein the total length of the first block switching
information is 8 bits when the number of the block switching levels indicated in the
second block switching information is up to 3 levels.
10. The method of claim 3, wherein the total length of the first block switching
information is 16 bits when the number of the block switching levels indicated in the
second block switching information is 4 levels.
11. The method of claim 3, wherein the total length of the first block switching
information is 32 bits when the number of the block switching levels indicated in the
second block switching information is 5 levels.
12. The method of claim 1 , wherein a foremost bit of the first block switching
information indicates whether the channels are switched independently or
synchronously.
13. The method of claim 12, wherein the foremost bit of the first block
switching information is set to "1" when the channels are switched independently.
14. The method of claim 2, wherein a foremost bit of the first block switching
information indicates whether the channels are switched independently or
synchronously, and the remaining bits of the first block switching information
indicates how channels are switched at the block switching levels, respectively.
15. The method of claim 14, wherein the foremost bit of the first block
switching information is set to "1" when the channels are switched independently.
16. A method encoding an audio file, the method comprising:
switching a pair of channels of an audio data frame included in an audio file
independently or synchronously based on whether a correlation between the
channels exists, the channels being switched hierarchically at one or more block
switching levels; and
generating first block switching information indicating whether the channels
are switched independently or synchronously and how the channels are switched at
the block switching levels, respectively.
17. A method of decoding an audio file, the method comprising:
receiving an audio file including an audio data frame which has a pair of
channels, the channels being switched independently or synchronously based on
whether a correlation between the channels exists and being switched hierarchically
at one or more block switching levels; and
parsing first block switching information from the audio data frame, the first
block switching information indicating whether the channels are switched
independently or synchronously and how the channels are switched at the block switching levels, respectively.
18. An apparatus of encoding an audio file, the apparatus comprising:
an encoder configured to switch a pair of channels of an audio data frame
included in an audio file independently or synchronously based on whether a
correlation between the channels exists, the channels being switched hierarchically
at one or more block switching levels, wherein the encoder generates first block
switching information indicating whether the channels are switched independently or
synchronously and how the channels are switched at the block switching levels,
respectively.
19. An apparatus of decoding an audio file, the apparatus comprising:
a decoder configured to receive an audio file including an audio data frame
which includes a pair of channels, the channel being switched independently or
synchronously based on whether a correlation between the channels exists and
being switched hierarchically at one or more block switching levels, wherein the
decoder parses first block switching information from the audio data frame, the first
block switching information indicating whether the channels are switched
independently or synchronously and how the channels are switched at the block
switching levels, respectively.
PCT/KR2005/002306 2005-07-18 2005-07-18 Apparatus and method of encoding and decoding audio signal WO2007011083A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/KR2005/002306 WO2007011083A1 (en) 2005-07-18 2005-07-18 Apparatus and method of encoding and decoding audio signal

Applications Claiming Priority (93)

Application Number Priority Date Filing Date Title
PCT/KR2005/002306 WO2007011083A1 (en) 2005-07-18 2005-07-18 Apparatus and method of encoding and decoding audio signal
US11481927 US7835917B2 (en) 2005-07-11 2006-07-07 Apparatus and method of processing an audio signal
US11481932 US8032240B2 (en) 2005-07-11 2006-07-07 Apparatus and method of processing an audio signal
US11481941 US8050915B2 (en) 2005-07-11 2006-07-07 Apparatus and method of encoding and decoding audio signals using hierarchical block switching and linear prediction coding
US11481917 US7991272B2 (en) 2005-07-11 2006-07-07 Apparatus and method of processing an audio signal
US11481930 US8032368B2 (en) 2005-07-11 2006-07-07 Apparatus and method of encoding and decoding audio signals using hierarchical block swithcing and linear prediction coding
US11481916 US8108219B2 (en) 2005-07-11 2006-07-07 Apparatus and method of encoding and decoding audio signal
US11481933 US7966190B2 (en) 2005-07-11 2006-07-07 Apparatus and method for processing an audio signal using linear prediction
US11481915 US7996216B2 (en) 2005-07-11 2006-07-07 Apparatus and method of encoding and decoding audio signal
US11481939 US8121836B2 (en) 2005-07-11 2006-07-07 Apparatus and method of processing an audio signal
US11481926 US7949014B2 (en) 2005-07-11 2006-07-07 Apparatus and method of encoding and decoding audio signal
US11481940 US8180631B2 (en) 2005-07-11 2006-07-07 Apparatus and method of processing an audio signal, utilizing a unique offset associated with each coded-coefficient
US11481942 US7830921B2 (en) 2005-07-11 2006-07-07 Apparatus and method of encoding and decoding audio signal
US11481931 US7411528B2 (en) 2005-07-11 2006-07-07 Apparatus and method of processing an audio signal
US11481929 US7991012B2 (en) 2005-07-11 2006-07-07 Apparatus and method of encoding and decoding audio signal
JP2008521311A JP2009500687A (en) 2005-07-11 2006-07-10 Processing apparatus and method of the audio signal
JP2008521316A JP2009510810A (en) 2005-07-11 2006-07-10 Processing apparatus and method of the audio signal
EP20060757768 EP1913583A4 (en) 2005-07-11 2006-07-10 Apparatus and method of processing an audio signal
PCT/KR2006/002683 WO2007008005A1 (en) 2005-07-11 2006-07-10 Apparatus and method of processing an audio signal
JP2008521310A JP2009500686A (en) 2005-07-11 2006-07-10 Encoding and decoding apparatus and method of the audio signal
JP2008521313A JP2009500688A (en) 2005-07-11 2006-07-10 Processing apparatus and method of the audio signal
PCT/KR2006/002682 WO2007008004A3 (en) 2005-07-11 2006-07-10 Apparatus and method of encoding and decoding audio signal
PCT/KR2006/002679 WO2007008001A3 (en) 2005-07-11 2006-07-10 Apparatus and method of encoding and decoding audio signal
CN 200680029417 CN101243489A (en) 2005-07-11 2006-07-10 Apparatus and method of coding and decoding an audio signal
CN 200680024866 CN101218852A (en) 2005-07-11 2006-07-10 Apparatus and method of processing an audio signal
JP2008521308A JP2009500684A (en) 2005-07-11 2006-07-10 Method of processing an audio signal, encoding of audio signals and decoding apparatus and method
EP20060769222 EP1908058A4 (en) 2005-07-11 2006-07-10 Apparatus and method of processing an audio signal
JP2008521307A JP2009500683A (en) 2005-07-11 2006-07-10 Encoding and decoding apparatus and method of the audio signal
EP20060757766 EP1913581A4 (en) 2005-07-11 2006-07-10 Apparatus and method of encoding and decoding audio signal
JP2008521306A JP2009500682A (en) 2005-07-11 2006-07-10 Encoding and decoding apparatus and method of the audio signal
PCT/KR2006/002685 WO2007008007A1 (en) 2005-07-11 2006-07-10 Apparatus and method of processing an audio signal
EP20060757764 EP1913579A4 (en) 2005-07-11 2006-07-10 Apparatus and method of encoding and decoding audio signal
CN 200680030479 CN101243493A (en) 2005-07-11 2006-07-10 Apparatus and method of coding and decoding an audio signal
CN 200680030549 CN101243495A (en) 2005-07-11 2006-07-10 Apparatus and method of coding and decoding an audio signal
JP2008521319A JP2009500693A (en) 2005-07-11 2006-07-10 Encoding and decoding apparatus and method of the audio signal
CN 200680030469 CN101243492A (en) 2005-07-11 2006-07-10 Apparatus and method of coding and decoding an audio signal
CN 200680029407 CN101243496B (en) 2005-07-11 2006-07-10 An audio signal processing apparatus and method of
CN 200680025137 CN101218631A (en) 2005-07-11 2006-07-10 Apparatus and method of processing an audio signal
JP2008521309A JP2009500685A (en) 2005-07-11 2006-07-10 Encoding and decoding apparatus and method of the audio signal
CN 200680028892 CN101238509A (en) 2005-07-11 2006-07-10 Apparatus and method of processing an audio signal
PCT/KR2006/002681 WO2007008003A3 (en) 2005-07-11 2006-07-10 Apparatus and method of encoding and decoding audio signal
EP20060757765 EP1913580A4 (en) 2005-07-11 2006-07-10 Apparatus and method of encoding and decoding audio signal
PCT/KR2006/002680 WO2007008002A3 (en) 2005-07-11 2006-07-10 Apparatus and method of encoding and decoding audio signal
CN 200680025269 CN101218630B (en) 2005-07-11 2006-07-10 Apparatus and method of processing an audio signal
PCT/KR2006/002689 WO2007008011A3 (en) 2005-07-11 2006-07-10 Apparatus and method of processing an audio signal
CN 200680030541 CN101243497A (en) 2005-07-11 2006-07-10 Apparatus and method of coding and decoding an audio signal
PCT/KR2006/002691 WO2007008013A3 (en) 2005-07-11 2006-07-10 Apparatus and method of encoding and decoding audio signal
CN 200680025139 CN101218629A (en) 2005-07-11 2006-07-10 Apparatus and method of processing an audio signal
EP20060769219 EP1913584A4 (en) 2005-07-11 2006-07-10 Apparatus and method of encoding and decoding audio signal
JP2008521314A JP2009500689A (en) 2005-07-11 2006-07-10 Processing apparatus and method of the audio signal
EP20060769226 EP1913588A4 (en) 2005-07-11 2006-07-10 Apparatus and method of processing an audio signal
EP20060769225 EP1911021A4 (en) 2005-07-11 2006-07-10 Apparatus and method of processing an audio signal
EP20060769218 EP1913589A4 (en) 2005-07-11 2006-07-10 Apparatus and method of encoding and decoding audio signal
EP20060769220 EP1913585A4 (en) 2005-07-11 2006-07-10 Apparatus and method of processing an audio signal
CN 200680030511 CN101243494A (en) 2005-07-11 2006-07-10 Apparatus and method of coding and decoding an audio signal
PCT/KR2006/002690 WO2007008012A3 (en) 2005-07-11 2006-07-10 Apparatus and method of processing an audio signal
PCT/KR2006/002686 WO2007008008A3 (en) 2005-07-11 2006-07-10 Apparatus and method of processing an audio signal
PCT/KR2006/002687 WO2007008009A1 (en) 2005-07-11 2006-07-10 Apparatus and method of processing an audio signal
EP20060769224 EP1913794A4 (en) 2005-07-11 2006-07-10 Apparatus and method of processing an audio signal
JP2008521318A JP2009500692A (en) 2005-07-11 2006-07-10 Processing apparatus and method of the audio signal
EP20060757767 EP1913582A4 (en) 2005-07-11 2006-07-10 Apparatus and method of encoding and decoding audio signal
CN 200680025138 CN101218628B (en) 2005-07-11 2006-07-10 Apparatus and method of encoding and decoding an audio signal
JP2008521305A JP2009500681A (en) 2005-07-11 2006-07-10 Encoding and decoding apparatus and method of the audio signal
CN 200680028982 CN101238510A (en) 2005-07-11 2006-07-10 Apparatus and method of processing an audio signal
EP20060769223 EP1913587A4 (en) 2005-07-11 2006-07-10 Apparatus and method of processing an audio signal
PCT/KR2006/002678 WO2007008000A3 (en) 2005-07-11 2006-07-10 Apparatus and method of encoding and decoding audio signal
PCT/KR2006/002688 WO2007008010A1 (en) 2005-07-11 2006-07-10 Apparatus and method of processing an audio signal
PCT/KR2006/002677 WO2007007999A3 (en) 2005-07-11 2006-07-10 Apparatus and method of encoding and decoding audio signal
EP20060769227 EP1911020A4 (en) 2005-07-11 2006-07-10 Apparatus and method of encoding and decoding audio signal
JP2008521317A JP2009500691A (en) 2005-07-11 2006-07-10 Processing apparatus and method of the audio signal
JP2008521315A JP2009500690A (en) 2005-07-11 2006-07-10 Processing apparatus and method of the audio signal
US12232527 US7962332B2 (en) 2005-07-11 2008-09-18 Apparatus and method of encoding and decoding audio signal
US12232526 US8010372B2 (en) 2005-07-11 2008-09-18 Apparatus and method of encoding and decoding audio signal
US12232593 US8326132B2 (en) 2005-07-11 2008-09-19 Apparatus and method of encoding and decoding audio signal
US12232595 US8417100B2 (en) 2005-07-11 2008-09-19 Apparatus and method of encoding and decoding audio signal
US12232590 US8055507B2 (en) 2005-07-11 2008-09-19 Apparatus and method for processing an audio signal using linear prediction
US12232591 US8255227B2 (en) 2005-07-11 2008-09-19 Scalable encoding and decoding of multichannel audio with up to five levels in subdivision hierarchy
US12232662 US8510120B2 (en) 2005-07-11 2008-09-22 Apparatus and method of processing an audio signal, utilizing unique offsets associated with coded-coefficients
US12232659 US8554568B2 (en) 2005-07-11 2008-09-22 Apparatus and method of processing an audio signal, utilizing unique offsets associated with each coded-coefficients
US12232658 US8510119B2 (en) 2005-07-11 2008-09-22 Apparatus and method of processing an audio signal, utilizing unique offsets associated with coded-coefficients
US12232734 US8155144B2 (en) 2005-07-11 2008-09-23 Apparatus and method of encoding and decoding audio signal
US12232747 US8149878B2 (en) 2005-07-11 2008-09-23 Apparatus and method of encoding and decoding audio signal
US12232740 US8149876B2 (en) 2005-07-11 2008-09-23 Apparatus and method of encoding and decoding audio signal
US12232743 US7987008B2 (en) 2005-07-11 2008-09-23 Apparatus and method of processing an audio signal
US12232744 US8032386B2 (en) 2005-07-11 2008-09-23 Apparatus and method of processing an audio signal
US12232739 US8155152B2 (en) 2005-07-11 2008-09-23 Apparatus and method of encoding and decoding audio signal
US12232741 US8149877B2 (en) 2005-07-11 2008-09-23 Apparatus and method of encoding and decoding audio signal
US12232748 US8155153B2 (en) 2005-07-11 2008-09-23 Apparatus and method of encoding and decoding audio signal
US12232783 US8275476B2 (en) 2005-07-11 2008-09-24 Apparatus and method of encoding and decoding audio signals
US12232784 US7987009B2 (en) 2005-07-11 2008-09-24 Apparatus and method of encoding and decoding audio signals
US12232782 US8046092B2 (en) 2005-07-11 2008-09-24 Apparatus and method of encoding and decoding audio signal
US12232781 US7930177B2 (en) 2005-07-11 2008-09-24 Apparatus and method of encoding and decoding audio signals using hierarchical block switching and linear prediction coding
US12314891 US8065158B2 (en) 2005-07-11 2008-12-18 Apparatus and method of processing an audio signal

Publications (1)

Publication Number Publication Date
WO2007011083A1 true true WO2007011083A1 (en) 2007-01-25

Family

ID=37668949

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2005/002306 WO2007011083A1 (en) 2005-07-18 2005-07-18 Apparatus and method of encoding and decoding audio signal

Country Status (1)

Country Link
WO (1) WO2007011083A1 (en)

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LIEBCHEN T. ET AL.: 'MPEG-4 ALS: an emerging standard for lossless audio coding' DATA COMPRESSION CONFERENCE, 2004. PROCEEDINGS (DCC 2004) pages 439 - 448, XP010692571 *
LIEBCHEN T.: 'An introduction to MPEG-4 audio lossless coding' 2004 INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP '04) vol. 3, 17 May 2004 - 21 May 2004, pages 1012 - 1015, XP010718364 *
MORIYA T. ET AL.: 'Extended linear prediction tools for the lossless audio coding' ICASSP 2004 vol. 3, 17 May 2004 - 21 May 2004, pages 1008 - 1011, XP010718363 *

Similar Documents

Publication Publication Date Title
US6122619A (en) Audio decoder with programmable downmixing of MPEG/AC-3 and method therefor
US7200561B2 (en) Digital signal coding and decoding methods and apparatuses and programs therefor
US6807528B1 (en) Adding data to a compressed data frame
US7299190B2 (en) Quantization and inverse quantization for audio
US6675148B2 (en) Lossless audio coder
US20060173675A1 (en) Switching between coding schemes
US20110202353A1 (en) Apparatus and a Method for Decoding an Encoded Audio Signal
US7599840B2 (en) Selectively using multiple entropy models in adaptive coding and decoding
US6529604B1 (en) Scalable stereo audio encoding/decoding method and apparatus
US7974847B2 (en) Advanced methods for interpolation and parameter signalling
US20060004566A1 (en) Low-bitrate encoding/decoding method and system
US5812971A (en) Enhanced joint stereo coding method using temporal envelope shaping
US7684981B2 (en) Prediction of spectral coefficients in waveform coding and decoding
EP1617413A2 (en) Multichannel audio data encoding/decoding method and apparatus
US7693709B2 (en) Reordering coefficients for waveform coding or decoding
US7392195B2 (en) Lossless multi-channel audio codec
US8046214B2 (en) Low complexity decoder for complex transform coding of multi-channel sound
US7761290B2 (en) Flexible frequency and time partitioning in perceptual transform coding of audio
US20080312758A1 (en) Coding of sparse digital media spectral data
US20080077412A1 (en) Method, medium, and system encoding and/or decoding audio signals by using bandwidth extension and stereo coding
US7885819B2 (en) Bitstream syntax for multi-process audio decoding
US8069050B2 (en) Multi-channel audio encoding and decoding
US20080215317A1 (en) Lossless multi-channel audio codec using adaptive segmentation with random access point (RAP) and multiple prediction parameter set (MPPS) capability
US20090164223A1 (en) Lossless multi-channel audio codec
US20130121411A1 (en) Audio or video encoder, audio or video decoder and related methods for processing multi-channel audio or video signals using a variable prediction direction

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct app. not ent. europ. phase

Ref document number: 05761300

Country of ref document: EP

Kind code of ref document: A1