WO2016003206A1 - Multichannel audio signal processing method and device - Google Patents

Multichannel audio signal processing method and device Download PDF

Info

Publication number
WO2016003206A1
WO2016003206A1 PCT/KR2015/006788 KR2015006788W WO2016003206A1 WO 2016003206 A1 WO2016003206 A1 WO 2016003206A1 KR 2015006788 W KR2015006788 W KR 2015006788W WO 2016003206 A1 WO2016003206 A1 WO 2016003206A1
Authority
WO
WIPO (PCT)
Prior art keywords
signal
channel
input
output
matrix
Prior art date
Application number
PCT/KR2015/006788
Other languages
French (fr)
Korean (ko)
Inventor
백승권
서정일
성종모
이태진
장대영
김진웅
Original Assignee
한국전자통신연구원
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to KR10-2014-0082030 priority Critical
Priority to KR20140082030 priority
Application filed by 한국전자통신연구원 filed Critical 한국전자통신연구원
Priority to KR10-2015-0094195 priority
Priority to KR1020150094195A priority patent/KR20160003572A/en
Priority claimed from DE112015003108.1T external-priority patent/DE112015003108T5/en
Publication of WO2016003206A1 publication Critical patent/WO2016003206A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing

Abstract

Disclosed are a multichannel audio signal processing method and a multichannel audio signal processing device. The multichannel audio signal processing method may generate output signals of N channels from down-mixed signals of N/2 channels according to an N-N/2-N structure.

Description

Multichannel audio signal processing method and apparatus

The present invention relates to a method and apparatus for processing a multichannel audio signal, and more particularly, to a method and apparatus for processing a multichannel audio signal more efficiently for an N-N / 2-N structure.

MPEG Surround (MPS) is an audio codec for coding multi-channel signals such as 5.1 channel and 7.1 channel. It refers to an encoding and decoding technology capable of compressing and transmitting a multi-channel signal with a high compression rate. MPS has the limitation of backward compatibility in encoding and decoding process. Therefore, the bitstream compressed through the MPS and then transmitted to the decoder must satisfy the constraint that the audio stream can be reproduced in a mono or stereo manner even if the previous audio codec is used.

Therefore, even if the number of input channels constituting the multichannel signal increases, the bitstream transmitted to the decoder must include an encoded mono signal or a stereo signal. The decoder may further receive additional information such that a mono signal or a stereo signal transmitted through the bitstream may be upmixed. The decoder may recover the multichannel signal from the mono signal or the stereo signal using the additional information.

However, since the use of multi-channel audio signals over 5.1 and 7.1 channels is required, there is a problem in the quality of audio signals when multi-channel audio signals are processed in a structure defined by the conventional MPS.

The present invention provides a method and apparatus for processing a multichannel audio signal via an N-N / 2-N structure.

Multi-channel audio signal processing method according to an embodiment of the present invention comprises the steps of identifying the downmix signal and the residual signal of the N / 2 channel generated from the input signal of the N channel; Applying a downmix signal and a residual signal of the N / 2 channel to a first matrix; Outputs a first signal input to the N / 2 decorrelators corresponding to N / 2 OTT boxes and a second signal transmitted to the second matrix without being input to the N / 2 decorrelators through the first matrix Making; Outputting uncorrelated signals from a first signal through the N / 2 decorrelators; Applying the uncorrelated signal and the second signal to a second matrix; And generating an output signal of the N channel through the second matrix.

When the LFE channel is not included in the output signal of the N channel, N / 2 decorrelators may correspond to the N / 2 OTT boxes.

If the number of decorrelators exceeds the reference value of the modulo operation, the index of the decorrelator may be repeatedly reused according to the reference value.

When the LFE channel is included in the output signal of the N channel, the decorrelator may use N / 2, except for the number of LFE channels, and the LFE channel may not use the decorrelator of the OTT box. .

When the temporal shaping tool is not used, the second matrix may be input with a vector including the second signal, the uncorrelated signal derived from the decorrelator, and the residual signal derived from the decorrelator. have.

When a temporal shaping tool is used, the second matrix is a spread comprising a vector corresponding to a direct signal consisting of the second signal and a residual signal derived from the decorrelator and an uncorrelated signal derived from the decorrelator. A vector corresponding to the signal may be input.

The generating of the N-channel output signal includes, when subband domain time processing (STP) is used, applying a scale factor based on a spread signal and a direct signal to a spread signal portion of the output signal to temporal envelope of the output signal. You can shape

The generating of the N-channel output signal may flatten and reshape the envelope of the direct signal portion for each channel of the N-channel output signal when guided envelope shaping (GES) is used.

The size of the first matrix may be determined according to the number of channels of the downmix signal applying the first matrix and the number of decorrelators, and the elements of the first matrix may be determined by the CLD parameter or the CPC parameter.

In accordance with another aspect of the present invention, there is provided a method of processing a multichannel audio signal, including: identifying a downmix signal of an N / 2 channel and a residual signal of the N / 2 channel; Inputting an N / 2 channel downmix signal and an N / 2 channel residual signal to the N / 2 OTT boxes to generate an N channel output signal, wherein the N / 2 OTT boxes are not connected to each other; The OTT box which is arranged in parallel without any other and outputs the LFE channel among the N / 2 OTT boxes receives (1) only the downmix signal except the residual signal, and (2) the CLD parameter among the CLD parameter and the ICC parameter. (3) Do not output uncorrelated signal through decorator.

An apparatus for processing a multichannel audio signal according to an embodiment of the present invention includes a processor for performing a multichannel audio signal processing method, and the multichannel audio signal processing method includes an N / 2 channel generated from an input signal of N channels. Identifying the downmix signal and the residual signal of the; Applying a downmix signal and a residual signal of the N / 2 channel to a first matrix; Outputs a first signal input to the N / 2 decorrelators corresponding to N / 2 OTT boxes and a second signal transmitted to the second matrix without being input to the N / 2 decorrelators through the first matrix Making; Outputting uncorrelated signals from a first signal through the N / 2 decorrelators; Applying the uncorrelated signal and the second signal to a second matrix; And generating an output signal of the N channel through the second matrix.

When the LFE channel is not included in the output signal of the N channel, N / 2 decorrelators may correspond to the N / 2 OTT boxes.

If the number of decorrelators exceeds the reference value of the modulo operation, the index of the decorrelator may be repeatedly reused according to the reference value.

When the LFE channel is included in the output signal of the N channel, the decorrelator may use N / 2, except for the number of LFE channels, and the LFE channel may not use the decorrelator of the OTT box. .

When the temporal shaping tool is not used, the second matrix may be input with a vector including the second signal, the uncorrelated signal derived from the decorrelator, and the residual signal derived from the decorrelator. have.

When a temporal shaping tool is used, the second matrix is a spread comprising a vector corresponding to a direct signal consisting of the second signal and a residual signal derived from the decorrelator and an uncorrelated signal derived from the decorrelator. A vector corresponding to the signal may be input.

The generating of the N-channel output signal includes, when subband domain time processing (STP) is used, applying a scale factor based on a spread signal and a direct signal to a spread signal portion of the output signal to temporal envelope of the output signal. You can shape

The generating of the N-channel output signal may flatten and reshape the envelope of the direct signal portion for each channel of the N-channel output signal when guided envelope shaping (GES) is used.

The size of the first matrix may be determined according to the number of channels of the downmix signal applying the first matrix and the number of decorrelators, and the elements of the first matrix may be determined by the CLD parameter or the CPC parameter.

In accordance with another aspect of the present invention, an apparatus for processing a multichannel audio signal includes a processor for performing a method for processing a multichannel audio signal, and the method for processing a multichannel audio signal includes an N / 2 channel downmix signal and an N / Identifying a residual signal of two channels; Inputting an N / 2 channel downmix signal and an N / 2 channel residual signal to the N / 2 OTT boxes to generate an N channel output signal,

The N / 2 OTT boxes are arranged in parallel without being connected to each other, and an OTT box that outputs an LFE channel among the N / 2 OTT boxes receives (1) only a downmix signal except a residual signal, (2) It uses CLD parameter among CLD parameter and ICC parameter. (3) Does not output uncorrelated signal through decorator.

According to an embodiment of the present invention, by processing a multi-channel audio signal according to the N-N / 2 -N structure, it is possible to efficiently process an audio signal of a greater number of channels than the number of channels defined in the MPS.

1 is a diagram illustrating a 3D audio decoder, according to an exemplary embodiment.

2 is a diagram for a domain processed by a 3D audio decoder, according to an exemplary embodiment.

3 illustrates a USAC 3D encoder and a USAC 3D decoder, according to an exemplary embodiment.

4 is a first diagram illustrating a detailed configuration of a first encoding unit of FIG. 3 according to an embodiment.

FIG. 5 is a second diagram illustrating a detailed configuration of a first encoding unit of FIG. 3 according to an embodiment.

6 is a third diagram illustrating a detailed configuration of the first encoding unit of FIG. 3 according to an embodiment.

FIG. 7 is a fourth diagram illustrating a detailed configuration of the first encoding unit of FIG. 3 according to an embodiment.

8 is a first diagram illustrating a detailed configuration of a second decoding unit of FIG. 3 according to an embodiment.

9 is a second diagram illustrating a detailed configuration of a second decoding unit of FIG. 3 according to an embodiment.

FIG. 10 is a third diagram illustrating a detailed configuration of a second decoding unit of FIG. 3 according to an embodiment.

FIG. 11 is a diagram illustrating an example of implementing FIG. 3 according to an embodiment.

12 is a diagram schematically illustrating FIG. 11 according to an embodiment.

FIG. 13 is a diagram illustrating a detailed configuration of a second encoding unit and a first decoding unit of FIG. 12 according to an embodiment.

14 is a diagram illustrating a result of combining the first encoding unit and the second encoding unit of FIG. 11 and combining the first decoding unit and the second decoding unit, according to an exemplary embodiment.

FIG. 15 is a diagram schematically illustrating FIG. 14 according to an embodiment.

16 is a diagram illustrating an audio processing scheme for an N-N / 2-N structure according to an embodiment.

17 is a diagram illustrating an N-N / 2-N structure in a tree form according to an embodiment.

18 illustrates an encoder and a decoder for an FCE structure according to an embodiment.

19 illustrates an encoder and a decoder for a TCE structure according to an embodiment.

20 illustrates an encoder and a decoder for an ECE structure according to an embodiment.

21 illustrates an encoder and a decoder for a SiCE structure according to an embodiment.

FIG. 22 illustrates a process of processing an audio signal of 24 channels according to an FCE structure according to an embodiment.

FIG. 23 is a diagram illustrating a process of processing an audio signal of 24 channels according to an ECE structure according to an embodiment.

24 is a diagram illustrating a process of processing an audio signal of 14 channels according to an FCE structure according to an embodiment.

25 is a diagram illustrating a process of processing an audio signal of 14 channels according to an ECE structure and a SiCE structure according to an embodiment.

FIG. 26 illustrates a process of processing an 11.1 channel audio signal according to a TCE structure according to an embodiment.

27 illustrates a process of processing an 11.1 channel audio signal according to an FCE structure according to an embodiment.

FIG. 28 is a diagram illustrating a process of processing an audio signal of 9.0 channels according to a TCE structure according to an embodiment.

29 is a diagram illustrating a process of processing an audio signal of 9.0 channels according to an FCE structure according to an embodiment.

Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

1 is a diagram illustrating a 3D audio decoder, according to an exemplary embodiment.

Referring to the present invention, a multichannel audio signal may be downmixed at an encoder and a downmix signal may be upmixed at a decoder to restore the multichannel audio signal. In the embodiments described with reference to FIGS. 2 to 29, the contents of the decoder correspond to FIG. 1. 2 to 29 illustrate a process of processing a multi-channel audio signal, it may correspond to any one component of a bitstream, a USAC 3D decoder, a DRC-1, and a format conversion in FIG. 1.

2 is a diagram for a domain processed by a 3D audio decoder, according to an exemplary embodiment.

The USAC decoder described in FIG. 1 is for coding a core band and processes an audio signal in one of a time domain and a frequency domain. The DRC-1 processes the audio signal in the frequency domain when the audio signal is multiband. Format conversion, on the other hand, processes audio signals in the frequency domain.

3 illustrates a USAC 3D encoder and a USAC 3D decoder, according to an exemplary embodiment.

Referring to FIG. 3, the USAC 3D encoder may include both a first encoder 301 and a second encoder 302. Alternatively, the USAC 3D encoder may include a second encoding unit 302. Similarly, the USAC 3D decoder may include a first decoding unit 303 and a second decoding unit 304. Alternatively, the USAC 3D decoder may include a first decoding unit 303.

An N-channel input signal is input to the first encoding unit 301. Thereafter, the first encoding unit 301 may downmix the input signal of the N channel to output the downmix signal of the M channel. At this time, N may have a value larger than M. For example, when N is even, M may be N / 2. And when N is odd, M may be (N-1) / 2 + 1. In summary, it may be expressed as Equation 1.

<Equation 1>

Figure PCTKR2015006788-appb-I000001

The second encoder 302 may generate a bitstream by encoding the downmix signal of the M channel. For example, the second encoder 302 may encode the downmix signal of the M channel, and a general audio coder may be utilized. For example, when the second encoder 302 is a USAC coder that is an extended HE-AAC, the second encoder 302 may encode and transmit 24 channel signals.

However, when the N-channel input signal is encoded using only the second encoding unit 302, the N-channel input signal is encoded using both the first encoding unit 301 and the second encoding unit 302. More bits are required, and sound quality degradation can also occur.

Meanwhile, the first decoder 303 may output a M-channel downmix signal by decoding the bitstream generated by the second encoder 302. Then, the second decoding unit 304 may generate an N-channel output signal by upmixing the M-channel downmix signal. The N-channel output signal may be restored similarly to the N-channel input signal input to the first encoding unit 301.

For example, the second decoding unit 304 may decode the downmix signal of the M channel, and a general audio coder may be utilized. For example, when the second decoding unit 304 is a USAC coder that is an extended HE-AAC, the second decoding unit 302 may decode a 24 channel downmix signal.

4 is a first diagram illustrating a detailed configuration of a first encoding unit of FIG. 3 according to an embodiment.

The first encoding unit 301 may include a plurality of downmixing units 401. In this case, the N-channel input signals input to the first encoding unit 301 may be configured in pairs of two and then input to the downmixing unit 401. Thus, the downmixing unit 401 may represent a two-to-two box. The downmixing unit 401 is a spatial cue (CLD), Inter Channel Correlation / Coherence (ICC), Inter Channel Phase Difference (IPD), Channel Prediction Coefficient (CPC) or OPD, which are spatial cues from the input two input signals. One phase (mono) downmix signal may be generated by extracting (Overall Phase Difference) and downmixing an input signal of two channels (stereo).

The plurality of downmixing units 401 included in the first encoding unit 301 may represent a parallel structure. For example, when an input signal of N channels is input to the first encoding unit 301 and N is an even number, the downmixing unit 401 implemented as a TTO box included in the first encoding unit 301 is N / N. Two may be required. In the case of FIG. 4, the first encoding unit 301 may downmix an N-channel input signal through N / 2 TTO boxes to generate a downmix signal of M channels (N / 2 channels).

FIG. 5 is a second diagram illustrating a detailed configuration of a first encoding unit of FIG. 3 according to an embodiment.

4 illustrates a detailed configuration of the first encoding unit 301 when an input signal of N channels is input to the first encoding unit 301 and N is an even number. 5 illustrates a detailed configuration of the first encoding unit 301 when an input signal of N channels is input to the first encoding unit 301 and N is an odd number.

Referring to FIG. 5, the first encoding unit 301 may include a plurality of downmixing units 501. In this case, the first encoding unit 301 may include (N-1) / 2 downmixing units 501. In addition, the first encoder 301 may include a delay unit 502 to process the other one channel signal.

In this case, the N-channel input signals input to the first encoding unit 301 may be configured in pairs of two channels and then input to the downmixing unit 501. Thus, the downmixing unit 501 may represent a TTO box. The downmixing unit 501 extracts the spatial cues CLD, ICC, IPD, CPC, or OPD from the input two-channel input signals, downmixes the two-channel (stereo) input signals, and downlinks one channel (mono). You can generate a mix signal. The downmix signal of the M channel output from the first encoder 301 is determined according to the number of downmixers 501 and the number of delay units 502.

The delay value applied to the delay unit 502 may be the same as the delay value applied to the downmixer 501. If the downmix signal of the M channel, which is an output signal of the first encoding unit 301, is a PCM signal, the delay value may be determined according to Equation 2 below.

<Equation 2>

Figure PCTKR2015006788-appb-I000002

Here, Enc_Delay represents a delay value applied to the downmixing unit 501 and the delay unit 502. Delay1 (QMF Analysis) represents a delay value generated during QMF analysis for 64 bands of the MPS and may be 288. Delay2 (Hybrid QMF Analysis) represents a delay value generated during Hybrid QMF analysis using a 13-tap filter, and may be 6 * 64 = 384. Here, the reason why 64 is applied is that Hybrid QMF analysis is performed after QMF analysis is performed for 64 bands.

If the downmix signal of the M channel, which is the output signal of the first encoding unit 301, is a QMF signal, the delay value may be determined according to Equation (3).

<Equation 3>

Figure PCTKR2015006788-appb-I000003

6 is a third diagram illustrating a detailed configuration of the first encoding unit of FIG. 3 according to an embodiment. FIG. 7 is a fourth diagram illustrating a detailed configuration of the first encoding unit of FIG. 3 according to an embodiment.

It is assumed that an input signal of the N channel is composed of an input signal of the N 'channel and an input signal of the K channel. In this case, it is assumed that an input signal of the N ′ channel is input to the first encoding unit 301, and an input signal of the K channel is not input to the first encoding unit 301.

In this case, M, which is the number of channels corresponding to the downmix signal of the M channel input to the second encoder 301, may be determined by Equation 4.

<Equation 4>

Figure PCTKR2015006788-appb-I000004

6 illustrates a structure of the first encoding unit 301 when N 'is an even number, and FIG. 7 illustrates a structure of the first encoding unit 301 when N' is an odd number.

Referring to FIG. 6, when N 'is an even number, input signals of the N ′ channel may be input to the plurality of downmixing units 601, and input signals of the K channel may be input to the plurality of delay units 602. Here, the input signal of the N 'channel may be input to the downmixing unit 601 representing N' / 2 TTO boxes, and the input signal of the K channel may be input to the K delay units 602.

In addition, according to FIG. 7, when N ′ is an odd number, an input signal of an N ′ channel may be input to the plurality of downmixing units 701 and one delay unit 702. The input signal of the K channel may be input to the plurality of delay units 702. Here, the input signal of the N 'channel may be input to the downmixing unit 701 and one delay unit 702 representing N' / 2 TTO boxes. The input signal of the K channel may be input to the K delay units 702.

8 is a first diagram illustrating a detailed configuration of a second decoding unit of FIG. 3 according to an embodiment.

Referring to FIG. 8, the second decoding unit 304 may generate an N-channel output signal by upmixing the M-down channel downmix signal transmitted from the first decoding unit 303. The first decoding unit 303 may decode the downmix signal of the M channel included in the bitstream. In this case, the second decoding unit 304 may generate the output signal of the N channel by upmixing the downmix signal of the M channel using the spatial cues transmitted from the second encoding unit 301 of FIG. 3.

For example, when N is an even number in the output signal of the N channel, the second decoding unit 304 may include a plurality of decorrelating units 801 and upmixing units 802. When N is an odd number in the N-channel output signal, the second decoding unit 304 may include a plurality of uncorrelated units 801, an upmixing unit 802, and a delay unit 803. That is, when N is an even number in the output signal of the N channel, the delay unit 803 may be unnecessary, as shown in FIG. 8.

In this case, since an additional delay may occur in the process of generating an uncorrelated signal in the uncorrelated unit 801, the delay value of the delay unit 803 may be different from the delay value applied in the encoder. 8 illustrates a case where N is an odd number in an N-channel output signal derived from the second decoding unit 304.

When the output signal of the N channel output from the second decoding unit 304 is a PCM signal, the delay value of the delay unit 803 may be determined according to Equation 5 below.

<Equation 5>

Figure PCTKR2015006788-appb-I000005

Here, Dec_Delay represents the delay value of the delay unit 803. Delay1 represents a delay value generated according to QMF analysis, Delay2 represents a delay value generated from hybrid QMF analysis, and Delay3 represents a delay value generated from QMF synthesis. Delay 4 represents a delay value generated when the uncorrelated filter is applied in the uncorrelated unit 801.

When the output signal of the N channel output from the second decoding unit 304 is a QMF signal, the delay value of the delay unit 803 may be determined according to Equation 6 below.

<Equation 6>

Figure PCTKR2015006788-appb-I000006

First, each of the plurality of uncorrelated units 801 may generate an uncorrelated signal of the downmix signal of the M channel input to the second decoder 304. The uncorrelated signal generated in each of the plurality of decorrelators 801 may be input to the upmixing unit 802.

In this case, unlike generating an uncorrelated signal in the MPS, the plurality of uncorrelated units 801 may generate an uncorrelated signal using the downmix signal of the M channel. That is, when using an M-channel downmix signal transmitted from an encoder to generate an uncorrelated signal, sound quality degradation may not occur when reproducing a sound field of a multi-channel signal.

Hereinafter, an operation of the upmixing unit 802 included in the second decoding unit 304 will be described. The downmix signal of the M channel input to the second decoding unit 304 is defined as m (n) = [m 0 (n), m 1 (n), .., m M-1 (n)] T. Can be. The M uncorrelated signals generated by using the downmix signal of the M channel are

Figure PCTKR2015006788-appb-I000007
It can be defined as. In addition, the output signal of the N channel output through the second decoding unit 304 is
Figure PCTKR2015006788-appb-I000008
It can be defined as.

Then, the second decoding unit 304 may generate an output signal of the N channel according to Equation 7 below.

<Equation 7>

Figure PCTKR2015006788-appb-I000009

Here, M (n) means a matrix for performing upmixing on the downmix signal of M channels at n sample times. At this time, M (n) may be defined by the following equation (8).

<Equation 8>

Figure PCTKR2015006788-appb-I000010

In Equation 8, 0 is a 2x2 zero matrix.

Figure PCTKR2015006788-appb-I000011
May be defined as Equation 9 as a 2 × 2 matrix.

<Equation 9>

Figure PCTKR2015006788-appb-I000012

here,

Figure PCTKR2015006788-appb-I000013
Is a component of
Figure PCTKR2015006788-appb-I000014
May be derived from the spatial cues sent from the encoder. The spatial cues actually transmitted from the encoder can be determined for each b index, which is a frame unit, and is applied on a sample basis.
Figure PCTKR2015006788-appb-I000015
May be determined by interpolation between frames adjacent to each other.

Figure PCTKR2015006788-appb-I000016
May be determined by Equation 10 according to the MPS method.

<Equation 10>

Figure PCTKR2015006788-appb-I000017

In Equation 10,

Figure PCTKR2015006788-appb-I000018
Can be derived from the CLD. And,
Figure PCTKR2015006788-appb-I000019
Wow
Figure PCTKR2015006788-appb-I000020
Can be derived from CLD and ICC. Equation 10 may be derived according to the processing method of the spatial queue defined in the MPS.

And in Equation 7,

Figure PCTKR2015006788-appb-I000021
Denotes an operator for interlacing each element of the vectors to create a new vector column. In equation (7)
Figure PCTKR2015006788-appb-I000022
May be determined according to Equation 11 below.

<Equation 11>

Figure PCTKR2015006788-appb-I000023

Through this process, Equation 7 may be represented by Equation 12 below.

<Equation 12>

Figure PCTKR2015006788-appb-I000024

In Equation 12, {} is used to clearly indicate the processing of the input signal and the output signal. According to Equation 11, the downmix signal of the M channel and the uncorrelated signal may be paired with each other, and may be an input of Equation 12, which is an upmixing matrix. That is, according to Equation 12, by applying an uncorrelated signal to each of the downmix signals of the M channel, the distortion of sound quality during the upmixing process can be minimized, and the sound field effect can be generated as close to the original signal as possible. .

Equation 12 described above may also be represented by Equation 13 below.

<Equation 13>

Figure PCTKR2015006788-appb-I000025

9 is a second diagram illustrating a detailed configuration of a second decoding unit of FIG. 3 according to an embodiment.

Referring to FIG. 9, the second decoding unit 304 may decode an M-channel downmix signal transmitted from the first decoding unit 303 to generate an N-channel output signal. When the downmix signal of the M channel is composed of an N '/ 2 channel audio signal and a K channel audio signal, the second decoding unit 304 may also process the result reflected by the encoder.

For example, assuming that the downmix signal of the M channel input to the second decoding unit 304 satisfies Equation 4, as shown in FIG. 9, the second decoding unit 304 may control the plurality of delay units 903. It may include.

In this case, when N ′ is an odd number of downmix signals of M channels satisfying Equation 4, the second decoding unit 304 may have a structure as shown in FIG. 9. If N 'is an even number for the downmix signal of the M channel satisfying Equation 4, one delay unit 903 located below the upmixing unit 902 in the second decoding unit 304 of FIG. May be excluded.

FIG. 10 is a third diagram illustrating a detailed configuration of a second decoding unit of FIG. 3 according to an embodiment.

Referring to FIG. 10, the second decoding unit 304 may generate an N-channel output signal by upmixing an M-channel downmix signal transmitted from the first decoding unit 303. In this case, in the second decoding unit 304 illustrated in FIG. 10, the upmixing unit 1002 may include a plurality of signal processing units 1003 representing a one-to-two box.

At this time, each of the plurality of signal processing units 1003 generates two channels of output signals using the downmix signal of one channel among the downmix signals of the M channel and the uncorrelated signal generated by the uncorrelated unit 1001. can do. The plurality of signal processing units 1003 arranged in parallel in the upmixing unit 1002 may generate output signals of the N-1 channel.

If N is an even number, the delay unit 1004 may be excluded from the second decoding unit 304. Then, the plurality of signal processing units 1003 arranged in parallel in the upmixing unit 1002 may generate output signals of N channels.

The signal processor 1003 may upmix according to Equation 13. The upmixing process performed by all the signal processing units 1003 may be represented by one upmixing matrix as shown in Equation 12.

FIG. 11 is a diagram illustrating an example of implementing FIG. 3 according to an embodiment.

Referring to FIG. 11, the first encoding unit 301 may include a plurality of downmixing units 1101 and a plurality of delay units 1102 of the TTO box. The second encoding unit 302 may include a plurality of USAC encoders 1103. Meanwhile, the first decoding unit 303 may include a plurality of USAC decoders 1106, and the second decoding unit 304 may include a plurality of upmixing units 304 and a plurality of delay units 1108 of the OTT box. It may include.

Referring to FIG. 11, the first encoding unit 301 may output a downmix signal of M channels by using an input signal of N channels. In this case, the downmix signal of the M channel may be input to the second encoding unit 302. At this time, pairs of downmix signals of one channel, which are passed through the downmixing unit 1101 of the TTO box, among the downmix signals of the M channel, in a stereo form in the USAC encoder 1103 included in the second encoding unit 302. Can be encoded.

The downmix signal, which has passed through the delay unit 1102 without passing through the downmixing unit 1101 of the TTO box, may be encoded in the mono form or the stereo form by the USAC encoder 1103. In other words, the downmix signal of one channel of the downmix signal of the M channel, which has passed through the delay unit 1102, may be encoded in the mono form by the USAC encoder 1103. The downmix signals of two channels, which have passed through the two delay units 1102 of the downmix signals of the M channel, may be encoded in a stereo form by the USAC encoder 1103.

The M channel signals may be encoded by the second encoding unit 302 to generate a plurality of bitstreams. The plurality of bitstreams may be reformatted into one bitstream through the multiplexer 1104.

The bitstream generated by the multiplexer 1104 is transferred to the demultiplexer 1104, and the demultiplexer 1105 corresponds to a plurality of bitstreams corresponding to the USAC decoder 303 included in the first decoder 303. It can demultiplex into bitstreams of.

The plurality of demultiplexed bitstreams may be input to the USAC decoder 1106 included in the first decoding unit 303, respectively. The USAC decoder 303 may decode according to a method encoded by the USAC encoder 1103 included in the second encoding unit 302. Then, the first decoding unit 303 may output the downmix signal of the M channel from the plurality of bitstreams.

Thereafter, the second decoding unit 304 may generate an output signal of the N channel using the downmix signal of the M channel. In this case, the second decoding unit 304 may upmix a portion of the downmix signal of the input M channel using the upmixing unit 1107 of the OTT box. Specifically, the downmix signal of one channel of the downmix signals of the M channel is input to the upmixing unit 1107, and the upmixing unit 1107 uses a signal uncorrelated with the downmix signal of one channel to 2. The output signal of the channel can be generated. For example, the upmixing unit 1107 may generate two channels of output signals using Equation 13.

Meanwhile, each of the plurality of upmixing units 1107 performs upmixing M times by using an upmixing matrix corresponding to Equation 13, so that the second decoding unit 304 generates an N-channel output signal. Can be. Thus, since Equation 12 is derived only by performing M upmixing according to Equation 13, M in Equation 12 may be equal to the number of upmixing units 1107 included in the second decoding unit 304. Can be.

The first encoder 301 of the N channel input signals includes the K channel audio signal from the M channel downmix signal through the delay unit 1102 instead of the downmixing unit 1101 of the TTO box. In this case, the K-channel audio signal may be processed by the delay unit 1108 instead of the upmixing unit 1107 of the OTT box by the second decoding unit 304. In this case, the number of channels of the output signal output through the upmixing unit 1107 may be N-K.

12 is a diagram schematically illustrating FIG. 11 according to an embodiment.

Referring to FIG. 12, N-channel input signals may be input to the downmixing unit 1201 included in the first encoding unit 301 in pairs of two channels. The downmixer 1201 may be configured as a TTO box, and downmix the two input signals to generate one downmix signal. The first encoding unit 301 may generate an M-channel downmix signal from the N-channel input signals by using the plurality of downmixing units 1201 arranged in parallel. According to one embodiment of the invention, N is an integer greater than M, M may be N / 2.

Then, the stereotype USAC encoder 1202 included in the second encoder 302 may generate a bitstream by encoding two downmix signals output from the two downmixers 1201. .

The USAC decoder 1203 of the stereo type included in the first decoder 303 may restore two downmix signals of one channel from the downmix signal of M channels from the bitstream. Two one-channel downmix signals may be input to two upmixing units 1204 respectively representing OTT boxes included in the second decoding unit 304. Then, the upmixing unit 1204 may generate two channel output signals constituting the N channel output signals using signals uncorrelated with one channel downmix signal.

FIG. 13 is a diagram illustrating a detailed configuration of a second encoding unit and a first decoding unit of FIG. 12 according to an embodiment.

In FIG. 13, the USAC encoder 1302 included in the second encoding unit 302 may include a downmixing unit 1303, a spectral band replication (SBR) unit 1304, and a core encoding unit 1305 of the TTO box. have.

The downmixing unit 1301 of the TTO box included in the first encoding unit 301 downmixes two input signals of the N channel input signals to form one downmix signal of the M channel. You can generate a signal. The number of channels of the M channel may be determined according to the number of the downmixing units 1301.

Then, the two downmix signals output from the two downmixing units 1301 included in the first encoding unit 301 are transmitted to the downmixing unit 1303 of the TTO box included in the USAC encoder 1302. Can be entered. The downmixer 1303 may generate a downmix signal of one channel by downmixing a pair of downmix signals of one channel output from the two downmixers 1301.

In order to encode the high frequency band of the mono signal generated by the downmixing unit 1303, the SBR unit 1304 may extract only the low frequency band excluding the high frequency band from the mono signal. Then, the core encoding unit 1305 may generate a bitstream by encoding the mono signal of the low frequency band corresponding to the core band.

In conclusion, according to an embodiment of the present invention, a TTO type downmixing process may be continuously performed to generate a bitstream including an M channel downmix signal from an N channel input signal. In other words, the downmixing unit 1301 of the TTO box may downmix two channel input signals having a stereo form among the N channel input signals. The result output from each of the two downmixing units 1301 may be input to the downmixing unit 1303 of the TTO box as a part of the downmix signal of the M channel. That is, four of the N-channel input signals may be continuously output as one-channel downmix signals through TTO-type downmixing.

The bitstream generated by the second encoder 302 may be input to the USAC decoder 1306 of the first decoder 302. In FIG. 13, the USAC decoder 1306 included in the second encoding unit 302 may include a core decoding unit 1307, an SBR unit 1308, and an upmixing unit 1309 of an OTT box.

The core decoding unit 1307 may output a mono signal of the core band corresponding to the low frequency band using the bitstream. Then, the SBR unit 1308 may restore the high frequency band by copying the low frequency band of the mono signal. The upmixing unit 1309 may generate a stereo signal constituting the downmix signal of the M channel by upmixing the mono signal output from the SBR unit 1308.

Then, the upmixing unit 1310 of the OTT box included in the second decoding unit 304 may generate a stereo signal by upmixing the mono signal included in the stereo signal generated by the first decoding unit 302. .

In conclusion, according to an embodiment of the present invention, an OTT-type upmixing process may be performed in parallel to recover an N-channel output signal from a bitstream. In other words, the upmixing unit 1309 of the OTT box may generate a stereo signal by upmixing a mono signal (one channel). The two mono signals constituting the stereo signal as the output signal of the upmixing unit 1309 may be input to the upmixing unit 1310 of the OTT box. The upmixing unit 1301 of the OTT box may output a stereo signal by upmixing the input mono signal. That is, four channels of the output signal can be generated by continuously mixing the mono signal in the OTT form.

14 is a diagram illustrating a result of combining the first encoding unit and the second encoding unit of FIG. 11 and combining the first decoding unit and the second decoding unit, according to an exemplary embodiment.

The first encoding unit and the second encoding unit of FIG. 11 may be combined to be implemented as one encoding unit 1401 as illustrated in FIG. 14. In addition, the first decoding unit and the second decoding unit of FIG. 11 are combined to show a result implemented by one decoding unit 1402 as shown in FIG. 14.

The encoding unit 1401 of FIG. 14 further includes a downmixing unit 1404 of the TTO box in a USAC encoder including a downmixing unit 1405, an SBR unit 1406, and a core encoding unit 1407 of the TTO box. An encoding unit 1403 may be included. In this case, the encoding unit 1401 may include a plurality of encoding units 1403 arranged in a parallel structure. Alternatively, the encoding unit 1403 may correspond to a USAC encoder including the downmixing unit 1404 of the TTO box.

That is, according to an embodiment of the present invention, the encoding unit 1403 may generate a mono signal of one channel by continuously applying a TTO-type downmixing to four input signals of N channels.

In the same manner, the decoding unit 1402 of FIG. 14 includes an upmixing unit 1404 of an OTT box to a USAC decoder that includes a core decoding unit 1411, an SBR unit 1412, and an upmixing unit 1413 of an OTT box. It may include a decoding unit 1410 further comprising. In this case, the decoding unit 1402 may include a plurality of decoding units 1410 arranged in a parallel structure. Alternatively, the decoding unit 1410 may correspond to a USAC decoder including the upmixing unit 1404 of the OTT box.

That is, according to an embodiment of the present invention, the decoding unit 1410 may generate an output signal of four channels of the output signals of the N channel by continuously applying the OTT-type upmixing to the mono signal.

FIG. 15 is a diagram schematically illustrating FIG. 14 according to an embodiment.

In FIG. 15, the encoding unit 1501 may correspond to the encoding unit 1403 of FIG. 14. Here, the encoding unit 1501 may correspond to the modified USAC encoder. That is, the modified USAC encoder additionally includes the downmixing unit 1503 of the TTO box in the original USAC encoder including the downmixing unit 1504 of the TTO box, the SBR unit 1505 and the core encoding unit 1506. Can be implemented.

In addition, in FIG. 15, the decoding unit 1502 may correspond to the decoding unit 1410 of FIG. 14. Here, the decoding unit 1502 may correspond to the modified USAC decoder. That is, the modified USAC decoder further includes the upmixing unit 1510 of the OTT box in the original USAC decoder including the core decoding unit 1507, the SBR unit 1508, and the upmixing unit 1509 of the OTT box. Can be implemented.

16 is a diagram illustrating an audio processing scheme for an N-N / 2-N structure according to an embodiment.

Referring to FIG. 16, an N-N / 2-N structure in which a structure defined in MPEG SURROUND is changed is illustrated. In the case of MPEG SURROUND, spatial synthesis may be performed in a decoder as shown in Table 1. Spatial synthesis can transform the input signals from the time domain into a non-uniform subband domain through a hybrid Quadrature Mirror Filter (QMF) analysis bank. Here, the term irregular corresponds to a hybrid.

The decoder then operates in the hybrid subband. The decoder may generate an output signal from the input signals by performing spatial synthesis based on the spatial parameters passed by the encoder. The decoder can then use the hybrid QMF synthesis bank to inverse the output signals from the hybrid subband to the time domain.

Figure PCTKR2015006788-appb-I000026

FIG. 16 illustrates a process of processing a multi-channel audio signal through a mixed matrix of spatial synthesis performed by a decoder. Basically, MPEG SURROUND defines a 5-1-5 structure, a 5-2-5 structure, a 7-2-7 structure, and a 7-5-7 structure, but the present invention proposes an N-N / 2-N structure.

In the case of the N-N / 2-N structure, after the input signal of the N channel is converted to the downmix signal of the N / 2 channel, the output signal of the N channel is generated from the downmix signal of the N / 2 channel. The decoder according to an embodiment of the present invention may generate the N-channel output signal by upmixing the N / 2 channel downmix signal. Basically, the number of N channels in the N-N / 2-N structure of the present invention is not limited. That is, the N-N / 2-N structure may support not only a channel structure supported by the MPS but also a channel structure of a multichannel audio signal not supported by the MPS.

In FIG. 16, NumInCh refers to the number of channels of the downmix signal, and NumOutCh refers to the number of channels of the output signal. In other words, NumInCh is N / 2 and NumOutCh is N.

In FIG. 16, the N / 2 channel downmix signals (X 0 to X NumInch 1 ) and the residual signals form an input vector X. In FIG. 16, since NumInCh is N / 2, X0 to X NumInCh 1 represent downmix signals of N / 2 channels. Since the number of one-to-two (OTT) boxes is N / 2, N, the number of channels of the output signal, must be even to process the downmix signal of the N / 2 channel.

Vector corresponding to matrix M1

Figure PCTKR2015006788-appb-I000027
The input vector X to be multiplied by means a vector including the downmix signal of the N / 2 channel. When the LFE channel is not included in the output signal of the N channel, N / 2 decorrelators may be used to the maximum. However, if N, the channel number of the output signal, exceeds 20, the filters of the decorrelator can be reused.

In order to ensure orthogonality of the output signals of the decorrelator, some N decorator indexes are repeated because N is 20, the number of available decorrelators needs to be limited to a certain number (ex. 10). Can be. Therefore, according to a preferred embodiment of the present invention, N, which is the number of channels of the output signal in the N-N / 2-N structure, needs to be less than twice the limited specific number (ex. N <20). If the LFE channel is included in the output signal, the N channel needs to be configured with a smaller number of channels (eg, N <24) than more than twice the specific number in consideration of the number of LFE channels.

And, the output result of the decorrelators may be replaced with the residual signal for a specific frequency region depending on the bitstream. If the LFE channel is one of the outputs of the OTT box, no decorrelator is used for the OTT box based on the upmix.

In FIG. 16, the decorrelators labeled M (ex. NumInCh-NumLfe) from 1, the output result (uncorrelated signal) of the decorrelator, and residual signals correspond to different OTT boxes. d 1 ~ d M means uncorrelated signal which is the output result of the decorrelator (D 1 ~ D M ), res 1 ~ res M means the residual signal which is the output result of the decorrelator (D 1 ~ D M ) do. The decorrelators D1 to DM correspond to different OTT boxes, respectively.

In the following, vectors and matrices used in the NN / 2-N structure are defined. Input signals to decorators in N-2 / NN structures are vectors

Figure PCTKR2015006788-appb-I000028
Is defined as

vector

Figure PCTKR2015006788-appb-I000029
Can be determined differently depending on whether a temporal shaping tool is used or not.

(1) When a term shaping tool is not used

Vector if no temporal shaping tool is used

Figure PCTKR2015006788-appb-I000030
Is based on Equation 14
Figure PCTKR2015006788-appb-I000031
Corresponding to matrix M1
Figure PCTKR2015006788-appb-I000032
Is derived by. And,
Figure PCTKR2015006788-appb-I000033
Is the matrix of the first column in the Nth row.

<Equation 14>

Figure PCTKR2015006788-appb-I000034

At this time, the vector in equation (14)

Figure PCTKR2015006788-appb-I000035
Of elements in
Figure PCTKR2015006788-appb-I000036
To
Figure PCTKR2015006788-appb-I000037
May be input directly to the matrix M2 without being input to the N / 2 decorrelators corresponding to the N / 2 OTT boxes. so,
Figure PCTKR2015006788-appb-I000038
To
Figure PCTKR2015006788-appb-I000039
May be defined as a direct signal. And vector
Figure PCTKR2015006788-appb-I000040
Of elements in
Figure PCTKR2015006788-appb-I000041
To
Figure PCTKR2015006788-appb-I000042
Signals other than
Figure PCTKR2015006788-appb-I000043
To
Figure PCTKR2015006788-appb-I000044
) May be input to the N / 2 decorrelators corresponding to the N / 2 OTT boxes.

vector

Figure PCTKR2015006788-appb-I000045
Is composed of a direct signal, d 1 to d M which are decorrelated signals output from decorrelators, and res 1 to res M which are residual signals output from decorrelators. vector
Figure PCTKR2015006788-appb-I000046
May be determined by Equation 15 below.

<Equation 15>

Figure PCTKR2015006788-appb-I000047

In equation (15)

Figure PCTKR2015006788-appb-I000048
Defined as
Figure PCTKR2015006788-appb-I000049
Is
Figure PCTKR2015006788-appb-I000050
Means a set of all k satisfying And,
Figure PCTKR2015006788-appb-I000051
Signal
Figure PCTKR2015006788-appb-I000052
Fall decorator
Figure PCTKR2015006788-appb-I000053
When input to, it means the uncorrelated signal output from the decorator. Especially,
Figure PCTKR2015006788-appb-I000054
Is the OTT box is OTTx and the residual signal is
Figure PCTKR2015006788-appb-I000055
In the case of means the signal output from the decorator.

The subbands of the output signal can be defined dependently for all time slots n and all hybrid subbands k. Output signal

Figure PCTKR2015006788-appb-I000056
Can be determined by Equation 16 through the vector w and the matrix M2 .

<Equation 16>

Figure PCTKR2015006788-appb-I000057

here,

Figure PCTKR2015006788-appb-I000058
Denotes a matrix M2 composed of NumOutCh rows and NumInCh-NumLfe columns.
Figure PCTKR2015006788-appb-I000059
Is
Figure PCTKR2015006788-appb-I000060
Can be defined by Equation 17 below.

<Equation 17>

Figure PCTKR2015006788-appb-I000061

here,

Figure PCTKR2015006788-appb-I000062
Is defined as And,
Figure PCTKR2015006788-appb-I000063
Can be smoothed according to Equation 18 below.

<Equation 18>

Figure PCTKR2015006788-appb-I000064

here,

Figure PCTKR2015006788-appb-I000065
Denotes a function where the first row is hybrid band k and the second row is the corresponding processing band.
Figure PCTKR2015006788-appb-I000066
Corresponds to the last parameter set of the previous frame.

Meanwhile,

Figure PCTKR2015006788-appb-I000067
By means the hybrid subband signals that can be synthesized in the time domain through the hybrid synthesis filter bank. Here, the hybrid synthesis filter bank is a combination of the QMF synthesis bank through the Nyquist synthesis banks,
Figure PCTKR2015006788-appb-I000068
Can be transformed from the hybrid subband domain to the time domain through a hybrid synthesis filterbank.

(2) when temporal shaping tools are used

If temporal shaping tools are used, vectors

Figure PCTKR2015006788-appb-I000069
Is the same as described above, but the vector
Figure PCTKR2015006788-appb-I000070
May be divided into two vectors as shown in Equation 19 and Equation 20 below.

Figure PCTKR2015006788-appb-I000071

<Equation 20>

Figure PCTKR2015006788-appb-I000072

Figure PCTKR2015006788-appb-I000073
Denotes a direct signal input directly to the matrix M2 and residual signals output from the decorrelator without passing through the decorrelators,
Figure PCTKR2015006788-appb-I000074
Means uncorrelated signal output from the decorrelator. And,
Figure PCTKR2015006788-appb-I000075
Is defined as
Figure PCTKR2015006788-appb-I000076
Is
Figure PCTKR2015006788-appb-I000077
Means a set of all k satisfying Also, decorator
Figure PCTKR2015006788-appb-I000078
Input signal to
Figure PCTKR2015006788-appb-I000079
Is entered,
Figure PCTKR2015006788-appb-I000080
Decorator
Figure PCTKR2015006788-appb-I000081
Means the uncorrelated signal output from.

As defined in Equation 19, Equation 20

Figure PCTKR2015006788-appb-I000082
Wow
Figure PCTKR2015006788-appb-I000083
The final output signal is
Figure PCTKR2015006788-appb-I000084
Wow
Figure PCTKR2015006788-appb-I000085
It can be divided into.
Figure PCTKR2015006788-appb-I000086
Includes a direct signal,
Figure PCTKR2015006788-appb-I000087
Includes a diffuse signal. In other words,
Figure PCTKR2015006788-appb-I000088
Is the result derived from the direct signal input directly to the matrix M2 without passing through the decorrelator,
Figure PCTKR2015006788-appb-I000089
Is the result derived from the spread signal output from the decorrelator and input to the matrix M2.

If Subband Domain Temporal Processing (STP) is used for the NN / 2-N structure, Guided Envelope Shaping (GES) is used for the NN / 2-N structure. Separately

Figure PCTKR2015006788-appb-I000090
Wow
Figure PCTKR2015006788-appb-I000091
Is derived. At this time,
Figure PCTKR2015006788-appb-I000092
Wow
Figure PCTKR2015006788-appb-I000093
Is identified by the datastream element bsTempShapeConfig.

<When STP is used>

In order to synthesize the degree of decorrelation between the channels of the output signal, a spreading signal is generated through the decorrelator for spatial synthesis. In this case, the generated spread signal may be mixed with the direct signal. In general, the temporal envelope of the spread signal does not match the envelope of the direct signal.

At this time, subband domain time processing is used to shape the envelope of each spreading signal portion of the output signal to match the temporal shape of the downmix signal transmitted from the encoder. Such processing may be implemented with envelope estimation, such as envelope ratio calculation for direct and spread signals or shaping of the upper spectral portion of the spread signal.

That is, the temporal energy envelope of the portion corresponding to the direct signal and the portion corresponding to the spread signal in the output signal generated through upmixing can be estimated. The shaping factor may be calculated as the ratio between the temporal energy envelope for the portion corresponding to the direct signal and the portion corresponding to the spread signal.

STP

Figure PCTKR2015006788-appb-I000094
May be signaled as. if,
Figure PCTKR2015006788-appb-I000095
If, the spread signal portion of the output signal generated through upmixing can be processed via STP.

On the other hand, in order to reduce the need for delay alignment of the transmitted original downmix signal relative to the spatial upmix for generating the output signal, the downmix of the spatial upmix is approximated with the transmitted original downmix signal ( approximation).

For the N-N / 2-N structure, the direct downmix signal for (NumInCh-NumLfe) can be defined by Equation 21 below.

<Equation 21>

Figure PCTKR2015006788-appb-I000096

here,

Figure PCTKR2015006788-appb-I000097
Includes a pair-wise output signal corresponding to channel d of the output signal for the NN / 2-N structure.
Figure PCTKR2015006788-appb-I000098
May be defined as shown in Table 2 below for the NN / 2-N structure.

Figure PCTKR2015006788-appb-T000001

The envelopes of the downmix broadband envelopes and the spread signal portion of each upmix channel can be estimated according to Equation 22 using normalized direct energy.

<Equation 22>

Figure PCTKR2015006788-appb-I000099
here,
Figure PCTKR2015006788-appb-I000100
Means a bandpass factor,
Figure PCTKR2015006788-appb-I000101
Denotes a spectral flattering factor.

Since there is a direct signal for NumInCh-NumLfe in the NN / 2-N structure,

Figure PCTKR2015006788-appb-I000102
Energy of the direct signal satisfying
Figure PCTKR2015006788-appb-I000103
Can be obtained in the same manner as the 5-1-5 structure defined in MPEG Surround. The scale factor for the final envelope process may be defined as in Equation 23 below.

<Equation 23>

Figure PCTKR2015006788-appb-I000104

In Equation 23, the scale factor for the NN / 2-N structure

Figure PCTKR2015006788-appb-I000105
Can be defined. The scale factor is then applied to the spread signal portion of the output signal, thereby mapping the temporal envelope of the output signal to substantially the temporal envelope of the downmix signal. Then, the spread signal portion processed by the scale factor in each channel of the output signals of the N channels may be mixed with the direct signal portion. Then, it may be signaled whether the extension signal portion has been processed in the scale factor for each channel of the output signal. (
Figure PCTKR2015006788-appb-I000106
) Indicates that the extension signal portion was processed with the scale factor.)

<When GES is used>

When temporal shaping is performed on the extended signal portion of the output signal described above, characteristic distortion may occur. Thus, Guided Envelope Shaping (GES) can improve temporal / spatial quality while solving distortion problems. The decoder processes the direct signal portion and the extension signal portion of the output signal separately, but when GES is applied, only the direct signal portion of the upmixed output signal can be changed.

GES can recover the broadband envelope of the synthesized output signal. GES includes a modified upmixing process after flattening and reshaping the envelope for the direct signal portion for each channel of the output signal.

For reshaping, additional information of a parametric broadband envelope included in the bitstream may be used. The additional information includes the envelope ratio of the envelope of the original input signal and the envelope of the downmix signal. The envelope ratio at the decoder may be applied to the direct signal portion of each time slot included in the frame for each channel of the output signal. The GES does not alter the spread signal portion for each channel of the output signal.

if,

Figure PCTKR2015006788-appb-I000107
If, the GES process may proceed. If GES is available, the extension signal and the direct signal of the output signal may be respectively synthesized using the post mixing matrix M2 modified in the hybrid subband domain according to Equation (24).

Figure PCTKR2015006788-appb-I000108

In Equation 24, the direct signal portion for the output signal y provides the direct signal and the residual signal, and the extension signal portion for the output signal y provides the extension signal. In total, only the direct signal can be processed by the GES.

The result of processing the GES may be determined according to Equation 25 below.

<Equation 25>

Figure PCTKR2015006788-appb-I000109

The GES can extract an envelope for a particular channel of the upmixed output signal from the downmix signal by the downmix signal and decoder that performs spatial synthesis except the LFE channel depending on the tree structure.

Output signal in NN / 2-N structure

Figure PCTKR2015006788-appb-I000110
May be defined as shown in Table 3 below.

Figure PCTKR2015006788-appb-T000002

And, the input signal in the NN / 2-N structure

Figure PCTKR2015006788-appb-I000111
May be defined as shown in Table 4 below.

Figure PCTKR2015006788-appb-T000003

Also, downmix signals in NN / 2-N structures

Figure PCTKR2015006788-appb-I000112
May be defined as shown in Table 5 below.

Figure PCTKR2015006788-appb-T000004

In the following, the matrix M1 (defined for all time slots n and all hybrid subbands k)

Figure PCTKR2015006788-appb-I000113
) And the matrix M2 (
Figure PCTKR2015006788-appb-I000114
) Will be described. These matrices are defined for a given parameter time slot and given processing band m based on the parameter time slot and the CLD, ICC and CPC parameters valid for the processing band.
Figure PCTKR2015006788-appb-I000115
And
Figure PCTKR2015006788-appb-I000116
Interpolated version of.

<Definition of Matrix M1 (Pre-Matrix)>

Corresponding to matrix M1 in the NN / 2-N structure of FIG. 16.

Figure PCTKR2015006788-appb-I000117
Describes how the downmix signal is input to the decorrelators used in the decoder. Matrix M1 may be expressed as a free matrix.

The size of the matrix M1 depends on the number of channels of the downmix signal input to the matrix M1 and the number of decorrelators used in the decoder. On the other hand, the elements of the matrix M1 may be derived from the CLD and / or CPC parameters. M1 may be defined by Equation 26 below.

<Equation 26>

Figure PCTKR2015006788-appb-I000118

At this time,

Figure PCTKR2015006788-appb-I000119
Is defined as

Meanwhile,

Figure PCTKR2015006788-appb-I000120
Can be smoothed by the following equation (27).

Figure PCTKR2015006788-appb-I000121

here,

Figure PCTKR2015006788-appb-I000122
Wow
Figure PCTKR2015006788-appb-I000123
First row in the hybrid subband
Figure PCTKR2015006788-appb-I000124
Where the second row is the processing band and the third row is the specific hybrid subband
Figure PCTKR2015006788-appb-I000125
Is a complex conjugation of
Figure PCTKR2015006788-appb-I000126
to be. And,
Figure PCTKR2015006788-appb-I000127
Means the last parameter set of the previous frame.

Matrix for Matrix M1

Figure PCTKR2015006788-appb-I000128
May be defined as follows.

(1) matrix R1

matrix

Figure PCTKR2015006788-appb-I000129
May control the number of signals input to the decorrelators. Since it does not add uncorrelated signals, it can only be expressed as a function of CLD and CPC.

matrix

Figure PCTKR2015006788-appb-I000130
May be defined differently according to the channel structure. In the NN / 2-N structure, in order to prevent OTT boxes from being cascaded, all channels of an input signal may be input in pairs by 2 channels to the OTT box. So, for the NN / 2-N structure, the number of OTT boxes is N / 2.

In this case, the matrix

Figure PCTKR2015006788-appb-I000131
Is a vector containing the input signal
Figure PCTKR2015006788-appb-I000132
It depends on the number of OTT boxes equal to its column size. However, Lfe upmixes based on OTT boxes are not considered in the NN / 2-N architecture since no decorrelator is needed. matrix
Figure PCTKR2015006788-appb-I000133
All elements of may be either 1 or 0.

In the NN / 2-N structure

Figure PCTKR2015006788-appb-I000134
May be defined by Equation 28 below.

<Equation 28>

Figure PCTKR2015006788-appb-I000135

All OTT boxes in the NN / 2-N architecture represent a parallel processing satge, not a cascade. Therefore, all OTT boxes in the NN / 2-N structure are not connected to any other OTT boxes. So, matrix is unit matrix

Figure PCTKR2015006788-appb-I000136
And unit matrix
Figure PCTKR2015006788-appb-I000137
It can be configured as. In this case, the unit matrix
Figure PCTKR2015006788-appb-I000138
May be a unit matrix of size N * N.

(2) Matrix G1

In order to handle downmix signals or externally supplied downmix signals prior to MPEG Surround decoding, a datastream controlled by correction factors may be applied. Calibration factor matrix

Figure PCTKR2015006788-appb-I000139
It can be applied to the downmix signal or an externally supplied downmix signal.

matrix

Figure PCTKR2015006788-appb-I000140
May ensure that the level of the downmix signal for a particular time frequency tile represented by the parameter is the same as the level of the downmix signal obtained when the spatial parameter is estimated at the encoder.

This is divided into three cases: (i) without external downmix compensation (

Figure PCTKR2015006788-appb-I000141
), (ii) with parameterized external downmix compensation (
Figure PCTKR2015006788-appb-I000142
And (iii) perform residual coding based on external downmix compensation (
Figure PCTKR2015006788-appb-I000143
) Can be separated. if,
Figure PCTKR2015006788-appb-I000144
If, the decoder does not support residual coding based on external downmix compensation.

And, if external downmix compensation is not applied in the NN / 2-N structure (

Figure PCTKR2015006788-appb-I000145
), Matrix in NN / 2-N structure
Figure PCTKR2015006788-appb-I000146
May be defined by Equation 29 below.

<Equation 29>

Figure PCTKR2015006788-appb-I000147

here,

Figure PCTKR2015006788-appb-I000148
Means a unit matrix indicating NumInch * NumInCh size,
Figure PCTKR2015006788-appb-I000149
Denotes a zero matrix representing NumInch * NumInCh size.

In contrast, if external downmix compensation is applied in the NN / 2-N structure (

Figure PCTKR2015006788-appb-I000150
), For the NN / 2-N structure
Figure PCTKR2015006788-appb-I000151
May be defined by Equation 30 below.

<Equation 30>

Figure PCTKR2015006788-appb-I000152

here,

Figure PCTKR2015006788-appb-I000153
Is defined as

On the other hand, when residual coding based on external downmix compensation is applied in the NN / 2-N structure (

Figure PCTKR2015006788-appb-I000154
),
Figure PCTKR2015006788-appb-I000155
May be defined by Equation 31 below.

<Equation 31>

Figure PCTKR2015006788-appb-I000156

here,

Figure PCTKR2015006788-appb-I000157
It can be defined as. And,
Figure PCTKR2015006788-appb-I000158
Can be updated.

(3) matrix H1

In the NN / 2-N structure, the number of channels of the downmix signal may be more than five. Thus, the inverse matrix H is a vector of input signals for all parameter sets and processing bands.

Figure PCTKR2015006788-appb-I000159
It may be a unit matrix having the same size as the number of columns of.

<Definition of matrix M2 (post-matrix)>

In NN / 2-N structure, matrix M2

Figure PCTKR2015006788-appb-I000160
Defines how to combine the direct and uncorrelated signals to regenerate the multi-channel output signal.
Figure PCTKR2015006788-appb-I000161
May be defined by Equation 32 below.

<Equation 32>

Figure PCTKR2015006788-appb-I000162

here,

Figure PCTKR2015006788-appb-I000163
Is defined as

Meanwhile,

Figure PCTKR2015006788-appb-I000164
Can be smoothed by the following equation (33).

<Equation 33>

Figure PCTKR2015006788-appb-I000165

here,

Figure PCTKR2015006788-appb-I000166
Wow
Figure PCTKR2015006788-appb-I000167
First row in the hybrid subband
Figure PCTKR2015006788-appb-I000168
Where the second row is the processing band and the third row is the specific hybrid subband
Figure PCTKR2015006788-appb-I000169
About
Figure PCTKR2015006788-appb-I000170
Complex conjugation of
Figure PCTKR2015006788-appb-I000171
to be. And,
Figure PCTKR2015006788-appb-I000172
Means the last parameter set of the previous frame.

Matrix for Matrix M2

Figure PCTKR2015006788-appb-I000173
The element of can be calculated from the equivalent model of the OTT box. The OTT box includes a decorrelator and a mixing section. The mono input signal input to the OTT box is transmitted to the decorrelator and the mixing unit, respectively. The mixing unit may generate a stereo output signal using a mono input signal, an uncorrelated signal output through the decorrelator, and the CLD and ICC parameters. Here, the CLD controls localization in the stereo field, and the ICC controls the stereo wideness of the output signal.

Then, the result output from any OTT box can be defined by Equation 34 below.

<Equation 34>

Figure PCTKR2015006788-appb-I000174

OTT box

Figure PCTKR2015006788-appb-I000175
Labeling as (
Figure PCTKR2015006788-appb-I000176
),
Figure PCTKR2015006788-appb-I000177
Time slot for OTT box
Figure PCTKR2015006788-appb-I000178
And parameter bands
Figure PCTKR2015006788-appb-I000179
Denotes an element of an arbitrary matrix.

In this case, the post gain matrix may be defined as in Equation 35 below.

Figure PCTKR2015006788-appb-I000180

here,

Figure PCTKR2015006788-appb-I000181
, And
Figure PCTKR2015006788-appb-I000182
,ego,
Figure PCTKR2015006788-appb-I000183
And
Figure PCTKR2015006788-appb-I000184
Is defined as

Meanwhile,

Figure PCTKR2015006788-appb-I000185
(
Figure PCTKR2015006788-appb-I000186
for
Figure PCTKR2015006788-appb-I000187
Can be defined as

And,

Figure PCTKR2015006788-appb-I000188
Is defined as

At this time, in the NN / 2-N structure,

Figure PCTKR2015006788-appb-I000189
May be defined by Equation 36 below.

<Equation 36>

Figure PCTKR2015006788-appb-I000190

Here, CLD and ICC may be defined by Equation 37 below.

<Equation 37>

Figure PCTKR2015006788-appb-I000191

At this time,

Figure PCTKR2015006788-appb-I000192
It can be defined as.

<Definition of Emergency Correlator>

In the N-N / 2-N structure, decorrelators may be performed by a reverberation filter in the QMF subband domain. Reverberation filters exhibit different filter characteristics based on which hybrid subband currently corresponds to all hybrid subbands.

The reverberation filter is an IIR grating filter. The IIR grating filters have different filter coefficients for different decorrelators to produce mutually uncorrelated orthogonal signals.

The uncorrelated process carried out by the decorator is carried out in several processes. First, the output of matrix M1

Figure PCTKR2015006788-appb-I000193
Is entered into the set of all-pass uncorrelated filters. The filtered signals can then be energy shaped. Here, energy shaping is shaping the spectral or temporal envelope to match uncorrelated signals more closely to the input signal.

Input signal input to any decorator

Figure PCTKR2015006788-appb-I000194
Vector
Figure PCTKR2015006788-appb-I000195
It is part of. In order to ensure orthogonality between uncorrelated signals derived through the plurality of decorrelators, the plurality of decorrelators have different filter coefficients.

The uncorrelated filter consists of a plurality of all-pass (IIR) regions preceded by a fixed frequency-dependent delay. The frequency axis may be divided into different regions so as to correspond to the QMF division frequency. In each region, the length of the delay and the length of the filter coefficient vectors are the same. And, the filter coefficients of the decorrelator with fractional delay due to additional phase rotation depend on the hybrid subband index.

As discussed above, the filters of the decorrelators have different filter coefficients to ensure orthogonality between the uncorrelated signals output from the decorrelators. In the N-N / 2-N structure, N / 2 decorrelators are required. At this time, in the N-N / 2-N structure, the number of decorrelators may be limited to ten. In the NN / 2-N structure where Lfe mode does not exist, when the number of OTT boxes, N / 2, exceeds 10, the decorators are more than 10 OTT boxes according to 10 basis modulo operations. It can be reused corresponding to the number of.

Table 6 below shows the index of the uncorrelator in the decoder of the NN / 2-N structure. Referring to Table 6, the N / 2 decorrelators are indexed by 10 units. That is, the 0th decorator and the 10th decorator

Figure PCTKR2015006788-appb-I000196
Have the same index.

Figure PCTKR2015006788-appb-T000005

For the N-N / 2-N structure, it may be implemented by the syntax of Table 7.

Figure PCTKR2015006788-appb-I000197
Figure PCTKR2015006788-appb-I000198
Figure PCTKR2015006788-appb-I000199

At this time, bsTreeConfig may be implemented by Table 8.

Figure PCTKR2015006788-appb-T000006

In addition, bsNumInCh, which is the number of channels of the downmix signal in the N-N / 2-N structure, may be implemented as shown in Table 9 below.

Figure PCTKR2015006788-appb-T000007

In the NN / 2-N structure, the number of LFE channels among the output signals is

Figure PCTKR2015006788-appb-I000200
May be implemented as shown in Table 10 below.

Figure PCTKR2015006788-appb-T000008

In the N-N / 2-N structure, the channel order of the output signal may be implemented as shown in Table 11 according to the number of channels of the output signal and the number of LFE channels.

Figure PCTKR2015006788-appb-T000009

In Table 7, bsHasSpeakerConfig is a flag indicating whether the layout of the output signal to be actually reproduced is different from the channel order specified in Table 11. If bsHasSpeakerConfig == 1, audioChannelLayout, which is the layout of the loudspeakers during actual playback, may be used for rendering.

The audioChannelLayout shows the layout of the loudspeakers for actual playback. If the loudspeaker includes an LFE channel, the LFE channels should be processed using one OTT box together with the non-LFE channel and may be located last in the channel list. For example, the LFE channel is located last in the channel lists L, Lv, R, Rv, Ls, Lss, Rs, Rss, C, LFE, Cvr, and LFE2.

17 is a diagram illustrating an N-N / 2-N structure in a tree form according to an embodiment.

The N-N / 2-N structure illustrated in FIG. 16 may be represented in a tree form as shown in FIG. 17. In FIG. 17, all OTT boxes can regenerate two channels of output signals based on CLD, ICC, residual signal and input signal. OTT boxes and their corresponding CLD, ICC, residual and input signals may be numbered in the order in which they appear in the bitstream.

According to FIG. 17, there are N / 2 of the plurality of OTT boxes. In this case, the decoder, which is a multichannel audio signal processing apparatus, may generate N-channel output signals from N / 2-channel downmix signals using N / 2 OTT boxes. Here, N / 2 OTT boxes are not implemented through a plurality of layers. That is, the OTT boxes may perform upmixing in parallel for each channel of the downmix signal of the N / 2 channel. In other words, one OTT box is not connected to another OTT box.

Meanwhile, in FIG. 17, the left figure shows a case where the LFE channel is not included in the N-channel output signal, and the right figure shows a case where the LFE channel is included in the N-channel output signal.

In this case, when the LFE channel is not included in the output signal of the N channel, the N / 2 OTT boxes may generate the output signal of the N channel using the residual signal res and the downmix signal M. FIG. However, when the LFE channel is included in the output signal of the N channel, the OTT box in which the LFE channel is output among the N / 2 OTT boxes may use only the downmix signal except the residual signal.

In addition, when the LFE channel is included in the output signal of the N channel, the OTT box in which the LFE channel is not output among the N / 2 OTT boxes upmixes the downmix signal using CLD and ICC, but the LFE channel is The output OTT box can upmix the downmix signal using only the CLD.

If the LFE channel is included in the output signal of the N channel, the OTT box in which the LFE channel is not output among the N / 2 OTT boxes generates an uncorrelated signal through the decorrelator, but the OTT in which the LFE channel is output. The box does not perform uncorrelated processes and therefore does not generate uncorrelated signals.

18 illustrates an encoder and a decoder for an FCE structure according to an embodiment.

Referring to FIG. 18, a Four Channel Element (FCE) downmixes an input signal of four channels to generate an output signal of one channel, or upmixes an input signal of one channel to generate an output signal of four channels. Corresponds to the device to create.

The FCE encoder 1801 may generate an output signal of one channel from four input signals using two TTO boxes 1803 and 1804 and the USAC encoder 1805. The TTO boxes 1803 and 1804 may each downmix two input signals to generate one down channel signal from four input signals. The USC encoder 1805 may perform encoding in the core band of the downmix signal.

The FCE decoder 1802 performs the inverse of the operation performed by the FCE encoder 1801. The FCE decoder 1802 may generate four channels of output signals from one channel of input signals using the USAC decoder 1806 and two OTT boxes 1807 and 1808. OTT boxes 1807 and 1808 may upmix the input signals of one channel, respectively, decoded by USAC decoder 1806 to produce four channels of output signals. USC decoder 1806 may perform encoding in the core band of the FCE downmix signal.

The FCE decoder 1802 may perform coding at a low bitrate in order to operate in a parametric mode using spatial cues such as CLD, IPD, and ICC. The parametric type may be changed based on at least one of the operation bit rate and the total number of channels of the input signal, the resolution of the parameter, and the quantization level. The FCE encoder 1801 and the FCE decoder 1802 can be widely used from 128 kbps to 48 kbps.

The number of channels (four) of the output signal of the FCE decoder 1802 is the same as the number of channels (four) of the input signal input to the FCE encoder 1801.

19 illustrates an encoder and a decoder for a TCE structure according to an embodiment.

Referring to FIG. 19, a three channel element (TCE) corresponds to an apparatus for generating an output signal of one channel from three input signals or generating an output signal of three channels from an input signal of one channel. .

The TCE encoder 1901 may include one TTO box 1903 and one QMF converter 1904 and one USAC encoder 1905. Here, the QMF converter may include a hybrid analyzer / synthesizer. At this time, input signals of two channels may be input to the TTO box 1903, and input signals of one channel may be input to the QMF converter 1904. The TTO box 1903 may downmix the input signals of the two channels to generate the downmix signal of one channel. The QMF converter 1904 may convert an input signal of one channel into a QMF domain.

The output result of the TTO box 1903 and the output result of the QMF converter 1904 may be input to the USAC encoder 1905. The USAC encoder 1905 may encode the core bands of the two channel signals input as the output result of the TTO box 1903 and the output result of the QMF converter 1904.

According to FIG. 19, since the number of channels of the input signal is three and odd, only two input signals of the two channels are input to the TTO box 1903, and the other one of the input signals bypasses the TTO box 1903. May be input to the USAC encoder 1905. In this case, since the TTO box 1903 operates in a parametric mode, the TCE encoder 1901 may be mainly applied when the number of channels of the input signal is 11.1 or 9.0.

The TCE decoder 1902 may include one USAC decoder 1906, one OTT box 1907 and one QMF inverse converter 1904. At this time, the input signal of one channel input from the TCE encoder 1901 is decoded through the USAC decoder 1906. In this case, the USAC decoder 1906 may decode the core band from the input signal of one channel.

Input signals of two channels output through the USAC decoder 1906 may be input to the OTT box 1907 and the QMF inverse converter 1908 for each channel. QMF inverse transformer 1908 may include a hybrid analyzer / synthesizer. The OTT box 1907 may generate an output signal of two channels by upmixing an input signal of one channel. In addition, the QMF inverse converter 1908 may inversely convert the input signal of one of the two channels of the input signal output through the USAC decoder 1906 from the QMF domain to the time domain or frequency domain.

The number of channels of three output signals of the TCE decoder 1902 is equal to the number of channels of three input signals input to the TCE encoder 1901.

20 illustrates an encoder and a decoder for an ECE structure according to an embodiment.

Referring to FIG. 20, an ECE (Eight Channel Element) downmixes an input signal of eight channels to generate an output signal of one channel, or upmixes an input signal of one channel to generate an output signal of eight channels. Corresponds to the device to create.

The ECE encoder 2001 may generate an output signal of one channel from eight input signals using six TTO boxes 2003 to 2008 and USAC encoder 2009. First, input signals of eight channels are input as input signals of two channels, respectively, by four TTO boxes 2003 to 2006. Then, each of the four TTO boxes 2003 to 2006 may generate an output signal of one channel by downmixing input signals of two channels. The output results of the four TTO boxes 2003 to 2006 are input to two TTO boxes 2007 and 2008 connected to the four TTO boxes 2003 to 2006.

The two TTO boxes 2007 and 2008 may downmix the output signals of two channels among the output signals of the four TTO boxes 2003 to 2006 to generate the output signal of one channel. Then, the output results of the two TTO boxes 2007 and 2008 are input to the USAC encoder 2009 connected to the two TTO boxes 2007 and 2008. The USAC encoder 2009 may encode the input signal of two channels to generate the output signal of one channel.

In conclusion, the ECE encoder 2001 may generate an output signal of one channel from an input signal of eight channels using TTO boxes connected in a two-stage tree form. In other words, the four TTO boxes 2003 to 2006 and the two TTO boxes 2007 and 2008 may be connected to each other in a cascade to form a tree of two layers. The ECE encoder 2001 may be used in 48kbps mode or 64kbps mode for the case where the channel structure of the input signal is 22.2 or 14.0.

The ECE decoder 2002 may generate eight channels of output signals from one channel of input signals using six OTT boxes 2011 to 2016 and USAC decoders 2010. First, an input signal of one channel generated by the ECE encoder 2001 may be input to the USAC decoder 2010 included in the ECE decoder 2002. The USAC decoder 2010 may then decode the core band of the input signal of one channel to generate an output signal of two channels. The output signals of the two channels output from the USAC decoder 2010 may be input to the OTT box 2011 and the OTT box 2012 for each channel. The OTT box 2011 may generate an output signal of two channels by upmixing an input signal of one channel. Similarly, the OTT box 2012 may upmix the input signal of one channel to generate an output signal of two channels.

Then, output results of the OTT boxes 2011 and 2012 may be input to the OTT boxes 2013 to 2016 connected to the OTT boxes 2011 and 2012, respectively. Each of the OTT boxes 2013 to 2016 may receive upmixed output signals of one channel among the output signals of two channels that are output results of the OTT boxes 2011 and 2012. That is, each of the OTT boxes 2013 to 2016 may generate an output signal of two channels by upmixing an input signal of one channel. Then, the number of channels of the output signal generated from each of the four OTT boxes 2013 to 2016 is nine.

In conclusion, the ECE decoder 2002 may generate eight channels of output signals from one channel of input signals using OTT boxes connected in a two-stage tree form. In other words, the four OTT boxes 2013 to 2016 and the two OTT boxes 2011 and 2012 may be connected to each other in a cascade to form a tree of two layers.

The number of channels of eight output signals of the ECE decoder 2002 is equal to the number of channels of eight input signals input to the ECE encoder 2001.

21 illustrates an encoder and a decoder for a SiCE structure according to an embodiment.

Referring to FIG. 21, a six channel element (SICE) corresponds to an apparatus for generating one channel output signal from six channel input signals or six channel output signals from one channel input signal. .

The SICE encoder 2101 may include four TTO boxes 2103-2106 and one USAC encoder 2107. At this time, input signals of six channels may be input to three TTO boxes 2103 to 2106. Then, each of the three TTO boxes 2103 to 2106 may generate an output signal of one channel by downmixing an input signal of two channels among the input signals of six channels. Two TTO boxes of the three TTO boxes 2103 to 2106 may be connected to the other TTO box. In the case of FIG. 21, the TTO boxes 2103 and 2104 may be connected to the TTO boxes 2106.

The output results of the TTO boxes 2103 and 2104 may be input to the TTO box 2106. As shown in FIG. 21, the TTO box 2106 may downmix two input signals to generate one channel of output signal. On the other hand, the output result of the TTO box 2105 is not input to the TTO box 2106. That is, the output result of the TTO box 2105 is input to the USAC encoder 2107 by bypassing the TTO box 2106.

The USAC encoder 2107 may generate the output signal of one channel by encoding the core bands of the two channel input signals that are the output results of the TTO box 2105 and the TTO box 2106.

In the SiCE encoder 2101, three TTO boxes 2103 to 2105 and one TTO box 2106 constitute different layers. However, unlike the ECE encoder 2001, in the SiCE encoder 2101, two TTO boxes 2103 to 2104 among three TTO boxes 2103 to 2105 are connected to one TTO box 2106, and the other 1 TTO boxes 2105 bypass TTO box 2106. The SiCE encoder 2101 can process an input signal having a 14.0 channel structure at 48 kbps and 64 kbps.

The SiCE decoder 2102 may include one USAC decoder 2108 and four OTT boxes 2109-2112.

The output signal of one channel generated by the SiCE encoder 2101 may be input to the SiCE decoder 2102. The USAC decoder 2108 of the SiCE decoder 2102 may then decode the core band of the input signal of one channel to generate two output signals. Then, the output signal of one of the two channel output signals generated from the USAC decoder 2108 is input to the OTT box 2109, and the output signal of the other one channel bypasses the OTT box 2109. Directly into the OTT box 2112.

The OTT box 2109 may then upmix the input signal of one channel delivered from the USAC decoder 2108 to generate two channels of output signal. Then, the output signal of one channel of the two channel output signals generated from the OTT box 2109 is input to the OTT box 2110, and the output signal of the other one channel is input to the OTT box 2111. Can be. Thereafter, the OTT boxes 2110 to 2112 may upmix the input signals of one channel to generate output signals of two channels.

The encoder of the FCE structure, the TCE structure, the ECE structure, and the SiCE structure described above with reference to FIGS. 18 to 21 may generate an output signal of one channel from an N-channel input signal using a plurality of TTO boxes. In this case, one TTO box may exist inside the USAC encoder included in the FCE structure, the TCE structure, the ECE structure, and the SiCE encoder.

Meanwhile, the encoder of the ECE structure and the SiCE structure may be configured of two layers of TTO boxes. In addition, when the number of channels of the input signal is odd, such as the TCE structure and the SiCE structure, the TTO box may be bypassed.

The decoder of the FCE structure, the TCE structure, the ECE structure, and the SiCE structure may generate an N-channel output signal from an input signal of one channel using a plurality of OTT boxes. At this time, one OTT box may exist inside the USAC decoder included in the decoder of the FCE structure, the TCE structure, the ECE structure, and the SiCE structure.

Meanwhile, the decoder of the ECE structure and the SiCE structure may be configured of two layers of OTT boxes. In addition, when the number of channels of the input signal is odd, such as the TCE structure and the SiCE structure, there is a case of bypassing the OTT box.

FIG. 22 illustrates a process of processing an audio signal of 24 channels according to an FCE structure according to an embodiment.

In detail, FIG. 22 may operate at 128kbps and 96kbps as a 22.2 channel structure. Referring to FIG. 22, four channels of 24 input signals may be input to six FCE encoders 2201. Then, as described with reference to FIG. 18, the FCE encoder 2201 may generate one channel output signal from four channel input signals. Then, an output signal of one channel output from each of the six FCE encoders 2201 illustrated in FIG. 22 may be output in the form of a bitstream through the bitstream formatter. That is, the bitstream may include six output signals.

The bitstream deformatter can then derive six output signals from the bitstream. Six output signals may be input to each of six FCE decoders 2202. Then, as described with reference to FIG. 18, the FCE decoder 2202 may generate four channel output signals from one channel input signal. A total of 24 channels of output signals may be generated through six FCE decoders 2202.

FIG. 23 is a diagram illustrating a process of processing an audio signal of 24 channels according to an ECE structure according to an embodiment.

FIG. 23 assumes a case where an input signal of 24 channels is input as in the 22.2 channel structure described with reference to FIG. 22. However, it is assumed that the operation mode of FIG. 23 operates at 48 kbps and 64 kbps, which are lower bit rates than FIG. 22.

Referring to FIG. 23, eight channels of input signals of 24 channels may be input to three ECE encoders 2301, respectively. Then, as described with reference to FIG. 20, the ECE encoder 2301 may generate an output signal of one channel from input signals of eight channels. Then, an output signal of one channel output from each of the three ECE encoders 2301 illustrated in FIG. 23 may be output in the form of a bitstream through the bitstream formatter. That is, the bitstream may include three output signals.

The bitstream deformatter can then derive three output signals from the bitstream. Three output signals may be input to three ECE decoders 2302, respectively. Then, as described with reference to FIG. 20, the ECE decoder 2302 may generate an output signal of eight channels from an input signal of one channel. A total of 24 channels of output signals may be generated through three FCE decoders 2302.

24 is a diagram illustrating a process of processing an audio signal of 14 channels according to an FCE structure according to an embodiment.

FIG. 24 illustrates a process of generating four channels of output signals through three FCE encoders 2401 and one CPE encoder 2402 with input signals of fourteen channels. At this time, FIG. 24 shows a case in which operation is performed at a relatively high bit rate such as 128 kbps or 96 kbps.

Three FCE encoders 2401 may generate one channel of output signals from four channels of input signals, respectively. In addition, one CPE encoder 2402 may generate an output signal of one channel by downmixing an input signal of two channels. Then, the bitstream formatter may generate a bitstream including four output signals from the output results of three FCE encoders 2401 and the output results of one CPE encoder 2402.

On the other hand, the bitstream formatter extracts four output signals from the bitstream, and then the three output signals can be delivered to three FCE decoders 2403 and the other one output signal to one CPE decoder 2404. have. Then, each of the three FCE decoders 2403 may generate four channels of output signals from one channel of input signals. In addition, one CPE decoder 2404 may generate two channels of output signals from one channel of input signals. That is, a total of 14 output signals may be generated through three FCE decoders 2403 and one CPE decoder 2404.

25 is a diagram illustrating a process of processing an audio signal of 14 channels according to an ECE structure and a SiCE structure according to an embodiment.

Referring to FIG. 25, the ECE encoder 2501 and the SiCE encoder 2502 process 14 input signals. Unlike FIG. 24, FIG. 25 is applied to a relatively low bit rate (eg 48 kbps, 96 kbps).

The ECE encoder 2501 may generate an output signal of one channel from input signals of eight channels among the input signals of 14 channels. The SiCE encoder 2502 may generate an output signal of one channel from input signals of six channels among the input signals of 14 channels. The bitstream formatter may generate a bitstream using two output signals as an output result of the ECE encoder 2501 and the SiCE encoder 2502.

Meanwhile, the bitstream deformatter may extract two output signals from the bitstream. Then, two output signals may be input to the ECE decoder 2503 and the SiCE decoder 2504, respectively. The ECE decoder 2503 can generate eight channels of output signals using one channel of input signals, and the SiCE decoder 2504 can generate six channels of output signals using one channel of input signals. have. That is, a total of 14 output signals may be generated through the ECE decoder 2503 and the SiCE decoder 2504, respectively.

FIG. 26 illustrates a process of processing an 11.1 channel audio signal according to a TCE structure according to an embodiment.

Referring to FIG. 26, four CPE encoders 2601 and one TCE encoder 2602 may generate five channels of output signals from 11.1 channels of input signals. In the case of FIG. 26, an audio signal may be processed at a relatively high bit rate such as 128 kbps and 96 kbps.

Each of the four CPE encoders 2601 may generate one channel of output signals from two channels of input signals. Meanwhile, one TCE encoder 2602 may generate one channel output signal from three channel input signals. The output results of the four CPE encoders 2601 and one TCE encoder 2602 may be input to a bitstream formatter and output as a bitstream. That is, the bitstream may include output signals of five channels.

Meanwhile, the bitstream deformatter may extract five channels of output signals from the bitstream. Five output signals may then be input to four CPE decoders 2603 and one TCE decoder 2604. The four CPE decoders 2603 may then generate two channels of output signals from one channel of input signals, respectively. The TCE decoder 2604 may generate three channels of output signals from one channel of input signals. Finally, 11 channels of output signals may be output through four CPE decoders 2603 and one TCE decoder 2604.

27 illustrates a process of processing an 11.1 channel audio signal according to an FCE structure according to an embodiment.

Unlike FIG. 26, FIG. 27 may operate at a relatively low bit rate (eg, 64kbps, 48kbps). Referring to FIG. 27, three FCE encoders 2701 may generate three channels of output signals from twelve channels of input signals. Specifically, each of the three FCE encoders 2701 may generate an output signal of one channel from input signals of four channels among the input signals of twelve channels. Then, the bitstream formatter may generate a bitstream using three channel output signals output from three FCE encoders 2701.

Meanwhile, the bitstream deformatter may output three channels of output signals from the bitstream. Then, output signals of three channels may be input to three FCE decoders 2702, respectively. Thereafter, the FCE decoder 2702 may generate an output signal of three channels by using an input signal of one channel. Then, output signals of 12 channels may be generated through three FCE decoders 2702.

FIG. 28 is a diagram illustrating a process of processing an audio signal of 9.0 channels according to a TCE structure according to an embodiment.

Referring to FIG. 28, a process of processing input signals of nine channels is illustrated. 28 can process input signals of nine channels at relatively high bitrates (eg, 128 kbps, 96 kbps). At this time, nine channels of input signals may be processed based on three CPE encoders 2801 and one TCE encoder 2802. Each of the three CPE encoders 2801 may generate one channel of output signals from two channels of input signals. Meanwhile, one TCE encoder 2802 may generate one channel output signal from three channel input signals. Then, a total of four channels of output signals can be input to the bitstream formatter and output as a bitstream.

The bitstream deformatter may extract output signals of four channels included in the bitstream. Then, four channels of output signals may be input to three CPE decoders 2803 and one TCE decoder 2804. Each of the three CPE decoders 2803 may generate two channels of output signals from one channel of input signals. Meanwhile, one TCE decoder 2804 may generate three channel output signals from one channel input signal. A total of nine channels of output signals can then be generated.

29 is a diagram illustrating a process of processing an audio signal of 9.0 channels according to an FCE structure according to an embodiment.

Referring to FIG. 29, a process of processing input signals of nine channels is illustrated. 29 can process nine channels of input signals at relatively low bitrates (64 kbps, 48 kbps). In this case, nine channels of input signals may be processed based on two FCE encoders 2901 and one SCE encoder 2902. Each of the two FCE encoders 2901 may generate one channel of output signal from four channels of input signal. Meanwhile, one SCE encoder 2902 may generate an output signal of one channel from an input signal of one channel. Then, a total of three channels of output signals may be input to the bitstream formatter and output in the bitstream.

The bitstream deformatter may extract output signals of three channels included in the bitstream. Then, output signals of three channels may be input to two FCE decoders 2903 and one SCE decoder 2904. Each of the two FCE decoders 2903 may generate four channels of output signals from one channel of input signals. Meanwhile, one SCE decoder 2904 may generate one channel output signal from one channel input signal. A total of nine channels of output signals can then be generated.

Table 12 below shows a configuration of a parameter set according to the number of channels of an input signal when spatial coding is performed. Here, bsFreqRes means the number of analysis bands equal to the number of USAC encoders.

Figure PCTKR2015006788-appb-T000010

The USAC encoder can encode the core band of the input signal. The USAC encoder can control the plurality of encoders according to the number of input signals by using channel-to-object mapping information based on metadata representing relationship information between channel elements (CPEs and SCEs) and objects and rendered channel signals. have. Table 13 below shows the bit rate and sampling rate used in the USAC encoder. According to the sampling rate of Table 13, encoding parameters of spectral band replication (SBR) may be appropriately adjusted.

Figure PCTKR2015006788-appb-T000011

Methods according to an embodiment of the present invention can be implemented in the form of program instructions that can be executed by various computer means and recorded in a computer readable medium. The computer readable medium may include program instructions, data files, data structures, etc. alone or in combination. Program instructions recorded on the media may be those specially designed and constructed for the purposes of the present invention, or they may be of the kind well-known and available to those having skill in the computer software arts.

As described above, the present invention has been described by way of limited embodiments and drawings, but the present invention is not limited to the above embodiments, and those skilled in the art to which the present invention pertains various modifications and variations from such descriptions. This is possible.

Therefore, the scope of the present invention should not be limited to the described embodiments, but should be determined not only by the claims below but also by the equivalents of the claims.

Claims (20)

  1. Identifying a downmix signal and a residual signal of the N / 2 channel generated from the N-channel input signal;
    Applying a downmix signal and a residual signal of the N / 2 channel to a first matrix;
    Outputs a first signal input to the N / 2 decorrelators corresponding to N / 2 OTT boxes and a second signal transmitted to the second matrix without being input to the N / 2 decorrelators through the first matrix Making;
    Outputting uncorrelated signals from a first signal through the N / 2 decorrelators;
    Applying the uncorrelated signal and the second signal to a second matrix; And
    Generating an N-channel output signal through the second matrix
    Multi-channel audio signal processing method comprising a.
  2. The method of claim 1,
    And when the LFE channel is not included in the output signal of the N channel, N / 2 decorrelators corresponding to the N / 2 OTT boxes.
  3. The method of claim 1,
    If the number of the decorrelator exceeds the reference value of the modulo operation, the index of the decorrelator is repeatedly reused according to the reference value.
  4. The method of claim 1,
    When the LFE channel is included in the output signal of the N channel, the decorrelator uses the remaining number other than the number of LFE channels in N / 2,
    The LFE channel, the multi-channel audio signal processing method does not use the decorator of the OTT box.
  5. The method of claim 1,
    If no temporal shaping tool is used, the second matrix is
    And a vector including the second signal, the uncorrelated signal derived from the decorrelator, and the residual signal derived from the decorrelator, is input.
  6. The method of claim 1,
    When a temporal shaping tool is used, the second matrix is
    And a vector corresponding to a direct signal composed of the second signal and the residual signal derived from the decorrelator, and a vector corresponding to a spread signal composed of the uncorrelated signal derived from the decorrelator.
  7. The method of claim 6,
    Generating the output signal of the N channel,
    When subband domain time processing (STP) is used, a multi-channel audio signal processing method for shaping the temporal envelope of an output signal by applying a scale factor based on a spread signal and a direct signal to the spread signal portion of the output signal.
  8. The method of claim 6,
    Generating the output signal of the N channel,
    When guided envelope shaping (GES) is used, the method of processing multi-channel audio signals by flattening and reshaping the envelope for the direct signal portion for each channel of the output signal of the N channel.
  9. The method of claim 1,
    The size of the first matrix is
    It is determined according to the number of channels and the number of decorrelators of the downmix signal applying the first matrix,
    The element of the first matrix,
    A multichannel audio signal processing method determined by a CLD parameter or a CPC parameter.
  10. Identifying a downmix signal of the N / 2 channel and a residual signal of the N / 2 channel;
    Inputting an N / 2 channel downmix signal and an N / 2 channel residual signal into the N / 2 OTT boxes to generate an N channel output signal
    Including,
    The N / 2 OTT boxes are arranged in parallel without being connected to each other.
    OTT box for outputting the LFE channel of the N / 2 OTT box,
    (1) Receive only downmix signal except residual signal,
    (2) Use the CLD parameter among the CLD parameter and the ICC parameter,
    (3) A multi-channel audio signal processing method that does not output uncorrelated signals through decorrelators.
  11. In the multi-channel audio signal processing apparatus,
    A processor for performing a multi-channel audio signal processing method,
    The multi-channel audio signal processing method,
    Identifying a downmix signal and a residual signal of the N / 2 channel generated from the N-channel input signal;
    Applying a downmix signal and a residual signal of the N / 2 channel to a first matrix;
    Outputs a first signal input to the N / 2 decorrelators corresponding to N / 2 OTT boxes and a second signal transmitted to the second matrix without being input to the N / 2 decorrelators through the first matrix Making;
    Outputting uncorrelated signals from a first signal through the N / 2 decorrelators;
    Applying the uncorrelated signal and the second signal to a second matrix; And
    Generating an N-channel output signal through the second matrix
    Multi-channel audio signal processing apparatus comprising a.
  12. The method of claim 11,
    The multi-channel multi-channel audio signal processing apparatus corresponding to the N / 2 decorrelator corresponding to the N / 2 OTT boxes, if the LFE channel is not included in the output signal of the N channel.
  13. The method of claim 11,
    And the index of the decorrelator is repeatedly reused according to the reference value when the number of decorrelators exceeds a reference value of a modulo operation.
  14. The method of claim 11,
    When the LFE channel is included in the output signal of the N channel, the decorrelator uses the remaining number other than the number of LFE channels in N / 2,
    The LFE channel is a multi-channel audio signal processing apparatus that does not use the decorrelator of the OTT box.
  15. The method of claim 11,
    If no temporal shaping tool is used, the second matrix is
    And a vector including the second signal, the uncorrelated signal derived from the decorrelator, and the residual signal derived from the decorrelator, is input.
  16. The method of claim 11,
    When a temporal shaping tool is used, the second matrix is
    And a vector corresponding to a direct signal composed of the second signal and the residual signal derived from the decorrelator, and a vector corresponding to a spread signal composed of the uncorrelated signal derived from the decorrelator.
  17. The method of claim 16,
    Generating the output signal of the N channel,
    A multi-channel audio signal processing apparatus, when subband domain time processing (STP) is used, shaping the temporal envelope of the output signal by applying a scale factor based on the spread signal and the direct signal to the spread signal portion of the output signal.
  18. The method of claim 16,
    Generating the output signal of the N channel,
    When guided envelope shaping (GES) is used, the multi-channel audio signal processing apparatus for flattening and reshaping the envelope for the direct signal portion for each channel of the N-channel output signal.
  19. The method of claim 11,
    The size of the first matrix is
    It is determined according to the number of channels and the number of decorrelators of the downmix signal applying the first matrix,
    The element of the first matrix,
    A multi-channel audio signal processing apparatus determined by a CLD parameter or a CPC parameter.
  20. In the multi-channel audio signal processing apparatus,
    A processor for performing a multi-channel audio signal processing method,
    The multi-channel audio signal processing method,
    Identifying a downmix signal of the N / 2 channel and a residual signal of the N / 2 channel;
    Inputting an N / 2 channel downmix signal and an N / 2 channel residual signal into the N / 2 OTT boxes to generate an N channel output signal
    Including,
    The N / 2 OTT boxes are arranged in parallel without being connected to each other.
    OTT box for outputting the LFE channel of the N / 2 OTT box,
    (1) Receive only downmix signal except residual signal,
    (2) Use the CLD parameter among the CLD parameter and the ICC parameter,
    (3) A multi-channel audio signal processing device that does not output uncorrelated signals through decorrelators.
PCT/KR2015/006788 2014-07-01 2015-07-01 Multichannel audio signal processing method and device WO2016003206A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
KR10-2014-0082030 2014-07-01
KR20140082030 2014-07-01
KR10-2015-0094195 2015-07-01
KR1020150094195A KR20160003572A (en) 2014-07-01 2015-07-01 Method and apparatus for processing multi-channel audio signal

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
DE112015003108.1T DE112015003108T5 (en) 2014-07-01 2015-07-01 Operation of the multi-channel audio signal systems
US15/323,028 US9883308B2 (en) 2014-07-01 2015-07-01 Multichannel audio signal processing method and device
CN201580036477.8A CN106471575B (en) 2014-07-01 2015-07-01 Multi-channel audio signal processing method and device
US15/870,700 US10264381B2 (en) 2014-07-01 2018-01-12 Multichannel audio signal processing method and device
US16/357,180 US20190289413A1 (en) 2014-07-01 2019-03-18 Multichannel audio signal processing method and device

Related Child Applications (3)

Application Number Title Priority Date Filing Date
US15/323,028 A-371-Of-International US9883308B2 (en) 2014-07-01 2015-07-01 Multichannel audio signal processing method and device
US201615323028A A-371-Of-International 2016-12-29 2016-12-29
US15/870,700 Continuation US10264381B2 (en) 2014-07-01 2018-01-12 Multichannel audio signal processing method and device

Publications (1)

Publication Number Publication Date
WO2016003206A1 true WO2016003206A1 (en) 2016-01-07

Family

ID=55019650

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2015/006788 WO2016003206A1 (en) 2014-07-01 2015-07-01 Multichannel audio signal processing method and device

Country Status (1)

Country Link
WO (1) WO2016003206A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050195981A1 (en) * 2004-03-04 2005-09-08 Christof Faller Frequency-based coding of channels in parametric multi-channel coding systems
WO2007078254A2 (en) * 2006-01-05 2007-07-12 Telefonaktiebolaget Lm Ericsson (Publ) Personalized decoding of multi-channel surround sound
WO2007111568A2 (en) * 2006-03-28 2007-10-04 Telefonaktiebolaget L M Ericsson (Publ) Method and arrangement for a decoder for multi-channel surround sound
WO2010050740A2 (en) * 2008-10-30 2010-05-06 삼성전자주식회사 Apparatus and method for encoding/decoding multichannel signal
KR20120099191A (en) * 2006-01-11 2012-09-07 삼성전자주식회사 Method of generating a multi-channel signal from down-mixed signal and computer-readable medium thereof

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050195981A1 (en) * 2004-03-04 2005-09-08 Christof Faller Frequency-based coding of channels in parametric multi-channel coding systems
WO2007078254A2 (en) * 2006-01-05 2007-07-12 Telefonaktiebolaget Lm Ericsson (Publ) Personalized decoding of multi-channel surround sound
KR20120099191A (en) * 2006-01-11 2012-09-07 삼성전자주식회사 Method of generating a multi-channel signal from down-mixed signal and computer-readable medium thereof
WO2007111568A2 (en) * 2006-03-28 2007-10-04 Telefonaktiebolaget L M Ericsson (Publ) Method and arrangement for a decoder for multi-channel surround sound
WO2010050740A2 (en) * 2008-10-30 2010-05-06 삼성전자주식회사 Apparatus and method for encoding/decoding multichannel signal

Similar Documents

Publication Publication Date Title
DE602004001868T2 (en) Method for processing compressed audio data for spatial playback
KR100192700B1 (en) Signal encoding and decoding system allowing adding of signals in a form of frequency sample sequence upon decoding
JP3646938B1 (en) Audio decoding apparatus and audio decoding method
JP2006523859A (en) Audio signal synthesis
WO2011049397A2 (en) Method and apparatus for decoding video according to individual parsing or decoding in data unit level, and method and apparatus for encoding video for individual parsing or decoding in data unit level
WO2009151232A2 (en) Image-encoding method and a device therefor, and image-decoding method and a device therefor
JP2008203879A (en) Noise suppressing method and apparatus, and computer program
De Courville et al. Adaptive filtering in subbands using a weighted criterion
WO2011087295A2 (en) Method and apparatus for encoding and decoding video by using pattern information in hierarchical data unit
EP1979898A1 (en) Method and apparatus for processing a media signal
BRPI0706306A2 (en) method and apparatus for synthesizing a binaural audio signal; method; method for synthesizing a stereo audio signal; parametric audio decoder; computer program product, stored in a computer readable medium and executable in a data processing device, for processing a parametrically encoded audio signal comprising at least one combined signal from a plurality of audio channels and one or more more corresponding information sets describing a multi channel sound image; method for generating a parametrically encoded audio signal; parametric audio encoder for generating a parametrically encoded audio signal; computer program product, stored on a computer readable medium and executable on a data processing device, to generate a parametrically encoded audio signal
WO2011034373A2 (en) Apparatus for sharing a wireless communication base station
WO2013002554A2 (en) Video encoding method using offset adjustments according to pixel classification and apparatus therefor, video decoding method and apparatus therefor
JP2003047099A (en) Audio signal processing
KR970024629A (en) The signal encoding method and apparatus
US6804361B2 (en) Sound signal playback machine and method thereof
WO2013022223A2 (en) Method for controlling electronic apparatus based on voice recognition and motion recognition, and electronic apparatus applying the same
WO2013172551A1 (en) A method and apparatus for transmission and reception of data streams in digital video broadcasting systems
KR930022335A (en) High efficiency encoding apparatus for audio signal
WO2014107101A1 (en) Display apparatus and method for controlling the same
WO2012044080A1 (en) Method and apparatus for feedback in multi-user multiple-input multiple-output (mu-mimo) communication system
WO2012033373A2 (en) Low complexity transform coding using adaptive dct/dst for intra-prediction
WO2011126272A2 (en) Method and apparatus for encoding video by using dynamic-range transformation, and method and apparatus for decoding video by using dynamic-range transformation
WO2013022221A2 (en) Method for controlling electronic apparatus based on voice recognition and motion recognition, and electronic apparatus applying the same
WO2011126287A2 (en) Method and apparatus for performing interpolation based on transform and inverse transform

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15815538

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 15323028

Country of ref document: US

WWE Wipo information: entry into national phase

Ref document number: 112015003108

Country of ref document: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15815538

Country of ref document: EP

Kind code of ref document: A1