KR100923478B1

KR100923478B1 - Synthesizing a mono audio signal based on an encoded multichannel audio signal

Info

Publication number: KR100923478B1
Application number: KR1020067017564A
Authority: KR
Inventors: 아리 라카니에미; 파시 오잘라
Original assignee: 노키아 코포레이션
Priority date: 2004-03-12
Filing date: 2004-03-12
Publication date: 2009-10-27
Also published as: KR20060121985A

Abstract

본 발명은 이용가능한 부호화된 다중채널 오디오 신호에 기반하여 모노 오디오 신호를 합성하는 방법에 관한 것이다. 상기 부호화된 다중채널 오디오 신호는 적어도 오디오 주파수 대역의 일부분에 대해 상기 다중채널 오디오 신호의 각 채널에 대한 개별 매개 변수 값들을 포함하는 것으로 가정된다. 상기 모노 오디오 신호를 합성하는데 있어서의 처리 부하를 감소시키기 위하여, 적어도 매개 변수 도메인에서 오디오 주파수 대역의 일부분에 대해 상기 다중 채널들의 매개 변수 값들이 결합되는 것이 제안된다. 그다음 상기 결합된 매개 변수 값들은 상기 모노 오디오 신호를 합성하는데 사용된다. 본 발명은 동등하게 대응하는 오디오 복호기, 대응하는 부호화 시스템 및 대응하는 소프트웨어 프로그램 생성물에 관한 것이다.The present invention relates to a method of synthesizing a mono audio signal based on an available coded multichannel audio signal. It is assumed that the encoded multichannel audio signal includes individual parameter values for each channel of the multichannel audio signal for at least a portion of an audio frequency band. In order to reduce the processing load in synthesizing the mono audio signal, it is proposed that the parameter values of the multiple channels are combined for at least a portion of the audio frequency band in the parameter domain. The combined parameter values are then used to synthesize the mono audio signal. The present invention relates equally to corresponding audio decoders, corresponding encoding systems and corresponding software program products.

Description

Synthesizing a mono audio signal based on an encoded multichannel audio signal

본 발명은 이용가능한 부호화된 다중채널 오디오 신호에 기반하여 모노 오디오 신호를 합성하는 방법으로서, 부호화된 다중채널 오디오 신호가 적어도 오디오 주파수 대역의 일부분에 대해 상기 다중채널 오디오 신호의 각 채널에 대한 개별 매개 변수 값들을 포함하는 방법에 관한 것이다. 본 발명은 동등하게 대응하는 오디오 복호기, 대응하는 부호화 시스템 및 대응하는 소프트웨어 프로그램 생성물에 관한 것이다.The present invention provides a method for synthesizing a mono audio signal based on an available coded multichannel audio signal, wherein the coded multichannel audio signal is a separate parameter for each channel of the multichannel audio signal for at least a portion of an audio frequency band. A method for including variable values. The present invention relates equally to corresponding audio decoders, corresponding encoding systems and corresponding software program products.

오디오 부호화 시스템들은 당 기술적 수준에서 잘 알려져 있다. 그들은 특히 오디오 신호들을 송신하거나 저장하는데 사용된다.Audio encoding systems are well known at the technical level. They are especially used for transmitting or storing audio signals.

오디오 신호들의 송신에 사용되는 오디오 부호화 시스템은 송신단에 있는 부호기 및 수신단에 있는 복호기를 포함한다. 상기 송신단 및 수신단은 예를 들어 이동 단말기들일 수 있다. 송신될 오디오 신호는 상기 부호기에 제공된다. 상기 부호기는 입력 오디오 데이터 레이트를 송신 채널에서의 대역폭 조건들이 위반되지 않는 비트레이트 레벨에 적응시키는 것을 담당한다. 이상적으로, 상기 부호기는 상기 부호화 프로세스에서 상기 오디오 신호로부터 관련없는 정보만을 버린다. 그다음 상기 부호화된 오디오 신호는 상기 오디오 부호화 시스템의 송신단에 의해 송신되고 상기 오디오 부호화 시스템의 수신단에서 수신된다. 상기 수신단에 있는 복호기는 상기 부호화 프로세스를 거꾸로 수행하여 거의 열화없이 또는 아무런 오디오적 저하없이 복호화된 오디오 신호를 획득한다.An audio encoding system used for the transmission of audio signals includes an encoder at the transmitting end and a decoder at the receiving end. The transmitting end and the receiving end may be mobile terminals, for example. The audio signal to be transmitted is provided to the encoder. The encoder is responsible for adapting the input audio data rate to the bitrate level at which bandwidth conditions in the transmission channel are not violated. Ideally, the encoder discards only unrelated information from the audio signal in the encoding process. The coded audio signal is then transmitted by the transmitting end of the audio encoding system and received at the receiving end of the audio encoding system. The decoder at the receiving end performs the encoding process backwards to obtain a decoded audio signal with little degradation or no audio degradation.

상기 오디오 부호화 시스템이 오디오 데이터를 보관하는데 사용되는 경우, 상기 부호기에 의해 제공된 부호화된 오디오 데이터는 어떤 저장 유닛에 저장되고, 상기 복호기는 예를 들어 어떤 미디어 플레이어에 의한 프리젠테이션을 위해, 상기 저장 유닛으로부터 검색된 데이터를 복호화한다. 이러한 대안에서, 저장 공간을 절감하기 위하여, 상기 부호기가 가능한 한 낮은 비트레이트를 달성하는 것이 목표이다.When the audio encoding system is used to store audio data, the encoded audio data provided by the encoder is stored in a storage unit, and the decoder is stored, for example, for presentation by some media player. Decrypt the data retrieved from. In this alternative, in order to save storage space, the aim is to achieve the bitrate as low as possible.

허용된 비트레이트에 의존하여, 상이한 부호화 방식들이 오디오 신호에 적용될 수 있다.Depending on the allowed bitrate, different coding schemes can be applied to the audio signal.

대부분의 경우에, 오디오 신호의 저 주파수 대역 및 고 주파수 대역은 서로 상관된다. 그러므로 오디오 코덱 대역폭 확장 알고리즘들은 전형적으로 우선 부호화된 오디오 신호의 대역폭을 두개의 주파수 대역들로 분할한다. 그다음 상기 저 주파수 대역은 소위 코어 코덱에 의해 독립적으로 처리되고, 반면에 상기 고 주파수 대역은 상기 저 주파수 대역으로부터의 신호들 및 부호화 매개 변수들에 대한 지식을 사용하여 처리된다. 상기 고 주파수 대역에서 상기 저 주파수 대역 부호화로부터의 매개 변수들을 사용하는 것은 높은 대역 부호화를 초래하는 비트 레이트 를 상당히 감소시킨다.In most cases, the low and high frequency bands of the audio signal are correlated with each other. Therefore, audio codec bandwidth extension algorithms typically first divide the bandwidth of an encoded audio signal into two frequency bands. The low frequency band is then processed independently by a so-called core codec, while the high frequency band is processed using knowledge of the signals and coding parameters from the low frequency band. Using the parameters from the low frequency band coding in the high frequency band significantly reduces the bit rate resulting in high band coding.

도 1은 전형적인 분할 대역 부호화 및 복호화 시스템을 나타낸 것이다. 상기 시스템은 오디오 부호기(10) 및 오디오 복호기(20)를 포함한다. 상기 오디오 부호기(10)는 2 대역 분석 필터뱅크(11), 저 대역 부호기(12) 및 고 대역 부호기(13)를 포함한다. 상기 오디오 복호기(20)는 저 대역 복호기(21), 고 대역 복호기(22) 및 2 대역 합성 필터뱅크(23)를 포함한다. 상기 저 대역 부호기(12)와 복호기(21)는 예를 들어 적응 다중-레이트 광대역(AMR-WB: Adaptive Multi-Rate Wideband) 표준 부호기 및 복호기일 수 있고, 반면에 상기 고 대역 부호기(13)와 복호기(22)는 독립 부호화 알고리즘, 대역폭 확장 알고리즘 또는 양자의 조합을 포함할 수 있다. 예로서, 상기 제시된 시스템은 분할 대역 부호화 알고리즘으로서 확장된 AMR-WB(AMR-WB+) 코덱을 사용하는 것으로 가정된다.1 shows a typical split band encoding and decoding system. The system includes an audio encoder 10 and an audio decoder 20. The audio encoder 10 includes a two band analysis filterbank 11, a low band encoder 12 and a high band encoder 13. The audio decoder 20 includes a low band decoder 21, a high band decoder 22 and a two band synthesis filter bank 23. The low band encoder 12 and decoder 21 may be, for example, an adaptive multi-rate wideband (AMR-WB) standard encoder and decoder, while the high band encoder 13 and Decoder 22 may include an independent encoding algorithm, a bandwidth extension algorithm, or a combination of both. By way of example, it is assumed that the system presented above uses an extended AMR-WB (AMR-WB +) codec as the split band coding algorithm.

입력 오디오 신호(1)는 우선 2-대역 분석 필터뱅크(11)에 의해 처리되는데, 상기 오디오 주파수 대역은 저 주파수 대역 및 고 주파수 대역으로 분할된다. 설명을 위해, 도 2는 AMR-WB+의 경우에 대해 2-대역 필터뱅크의 주파수 응답의 예를 도시한 것이다. 12 kHz 오디오 대역은 0 kHz 내지 6.4 kHz 대역 L 및 6.4 kHz 내지 12 kHz 대역 H로 분할된다. 상기 2-대역 분석 필터뱅크(11)에서, 결과로서 생성된 주파수 대역들은 더욱이 결정적으로 다운-샘플링된다. 즉, 상기 저 주파수 대역은 12.8 kHz로 다운-샘플링되고 상기 고 주파수 대역은 11.2 kHz로 재-샘플링된다.The input audio signal 1 is first processed by a two-band analysis filterbank 11, which is divided into a low frequency band and a high frequency band. For illustration purposes, FIG. 2 shows an example of the frequency response of a two-band filterbank for the case of AMR-WB +. The 12 kHz audio band is divided into 0 kHz to 6.4 kHz band L and 6.4 kHz to 12 kHz band H. In the two-band analysis filterbank 11, the resulting frequency bands are further decisively down-sampled. That is, the low frequency band is down-sampled at 12.8 kHz and the high frequency band is resampled at 11.2 kHz.

그다음, 상기 저 주파수 대역과 상기 고 주파수 대역은 각각 상기 저 대역 부호기(12)와 상기 고 대역 부호기(13)에 의해 서로에 상관없이 부호화된다.The low frequency band and the high frequency band are then encoded independently of each other by the low band encoder 12 and the high band encoder 13, respectively.

상기 저 대역 부호기(12)는 이것 때문에 완전 소스 신호 부호화 알고리즘들을 포함한다. 상기 알고리즘들은 대수 부호 여진 선형 예측(ACELP: algebraic code excitation linear prediction) 유형의 알고리즘 및 변환 기반 알고리즘을 포함한다. 상기 실제로 사용된 알고리즘은 각각의 입력 오디오 신호의 신호 특성에 기반하여 선택된다. 상기 ACELP 알고리즘은 전형적으로 음성 신호들과 경과음들(transients)을 부호화하기 위해 선택되고, 반면에 상기 변환 기반 알고리즘은 전형적으로 주파수 해상도를 더 잘 처리하기 위하여 음악 및 톤 유형 신호들을 부호화하기 위해 선택된다.The low band encoder 12 contains full source signal coding algorithms because of this. The algorithms include algorithms of algebraic code excitation linear prediction (ACELP) type and transform based algorithms. The actually used algorithm is selected based on the signal characteristics of each input audio signal. The ACELP algorithm is typically chosen to encode speech signals and transients, while the transformation based algorithm is typically chosen to encode music and tone type signals to better handle frequency resolution. .

AMR-WB+ 코덱에서, 상기 고 대역 부호기(13)는 상기 고 주파수 대역 신호의 스펙트럼 포락선을 모델링하기 위해 선형 예측 부호화(LPC: linear prediction coding)를 이용한다. 그래서 상기 고 주파수 대역은 합성된 신호의 스펙트럼 특성들을 정의하는 LPC 합성 필터 계수들 및 상기 합성된 고 주파수 대역 오디오 신호의 진폭을 제어하는 여기 신호에 대한 이득 계수들에 의해 설명될 수 있다. 상기 고 대역 여기 신호는 상기 저 대역 부호기(12)로부터 복사된다. 단지 상기 LPC 계수들과 상기 이득 계수들만이 송신을 위해 제공된다.In the AMR-WB + codec, the high band encoder 13 uses linear prediction coding (LPC) to model the spectral envelope of the high frequency band signal. The high frequency band can thus be described by LPC synthesis filter coefficients defining the spectral characteristics of the synthesized signal and gain coefficients for the excitation signal controlling the amplitude of the synthesized high frequency band audio signal. The high band excitation signal is copied from the low band encoder 12. Only the LPC coefficients and the gain coefficients are provided for transmission.

상기 저 대역 부호기(12)와 상기 고 대역 부호기(13)의 출력은 단일 비트 스트림(2)에 다중화된다.The outputs of the low band encoder 12 and the high band encoder 13 are multiplexed onto a single bit stream 2.

상기 다중화된 비트 스트림(2)은 예를 들어 통신 채널을 통해 상기 오디오 복호기(20)로 송신되는데, 상기 저 주파수 대역과 상기 고 주파수 대역은 개별적으로 복호화된다.The multiplexed bit stream 2 is transmitted to the audio decoder 20 via a communication channel, for example, wherein the low frequency band and the high frequency band are separately decoded.

상기 저 대역 복호기(21)에서, 상기 저 대역 부호기(12)에서의 처리는 상기 저 주파수 대역 오디오 신호를 합성하기 위하여 거꾸로 수행된다.In the low band decoder 21, the processing in the low band encoder 12 is performed upside down to synthesize the low frequency band audio signal.

상기 고 대역 복호기(22)에서, 상기 저 대역 복호기(21)에 의해 제공된 저 주파수 대역 여기를 상기 고 주파수 대역에서 사용된 샘플링 레이트로 재-샘플링함으로써 여기 신호가 생성된다. 즉, 상기 저 주파수 대역 여기 신호는 상기 저 주파수 대역 신호를 상기 고 주파수 대역으로 바꾸어 놓음으로써 상기 고주파 대역의 복호화를 위해 재사용된다. 대안적으로, 상기 고 주파수 대역 신호의 재구성을 위해 랜덤 여기 신호가 생성될 수 있다. 그다음, 상기 고 주파수 대역 신호는 상기 LPC 계수들에 의해 정의된 고 대역 LPC 모델을 통해 스케일링된 여기 신호를 필터링함으로써 재구성된다.In the high band decoder 22, an excitation signal is generated by re-sampling the low frequency band excitation provided by the low band decoder 21 at the sampling rate used in the high frequency band. That is, the low frequency band excitation signal is reused for decoding the high frequency band by replacing the low frequency band signal with the high frequency band. Alternatively, a random excitation signal can be generated for reconstruction of the high frequency band signal. The high frequency band signal is then reconstructed by filtering the scaled excitation signal through a high band LPC model defined by the LPC coefficients.

상기 2 대역 합성 필터뱅크(23)에서, 상기 복호화된 저 주파수 대역 신호들 및 상기 고 주파수 대역 신호들은 원래의 샘플링 주파수로 업-샘플링되고 합성된 출력 오디오 신호(3)에 결합된다.In the two band synthesis filterbank 23, the decoded low frequency band signals and the high frequency band signals are coupled to the output audio signal 3 which is up-sampled and synthesized at the original sampling frequency.

부호화될 입력 오디오 신호(1)는 모노 오디오 신호 또는 적어도 제1 및 제2 채널 신호를 포함하는 다중채널 오디오 신호일 수 있다. 다중채널 오디오 신호의 예는 좌 채널 신호와 우 채널 신호로 구성되는 스테레오 오디오 신호이다.The input audio signal 1 to be encoded may be a mono audio signal or a multichannel audio signal comprising at least first and second channel signals. An example of a multichannel audio signal is a stereo audio signal composed of a left channel signal and a right channel signal.

AMR-WB+ 코덱의 스테레오 동작을 위해, 상기 입력 오디오 신호는 상기 2 대역 분석 필터뱅크(11)에서 저 주파수 대역 신호와 고 주파수 대역 신호로 동일하게 분할된다. 상기 저 대역 부호기(12)는 상기 저 주파수 대역에서 좌 채널 신호들과 우 채널 신호들을 결합함으로써 모노 신호를 생성한다. 상기 모노 신호는 상술된 바와 같이 부호화된다. 더욱이, 상기 저 대역 부호기(12)는 상기 모노 신호에 대한 좌 및 우 채널 신호들의 차이들을 부호화하기 위하여 파라메트릭 부호화를 사용한다. 상기 고 대역 부호기(13)는 각 채널에 대해 개별 LPC 계수들과 이득 계수들을 결정함으로써 개별적으로 상기 좌 채널과 우 채널을 부호화한다.For stereo operation of the AMR-WB + codec, the input audio signal is equally divided into a low frequency band signal and a high frequency band signal in the two band analysis filter bank 11. The low band encoder 12 generates a mono signal by combining left channel signals and right channel signals in the low frequency band. The mono signal is encoded as described above. Moreover, the low band encoder 12 uses parametric coding to encode differences between left and right channel signals for the mono signal. The high band encoder 13 encodes the left channel and the right channel separately by determining individual LPC coefficients and gain coefficients for each channel.

상기 입력 오디오 신호(1)가 다중채널 오디오 신호이지만, 합성된 오디오 신호(3)를 나타낼 장치가 다중채널 오디오 출력을 지원하지 않는 경우, 입력되는 다중채널 비트 스트림(2)은 상기 오디오 복호기(20)에 의해 모노 오디오 신호로 변환되어야 한다. 저 주파수 대역에서, 상기 다중채널 신호를 모노 신호로 변환하는 것은 직접적인데, 왜냐하면 상기 저 대역 복호기(21)는 상기 수신된 비트 스트림에서 스테레오 매개 변수들을 단순히 생략할 수 있고 상기 모노 부분만을 복호화할 수 있기 때문이다. 하지만 상기 고 주파수 대역에 대해, 상기 고 주파수 대역의 아무런 개별 모노 신호 부분도 상기 비트 스트림에서 이용가능하지 않기 때문에, 더 많은 처리가 요구된다.If the input audio signal 1 is a multichannel audio signal, but the device representing the synthesized audio signal 3 does not support multichannel audio output, the input multichannel bit stream 2 is the audio decoder 20. To mono audio signal. In the low frequency band, it is direct to convert the multichannel signal into a mono signal, because the low band decoder 21 can simply omit stereo parameters from the received bit stream and decode only the mono part. Because there is. However, for the high frequency band, more processing is required since no individual mono signal portion of the high frequency band is available in the bit stream.

관용적으로, 상기 고 주파수 대역에 대한 스테레오 비트 스트림은 좌 및 우 채널 신호들에 대해 개별적으로 복호화되고, 그다음 상기 모노 신호는 다운-믹싱 프로세스에서 좌 및 우 채널 신호들을 결합함으로써 생성된다. 이 접근은 도 3에 도시된다.Conventionally, the stereo bit stream for the high frequency band is decoded separately for the left and right channel signals, and then the mono signal is generated by combining the left and right channel signals in a down-mixing process. This approach is shown in FIG.

도 3은 모노 오디오 신호 출력에 대한 도 1의 고 대역 복호기(22)의 상세를 개략적으로 도시한 것이다. 상기 고 대역 복호기는 이것 때문에 좌 채널 처리부(30)와 우 채널 처리부(33)를 포함한다. 상기 좌 채널 처리부(30)는 LPC 합성 필 터(32)에 연결된, 믹서(31)를 포함한다. 상기 우 채널 처리부(33)는 LPC 합성 필터(35)에 연결된, 믹서(34)를 동등하게 포함한다. LPC 합성 필터들(32, 35) 양자의 출력은 추가 믹서(36)에 연결된다.3 schematically illustrates the details of the high band decoder 22 of FIG. 1 for a mono audio signal output. The high band decoder includes a left channel processor 30 and a right channel processor 33 for this reason. The left channel processor 30 includes a mixer 31 connected to the LPC synthesis filter 32. The right channel processor 33 equally includes a mixer 34 connected to the LPC synthesis filter 35. The output of both LPC synthesis filters 32, 35 is connected to an additional mixer 36.

상기 저 대역 복호기(21)에 의해 제공되는 저 주파수 대역 여기 신호는 상기 믹서들(31 및 34) 중 하나에 공급된다. 상기 믹서(31)는 상기 좌 채널에 대한 이득 계수들을 상기 저 주파수 대역 여기 신호에 적용한다. 그다음 상기 좌 채널 고 대역 신호는 상기 좌 채널에 대한 LPC 계수들에 의해 정의된 고 대역 LPC 모델을 통해 스케일링된 여기 신호를 필터링함으로써 상기 LPC 합성 필터(32)에 의해 재구성된다. 상기 믹서(34)는 우 채널에 대한 이득 계수들을 상기 저 주파수 대역 여기 신호에 적용한다. 그다음 상기 우 채널 고 대역 신호는 상기 우 채널에 대한 LPC 계수들에 의해 정의된 고 대역 LPC 모델을 통해 스케일링된 여기 신호를 필터링함으로써 상기 LPC 합성 필터(35)에 의해 재구성된다.The low frequency band excitation signal provided by the low band decoder 21 is supplied to one of the mixers 31 and 34. The mixer 31 applies the gain coefficients for the left channel to the low frequency band excitation signal. The left channel high band signal is then reconstructed by the LPC synthesis filter 32 by filtering the scaled excitation signal through a high band LPC model defined by the LPC coefficients for the left channel. The mixer 34 applies gain coefficients for the right channel to the low frequency band excitation signal. The right channel high band signal is then reconstructed by the LPC synthesis filter 35 by filtering the scaled excitation signal through a high band LPC model defined by the LPC coefficients for the right channel.

그다음 상기 재구성된 좌 채널 고 주파수 대역 신호와 상기 재구성된 우 채널 고 주파수 대역 신호는 상기 믹서(36)에 의해 시간 도메인에서 그들의 평균을 계산함으로써 모노 고 주파수 대역 신호로 변환된다.The reconstructed left channel high frequency band signal and the reconstructed right channel high frequency band signal are then converted by the mixer 36 into a mono high frequency band signal by calculating their average in the time domain.

이것은 원칙적으로 단순하고 효과적인 접근이다. 하지만, 이것 때문에 단지 단일 채널 신호가 필요할지라도, 그것은 다중 채널들의 개별 합성을 필요로 한다.This is in principle a simple and effective approach. However, even though this only requires a single channel signal, it requires separate synthesis of multiple channels.

더욱이, 상기 다중채널 오디오 신호의 대부분의 에너지가 상기 채널들 중 한 채널상에 존재하는 방식으로 상기 다중채널 오디오 입력 신호(1)가 불균형화되는 경우, 그들의 평균을 계산함으로써 다중채널들의 직접 믹싱은 상기 결합된 신호에 서의 감쇠를 초래할 것이다. 극단적인 경우에, 상기 채널들 중 한 채널은 완전히 무성 상태이고, 이것은 원래의 활성 입력 채널의 에너지 레벨의 절반인 결합된 신호의 에너지 레벨을 초래한다.Furthermore, if the multichannel audio input signal 1 is unbalanced in such a way that most of the energy of the multichannel audio signal is present on one of the channels, the direct mixing of the multichannels by calculating their average This will result in attenuation in the combined signal. In extreme cases, one of the channels is completely silent, which results in an energy level of the combined signal that is half the energy level of the original active input channel.

본 발명의 목적은 부호화된 다중채널 오디오 신호에 기반하여 모노 오디오 신호를 합성하는데 필요한 처리 부하를 감소시키는 것이다.It is an object of the present invention to reduce the processing load required to synthesize a mono audio signal based on an encoded multichannel audio signal.

이용가능한 부호화된 다중채널 오디오 신호에 기반하여 모노 오디오 신호를 합성하는 방법이 제안되는데, 상기 부호화된 다중채널 오디오 신호는 적어도 오디오 주파수 대역의 일부분에 대해 상기 다중채널 오디오 신호의 각 채널에 대한 개별 매개 변수 값들을 포함한다. 상기 제안된 방법은 적어도 오디오 주파수 대역의 일부분에 대해 매개 변수 도메인에서 다중 채널들의 매개 변수 값들을 결합하는 단계를 포함한다. 상기 제안된 방법은 상기 오디오 주파수 대역의 부분에 대해 상기 결합된 매개 변수 값들을 사용하여 모노 오디오 신호를 합성하는 단계를 더 포함한다.A method of synthesizing a mono audio signal based on an available coded multichannel audio signal is proposed, wherein the coded multichannel audio signal is a separate parameter for each channel of the multichannel audio signal for at least a portion of an audio frequency band. Contains variable values. The proposed method comprises combining parameter values of multiple channels in the parameter domain for at least a portion of the audio frequency band. The proposed method further comprises synthesizing a mono audio signal using the combined parameter values for the portion of the audio frequency band.

더욱이, 이용가능한 부호화된 다중채널 오디오 신호에 기반하여 모노 오디오 신호를 합성하기 위한 오디오 복호기가 제안된다. 상기 부호화된 다중채널 오디오 신호는 적어도 원래의 다중채널 오디오 신호의 주파수 대역의 일부분에 대해 상기 다중채널 오디오 신호의 각 채널에 대한 개별 매개 변수 값들을 포함한다. 상기 제안된 오디오 복호기는 상기 다중채널 오디오 신호의 주파수 대역의 적어도 일부분에 대해 매개 변수 도메인에서 상기 다중 채널들의 매개 변수 값들을 결합하기에 적합한 적어도 하나의 매개 변수 선택부를 포함한다. 상기 제안된 오디오 복호기는 매개 변수 선택부에 의해 제공된 결합된 매개 변수 값들에 기반하여 적어도 상기 다중채널 오디오 신호의 주파수 대역의 일부분에 대해 모노 오디오 신호를 합성하기에 적합한 오디오 신호 합성부를 더 포함한다.Moreover, an audio decoder for synthesizing a mono audio signal based on the available encoded multichannel audio signal is proposed. The encoded multichannel audio signal includes individual parameter values for each channel of the multichannel audio signal for at least a portion of the frequency band of the original multichannel audio signal. The proposed audio decoder includes at least one parameter selector suitable for combining parameter values of the multiple channels in a parameter domain for at least a portion of the frequency band of the multichannel audio signal. The proposed audio decoder further comprises an audio signal synthesizer suitable for synthesizing a mono audio signal for at least a portion of the frequency band of the multichannel audio signal based on the combined parameter values provided by the parameter selector.

더욱이, 상기 제안된 복호기에 부가하여 부호화된 다중채널 오디오 신호를 제공하는 오디오 부호기를 포함하는, 부호화 시스템이 제안된다.Moreover, an encoding system is proposed, comprising an audio encoder for providing an encoded multichannel audio signal in addition to the proposed decoder.

마지막으로, 이용가능한 부호화된 다중채널 오디오 신호에 기반하여 모노 오디오 신호를 합성하기 위한 소프트웨어 코드가 저장되어 있는 소프트웨어 프로그램 생성물이 제안된다. 상기 부호화된 다중채널 오디오 신호는 적어도 원래의 다중채널 오디오 신호의 주파수 대역의 일부분에 대해 상기 다중채널 오디오 신호의 각 채널에 대한 개별 매개 변수 값들을 포함한다. 상기 제안된 소프트웨어 코드는 오디오 복호기에서 실행되는 경우 상기 제안된 방법의 단계들을 구현한다.Finally, a software program product is proposed that stores software code for synthesizing a mono audio signal based on an available encoded multichannel audio signal. The encoded multichannel audio signal includes individual parameter values for each channel of the multichannel audio signal for at least a portion of the frequency band of the original multichannel audio signal. The proposed software code implements the steps of the proposed method when executed in an audio decoder.

상기 부호화된 다중채널 오디오 신호는 특히 부호화된 스테레오 오디오 신호일 수 있지만, 이에 한정되지는 않는다.The encoded multichannel audio signal may be, in particular, an encoded stereo audio signal, but is not limited thereto.

본 발명은 모노 오디오 신호를 획득하기 위하여, 다중 채널들에 이용가능한매개 변수 값들이 복호화 이전에 매개 변수 도메인에서 이미 결합된 경우, 이용가능한 다중 채널들의 개별 복호화가 회피될 수 있다는 고려로부터 유래된다. 그다음 상기 결합된 매개 변수 값들은 단일 채널 복호화에 사용될 수 있다.The invention derives from the consideration that separate parameterization of the available multiple channels can be avoided if the parameter values available for the multiple channels have already been combined in the parameter domain prior to decoding to obtain a mono audio signal. The combined parameter values can then be used for single channel decoding.

본 발명의 이점은 그것이 복호기에 처리 부하를 절감할 수 있게 하고 그것이 복호기의 복잡성을 감소시킨다는 것이다. 상기 다중 채널들이 분할 대역 시스템에서 처리되는 스테레오 채널들인 경우, 예를 들어 양 채널들에 대해 개별적으로 고 주파수 대역 합성 필터링을 수행하고 결과로서 생성되는 좌 및 우 채널 신호들을 믹싱하는 것과 비교하여 고 주파수 대역 합성 필터링에 필요한 처리 부하의 약 절반이 절감될 수 있다.The advantage of the present invention is that it allows to reduce the processing load on the decoder and it reduces the complexity of the decoder. If the multiple channels are stereo channels processed in a split band system, for example high frequency compared to performing separate high frequency band synthesis filtering on both channels and mixing the resulting left and right channel signals About half of the processing load required for band synthesis filtering can be reduced.

본 발명의 일 실시예에서, 상기 매개 변수들은 상기 다중 채널들 각각에 대한 이득 계수들 및 상기 다중 채널들 각각에 대한 선형 예측 계수들을 포함한다.In one embodiment of the invention, the parameters include gain coefficients for each of the multiple channels and linear prediction coefficients for each of the multiple channels.

상기 매개 변수 값들을 결합하는 것은 정적인 방법으로 구현될 수 있는데, 예를 들어 모든 채널들에 대해 이용가능한 매개 변수 값들의 평균을 일반적으로 계산함으로써 구현될 수 있다. 하지만, 유리하게는 상기 매개 변수 값들을 결합하는 것은 상기 다중 채널들에서의 각각의 활동에 대한 정보에 기반하여 적어도 하나의 매개 변수에 대해 제어된다. 이것은 스펙트럼 특성을 지니고 상기 스펙트럼 특성들및 각각의 활성 채널에서의 신호 레벨에 가능한 한 근접한 신호 레벨을 지닌 모노 오디오 신호를 달성하도록 허용하여, 합성된 모노 오디오 신호의 개선된 오디오 품질을 달성하도록 허용한다.Combining the parameter values can be implemented in a static manner, for example by generally calculating the average of the parameter values available for all channels. However, advantageously combining the parameter values is controlled for at least one parameter based on information about each activity in the multiple channels. This allows to achieve a mono audio signal having spectral characteristics and a signal level as close as possible to the spectral characteristics and the signal level in each active channel, thereby achieving an improved audio quality of the synthesized mono audio signal. .

제1 채널에서의 활동이 제2 채널에서보다 상당히 높은 경우, 상기 제1 채널은 활성 채널인 것으로 가정될 수 있고, 반면에 상기 제2 채널은 원래의 오디오 신호에 기본적으로 아무런 오디오적인 기여를 제공하지 않는 사일런트 채널인 것으로 가정될 수 있다. 사일런트 채널이 존재하는 경우, 유리하게는 적어도 하나의 매개 변수의 매개 변수 값들은 상기 매개 변수 값들을 결합할 때 완전히 무시된다. 결과로서, 상기 합성된 모노 신호는 상기 활성 채널과 유사할 것이다. 모든 다른 경우에, 상기 매개 변수 값들은 예를 들어 모든 채널들에 걸쳐 평균 또는 가중된 평균을 형성함으로써 결합될 수 있다. 가중된 평균에 대해, 채널에 할당된 가중치는 다른 채널 또는 채널들과 비교하여 그것의 관련된 활동에 따라 상승한다. 상기 결합을 구현하기 위하여 다른 방법들이 또한 사용될 수 있다. 동일하게, 폐기되지 않을 사일런트 채널에 대한 매개 변수 값은 평균화 또는 어떤 다른 방법에 의해 활성 채널의 매개 변수 값들과 결합될 수 있다.If the activity in the first channel is significantly higher than in the second channel, the first channel can be assumed to be the active channel, while the second channel provides essentially no audio contribution to the original audio signal. It can be assumed to be a silent channel that does not. If there is a silent channel, advantageously the parameter values of at least one parameter are completely ignored when combining the parameter values. As a result, the synthesized mono signal will be similar to the active channel. In all other cases, the parameter values can be combined, for example, by forming an average or weighted average over all channels. For a weighted average, the weight assigned to a channel rises with its related activity compared to other channels or channels. Other methods may also be used to implement the combination. Equally, parameter values for silent channels that will not be discarded may be combined with parameter values of the active channel by averaging or some other method.

다양한 유형의 정보는 다중 채널들에서의 각각의 활동에 대한 정보를 형성할 수 있다. 그것은 예를 들어 상기 다중 채널들 각각에 대한 이득 계수, 상기 다중 채널들 각각에 대한 단기간동안의 이득 계수들의 결합 또는 상기 다중 채널들 각각에 대한 선형 예측 계수들에 의해 제공될 수 있다. 상기 활동 정보는 상기 다중 채널들 각각에 대한 다중채널 오디오 신호의 주파수 대역의 적어도 일부분에서의 에너지 레벨 또는 부호화된 다중채널 오디오 신호를 제공하는 부호기로부터 수신된 활동에 대한 개별 보조 정보에 의해 동일하게 제공될 수 있다.Various types of information may form information about each activity in multiple channels. It may be provided for example by a gain coefficient for each of the multiple channels, a combination of short term gain coefficients for each of the multiple channels or linear prediction coefficients for each of the multiple channels. The activity information is equally provided by individual supplementary information about the activity received from an encoder providing an energy level or an encoded multichannel audio signal in at least a portion of the frequency band of the multichannel audio signal for each of the multichannels. Can be.

부호화된 다중채널 오디오 신호를 획득하기 위하여, 원래의 다중채널 오디오 신호는 예를 들어 저 주파수 대역 신호 및 고 주파수 대역 신호로 분할될 수 있다. 그다음 상기 저 주파수 대역 신호는 관용적인 방법으로 부호화될 수 있다. 또한 상기 고 주파수 대역 신호는 관용적인 방법으로 상기 다중 채널들에 대해 개별적으로 부호화될 수 있는데, 이것은 상기 다중 채널들 각각에 대한 매개 변수 값들을 초래한다. 그다음 상기 전체 부호화된 다중채널 오디오 신호의 적어도 상기 부호화된 고 주파수 대역 부분은 본 발명에 의해 처리될 수 있다.In order to obtain an encoded multichannel audio signal, the original multichannel audio signal can be divided into, for example, a low frequency band signal and a high frequency band signal. The low frequency band signal may then be encoded in a conventional manner. The high frequency band signal may also be encoded separately for the multiple channels in a conventional manner, which results in parameter values for each of the multiple channels. Then at least the coded high frequency band portion of the entire coded multichannel audio signal can be processed by the present invention.

상기 저 주파수 대역 및 상기 고 주파수 대역 간의 불균형, 예를 들어 신호 레벨의 불균형을 방지하기 위하여, 전체 신호의 저 주파수 대역 부분의 다중채널 매개 변수 값들이 동일하게 본 발명에 따라 처리될 수 있다는 것은 이해되어야 한다. 대안적으로, 상기 신호 레벨에 영향을 미치는 상기 고 주파수 대역에서의 사일런트 채널들에 대한 매개 변수 값들은 원칙적으로 폐기될 수 없고, 상기 신호의 스텍트럼 특성에 영향을 미치는 사일런트 채널들에 대한 매개 변수 값들만이 폐기될 수 있다.It is to be understood that the multichannel parameter values of the low frequency band portion of the overall signal can be treated according to the present invention in order to prevent imbalance between the low frequency band and the high frequency band, for example an unbalance of the signal level. Should be. Alternatively, parameter values for silent channels in the high frequency band that affect the signal level cannot in principle be discarded, and parameter values for silent channels that affect the spectrum characteristics of the signal. Only these can be discarded.

본 발명은 예를 들어 AMR-WB+ 기반 부호화 시스템에서 구현될 수 있지만 이에 한정되지는 않는다.The present invention may be implemented in, for example, AMR-WB + based coding system, but is not limited thereto.

본 발명의 다른 목적들 및 특징들은 첨부한 도면들과 함께 고려되는 하기의 상세한 설명으로부터 명백해질 것이다.Other objects and features of the present invention will become apparent from the following detailed description considered in conjunction with the accompanying drawings.

도 1은 분할 대역 부호화 시스템의 개략적인 블록도이다.1 is a schematic block diagram of a split band coding system.

도 2는 2-대역 필터뱅크의 주파수 응답의 도면이다.2 is a diagram of the frequency response of a two-band filterbank.

도 3은 스테레오-모노 변환을 위한 관용적인 고 대역 복호기의 개략적인 블록도이다.3 is a schematic block diagram of a conventional high band decoder for stereo-mono conversion.

도 4는 본 발명의 제1 실시예에 의한 스테레오-모노 변환을 위한 고 대역 복호기의 개략적인 블록도이다.4 is a schematic block diagram of a high band decoder for stereo-mono conversion according to a first embodiment of the present invention.

도 5는 도 4의 고 대역 복호기를 가지고 결과로서 생성되는 스테레오 신호들 및 모노 신호에 대한 주파수 응답을 도시한 도면이다.FIG. 5 is a diagram illustrating the frequency response of the resulting stereo signals and mono signal with the high band decoder of FIG. 4.

도 6은 본 발명의 제2 실시예에 의한 스테레오-모노 변환을 위한 고 대역 복호기의 개략적인 블록도이다.6 is a schematic block diagram of a high band decoder for stereo-mono conversion according to a second embodiment of the present invention.

도 7은 도 6의 고 대역 복호기를 사용하는 시스템에서의 동작을 도시한 흐름도이다.7 is a flowchart illustrating operation in a system using the high band decoder of FIG. 6.

도 8은 도 7의 흐름도에서 매개 변수 결합에 대한 제1 옵션을 도시한 흐름도이다.8 is a flow chart showing a first option for parameter combining in the flow chart of FIG.

도 9는 도 7의 흐름도에서 매개 변수 결합에 대한 제2 옵션을 도시한 흐름도이다.FIG. 9 is a flow chart illustrating a second option for parameter combining in the flow chart of FIG. 7.

본 발명은 하기에서 또한 참조될, 도 1의 시스템에서 구현되는 것으로 가정된다. 스테레오 입력 오디오 신호(1)가 부호화를 위해 오디오 부호기(10)에 제공되고, 반면에 복호화된 모노 오디오 신호(3)는 프리젠테이션을 위해 오디오 복호기(20)에 의해 제공되어야 한다.The invention is assumed to be implemented in the system of FIG. 1, which will also be referenced below. The stereo input audio signal 1 is provided to the audio encoder 10 for encoding, while the decoded mono audio signal 3 must be provided by the audio decoder 20 for presentation.

낮은 처리 부하를 지닌 이러한 모노 오디오 신호(3)를 제공할 수 있기 위하여, 상기 시스템의 고 대역 복호기(22)는 본 발명의 단순한 제1 실시예에 따라 구현될 수 있다.In order to be able to provide such a mono audio signal 3 with a low processing load, the high band decoder 22 of the system can be implemented according to the first simple embodiment of the invention.

도 4는 상기 고 대역 복호기(22)의 개략적인 블록도이다. 상기 고 대역 복호기(22)의 저 대역 여기 입력은 믹서(40)와 LPC 합성 필터(41)를 통해 상기 고 대역 복호기(22)의 출력에 연결된다. 상기 고 대역 복호기(22)는 상기 믹서에 연결된 이득 평균 계산부(42) 및 부가적으로 상기 LPC 합성 필터(41)에 연결된 LPC 평균 계 산부(43)를 포함한다. 4 is a schematic block diagram of the high band decoder 22. The low band excitation input of the high band decoder 22 is connected to the output of the high band decoder 22 via mixer 40 and LPC synthesis filter 41. The high band decoder 22 includes a gain average calculator 42 connected to the mixer and an LPC average calculator 43 connected to the LPC synthesis filter 41.

상기 시스템은 다음과 같이 동작한다.The system operates as follows.

상기 오디오 부호기(10)에 입력된 스테레오 신호는 상기 2 대역 분석 필터뱅크(11)에 의해 저 주파수 대역과 고 주파수 대역으로 분할된다. 저 대역 부호기(11)는 상술된 바와 같이 상기 저 주파수 대역 오디오 신호를 부호화한다. AMR-WB+ 고 대역 부호기(12)는 좌 및 우 채널들에 대해 개별적으로 상기 고 대역 스테레오 신호를 부호화한다. 특히, 그것은 상술된 바와 같이 각 채널에 대한 선형 예측 계수들 및 이득 계수들을 결정한다.The stereo signal input to the audio encoder 10 is divided into a low frequency band and a high frequency band by the two-band analysis filter bank 11. The low band encoder 11 encodes the low frequency band audio signal as described above. AMR-WB + high band encoder 12 encodes the high band stereo signal separately for the left and right channels. In particular, it determines linear prediction coefficients and gain coefficients for each channel as described above.

상기 부호화된 모노 저 주파수 대역 신호, 상기 스테레오 저 주파수 대역 매개 변수 값들 및 상기 스테레오 고 주파수 대역 매개 변수 값들은 상기 오디오 복호기(20)에 비트 스트림(2)으로 송신된다.The encoded mono low frequency band signal, the stereo low frequency band parameter values and the stereo high frequency band parameter values are transmitted to the audio decoder 20 in a bit stream 2.

상기 저 대역 복호기(21)는 복호화를 위해 상기 비트 스트림의 저 주파수 대역 부분을 수신한다. 상기 복호화에서, 그것은 상기 스테레오 매개 변수들을 생략하고 단지 상기 모노 부분만을 복호화한다. 그 결과는 모노 저 주파수 대역 오디오 신호이다.The low band decoder 21 receives the low frequency band portion of the bit stream for decoding. In the decoding, it omits the stereo parameters and only decodes the mono part. The result is a mono low frequency audio signal.

상기 고 대역 복호기(22)는 한편으로 상기 송신된 비트 스트림으로부터 상기 고 주파수 대역 매개 변수 값들을 수신하고 다른 한편으로 상기 저 대역 복호기(21)에 의해 출력된 저 대역 여기 신호를 수신한다.The high band decoder 22 receives on the one hand the high frequency band parameter values from the transmitted bit stream and on the other hand receives the low band excitation signal output by the low band decoder 21.

상기 고 주파수 대역 매개 변수들은 좌 채널 이득 계수, 우 채널 이득 계수, 좌 채널 LPC 계수들 및 우 채널 LPC 계수들을 각각 포함한다. 상기 이득 평균 계산 부(42)에서, 상기 좌 채널 및 상기 우 채널에 대한 각각의 이득 계수들이 평균화되고, 상기 평균 이득 계수는 상기 저 대역 여기 신호를 스케일링하기 위하여 상기 믹서(40)에 의해 사용된다. 상기 결과로서 생성된 신호는 상기 LPC 합성 필터(42)에 대한 필터링을 위해 제공된다.The high frequency band parameters include left channel gain coefficients, right channel gain coefficients, left channel LPC coefficients and right channel LPC coefficients, respectively. In the gain average calculation section 42, respective gain coefficients for the left channel and the right channel are averaged, and the average gain coefficient is used by the mixer 40 to scale the low band excitation signal. . The resulting signal is provided for filtering on the LPC synthesis filter 42.

상기 평균 LPC 계산부(43)에서, 상기 좌 채널 및 우 채널에 대한 각각의 선형 예측 계수들이 결합된다. AMR-WB+에서, 양 채널들로부터의 LPC 계수들의 결합은 예를 들어 이미턴스 스펙트럼 쌍(ISP: Immittance Spectral Pair) 도메인에서 수신된 계수들에 대한 평균을 계산함으로써 행해질 수 있다. 그다음 상기 평균 계수들은 상기 스케일링된 저 대역 여기 신호가 종속되는, 상기 LPC 합성 필터(41)를 구성하는데 사용된다.In the average LPC calculator 43, respective linear prediction coefficients for the left channel and the right channel are combined. In AMR-WB +, the combination of LPC coefficients from both channels can be done, for example, by calculating the average over the coefficients received in the Immittance Spectral Pair (ISP) domain. The average coefficients are then used to construct the LPC synthesis filter 41 upon which the scaled low band excitation signal is dependent.

상기 스케일링되고 필터링된 저 대역 여기 신호는 원하는 모노 고 대역 오디오 신호를 형성한다.The scaled filtered low band excitation signal forms the desired mono high band audio signal.

상기 모노 저 대역 오디오 신호와 상기 모노 고 대역 오디오 신호는 상기 2 대역 합성 필터뱅크(23)에서 결합되고, 결과로서 생성된 합성된 신호(3)는 프리젠테이션을 위해 출력된다.The mono low band audio signal and the mono high band audio signal are combined in the two band synthesis filter bank 23, and the resulting synthesized signal 3 is output for presentation.

도 3의 고 대역 부호기를 사용하는 시스템과 비교하여, 도 4의 고 대역 부호기를 사용하는 시스템은 상기 합성된 신호가 단지 한번만 생성되기 때문에 상기 합성된 신호를 생성하기 위하여 처리 능력의 약 절반만을 필요로 한다는 이점을 가지고 있다.Compared to the system using the high band encoder of FIG. 3, the system using the high band encoder of FIG. 4 only requires about half of the processing power to generate the synthesized signal because the synthesized signal is generated only once. It has the advantage that

그러나, 상기 채널들 중 단지 한 채널에서 활성 신호를 지닌 스테레오 오디 오 입력의 경우 상기 결합된 신호에서의 가능한 감쇠에 관한 상기에 언급된 문제가 남아 있다는 것은 주목되어야 한다.However, it should be noted that the above mentioned problem regarding possible attenuation in the combined signal remains in case of a stereo audio input with an active signal in only one of the channels.

더욱이, 단지 하나의 활성 채널을 지닌 스테레오 오디오 입력 신호들에 대해, 선형 예측 계수들의 평균화는 결과로서 생성되는 결합된 신호에서 상기 스펙트럼을 '평평하게 하는' 원하지 않는 사이드 효과를 야기한다. 상기 활성 채널의 스펙트럼 특성을 가지는 것 대신에, 상기 결합된 신호는 상기 활성 채널의 '실제' 스펙트럼과 상기 사일런트 채널의 실제적으로 평탄 또는 랜덤-유사 스펙트럼의 결합으로 인하여 다소 왜곡된 스펙트럼 특성을 지닌다.Moreover, for stereo audio input signals with only one active channel, the averaging of the linear prediction coefficients results in an unwanted side effect of 'flattening' the spectrum in the resulting combined signal. Instead of having the spectral characteristics of the active channel, the combined signal has somewhat distorted spectral characteristics due to the combination of the 'real' spectrum of the active channel with the substantially flat or random-like spectrum of the silent channel.

이러한 효과는 도 5에 도시되어 있다. 도 5는 80 ms의 프레임동안 계산된 3개의 상이한 LPC 합성 필터 주파수 응답들에 대해 주파수에 대한 진폭을 도시한 도면이다. 실선은 활성 채널의 LPC 합성 필터 주파수 응답을 나타낸다. 점선은 사일런트 채널의 LPC 합성 필터 주파수 응답을 나타낸다. 파선은 상기 ISP 도메인에서 양 채널들로부터 상기 LPC 모듈들을 평균화하는 경우 결과로서 생성된 LPC 합성 필터 주파수 응답을 나타낸다. 상기 평균화된 LPC 필터가 실제 스펙트럼들 중 어느 한 스펙트럼을 면밀하게 닮지 않은 스펙트럼을 생성한다는 것을 알 수 있다. 실제로 이러한 현상은 상기 고 주파수 대역에서 감소된 오디오 품질로서 들릴 수 있다.This effect is shown in FIG. 5 shows the amplitude versus frequency for three different LPC synthesis filter frequency responses calculated over a frame of 80 ms. The solid line represents the LPC synthesis filter frequency response of the active channel. The dotted line represents the LPC synthesis filter frequency response of the silent channel. The dashed line represents the resulting LPC synthesis filter frequency response when averaging the LPC modules from both channels in the ISP domain. It can be seen that the averaged LPC filter produces spectra that do not closely resemble any of the real spectra. In practice this may sound as reduced audio quality in the high frequency band.

낮은 처리 부하를 가질 뿐만 아니라 도 4의 고 대역 복호기로 풀리지 않는 제약들을 추가로 회피하는 모노 오디오 신호(3)를 제공할 수 있기 위하여, 도 1의 시스템의 고 대역 복호기(22)는 본 발명의 제2 실시예에 따라 구현될 수 있다.In order to not only have a low processing load but also to provide a mono audio signal 3 which further avoids the constraints which are not solved with the high band decoder of FIG. 4, the high band decoder 22 of the system of FIG. It may be implemented according to the second embodiment.

도 6은 이러한 고 대역 복호기(22)의 개략적인 블록도이다. 상기 고 대역 복 호기(22)의 저 대역 여기 입력은 믹서(60)와 LPC 합성 필터(61)를 통해 상기 고 대역 복호기(22)의 출력에 연결된다. 상기 고 대역 복호기(22)는 부가적으로 상기 믹서(60)에 연결된 이득 선택 로직(62) 및 상기 LPC 합성 필터(61)에 연결된 LPC 선택 로직(63)을 포함한다.6 is a schematic block diagram of such a high band decoder 22. The low band excitation input of the high band decoder 22 is connected to the output of the high band decoder 22 through a mixer 60 and an LPC synthesis filter 61. The high band decoder 22 additionally includes a gain selection logic 62 coupled to the mixer 60 and an LPC selection logic 63 coupled to the LPC synthesis filter 61.

이제 도 6의 고 대역 부호기(22)를 사용하는 시스템에서의 처리가 도 7을 참조하여 설명될 것이다. 도 7은 상부에 상기 오디오 부호기(10)에서의 처리를 도시하고 하부에 상기 시스템의 오디오 복호기(20)의 처리를 도시한 흐름도이다. 상기 상부 및 하부는 수평 파선에 의해 나뉘어져 있다.Processing in the system using the high band encoder 22 of FIG. 6 will now be described with reference to FIG. Fig. 7 is a flowchart showing the processing in the audio encoder 10 at the top and the processing of the audio decoder 20 in the system at the bottom. The upper and lower portions are divided by horizontal dashed lines.

상기 부호기에 대한 스테레오 오디오 신호 입력(1)은 상기 2 대역 분석 필터뱅크(11)에 의해 저 주파수 대역과 고 주파수 대역으로 분할된다. 저 대역 부호기(12)는 상기 저 주파수 대역을 부호화한다. AMR-WB+ 고 대역 부호기(13)는 좌 및 우 채널들에 대해 개별적으로 상기 고 주파수 대역을 부호화한다. 특히, 그것은 고 주파수 대역 매개 변수들로서 양 채널들에 대한 선형 예측 계수들과 전용 이득 계수들을 결정한다.The stereo audio signal input 1 to the encoder is divided into a low frequency band and a high frequency band by the two band analysis filter bank 11. The low band encoder 12 encodes the low frequency band. The AMR-WB + high band encoder 13 encodes the high frequency band separately for the left and right channels. In particular, it determines linear prediction coefficients and dedicated gain coefficients for both channels as high frequency band parameters.

상기 저 대역 복호기(21)는 상기 비트 스트림(2)의 저 주파수 대역 관련 부분을 수신하고, 상기 부분을 복호화한다. 상기 복호화에서, 상기 저 대역 복호기(21)는 상기 수신된 스테레오 매개 변수들을 생략하고 단지 상기 모노 부분만을 복호화한다. 그 결과는 모노 저 대역 오디오 신호이다.The low band decoder 21 receives a low frequency band related part of the bit stream 2 and decodes the part. In the decoding, the low band decoder 21 omits the received stereo parameters and only decodes the mono part. The result is a mono low band audio signal.

상기 고 대역 복호기(22)는 한편으로 좌 채널 이득 계수, 우 채널 이득 계수, 상기 좌 채널에 대한 선형 예측 계수들 및 상기 우 채널에 대한 선형 예측 계수들을 수신하고, 다른 한편으로 상기 저 대역 복호기(21)에 의해 출력된 저 대역 여기 신호를 수신한다. 상기 좌 채널 이득 및 상기 우 채널 이득은 동시에 채널 활동 정보로서 사용된다. 그대신, 상기 좌 채널과 상기 우 채널에 대한 상기 고 주파수 대역에서의 활동 분포를 나타내는 어떤 다른 채널 활동 정보가 상기 고 대역 부호기(13)에 의해 부가적인 매개 변수로서 제공될 수 있다는 것은 주목되어야 한다.The high band decoder 22 receives, on the one hand, a left channel gain coefficient, a right channel gain coefficient, linear prediction coefficients for the left channel and linear prediction coefficients for the right channel, and on the other hand the low band decoder ( Receive the low band excitation signal output by 21). The left channel gain and the right channel gain are used simultaneously as channel activity information. Instead, it should be noted that any other channel activity information indicative of the distribution of activity in the high frequency band for the left channel and the right channel may be provided as an additional parameter by the high band encoder 13. .

상기 채널 활동 정보는 평가되고, 상기 좌 채널 및 상기 우 채널에 대한 이득 계수들은 단일 이득 계수에 대한 평가에 따라 상기 이득 선택 로직(62)에 의해 결합된다. 그다음 상기 선택된 이득은 상기 믹서(60)를 통해 상기 저 대역 복호기(21)에 의해 제공된 저 주파수 대역 여기 신호에 적용된다.The channel activity information is evaluated and the gain coefficients for the left channel and the right channel are combined by the gain selection logic 62 according to the evaluation for a single gain coefficient. The selected gain is then applied to the low frequency band excitation signal provided by the low band decoder 21 via the mixer 60.

더욱이, 상기 좌 채널 및 상기 우 채널에 대한 LPC 계수들은 단일 세트의 LPC 계수들에 대한 평가에 따라 상기 LPC 모델 선택 로직(63)에 의해 결합된다. 상기 결합된 LPC 모델은 상기 LPC 합성 필터(61)에 공급된다. 상기 LPC 합성 필터(61)는 상기 믹서(60)에 의해 제공된 스케일링된 저 주파수 대역 여기 신호에 상기 선택된 LPC 모델을 적용한다.Moreover, LPC coefficients for the left channel and the right channel are combined by the LPC model selection logic 63 according to the evaluation of a single set of LPC coefficients. The combined LPC model is supplied to the LPC synthesis filter 61. The LPC synthesis filter 61 applies the selected LPC model to the scaled low frequency band excitation signal provided by the mixer 60.

그다음 상기 결과로서 생성된 고 주파수 대역 오디오 신호는 상기 2 대역 합성 필터뱅크(23)에서 상기 모노 저 주파수 대역 오디오 신호와 결합되어 모노 전 대역 오디오 신호가 되는데, 상기 모노 전 대역 오디오 신호는 스테레오 오디오 신 호들을 처리할 수 없는 애플리케이션 또는 장치에 의한 프리젠테이션을 위해 출력될 수 있다.The resulting high frequency band audio signal is then combined with the mono low frequency band audio signal in the two band synthesis filter bank 23 to become a mono full band audio signal, wherein the mono full band audio signal is a stereo audio signal. It may be output for presentation by an application or device that cannot handle calls.

도 7의 흐름도에서 이중 라인들을 지닌 블록으로 표시된, 상기 채널 활동 정보의 제안된 평가 및 상기 매개 변수 값들의 후속적인 결합은 상이한 방법들로 구현될 수 있다. 도 8과 도 9의 흐름도들을 참조하여 두가지 옵션들이 제시될 것이다.The proposed evaluation of the channel activity information and subsequent combination of the parameter values, indicated by the block with double lines in the flow chart of FIG. 7, can be implemented in different ways. Two options will be presented with reference to the flow charts of FIGS. 8 and 9.

도 8에 도시된 첫번째 옵션에서, 좌 채널에 대한 이득 계수들은 우선 한 프레임의 지속 시간동안 평균화되고, 동일하게, 우 채널에 대한 이득 계수들은 한 프레임의 지속 시간동안 평균화된다.In the first option shown in Fig. 8, the gain coefficients for the left channel are first averaged over the duration of one frame, and likewise, the gain coefficients for the right channel are averaged over the duration of one frame.

그다음 상기 평균화된 우 채널 이득은 상기 평균화된 좌 채널 이득으로부터 감산되어, 각 프레임에 대한 어떤 이득 차를 초래한다.The averaged right channel gain is then subtracted from the averaged left channel gain, resulting in some gain difference for each frame.

상기 이득 차가 제1 임계값보다 작은 경우, 상기 프레임에 대한 결합된 이득 계수들은 우 채널에 대해 제공된 이득 계수들과 동일하게 설정된다. 더욱이, 상기 프레임에 대한 결합된 LPC 모델들은 상기 우 채널에 대해 제공된 LPC 모델들과 동일하게 설정된다.If the gain difference is less than the first threshold, the combined gain coefficients for the frame are set equal to the gain coefficients provided for the right channel. Moreover, the combined LPC models for the frame are set equal to the LPC models provided for the right channel.

상기 이득 차가 제2 임계값보다 큰 경우, 상기 프레임에 대한 결합된 이득 계수들은 상기 좌 채널에 대해 제공된 이득 계수들과 동일하게 설정된다. 더욱이, 상기 프레임에 대한 결합된 LPC 모델들은 상기 좌 채널에 대해 제공된 LPC 모델들과 동일하게 설정된다.If the gain difference is greater than the second threshold, the combined gain coefficients for the frame are set equal to the gain coefficients provided for the left channel. Moreover, the combined LPC models for the frame are set equal to the LPC models provided for the left channel.

모든 다른 경우에, 상기 프레임에 대한 결합된 이득 계수들은 상기 좌 채널 에 대한 각각의 이득 계수와 상기 우 채널에 대한 각각의 이득 계수들에 대한 평균과 동일하게 설정된다. 상기 프레임에 대한 결합된 LPC 모델들은 상기 좌 채널에 대한 각각의 LPC 모델과 상기 우 채널에 대한 각각의 LPC 모델에 대한 평균과 동일하게 설정된다.In all other cases, the combined gain coefficients for the frame are set equal to the average for each gain coefficient for the left channel and for each gain coefficient for the right channel. The combined LPC models for the frame are set equal to the average for each LPC model for the left channel and each LPC model for the right channel.

상기 제1 임계값과 상기 제2 임계값은 요구되는 민감도 및 스테레오-모노 변환이 요구되는 애플리케이션의 유형에 의존하여 선택된다. 적합한 값들은 예를 들어 상기 제1 임계값에 대해 -20 dB이고 상기 제2 임계값에 대해 20 dB이다.The first threshold and the second threshold are selected depending on the sensitivity required and the type of application for which stereo to mono conversion is desired. Suitable values are, for example, -20 dB for the first threshold and 20 dB for the second threshold.

따라서, 상기 채널들 중 한 채널이 사일런트 채널로서 간주될 수 있고 반면에 다른 채널이 각각의 프레임동안 활성 채널로서 간주될 수 있는 경우, 상기 평균 이득 계수들에서의 큰 차이로 인하여, 상기 사일런트 채널에 대한 LPC 모델들 및 이득 계수들은 상기 프레임의 지속 시간동안 무시된다. 이것은 상기 사일런트 채널이 믹싱된 오디오 출력에 대해 아무런 오디오적인 기여도 하지 않기 때문에 가능하다. 이러한 매개 변수들의 결합은 상기 스펙트럼 특성과 상기 신호 레벨이 각각의 활성 채널에 가능한 한 밀접하다는 것을 보장한다.Thus, if one of the channels can be considered as a silent channel while the other can be considered as an active channel during each frame, due to the large difference in the average gain coefficients, LPC models and gain coefficients are ignored for the duration of the frame. This is possible because the silent channel makes no audio contribution to the mixed audio output. The combination of these parameters ensures that the spectral characteristics and the signal level are as close as possible to each active channel.

상기 스테레오 매개 변수들을 생략하는 것 대신에, 또한 상기 저 대역 복호기는 바로 상기 고 주파수 대역 처리에 대해 설명된 바와 같이, 결합된 매개 변수 값들을 형성하고 그들을 상기 신호의 모노 부분에 적용할 수 있다는 것은 주목되어야 한다.Instead of omitting the stereo parameters, it is also possible that the low band decoder can form combined parameter values and apply them to the mono portion of the signal, just as described for the high frequency band processing. It should be noted.

도 9에 도시된 매개 변수 값들을 결합하는 두번째 옵션에서, 상기 좌 채널에 대한 이득 계수들과 상기 우 채널에 대한 이득 계수들은 각각 한 프레임의 지속 시 간동안 또한 평균화된다.In the second option of combining the parameter values shown in Fig. 9, the gain coefficients for the left channel and the gain coefficients for the right channel are each also averaged over the duration of one frame.

상기 이득 차가 제1의 낮은 임계값보다 작은 경우, 상기 프레임에 대한 결합된 LPC 모델들은 상기 우 채널에 대해 제공된 LPC 모델들과 동일하게 설정된다.If the gain difference is less than the first low threshold, the combined LPC models for the frame are set equal to the LPC models provided for the right channel.

상기 이득 차가 제2의 높은 임계값보다 큰 경우, 상기 프레임에 대한 결합된 LPC 모델들은 상기 좌 채널에 대해 제공된 LPC 모델들과 동일하게 설정된다.If the gain difference is greater than the second high threshold, the combined LPC models for the frame are set equal to the LPC models provided for the left channel.

모든 다른 경우에, 상기 프레임에 대한 결합된 LPC 계수들은 상기 좌 채널에 대한 각각의 LPC 모델과 상기 우 채널에 대한 각각의 LPC 모델에 대한 평균과 동일하게 설정된다.In all other cases, the combined LPC coefficients for the frame are set equal to the average for each LPC model for the left channel and each LPC model for the right channel.

어떤 경우에도 상기 프레임에 대한 결합된 이득 계수들은 상기 좌 채널에 대한 각각의 이득 계수와 상기 우 채널에 대한 각각의 이득 계수에 대한 평균과 동일하게 설정된다.In any case, the combined gain coefficients for the frame are set equal to the average for each gain coefficient for the left channel and for each gain coefficient for the right channel.

상기 LPC 계수들은 상기 합성된 신호의 스펙트럼 특성에만 직접적인 영향을 미친다. 따라서 단지 상기 LPC 계수들을 결합하는 것은 원하는 스펙트럼 특성을 초래하지만, 신호 감쇠의 문제를 해결하지 못한다. 하지만, 상기 저 주파수 대역이 본 발명에 따라 믹싱되지 않는 경우, 이것은 상기 저 주파수 대역과 상기 고 주파수 대역 간의 균형이 보존된다는 이점을 갖는다. 상기 고 주파수 대역에서 신호 레벨을 보존하는 것은 아마도 감소된 종속적인 오디오 품질을 초래하는, 고 주파수 대역에서 비교적 너무 큰 신호들을 야기함으로서 상기 저 주파수 대역들과 상기 고 주파수 대역들 간의 균형을 변경시킬 것이다.The LPC coefficients directly affect only the spectral characteristics of the synthesized signal. Thus only combining the LPC coefficients results in the desired spectral characteristics, but does not solve the problem of signal attenuation. However, if the low frequency band is not mixed according to the present invention, this has the advantage that the balance between the low frequency band and the high frequency band is preserved. Preserving the signal level in the high frequency band will change the balance between the low frequency bands and the high frequency bands, possibly causing signals that are relatively too large in the high frequency band, resulting in reduced dependent audio quality. .

상술된 실시예들이 많은 방법들로 더 보정될 수 있는 매우 다양한 실시예들 중 단지 몇몇 실시예들이라는 것은 주목되어야 한다.It should be noted that the above-described embodiments are only some of the wide variety of embodiments that can be further corrected in many ways.

Claims

delete

A method of synthesizing a mono audio signal based on an available coded multichannel audio signal, wherein the coded multichannel audio signal comprises individual parameter values for each channel of the multichannel audio signal for at least a portion of an audio frequency band. A method comprising: for at least a portion of an audio frequency band:

Combining parameter values of the multiple channels in the parameter domain; And

Synthesizing a mono audio signal using the combined parameter values,

Combining the parameter values is controlled based on information on each activity in the multiple channels for at least one parameter.

22. The method of claim 21, wherein the parameters comprise gain coefficients for each of the multiple channels and linear prediction coefficients for each of the multiple channels.

23. The method of claim 21 or 22, wherein the information about each activity in the multiple channels is

A gain factor for each of the multiple channels;

Combining short term gain coefficients for each of the multiple channels;

Linear prediction coefficients for each of the multiple channels;

An energy level in at least a portion of the frequency band of the multichannel audio signal for each of the multichannels; And

And at least one of individual supplementary information about the activity received from an encoding end providing the encoded multichannel audio signal.

23. The method of claim 21 or 22, wherein if the information about activity in the multiple channels indicates that activity in a first of the multiple channels is significantly lower than in another channel of at least one of the multiple channels, Ignoring the value of at least one parameter available for the first channel.

25. The method of claim 24, wherein if the information about the activity in the multiple channels indicates that activity in a first of the multiple channels is significantly lower than in at least one of the multiple channels. And averaging the values of at least one other parameter available.

23. The method according to claim 21 or 22, wherein the information on activity in the multiple channels does not indicate that activity in the first of the multiple channels is significantly lower than in at least one other channel of the multiple channels. And averaging the values of the parameters available for the multiple channels.

23. The method of claim 21 or 22, wherein the multichannel signal is a stereo signal.

23. The method of claim 21 or 22, further comprising: dividing the original multichannel audio signal into a low frequency band signal and a high frequency band signal, encoding the low frequency signal, and separately for the multichannels. Encoding the frequency band signal to result in the parameter values for each of the multiple channels, wherein the resulting parameter values for at least the high frequency band signal are modified by the mono audio signal. Combined for synthesis.

An audio decoder for synthesizing a mono audio signal based on an available coded multichannel audio signal, wherein the coded multichannel audio signal comprises at least a portion of the multichannel audio signal for at least a portion of the frequency band of the original multichannel audio signal. In an audio decoder comprising individual parameter values for a channel,

At least one parameter selector for combining parameter values of the multichannels in a parameter domain for at least a portion of a frequency band of the multichannel audio signal; And

An audio signal synthesizer for synthesizing a mono audio signal for at least a portion of a frequency band of the multichannel audio signal based on the combined parameter values provided by the at least one parameter selector,

And the parameter selector combines the parameter values for at least one parameter based on information about each activity in the multiple channels.

30. The audio decoder of claim 29, wherein the parameters comprise gain coefficients for each of the multiple channels and linear prediction coefficients for each of the multiple channels.

31. The method of claim 29 or 30, wherein the information about each activity in the multiple channels is

A gain factor for each of the multiple channels;

Combining short term gain coefficients for each of the multiple channels;

Linear prediction coefficients for each of the multiple channels;

31. The method of claim 29 or 30, wherein if the information about the activity in the multiple channels indicates that activity in a first channel is significantly lower than in at least one of the multiple channels, the parameter selector And in the combination ignores the value of at least one parameter available for the first of the multiple channels.

33. The apparatus of claim 32, wherein if the information about activity in the multiple channels indicates that activity in a first of the multiple channels is significantly lower than in at least one of the multiple channels, the parameter selector And averaging the values of at least one other parameter available for the multiple channels in the combining.

31. The method according to claim 29 or 30, wherein the information on activity in the multiple channels does not indicate that activity in one of the multiple channels is significantly lower than in at least one other of the multiple channels. And wherein said parameter selector averages the values of said parameters available for said multiple channels upon said combining.

31. The audio decoder of claim 29 or 30, wherein the multichannel signal is a stereo signal.

A mobile terminal comprising an audio decoder according to claim 29 or 30.

31. An audio encoder for providing an encoded multichannel audio signal and an audio decoder according to claim 29 or 30, wherein the encoded multichannel audio signal comprises at least one of the multiples of at least a portion of the frequency band of the original multichannel audio signal. And an individual parameter value for each channel of the channel audio signal.

38. The coding system of claim 37, wherein the audio encoder comprises an evaluation element for determining information about activity on the multiple channels and providing the information for use by the audio decoder.

A computer readable medium having stored thereon software code for synthesizing a mono audio signal based on an available coded multichannel audio signal, wherein the coded multichannel audio signal is at least a frequency band of the original multichannel audio signal. A computer readable medium comprising, for a portion, individual parameter values for each channel of the multichannel audio signal,

The software code readable medium embodies the steps of the method of claim 21 when executed in an audio decoder.

delete