KR20160060161A

KR20160060161A - Audio codec supporting time-domain and frequency-domain coding modes

Info

Publication number: KR20160060161A
Application number: KR1020167012861A
Authority: KR
Inventors: 랄프 가이거; 콘스탄틴 슈미트; 베른하트 그릴; 맨프레드 러츠키; 미카엘 베르너; 마크 가이어; 요하네스 힐페르트; 마리아 루이스 발레로; 볼프강 예거스
Original assignee: 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베.
Priority date: 2011-02-14
Filing date: 2012-02-14
Publication date: 2016-05-27
Also published as: CN103548078A; ZA201306872B; TW201241823A; TWI484480B; JP2014507016A; WO2012110480A1; AU2016200351B2; KR101751354B1; KR20140000322A; HK1192793A1; EP2676269A1; CN103548078B; US20130332174A1; AU2016200351A1; BR112013020589A2; EP2676269B1; MY160264A; JP5851525B2; CA2827296C; US9037457B2

Abstract

레이트/왜곡 비율과 관련하여 저지연 및 증가된 코딩 효율을 갖는, 시간-도메인 및 주파수-도메인 코딩 방식 모두를 지원하는 오디오 코덱은 만일 활성 운용 방식이 제 1 운용 방식이면, 이용가능한 프레임 코딩 방식들의 방식 의존 세트는 시간-도메인 코딩 방식들의 서브세트에서 분리되고 주파수-도메인 코딩 방식들의 제 2 서브세트로 오버랩하고, 반면에, 만일 활성 운용 방식이 제 2 운용 방식이면, 이용가능한 프레임 코딩 방식들의 방식 의존 세트는 두 서브세트들, 즉, 시간-도메인 코딩 방식들의 서브세트뿐만 아니라 주파수-도메인 코딩 방식들의 서브세트로 오버랩하는 것과 같이 만일 오디오 디코더가 서로 다른 방식들로 운용하도록 구성함으로써 획득될 수 있다.An audio codec that supports both time-domain and frequency-domain coding schemes, with low delay and increased coding efficiency with respect to rate / distortion ratios, can be used to determine if the active scheme is the first scheme, The scheme dependent set is separated from the subset of time-domain coding schemes and overlaps with the second subset of frequency-domain coding schemes, whereas if the active scheme is the second scheme, the scheme of available frame coding schemes The dependent set may be obtained by configuring the audio decoder to operate in different manners, such as overlapping two subsets, i. E., A subset of time-domain coding schemes as well as a subset of frequency-domain coding schemes .

Description

AUDIO CODEC SUPPORTING TIME-DOMAIN AND FREQUENCY-DOMAIN CODING MODES "

본 발명은 시간-도메인 및 주파수-도메인 코딩 방식을 지원하는 오디오 코덱에 관한 것이다.The present invention relates to an audio codec that supports time-domain and frequency-domain coding schemes.

최근에, MPEG 통합 음성 및 오디오 코딩 코덱(USAC codec)이 확정되었다. 통합 음성 및 오디오 코덱은 고급 오디오 코딩(AAC), 변환 코딩 여진(Transform Coded Excitation, TCX) 및 대수 부호 여진 선형 예측(Algebraic Code-Excited Linear Prediction, ACELP)의 혼합을 사용하여 오디오 신호들을 코딩하는 코덱이다. 특히, MPEG 통합 음성 및 오디오 코딩은 1024 샘플들의 길이를 사용하고 1024의 고급 오디오 코딩 유사 프레임들 또는 8x128 샘플들, 변환 코딩 여진 1024 프레임들 또는 하나의 프레임 내의 대수 부호 여진 선형 예측 프레임들(256 샘플들), 변환 코딩 여진 256 및 변환 코딩 여진 512 샘플들의 조합 사이의 전환을 허용한다. Recently, an MPEG integrated voice and audio coding codec (USAC codec) has been established. The integrated voice and audio codec is a codec that codes audio signals using a mixture of Advanced Audio Coding (AAC), Transform Coded Excitation (TCX), and Algebraic Code-Excited Linear Prediction (ACELP) to be. In particular, MPEG integrated voice and audio coding uses 1024 samples length and uses 1024 advanced audio coding similar frames or 8x128 samples, 1024 transformed-coded frames or algebraic-signed excitation linear prediction frames in one frame ), A transform coding excitation 256, and a transform coding excitation 512.

바람직하지 않게, MPEG 통합 음성 및 오디오 코딩 코덱은 저지연(low delay)을 필요로 하는 적용들에 적합하지 않다. 2방향 통신 적용들이 예를 들면, 그러한 짧은 지연들을 필요로 한다. 1024 샘플들의 통합 음성 및 오디오 코딩 프레임 길이 때문에, 통합 음성 및 오디오 코딩은 이러한 저지연 적용들을 위한 후보자가 아니다.Undesirably, MPEG integrated voice and audio coding codecs are not suitable for applications that require low delay. Two way communication applications, for example, require such short delays. Because of the combined voice and audio coding frame length of 1024 samples, integrated voice and audio coding is not a candidate for these low delay applications.

국제특허 WO 2011147950에서, 통합 음성 및 오디오 코딩 코덱의 코딩 방식들을 변환 코딩 여진 및 대수 부호 여진 선형 예측만으로 한정함으로써 저지연 적용들에 적합한 통합 음성 및 오디오 코딩 접근법을 제공하는 것이 제안되었다. 또한 저지연 적용들에 의해 도입되는 저지연 요구사항을 따르기 위하여 프레임 구조를 더 정교하게 하는 것이 제안되었다.In International Patent WO 2011147950 it has been proposed to provide an integrated voice and audio coding approach suitable for low delay applications by limiting the coding schemes of the integrated voice and audio coding codec to only transcoding excitation and algebraic sign excited linear prediction. It has also been proposed to further refine the frame structure to accommodate the low delay requirements introduced by low delay applications.

그러나, 레이트(rate)/왜곡(distorsion) 비율과 관련하여 증가된 효율에서 낮은 코딩 지연을 가능하게 하는 오디오 코덱을 제공하기 위한 필요성이 여전히 존재한다. 바람직하게는, 코덱은 음성 및 음악과 같은 서로 다른 종류의 오디오 신호들을 효율적으로 처리할 수 있어야 한다.However, there is still a need to provide an audio codec that enables a lower coding delay at increased efficiency with respect to rate / distortion ratio. Preferably, the codec should be able to efficiently process different kinds of audio signals, such as voice and music.

[1]: 3GPP, "Audio codec processing functions; Extended Adaptive Multi-rate-Wideband(AMR-WB+) codec; Transcoding functions', 2009, 3GPP TS 26.920.[1] 3GPP, "Audio codec processing functions, Extended Adaptive Multi-rate-Wideband (AMR-WB +) codec, Transcoding functions", 2009, 3GPP TS 26.920. [2]: USAC 코덱 (United Speech and Audio Codec), ISO/IEC CD 23003-3 2010년9월 24일.[2]: USAC codec (United States Speech and Audio Codec), ISO / IEC CD 23003-3 September 24, 2010.

따라서, 저지연 적용들을 위한 저지연을 제공하나, 예를 들면 통합 음성 및 오디오 코딩과 비교하여 레이트/왜곡 비율과 관련하여 증가된 코딩 효율에서의 오디오 코덱을 제공하는 것이 본 발명의 목적이다.Accordingly, it is an object of the present invention to provide an audio codec at increased coding efficiency in terms of rate / distortion ratio, while providing low latency for low delay applications, e.g., compared to unified voice and audio coding.

본 발명의 목적은 첨부된 독립 항들의 주제에 의해 달성된다.Objects of the invention are achieved by the subject matter of the appended independent claims.

본 발명을 설명하는 기본 개념은 만일 활성 운용 방식(active operating mode)이 제 1 운용 방식이면, 이용가능한 프레임 코딩 방식들의 방식 의존 세트(mode dependent set)는 시간-도메인 코딩 방식들의 서브세트(subset)에서 분리되고, 주파수-도메인 코딩 방식들의 제 2 서브세트로 오버랩하고, 반면에, 만일 활성 운용 방식이 제 2 운용 방식이면, 이용가능한 프레임 코딩 방식들의 방식 의존 세트는 두 서브세트들, 즉, 시간-도메인 코딩 방식들의 서브세트뿐만 아니라 주파수-도메인 코딩 방식들의 서브세트로 오버랩하는 것과 같이 만일 오디오 디코더가 서로 다른 방식들로 운용하도록 구성되면, 레이트/왜곡 비율과 관련하여 저지연 및 증가된 코딩 효율을 갖는, 시간-도메인 및 주파수-도메인 코딩 방식 모두를 지원하는 오디오 코덱이 획득될 수 있다는 것이다. 예를 들면, 제 1 운용 방식 및 제 2 운용 방식 중 어떤 것이 액세스되는가와 같은 판정은 데이터 스트림을 전송하기 위한 이용가능한 전송 비트레이트들에 따라 실행될 수 있다. 예를 들면, 판정의 의존성은 이용가능한 낮은 전송 비트레이트들의 경우에 제 2 운용 방식이 액세스되고, 이용가능한 높은 전송 비트레이트들의 경우에 제 1 운용 방식이 액세스되는 것과 같을 수 있다. 특히, 인코더에 운용 방식들을 제공함으로써, 인코더가 장기적으로 레이트/왜곡 비율과 관련하여 코딩 효율을 고려할 때 어떠한 코딩 방식의 선택이 코딩 효율 손실을 생성할 확률이 큰 것과 같은, 이용가능한 전송 비트레이트들에 의해 결정되는 것과 같은, 코딩 상황들의 경우에 인코더가 어떠한 시간-도메인 코딩 방식을 선택하는 것을 방지하는 것이 가능하다. 더 정확히 설명하면, 본 발명의 발명자들은 이용가능한 (상대적으로) 높은 전송 대역폭의 경우에 어떠한 시간-도메인 코딩 방식의 선택을 억제하는 것이 코딩 효율 증가를 야기한다는 것을 발견하였고, 반면에 단기적으로, 시간-도메인 코딩 방식은 주파수-도메인 코딩 방식들을 넘어 현재 바람직한 것으로 가정할 수 있는데, 이러한 가정은 만일 장기간 동안 오디오 신호를 분석하면 부정확한 것으로 밝혀질 확률이 크다. 그러나, 그러한 긴 분석 또는 예견(look-ahead)은 저지연 적용들에서 가능하지 않으며, 따라서, 인코더가 어떠한 시간-도메인 코딩 방식을 액세스하는 것을 방지하는 것은 증가된 코딩 효율의 달성을 가능하게 한다.The basic concept describing the present invention is that if the active operating mode is the first operating mode then a mode dependent set of available frame coding schemes is a subset of time- And overlaps with the second subset of frequency-domain coding schemes, whereas if the active scheme is the second scheme, then the scheme-dependent set of available frame coding schemes is divided into two subsets, i.e., time If the audio decoder is configured to operate in different manners, such as overlapping a subset of the domain coding schemes as well as a subset of frequency-domain coding schemes, then the low delay and increased coding efficiency An audio codec that supports both time-domain and frequency-domain coding schemes can be obtained. For example, a determination such as which one of the first operating mode and the second operating mode is accessed may be performed according to the available transmission bit rates for transmitting the data stream. For example, the dependency of the decision may be that the second operating mode is accessed in the case of available lower transmission bit rates and the first operating mode is accessed in the case of higher available transmission bit rates. In particular, by providing operational schemes for the encoder, it is possible to reduce the available transmission bit rates < RTI ID = 0.0 > It is possible to prevent the encoder from choosing any time-domain coding scheme in the case of coding situations, such as that determined by the < RTI ID = 0.0 > More precisely, the inventors of the present invention have found that suppressing the selection of any time-domain coding scheme in the case of available (relatively) high transmission bandwidth causes an increase in coding efficiency, while in the short term, It can be assumed that the domain coding scheme is currently preferred over frequency-domain coding schemes, and this assumption is likely to be found to be inaccurate if the audio signal is analyzed over a long period of time. However, such a long analysis or look-ahead is not possible in low latency applications, thus preventing the encoder from accessing any time-domain coding scheme enables the achievement of increased coding efficiency.

본 발명의 일 실시 예에 따라, 위의 개념은 데이터 비트레이트가 더 증가되는 정도까지 활용된다. 이는 인코더 및 디코더의 운용 방식을 동시에 제어하기에 꽤 저렴한 비트레이트이거나 또는 일부 다른 수단들에 동시발생이 제공되기 때문에 어떠한 비트레이트도 희생시키지 않으나, 인코더 및 디코더는 동시에 작동 방식들 사이에서 운용되고 전환된다는 사실은 각각 오디오 신호의 연속적인 부분들 내의 데이터 스트림의 개별 프레임들과 관련된 프레임 코딩 방식들을 시그널링하기 위한 시그널링 오버헤드(signaling overhead)를 감소시키기 위하여 활용될 수 있다. 특히, 디코더의 연관기(associator)는 데이터 스트림의 프레임들과 관련된 프레임 방식 구문 요소에 따라 복수의 프레임-코딩 방식의 방식-의존 세트들 중의 하나를 갖는 데이터 스트림의 각각의 연속적인 프레임들의 관련성을 실행하도록 구성되나, 연관기는 특히 활성 운용 방식에 따라 관련성의 실행의 의존도를 변경할 수 있다. 특히, 의존성 변화는 만일 활성 운용 방식이 제 1 운용 방식이면, 방식-의존 세트는 제 1 서브세트와 분리되고 제 2 서브세트로 오버랩하고, 만일 활성 운용 방식이 제 2 운용 방식이면, 방식-의존 세트는 두 서브세트로 오버랩하는 것과 같을 수 있다. 그러나, 운용 방식들과 관련된 상황들에 대한 지식을 활용함으로써 비트레이트를 증가시키는 덜 엄격한 해결책이 또한 실현 가능하다.In accordance with one embodiment of the present invention, the above concept is exploited to the extent that the data bit rate is further increased. This does not sacrifice any bit rate because it is a fairly inexpensive bit rate to control the operation of the encoder and decoder at the same time or because simultaneous generation is provided to some other means, but the encoder and decoder are operated between concurrent modes of operation, May be utilized to reduce the signaling overhead for signaling frame coding schemes associated with individual frames of the data stream within successive portions of the audio signal, respectively. In particular, an associator of a decoder may determine the relevance of each successive frame of a data stream having one of a plurality of frame-coding scheme-dependent sets of frame-based syntax elements associated with frames of the data stream , But the associator may change the dependency of the execution of the association, in particular according to the active mode of operation. In particular, the dependency change is such that if the active operational scheme is the first operational scheme, the scheme-dependent set is separated from the first subset and overlaps with the second subset, and if the active operational scheme is the second operational scheme, The set may be the same as overlapping two subsets. However, a less rigorous solution to increase the bit rate is also feasible by exploiting knowledge of the situations associated with manners.

본 발명의 실시 예들의 바람직한 양상들이 종속항들의 주제이다.Preferred aspects of embodiments of the present invention are subject of the dependent claims.

특히, 본 발명의 바람직한 실시 예들이 도면들을 참조하여 아래에 더 상세히 설명된다.
도 1은 본 발명에 따른 오디오 디코더의 블록 다이어그램을 도시한다.
도 2는 일 실시 예에 따른 프레임 방식 구문 요소 및 방식 의존 세트의 프레임 방식들 사이의 전단사 매핑(bijective mapping)의 개략도를 도시한다.
도 3은 일 실시 예에 따른 시간-도메인 디코더의 블록 다이어그램을 도시한다.
도 4는 일 실시 예에 따른 주파수-도메인 인코더의 블록 다이어그램을 도시한다.
도 5는 일 실시 예에 따른 오디오 인코더의 블록 다이어그램을 도시한다.
도 6은 일 실시 예에 따른 시간-도메인 및 주파수-도메인 인코더를 위한 일 실시 예를 도시한다.
도면들의 설명과 관련하여 명백하게 달리 설명되지 않는 한, 하나의 도면에서의 구성요소들의 설명들은 다른 도면에서 그것과 관련된 동일한 참조 부호를 갖는 구성요소들에 동등하게 적용되어야 한다는 것을 이해하여야 한다.In particular, preferred embodiments of the present invention are described in further detail below with reference to the drawings.
Figure 1 shows a block diagram of an audio decoder according to the present invention.
FIG. 2 shows a schematic diagram of a bijective mapping between the framing syntax elements and the scheme dependent set of frame schemes according to an embodiment.
3 shows a block diagram of a time-domain decoder according to one embodiment.
4 shows a block diagram of a frequency-domain encoder in accordance with one embodiment.
5 shows a block diagram of an audio encoder according to one embodiment.
Figure 6 illustrates one embodiment for a time-domain and frequency-domain encoder in accordance with one embodiment.
It should be understood that, unless explicitly stated otherwise in connection with the description of the drawings, the description of the components in one drawing applies equally to the components having the same reference signs associated therewith in the other drawings.

도 1은 본 발명의 일 실시 예에 따른 오디오 디코더(10)를 도시한다. 오디오 디코더는 시간-도메인 디코더(12) 및 주파수-도메인 디코더(14)를 포함한다. 또한 오디오 디코더(10)는 데이터 스트림(20)의 각각의 연속적인 프레임들(18a-18c)을 바람직하게는 A, B 및 C로서 도 1에 도시된 복수의 프레임 코딩 방식(22)의 방식-의존 세트 이외의 하나에 연관시키도록 구성되는 연관기(16)를 포함한다. 3가지 이상의 프레임 코딩 방식이 존재할 수 있으며, 따라서 수는 3부터 다른 수로 변경될 수 있다. 각각의 프레임(18a-c)은 오디오 디코더가 데이터 스트림(20)으로부터 재구성되는 오디오 신호(26)의 연속적인 부분들(24a-c) 중의 하나와 상응한다.1 shows an audio decoder 10 according to an embodiment of the present invention. The audio decoder includes a time-domain decoder (12) and a frequency-domain decoder (14). The audio decoder 10 may also be configured to decode each successive frame 18a-18c of the data stream 20 into a plurality of frame-coding schemes 22, preferably A, B, and C, And an associator 16 configured to associate with one of the sets other than the dependent set. Three or more frame coding schemes may exist, so the number may be changed from 3 to a different number. Each frame 18a-c corresponds to one of the successive portions 24a-c of the audio signal 26 from which the audio decoder is reconstructed from the data stream 20.

더 정확히 설명하면, 연관기(16)는 아래에 더 상세히 설명되는 방식으로 이것들을 관련 프레임들(18a-c)에 제공하기 위하여 한편으로는 디코더(10)의 입력(28), 및 다른 한편으로는 시간-도메인 디코더(12)와 주파수-도메인 디코더(14)의 입력들 사이에 연결된다. To be more precise, the correlator 16 is connected to the input 28 of the decoder 10 on the one hand to provide them to the associated frames 18a-c in a manner to be described in more detail below, Is connected between the inputs of the time-domain decoder 12 and the frequency-domain decoder 14. [

시간-도메인 디코더(12)는 그것들과 관련된 복수의 프레임-코딩 방식(22) 중 하나 또는 그 이상의 제 1 서브세트(30) 중의 하나를 갖는 프레임을 디코딩하도록 구성되고, 주파수-도메인 디코더(14)는 그것들과 관련된 복수의 프레임-코딩 방식(22) 중 하나 또는 그 이상의 제 2 서브세트(32) 중의 하나를 갖는 프레임을 디코딩하도록 구성된다. 제 1 및 제 2 서브세트들은 도 1에 도시된 것과 같이 서로 분리된다. 더 정확히 설명하면, 시간-도메인 디코더(12)는 그것들과 관련된 프레임-코딩 방식들의 제 1 서브세트들(30) 중의 하나를 갖는 프레임들과 상응하는 오디오 신호(26)의 재구성된 부분들(24a-c)을 출력하기 위한 출력을 가지며, 주파수-도메인 디코더(14)는 그것들과 관련된 프레임-코딩 방식들의 제 2 서브세트들(32) 중의 하나를 갖는 프레임들과 상응하는 오디오 신호(26)의 재구성된 부분들을 출력하기 위한 출력을 포함한다.Domain decoder 12 is configured to decode a frame having one of one or more first sub-sets 30 of a plurality of frame-coding schemes 22 associated therewith, and the frequency- Is configured to decode a frame having one of one or more second sub-sets (32) of a plurality of frame-coding schemes (22) associated with them. The first and second subsets are separated from each other as shown in Fig. More precisely, the time-domain decoder 12 decodes the reconstructed portions 24a of the audio signal 26 corresponding to the frames having one of the first subsets 30 of frame- domain decoder 14 has an output for outputting the audio signal 26 corresponding to the frames having one of the second subsets 32 of frame-coding schemes associated with them, And an output for outputting reconstructed portions.

도 1에 도시된 것과 같이, 오디오 디코더(10)는 선택적으로, 한편으로는 시간-도메인 디코더(12)와 주파수-도메인 디코더(14)의 출력들 및 다른 한편으로는 디코더(10)의 출력(36) 사이에 연결되는 결합기(combiner, 34)를 가질 수 있다. 특히, 비록 도 1의 부분들(24a-c)은 서로 오버랩(overlap)하지 않으나, 이 경우에 있어서는 결합기(34)가 누락될 수 있는, 시간(t)에 따라 서로 즉시 뒤따르는 것으로 제안되나, 또한 부분들(24a-c)은 적어도 부분적으로, 시간(t)에 연속적이나, 예를 들면, 그 뒤에 더 상세히 설명되는 주파수-도메인 디코더(14)의 실시 예와 마찬가지로, 예를 들면, 주파수-도메인 디코더(14)에 의해 사용되는 겹침 변환(lapped transform)과 관련되는 시간-에일리어싱 제거(time-aliasing cancellation)를 허용하기 위한 것과 같이 부분적으로 서로 오버랩하는 것이 가능하다.1, the audio decoder 10 may optionally be coupled to the outputs of the time-domain decoder 12 and the frequency-domain decoder 14 and, on the other hand, to the output of the decoder 10 36 which are connected to each other. Particularly, although it is proposed that portions 24a-c of Figure 1 do not overlap each other, but follow immediately following each other according to time t, in which case coupler 34 may be missing, The portions 24a-c are also continuous at time t, at least in part, for example, as in the embodiment of the frequency-domain decoder 14 described in more detail below, It is possible to partially overlap each other, such as to allow time-aliasing cancellation associated with the lapped transform used by the domain decoder 14.

도 1의 실시 예의 또 다른 설명에 앞서, 도 1에 도시된 프레임-코딩 방식들(A-C)의 수는 단지 설명을 위한 것이라는 것을 이해하여야 한다. 도 1의 오디오 디코더는 3가지 이상의 코딩 방식을 지원할 수 있다. 다음에서, 서브세트(32)의 프레임-코딩 방식들은 주파수-도메인 코딩 방식들로 불리며, 반면에 서브세트(30)의 프레임-코딩 방식들은 시간-도메인 코딩 방식들로 불린다. 연관기(16)는 어떠한 시간-도메인 코딩 방식(30)의 프레임들(15a-c)을 시간-도메인 디코더(12)에 전달하고, 어떠한 주파수-도메인 코딩 방식의 프레임들(18a-c)을 주파수-도메인 디코더(14)에 전달한다. 결합기(34)는 도 1에 표시된 것과 같이 시간(t)에 따라 연속적으로 배치되도록 하기 위하여 시간-도메인 및 주파수-도메인 디코더(12 및 14)에 의한 출력으로서 오디오 신호(26)의 재구성된 부분들을 등록한다. 선택적으로, 결합기(34)는 주파수-도메인 디코더(14)에 의해 출력되는 부분들 사이의 에일리어싱 제거를 실행하기 위하여, 오버랩-가산 기능과 같은, 바로 연속적인 부분들 사이의 전이(transition)들에서 주파수-도메인 코딩 방식 부분들(24) 사이의 오버랩-가산 기능 또는 다른 특정 측정들을 실행할 수 있다. 즉, 주파수-도메인 코딩 방식 부분들(24)로부터 시간-도메인 코딩 방식 부분들(24)로 그리고 반대로의 전이를 위하여, 시간-도메인 및 주파수-도메인 디코더(12 및 14)에 의해 분리되어 출력되는 즉시 뒤따르는 부들(24a-c) 사이에서 전방 에일리어싱 제거가 실행될 수 있다. 가능한 구현들에 대한 상세한 설명을 위하여, 더 상세한 설명의 실시 예들이 아래에 참조된다.Prior to further description of the embodiment of FIG. 1, it should be understood that the number of frame-coding schemes A-C shown in FIG. 1 is for illustration purposes only. The audio decoder of Figure 1 may support more than two coding schemes. In the following, the frame-coding schemes of subset 32 are referred to as frequency-domain coding schemes, while the frame-coding schemes of subset 30 are referred to as time-domain coding schemes. The correlator 16 transfers the frames 15a-c of the time-domain coding scheme 30 to the time-domain decoder 12 and the frames 18a-c of any frequency-domain coding scheme To the frequency-domain decoder (14). The combiner 34 combines the reconstructed portions of the audio signal 26 as an output by the time-domain and frequency-domain decoders 12 and 14 in order to be placed consecutively according to time t, Register. Optionally, the combiner 34 is operable to perform aliasing elimination between portions output by the frequency-domain decoder 14, such as in an overlap-add function, at transitions between consecutive portions To perform an overlap-add function or other specific measurements between the frequency-domain coding scheme portions 24. Domain decoders 12 and 14 for the transition from frequency-domain coding scheme portions 24 to time-domain coding scheme portions 24 and vice versa Forward antialiasing can be performed between immediately following portions 24a-c. For a detailed description of possible implementations, embodiments of the more detailed description are referenced below.

아래에 더 상세히 설명될 것과 같이, 연관기(16)는 그러한 시간-도메인 코딩 방식의 사용이 시간-도메인 코딩 방식들이 주파수-도메인 방식들과 비교하여 레이트/왜곡 비율에 대하여 비효율적인 것 같은 높은 이용가능한 전송 비트레이트의 경우에서와 같은 부적합한 경우에 있어서 시간-도메인 코딩 방식의 사용을 방지하는 방식으로 데이터 스트림(20)이 연속적인 프레임들(18a-c)의 프레임-코딩 방식들(A-C)과의 관련을 실행하도록 구성된다. 따라서 특정 프레임(18a-18c)을 위한 시간-도메인 프레임-코딩 방식은 아마도 코딩 효율에서의 감소에 이르게 할 수 있다.As will be described in greater detail below, the associator 16 is able to use such a time-domain coding scheme in such a way that the time-domain coding schemes are inefficient for rate / distortion ratios as compared to frequency- (AC) of successive frames 18a-c in a manner that avoids the use of a time-domain coding scheme in an unsuitable case such as in the case of possible transmission bit rates To be executed. Thus, the time-domain frame-coding scheme for a particular frame 18a-18c may lead to a reduction in coding efficiency.

따라서, 연관기(16)는 데이터 스트림(20) 내의 프레임들(18a-c)과 관련된 구문 요소에 의존하는 프레임 코딩 방식들로의 프레임들의 관련을 실행하도록 구성된다. 예를 들면, 데이터 스트림(20)의 구문은 각각의 프레임(18a-c)이 상응하는 프레임(18a-c)이 속하는, 프레임-코딩 방식의 결정을 위하여 그러한 프레임 방식 구문 요소(38)를 포함하는 것과 같이 구성될 수 있다.Thus, the associator 16 is configured to perform the association of frames to frame coding schemes that depend on the syntax elements associated with the frames 18a-c in the data stream 20. [ For example, the syntax of the data stream 20 includes such framing syntax elements 38 for the determination of a frame-coding scheme, where each frame 18a-c belongs to a corresponding frame 18a-c. As shown in FIG.

또한, 연관기(16)는 복수의 운용 방식 중 활성인 하나 내에서 운용되거나, 또는 복수의 운용 방식 중 현재 운용 방식을 선택하도록 구성된다. 연관기(16)는 데이터 스트림에 따라 또는 외부 신호에 따라 이러한 선택을 실행할 수 있다. 예를 들면, 아래에 더 상세히 설명될 것과 같이, 디코더(10)는 그것의 운용 방식을 인코더에서의 운용 방식 변화에 동시에 변경하고 동시발생을 구현하기 위하여, 인코더는 활성 운용 방식 및 데이터 스트림(20) 내의 운용 방식들 중 활성인 하나의 변화를 시그널링할 수 있다. 대안으로서, 인코더 및 디코더(10)는 진화된 패킷 시트템(EPS) 또는 실시간 전송 프로토콜(RTP) 등과 같은 낮은 전달 계층들에 의해 제공되는 제어 신호들과 같은 일부 외부 제어 신호에 의해 동시에 제어될 수 있다. In addition, the associator 16 is configured to operate within one of the plurality of operating modes, or to select the current operating mode among the plurality of operating modes. The correlator 16 may perform this selection according to the data stream or according to an external signal. For example, as will be described in greater detail below, the decoder 10 may be configured to change its mode of operation to change the mode of operation in the encoder and to implement concurrently, Lt; RTI ID = 0.0 > active < / RTI > Alternatively, the encoder and decoder 10 may be simultaneously controlled by some external control signal, such as control signals provided by the lower transport layers, such as an evolved packet system (EPS) or a real time transport protocol have.

위에서 설명된 것과 같이 시간-도메인 코딩 방식들의 부적합한 선택들 또는 부적합한 사용의 방지를 예시하거나 실현하기 위하여, 연관기(16)는 프레임들(18)의 활성 운용 방식에 따른 코딩 방식들로의 연관의 실행의 의존을 변경하도록 구성된다. 특히, 만일 활성 운용 방식이 제 1 운용 방식이면, 복수의 프레임 코딩 방식의 방식 의존 세트는 예를 들면, 제 1 서브세트(30)에 분리되고 제 2 서브세트(32)를 오버랩하는, 40에 도시된 하나이며, 만일 활성 운용 방식이 제 2 운용 방식이면, 방식 의존 세트는 예를 들면, 도 1의 42에 도시된 것과 같고 제 1 및 제 2 서브세트들(30 및 32)을 오버랩한다. In order to illustrate or realize the inadequate selection of time-domain coding schemes or the prevention of improper use, as described above, the associator 16 may determine the association of coding schemes according to the active mode of operation of the frames 18 And is configured to change dependencies of execution. In particular, if the active mode of operation is a first mode of operation, then a method dependent set of a plurality of frame coding schemes may be used, for example, at 40, separated into a first subset 30 and overlapping a second subset 32 And if the active mode of operation is the second mode of operation, then the mode dependent set is for example as shown in 42 of FIG. 1 and overlaps the first and second subsets 30 and 32.

즉, 도 1의 실시 예에 따라, 오디오 디코더(10)는 제 1 및 제 2 방식 사이의 그것의 활성 운용 방식을 변경하기 위하여 데이터 스트림(20) 또는 외부 제어 신호를 거쳐 제어가능하며, 그렇게 함으로써 주로 40 및 42 사이의, 프레임 코딩 운용 방식들의 운용 방식 의존 세트를 그에 알맞게 변경하며, 따라서 하나의 운용 방식에 따라, 운용 의존 세트(40)는 시간-도메인 코딩 방식들의 세트와 분리되며, 반면에 다른 운용 방식에서 방식 의존 세트(42)는 적어도 하나의 시간-도메인 코딩 방식뿐만 아니라 적어도 하나의 주파수-도메인 코딩 방식을 포함한다.That is, according to the embodiment of FIG. 1, the audio decoder 10 is controllable via a data stream 20 or an external control signal to change its active mode of operation between the first and second modes, Dependent set of frame coding operations, mainly between 40 and 42, so that, depending on one operating mode, the operation dependent set 40 is separated from the set of time-domain coding schemes In other implementations, the scheme dependent set 42 includes at least one frequency-domain coding scheme as well as at least one time-domain coding scheme.

연관기(16)의 관련성의 성능의 의존성을 더 상세히 설명하기 위하여, 도 2가 참조되는데, 이는 바람직하게는 데이터 스트림 중의 단편(fragment)을 설명하며, 단편은 도 1의 프레임들(18a-18c) 중의 특정 하나와 관련된 프레임 방식 구문 요소(38)를 포함한다. 이와 관련하여, 도 1에 예시된 데이터 스트림(20)의 구조는 단지 설명의 목적을 위하여 적용되었으며, 서로 다른 구조가 또한 적용될 수 있다는 것을 이해하여야 한다. 예를 들면, 도 1의 프레임들(18a 내지 18c)은 그것들 사이의 어떠한 인터리빙(interleaving) 없이 데이터 스트림(20)의 간단히 연결되거나 또는 연속적인 부분들로서 도시되었으나, 그러한 인터리빙이 또한 적용될 수 있다. 게다가, 도 1은 프레임 방식 구문 요소(38)가 프레임 내에 포함되는 것으로 제안되나, 이는 반드시 그럴 필요는 없다. 오히려, 프레임 방식 구문 요소(38)는 프레임들(18a 내지 18c) 외부의 데이터 스트림(20) 내에 위치될 수 있다. 또한, 데이터 스트림(20) 내에 포함되는 프레임 방식 구문 요소(38)의 수는 데이터 스트림(20) 내의 프레임들(18a 내지 18c)의 수와 동일할 필요는 없다. 오히려, 예를 들면, 도 2의 프레임 방식 구문 요소(38)는 데이터 스트림(20) 내의 프레임들(18a 내지 18c) 중의 하나 이상과 관련될 수 있다.To further illustrate the dependence of the performance of the associativity of the associator 16, reference is made to Fig. 2, which preferably describes a fragment in the data stream, the fragment comprising frames 18a-18c And a frame type syntax element 38 associated with a particular one of the frames. In this regard, it should be understood that the structure of the data stream 20 illustrated in FIG. 1 has been applied for illustrative purposes only, and that different structures may also be applied. For example, the frames 18a-18c of FIG. 1 are shown as simply connected or consecutive portions of the data stream 20 without any interleaving between them, but such interleaving can also be applied. In addition, Figure 1 suggests that a framed syntax element 38 is included in a frame, but this is not necessary. Rather, the framing syntax element 38 may be located within the data stream 20 outside the frames 18a-c. In addition, the number of framed syntax elements 38 included in the data stream 20 need not be equal to the number of frames 18a through 18c in the data stream 20. Rather, for example, the framed syntax element 38 of FIG. 2 may be associated with one or more of the frames 18a-18c in the data stream 20.

어떤 경우라도, 프레임 방식 구문 요소(38)가 데이터 스트림(20) 내로 삽입되는 방법에 따라, 데이터 스트림(20)을 거쳐 포함되고 전송된 것과 같은 프레임 방식 구문 요소(38), 및 프레임 방식 구문 요소(38)의 가능한 값들 중의 하나의 세트(46) 사이에 매핑(44)이 존재한다. 예를 들면, 프레임 방식 구문 요소(38)는 즉, 예를 들면, 펄스 코드 변조(PCM)와 같은 이진 표현을 사용하거니 또는 가변 길이 코드를 사용하거나 및/또는 허프만(Huffman) 또는 산술 코딩과 같은 엔트로피 코딩을 사용하여 직접 데이터 스트림(20) 내로 삽입될 수 있다. 따라서, 연관기(16)는 가능한 값들 중의 어느 하나를 유래하기 위하여 디코딩에 의한 것과 같이, 데이터 스트림(20)으로부터 프레임 방식 구문 요소(38)를 추출(48)하도록 구성될 수 있는데, 가능한 값들이 작은 삼각형들로 도 2에 표시된다. 인코더 면에서, 삽입(50)은 인코딩에 의한 것과 같이, 상응하게 수행된다. In any case, a frame-wise syntax element 38, such as contained and transmitted via the data stream 20, and a frame-wise syntax element 38, depending on how the framed syntax element 38 is inserted into the data stream 20, There is a mapping 44 between one of the possible values of the value < RTI ID = 0.0 > 38 < / RTI & For example, the frame-wise syntax element 38 may be implemented in any suitable manner, for example, using binary representations such as, for example, pulse code modulation (PCM) or using variable length codes and / or Huffman or arithmetic coding Can be inserted directly into the data stream 20 using the same entropy coding. The associator 16 may thus be configured to extract 48 the framing syntax element 38 from the data stream 20, such as by decoding to derive any of the possible values, Small triangles are shown in Fig. In the encoder aspect, the insert 50 is performed correspondingly, such as by encoding.

즉, 프레임 방식 구문 요소(38)가 가능하게 추정할 수 있는 각각의 가능한 값, 즉, 프레임 방식 구문 요소(38)의 가능한 값(46) 범위 내의 각각의 가능한 값은 복수의 프레임 방식 코딩 방식(A, B 및 C) 중의 특정한 하나와 관련된다. 특히, 한편으로는 세트(46)의 가능한 값들 및 다른 한편으로는 프레임 코딩 방식들의 방식 의존 세트 사이에 전단사 매핑이 존재한다. 도 2의 양촉 화살표(52)에 의해 표시된 것과 같이, 매핑은 활성 운용 방식에 따라 변한다. 전단사 매핑(52)은 활성 운용 방식에 따라 매핑(52)을 변경하는 연관기(16)의 기능의 일부이다. 도 1에 대하여 설명된 것과 같이, 방식 의존 세트(40 또는 42)는 도 2의 도시된 제 2 운용 방식의 경우에 두 프레임 코딩 방식 서브세트들(30 및 32)과 함께 오버랩하나, 제 1 운용 방식의 경우에 서브세트(30)와 분리되는데, 즉 서브세트(30)의 어떠한 요소도 포함하지 않는다. 바꾸어 말하면, 전단사 매핑(52)은 프레임 방식 구문 요소(38)의 가능한 값들의 도메인을 각각 방식 의존 세트(50 및 52)로 불리는, 프레임 코딩 방식들의 공동-도메인(co-domain) 상으로 매핑한다. 세트(46)의 가능한 값들을 위하여 삼각형의 실선의 사용에 의해 도 1 및 도 2에 도시된 것과 같이, 전단사 매핑(52)의 도메인은 두 운용 방식, 즉 제 1 운용 방식 및 제 2 운용 방식에서 동일하게 남으나, 전단사 매핑(52)의 공동-도메인은 위에 도시되고 설명된 것과 같이 변한다.That is, each possible value within the range of possible values (46) of the framing syntax element 38, that is, each possible value that the framing syntax element 38 may possibly estimate, A, B, and C). In particular, there is a shear mapping between the possible values of the set 46 on the one hand and the scheme dependent set of frame coding schemes on the other hand. As indicated by the double-headed arrow 52 in FIG. 2, the mapping varies according to the active mode of operation. The front end mapping 52 is part of the functionality of the associator 16 to change the mapping 52 according to the active mode of operation. As described with respect to FIG. 1, the scheme dependent set 40 or 42 overlaps with the two frame coding scheme subsets 30 and 32 in the case of the illustrated second scheme of FIG. 2, Scheme, that is, it does not include any element of the subset 30. In this case, In other words, the front end mapping 52 maps the domain of possible values of the framing syntax element 38 onto a co-domain of frame coding schemes, referred to as scheme dependent sets 50 and 52, respectively do. As shown in Figures 1 and 2 by the use of solid lines of triangles for possible values of the set 46, the domain of the front-end mapping 52 has two operating modes: the first operating mode and the second operating mode But the cavity-domain of the shear mapping 52 changes as shown and described above.

그러나, 세트(46) 내의 가능한 값들의 수도 변할 수 있다. 이는 도 2의 파선으로 도시된 삼각형에 의해 표시된다. 더 정확히 설명하면, 이용가능한 프레임 방식들의 수는 제 1 및 제 2 운용 방식 사이에서 서로 다르다. 그러나, 만일 그렇다면, 연관기(16)는 어떤 경우라도 전단사 매핑(52)의 공동-도메인이 위에 설명된 것과 같이 실행되는 것과 같이 구현된다. 활성인 제 1 운용 방식의 경우에 방식 의존 세트 및 서브세트(30) 사이에 어떠한 오버랩도 존재하지 않는다.However, the number of possible values within the set 46 may vary. This is indicated by the triangle shown by the dashed line in Fig. More precisely, the number of available frame schemes is different between the first and second schemes. However, if so, the associator 16 is implemented in such a way that the co-domain of the shear mapping 52 in any case is executed as described above. There is no overlap between the scheme dependent set and the subset 30 in the case of the first operating scheme being active.

달리 설명하면, 다음과 같이 언급된다. 내부적으로, 프레임 방식 구문 요소(38)의 값은 이의 가능한 값의 범위가 현재의 활성 운용 방식과 관계없는 가능한 값들의 세트(46)를 수용하는, 일부 이진 값에 의해 표현될 수 있다. 더 정확히 설명하면, 연관기(16)는 내부적으로 이진 표현의 이진 값으로 프레임 구문 요소(38)의 값을 표현할 수 있다. 이러한 이진 값들을 사용하여, 세트(46)의 가능한 값들이 서열 척도(ordinal scale) 내로 분류되고 따라서 세트(46)의 가능한 값들은 운용 방식의 변경의 경우에도 서로 유사하게 남는다. 이러한 서열 척도에 따른 세트(46)의 제 1 가능한 값은 예를 들면, 세트(46)의 가능한 값들 중에서 가장 높은 확률, 연속적으로 그 다음의 적은 확률인 세트(46)의 가능한 값들 중 두 번째 등과 관련된 것으로 정의될 수 있다. 따라서, 프레임 방식 구문 요소(38)의 가능한 값들은 운용 방식의 변경에도 불구하고 서로 유사할 수 있다. 후자의 예에 있어서, 전단사 매핑(52)의 도메인 및 공동-도메인, 즉, 프레임 코딩 방식들의 가능한 값들의 세트(46) 및 방식 의존 세트는 제 1 및 제 2 운용 방식 사이를 변경하는 활성 운용 방식에도 불구하고 동일하게 남으나, 전단사 매핑(52)은 한편으로는 방식 의존 세트의 프레임 코딩 방식들 및 다른 한편으로는 세트(46)의 상당히 가능한 값들 사이의 관련성을 변경한다. 후자의 실시 예에서, 도 1의 디코더(10)는 주로 제 1 운용 방식의 경우에 부적합한 시간-도메인 코딩 방식들의 선택을 피함으로써, 여전히 그 뒤에 설명되는 실시 예들에 따라 작동하는 인코더를 이용할 수 있다. 제 1 운용 방식의 경우에 세트(46)의 더 예상 가능한 값들을 주파수-도메인 코딩 방식들(32)과 관련시킴으로써, 제 1 운용 방식 동안에만 시간-주파수 코딩 방식들(30)을 위한 세트(46)의 낮은 예상 가능한 값들의 사용 동안에, 제 2 운용 방식에서의 이러한 방침의 변경은 만일 데이터 스트림(20) 내로/으로부터 프레임 방식 구문 요소(38)의 삽입/추출을 위하여 엔트로피 코딩을 사용하면 데이터 스트림(20)을 위한 높은 압축 비율을 야기한다. 바꾸어 말하면, 제 1 운용 방식 중에, 시간-도메인 코딩 방식들(30) 중 어떤 것도 주파수-도메인 코딩 방식(32) 중 어느 하나 상으로의 매핑(52)에 의해 매핑되는 가능한 값을 위한 확률보다 높은 그것과 관련된 확률을 갖는 세트(46)의 가능한 값과 관련될 수 없는데, 그러한 경우는 적어도 하나의 시간-도메인 코딩 방식(30)이 매핑(52)에 따라 주파수-도메인 코딩 방식(32)과 관련된 다른 가능한 값보다 높은 그것과 관련된 확률을 갖는 그러한 가능한 값과 관련되는 제 2 코딩 방식에 존재한다.In other words, it is mentioned as follows. Internally, the value of the framed syntax element 38 may be represented by some binary value, whose range of possible values accommodates a possible set of values 46 independent of the current active mode of operation. More precisely, the correlator 16 can internally represent the value of the frame syntax element 38 as a binary value of the binary representation. Using these binary values, the possible values of the set 46 are classified into an ordinal scale, so that the possible values of the set 46 remain similar to each other in the case of a change in operating mode. The first possible value of the set 46 according to this sequence measure is, for example, the highest of the possible values of the set 46, the second of the possible values of the set 46, Can be defined as related. Thus, the possible values of the framed syntax element 38 may be similar to each other despite the change in operating mode. In the latter example, the domain 46 and the co-domain of the front end mapping 52, i.e. the set of possible values 46 of the frame coding schemes and the scheme dependent set, But the front end mapping 52 alters the relevance between the frame-coding schemes of the scheme-dependent set on the one hand and the fairly possible values of the set 46 on the other hand. In the latter embodiment, the decoder 10 of FIG. 1 may use an encoder that operates in accordance with the embodiments still described hereinafter, by avoiding the selection of time-domain coding schemes that are unsuitable primarily for the first operating scheme . By associating the more predictable values of the set 46 with the frequency-domain coding schemes 32 in the case of the first scheme, only the set 46 for the time-frequency coding schemes 30 during the first operating scheme The use of entropy coding for the insertion / extraction of the framed syntax element 38 into / from the data stream 20 will result in a data stream < RTI ID = 0.0 > Resulting in high compression ratios for the compression mechanism 20. In other words, in the first mode of operation, none of the time-domain coding schemes 30 is higher than the probability for the possible values to be mapped by the mapping 52 onto any of the frequency-domain coding schemes 32 Domain coding scheme 30 is associated with a frequency-domain coding scheme 32 in accordance with a mapping 52. In this case, at least one time-domain coding scheme 30 is associated with a frequency- There is a second coding scheme associated with such a possible value having a probability associated with it higher than another possible value.

앞서 언급된 가능한 값들(46)과 관련되고 선택적으로 이를 인코딩/디코딩하도록 사용되는 확률은 고정적이거나 또는 적응적으로 변경될 수 있다. 서로 다른 운용 방식들을 위하여 서로 다른 확률 측정들의 세트가 사용될 수 있다. 확률을 적응적으로 변경하는 경우에 있어서, 문맥 적응성(context-adaptive) 엔트로피 코딩이 사용될 수 있다.The probabilities associated with the above-mentioned possible values 46 and optionally used to encode / decode it may be fixed or adaptively changed. A set of different probabilistic measures may be used for different manners of operation. In case of adaptively changing the probability, context-adaptive entropy coding may be used.

도 1에 도시된 것과 같이, 연관기(16)를 위한 바람직한 일 실시 예는 관련성의 성능의 의존성은 활성 운용 방식에 의존하며, 프레임 방식 구문 요소(38)는 세트(46) 내의 서로 다른 가능한 값들의 수가 제 1 또는 제 2 운용 방식인 활성 운용 방식과 독립적인 것과 같이 데이터 스트림(20) 내로 인코딩되고 데이터 스트림으로부터 디코딩되는 것과 같다. 특히, 도 1의 경우에, 서로 다른(구별가능한, differentiable) 가능한 값들(possible values)의 수는 실선들을 갖는 삼각형들을 고려할 때, 또한 도 2에 도시된 것과 같이, 2이다. 그러한 경우에 있어서, 예를 들면, 연관기(16)는 만일 활성 운용 방식이 제 1 운용 방식이면, 방식 의존 세트(40)가 프레임 코딩 방식들의 제 2 서브세트(32)의 제 1 및 제 2 프레임 코딩 방식(A 및 B)을 포함하고, 이러한 프레임 코딩 방식들에 대한 책임을 맡는, 주파수-도메인 디코더(14)는 그것과 관련된 제 1 및 제 2 프레임 코딩 방식(A 및 B) 중 하나를 갖는 프레임들을 디코딩하는데 서로 다른 시간-주파수 해상도들을 사용하도록 구성된다. 이러한 측정에 의해, 예를 들면, 데이터 스트림(20) 내의 프레임 방식 구문 요소(38)를 다른 어떠한 엔트로피 코딩 없이 직접 전송하는데 1 비트면 충분할 수 있으며, 단지 전단사 매핑(54)만이 제 1 운용 방식으로부터 제 2 운용 방식으로의 변경 및 반대의 변경 상에서 변한다. As shown in Figure 1, a preferred embodiment for associator 16 depends on the performance mode of relevance and the performance of the relevancy depends on the active mode of operation, Is encoded into the data stream 20 and decoded from the data stream, such as is independent of the active mode of operation, which is the first or second mode of operation. In particular, in the case of FIG. 1, the number of different (possible differentiable) possible values is 2, also considering the triangles with solid lines, as shown in FIG. In such a case, for example, the associator 16 may determine that the scheme dependent set 40 is the first and second of the second subset 32 of frame coding schemes if the active scheme is the first scheme, The frequency-domain decoder 14, which includes frame coding schemes A and B, and which is responsible for these frame coding schemes, has one of the first and second frame coding schemes A and B associated therewith And to use different time-frequency resolutions to decode the frames that it has. With this measure, for example, one bit may be sufficient to directly transmit the framed syntax element 38 in the data stream 20 without any other entropy coding, and only the front end mapping 54 may be used in the first operating system To the second operating mode and vice versa.

도 3 및 4와 관련하여 아래에 더 상세히 설명될 것과 같이, 시간-도메인 디코더(12)는 부호 여진 선형 예측(code excited linear-prediction) 디코더일 수 있으며, 주파수-도메인 디코더는 데이터 스트림(20) 내로 인코딩되는 변환 계수 레벨들을 기초로 하여 그것과 관련된 프레임 코딩 방식들의 제 2 서브세트 중 어느 하나를 갖는 프레임들을 디코딩하도록 구성되는 변환 디코더일 수 있다.Domain decoder 12 may be a code excited linear-prediction decoder, as will be described in more detail below with respect to Figures 3 and 4, and the frequency-domain decoder may be a data- And a decoder that is configured to decode frames having any of a second subset of frame coding schemes associated therewith based on the transform coefficient levels encoded into the transform coefficients.

예를 들어, 도 3이 참조된다. 도 3은 재구성된 오디오 신호(25)의 상응하는 부들(24)을 생성하도록 시간-도메인 디코더(12)를 통과하도록 하기 위하여 시간-도메인 디코더(12) 및 시간-도메인 코딩 방식과 관련된 프레임의 일 실시 예를 도시한다. 도 3의 실시 예 및 뒤에 설명될 도 4의 실시 예에 따라, 시간-도메인 디코더(12)뿐만 아니라 주파수-도메인 디코더는 데이터 스트림(12)으로부터 각각의 프레임을 위한 선형 예측 필터 계수들을 획득하도록 구성되는 선형 예측 기반 디코더들이다. 도 3 및 4는 각각의 프레임(18)이 그 안에 통합되는 선형 예측 필터 계수들(16)을 가질 수 있다는 것을 제안하나, 반드시 그렇지는 않다. 선형 예측 계수들(60)이 데이터 스트림(12) 내로 전송되는 선형 예측 코딩 전송 비율은 프레임들(18)의 프레임 비율과 동일하거나 또는 서로 다를 수 있다. 그럼에도 불구하고, 인코더 및 디코더는 선형 예측 코딩 전송 비율로부터 선형 예측 코딩 적용 비율 상으로 보간함(interpolate)으로써 동시에 각각의 프레임과 개별적으로 관련된 선형 예측 필터 계수들로 운용하거나 또는 이를 적용할 수 있다.For example, FIG. 3 is referred to. Figure 3 shows a block diagram of a time-domain decoder 12 and a frame associated with a time-domain coding scheme in order to allow the corresponding parts 24 of the reconstructed audio signal 25 to pass through the time- Fig. According to the embodiment of FIG. 3 and the embodiment of FIG. 4 to be described later, the time-domain decoder 12 as well as the frequency-domain decoder are configured to obtain linear prediction filter coefficients for each frame from the data stream 12 Linear prediction based decoders. 3 and 4 suggest that each frame 18 may have linear prediction filter coefficients 16 incorporated therein, but this is not necessarily the case. The linear predictive coding transfer rate at which the linear prediction coefficients 60 is transferred into the data stream 12 may be the same as or different from the frame rate of the frames 18. [ Nonetheless, the encoder and decoder may operate on linear prediction filter coefficients that are individually associated with each frame at the same time or by interpolating on the linear predictive coding application rate from the linear predictive coding transfer rate.

도 3에 도시된 것과 같이, 시간-도메인 디코더(12)는 선형 예측 합성 필터(62) 및 여진 신호 구성기(excitation signal constructor, 64)를 포함할 수 있다. 도 3에 도시된 것과 같이, 선형 예측 합성 필터(62)에 현재 시간-도메인 코딩 방식 프레임(18)을 위하여 데이터 스트림(12)으로부터 획득되는 선형 예측 필터 계수들이 제공된다. 여진 신호 구성기(64) 및 선형 예측 합성 필터(62)는 합성 필터(62)의 출력에서 재구성되는 상응하는 오디오 신호 부분(24)을 출력하기 위하여 직렬로 연결된다. 특히, 여진 신호 구성기(64)는 도 3에 표시된 것과 같이, 그것과 관련된 어떠한 시간-도메인 코딩 방식을 갖는 현재 디코딩된 프레임 내에 포함될 수 있는 여진 파라미터(66)를 사용하여 여진 신호(68)를 구성하도록 구성된다. 여진 신호(68)는 선형 예측 합성 필터(62)에 의해 스펙트럼 엔벨로프(spectral envelope)가 형성되는, 잔류 신호의 한 종류이다. 특히, 선형 예측 합성 필터는 오디오 신호(26)의 재구성된 부분(24)을 생산하기 위하여, 현재 디코딩된 프레임(그것과 관련된 어떠한 시간-도메인 코딩 방식을 갖는)을 위하여 데이터 스트림(20) 내에 전달되는 선형 예측 필터 계수들에 의해 제어된다. As shown in FIG. 3, the time-domain decoder 12 may include a linear prediction synthesis filter 62 and an excitation signal constructor 64. 3, linear prediction filter coefficients are obtained from the data stream 12 for the current time-domain coding scheme frame 18. The excitation signal constructor 64 and the linear prediction synthesis filter 62 are connected in series to output the corresponding audio signal portion 24 reconstructed at the output of the synthesis filter 62. In particular, the excitation signal constructor 64 uses the excitation parameter 66, which may be included in the current decoded frame with any time-domain coding scheme associated therewith, as shown in FIG. 3 to provide the excitation signal 68 Respectively. The excitation signal 68 is a kind of residual signal in which a spectral envelope is formed by the linear prediction synthesis filter 62. In particular, the linear predictive synthesis filter is used to deliver the current decoded frame (with any time-domain coding scheme associated therewith) in the data stream 20 to produce a reconstructed portion 24 of the audio signal 26 Lt; / RTI > is controlled by the linear prediction filter coefficients.

도 3의 부호 여진 선형 예측 디코더의 가능한 구현에 대한 더 상세한 설명을 위하여, 예를 들면, 위에서 설명된 통합 음성 및 오디오 코딩 [2] 또는 확장 적응성 멀티-레이트-광대역(AMR-WB+) 코덱 [1]과 같은 알려진 코덱들이 참조된다. 후자의 코덱에 따라, 도 3의 여진 부호 선형 예측 디코더는 어떠한 여진 신호(68)가 코드/파라미터 제어된 신호의 결합에 의해 형성되는가에 따라, 즉, 혁신 여진(innovation exitation), 및 또한 현재 디코딩된 시간-도메인 코딩 방식 프레임(18)을 위하여 데이터 스트림(12) 내에 전달되는 적응적 여진 파라미터에 따라 즉시 선행하는 시간-도메인 코딩 방식 프레임을 위하여 최종적으로 획득되고 적용되는 여진 신호의 변환을 야기하는 연속적으로 업데이트되는 적응적 여진에 따라 대수 부호 여진 선형 예측 디코더로서 구현될 수 있다. 적응적 여진 파라미터는 예를 들면, 현재 프레임을 위한 적응적 여진을 획득하기 위하여 피치 및 이득의 의미에 있어서 과거 프레임을 어떻게 변형하는가를 규정하는, 피치 래그(pitch lag) 및 이득(gain)을 정의할 수 있다. 코드(66)는 코드북 룩-업(look-up)을 위하여 사용될 수 있거나, 또는 그렇지 않으면, 논리적으로 또는 산술적으로, 예를 들면, 수 및 위치와 관련하여 혁신 여진의 펄스들을 정의할 수 있다.For a more detailed description of possible implementations of the signed excitation linear predictive decoder of FIG. 3, for example, the integrated speech and audio coding [2] or extended adaptive multi-rate-wideband (AMR-WB + Are referred to. Depending on the latter codec, the excitation code linear predictive decoder of FIG. 3 may be modified according to which excitation signal 68 is formed by the combination of code / parameter controlled signals, i. E. Innovation exitation, Domain coding scheme frame in accordance with an adaptive excitation parameter that is passed in the data stream 12 for a time-domain coding scheme frame 18 And can be implemented as a log-likelihood-linear predictive decoder in accordance with continuously updated adaptive excitation. The adaptive excitation parameter defines a pitch lag and gain defining how to modify the past frame in terms of pitch and gain, for example, to obtain an adaptive excitation for the current frame. can do. The code 66 may be used for a codebook look-up, or otherwise it may define pulses of innovation excitation in relation to the number and position logically or arithmetically.

유사하게, 도 4는 주파수-도메인 디코더(14)의 가능한 실시 예를 도시한다. 도 4는 그것과 관련된 어떠한 주파수-도메인 코딩 방식을 갖는 프레임(18)과 함께, 주파수-도메인 디코더(14)로 들어가는 현재 프레임(18)을 도시한다. 주파수-도메인 디코더(14)는 출력이 재변환기(retransformer, 72)에 연결되는, 주파수 도메인 잡음 형상기(frequency-domain noise shaper, 70)을 포함한다. 재변환기(72)의 출력은 결과적으로, 현재 디코딩되는 프레임(18)과 상응하는 오디오 신호의 재구성된 부분을 출력하는, 주파수-도메인 디코더(14)의 출력이다. Similarly, FIG. 4 shows a possible embodiment of a frequency-domain decoder 14. FIG. 4 shows the current frame 18 entering the frequency-domain decoder 14, along with a frame 18 with any frequency-domain coding scheme associated therewith. The frequency-domain decoder 14 includes a frequency-domain noise shaper 70, the output of which is connected to a retransformer 72. The output of the re-converter 72 is consequently the output of the frequency-domain decoder 14, which outputs the reconstructed portion of the audio signal corresponding to the current frame 18 to be decoded.

도 4에 도시된 것과 같이, 데이터 스트림(20)은 그것과 관련된 어떠한 주파수-도메인 코딩 방식을 갖는 프레임들을 위하여 변환 계수 레벨들(74) 및 선형 예측 필터 계수들(76)을 전달할 수 있다. 선형 예측 필터 계수들(76)은 그것과 관련된 어떠한 주파수-도메인 코딩 방식을 갖는 프레임들과 관련된 선형 예측 필터 계수와 동일한 구조를 가질 수 있으나, 변환 계수 레벨들(74)은 변환 도메인 내의 주파수-도메인 프레임들(18)을 위한 여진 신호를 표현하기 위한 것이다. 통합 음성 및 오디오 코딩으로부터 알려진 것과 같이, 변환 계수 레벨들(74)은 스펙트럼 축을 따라 서로 다르게 코딩될 수 있다. 변환 계수 레벨들(74)의 양자화 정확도는 통상의 스케일 팩터(scale factor) 또는 이득 팩터에 의해 제어될 수 있다. 스케일 팩터는 데이터 스트림의 일부일 수 있고 변환 계수 레벨들(74)의 일부로 가정될 수 있다. 그러나, 다른 양자화 방식이 또한 사용될 수 있다. 변환 계수 레벨들(74)은 주파수-도메인 잡음 형상기(70)에 제공된다. 이는 동일하게 현재 디코딩된 주파수-도메인 프레임(18)을 위하여 선형 예측 필터 계수들(76)에 적용된다. 주파수-도메인 잡음 형상기(70)는 그리고 나서 변환 계수 레벨들(74)로부터 여진 신호의 여진 스펙트럼을 획득하고 선형 예측 필터 계수들(76)에 따라 이러한 여진 스펙트럼을 스펙트럼으로 형상화하도록 구성된다. 더 정확히 설명하면, 주파수-도메인 잡음 형상기(70)는 여진 신호의 스펙트럼을 생산하기 위하여 변환 계수 레벨들(74)을 탈양자화하도록 구성된다. 그리고 나서, 주파수-도메인 잡음 형상기(70)는 선형 예측 필터 계수들(76)에 의해 정의되는 선형 예측 합성 필터와 상응하도록 하기 위하여 선형 예측 필터 계수들(76)을 가중스펙트럼 내로 전환한다. 이러한 전환은 선형 예측 코딩들을 스펙트럼 가중 값들로 바꾸도록 하기 위하여 선형 예측 코딩들에 적용되는 홀수 이산 푸리에 변환(ODFT)을 포함할 수 있다. 더 상세한 설명이 통합 음성 및 오디오 코딩으로부터 획득될 것이다. 가중 스펙트럼을 사용하여 주파수-도메인 잡음 형상기(70)는 변환 계수 레벨들(74)에 의해 획득된 여진 스펙트럼을 형상화하거나 또는 가중하며, 그렇게 함으로써, 여진 신호 스펙트럼을 획득한다. 형상화/가중에 의해, 변환 계수들을 양자화함으로써 인코더 면에 도입되는 양자화 잡음은 인지적으로(perceptually) 덜 중요하도록 형상화된다. 재변환기(72)는 그리고 나서 방금 디코딩된 프레임(18)과 상응하는 재구성된 부분을 획득하기 위하여 주파수 도메인 잡음 형상기(70)에 의한 출력으로서 형상화된 여진 스펙트럼을 재변환한다. 4, data stream 20 may carry transform coefficient levels 74 and linear prediction filter coefficients 76 for frames having any frequency-domain coding scheme associated therewith. The linear prediction filter coefficients 76 may have the same structure as the linear prediction filter coefficients associated with the frames having any of the frequency-domain coding schemes associated therewith, For expressing the excitation signal for the frames 18. As is known from integrated speech and audio coding, the transform coefficient levels 74 may be coded differently along the spectral axes. The quantization accuracy of the transform coefficient levels 74 can be controlled by a conventional scale factor or gain factor. The scale factor may be part of the data stream and may be assumed to be part of the transform coefficient levels 74. However, other quantization schemes may also be used. The transform coefficient levels 74 are provided in the frequency-domain noise type (70). Which is likewise applied to linear predictive filter coefficients 76 for the currently decoded frequency-domain frame 18. The frequency-domain noise form 70 is then configured to obtain the excitation spectrum of the excitation signal from the transform coefficient levels 74 and to shape the excitation spectrum into a spectrum in accordance with the linear prediction filter coefficients 76. More precisely, the frequency-domain noise shaping (70) is configured to dequantize the transform coefficient levels (74) to produce a spectrum of excitation signals. The frequency-domain noise variant 70 then converts the linear predictive filter coefficients 76 into a weighted spectrum to correspond to a linear predictive synthesis filter defined by linear predictive filter coefficients 76. Such a transition may include an odd discrete Fourier transform (ODFT) applied to linear predictive coding to transform the linear predictive coding into spectral weighted values. A more detailed description will be obtained from the integrated voice and audio coding. Using the weighted spectrum, the frequency-domain noise type (70) shapes or weights the excitation spectrum acquired by the transform coefficient levels (74) and thereby obtains the excitation signal spectrum. By shaping / weighting, the quantization noise introduced into the encoder plane by quantizing the transform coefficients is shaped to be perceptually less important. The re-transformer 72 then reconverts the shaped excitation spectrum as an output by the frequency domain noise shaping 70 to obtain a reconstructed portion corresponding to the frame 18 just decoded.

위에서 이미 설명된 것과 같이, 도 4의 주파수-도메인 디코더(14)는 서로 다른 코딩 방식들을 지원할 수 있다. 특히, 주파수-도메인 디코더(14)는 그것과 관련된 서로 다른 주파수-도메인 코딩 방식들을 갖는 주파수-도메인 프레임들을 디코딩하는데 있어서 서로 다른 시간-도메인 해상도들을 적용하도록 구성될 수 있다. 예를 들면, 재변환기(72)에 실행되는 재변환은 변환되려는 신호의 어떠한 연속적이고 상호 오버래핑 윈도우잉되는 부분들이 개별 변환들 내로 세분되는가에 따라, 겹침 변환일 수 있는데, 재변환(72)은 이러한 윈도우잉된 부분들(78a, 78b 및 78c)의 재구성을 생산한다. 위에서 이미 설명된 것과 같이, 결합기(34)는 예를 들면, 오버랩-가산 과정에 의해 이러한 윈도우잉된 부분들의 오버랩에서 발생하는 에일리어싱을 상호 보상할 수 있다. 재변환기(72)의 겹침 변환 또는 겹침 재변환은 예를 들면, 시간 에일리어싱 제거를 필요로 하는 임계적으로 샘플링되는 변환/재변환일 수 있다. 예를 들면, 재변환기(72)는 역 변형 이산 코사인 변환을 실행할 수 있다. 어떤 경우에도, 주파수-도메인 코딩 방식들(A 및 B)은 예를 들면, 현재 디코딩된 프레임(18)과 상응하는 부분(18)이 하나의 윈도우잉된 부분(78)에 의해 전환되거나(또한 선행 및 다음 부분들 내로 확장하며, 그렇게 함으로써 프레임(18) 내의 변환 계수 레벨들(74)의 하나의 큰 변환 세트를 생산한다) 또는 두 개의 연속적인 윈도우잉된 서브-부분들(78c 및 78b) 내로(상호 오버래핑되고 각각 선행 부분 및 다음 부분 내로 확장하고 이들로 오버래핑되는, 그렇게 함으로써 프레임(18) 내의 변환 계수 레벨들(74)의 두 개의 작은 변환 세트를 생산하는) 전환된다. 따라서, 디코더 및 주파수-도메인 잡음 형상기(70) 및 재변환기(72)는 예를 들면, 방식(A)의 프레임들을 위하여 두 가지 운용- 형상화 및 재변환-을 실행할 수 있는데, 그것들은 예를 들면 프레임 코딩 방식(B)의 프레임 당 하나의 운용을 상호 실행한다.As already described above, the frequency-domain decoder 14 of FIG. 4 may support different coding schemes. In particular, the frequency-domain decoder 14 may be configured to apply different time-domain resolutions in decoding frequency-domain frames with different frequency-domain coding schemes associated therewith. For example, the re-transform performed on the re-transformer 72 may be a lapped transform, depending on which consecutive and mutually overlapping windowed portions of the signal to be transformed are subdivided into individual transforms, Resulting in a reconstruction of these windowed portions 78a, 78b, and 78c. As already described above, the combiner 34 can compensate for the aliasing that occurs in the overlap of these windowed portions, for example, by an overlap-add process. The lapped or overlaid re-transform of the re-transformer 72 may be, for example, a transformed / re-transform that is critically sampled that requires time aliasing elimination. For example, the re-transformer 72 may perform inverse transform discrete cosine transform. In any case, the frequency-domain coding schemes A and B can be used to determine whether the current decoded frame 18 and corresponding portion 18 are switched by a windowed portion 78 Leading to the next and preceding portions and thus producing one large transform set of transform coefficient levels 74 in frame 18) or two consecutive windowed sub-portions 78c and 78b, (Which are mutually overlapping and extend into each preceding and following portion and overlap with, thereby producing two smaller transform sets of transform coefficient levels 74 in frame 18). Thus, the decoder 70 and the re-transformer 72 of the frequency-domain noise type can perform, for example, two operation-shaping and re-transformations for the frames of scheme A, And one operation per frame of the frame coding scheme (B).

위에서 설명된 오디오 디코더를 위한 실시 예들은 시간-도메인 프레임 코딩 방식들이 이러한 운용 방식들 중 하나에서 선택되지 않고, 다른 방식에서 선택되는 정도까지 주로 이러한 운용 방식들 사이의 프레임 코딩 방식들 중에서의 선택을 변경하기 위하여, 서로 다른 운용 방식들에서 운용되는 오디오 인코더를 이용하도록 디자인되었다. 그러나, 아래에 설명되는 오디오 인코더를 위한 실시 예들은 또한 적어도 이러한 실시 예들의 서브세트가 관련되는 한, 서로 다른 운용 방식들을 지원하지 않는 오디오 디코더에 맞을 수 있다는 것을 이해하여야 한다. 이는 어떠한 데이터 스트림이 이러한 운용 방식들 사이에서 바뀌지 않는지에 따라 그러한 인코더 실시 예들을 위하여 적어도 사실이다. 바꾸어 말하면, 아래에 설명되는 오디오 인코더를 위한 일부 실시 예들에 따라, 운용 방식들 중의 하나에서 주파수-도메인 코딩 방식들로의 프레임 코딩 방식의 선택의 제한은 운용 방식 변경들이 투명한 한(활성인 이러한 운용 방식들 중 하나의 동안에 시간-도메인 프레임 코딩 방식들을 제외하고), 자체로 데이터 스트림(12) 내에 반영하지 않는다. 그러나, 특히 위에서 설명된 다양한 실시 예들에 따른 특히 전용 오디오 디코더들은 위에서 설명된 오디오 인코더를 위한 각각의 실시 예들과 함께, 부가적으로 위에서 설명된 것과 같이, 예를 들면, 특정 전송 조건들과 상응하는 특정 운용 방식 동안에 프레임 코딩 방식 선택 제한을 이용하는 오디오 코덱들을 형성한다.Embodiments for the audio decoder as described above are primarily used to select from among frame coding schemes between these modes of operation, to the extent that time-domain frame coding schemes are not selected in one of these schemes, In order to change, it is designed to use an audio encoder operating in different manners. It should be understood, however, that embodiments for the audio encoder described below may also be suitable for audio decoders that do not support different operating modes, at least as long as a subset of such embodiments is relevant. This is at least true for such encoder embodiments, depending on which data streams do not change between these manners. In other words, in accordance with some embodiments for the audio encoder described below, the limitation of the selection of the frame coding scheme from one of the schemes to the frequency-domain coding schemes is not limited as long as the scheme changes are transparent (Except for the time-domain frame coding schemes during one of the schemes). However, especially in particular the dedicated audio decoders according to the various embodiments described above, in addition to the respective embodiments for the audio encoder described above, can additionally be implemented as described above, for example, To form audio codecs that use frame coding scheme selection constraints during certain operating modes.

도 5는 본 발명의 일 실시 예에 따른 오디오 인코더를 도시한다. 도 5의 오디오 인코더는 일반적으로 100으로 표시되고 연관기(102), 시간-도메인 인코더(104) 및 주파수-도메인 인코더(106)를 포함하며, 연관기(102)는 한편으로는 오디오 인코더(100)의 입력(108) 및 다른 한편으로는 시간-도메인 인코더(104)와 주파수-도메인 인코더(106)의 입력들 사이에 연결된다. 시간-도메인 인코더(104)와 주파수-도메인 인코더(106)의 출력들은 오디오 인코더(100)의 출력(110)에 연결된다. 따라서, 도 5에서 112로 표시되는, 인코딩되려는 오디오 신호는 입력(108)으로 들어가고 오디오 인코더(100)는 그것으로부터 데이터 스트림(114)을 형성하도록 구성된다.5 illustrates an audio encoder in accordance with one embodiment of the present invention. The audio encoder of Figure 5 is generally designated 100 and includes an associator 102, a time-domain encoder 104 and a frequency-domain encoder 106, and the associator 102 comprises an audio encoder 100 And on the other hand between the inputs of the time-domain encoder 104 and the frequency-domain encoder 106. The time- The outputs of the time-domain encoder 104 and the frequency-domain encoder 106 are connected to the output 110 of the audio encoder 100. [ Thus, the audio signal to be encoded, denoted 112 in FIG. 5, enters input 108 and the audio encoder 100 is configured to form a data stream 114 therefrom.

연관기(102)는 이전에 설명된 오디오 신호(112)의 부분들(24)과 상응하는 연속적인 부분들(116a 내지 116c)을 복수의 프레임 코딩 방식(도 1 내지 4의 40 및 42 참조)의 방식 의존 세트 중의 하나와 관련시키도록 구성된다.The correlator 102 may be configured to transmit portions 24 of the previously described audio signal 112 and corresponding successive portions 116a through 116c to a plurality of frame coding schemes 40 and 42 of Figures 1-4, Dependent manner of the set.

시간-도메인 인코더(104)는 그것과 관련된 하나 또는 그 이상의 복수의 프레임 코딩 방식의 제 1 서브세트(30) 중의 하나를 갖는 부분들(116a 내지 116c)을 데이터 스트림(114)의 상응하는 프레임(118a 내지 118c) 내로 인코딩하도록 구성된다. 주파수-도메인 인코더(106)는 유사하게 그것과 관련된 세트(32)의 어떠한 주파수-도메인 코딩 방식을 갖는 부분들을 데이터 스트림(114)의 상응하는 프레임(118a 내지 118c) 내로 인코딩하는데 책임이 있다.The time-domain encoder 104 includes portions 116a-116c having one of a first subset 30 of one or more of a plurality of frame coding schemes associated therewith, 118a-118c. &Lt; / RTI > The frequency-domain encoder 106 is also responsible for encoding portions of the set 32 associated with it with any frequency-domain coding scheme into the corresponding frames 118a through 118c of the data stream 114.

연관기(102)는 복수의 운용 방식 중 활성의 하나에서 운용하도록 구성된다. 더 정확히 설명하면, 연관기(102)는 복수의 운용 방식 중 하나가 정확하게 활성인 것과 같이 구성되나, 복수의 운용 방식 중 활성인 하나의 선택은 오디오 신호(112)의 연속적으로 인코딩하는 부분들(116a 내지 116c) 동안에 변할 수 있다.The associator 102 is configured to operate in one of a plurality of operating modes. More precisely, the associator 102 is configured such that one of the plurality of manners is correctly active, but one of the plurality of manners of operation is selected by successively encoding portions of the audio signal 112 116a-116c. &Lt; / RTI >

특히, 연관기(102)는 만일 활성 운용 방식이 제 1 운용 방식이면, 방식 운용 세트는 주로 제 1 서브세트(30)와 분리되고 제 2 서브세트(32)로 오버랩하는, 도 1의 세트(40) 같이 행동하나, 만일 활성 운용 방식이 제 2 운용 방식이면, 복수의 인코딩 방식의 방식 의존 세트는 제 1 및 제 2 서브세트(30 및 32)로 오버랩하는 도 1의 방식(42) 같이 행동한다.In particular, the associator 102, if the active mode of operation is the first mode of operation, is a set (e. G., &Lt; RTI ID = 0.0 > 40), but if the active mode of operation is the second mode of operation, the mode dependent set of the plurality of encoding modes behaves like the scheme 42 of FIG. 1 overlapping the first and second subsets 30 and 32 do.

위에서 설명된 것과 같이, 도 5의 오디오 인코더의 기능성은 비록 전송 조건들과 같은, 외부 조건들이 어떠한 시간-도메인 프레임 코딩 방식의 예비 선택은 주파수-도메인 프레임 코딩 방식만을 제한하는 것과 비교할 때 레이트/왜곡 비율과 관련하여 낮은 코딩 효율을 생산할 수 있는 것과 같더라도 어떠한 시간-도메인 프레임 코딩 방식을 바람직하지 않게 선택되는 것을 방지하는 것과 같이 인코더(100)를 외부로 제어하는 것을 가능하게 한다. 도 5에 도시된 것과 같이, 연관기(102)는 예를 들면, 외부 제어 신호(120)를 수신하도록 구성될 수 있다. 연관기(102)는 예를 들면, 외부 엔티티(entity)에 의해 제공되는 외부 제어 신호(120)가 데이터 스트림(114)의 이용가능한 전송 대역폭을 나타내는 것과 같이 일부 외부 엔티티에 연결될 수 있다. 이러한 외부 엔티티는 예를 들면, 개방형 시스템 상호접속(OSI) 계층 모델에 대하여 하부와 같은 기본 하부 전송 계층의 일부일 수 있다. 예를 들면, 외부 엔티티는 LTE 통신 네트워크의 일부일 수 있다. 신호(112)는 자연적으로 실제 이용가능한 전송 대역폭의 측정 또는 평균 미래 이용가능한 전송 대역폭의 측정을 기초로 하여 제공될 수 있다. 도 1 내지 4와 관련하여 위에서 이미 설명된 것과 같이, "제 1 운용 방식"은 특정 한계값(threshold)보다 낮은 이용가능한 전송 대역폭들과 관련될 수 있으며, 반면에 "제 2 운용 방식"은 특정 한계값을 초과하는 이용가능한 전송 대역폭들과 관련될 수 있는데, 이렇게 함으로써 주로, 만일 이용가능한 전송 대역폭이 특정 한계값보다 낮으면, 인코더(100)는 시간-도메인 코딩이 더 비효율적인 압축을 생산할 것 같은 부적합한 조건들에서 어떠한 시간-도메인 프레임 코딩 방식을 선택하는 것을 방지한다. As described above, the functionality of the audio encoder of FIG. 5 is similar to the rate / distortion < RTI ID = 0.0 > Rate frame coding scheme, it is possible to control the encoder 100 externally, such as to prevent any time-domain frame coding scheme from being undesirably selected. As shown in FIG. 5, the associator 102 may be configured to receive an external control signal 120, for example. The associator 102 may be coupled to some external entity, for example, an external control signal 120 provided by an external entity, indicating the available transmission bandwidth of the data stream 114. Such external entities may be part of the underlying sub-transport layer, such as, for example, for the Open Systems Interconnection (OSI) layer model. For example, the external entity may be part of an LTE communication network. The signal 112 may naturally be provided based on a measurement of the actual available transmission bandwidth or a measurement of the average future available transmission bandwidth. As already described above with respect to Figures 1-4, the "first scheme" may relate to available transmission bandwidths below a certain threshold, while the "second scheme" By doing so, by doing so, if the available transmission bandwidth is below a certain threshold value, the encoder 100 will cause the time-domain coding to produce more inefficient compression Thereby avoiding selecting any time-domain frame coding scheme in the same inappropriate conditions.

그러나, 제어 신호(120)는 또한 예를 들면, 음성 단계들, 즉, 이 동안에 오디오 신호(112) 내에 음성 컴포넌트들이 우세한, 시간 간격들, 및 음악 등과 같은 다른 오디오 소스들이 오디오 신호(112) 내에 우세한 비-음성 단계들 사이를 구별하기 위하여, 재구성되려는 오디오 신호(112)를 분석하는 음성 검출기와 같은 일부 다른 엔티티에 의해 제공될 수 있다. 제어 신호(120)는 음성 및 비-음성 단계들에서의 이러한 변화를 나타낼 수 있으며 연관기(102)는 운용 방식들 사이를 그에 알맞게 변경하도록 구성될 수 있다. 예를 들면, 음성 단계들에서 연관기(102)는 앞서 언급된 "제 2 운용 방식"으로 들어갈 수 있으나 "제 1 운용 방식"은 비-음성 단계들과 관련될 수 있는데, 그렇게 함으로써 비-음성 단계들 동안에 시간-도메인 프레임 코딩 방식들의 선택은 덜 효율적인 압축을 야기할 것 같다는 사실을 따른다.However, the control signal 120 may also include other audio sources such as, for example, audio steps, i.e., during which the audio components 112 dominate the audio components, time intervals, and other audio sources, such as music, May be provided by some other entity, such as a speech detector, which analyzes the audio signal 112 to be reconstructed, to distinguish between the dominant non-speech phases. The control signal 120 may represent this change in voice and non-voice phases and the associator 102 may be configured to change between operations accordingly. For example, in voice steps, associator 102 may enter the aforementioned " second mode of operation ", but the "first mode of operation" may be associated with non-voice phases, The selection of time-domain frame coding schemes during the steps follows the fact that it is likely to result in less efficient compression.

각각의 부분(116a 내지 116c)을 위하여 복수의 프레임 코딩 방식 중 어떤 프레임 코딩 방식이 관련되는지를 나타내기 위하여 프레임 방식 구문 요소(112, 도 1의 구문 요소(38)와 비교)를 데이터 스트림(114) 내로 인코딩하도록 구성될 수 있으며, 이러한 프레임 방식 구문 요소(122)의 데이터 스트림(114) 내로의 삽입은 도 1의 프레임 방식 구문 요소(38)를 갖는 데이터 스트림(20)을 생성하기 위하여 운용 방식에 의존하지 않을 수 있다. 위에서 이미 설명된 것과 같이, 데이터 스트림(114)의 데이터 스트림 발생은 현재 활성인 운용 방식과 관계없이 실행될 수 있다.(In comparison with the syntax element 38 of FIG. 1) to indicate which frame coding scheme among the plurality of frame coding schemes is associated for each of the portions 116a-116c to the data stream 114 And insertion of such a framing syntax element 122 into the data stream 114 may be implemented in an operational manner to generate a data stream 20 having the framed syntax element 38 of FIG. Lt; / RTI > As already described above, the data stream generation of the data stream 114 may be performed independently of the currently active mode of operation.

그러나, 비트레이트 오버헤드와 관련하여, 만일 데이터 스트림(114)이 어떤 데이터 스트림 발생이 현재 활성인 운용 방식에 바람직하게 적용되는지에 따라, 도 1 내지 4의 실시 예들과 관련하여 위에서 논의된 데이터 스트림(20)을 생성하기 위하여 도 5의 오디오 인코더(100)에 의해 발생되면, 이는 바람직하다.However, with respect to the bit rate overhead, if the data stream 114 is to be applied to a currently active mode of operation, Is generated by the audio encoder 100 of Fig. 5 to generate the audio signal 20, which is desirable.

따라서, 도 1 내지 4에 대한 오디오 디코더를 위하여 위에서 설명된 실시 예들과 맞는 도 5의 오디오 인코더(100)의 일 실시 예에 따라, 연관기(102)는 전단사 매핑(52)이 활성 운용 방식에 따라 변하는, 한편으로는 각각의 부분(116a 내지 116c)과 관련된 프레임 방식 구문 요소(122)의 가능한 값들(46)의 세트, 및 다른 한편으로는 프레임 코딩 방식들의 방식 의존 세트 사이의 전단사 매핑(52)을 사용하여 프레임 방식 구문 요소(38)를 데이터 스트림(114) 내로 인코딩하도록 구성될 수 있다. 특히, 변경은 만일 활성 운용 방식이 제 1 운용 방식이면, 방식 의존 세트는 즉 제 1 서브세트(30)와 분리되고 제 2 서브세트(32)로 오버랩하는, 세트(40) 같이 행동하고, 반면에 만일 활성 운용 방식이 제 2 운용 방식이면, 방식 의존 세트는 즉, 제 1 및 제 2 서브세트(30 및 32)로 오버랩하는, 세트(42) 같이 행동하는 것과 같을 수 있다. 특히, 위에서 이미 설명된 것과 같이, 세트(46) 내의 가능한 값들의 수는 제 1 또는 제 2 운용 방식인 것과 관계없이 2일 수 있으며, 연관기(102)는 만일 활성 운용 방식이 제 1 운용 방식이면, 방식 의존 세트는 주파수-도메인 프레임 코딩 방식들(A 및 B)을 포함하는 것과 같이 구성될 수 있고, 주파수-도메인 인코더(106)는 방식 A 또는 방식 B인 그것의 프레임 코딩에 따라 각각의 부분들(116a 내지 116c)을 인코딩하는데 서로 다른 시간-주파수 해상도들을 사용하도록 구성될 수 있다. Thus, in accordance with one embodiment of the audio encoder 100 of FIG. 5, which is compatible with the embodiments described above for the audio decoder for FIGS. 1-4, the associator 102 may be configured so that the front- On the one hand, a set of possible values 46 of the framing syntax element 122 associated with each of the sections 116a through 116c, and on the other hand, a shear mapping between the scheme dependent sets of frame coding schemes And to encode the framed syntax element 38 into the data stream 114 using the encoding 52. In particular, the change behaves like set 40, where the scheme is dependent on the first subset 30 and overlaps with the second subset 32, if the active scheme is the first scheme If the active mode of operation is the second mode of operation, then the mode dependent set may be equal to acting as the set 42, overlapping with the first and second subsets 30 and 32, for example. In particular, as already described above, the number of possible values in the set 46 may be 2, regardless of whether it is the first or second approach, and the associator 102 may determine if the active mode of operation is the first mode of operation , Then the scheme dependent set may be configured to include frequency-domain frame coding schemes A and B and the frequency-domain encoder 106 may be configured with its frame coding according to its frame coding scheme A or scheme B, May be configured to use different time-frequency resolutions to encode portions 116a-c.

도 6은 변환 코딩 여진 선형 예측 코딩은 주파수-도메인 코딩 방식들을 위하여 사용되는 동안에, 시간-도메인 프레임 코딩 방식을 위하여 어떤 부호 여진 선형 예측 코딩이 사용될 수 있는지에 따라, 위에서 이미 설명된 사실과 상응하는 시간-도메인 인코더(104) 및 주파수-도메인 인코더(106)의 가능한 구현을 위한 일 실시 예를 도시한다. 따라서, 도 6에 따라, 시간-도메인 인코더(104)는 부호 여진 선형 예측 인코더이고 주파수-도메인 인코더(106)는 변환 계수 레벨들을 사용하여 그것과 관련된 어떠한 주파수-도메인 프레임 코딩 방식을 갖는 부분들을 인코딩하고 이를 오디오 스트림(114)의 상응하는 프레임들(118a 내지 118c) 내로 인코딩하도록 구성되는 변환 인코더이다.Figure 6 shows that while transform coding excitation linear predictive coding is used for frequency-domain coding schemes, depending on which signed excitation linear predictive coding can be used for the time-domain frame coding scheme, Domain encoder 104 and a frequency-domain encoder 106. The time-domain encoder 104 and the frequency- 6, the time-domain encoder 104 is a signed linear predictive encoder and the frequency-domain encoder 106 uses the transform coefficient levels to encode portions having any frequency-domain frame coding scheme associated therewith And to encode it into the corresponding frames 118a through 118c of the audio stream 114. [

시간-도메인 인코더(104) 및 주파수-도메인 인코더(106)을 위한 가능한 구현을 설명하기 위하여, 도 6이 참조된다. 도 6에 따라, 주파수-도메인 인코더(106) 및 시간-도메인 인코더(104)는 선형 예측 코딩 분석기(130)를 공유한다. 그러나, 이러한 상황은 본 발명의 실시 예에 중요하지 않으며 두 인코더(104 및 106)가 서로 완전히 분리되는 다른 구현이 또한 사용될 수 있다는 것을 이해하여야 한다. 게다가, 인코더 실시 예들뿐만 아니라 도 1 내지 4에 대하여 위에서 설명된 디코더 실시 예들과 관련하여, 본 발명은 두 코딩 방식들, 즉, 주파수-도메인 프레임 코딩 방식들뿐만 아니라 시간-도메인 프레임 코딩 방식들이 선형 예측 기반인 경우들로 한정되지 않는다는 것을 이해하여야 한다. 오히려, 인코더 및 디코더 실시 예들은 또한 서로 다른 방식으로 시간-도메인 코딩 및 주파수-도메인 코딩 중의 하나가 구현되는 다른 경우들에 이용가능할 수 있다.To illustrate possible implementations for time-domain encoder 104 and frequency-domain encoder 106, reference is made to Fig. According to FIG. 6, the frequency-domain encoder 106 and the time-domain encoder 104 share a linear predictive coding analyzer 130. It should be understood, however, that this situation is not critical to embodiments of the present invention, and that other implementations in which the two encoders 104 and 106 are completely separate from one another can also be used. In addition, with respect to the encoder embodiments as well as the decoder embodiments described above with respect to Figures 1-4, the present invention is applicable to both the coding schemes, i.e., frequency-domain frame coding schemes as well as time- domain frame coding schemes, It is to be understood that the present invention is not limited to cases based on prediction. Rather, the encoder and decoder embodiments may also be available in other cases in which one of time-domain coding and frequency-domain coding is implemented in different ways.

다시 도 6을 설명하면, 도 6의 주파수-도메인 인코더(106)는 선형 예측 코딩 분석기(130) 이외에, 변환기(132), 선형 예측 코딩-대-주파수 도메인 가중 전환기(134), 주파수-도메인 잡음 형상기(136) 및 양자화기(138)를 포함한다. 변환기(132), 주파수 도메인 잡음 형상기(136) 및 양자화기(138)는 주파수-도메인 인코더(106)의 공통 입력(140) 및 출력(142) 사이에 직렬 연결된다. 선형 예측 코딩 변환기(134)는 선형 예측 코딩 분석기(130)의 출력 및 주파수 도메인 잡음 형상기(136)의 가중 입력 사이에 연결된다. 선형 예측 코딩 분석기(130)의 입력은 공통 입력(140)에 연결된다.6, the frequency-domain encoder 106 may include, in addition to the linear predictive coding analyzer 130, a transformer 132, a linear predictive coding-to-frequency domain weighted switch 134, a frequency- Shape memory 136 and a quantizer 138. [ The converter 132, the frequency domain noise shaping 136 and the quantizer 138 are connected in series between the common input 140 and the output 142 of the frequency-domain encoder 106. The linear predictive coding converter 134 is coupled between the output of the linear predictive coding analyzer 130 and the weighted input of the frequency domain noise shaping 136. The input of the linear predictive coding analyzer 130 is coupled to a common input 140.

시간-도메인 인코더(104)와 관련하여, 이는 선형 예측 코딩 분석기(130) 이외에, 모두 시간-도메인 인코더(104)의 공통 입력(140) 및 출력(148) 사이에 직렬로 연결되는, 선형 예측 분석 필터(144) 및 코드 기반 여진 신호 근사장치(code based excitation signal approximation, 146)를 포함한다. 선형 예측 분석 필터(144)의 선형 예측 계수 입력은 선형 예측 코딩 분석기(130)의 출력에 연결된다.In conjunction with the time-domain encoder 104, it includes a linear predictive analysis (LPC), which is connected in series between the common input 140 and the output 148 of the time-domain encoder 104, A filter 144 and a code based excitation signal approximation 146. [ The linear prediction coefficient input of the linear prediction analysis filter 144 is connected to the output of the linear prediction coding analyzer 130.

입력(140)에 들어가는 오디오 신호(112)를 인코딩하는데 있어서, 선형 예측 코딩 분석기(130)는 오디오 신호(112)의 각각의 부분(116a 내지 116c)을 위한 선형 예측 계수들을 연속적으로 결정한다. 선형 예측 코딩 결정은 (위너-)레빈슨-더빈 알고리즘 또는 슈어(Schur) 알고리즘 등을 사용하는 것과 같이 연속-오버래핑 또는 비-오버래핑(오디오 신호의 윈도우잉된 부분들)을 결과로서 생기는 자기상관들(선택적으로 미리 자기상관들을 래그 윈도우잉으로 두는 것과 함께) 상으로의 선형 예측 코딩 평가의 실행과 관련시킨다.In encoding the audio signal 112 entering the input 140, the linear predictive coding analyzer 130 continuously determines the linear predictive coefficients for each portion 116a-116c of the audio signal 112. The linear predictive coding decision may be used to determine successive-overlapping or non-overlapping (windowed portions of an audio signal), such as using a (Wiener-) Levinson-Durbin algorithm or a Schur algorithm, Optionally with pre-registration of the autocorrelation lag windowing).

도 3 및 4와 관련하여 설명된 것과 같이, 선형 예측 코딩 분석기(130)는 프레임들(118a 내지 118c)의 프레임 비율과 동등한 선형 예측 코딩 전송 비율로 데이터 스트림(114) 내의 신형 예측 계수들에 신호를 보낼 필요는 없다. 그보다 더 높은 비율도 또한 사용될 수 있는데, 일반적으로 선형 예측 코딩 분석기(130)는 예를 들면, 선형 예측 코딩들이 결정되는 것을 기초로 하여, 위에서 설명된 자기 상관들의 비율에 의해 정의되는 선형 예측 코딩 결정 비율로 선형 예측 코딩 정보(60 및 76)를 결정할 수 있다. 그리고 나서, 선형 예측 코딩 분석기(130)는 선형 예측 코딩 결정 비율보다 낮을 수 있는 선형 예측 코딩 전송 비율로 선형 예측 코딩 정보(60 및 76)를 데이터 스트림 내로 삽입할 수 있고, 차례로 시간-도메인 및 주파수 도메인 인코더들(104 및 106)은 데이터 스트림(114)의 프레임들(118a 내지 118c) 내의 전송된 선형 예측 코딩 정보(60 및 76)를 보간함으로써, 선형 예측 계수들을 선형 예측 코딩 전송 비율보다 높은 선형 예측 코딩 적용 비율로 이를 업데이트하는데 적용할 수 있다. 특히, 주파수-도메인 인코더(106) 및 주파수-도메인 디코더는 변환 당 한번 선형 예측 코딩 계수들을 적용하기 때문에, 주파수-도메인 프레임들 내의 선형 예측 코딩 적용은 주파수-도메인 인코더/디코더 내에 적용되는 선형 예측 코딩 계수들이 선형 예측 코딩 전송 비율로부터 보간에 의해 적용되고/업데이트되는 비율보다 낮을 수 있다. 보간이 또한 실행될 수 있기 때문에, 동시에, 디코더 면에서, 동일한 선형 예측 코딩 계수들이 한편으로는 시간-도메인과 주파수-도메인 인코더들 및 다른 한편으로는 시간-도메인과 주파수-도메인 디코더들을 위하여 이용가능하다. 어떤 경우라도, 선형 예측 코딩 분석기(130)는 프레임 비율과 동등하거나 또는 이보다 높은 일부 선형 예측 코딩 결정 비율로 오디오 신호(112)를 위한 선형-예측 계수들을 결정하고 선형 예측 코딩 결정 비율과 동등할 수 있거나 또는 이보다 낮은 선형 예측 코딩 전송 비율로 이를 데이터 스트림 내로 삽입한다. 그러나, 선형 예측 분석 필터(144)는 선형 예측 코딩 전송 비율보다 높은 선형 예측 코딩 적용 비율에서 선형 예측 코딩 분석 필터를 업데이트하도록 보간할 수 있다. 선형 예측 코딩 변환기(134)는 필요한 스펙트럼 가중 변환에 대하여 각각의 변환 또는 각각의 선형 예측 코딩을 위한 선형 예측 코딩 계수들을 결정하도록 실행할 수 있거나 또는 실행하지 않을 수 있다. 선형 예측 코딩 계수들을 전송하기 위하여, 선 스펙트럼 주파수(LSF)/선 스펙트럼 쌍(LSP) 도메인에서와 같은 적합한 도메인에서의 양자화의 대상이 될 수 있다.3 and 4, the linear predictive coding analyzer 130 provides a signal to the new prediction coefficients in the data stream 114 with a linear predictive coding transfer ratio equal to the frame rate of the frames 118a through 118c . Higher ratios may also be used, generally the linear predictive coding analyzer 130 may determine, based on, for example, the linear predictive coding (s) determined, a linear predictive coding The linear prediction coding information 60 and 76 can be determined. The LPC analysis analyzer 130 may then insert the LPC coding information 60 and 76 into the data stream at a LPC coding transmission rate that may be lower than the LPC coding rate, Domain encoders 104 and 106 may interpolate the transmitted linear predictive coding information 60 and 76 in frames 118a through 118c of data stream 114 to obtain linear predictive coefficients higher than linear predictive coding transmission rates, It can be applied to update it with a prediction coding application rate. In particular, since the frequency-domain encoder 106 and the frequency-domain decoder apply once-per-transform linear predictive coding coefficients, the application of the linear predictive coding in the frequency-domain frames results in a linear predictive coding Coefficients may be lower than the rate at which they are applied / updated by interpolation from the LPC transmission rate. At the same time, on the decoder side, the same linear predictive coding coefficients are available for both time-domain and frequency-domain encoders on the one hand and time-domain and frequency-domain decoders on the other hand since interpolation can also be performed . In any case, linear predictive coding analyzer 130 may determine linear-prediction coefficients for audio signal 112 with some linear predictive coding decision ratio equal to or higher than the frame rate and may be equivalent to a linear predictive coding decision ratio Or inserts it into the data stream at a lower linear predictive coding transfer rate. However, the linear prediction analysis filter 144 may interpolate to update the LPC analysis filter at a linear predictive coding application rate higher than the linear predictive coding transmission rate. The LPC transformer 134 may or may not execute to determine the LPC coefficients for each transform or each LPC for the required spectral weight transforms. May be subject to quantization in a suitable domain, such as in the Line Spectrum Frequency (LSF) / Line Spectrum Pairs (LSP) domain, to transmit the LPC coefficients.

시간-도메인 인코더(104)는 다음과 같이 운용할 수 있다. 선형 예측 분석 필터는 선형 예측 코딩 분석기(130)에 의해 출력되는 선형 예측 계수에 따라 오디오 신호(112)의 시간-도메인 코딩 방식 부분들을 필터링할 수 있다. 선형 예측 분석 필터(144)의 출력에서, 따라서 여진 신호(150)가 유래한다. 여진 신호는 근사장치(146)에 의해 근사치가 계산된다. 특히, 근사장치(146)는 예를 들면, 즉, 선형 예측 코딩들에 따른 각각의 합성 필터를 각각의 연진 신호들 상으로 적용한 후에, 합성된 도메인 내의 한편으로는 여진 신호(150) 및 다른 한편으로는 코드북 지수(66)에 의해 정의되는 것과 같은 합성으로 발생된 여진 신호의 유래에 의해, 정의되는 일부 양자화 측정의 최소화 또는 최대화에 의한 것과 같이 여진 신호(150)의 근사치를 계산하기 위하여 코드북 지수 또는 다른 파라미터와 같은 코드를 설정한다. 양자화 측정은 선택적으로 인지적으로 더 관련된 주파수 대역들에서 인지적으로 강조되는 유도들일 수 있다. 근사장치(146)에 의해 코드 세트에 의해 결정되는 혁신 여진은 혁신 파라미터로 불릴 수 있다.The time-domain encoder 104 can operate as follows. The linear prediction analysis filter may filter time-domain-coded portions of the audio signal 112 according to the linear prediction coefficients output by the linear prediction coding analyzer 130. At the output of the linear prediction analysis filter 144, the excitation signal 150 is therefore derived. The excitation signal is approximated by the approximation device 146. In particular, the approximation device 146 may be configured to apply excitation signals 150 on the one hand in the synthesized domain and on the other hand, in the synthesized domain, after applying the respective synthesis filter, e.g., according to the linear predictive coding, To calculate an approximation of the excitation signal 150, such as by minimizing or maximizing some of the quantization measurements defined by the origin of the excitation signal generated synthetically as defined by the codebook exponent 66, Or other parameters. The quantization measurement may optionally be cognitively highlighted indications in more cognitively more relevant frequency bands. The innovation excitation determined by the code set by the approximation device 146 may be referred to as an innovation parameter.

따라서, 근사장치(146)는 예를 들면, 프레임 방식 구문 요소(122)를 거쳐 그것과 관련된 시간-도메인 코딩 방식을 갖는 상응하는 프레임들 내로 삽입되도록 하기 위하여 시간-도메인 프레임 코딩 방식 당 하나 또는 그 이상의 혁신 파라미터를 출력할 수 있다. 차례로, 주파수-도메인 인코더(106)는 다음과 같이 운용할 수 있다. 변환기(132)는 부분 당 또는 그 이상의 스펙트럼을 획득하기 위하여 예를 들면, 겹침 변환을 사용하여 오디오 신호(112)의 주파수-도메인 부분들을 변환한다. 변환기(132)의 출력에서 결과로서 생기는 스펙트로그램은 선형 예측 코딩들에 따라 스펙트로그램을 표현하는 스펙트럼의 시퀀스를 형상화하는 주파수 도메인 잡음 형상기(136)로 들어간다. 이를 위하여, 선형 예측 코딩 변환기(134)는 스펙트럼을 스펙트럼으로 가중하기 위하여 선형 예측 코딩 분석기(130)의 선형 예측 계수들을 주파수-도메인 가중 값들로 전환한다. 이번에, 스펙트럼 가중은 선형 예측 분석 필터의 전달 함수가 생기는 것과 같이 실행된다. 즉, 선형 예측 코딩 계수들을 그리고 나서 스펙트럼 출력을 세분하는데 사용될 수 있는 스펙트럼 가중들로 전환하기 위하여 예를 들면, 홀수 이산 푸리에 변환이 사용될 수 있으며, 디코더 면에서 곱셈이 사용된다.Thus, the approximation device 146 may include one or more per-time-domain frame coding schemes to allow insertion into corresponding frames with a time-domain coding scheme associated therewith, for example, The above-mentioned innovation parameters can be outputted. In turn, the frequency-domain encoder 106 can operate as follows. The transformer 132 transforms the frequency-domain portions of the audio signal 112 using, for example, lapped transforms to obtain a spectrum per part or more. The resulting spectrogram at the output of the transducer 132 enters the frequency domain noise shaping unit 136, which shapes the sequence of spectra representing the spectrogram according to the linear predictive coding. To this end, the LPC transformer 134 converts the linear prediction coefficients of the LPC analyzer 130 to frequency-domain weighted values to weight the spectrum to the spectrum. This time, the spectral weighting is performed as if the transfer function of the linear prediction analysis filter occurs. That is, for example, an odd discrete Fourier transform can be used to convert the LPC coefficients into spectral weights that can then be used to refine the spectral output, and multiplication in the decoder plane is used.

이후에, 영자화기(138)는 데이터 스트림(114)의 상응하는 프레임들 내로의 삽입을 위하여 변환 계수 레벨들(60) 내로 주파수-도메인 잡음 형상기(136)에 의해 출력되는 결과로서 생기는 여진 스펙트럼을 양자화한다.Thereafter, the channelizer 138 receives the resulting excitation spectrum (echo signal) output by the frequency-domain noise shaping unit 136 into the transform coefficient levels 60 for insertion into the corresponding frames of the data stream 114, Lt; / RTI >

위에서 설명된 실시 예에 따라, 본 발명의 일 실시 예는 운용 방식들 중 특정한 하나의 경우에 있어서 대수 부호 여진 선형 예측 방식의 선택을 삼가도록 하기 위하여 서로 다른 운용 방식들로 운용하도록 통합 음성 및 오디오 코딩 인코더를 변형함으로써 본 발명의 적용의 도입부에 논의된 통합 음성 및 오디오 코딩을 변형할 때 유래할 수 있다. 저지연의 달성을 가능하도록 하기 위하여, 통합 음성 및 오디오 코덱이 또한 다음의 방법으로 변형될 수 있다. 예를 들면, 운용 방식과 관계없이, 변환 코딩 여진 및 대수 부호 선형 예측 프레임 코딩 방식들이 사용될 수 있다. 저지연을 달성하기 위하여, 프레임 길이는 20 밀리초의 프레이밍(framing)에 도달하도록 감소될 수 있다. 특히, 위의 실시 예들에 따라 더 효율적인 통합 음성 및 오디오 코덱을 제공하는데 있어서, 통합 음성 및 오디오 코딩, 주로 협대역, 광대역 및 초광대역의 운용 방식은 단지 전체 이용가능한 프레임 코딩 방식들이 적합한 서브세트만이 그 뒤에 설명되는 테이블에 따라 개별 운용 방식들 내에서 이용가능한 것과 같이 수정될 수 있다.According to the embodiment described above, an embodiment of the present invention may be implemented as an integrated voice and audio system to operate in different manners of operation in order to avoid selection of a log-likelihood linear prediction scheme in a particular one of the manners. May be derived by modifying the coding encoder to modify the integrated speech and audio coding discussed in the introduction of the application of the present invention. To enable the achievement of low latency, an integrated voice and audio codec may also be modified in the following manner. For example, regardless of the mode of operation, transform coding excitation and algebraic code linear predictive frame coding schemes may be used. To achieve low delay, the frame length may be reduced to reach a framing of 20 milliseconds. In particular, in providing more efficient integrated voice and audio codecs in accordance with the above embodiments, unified voice and audio coding, mainly narrowband, wideband and ultra-wideband operating schemes, can be used only when the entire available frame- May be modified as available within the individual manners of operation according to the tables described hereinafter.

위의 테이블에서 자명한 것과 같이, 위에서 설명된 실시 예들에서, 디코더의 운용 방식은 외부 신호 또는 데이터 신호로부터만 결정될 수 없으며, 둘의 결합을 기초로 하여 결정될 수 있다. 예를 들면, 위의 테이블에서, 데이터 스트림은 프레임 비율보다 낮을 수 있는 일부 비율로 데이터 스트림 내에 존재하는 코스(coarse) 운용 방식 구문 요소를 거쳐 디코더에 주요 방식, 즉, 협대역, 광대역, 초광대역, 주파수 대역을 나타낼 수 있다. 인코더는 이러한 구문 요소를 합성 요소들(38)에 더하여 삽입할 수 있다. 그러나, 정확한 운용 방식은 이용가능한 비트레이트를 나타내는 부가적인 외부 신호의 검사를 필요로 할 수 있다. 초광대역의 경우에, 예를 들면, 정확한 방식은 48kbp 이하, 48kbp 이상이고, 96kbp 이하이거나, 또는 96kbp 이상에 있는 이용가능한 비트레이트에 의존한다. As is evident from the above table, in the embodiments described above, the operation of the decoder can not be determined only from an external signal or a data signal, and can be determined based on a combination of the two. For example, in the table above, the data stream may be sent to the decoder in a main way, such as narrowband, wideband, ultra-wideband , And a frequency band. The encoder may insert such syntax elements in addition to the composite elements 38. [ However, the correct operating scheme may require the examination of additional external signals indicative of the available bit rate. In the case of ultra-wideband, for example, the exact scheme depends on the available bit rate being 48 kbp or less, 48 kbp or more, 96 kbp or less, or 96 kbp or more.

위의 실시 예들과 관련하여 비록 대안의 실시 예들에 따라 정보 신호의 프레임들/시간 부분들이 관련될 수 있는 모든 복수의 코딩 방식의 세트가 독점적으로 시간-도메인 또는 주파수-도메인 프레임 코딩 방식을 구성하나, 이는 서로 다를 수 있으며, 따라서 또한 시간-도메인 또는 주파수-도메인 코딩 방식이 아닌 하나 이상의 프레임 코딩 방식이 존재할 수 있다는 것을 이해하여야 한다.With respect to the above embodiments, although a set of all the plurality of coding schemes to which frames / time portions of the information signal may be associated exclusively constitute a time-domain or frequency-domain frame coding scheme, according to alternative embodiments , It should be understood that there may be one or more frame coding schemes that may be different from each other, and thus also not a time-domain or frequency-domain coding scheme.

장치의 맥락에서 일부 양상들이 설명되었으나, 이러한 양상들은 또한 블록 또는 장치가 방법 단계 또는 방법 단계의 특징에 상응하는, 상응하는 방법의 설명을 나타내는 것이 자명하다. 유사하게, 방법 단계의 맥락에서 설명된 양상들은 또한 상응하는 장치의 상응하는 블록 또는 아이템 또는 특징을 나타낸다. 일부 또는 모든 방법 단계들은 예를 들면, 마이크로프로세서, 프로그램가능 컴퓨터 또는 전자 회로 같은, 하드웨어 장치에 의해 실행될 수(또는 사용할 수) 있다. 일부 실시 예들에서, 일부 하나 또는 그 이상의 가장 중요한 방법 단계들이 그러한 장치에 의해 실행될 수 있다.While some aspects have been described in the context of an apparatus, it is apparent that these aspects also illustrate corresponding methods, where the block or device corresponds to a feature of a method step or method step. Similarly, aspects described in the context of method steps also represent corresponding blocks or items or features of the corresponding device. Some or all of the method steps may be (or may be) executed by a hardware device, such as, for example, a microprocessor, a programmable computer or an electronic circuit. In some embodiments, some or more of the most important method steps may be performed by such an apparatus.

특정 구현 필요성에 따라, 본 발명의 실시 예들은 하드웨어 또는 소프트웨어에서 구현될 수 있다. 구현은 디지털 저장 매체, 예를 들면, 거기에 저장되는 전자적으로 판독가능한 신호들을 갖는, 플로피 디스크, DVD, CD, ROM,, PROM, EPROM, EEPROM 또는 플래시 메모리를 사용하여 실행될 수 있는데, 이는 각각의 방법이 실행되는 것과 같이 프로그램가능 컴퓨터 시스템과 협력한다(또는 협력할 수 있다). 따라서 디지털 저장 매체는 컴퓨터 판독가능할 수 있다.Depending on the specific implementation needs, embodiments of the present invention may be implemented in hardware or software. An implementation may be implemented using a digital storage medium, e.g., a floppy disk, DVD, CD, ROM, PROM, EPROM, EEPROM or flash memory, with electronically readable signals stored thereon, (Or cooperate) with a programmable computer system as the method is implemented. The digital storage medium may thus be computer readable.

본 발명에 따른 일부 실시 예들은 여기에 설명된 방법들 중의 하나가 실행되는 것과 같이, 프로그램가능 컴퓨터 시스템과 협력할 수 있는, 전자적으로 판독가능한 제어 신호들을 갖는 비-일시적 데이터 캐리어를 포함한다. Some embodiments in accordance with the present invention include non-transient data carriers having electronically readable control signals that can cooperate with a programmable computer system, such as one of the methods described herein.

일반적으로, 본 발명의 실시 예들은 프로그램 코드를 갖는 컴퓨터 프로그램 베춤으로서 구현될 수 있는데, 프로그램 코드는 컴퓨터 프로그램 제품이 컴퓨터상에 구동될 때 방법들 중의 하나를 실행하도록 작동할 수 있다. 프로그램 코드는 예를 들면 기계 판독가능 캐리어 상에 저장될 수 있다.In general, embodiments of the present invention may be implemented as a computer program product having program code, the program code being operable to execute one of the methods when the computer program product is run on a computer. The program code may be stored on, for example, a machine readable carrier.

다른 실시 예들은 기계 판독가능 캐리어 상에 저장되는, 여기에 설명된 방법들 중의 하나를 실행하기 위한 컴퓨터 프로그램을 포함한다.Other embodiments include a computer program for executing one of the methods described herein, stored on a machine readable carrier.

바꾸어 말하면, 따라서 본 발명의 방법의 일 실시 예는 컴퓨터 프로그램이 컴퓨터상에 구동할 때, 여기에 설명된 방법들 중의 하나를 실행하기 위한 프로그램 코드를 갖는 컴퓨터 프로그램이다.In other words, therefore, one embodiment of the method of the present invention is a computer program having program code for executing one of the methods described herein when the computer program runs on a computer.

본 발명의 방법의 또 다른 실시 예는 따라서 여기에 설명된 방법들 중의 하나를 실행하기 위하여 그것에 대해 기록된, 컴퓨터 프로그램을 포함하는 데이터 캐리어(또는 디지털 저장 매체, 또는 컴퓨터 판독가능 매체)이다. 데이터 캐리어, 디지털 저장 매체 또는 기록된 매체는 일반적으로 고정 또는 비-일시적이다.Another embodiment of the method of the present invention is thus a data carrier (or digital storage medium, or computer readable medium) comprising a computer program recorded thereon for performing one of the methods described herein. Data carriers, digital storage media or recorded media are typically fixed or non-temporary.

본 발명의 방법의 또 다른 실시 예는 따라서 여기에 설명된 방법들 중의 하나를 실행하기 위한 컴퓨터 프로그램을 표현하는 신호들의 데이터 스트림 또는 시퀀스이다. 신호들의 데이터 스트림 또는 시퀀스는 예를 들면 데이터 통신 연결, 예를 들면 인터넷을 거쳐 전달되도록 구성될 수 있다.Another embodiment of the method of the present invention is thus a data stream or sequence of signals representing a computer program for carrying out one of the methods described herein. The data stream or sequence of signals may be configured to be communicated, for example, via a data communication connection, e.g., the Internet.

또 다른 실시 예는 처리 수단들, 예를 들면, 여기에 설명된 방법들 중의 하나를 실행하거나 적용하도록 구성되는 컴퓨터, 또는 프로그램가능 논리 장치를 포함한다. Yet another embodiment includes processing means, e.g., a computer, or a programmable logic device configured to execute or apply one of the methods described herein.

또 다른 실시 예는 여기에 설명된 방법들 중의 하나를 실행하기 위하여 거기에 설치된 컴퓨터 프로그램을 갖는 컴퓨터를 포함한다.Yet another embodiment includes a computer having a computer program installed thereon for executing one of the methods described herein.

본 발명에 따른 도 다른 실시 예는 여기에 설명된 방법들 중 하나를 수신기 에 실행하도록 컴퓨터 프로그램을 전달하도록(예를 들면, 전자적으로 또는 광학적으로) 구성되는 장치 또는 시스템을 포함한다. 수신기는 예를 들면, 컴퓨터, 이동 기기, 메모리 장치 등일 수 있다. 장치 또는 시스템은 컴퓨터 프로그램을 수신기에 전달하기 위한 파일 서버를 포함할 수 있다.Still other embodiments in accordance with the present invention include an apparatus or system configured to transmit (e.g., electronically or optically) a computer program to cause the receiver to perform one of the methods described herein. The receiver may be, for example, a computer, a mobile device, a memory device, or the like. A device or system may include a file server for delivering a computer program to a receiver.

일부 실시 예들에서, 프로그램가능 논리 장치(예를 들면, 필드 프로그램가능 게이트 어레이(field programmable gate array))는 여기에 설명된 방법들의 기능들이 일부 또는 모두를 실행하도록 사용될 수 있다. 일부 실시 예들에서, 필드 프로그램가능 게이트 어레이는 여기에 설명된 방법들 중의 하나를 실행하기 위하여 마이크로프로세서와 협력할 수 있다. 일반적으로, 방법들은 바람직하게는 어떠한 하드웨어 장치에 의해 실행된다.In some embodiments, a programmable logic device (e.g., a field programmable gate array) may be used to perform some or all of the functions of the methods described herein. In some embodiments, the field programmable gate array may cooperate with the microprocessor to perform one of the methods described herein. Generally, the methods are preferably executed by any hardware device.

위에서 설명된 실시 예들은 단지 본 발명의 원리를 설명하기 위한 것이다. 여기에 설명된 배치들 및 내용들의 변형 및 변경들은 통상의 지식을 가진 자들에 자명할 것이라는 것을 이해하여야 한다. 따라서, 본 발명의 실시 예들의 설명에 의해 표현된 특정 상세 내용에 의한 것이 아니라 첨부된 청구항들의 범위에 의해서만 한정되는 것으로 의도된다.The embodiments described above are only intended to illustrate the principles of the invention. It should be understood that variations and modifications of the arrangements and contents described herein will be apparent to those of ordinary skill in the art. It is, therefore, intended to be limited only by the scope of the appended claims, rather than by the particulars specified by way of illustration of the embodiments of the invention.

10 : 오디오 디코더
12 : 시간-도메인 디코더
14 : 주파수-도메인 디코더
16 : 연관기
18a-18c : 프레임
20 : 데이터 스트림
24a-c : 오디오 신호의 부분
26 : 오디오 신호
30 : 제 1 서브세트
32 : 제 2 서브세트
34 : 결합기
38 : 프레임 방식 구문 요소
40, 42 : 방식 의존 세트
46 : 세트
52 : 전단사 매핑
62 : 선형 예측 합성 필터
64 : 여진 신호 구성기
66 : 코드북 지수
68 : 여진 신호
70 : 주파수 도메인 잡음 형상기
72 : 재변환기
74 : 변환 계수 레벨
76 : 선형 예측 필터 계수
100 : 오디오 인코더
102 : 연관기
104 : 시간-도메인 인코더
106 : 주파수-도메인 인코더
108 : 오디오 인코더의 입력
114 : 데이터 스트림
120 : 외부 제어 신호
122 : 프레임 방식 구문 요소
130 : 선형 예측 코딩 분석기
132 : 변환기
134 : 선형 예측 코딩-대-주파수 도메인 가중 전환기
136 : 주파수-도메인 잡음 형상기
138 : 양자화기
140 : 주파수-도메인 인코더의 입력
142 : 주파수-도메인 인코더의 출력
144 : 선형 예측 분석 필터
146 : 코드 기반 여진 신호 근사장치
150 : 여진 신호10: Audio decoder
12: time-domain decoder
14: Frequency-domain decoder
16: Associative
18a-18c: Frame
20: data stream
24a-c: part of the audio signal
26: Audio signal
30: First subset
32: second subset
34: Coupler
38: Framing Syntax Element
40, 42: Method dependent set
46: Set
52: shear mapping
62: Linear predictive synthesis filter
64: Excitation signal generator
66: Codebook index
68: Excitation signal
70: Frequency domain noise type
72:
74: Transform coefficient level
76: Linear prediction filter coefficient
100: Audio Encoder
102: Associative
104: time-domain encoder
106: frequency-domain encoder
108: Input of audio encoder
114: data stream
120: External control signal
122: Frame method syntax element
130: Linear Predictive Coding Analyzer
132: converter
134: Linear Predictive Coding-to-Frequency Domain Weighted Switcher
136: frequency-domain noise type
138: Quantizer
140: Input of frequency-domain encoder
142: Output of frequency-domain encoder
144: linear prediction analysis filter
146: Code based excitation signal approximation
150: Excitation signal

Claims

Time-domain encoder 104;
Frequency-domain encoder 106; And
An associator (102) configured to associate each successive portion (116a-c) of the audio signal (112) with one of a set of one or more associable frame coding schemes of the plurality of frame coding schemes (22); , &Lt; / RTI &
The time-domain encoder 104 includes portions of the first subset 30 of one or more of the plurality of frame coding schemes 22 associated therewith in corresponding frames 118a wherein the frequency-domain encoder (106) is configured to encode portions of the second subset (32) of one or more of the plurality of frame coding schemes associated therewith with corresponding RTI ID = 0.0 > frame, < / RTI >
The associator 102 may be configured such that the set of one or more associable frame coding schemes of the plurality of frame coding schemes is separated from the first subset 30 and the second sub- And if the active mode of operation is a second mode of operation, the set of one or more associable frame coding schemes of the plurality of frame coding schemes overlap with the first and second subsets (30, 32) And to operate in an active operation mode among the plurality of operation modes,

The associator 102 may include, for each portion, a framing syntax element 122 into the data stream 114 to indicate which of the plurality of frame coding schemes each portion is related to. Encoded,

The associator (102) is operable, on the one hand, between a set of possible values of the framing syntax element associated with each portion and, on the other hand, a set of one or more associable frame coding schemes of the frame coding schemes, Is configured to encode the framed syntax element (122) into the data stream (114) using a front end mapping that is modified according to a method.

2. The apparatus of claim 1, wherein the associator (102) is adapted to determine if the active mode of operation is the first mode of operation, the set of associable frame coding schemes of the plurality of frame- And if the active mode of operation is the second mode of operation, the set of associatable frame coding schemes of the plurality of frame coding schemes is separated from the first and second subset (32) (30, 32). &Lt; / RTI >

2. The apparatus of claim 1, wherein the number of the set of possible values is two and the associator (102) is configured such that if the active mode of operation is a first mode of operation, A first and a second frame coding scheme of the second subset of coding schemes, wherein the frequency-domain encoder is configured to encode portions having the first and second frame coding schemes associated therewith, Resolution of the audio signal.

2. The audio encoder of claim 1, wherein the time-domain encoder (104) is a code excited linear predictive encoder.

2. The method of claim 1, wherein the frequency-domain encoder (106) uses transform coefficient levels and encodes it into the corresponding frames of the data stream to obtain the second And a transform encoder configured to encode portions having one of the subset.

3. The apparatus of claim 1, wherein the time-domain encoder and the frequency-domain encoder are linear prediction based encoders configured to signal the filter coefficients for each portion of the audio signal (112)
Wherein the time-domain encoder (104) is configured to filter the linear predictive analytic filter according to the linear predictive coding filter coefficients to obtain an excitation signal (150) with the first subset of one or more of the frame- To calculate an approximation of the excitation signal by use of codebook exponents and to insert it into the corresponding frames,
The frequency-domain encoder 106 may have one of the second subset of one or more of the frame coding schemes associated therewith to obtain a spectrum and to obtain the excitation spectrum, Transforming the portions of the audio signal that shape the spectrum according to the linear predictive coding filter coefficients for the portions having one of the first subset and the excitation spectrum within the frames having one of the second subset associated therewith Quantize the quantized excitation spectra into transform coefficients levels, and insert the quantized excitation spectra into the corresponding frames.

A method of audio encoding using a time-domain encoder (104) and a frequency-domain encoder (106)
Associating each successive portion (116a-c) of the audio signal (112) with one of a set of associable frame coding schemes of a plurality of frame coding schemes (22);
By a time-domain encoder 104, portions having one of the first subset 30 of one or more of the plurality of frame coding schemes 22 associated therewith to a corresponding frame 118a of data stream 114 c) < / RTI >
By a frequency-domain encoder 106, portions of the second subset 32 of one or more of the plurality of frame coding schemes 22 associated therewith into corresponding frames of the data stream 114 Encoding,
Wherein the associating step is such that if the active mode of operation is a first mode of operation, the set of associable frame coding schemes of the plurality of frame coding schemes are separated from the first subset 30 and the second subset 32, , And if the active mode of operation is a second mode of operation, the set of associable frame coding schemes of the plurality of frame coding schemes overlaps the first and second subsets (30, 32) Wherein the plurality of operation methods are executed in an active operation mode,

The method includes encoding, for each portion, a framing syntax element (122) into the data stream (114) to indicate which of the plurality of frame coding schemes each portion is related to Further comprising:

On the one hand, a set of possible values of the framing syntax element associated with each part and, on the other hand, a set of one or more associable frame coding schemes of the frame coding schemes, Wherein the frame-wise syntax element (122) is encoded into the data stream (114) using a mapping.

A computer program having program code for executing the method according to claim 7 when running on a computer.