EP1398760A1 - Signalisierung von Fensterschaltungen in einem MPEG Layer 3 Audio Datenstrom - Google Patents
Signalisierung von Fensterschaltungen in einem MPEG Layer 3 Audio Datenstrom Download PDFInfo
- Publication number
- EP1398760A1 EP1398760A1 EP20030292035 EP03292035A EP1398760A1 EP 1398760 A1 EP1398760 A1 EP 1398760A1 EP 20030292035 EP20030292035 EP 20030292035 EP 03292035 A EP03292035 A EP 03292035A EP 1398760 A1 EP1398760 A1 EP 1398760A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- subbands
- window
- information
- window forms
- block
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000011664 signaling Effects 0.000 title abstract description 5
- 238000000034 method Methods 0.000 claims description 45
- 230000008569 process Effects 0.000 claims description 32
- 230000005236 sound signal Effects 0.000 claims description 25
- 230000003595 spectral effect Effects 0.000 claims description 16
- 230000009467 reduction Effects 0.000 claims description 14
- 230000001131 transforming effect Effects 0.000 claims description 6
- 230000009466 transformation Effects 0.000 description 11
- 238000010606 normalization Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 230000000873 masking effect Effects 0.000 description 2
- 102100035474 DNA polymerase kappa Human genes 0.000 description 1
- 101710108091 DNA polymerase kappa Proteins 0.000 description 1
- 241000094111 Parthenolecanium persicae Species 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- HBGGXOJOCNVPFY-UHFFFAOYSA-N diisononyl phthalate Chemical compound CC(C)CCCCCCOC(=O)C1=CC=CC=C1C(=O)OCCCCCCC(C)C HBGGXOJOCNVPFY-UHFFFAOYSA-N 0.000 description 1
- 238000002592 echocardiography Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000008187 granular material Substances 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0204—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
- G10L19/0208—Subband vocoders
Definitions
- the invention relates to a method and to an apparatus for encoding or decoding an audio signal that is processed using multiple subbands and overlapping window functions, and using extended subband signal window switching configurations.
- cosine or Fourier transformation is used for generating spectral coefficients from time domain input samples.
- the coefficients are coded, thereby removing redundancy and irrelevancy.
- the coded coefficients are decoded and inversely transformed into time domain samples.
- the lengths of the transformation blocks are switched from long to short, and vice versa, depending on the current characteristics of the input signal, in order to mask pre-echoes and reduce audible noise arising in blocks with a more or less silent period before a sudden increase of the input signal amplitude.
- Transformation block length switching is also used in ISO/IEC 11172-3 (MPEG-1 Audio Layer 3) and in ISO/IEC 13818-3 (MPEG-2 Audio Layer 3) and in AAC (advanced audio coding).
- the transform block length switching information or window length switching information is transmitted within the overhead (between 'main_data_begin' and 'main_data') of the frames of the datastream using a flag called 'window_switching_flag' for each set of coefficients called 'granule'.
- the different layers in MPEG-1 Audio and MPEG-2 Audio as well as other audio codecs like the Minidisc system use subband coding/decoding, wherein the total frequency band is split into a predetermined number of subbands, e.g. 32 bands, or into 3 subbands in case of Minidisc.
- Fig. 2 depicts several subbands SB1 ...
- Real windows/transformation blocks may include between e.g. 12 and 2048 samples at original PCM sampling rates of e.g. 32kHz, 44.1kHz or 48kHz. The windows are overlapping by e.g. 50%, as shown in Fig. 2.
- the type of transformation can be an MDCT that uses subsampling by a factor 2 so that the overall quantity of input coefficients is not increased.
- the window functions shown in Fig. 2 are symbolic ones only, real window functions have e.g. sine/cosine or Kaiser-Bessel or Fielder shape.
- MPEG-1 Audio Layer 3 in MPEG-2 Audio Layer 3 and in Minidisc codecs it is also possible to select for a given period of the input signal a different transform block or window length in different subbands.
- the information about which subband or which group of subbands is to be using which transformation or window length needs to be included in the datastream for evaluation in the decoder.
- this parameter is called 'mixed_block_flag', determining that in the lowest two subbands SB1 and SB2 long blocks only are to be used whereas, in a uniform manner, in the upper 30 subbands the block length will vary between long blocks and short blocks including transition blocks called start blocks and stop blocks.
- the block or window type is signalled, too, using the 2-bit parameter 'block_type'. If short blocks are used there arises in each case a block type sequence as shown for instance for subbands 3 and 4 in Fig. 2: long block (code 0), start block (code 1, having unsymmetrical window function halves), 3 short blocks (code 2; at least one short block, generally speaking), stop block (code 3, having unsymmetrical window function halves), long block (code 0).
- a problem to be solved by the invention is to provide improved adaptation of the allowable block or window lengths or window forms within the total range of subbands. This problem is solved by the methods disclosed in claims 1 and 5. Apparatuses that utilise these methods are disclosed in claims 9 and 10.
- the corresponding 2-bit value of 'block_type' is sent repeatedly although the decoder knows already from the occurrence of the parameter 'window_switching_flag' that the above described sequence of 'start block', 'short window(s)', 'stop block' and 'long window' will follow. Therefore transmitting the changing parameter 'block_type' several times is redundant information.
- the superfluous parameter 'block_type' flag is not sent for block type signalling purposes. Instead, the two corresponding bits are used for signalling to the decoder differing subband signal window switching configuration types.
- These configuration types define in which of the total number of subbands used the window switching is affected by above parameter 'window_switching_flag', or in which of the total number of subbands used the window switching is not affected by the parameter 'window_switching_flag'.
- These configuration types can further define different subbands groups fixed within the total number of subbands, that are affected by the parameter 'window_switching_flag'.
- These configuration types can further define variable subbands groups within the total number of subbands, that are affected by the parameter 'window_switching_flag'. Both alternatives can be combined, too.
- the inventive method is suited for encoding an audio signal that is processed using multiple subbands and overlapping window functions into which the signals in the subbands are partitioned, wherein the resulting sample blocks are in each case transformed into corresponding blocks of spectral domain coefficients and are coded using data reduction, and wherein different window forms are used and the information about the window forms used is transmitted, recorded or stored in the side information for the coded coefficients, and wherein upon deciding to process, during a given time period, in a first group of subbands the subband signals at least in part with a given sequence of window forms different from the corresponding sequence of window forms used to process the subband signals in a second group of subbands, additional information about such mixing of window forms is transmitted, recorded or stored in said side information, and wherein following such decision to process in a first group of subbands the subband signals at least in part with a given sequence of window forms different from the corresponding sequence of window forms used to process the subband signals in a second group of subbands, information about the
- the inventive method is suited for decoding an audio signal that was processed using multiple subbands and overlapping window functions into which the signals in the subbands are partitioned, wherein the resulting sample blocks were in each case transformed into corresponding blocks of spectral domain coefficients and are coded using data reduction, and wherein different window forms were used and the information about the window forms used was transmitted, recorded or stored in the side information for the coded coefficients, and wherein upon the decision to process, during a given time period, in a first group of subbands the subband signals at least in part with a given sequence of window forms different from the corresponding sequence of window forms used to process the subband signals in a second group of subbands, additional information about such mixing of window forms was transmitted, recorded or stored in said side information, the decoding including the steps:
- the inventive apparatus for encoding an audio signal includes:
- stage SAFW carries out subband analysis filtering (i.e. generating the above 32 subband signals), windowing and transformation into the spectral domain.
- Stage ScFCal calculates the scale factors form the spectral coefficients.
- Stage ScFCod codes the scale factors, using side information received from stage BRAdj.
- Stage NQCod carries out normalisation, quantisation and coding of the coefficients from the subbands, thereby using side information from stage BRAdj.
- Stage FrFo performs formatting of the audio frames to be transmitted, recorded or stored.
- Stage FFTA performs an FFT analysis (fast Fourier transform) of the input signal EINP in parallel, in order to provide a source for psycho-acoustic information.
- the subsequent stage ThCalSD calculates therefrom the masking thresholds and signal/masking ratios, and determines the window switching information required for the subbands. That window switching information is applied in stage SAFW to the subband signals and to the corresponding transformation operations.
- Stage BAllCal calculates the required bit allocation.
- the subsequent stage BRAdj controls the adjustment to the desired fixed bit rate by sending corresponding control signals to stages ScFCod and NQCod. One channel only of two (stereo) or more channels is depicted, whereby the stages FFTA, ThCalSD, BAllCal and BRAdj are normally used for all channels in common.
- stage SIDec decodes the side information generated in the encoder and required by the decoder, e.g. scale factor information, bit allocation information, window switching information, normalisation information, quantisation information and threshold information.
- Stage SIDec controls the subsequent stages INQDec and SSFW.
- Stage INQDec performs inverse coding, inverse quantisation and inverse normalisation on the received or replayed coefficients from the subbands.
- Stage SSFW carries out inverse transformation, corresponding window switching and subband synthesis filtering, and provides the output PCM samples. One channel only of two (stereo) or more channels is depicted.
- the inventive window switching - as indicated using the example in Fig. 2 with subbands 1/2, 3/4 and 31/32 - using differing subband signal window switching configuration types is applied in stage SAFW in the encoder and in stage SSFW in the decoder.
- the information about the configuration type to be selected is determined in stage ThCalSD, transferred, and evaluated in stage SIDec in the decoder.
- the invention can be used in extended systems based on MPEG-1 Audio Layer 3, MPEG-2 Audio Layer 3, or AAC, for example.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP20030292035 EP1398760B1 (de) | 2002-08-28 | 2003-08-18 | Signalisierung von Fensterschaltungen in einem MPEG Layer 3 Audio Datenstrom |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP02090308A EP1394772A1 (de) | 2002-08-28 | 2002-08-28 | Signalierung von Fensterschaltungen in einem MPEG Layer 3 Audio Datenstrom |
EP02090308 | 2002-08-28 | ||
EP20030292035 EP1398760B1 (de) | 2002-08-28 | 2003-08-18 | Signalisierung von Fensterschaltungen in einem MPEG Layer 3 Audio Datenstrom |
Publications (2)
Publication Number | Publication Date |
---|---|
EP1398760A1 true EP1398760A1 (de) | 2004-03-17 |
EP1398760B1 EP1398760B1 (de) | 2005-04-13 |
Family
ID=31889453
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP20030292035 Expired - Lifetime EP1398760B1 (de) | 2002-08-28 | 2003-08-18 | Signalisierung von Fensterschaltungen in einem MPEG Layer 3 Audio Datenstrom |
Country Status (1)
Country | Link |
---|---|
EP (1) | EP1398760B1 (de) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0998051A2 (de) * | 1998-10-29 | 2000-05-03 | Matsushita Electric Industrial Co., Ltd. | Verfahren zur Bestimmung und zur Anpassung der Blockgrösse für Audiotransformationskodierung |
-
2003
- 2003-08-18 EP EP20030292035 patent/EP1398760B1/de not_active Expired - Lifetime
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0998051A2 (de) * | 1998-10-29 | 2000-05-03 | Matsushita Electric Industrial Co., Ltd. | Verfahren zur Bestimmung und zur Anpassung der Blockgrösse für Audiotransformationskodierung |
Non-Patent Citations (1)
Title |
---|
INTERNATIONAL STANDARDS ORGANIZATION: "Final text for DIS 11172-3 (rev. 2): Information Technology - Coding of Moving Pictures and Associated Audio for Digital Storage Media - Part 1 - Coding at up to about 1.5 Mbit/s (ISO/IEC JTC 1/SC 29/WG 11 N 0156) [MPEG 92] - Section 3: Audio", CODED REPRESENTATION OF AUDIO, PICTURE MULTIMEDIA AND HYPERMEDIA INFORMATION (TENTATIVE TITLE). APRIL 20, 1992. ISO/IEC JTC 1/SC 29 N 147. FINAL TEXT FOR DIS 11172-1 (REV. 2): INFORMATION TECHNOLOGY - CODING OF MOVING PICTURES AND ASSOCIATED AUDIO FO, 1992, pages III - V,174-337, XP002083108 * |
Also Published As
Publication number | Publication date |
---|---|
EP1398760B1 (de) | 2005-04-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7627480B2 (en) | Support of a multichannel audio extension | |
US6529604B1 (en) | Scalable stereo audio encoding/decoding method and apparatus | |
JP4731774B2 (ja) | 高品質オーディオ用縮尺自在符号化方法 | |
JP3926399B2 (ja) | オーディオ信号コーディング中にノイズ置換を信号で知らせる方法 | |
US7627482B2 (en) | Methods, storage medium, and apparatus for encoding and decoding sound signals from multiple channels | |
US7181404B2 (en) | Method and apparatus for audio compression | |
EP1394772A1 (de) | Signalierung von Fensterschaltungen in einem MPEG Layer 3 Audio Datenstrom | |
Sinha et al. | Audio compression at low bit rates using a signal adaptive switched filterbank | |
US20040186735A1 (en) | Encoder programmed to add a data payload to a compressed digital audio frame | |
US20030215013A1 (en) | Audio encoder with adaptive short window grouping | |
KR20030014752A (ko) | 오디오 코딩 | |
WO2006000842A1 (en) | Multichannel audio extension | |
KR20010021226A (ko) | 디지털 음향 신호 부호화 장치, 디지털 음향 신호 부호화방법 및 디지털 음향 신호 부호화 프로그램을 기록한 매체 | |
KR100955014B1 (ko) | 디지털 정보 신호의 인코딩과 디코딩을 위한 방법 및 장치 | |
AU729584B2 (en) | Method and device for coding an audio-frequency signal by means of "forward" and "backward" LPC analysis | |
KR100750115B1 (ko) | 오디오 신호 부호화 및 복호화 방법 및 그 장치 | |
Iwakami et al. | Audio coding using transform‐domain weighted interleave vector quantization (twin VQ) | |
EP1398760B1 (de) | Signalisierung von Fensterschaltungen in einem MPEG Layer 3 Audio Datenstrom | |
Prandoni et al. | Perceptually hidden data transmission over audio signals | |
KR0181488B1 (ko) | 비트 할당 테이블의 규칙성을 이용한 엠피이쥐 오디오복호 장치 및 방법 | |
JPH07508375A (ja) | スタジオ用デジタルオーディオ信号の記憶及び又は通信時のデータ整理方法 | |
JP2003195896A (ja) | オーディオ復号装置及びその復号方法並びに記憶媒体 | |
Mandal et al. | Digital Audio Compression | |
JPH07106977A (ja) | 情報復号化装置 | |
EP1341161A2 (de) | Verfahren und Vorrichtung zur Kodierung und Dekodierung eines digitalen Informationssignals |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PT RO SE SI SK TR |
|
AX | Request for extension of the european patent |
Extension state: AL LT LV MK |
|
17P | Request for examination filed |
Effective date: 20040327 |
|
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
AKX | Designation fees paid |
Designated state(s): DE FR GB IT |
|
GRAS | Grant fee paid |
Free format text: ORIGINAL CODE: EPIDOSNIGR3 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): DE FR GB IT |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT;WARNING: LAPSES OF ITALIAN PATENTS WITH EFFECTIVE DATE BEFORE 2007 MAY HAVE OCCURRED AT ANY TIME BEFORE 2007. THE CORRECT EFFECTIVE DATE MAY BE DIFFERENT FROM THE ONE RECORDED. Effective date: 20050413 |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: 746 Effective date: 20050407 |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: FG4D |
|
REF | Corresponds to: |
Ref document number: 60300500 Country of ref document: DE Date of ref document: 20050519 Kind code of ref document: P |
|
RAP2 | Party data changed (patent owner data changed or rights of a patent transferred) |
Owner name: THOMSON LICENSING |
|
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
ET | Fr: translation filed | ||
26N | No opposition filed |
Effective date: 20060116 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20070823 Year of fee payment: 5 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20070803 Year of fee payment: 5 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20070821 Year of fee payment: 5 |
|
GBPC | Gb: european patent ceased through non-payment of renewal fee |
Effective date: 20080818 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: ST Effective date: 20090430 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FR Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20080901 Ref country code: DE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20090303 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GB Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20080818 |