WO2003063135A1 - Audio coding method and apparatus using harmonic extraction - Google Patents
Audio coding method and apparatus using harmonic extraction Download PDFInfo
- Publication number
- WO2003063135A1 WO2003063135A1 PCT/KR2002/002348 KR0202348W WO03063135A1 WO 2003063135 A1 WO2003063135 A1 WO 2003063135A1 KR 0202348 W KR0202348 W KR 0202348W WO 03063135 A1 WO03063135 A1 WO 03063135A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- audio data
- pcm audio
- harmonic components
- pcm
- harmonic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
Definitions
- the present invention relates to a method of compressing an audio signal, and more particularly, to a method of and apparatus for efficiently compressing an audio signal into an MPEG-1 layer-3 audio signal with a low-speed bit rate.
- Moving Picture Experts Group-1 establishes standards regarding digital video compression and digital audio compression and is supported by the International Standardization Organization (ISO).
- ISO International Standardization Organization
- the MPEG-1 audio standard is used to compress 16-bit audio that is sampled at a 44.1 Khz sampling rate and stored on a 60-minute or
- Layer III is the most complex, uses much more filters than layer II, and adopts Huffman coding. Upon encoding at 1 12 Kbps, an excellent-quality sound can be heard. Upon encoding at 128 Kbps, a sound nearly the same as the original sound is obtained. Upon encoding at 160 Kbps or 192 Kbps, an excellent sound that a human ear cannot distinguish from the original sound can be heard.
- MPEG-1 layer-3 audio is referred to as MP3 audio.
- MP3 audio is produced through a discrete cosine transform (DCT), bit allocation based on psycho-acoustic model 2, quantization, and the like.
- DCT discrete cosine transform
- modified DCT is performed using the result of psycho-acoustic model 2.
- MDCT modified DCT
- the ear of a human is the most important. The human ear cannot hear if the intensity of a sound is at or below a predetermined level. If someone talks loudly in the office, it can be easily recognized who is talking. However, if an airplane passes at that moment, the talking cannot be heard. Even after the airplane has passed, the talking still cannot be heard because of a lingering sound.
- data having a volume equal to or greater than a masking threshold is sampled among data having a volume equal to or greater than the minimum audible limit corresponding to when it is quiet.
- the sampling is performed on each sub-band.
- the present invention provides a method of effectively processing an audio signal at a low speed by removing a harmonic component from an original signal using a fast Fourier transform (FFT) adopted in psycho-acoustic model 2 and compressing only a transient component using MDCT.
- FFT fast Fourier transform
- Korean Patent Publication No. 1995-022322 discloses a bit allocation method employing a psycho-acoustic model. However, the disclosed method is different from a method of the present invention for increasing compression efficiency by removing a harmonic component from an original signal using the result of an FFT adopted in a psycho-acoustic model.
- Korean Patent Publication No. 1998-072457 discloses a signal processing method and apparatus in the psycho-acoustic model 2, by which the amount of computation is significantly reduced by reducing computation overload while compressing an audio signal.
- the disclosed signal processing method includes a step of obtaining an individual masking boundary value using an FFT result, a step of selecting a global masking boundary value, and a step of shifting to the next frequency position.
- This method is the same as the present invention in that an FFT result value is used but different in that it uses a different quantization method.
- U.S. Patent No. 5,930,373 discloses a method for enhancing the quality of a sound signal using the residue harmonics of a low frequency signal.
- the disclosed method and the quantization method according to the present invention are different in that they use different techniques of using residue harmonics.
- an aspect of the present invention to provide a method of effectively processing an audio signal at a low speed by removing a harmonic component from an original audio signal using the result of a fast Fourier transform (FFT) used in psycho-acoustic model 2 and compressing only a residue transient using a modified discrete cosine transform (MDCT).
- FFT fast Fourier transform
- MDCT modified discrete cosine transform
- an audio coding method using harmonic components in which PCM audio data is first received and stored. Then, psycho-acoustic model 2 based on the audible limit characteristics of a human is applied to the stored data to obtain fast Fourier transformation (FFT) result, perceptual energy information regarding received data, and bit allocation information used for quantization. Thereafter, harmonic components are extracted from the received PCM audio data using the FFT result information. Next, the extracted harmonic components are encoded, and the encoded harmonic components are decoded. Then, a MDCT is performed on a number of samples of the received PCM audio data from which the extracted harmonic components are removed, which depends on the value of the perceptual energy information. Thereafter, the MDCTed audio data is quantized by allocating bits according to the bit allocation information. Finally, an audio packet is produced from the quantized, MDCTed audio data and the encoded harmonic components.
- FFT fast Fourier transformation
- a PCM audio data storage unit receives and stores PCM audio data.
- a psycho-acoustic model 2 performing unit receives the PCM audio data from the PCM audio data storage unit and performs psycho-acoustic model 2 to obtain FFT result information, perceptual energy information regarding received data, and bit allocation information used for quantization.
- a harmonic extraction unit extracts harmonic components from the received PCM audio data using the FFT result information.
- a harmonic encoding unit encodes the extracted harmonic components outputting encoded harmonic components.
- a harmonic decoding unit decodes the encoded harmonic components.
- An MDCT unit performs a MDCT on the stored PCM audio data from which the decoded harmonic components are removed, according to the perceptual energy information.
- a quantization unit quantizes the MDCTed audio data according to the bit allocation information.
- An MPEG layer III bitstream production unit transforms the quantized, MDCTed audio data and the encoded harmonic components output from the harmonic encoding unit into an MPEG audio layer III packet.
- FIG. 1 shows the format of an MPEG-1 layer III audio stream
- FIG. 2 is a block diagram of an apparatus for producing an MPEG-1 layer III audio stream
- FIG. 3 is a flowchart illustrating a computation process in a psycho-acoustic model
- FIG. 4 is a block diagram of an apparatus according to the present invention for producing a low-speed MPEG-1 layer III audio stream
- FIG. 5 is a flowchart illustrating harmonic extraction, harmonic encoding, and harmonic decoding based on psycho-acoustic model 2;
- FIGS. 6A, 6B, 6C, and 6D illustrate harmonic component samples extracted in stages in order to extract harmonic components using an FFT result in psycho-acoustic model 2;
- a moving picture experts group (MPEG)-1 layer III audio stream is composed of audio access units (AAUs) 100.
- the AAU 100 is a minimal unit that can be independently accessed, and compresses and stores data with a fixed number of samples.
- the AAU 100 includes a header 110, a cyclic redundancy check (CRC) 120, audio data 130, and auxiliary data 140.
- CRC cyclic redundancy check
- the header 110 stores a syncword, ID information, layer information, information regarding whether a protection bit exists, bitrate index information, sampling frequency information, information regarding whether a padding bit exists, a private bit, mode information, mode extension information, copyright information, information regarding whether an audio stream is an original one or a copy, and information on emphasis characteristics.
- the CRC 120 is optional.
- the presence or absence of the CRC 120 is defined in the header 110, and the length of the CRC 120 is 16 bits.
- the audio data 130 is a portion into which compressed audio data is inserted.
- the auxiliary data 140 is data which is filled into a space remaining when the end of the audio data 130 does not reach the end of an AAU.
- Arbitrary data other than MPEG audio can be inserted into the auxiliary data 140.
- FIG. 2 is a block diagram of an apparatus for producing an MPEG-1 layer III audio stream.
- a pulse code modulation (PCM) audio signal input unit 210 has a buffer in which PCM audio data is stored.
- the PCM audio signal input unit 210 receives, as the PCM audio data, granules, each composed of 576 samples.
- a psycho-acoustic model 2 performing unit 220 receives the PCM audio data from the buffer of the PCM audio signal input unit 210 and performs psycho-acoustic model 2.
- a discrete cosine transforming (DCT) unit 230 receives the PCM audio data in units of granules and performs a DCT operation at the same time when psycho-acoustic model 2 is performed.
- DCT discrete cosine transforming
- a modified DCT (MDCT) unit 240 performs an MDCT using the result of the application of psycho-acoustic model 2 and the result of the DCT performed by the DCT unit 230. If perceptual energy is greater than a predetermined threshold, the MDCT is performed using a short window. If the perceptual energy is smaller than the predetermined threshold, the MDCT is performed using a long window.
- perceptual coding which is an audio signal compression technique, a reproduced signal is different to an original signal. That is, detailed information that people cannot perceive using the characteristics of the human ear can be omitted.
- Perceptual energy denotes energy that a human can perceive.
- a quantization unit 250 performs quantization using bit allocation information generated as a result of the application of psycho-acoustic model 2 and using the result of the MDCT operation.
- An MPEG-1 layer III bitstream producing unit 260 transforms the quantized data into data to be inserted into an audio data area of an MPEG-1 bitstream, using Huffman coding.
- FIG. 3 is a flowchart illustrating a computation process in a psycho-acoustic model.
- PCM audio data is received in granules, each composed of 576 samples, in step 310.
- long windows, each composed of 1024 samples, or short windows, each composed of 256 samples, are formed using the received PCM audio data, in step 320. That is, one packet is constituted of multiple samples.
- step 330 a fast Fourier transform (FFT) is performed one window at a time on the windows formed in step 320. Then, psycho-acoustic model 2 is applied, in step 340.
- FFT fast Fourier transform
- a perceptual energy value is obtained through the application of psycho-acoustic model 2 and applied to a MDCT unit and the MDCT unit selects a window to be applied.
- a signal to masking ratio (SMR) value for each threshold bandwidth is calculated and applied to a quantization unit to determine the number of bits to be allocated.
- MDCT and quantization are performed using the perceptual energy value and the SMR value, in step 360.
- FIG. 4 is a block diagram of an apparatus for producing a low-speed MPEG-1 layer III audio stream, according to the present invention.
- a PCM audio signal storage unit 410 has a buffer in which it stores PCM audio data.
- a psycho-acoustic model 2 performing unit 420 performs an FFT on 1024 samples or 256 samples at a time and outputs perceptual energy information and bit allocation information. As described above with reference to FIG. 3, when psycho-acoustic model 2 is applied, the perceptual energy information and the bit allocation information that depends on an SMR are output. Since the psycho-acoustic model 2 performing unit 420 performs an FFT, a harmonic extraction unit 430 extracts a harmonic component from the result of the FFT. This will be described later with reference to FIG. 6.
- a harmonic encoding unit 440 encodes the extracted harmonic component and transmits the encoded harmonic component to an MPEG-1 layer III bitstream producing unit 480.
- the encoded harmonic component forms MPEG-1 audio, together with quantized audio data.
- a harmonic decoding unit 450 decodes the encoded harmonic component to obtain PCM data in the time domain.
- a MDCT unit 460 subtracts the decoded harmonic component from the original input PCM signal and performs a MDCT on the result of the subtraction. To be more specific, if the perceptual energy information value received from the psycho-acoustic model 2 unit 420 is greater than a predetermined threshold, a MDCT is performed on 18 samples at a time.
- a MDCT is performed on 36 samples at a time.
- the harmonic component extraction is performed on data arranged in a frequency domain using a tonal/non-tonal decision condition and auditory limit characteristics that are defined in psycho-acoustic model 2. This will be described later in detail.
- a quantization unit 470 performs quantization using the bit allocation information obtained by the psycho-acoustic model 2 performing unit 420.
- the MPEG-1 layer III bitstream producing unit 480 packetizes the harmonic component data made by the harmonic encoding unit 440 and quantized audio data obtained by the quantization unit 470 to obtain compressed audio data.
- FIG. 5 is a flowchart illustrating a harmonic extraction step 510, a harmonic encoding step 520, and a harmonic decoding step 530 based on psycho-acoustic model 2.
- the steps performed in psycho-acoustic model 2 in FIG. 5 are the same as the steps performed in psycho-acoustic model 2 in FIG. 3.
- the result of the FFT performed based on the psycho-acoustic model 2 performing unit is used in step 510 of extracting a harmonic component.
- the extracted harmonic component is encoded to an MPEG-1 bitstream in step 520.
- the harmonic extraction step 510 will now be described in greater detail with reference to FIGS. 6A through 6D.
- FIGS. 6A through 6D FIGS.
- 6A, 6B, 6C, and 6D illustrate samples extracted in stages when harmonic components are extracted using the result of the FFT performed in psycho-acoustic model 2.
- PCM audio data as shown in FIG. 6A are input, an FFT is first performed on the received data in order to determine sound pressure for each datum.
- One of the plurality of received PCM audio data whose sound pressure has been obtained is selected. If the values of the PCM audio data on the left and right sides of the selected data are smaller than the selected PCM audio data value, only the selected PCM audio data is extracted. This process is applied to all of the received PCM audio data.
- Sound pressure is the energy value of a sample in a frequency domain.
- only samples having sound pressures that are greater than a predetermined level are determined to be harmonic components. Accordingly, the samples shown in FIG. 6B are extracted. Thereafter, only samples having sound pressures that are greater than a predetermined level are extracted. For example, if the predetermined level is set to be 7.0 dB, samples having sound pressures smaller than 7.0 dB are not selected, and only the samples shown in FIG. 6C remain. The remaining samples are not all considered to be harmonic components, and some samples are extracted from the remaining samples according to the table of FIG. 7. Hence, finally, the samples shown in FIG. 6D remain.
- FIG. 7 is a table showing a limited frequency range that varies according to a K value.
- K is a value representing the location of a sample in a frequency domain
- the K value is smaller than 3 or greater than 500
- the values of samples present within the limited frequency range of 0 are 0 and accordingly not selected.
- a corresponding range value is set to be 2.
- the K value is equal to or greater than 63 and smaller than 127
- a corresponding range value is set to be 3.
- a corresponding range value is set to be 6.
- K value is equal to or greater than 255 and smaller than 500
- a corresponding range value is set to be 12.
- Setting 500 as the limit was made in consideration of the limit of the audible frequency of a human and was based on an assumption that there is no difference in the quality of reproduced sound between when sample values corresponding to a frequency equal to or greater than 500 are considered and when they are not considered. Consequently, only the sample values of FIG. 6D are extracted and determined to be harmonic components.
- Harmonic encoding 520 includes amplitude encoding, frequency encoding, and phase encoding. These three encoding methods use Equations 1 and 2:
- ⁇ Enc_peak_AmpMax _( ⁇
- AmpMax denotes a peak amplitude
- Enc_peak-AmpMax denotes a result value obtained by encoding the value AmpMax
- Amp denotes amplitudes other than the peak amplitude
- the peak amplitude is first encoded in a 8-bit log scale to obtain Enc_peak_AmpMax as shown in Equation 1 , and the other amplitudes Amp are encoded in a 5-bit log scale to obtain Enc-Amp as shown in Equation 2.
- phase encoding is achieved using 3 bits. After such harmonic extraction and harmonic encoding, encoded harmonic components are decoded and then undergo MDCT.
- FIG. 8 is a flowchart illustrating a process for producing an audio stream by removing harmonic components, according to the present invention.
- PCM audio data is received and stored.
- psycho-acoustic model 2 using the audible limit characteristics of a human being is applied to the stored data in order to obtain FFT result information, perceptual energy information regarding the received data, and bit allocation information used for quantization.
- harmonic components are extracted from the received PCM audio data using the FFT result information.
- the harmonic components are extracted in the following process.
- sound pressure for each of the plurality of received PCM audio data is obtained using the FFT result information.
- one of the plurality of received PCM audio data whose sound pressures are obtained is selected. If the values of the PCM audio data on the left and right sides of the selected data are smaller than the value of the selected PCM audio data, only the selected PCM audio data is extracted. This process is applied to all of the received PCM audio data. Thereafter, only PCM audio data that each have sound pressure greater than a predetermined value of 7.0 dB are extracted from the PCM audio data extracted in the previous step. Finally, harmonic components are extracted by not selecting PCM audio data in a predetermined frequency range among the audio data extracted in the previous step.
- the extracted harmonic components are encoded and output in step 840.
- encoded harmonic components are decoded in step 850.
- the received PCM audio data from which the decoded harmonic components are removed is subject to MDCT according to the perceptual energy information.
- MDCT is performed using a short window, for example, on 18 samples at a time. If the perceptual energy value is smaller than the predetermined threshold, MDCT is performed using a long window, for example, on 36 samples at a time.
- step 870 the MDCT result values are quantized by allocating bits according to the bit allocation information.
- step 880 the quantized audio data and the encoded harmonic components are subject to Huffman coding to obtain an audio packet.
- the embodiments of the present invention can be written as computer programs and can be implemented in general-use digital computers that execute the programs using a computer readable recording medium.
- Examples of computer readable recording media include magnetic storage media (e.g., ROM, floppy disks, hard disks, etc.), optical recording media (e.g., CD-ROMs, or DVDs), and a storage medium such as a carrier wave (e.g., transmission through the Internet).
- the number of quantization bits generated upon production of a low-speed MPEG-1 layer III audio stream is minimized.
- harmonic components are simply removed from an input audio signal, and only a transient portion is compressed using MDCT. Therefore, the input audio signal can be effectively compressed at a low-speed bitrate.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
Priority Applications (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CA002490064A CA2490064A1 (en) | 2002-06-27 | 2002-12-12 | Audio coding method and apparatus using harmonic extraction |
| DE10297751T DE10297751B4 (de) | 2002-06-27 | 2002-12-12 | Audiocodierverfahren und Vorrichtung, die die Harmonischen-Extraktion verwenden |
| JP2003562916A JP2005531014A (ja) | 2002-06-27 | 2002-12-12 | ハーモニック成分を利用したオーディオコーディング方法及び装置 |
| GB0427660A GB2408184B (en) | 2002-06-27 | 2002-12-12 | Audio coding method and apparatus using harmonic extraction |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR10-2002-0036310A KR100462611B1 (ko) | 2002-06-27 | 2002-06-27 | 하모닉 성분을 이용한 오디오 코딩방법 및 장치 |
| KR2002/36310 | 2002-06-27 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2003063135A1 true WO2003063135A1 (en) | 2003-07-31 |
Family
ID=27607091
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/KR2002/002348 Ceased WO2003063135A1 (en) | 2002-06-27 | 2002-12-12 | Audio coding method and apparatus using harmonic extraction |
Country Status (9)
| Country | Link |
|---|---|
| US (1) | US20040002854A1 (https=) |
| JP (1) | JP2005531014A (https=) |
| KR (1) | KR100462611B1 (https=) |
| CN (1) | CN1262990C (https=) |
| CA (1) | CA2490064A1 (https=) |
| DE (1) | DE10297751B4 (https=) |
| GB (1) | GB2408184B (https=) |
| RU (1) | RU2289858C2 (https=) |
| WO (1) | WO2003063135A1 (https=) |
Cited By (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP1806736A4 (en) * | 2004-10-28 | 2008-03-19 | Matsushita Electric Industrial Co Ltd | SCALABLE CODING DEVICE, SCALABLE DECODING DEVICE AND METHOD THEREFOR |
| RU2337478C2 (ru) * | 2004-03-31 | 2008-10-27 | Интел Корпорейшн | Декодирование высокоизбыточных кодов с контролем четности с использованием многопорогового прохождения сообщения |
| US7716561B2 (en) | 2004-03-31 | 2010-05-11 | Intel Corporation | Multi-threshold reliability decoding of low-density parity check codes |
| US8015468B2 (en) | 2004-12-29 | 2011-09-06 | Intel Corporation | Channel estimation and fixed thresholds for multi-threshold decoding of low-density parity check codes |
| RU2430407C1 (ru) * | 2010-04-20 | 2011-09-27 | Общество с ограниченной ответственностью "Научно-производственное предприятие "Цифровые решения" | Устройство для вычисления дискретного косинусного преобразования |
| US8209579B2 (en) | 2004-03-31 | 2012-06-26 | Intel Corporation | Generalized multi-threshold decoder for low-density parity check codes |
| US8631060B2 (en) | 2007-12-13 | 2014-01-14 | Qualcomm Incorporated | Fast algorithms for computation of 5-point DCT-II, DCT-IV, and DST-IV, and architectures |
Families Citing this family (17)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2005094183A2 (en) * | 2004-03-30 | 2005-10-13 | Guy Fleishman | Apparatus and method for digital coding of sound |
| KR100707186B1 (ko) * | 2005-03-24 | 2007-04-13 | 삼성전자주식회사 | 오디오 부호화 및 복호화 장치와 그 방법 및 기록 매체 |
| JP4720302B2 (ja) * | 2005-06-07 | 2011-07-13 | トヨタ自動車株式会社 | 自動変速機のクラッチ装置 |
| KR100684029B1 (ko) * | 2005-09-13 | 2007-02-20 | 엘지전자 주식회사 | 푸리에 변환을 이용한 배음 생성 방법 및 이를 위한 장치,다운 샘플링에 의한 배음 생성 방법 및 이를 위한 장치와소리 보정 방법 및 이를 위한 장치 |
| KR100788706B1 (ko) * | 2006-11-28 | 2007-12-26 | 삼성전자주식회사 | 광대역 음성 신호의 부호화/복호화 방법 |
| EP2165328B1 (en) * | 2007-06-11 | 2018-01-17 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Encoding and decoding of an audio signal having an impulse-like portion and a stationary portion |
| US20090099844A1 (en) * | 2007-10-16 | 2009-04-16 | Qualcomm Incorporated | Efficient implementation of analysis and synthesis filterbanks for mpeg aac and mpeg aac eld encoders/decoders |
| RU2464540C2 (ru) * | 2007-12-13 | 2012-10-20 | Квэлкомм Инкорпорейтед | Быстрые алгоритмы для вычисления 5-точечного dct-ii, dct-iv и dst-iv, и архитектуры |
| CN101552005A (zh) * | 2008-04-03 | 2009-10-07 | 华为技术有限公司 | 编码方法、解码方法、系统及装置 |
| EP3937167B1 (en) | 2008-07-11 | 2023-05-10 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio encoder and audio decoder |
| ES2988414T3 (es) | 2008-07-11 | 2024-11-20 | Fraunhofer Ges Zur Foerderungder Angewandten Forschung E V | Decodificador de audio |
| CN101751928B (zh) * | 2008-12-08 | 2012-06-13 | 扬智科技股份有限公司 | 应用音频帧频谱平坦度简化声学模型分析的方法及其装置 |
| BR122022013454B1 (pt) | 2009-10-20 | 2023-05-16 | Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. | Codificador de áudio, decodificador de áudio, método para codificar uma informação de áudio, método para decodificar uma informação de áudio que utiliza uma detecção de um grupo de valores espectrais previamente decodificados |
| JP5914527B2 (ja) * | 2011-02-14 | 2016-05-11 | フラウンホーファー−ゲゼルシャフト・ツール・フェルデルング・デル・アンゲヴァンテン・フォルシュング・アインゲトラーゲネル・フェライン | 過渡検出及び品質結果を使用してオーディオ信号の一部分を符号化する装置及び方法 |
| MX2013013261A (es) * | 2011-05-13 | 2014-02-20 | Samsung Electronics Co Ltd | Asignacion de bits, codificacion y decodificacion de audio. |
| RU2464649C1 (ru) * | 2011-06-01 | 2012-10-20 | Корпорация "САМСУНГ ЭЛЕКТРОНИКС Ко., Лтд." | Способ обработки звукового сигнала |
| CN103516440B (zh) | 2012-06-29 | 2015-07-08 | 华为技术有限公司 | 语音频信号处理方法和编码装置 |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5481614A (en) * | 1992-03-02 | 1996-01-02 | At&T Corp. | Method and apparatus for coding audio signals based on perceptual model |
| US5819212A (en) * | 1995-10-26 | 1998-10-06 | Sony Corporation | Voice encoding method and apparatus using modified discrete cosine transform |
| KR19980072457A (ko) * | 1997-03-05 | 1998-11-05 | 이준우 | 오디오 신호의 압축시 심리음향에서의 신호처리방법 및 그 장치 |
| KR20020077959A (ko) * | 2001-04-03 | 2002-10-18 | 엘지전자 주식회사 | 디지탈 오디오 부호화기 및 복호화 방법 |
Family Cites Families (25)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5023910A (en) * | 1988-04-08 | 1991-06-11 | At&T Bell Laboratories | Vector quantization in a harmonic speech coding arrangement |
| JPH0364800A (ja) * | 1989-08-03 | 1991-03-20 | Ricoh Co Ltd | 音声符号化及び復号化方式 |
| JP3266920B2 (ja) * | 1991-09-25 | 2002-03-18 | 三菱電機株式会社 | 音声符号化装置及び音声復号化装置並びに音声符号化復号化装置 |
| US5717821A (en) * | 1993-05-31 | 1998-02-10 | Sony Corporation | Method, apparatus and recording medium for coding of separated tone and noise characteristic spectral components of an acoustic sibnal |
| KR100368854B1 (ko) * | 1993-06-30 | 2003-05-17 | 소니 가부시끼 가이샤 | 디지털신호의부호화장치,그의복호화장치및기록매체 |
| JPH0736486A (ja) * | 1993-07-22 | 1995-02-07 | Matsushita Electric Ind Co Ltd | 音声符号化装置 |
| JP3131542B2 (ja) * | 1993-11-25 | 2001-02-05 | シャープ株式会社 | 符号化復号化装置 |
| CA2136891A1 (en) * | 1993-12-20 | 1995-06-21 | Kalyan Ganesan | Removal of swirl artifacts from celp based speech coders |
| DE19537338C2 (de) * | 1995-10-06 | 2003-05-22 | Fraunhofer Ges Forschung | Verfahren und Vorrichtung zum Codieren von Audiosignalen |
| JP2778567B2 (ja) * | 1995-12-23 | 1998-07-23 | 日本電気株式会社 | 信号符号化装置及び方法 |
| JPH09246983A (ja) * | 1996-03-08 | 1997-09-19 | Nec Eng Ltd | ディジタル信号処理装置 |
| US6269338B1 (en) * | 1996-10-10 | 2001-07-31 | U.S. Philips Corporation | Data compression and expansion of an audio signal |
| JPH10178349A (ja) * | 1996-12-19 | 1998-06-30 | Matsushita Electric Ind Co Ltd | オーディオ信号の符号化方法および復号方法 |
| US5930373A (en) * | 1997-04-04 | 1999-07-27 | K.S. Waves Ltd. | Method and system for enhancing quality of sound signal |
| DE19742201C1 (de) * | 1997-09-24 | 1999-02-04 | Fraunhofer Ges Forschung | Verfahren und Vorrichtung zum Codieren von Audiosignalen |
| CA2246532A1 (en) * | 1998-09-04 | 2000-03-04 | Northern Telecom Limited | Perceptual audio coding |
| KR100300887B1 (ko) * | 1999-02-24 | 2001-09-26 | 유수근 | 디지털 오디오 데이터의 역방향 디코딩 방법 |
| JP2000267700A (ja) * | 1999-03-17 | 2000-09-29 | Yrp Kokino Idotai Tsushin Kenkyusho:Kk | 音声符号化復号方法および装置 |
| JP2000276194A (ja) * | 1999-03-25 | 2000-10-06 | Yamaha Corp | 波形圧縮方法及び波形生成方法 |
| US6377916B1 (en) * | 1999-11-29 | 2002-04-23 | Digital Voice Systems, Inc. | Multiband harmonic transform coder |
| DE10000934C1 (de) * | 2000-01-12 | 2001-09-27 | Fraunhofer Ges Forschung | Vorrichtung und Verfahren zum Bestimmen eines Codierungs-Blockrasters eines decodierten Signals |
| JP4055336B2 (ja) * | 2000-07-05 | 2008-03-05 | 日本電気株式会社 | 音声符号化装置及びそれに用いる音声符号化方法 |
| KR100348899B1 (ko) * | 2000-09-19 | 2002-08-14 | 한국전자통신연구원 | 캡스트럼 분석을 이용한 하모닉 노이즈 음성 부호화기 및부호화 방법 |
| US6732071B2 (en) * | 2001-09-27 | 2004-05-04 | Intel Corporation | Method, apparatus, and system for efficient rate control in audio encoding |
| KR100472442B1 (ko) * | 2002-02-16 | 2005-03-08 | 삼성전자주식회사 | 웨이브렛 패킷 변환을 이용한 오디오 압축 방법 및 그시스템 |
-
2002
- 2002-06-27 KR KR10-2002-0036310A patent/KR100462611B1/ko not_active Expired - Fee Related
- 2002-12-12 GB GB0427660A patent/GB2408184B/en not_active Expired - Fee Related
- 2002-12-12 CN CNB028293487A patent/CN1262990C/zh not_active Expired - Fee Related
- 2002-12-12 CA CA002490064A patent/CA2490064A1/en not_active Abandoned
- 2002-12-12 RU RU2004138088/09A patent/RU2289858C2/ru not_active IP Right Cessation
- 2002-12-12 WO PCT/KR2002/002348 patent/WO2003063135A1/en not_active Ceased
- 2002-12-12 DE DE10297751T patent/DE10297751B4/de not_active Expired - Fee Related
- 2002-12-12 JP JP2003562916A patent/JP2005531014A/ja active Pending
-
2003
- 2003-01-13 US US10/340,828 patent/US20040002854A1/en not_active Abandoned
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5481614A (en) * | 1992-03-02 | 1996-01-02 | At&T Corp. | Method and apparatus for coding audio signals based on perceptual model |
| US5819212A (en) * | 1995-10-26 | 1998-10-06 | Sony Corporation | Voice encoding method and apparatus using modified discrete cosine transform |
| KR19980072457A (ko) * | 1997-03-05 | 1998-11-05 | 이준우 | 오디오 신호의 압축시 심리음향에서의 신호처리방법 및 그 장치 |
| KR20020077959A (ko) * | 2001-04-03 | 2002-10-18 | 엘지전자 주식회사 | 디지탈 오디오 부호화기 및 복호화 방법 |
Cited By (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| RU2337478C2 (ru) * | 2004-03-31 | 2008-10-27 | Интел Корпорейшн | Декодирование высокоизбыточных кодов с контролем четности с использованием многопорогового прохождения сообщения |
| US7716561B2 (en) | 2004-03-31 | 2010-05-11 | Intel Corporation | Multi-threshold reliability decoding of low-density parity check codes |
| US8209579B2 (en) | 2004-03-31 | 2012-06-26 | Intel Corporation | Generalized multi-threshold decoder for low-density parity check codes |
| EP1806736A4 (en) * | 2004-10-28 | 2008-03-19 | Matsushita Electric Industrial Co Ltd | SCALABLE CODING DEVICE, SCALABLE DECODING DEVICE AND METHOD THEREFOR |
| US8019597B2 (en) | 2004-10-28 | 2011-09-13 | Panasonic Corporation | Scalable encoding apparatus, scalable decoding apparatus, and methods thereof |
| US8015468B2 (en) | 2004-12-29 | 2011-09-06 | Intel Corporation | Channel estimation and fixed thresholds for multi-threshold decoding of low-density parity check codes |
| US8631060B2 (en) | 2007-12-13 | 2014-01-14 | Qualcomm Incorporated | Fast algorithms for computation of 5-point DCT-II, DCT-IV, and DST-IV, and architectures |
| RU2430407C1 (ru) * | 2010-04-20 | 2011-09-27 | Общество с ограниченной ответственностью "Научно-производственное предприятие "Цифровые решения" | Устройство для вычисления дискретного косинусного преобразования |
Also Published As
| Publication number | Publication date |
|---|---|
| GB0427660D0 (en) | 2005-01-19 |
| CN1262990C (zh) | 2006-07-05 |
| KR20040001184A (ko) | 2004-01-07 |
| JP2005531014A (ja) | 2005-10-13 |
| US20040002854A1 (en) | 2004-01-01 |
| GB2408184B (en) | 2006-01-04 |
| DE10297751T5 (de) | 2005-07-07 |
| DE10297751B4 (de) | 2005-12-22 |
| CN1639769A (zh) | 2005-07-13 |
| CA2490064A1 (en) | 2003-07-31 |
| RU2004138088A (ru) | 2005-06-27 |
| KR100462611B1 (ko) | 2004-12-20 |
| RU2289858C2 (ru) | 2006-12-20 |
| GB2408184A (en) | 2005-05-18 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20040002854A1 (en) | Audio coding method and apparatus using harmonic extraction | |
| CN103325377B (zh) | 音频编码方法 | |
| CN100525457C (zh) | 用于编码/解码具有辅助信息的音频比特流的方法和设备 | |
| KR100634506B1 (ko) | 저비트율 부호화/복호화 방법 및 장치 | |
| CN101223576B (zh) | 从音频信号提取重要频谱分量的方法和设备以及使用其的低比特率音频信号编码和/或解码方法和设备 | |
| JP2006048043A (ja) | オーディオデータの高周波数の復元方法及びその装置 | |
| CN102365680A (zh) | 音频信号的编码和解码方法及其装置 | |
| KR100707173B1 (ko) | 저비트율 부호화/복호화방법 및 장치 | |
| KR100750115B1 (ko) | 오디오 신호 부호화 및 복호화 방법 및 그 장치 | |
| JP3353868B2 (ja) | 音響信号変換符号化方法および復号化方法 | |
| US20080133250A1 (en) | Method and Related Device for Improving the Processing of MP3 Decoding and Encoding | |
| CN101406064B (zh) | 量化和反量化输入信号的方法和设备以及对输入信号编码和解码的方法和设备 | |
| US20050254586A1 (en) | Method of and apparatus for encoding/decoding digital signal using linear quantization by sections | |
| JP3348759B2 (ja) | 変換符号化方法および変換復号化方法 | |
| KR100928966B1 (ko) | 저비트율 부호화/복호화방법 및 장치 | |
| KR100940532B1 (ko) | 저비트율 복호화방법 및 장치 | |
| Reyes et al. | A new perceptual entropy-based method to achieve a signal adapted wavelet tree in a low bit rate perceptual audio coder | |
| JP2000293200A (ja) | 音声圧縮符号化方法 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SC SD SE SG SK SL TJ TM TN TR TT TZ UA UG UZ VC VN YU ZA ZM ZW |
|
| AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LU MC NL PT SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
| DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
| ENP | Entry into the national phase |
Ref document number: 0427660 Country of ref document: GB Kind code of ref document: A Free format text: PCT FILING DATE = 20021212 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2490064 Country of ref document: CA |
|
| ENP | Entry into the national phase |
Ref document number: 2004138088 Country of ref document: RU Kind code of ref document: A |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2003562916 Country of ref document: JP |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 20028293487 Country of ref document: CN |
|
| RET | De translation (de og part 6b) |
Ref document number: 10297751 Country of ref document: DE Date of ref document: 20050707 Kind code of ref document: P |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 10297751 Country of ref document: DE |
|
| REG | Reference to national code |
Ref country code: DE Ref legal event code: 8607 |
|
| 122 | Ep: pct application non-entry in european phase |