US20020029143A1 - System and method for effectively implementing fixed masking thresholds in an audio encoder device - Google Patents
System and method for effectively implementing fixed masking thresholds in an audio encoder device Download PDFInfo
- Publication number
- US20020029143A1 US20020029143A1 US09/221,394 US22139498A US2002029143A1 US 20020029143 A1 US20020029143 A1 US 20020029143A1 US 22139498 A US22139498 A US 22139498A US 2002029143 A1 US2002029143 A1 US 2002029143A1
- Authority
- US
- United States
- Prior art keywords
- data
- audio data
- masking thresholds
- thresholds
- bit allocator
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/002—Dynamic bit allocation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0204—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
- G10L19/0208—Subband vocoders
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04B—TRANSMISSION
- H04B1/00—Details of transmission systems, not covered by a single one of groups H04B3/00 - H04B13/00; Details of transmission systems not characterised by the medium used for transmission
- H04B1/66—Details of transmission systems, not covered by a single one of groups H04B3/00 - H04B13/00; Details of transmission systems not characterised by the medium used for transmission for reducing bandwidth of signals; for improving efficiency of transmission
- H04B1/665—Details of transmission systems, not covered by a single one of groups H04B3/00 - H04B13/00; Details of transmission systems not characterised by the medium used for transmission for reducing bandwidth of signals; for improving efficiency of transmission using psychoacoustic properties of the ear, e.g. masking effect
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
Definitions
- This invention relates generally to signal processing systems, and relates more particularly to a system and method for effectively implementing fixed masking thresholds in an audio encoder device.
- codec 110 comprises a decoder 114 and an encoder 112 that includes a psycho-acoustic modeler (PAM) 126 .
- PAM psycho-acoustic modeler
- encoder 112 receives source audio data from any compatible audio source via path 116 , responsively filters the source audio into frequency sub-bands, and then generates encoded audio data that may be provided to an audio device (such as a recordable compact-disc device or a computer system) via path 138 .
- PAM psycho-acoustic modeler
- FIG. 2 a graph 210 for one embodiment of exemplary masking thresholds for the FIG. 1 encoder-decoder system 110 is shown.
- Graph 210 displays audio data signal energy on vertical axis 212 , and also displays a series of frequency sub-bands on horizontal axis 214 .
- psycho-acoustic modeler (PAM) 126 receives source audio data, and then utilizes characteristics of human hearing to generate the masking thresholds 228 . Experiments have determined that human hearing cannot detect some sounds of lower energy when those lower energy sounds are close in frequency to sounds of higher energy.
- PAM psycho-acoustic modeler
- sub-band 3 ( 220 ) includes a 60 db sound 232 , a 30 db sound 234 , and a masking threshold 230 of 36 db.
- the 30 db sound 234 falls below masking threshold 230 , and is therefore not detectable by the human ear due to the masking effect of the 60 db sound 232 .
- encoder 112 may thus discard any sounds that fall below masking thresholds 228 to advantageously reduce the amount of audio data and expedite the encoding process.
- Psycho-acoustic modeler 126 thus provides useful information for reducing the amount of audio data that must be encoded by encoder 112 .
- implementing psycho-acoustic modeler 126 within encoder 112 substantially increases the complexity of encoder 112 , and also approximately doubles the processing power required to control encoder 112 .
- the cost and difficulty of successfully implementing psycho-acoustic modeler 126 are therefore significant negative aspects of the FIG. 1 encoder-decoder system 110 .
- An encoder device that exhibits reduced complexity, while still achieving acceptable quality in the encoded audio data would thus provide distinct advantages to system manufacturers and users. Therefore, for all the foregoing reasons, an improved system and method are needed to effectively implement fixed masking thresholds in an audio encoder device.
- a system and method are disclosed for effectively implementing fixed masking thresholds in an audio encoder device.
- system designers of the encoder initially create a masking threshold lookup table.
- the masking threshold lookup table may include masking threshold values that are based upon empirically-derived absolute human hearing thresholds.
- the lookup table may similarly include masking thresholds that are selectively tuned to deviate from the absolute human hearing thresholds.
- a filter bank in the encoder receives and filters source audio data into frequency sub-bands to provide filtered audio data to a bit allocator.
- the bit allocator then responsively analyzes the filtered audio data using the masking thresholds contained in the lookup table. Specifically, the bit allocator identifies masked audio data to be any filtered audio data that falls below the fixed masking thresholds from the lookup table. Similarly, the bit allocator identifies any filtered audio data that lies above the fixed masking thresholds in the lookup table as non-masked audio data.
- the bit allocator may then discard the filtered audio data that was identified as masked audio data to advantageously decrease the total amount of filtered audio data to be processed by the encoder.
- the bit allocator allocates all available allocation bits to the filtered audio data that was previously identified as non-masked audio data to generate allocated audio data to a quantizer.
- the quantizer quantizes the allocated audio data to generate quantized audio data to a bitstream packer.
- the bitstream packer packs the quantized audio data to produce encoded audio data for storage onto an appropriate and compatible storage medium, in accordance with the present invention.
- FIG. 1 is a block diagram for one embodiment of an audio encoder-decoder system
- FIG. 2 is a graph for one embodiment of exemplary masking thresholds for the FIG. 1 encoder-decoder system
- FIG. 3 is a block diagram for one embodiment of an encoder-decoder system, in accordance with the present invention.
- FIG. 4 is a block diagram for one embodiment of the encoder filter bank of FIG. 3, in accordance with the present invention.
- FIG. 5 is a block diagram for one embodiment of the masking threshold lookup table of FIG. 3, in accordance with the present invention.
- FIG. 6 is a graph showing absolute hearing thresholds, in accordance with the present invention.
- FIG. 7 is a graph for one embodiment of exemplary fixed masking thresholds, in accordance with the present invention.
- FIG. 8 is a flowchart of method steps for one embodiment to effectively implement fixed masking thresholds, in accordance with the present invention.
- the present invention relates to an improvement in signal processing systems.
- the following description is presented to enable one of ordinary skill in the art to make and use the invention and is provided in the context of a patent application and its requirements.
- Various modifications to the preferred embodiment will be readily apparent to those skilled in the art and the generic principles herein may be applied to other embodiments.
- the present invention is not intended to be limited to the embodiment shown, but is to be accorded the widest scope consistent with the principles and features described herein.
- the present invention comprises an encoder device that includes a filter bank for filtering source audio data to produce frequency sub-bands, a lookup table for storing masking threshold corresponding to the frequency sub-bands, and a bit allocator for using the masking thresholds to identify and discard masked audio data to thereby reduce the total amount of audio data that requires processing by the encoder device.
- codec 310 comprises an encoder 312 , and a decoder 314 .
- Encoder 312 preferably includes a filter bank 318 , a masking threshold lookup table 326 , a bit allocator 322 , a quantizer 332 , and a bitstream packer 336 .
- Decoder 314 preferably includes a bitstream unpacker 344 , a dequantizer 348 , and a filter bank 352 .
- encoder 312 and decoder 314 preferably function in response to a set of program instructions called an audio manager that is executed by a processor device (not shown).
- encoder 312 and decoder 314 may also be implemented and controlled using appropriate hardware configurations.
- the FIG. 3 embodiment specifically discusses encoding and decoding digital audio data, however the present invention may advantageously be utilized to process and manipulate other types of electronic information.
- encoder 312 receives source audio data from any compatible audio source via path 316 .
- the source audio data on path 316 includes digital audio data that is preferably formatted in a linear pulse code modulation (LPCM) format.
- Encoder 312 preferably processes 16 -bit digital samples of the source audio data in units called “frames”. In the preferred embodiment, each frame contains 1152 samples.
- filter bank 318 receives and separates the source audio data into a set of discrete frequency sub-bands to generate filtered audio data.
- the filtered audio data from filter bank 318 preferably includes thirty-two unique and separate frequency sub-bands.
- Filter bank 318 then provides the filtered audio data (sub-bands) to bit allocator 322 via path 320 .
- Bit allocator 322 then accesses relevant information from lookup table 326 via path 328 , and responsively generates allocated audio data to quantizer 332 via path 330 .
- Bit allocator 322 creates the allocated audio data by assigning binary digits (bits) to represent the signal contained in each of the sub-bands received from filter bank 318 .
- the functionality of lookup table 326 and bit allocator 322 are further discussed below in conjunction with FIGS. 5 - 8 .
- quantizer 332 compresses and codes the allocated audio data to generate quantized audio data to bitstream packer 336 via path 334 .
- Bitstream packer 336 responsively packs the quantized audio data to generate encoded audio data that may then be provided to an audio device (such as a recordable compact disc device or a computer system) via path 338 .
- encoded audio data is provided from an audio device to bitstream unpacker 344 via path 340 .
- Bitstream unpacker 344 responsively unpacks the encoded audio data to generate quantized audio data to dequantizer 348 via path 346 .
- Dequantizer 348 then dequantizes the quantized audio data to generate dequantized audio data to filter bank 352 via path 350 .
- Filter bank 352 responsively filters the dequantized audio data to generate and provide decoded audio data to an audio playback system via path 354 .
- FIG. 4 a block diagram for one embodiment of the FIG. 3 encoder filter bank 318 is shown, in accordance with the present invention.
- filter bank 318 receives source audio data from a compatible audio source via path 316 .
- Filter bank 318 then responsively divides the received source audio data into a series of frequency sub-bands that are each provided to bit allocator 322 .
- the FIG. 4 embodiment preferably generates thirty two sub-bands 320 ( a ) through 320 ( h ), however, in alternate embodiments, filter bank 318 may readily output a greater or lesser number of sub-bands.
- lookup table 326 includes a frequency 1 ( 512 ) through a frequency N ( 518 ), and a masking threshold 1 ( 520 ) through a masking threshold N ( 526 ).
- each frequency 512 through 518 uniquely corresponds with an individual masking threshold 520 through 526 .
- frequency 1 ( 512 ) corresponds to masking threshold 1 ( 520 )
- frequency N ( 518 ) corresponds to masking threshold N ( 526 ).
- frequencies 512 through 518 may represent the individual frequency sub-bands generated by filter 318 , or, alternately, may represent individual frequencies from the filtered audio data generated by filter bank 318 .
- bit allocator 322 may thus identify a particular frequency or a frequency sub-band 512 through 518 contained in the filtered audio data received from filter bank 318 .
- Bit allocator 322 may then access the masking threshold 520 through 526 that correspond to the particular frequency or frequency sub-band by referencing lookup table 326 .
- Bit allocator 322 may then advantageously identify and discard any masked audio data (from the filtered audio data) that falls below the masking thresholds 520 through 526 .
- Implementing encoder 312 with masking threshold lookup table 326 thus significantly reduces the overall complexity of encoder 312 , while still preserving the benefits of utilizing masking thresholds.
- FIG. 6 a graph 610 illustrating absolute hearing thresholds 616 is shown, in accordance with the present invention.
- graph 610 displays audio data signal energy in decibels on vertical axis 612 .
- Graph 610 also displays frequency sub-bands (generated by filter bank 318 ) on horizontal axis 614 .
- absolute hearing thresholds 616 represent empirically determined limits of human hearing. In other words, human hearing does not detect sound energy that falls below absolute hearing thresholds 616 .
- masking thresholds 520 through 526 of lookup table 326 are defined with reference to absolute hearing thresholds 616 .
- masking thresholds 520 through 526 may be substantially equal to absolute hearing thresholds 616 .
- selected segments of absolute hearing thresholds 616 may advantageously be altered or “tuned” to achieve improved performance of encoder 312 .
- selected higher frequency sub-bands may be represented in lookup table 326 by using corresponding masking thresholds that are tuned to threshold values which are higher than those corresponding thresholds contained in absolute hearing thresholds 616 .
- This tuning of lookup table 326 (for the selected higher frequency sub-bands) may thus facilitate optimal allocation of available allocation bits by bit allocator 322 , while still maintaining high quality in the encoded audio data.
- FIG. 7 a graph 710 for one embodiment of exemplary fixed masking thresholds is shown, in accordance with the present invention.
- Graph 710 displays audio data signal energy on vertical axis 712 , and also displays a series of frequency sub-bands on horizontal axis 714 .
- Graph 710 is presented to illustrate principles of the present invention, and therefore, the values shown in graph 710 are intended as examples only. The present invention may thus readily function with operational values other than those presented in graph 710 of FIG. 7.
- graph 710 includes sub-band 1 ( 716 ) through sub-band 6 ( 726 ), and masking threshold values 728 that change for each FIG. 7 sub-band.
- bit allocator 322 initially receives frequency sub-band 1 ( 716 ) from filter bank 318 , and responsively accesses corresponding masking threshold 730 by referencing lookup table 326 . Bit allocator 322 may then advantageously identify and discard any masked audio data from sub-band 1 ( 716 ) that falls below masking threshold 730
- Bit allocator 322 next similarly accesses and utilizes masking threshold 732 in connection with sub-band 2 ( 718 ) to identify and discard any masked audio data. Bit allocator 322 then continues to sequentially access and utilize masking thresholds for individual sub-bands until a current frame is complete. The foregoing process is repeated for each frame of audio data until all frames have been processed by encoder 312 .
- filter bank 318 of encoder 312 receives and filters source audio data into frequency sub-bands to provide filtered audio data to bit allocator 322 .
- bit allocator 322 analyzes the filtered audio data using the fixed masking thresholds contained in lookup table 326 , as discussed above in conjunction with FIGS. 3 and 5- 7 . Specifically, bit allocator 322 identifies any filtered audio data that falls below the fixed masking thresholds in lookup table 326 as masked audio data. Similarly, bit allocator 322 identifies any filtered audio data that lies above the fixed masking thresholds in lookup table 326 as non-masked audio data.
- bit allocator 322 may advantageously disregard or discard the filtered audio data that was identified as masked audio data in the preceding step 816 .
- bit allocator 322 next allocates all available allocation bits to the filtered audio data that was identified as non-masked audio data (in foregoing step 816 ) to generate allocated audio data to quantizer 332 .
- the step 820 bit allocation process may be performed using similar techniques to those disclosed in co-pending U.S. patent application Ser. No. ______, entitled “System And Method For Preventing Artifacts In An Audio Decoder Device,” filed on ______, which is hereby incorporated by reference.
- step 822 quantizer responsively quantizes the allocated audio data to generate quantized audio data to bitstream packer 336 .
- bitstream packer 336 packs the quantized audio data to produce encoded audio data for storage onto an appropriate and compatible storage medium, in accordance with the present invention.
Abstract
Description
- The present application is related to co-pending U.S. patent application Ser. No. 09/128,924, entitled “System And Method For Implementing A Refined Psycho-Acoustic Modeler,” filed on Aug. 4, 1998, and to co-pending U.S. patent application Ser. No. 09/150,117, entitled “System And Method For Efficiently Implementing A Masking Function In A Psycho-Acoustic Modeler,” filed on Sep. 9, 1998, and also to co-pending U.S. patent application Ser. No. ______, entitled “System And Method For Preventing Artifacts In An Audio Decoder Device,” filed on ______, which are hereby incorporated by reference. The foregoing related applications are commonly assigned.
- 1. Field of the Invention
- This invention relates generally to signal processing systems, and relates more particularly to a system and method for effectively implementing fixed masking thresholds in an audio encoder device.
- 2. Description of the Background Art
- Providing an effective method of encoding audio data is often a significant consideration for designers, manufacturers, and users of contemporary electronic systems. Developments in modern digital audio technology have necessitated corresponding improvements in sophisticated, high-performance audio encoding methodologies. For example, the operation of recordable audio compact-disc devices typically requires an encoder-decoder (codec) system to receive and encode source audio data into a format (such as MPEG) that may then be recorded onto appropriate media using the compact-disc device.
- Referring now to FIG. 1, a block diagram for one embodiment of an audio encoder-decoder (codec)110 is shown. In the FIG. 1 embodiment,
codec 110 comprises adecoder 114 and anencoder 112 that includes a psycho-acoustic modeler (PAM) 126. During an encoding operation,encoder 112 receives source audio data from any compatible audio source viapath 116, responsively filters the source audio into frequency sub-bands, and then generates encoded audio data that may be provided to an audio device (such as a recordable compact-disc device or a computer system) via path 138. The operation of psycho-acoustic modeler (PAM) 126 is further discussed below in conjunction with FIG. 2. - Referring now to FIG. 2, a
graph 210 for one embodiment of exemplary masking thresholds for the FIG. 1 encoder-decoder system 110 is shown. Graph 210 displays audio data signal energy onvertical axis 212, and also displays a series of frequency sub-bands onhorizontal axis 214. In operation, psycho-acoustic modeler (PAM) 126 receives source audio data, and then utilizes characteristics of human hearing to generate themasking thresholds 228. Experiments have determined that human hearing cannot detect some sounds of lower energy when those lower energy sounds are close in frequency to sounds of higher energy. - For example, sub-band3 (220) includes a 60 db sound 232, a 30
db sound 234, and a masking threshold 230 of 36 db. The 30db sound 234 falls below masking threshold 230, and is therefore not detectable by the human ear due to the masking effect of the 60 db sound 232. In practice,encoder 112 may thus discard any sounds that fall belowmasking thresholds 228 to advantageously reduce the amount of audio data and expedite the encoding process. - Psycho-
acoustic modeler 126 thus provides useful information for reducing the amount of audio data that must be encoded byencoder 112. However, implementing psycho-acoustic modeler 126 withinencoder 112 substantially increases the complexity ofencoder 112, and also approximately doubles the processing power required to controlencoder 112. The cost and difficulty of successfully implementing psycho-acoustic modeler 126 are therefore significant negative aspects of the FIG. 1 encoder-decoder system 110. An encoder device that exhibits reduced complexity, while still achieving acceptable quality in the encoded audio data would thus provide distinct advantages to system manufacturers and users. Therefore, for all the foregoing reasons, an improved system and method are needed to effectively implement fixed masking thresholds in an audio encoder device. - In accordance with the present invention, a system and method are disclosed for effectively implementing fixed masking thresholds in an audio encoder device. In one embodiment of the present invention, system designers of the encoder initially create a masking threshold lookup table. The masking threshold lookup table may include masking threshold values that are based upon empirically-derived absolute human hearing thresholds. In alternate embodiments, the lookup table may similarly include masking thresholds that are selectively tuned to deviate from the absolute human hearing thresholds.
- Next, a filter bank in the encoder receives and filters source audio data into frequency sub-bands to provide filtered audio data to a bit allocator. The bit allocator then responsively analyzes the filtered audio data using the masking thresholds contained in the lookup table. Specifically, the bit allocator identifies masked audio data to be any filtered audio data that falls below the fixed masking thresholds from the lookup table. Similarly, the bit allocator identifies any filtered audio data that lies above the fixed masking thresholds in the lookup table as non-masked audio data.
- The bit allocator may then discard the filtered audio data that was identified as masked audio data to advantageously decrease the total amount of filtered audio data to be processed by the encoder. Next, the bit allocator allocates all available allocation bits to the filtered audio data that was previously identified as non-masked audio data to generate allocated audio data to a quantizer.
- In response, the quantizer quantizes the allocated audio data to generate quantized audio data to a bitstream packer. Finally, the bitstream packer packs the quantized audio data to produce encoded audio data for storage onto an appropriate and compatible storage medium, in accordance with the present invention. The present invention thus efficiently and effectively provides a system and method for effectively implementing fixed masking thresholds in an audio encoder device.
- FIG. 1 is a block diagram for one embodiment of an audio encoder-decoder system;
- FIG. 2 is a graph for one embodiment of exemplary masking thresholds for the FIG. 1 encoder-decoder system;
- FIG. 3 is a block diagram for one embodiment of an encoder-decoder system, in accordance with the present invention;
- FIG. 4 is a block diagram for one embodiment of the encoder filter bank of FIG. 3, in accordance with the present invention;
- FIG. 5 is a block diagram for one embodiment of the masking threshold lookup table of FIG. 3, in accordance with the present invention;
- FIG. 6 is a graph showing absolute hearing thresholds, in accordance with the present invention;
- FIG. 7 is a graph for one embodiment of exemplary fixed masking thresholds, in accordance with the present invention; and
- FIG. 8 is a flowchart of method steps for one embodiment to effectively implement fixed masking thresholds, in accordance with the present invention.
- The present invention relates to an improvement in signal processing systems. The following description is presented to enable one of ordinary skill in the art to make and use the invention and is provided in the context of a patent application and its requirements. Various modifications to the preferred embodiment will be readily apparent to those skilled in the art and the generic principles herein may be applied to other embodiments. Thus, the present invention is not intended to be limited to the embodiment shown, but is to be accorded the widest scope consistent with the principles and features described herein.
- The present invention comprises an encoder device that includes a filter bank for filtering source audio data to produce frequency sub-bands, a lookup table for storing masking threshold corresponding to the frequency sub-bands, and a bit allocator for using the masking thresholds to identify and discard masked audio data to thereby reduce the total amount of audio data that requires processing by the encoder device.
- Referring now to FIG. 3, a block diagram for one embodiment of an encoder-decoder (codec)310 is shown, in accordance with the present invention. In the FIG. 3 embodiment,
codec 310 comprises anencoder 312, and adecoder 314.Encoder 312 preferably includes afilter bank 318, a masking threshold lookup table 326, abit allocator 322, aquantizer 332, and abitstream packer 336.Decoder 314 preferably includes abitstream unpacker 344, adequantizer 348, and afilter bank 352. - In the FIG. 3 embodiment,
encoder 312 anddecoder 314 preferably function in response to a set of program instructions called an audio manager that is executed by a processor device (not shown). In alternate embodiments,encoder 312 anddecoder 314 may also be implemented and controlled using appropriate hardware configurations. The FIG. 3 embodiment specifically discusses encoding and decoding digital audio data, however the present invention may advantageously be utilized to process and manipulate other types of electronic information. - During an encoding operation,
encoder 312 receives source audio data from any compatible audio source viapath 316. In the FIG. 3 embodiment, the source audio data onpath 316 includes digital audio data that is preferably formatted in a linear pulse code modulation (LPCM) format.Encoder 312 preferably processes 16-bit digital samples of the source audio data in units called “frames”. In the preferred embodiment, each frame contains 1152 samples. - In practice,
filter bank 318 receives and separates the source audio data into a set of discrete frequency sub-bands to generate filtered audio data. In the FIG. 3 embodiment, the filtered audio data fromfilter bank 318 preferably includes thirty-two unique and separate frequency sub-bands.Filter bank 318 then provides the filtered audio data (sub-bands) to bit allocator 322 viapath 320. -
Bit allocator 322 then accesses relevant information from lookup table 326 viapath 328, and responsively generates allocated audio data to quantizer 332 viapath 330.Bit allocator 322 creates the allocated audio data by assigning binary digits (bits) to represent the signal contained in each of the sub-bands received fromfilter bank 318. The functionality of lookup table 326 and bit allocator 322 are further discussed below in conjunction with FIGS. 5-8. - Next,
quantizer 332 compresses and codes the allocated audio data to generate quantized audio data tobitstream packer 336 viapath 334.Bitstream packer 336 responsively packs the quantized audio data to generate encoded audio data that may then be provided to an audio device (such as a recordable compact disc device or a computer system) viapath 338. - During a decoding operation, encoded audio data is provided from an audio device to
bitstream unpacker 344 viapath 340. Bitstream unpacker 344 responsively unpacks the encoded audio data to generate quantized audio data to dequantizer 348 viapath 346.Dequantizer 348 then dequantizes the quantized audio data to generate dequantized audio data to filterbank 352 viapath 350.Filter bank 352 responsively filters the dequantized audio data to generate and provide decoded audio data to an audio playback system viapath 354. - Referring now to FIG. 4, a block diagram for one embodiment of the FIG. 3
encoder filter bank 318 is shown, in accordance with the present invention. In the FIG. 4 embodiment,filter bank 318 receives source audio data from a compatible audio source viapath 316.Filter bank 318 then responsively divides the received source audio data into a series of frequency sub-bands that are each provided to bit allocator 322. The FIG. 4 embodiment preferably generates thirty two sub-bands 320(a) through 320(h), however, in alternate embodiments,filter bank 318 may readily output a greater or lesser number of sub-bands. - Referring now to FIG. 5, a block diagram for one embodiment of the FIG. 3 masking threshold lookup table326 is shown, in accordance with the present invention. In other embodiments of the present invention, lookup table 326 may readily be implemented using any other appropriate and compatible data structure. In the FIG. 5 embodiment, lookup table 326 includes a frequency 1 (512) through a frequency N (518), and a masking threshold 1 (520) through a masking threshold N (526). In the FIG. 5 embodiment, each
frequency 512 through 518 uniquely corresponds with anindividual masking threshold 520 through 526. For example, frequency 1 (512) corresponds to masking threshold 1 (520), and frequency N (518) corresponds to masking threshold N (526). - In the FIG. 5 embodiment,
frequencies 512 through 518 may represent the individual frequency sub-bands generated byfilter 318, or, alternately, may represent individual frequencies from the filtered audio data generated byfilter bank 318. In practice, bit allocator 322 may thus identify a particular frequency or afrequency sub-band 512 through 518 contained in the filtered audio data received fromfilter bank 318.Bit allocator 322 may then access themasking threshold 520 through 526 that correspond to the particular frequency or frequency sub-band by referencing lookup table 326. -
Bit allocator 322 may then advantageously identify and discard any masked audio data (from the filtered audio data) that falls below the maskingthresholds 520 through 526. Implementingencoder 312 with masking threshold lookup table 326 thus significantly reduces the overall complexity ofencoder 312, while still preserving the benefits of utilizing masking thresholds. - Referring now to FIG. 6, a
graph 610 illustratingabsolute hearing thresholds 616 is shown, in accordance with the present invention. In FIG. 6,graph 610 displays audio data signal energy in decibels onvertical axis 612.Graph 610 also displays frequency sub-bands (generated by filter bank 318) onhorizontal axis 614. - In
graph 610,absolute hearing thresholds 616 represent empirically determined limits of human hearing. In other words, human hearing does not detect sound energy that falls belowabsolute hearing thresholds 616. In selected embodiments of the present invention, maskingthresholds 520 through 526 of lookup table 326 (FIG. 5) are defined with reference toabsolute hearing thresholds 616. For example, maskingthresholds 520 through 526 may be substantially equal toabsolute hearing thresholds 616. - In other embodiments of the present invention, selected segments of
absolute hearing thresholds 616 may advantageously be altered or “tuned” to achieve improved performance ofencoder 312. For example, selected higher frequency sub-bands may be represented in lookup table 326 by using corresponding masking thresholds that are tuned to threshold values which are higher than those corresponding thresholds contained inabsolute hearing thresholds 616. This tuning of lookup table 326 (for the selected higher frequency sub-bands) may thus facilitate optimal allocation of available allocation bits by bit allocator 322, while still maintaining high quality in the encoded audio data. - Referring now to FIG. 7, a
graph 710 for one embodiment of exemplary fixed masking thresholds is shown, in accordance with the present invention.Graph 710 displays audio data signal energy onvertical axis 712, and also displays a series of frequency sub-bands onhorizontal axis 714.Graph 710 is presented to illustrate principles of the present invention, and therefore, the values shown ingraph 710 are intended as examples only. The present invention may thus readily function with operational values other than those presented ingraph 710 of FIG. 7. - In FIG. 7,
graph 710 includes sub-band 1 (716) through sub-band 6 (726), and maskingthreshold values 728 that change for each FIG. 7 sub-band. In practice, bit allocator 322 initially receives frequency sub-band 1 (716) fromfilter bank 318, and responsively accesses corresponding maskingthreshold 730 by referencing lookup table 326.Bit allocator 322 may then advantageously identify and discard any masked audio data from sub-band 1 (716) that falls below maskingthreshold 730 -
Bit allocator 322 next similarly accesses and utilizes maskingthreshold 732 in connection with sub-band 2 (718) to identify and discard any masked audio data.Bit allocator 322 then continues to sequentially access and utilize masking thresholds for individual sub-bands until a current frame is complete. The foregoing process is repeated for each frame of audio data until all frames have been processed byencoder 312. - Referring now to FIG. 8, a flowchart of method steps for one embodiment to effectively implement fixed masking thresholds is shown, in accordance with the present invention. Initially, in
step 812,filter bank 318 ofencoder 312 receives and filters source audio data into frequency sub-bands to provide filtered audio data to bit allocator 322. - Next, in
step 814, system designers ofencoder 312 create a masking threshold lookup table 326. The contents and functionality of the lookup table 326 are discussed above in conjunction with FIGS. 3 and 5-7. Then, instep 816, bit allocator 322 analyzes the filtered audio data using the fixed masking thresholds contained in lookup table 326, as discussed above in conjunction with FIGS. 3 and 5-7. Specifically, bit allocator 322 identifies any filtered audio data that falls below the fixed masking thresholds in lookup table 326 as masked audio data. Similarly, bit allocator 322 identifies any filtered audio data that lies above the fixed masking thresholds in lookup table 326 as non-masked audio data. - Then, in
step 818, bit allocator 322 may advantageously disregard or discard the filtered audio data that was identified as masked audio data in the precedingstep 816. Instep 820, bit allocator 322 next allocates all available allocation bits to the filtered audio data that was identified as non-masked audio data (in foregoing step 816) to generate allocated audio data toquantizer 332. In one embodiment of the present invention, thestep 820 bit allocation process may be performed using similar techniques to those disclosed in co-pending U.S. patent application Ser. No. ______, entitled “System And Method For Preventing Artifacts In An Audio Decoder Device,” filed on ______, which is hereby incorporated by reference. - In
step 822, quantizer responsively quantizes the allocated audio data to generate quantized audio data tobitstream packer 336. Finally, instep 824,bitstream packer 336 packs the quantized audio data to produce encoded audio data for storage onto an appropriate and compatible storage medium, in accordance with the present invention. - The invention has been explained above with reference to a preferred embodiment. Other embodiments will be apparent to those skilled in the art in light of this disclosure. For example, the present invention may readily be implemented using configurations and techniques other than those described in the preferred embodiment above. Additionally, the present invention may effectively be used in conjunction with systems other than the one described above as the preferred embodiment. Therefore, these and other variations upon the preferred embodiments are intended to be covered by the present invention, which is limited only by the appended claims.
Claims (29)
Priority Applications (8)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US09/221,394 US6418404B1 (en) | 1998-12-28 | 1998-12-28 | System and method for effectively implementing fixed masking thresholds in an audio encoder device |
AU31258/00A AU3125800A (en) | 1998-12-28 | 1999-12-15 | System and method for effectively implementing fixed masking thresholds in an audio encoder device |
KR1020007008584A KR20010040705A (en) | 1998-12-28 | 1999-12-15 | System and method for effectively implementing fixed masking thresholds in an audio encoder device |
EP99965312A EP1145223A3 (en) | 1998-12-28 | 1999-12-15 | System and method for effectively implementing fixed masking thresholds in an audio encoder device |
JP2000591609A JP2002534039A (en) | 1998-12-28 | 1999-12-15 | Apparatus and method for effectively achieving a fixed masking threshold in an audio encoding device |
CA002320169A CA2320169A1 (en) | 1998-12-28 | 1999-12-15 | System and method for effectively implementing fixed masking thresholds in an audio encoder device |
PCT/US1999/030193 WO2000039787A2 (en) | 1998-12-28 | 1999-12-15 | System and method for effectively implementing fixed masking thresholds in an audio encoder device |
TW088122371A TW451059B (en) | 1998-12-28 | 1999-12-18 | System and method for effectively implementing fixed masking thresholds in an audio encoder device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US09/221,394 US6418404B1 (en) | 1998-12-28 | 1998-12-28 | System and method for effectively implementing fixed masking thresholds in an audio encoder device |
Publications (2)
Publication Number | Publication Date |
---|---|
US20020029143A1 true US20020029143A1 (en) | 2002-03-07 |
US6418404B1 US6418404B1 (en) | 2002-07-09 |
Family
ID=22827639
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/221,394 Expired - Fee Related US6418404B1 (en) | 1998-12-28 | 1998-12-28 | System and method for effectively implementing fixed masking thresholds in an audio encoder device |
Country Status (8)
Country | Link |
---|---|
US (1) | US6418404B1 (en) |
EP (1) | EP1145223A3 (en) |
JP (1) | JP2002534039A (en) |
KR (1) | KR20010040705A (en) |
AU (1) | AU3125800A (en) |
CA (1) | CA2320169A1 (en) |
TW (1) | TW451059B (en) |
WO (1) | WO2000039787A2 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100476103B1 (en) * | 2002-08-09 | 2005-03-10 | 한국과학기술원 | Implementation of Masking Algorithm Using the Feature Space Filtering |
KR100713452B1 (en) | 2003-12-06 | 2007-05-02 | 삼성전자주식회사 | Apparatus and method for coding of audio signal |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6963649B2 (en) * | 2000-10-24 | 2005-11-08 | Adaptive Technologies, Inc. | Noise cancelling microphone |
DE60209888T2 (en) * | 2001-05-08 | 2006-11-23 | Koninklijke Philips Electronics N.V. | CODING AN AUDIO SIGNAL |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4866777A (en) * | 1984-11-09 | 1989-09-12 | Alcatel Usa Corporation | Apparatus for extracting features from a speech signal |
DE3805946A1 (en) | 1988-02-25 | 1989-09-07 | Fraunhofer Ges Forschung | DEVICE FOR DETERMINING CHARACTERISTIC PARAMETERS FROM THE INPUT AND OUTPUT SIGNALS OF A SYSTEM FOR AUDIO SIGNAL PROCESSING |
US5040217A (en) | 1989-10-18 | 1991-08-13 | At&T Bell Laboratories | Perceptual coding of audio signals |
JP3446216B2 (en) | 1992-03-06 | 2003-09-16 | ソニー株式会社 | Audio signal processing method |
JP3173218B2 (en) | 1993-05-10 | 2001-06-04 | ソニー株式会社 | Compressed data recording method and apparatus, compressed data reproducing method, and recording medium |
US5632003A (en) * | 1993-07-16 | 1997-05-20 | Dolby Laboratories Licensing Corporation | Computationally efficient adaptive bit allocation for coding method and apparatus |
JP3277679B2 (en) | 1994-04-15 | 2002-04-22 | ソニー株式会社 | High efficiency coding method, high efficiency coding apparatus, high efficiency decoding method, and high efficiency decoding apparatus |
US5809454A (en) * | 1995-06-30 | 1998-09-15 | Sanyo Electric Co., Ltd. | Audio reproducing apparatus having voice speed converting function |
JP3328532B2 (en) * | 1997-01-22 | 2002-09-24 | シャープ株式会社 | Digital data encoding method |
-
1998
- 1998-12-28 US US09/221,394 patent/US6418404B1/en not_active Expired - Fee Related
-
1999
- 1999-12-15 AU AU31258/00A patent/AU3125800A/en not_active Abandoned
- 1999-12-15 CA CA002320169A patent/CA2320169A1/en not_active Abandoned
- 1999-12-15 EP EP99965312A patent/EP1145223A3/en not_active Withdrawn
- 1999-12-15 KR KR1020007008584A patent/KR20010040705A/en not_active Application Discontinuation
- 1999-12-15 JP JP2000591609A patent/JP2002534039A/en not_active Withdrawn
- 1999-12-15 WO PCT/US1999/030193 patent/WO2000039787A2/en not_active Application Discontinuation
- 1999-12-18 TW TW088122371A patent/TW451059B/en not_active IP Right Cessation
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100476103B1 (en) * | 2002-08-09 | 2005-03-10 | 한국과학기술원 | Implementation of Masking Algorithm Using the Feature Space Filtering |
KR100713452B1 (en) | 2003-12-06 | 2007-05-02 | 삼성전자주식회사 | Apparatus and method for coding of audio signal |
Also Published As
Publication number | Publication date |
---|---|
EP1145223A3 (en) | 2002-09-11 |
US6418404B1 (en) | 2002-07-09 |
AU3125800A (en) | 2000-07-31 |
TW451059B (en) | 2001-08-21 |
CA2320169A1 (en) | 2000-07-06 |
JP2002534039A (en) | 2002-10-08 |
WO2000039787A3 (en) | 2001-08-16 |
EP1145223A2 (en) | 2001-10-17 |
WO2000039787A2 (en) | 2000-07-06 |
KR20010040705A (en) | 2001-05-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6502069B1 (en) | Method and a device for coding audio signals and a method and a device for decoding a bit stream | |
KR100871999B1 (en) | Audio coding | |
KR100903017B1 (en) | Scalable coding method for high quality audio | |
USRE46082E1 (en) | Method and apparatus for low bit rate encoding and decoding | |
US9741351B2 (en) | Adaptive quantization noise filtering of decoded audio data | |
US6240379B1 (en) | System and method for preventing artifacts in an audio data encoder device | |
WO1995032499A1 (en) | Encoding method, decoding method, encoding-decoding method, encoder, decoder, and encoder-decoder | |
WO2007066970A1 (en) | Method, medium, and apparatus encoding and/or decoding an audio signal | |
CA2490064A1 (en) | Audio coding method and apparatus using harmonic extraction | |
WO1996035269A1 (en) | Non-linearly quantizing an information signal | |
US6418404B1 (en) | System and method for effectively implementing fixed masking thresholds in an audio encoder device | |
US6647063B1 (en) | Information encoding method and apparatus, information decoding method and apparatus and recording medium | |
US6745162B1 (en) | System and method for bit allocation in an audio encoder | |
JP2000151413A (en) | Method for allocating adaptive dynamic variable bit in audio encoding | |
JP3942882B2 (en) | Digital signal encoding apparatus and digital signal recording apparatus having the same | |
KR100754389B1 (en) | Apparatus and method for encoding a speech signal and an audio signal | |
JPH0537395A (en) | Band-division encoding method | |
JP2002351500A (en) | Method of encoding digital data | |
Stuart et al. | MLP lossless compression | |
JPH0916199A (en) | Semi-reversible coding device of voice | |
JPH07336231A (en) | Method and device for coding signal, method and device for decoding signal and recording medium | |
JP2001094432A (en) | Sub-band coding and decoding method | |
JP2005043761A (en) | Information amount conversion device and information amount conversion system | |
JPH06348294A (en) | Band dividing and coding device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: SONY CORPORATION, JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:YIN, LIN;REEL/FRAME:009683/0446 Effective date: 19981214 Owner name: SONY ELECTRONICS INC., NEW JERSEY Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:YIN, LIN;REEL/FRAME:009683/0446 Effective date: 19981214 |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
REMI | Maintenance fee reminder mailed | ||
LAPS | Lapse for failure to pay maintenance fees | ||
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20100709 |