WO2010036897A1 - Method and apparatus for signal processing using transform-domain log-companding - Google Patents
Method and apparatus for signal processing using transform-domain log-companding Download PDFInfo
- Publication number
- WO2010036897A1 WO2010036897A1 PCT/US2009/058387 US2009058387W WO2010036897A1 WO 2010036897 A1 WO2010036897 A1 WO 2010036897A1 US 2009058387 W US2009058387 W US 2009058387W WO 2010036897 A1 WO2010036897 A1 WO 2010036897A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data signal
- transform
- inverse
- coefficients
- companding
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 238000012545 processing Methods 0.000 title abstract description 5
- 230000005236 sound signal Effects 0.000 claims abstract description 29
- 230000003595 spectral effect Effects 0.000 claims abstract description 23
- 238000000354 decomposition reaction Methods 0.000 claims description 24
- 238000004590 computer program Methods 0.000 claims description 9
- 230000006835 compression Effects 0.000 abstract description 28
- 238000007906 compression Methods 0.000 abstract description 28
- 230000006870 function Effects 0.000 description 19
- 238000004891 communication Methods 0.000 description 15
- 238000010586 diagram Methods 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 9
- 238000009826 distribution Methods 0.000 description 7
- 230000005540 biological transmission Effects 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 230000003287 optical effect Effects 0.000 description 4
- 238000013459 approach Methods 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 239000000835 fiber Substances 0.000 description 3
- 238000013139 quantization Methods 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 230000001413 cellular effect Effects 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 239000002245 particle Substances 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 1
- 230000001010 compromised effect Effects 0.000 description 1
- 230000001143 conditioned effect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 229940052961 longrange Drugs 0.000 description 1
- 230000000873 masking effect Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 239000005022 packaging material Substances 0.000 description 1
- 238000007639 printing Methods 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/50—Conversion to or from non-linear codes, e.g. companding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/032—Quantisation or dequantisation of spectral components
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0204—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0212—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using orthogonal transformation
Definitions
- the present disclosure relates generally to communications, and more specifically, to signal compression using spectral domain log companding.
- a codec generally includes an encoder and a decoder.
- the encoder typically divides the incoming speech signal (a digital signal representing audio information) into segments of time called “frames,” analyzes each frame to extract certain relevant parameters, and quantizes the parameters into an encoded frame.
- the encoded frames are transmitted over a transmission channel (i.e., a wired or wireless network connection) to a receiver that includes a decoder.
- the decoder receives and processes encoded frames, dequantizes them to produce the parameters, and recreates speech frames using the dequantized parameters.
- a method for encoding includes receiving a data signal, performing a transform of the data signal to provide at least two coefficients, and performing log companding of the at least two coefficients to provide a compressed data signal.
- a method for decoding includes receiving a compressed data signal, performing expansion by inverse log companding of the compressed data signal to obtain at least two coefficients, and performing inverse transform on the at least two coefficients to provide a data signal.
- an apparatus for encoding is disclosed.
- the apparatus includes a receiver configured to receive a data signal, a transform circuit configured to decompose the data signal to provide at least two coefficients, and a log companding circuit configured to encode the at least two coefficients to provide a compressed data signal.
- an apparatus for decoding includes a receiver configured to receive a compressed data signal, an inverse log companding circuit configured to decode the compressed data signal to obtain at least two coefficients, and an inverse transform circuit configured to reconstruct a data signal from the at least two coefficients.
- an apparatus for encoding is disclosed.
- the apparatus includes means for receiving a data signal, means for performing a transform of the data signal to provide at least two coefficients, and means for performing log companding of the at least two coefficients to provide a compressed data signal.
- an apparatus for decoding is disclosed.
- the apparatus includes means for receiving a compressed data signal, means for performing inverse log companding by decoding the compressed data signal to obtain at least two coefficients, and means for performing inverse transform on the at least two coefficients to provide a data signal.
- a computer program product for encoding includes a computer-readable medium comprising instructions executable to receive a data signal, perform a transform of the data signal to provide at least two coefficients, and perform log companding of the at least two coefficients to provide a compressed data signal.
- a computer program product for decoding includes a computer-readable medium comprising instructions executable to receive a compressed data signal, perform inverse log companding by decoding the compressed data signal to obtain at least two coefficients, and perform inverse transform on the at least two coefficients to provide a data signal.
- a headset in yet a further aspect of the disclosure, includes a receiver configured to receive a compressed data signal; an inverse log companding circuit configured to decode the compressed data signal to obtain at least two coefficients; an inverse transform circuit configured to reconstruct a data signal from the at least two coefficients; and a transducer configured to provide audio output based on the reconstructed data signal.
- a sensing device includes a sensor configured to detect a data signal; a transform circuit configured to decompose the data signal to provide at least two coefficients; a log companding circuit configured to encode the at least two coefficients to provide a compressed data signal; and a transmitter configured to transmit the compressed data signal.
- a handset in yet a further aspect of the disclosure, includes a transducer configured to detect an audio signal; a transform circuit configured to decompose the audio signal to provide at least two coefficients; a log companding circuit configured to encode the at least two coefficients to provide a compressed audio signal; and an antenna configured to transmit the compressed audio signal.
- a watch is disclosed. The watch includes a receiver configured to receive a compressed data signal; an inverse log companding circuit configured to decode the compressed data signal to obtain at least two coefficients; an inverse transform circuit configured to reconstruct a data signal from the at least two coefficients; and a user interface configured to provide an indication based on the reconstructed data signal.
- the one or more aspects comprise the features hereinafter fully described and particularly pointed out in the claims.
- the following description and the annexed drawings set forth in detail certain illustrative features of the one or more aspects. These features are indicative, however, of but a few of the various ways in which the principles of various aspects may be employed, and this description is intended to include all such aspects and their equivalents.
- FIG. 1 is a diagram illustrating an example of a wireless network
- FIG. 2 is a block diagram illustrating a signal compression system configured in accordance with various aspects disclosed herein;
- FIGs. 3A-3C are plots of example probability distributions of the first, second and sixth Discrete Cosine Transform (DCT) coefficients, respectively, in accordance with various aspects of the disclosure;
- FIGs. 4A and 4B are flow diagrams illustrating encoding/decoding functions performed in accordance with aspects of the disclosure
- FIG. 5 is a block diagram illustrating a system for facilitating speech/audio signal processing in a wireless network, in accordance with aspects of the disclosure;
- FIG. 6 is a block diagram illustrating a receiver for facilitating improved wireless audio/speech decoding, in accordance with aspects of the disclosure;
- FIG. 7 is a block diagram illustrating a transmitter for facilitating speech/audio signal compression, in accordance with aspects of the disclosure
- FIG. 8 is a block diagram illustrating an encoding apparatus configured in accordance with aspects of the disclosure.
- FIG. 9 is a block diagram illustrating a decoding apparatus configured in accordance with aspects of the disclosure.
- FIG. 1 An example of a short range communications network suitable for supporting one or more aspects presented throughout this disclosure is illustrated in FIG. 1.
- the network 100 is shown with various wireless nodes that communicate using any suitable radio technology or wireless protocol.
- the wireless nodes may be configured to support Ultra- Wideband (UWB) technology.
- the wireless nodes may be configured to support various wireless protocols such as Bluetooth or IEEE 802.11 , just to name a few.
- the network 100 is shown with a computer 102 in communication with the other wireless nodes.
- the computer 102 may receive digital photos from a digital camera 104, send documents to a printer 106 for printing, synch-up with e-mail on a personal digital assistant (PDA) 108, transfer music files to a digital audio player (e.g., MP3 player) 110, back up data and files to a mobile storage device 112, and communicate with a remote network (e.g., the Internet) via a wireless hub 114.
- PDA personal digital assistant
- the network 100 may also include a number of mobile and compact nodes, either wearable or implanted into the human body.
- a person may be wearing a headset 116 (e.g., headphones, earpiece, etc.) that receives streamed audio from the computer 102, a watch 118 that is set by the computer 102, and/or a sensor 120 which monitors vital body parameters (e.g., a biometric sensor, a heart rate monitor, a pedometer, and EKG device, etc.).
- a headset 116 e.g., headphones, earpiece, etc.
- a sensor 120 which monitors vital body parameters (e.g., a biometric sensor, a heart rate monitor, a pedometer, and EKG device, etc.).
- aspects presented throughout this disclosure may also be configured to support communications in a wide area network supporting any suitable wireless protocol, including by way of example, Evolution-Data Optimized (EV-DO), Ultra Mobile Broadband (UMB), Code Division Multiple Access (CDMA) 2000, Long Term Evolution (LTE), or Wideband CDMA (W-CDMA), just to name a few.
- the wireless node may be configured to support wired communications using cable modem, Digital Subscriber Line (DSL), fiber optics, Ethernet, HomeRF, or any other suitable wired access protocol.
- DSL Digital Subscriber Line
- Ethernet Ethernet
- HomeRF HomeRF
- a wireless device may communicate via an impulse-based wireless communication link.
- an impulse-based wireless communication link may utilize ultra-wideband pulses that have a relatively short length (e.g., on the order of a few nanoseconds or less) and a relatively wide bandwidth.
- the ultra-wideband pulses may have a fractional bandwidth on the order of approximately 20% or more and/or have a bandwidth on the order of approximately 500 MHz or more.
- the teachings herein may be incorporated into (e.g., implemented within or performed by) a variety of apparatuses (e.g., devices).
- a phone e.g., a cellular phone
- PDA personal data assistant
- an entertainment device e.g., a music or video device
- a headset e.g., headphones, an earpiece, etc.
- a microphone e.g., a medical sensing device (e.g., a biometric sensor, a heart rate monitor, a pedometer, an EKG device, a smart bandage, etc.), a user I/O device (e.g., a watch, a remote control, a light switch, a keyboard, a mouse, etc.), an environment sensing device (e.g., a tire pressure monitor), a monitor that may receive data from the medical or environment sensing device, a computer, a point- of-sale device, an entertainment device,
- a medical sensing device e.g
- teachings herein may be adapted for use in low power applications (e.g., through the use of an impulse-based signaling scheme and low duty cycle modes) and may support a variety of data rates including relatively high data rates (e.g., through the use of high- bandwidth pulses).
- aspects disclosed herein take advantage of the fact that the human ear is less sensitive to concealment of drop-outs in the frequency domain than to concealment of drop-outs in the time-domain.
- aspects disclosed herein apply equally well to a wide range of signals including audio, ultra-wideband speech, wideband speech and narrowband speech, among others.
- aspects of the disclosure provide a low-complexity, low-latency, and robust to channel errors solution to audio/speech compression that utilizes spectral domain log- companding (compression and expanding), and achieves transparent quality for wideband speech and audio.
- Aspects disclosed herein can be implemented with hardware friendly operations such as shift-and-adds, which require less power and area than traditional decoders.
- Aspects disclosed herein approach signal compression by applying log companding on spectral domain representations of signals.
- aspects of the disclosure combine these concepts by first computing the frequency domain representation of the signal. Transforms project data from one basis to another with the goal of representing the original data in a way which allows for the application of some psychoacoustic masking. Typically, this is done by separating a signal into specific frequency bands (interchangeably referred to herein as "bins") through the use of transforms, as in the case of the MP3 encoder, for example.
- aspects of the disclosure Upon computing the spectral domain representations of the audio/speech signals, aspects of the disclosure perform log companding with different compression ratios on each spectral coefficient. Since very little audio/speech energy resides in the upper frequency bands, the allocation of very few bits in those bands can maintain good quality. The resulting average number of bits per sample can therefore be reduced and is scalable with audio/speech quality. In addition, since the signal is encoded in the spectral domain, if there are bursty channel errors, they affect frequency bands in the time-frequency plane rather than simple dropouts in time. These errors are much less disagreeable to the human ear and, when subjected to simple spectral domain interpolation, can be effectively concealed.
- the invention may be implemented by performing a transform in the time-scale domain, in addition to the time-frequency domain.
- a time-scale transform is a wavelet.
- the system 200 includes an encoder 210 and a decoder 220.
- the encoder 210 includes a time-to-frequency decomposition block 212, a plurality of companders 214 and a packetizer 216.
- the decoder 220 includes an unpacketizer 222, a plurality of inverse companders 224, and an inverse transform block 226.
- time-to-frequency decomposition block 212 uses a
- DCT Discrete Cosine Transform
- the DCT algorithm decorrelates the signal into multiple frequency bands or bins.
- an 8-point DCT transform may be performed, although the point number may vary.
- the statistical distribution of each spectral coefficient is Laplacian in nature with much higher probability for lower amplitude coefficients, compared with higher amplitude coefficients.
- the variances of the coefficients significantly decrease.
- Example probability distributions of the first, second and sixth DCT coefficients, respectively, are shown in FIGs. 3A-3C. As can be seen from the example distributions in FIGs. 3A-3C, fewer bits may be allocated for the higher DCT coefficients.
- any transform that decorrelates a signal into multiple frequency bands may be used to achieve similar results.
- use of the DCT may be compared to classifying the energy of a signal into evenly divided frequency bands. For example, for data sampled at 32/48 kHz, the coefficients from an 8-point DCT could roughly represent the amount of energy at consecutive 2/3 kHz frequency bands to 16/24 kHz. It is known from psychoacoustic modeling that human hearing becomes less sensitive at frequencies above 16 kHz.
- Log companding such as the ⁇ -law/A-law algorithm, is an efficient compression tool for signals having a Laplacian/Exponential distribution, and works well for signals, such as speech, that have a distribution that resembles a Laplacian distribution, despite having a wide dynamic range.
- coarser quantization is used for larger sample values and progressively finer quantization is used for smaller sample values. This characteristic has been successfully exploited in telephony compression algorithms, e.g., G.711 specifications, which allow for intelligible transmission of speech at much lower bitrates (e.g., 8 bits per sample).
- G.711 log companding (compression and expansion) specifications are described in the International Telecommunication Union (ITU-T) Recommendation G.711 (November 1988) - Pulse code modulation (PCM) of voice frequencies and in the G711.C, G.711 ENCODING/DECODING FUNCTIONS, and are incorporated herein in their entirety.
- ITU-T International Telecommunication Union
- G.711 June 1988
- PCM Pulse code modulation
- ⁇ -law companding scheme There are two G.711 log companding schemes: a ⁇ -law companding scheme and an A-law companding scheme.
- Both the ⁇ -law companding scheme and the A-law companding scheme are Pulse Code Modulation (PCM) methods. That is, an analog signal is sampled and the amplitude of each sampled signal is quantized, i.e., assigned a digital value.
- PCM Pulse Code Modulation
- the logarithmic curve is divided into segments, wherein each successive segment is twice the length of the previous segment.
- the A-law and ⁇ -law companding schemes have different segment lengths because the ⁇ -law and A-law companding schemes calculate the linear approximation differently. It should be noted that although aspects have been described in reference to log companding using the G. 711 specifications, any log companding specification that allows intelligible transmission of speech at low bitrates may be used to achieve similar goals.
- log companding which operates on values between -1 and 1, is applied on the DCT coefficients by the plurality of log companders 214, each using a different companding parameter, such as a ⁇ constant ( ⁇ i to ⁇ n ).
- Log companding effectively allocates more quantization steps around 0, and less as the sample values increase.
- the first, second, and third coefficients may be respectively scaled down by a factor of 4, 2 and 2, which ensures a correct data range for the plurality of log companders 214.
- clipping is performed on DCT coefficient values with a magnitude greater than 1.
- the decoder 220 reverses the companding and DCT transform performed to compress the signal.
- the received signal is unpacketized by unpacketizer 222
- the first three coefficients are scaled up by 4, 2 and 2, respectively, and inverse log companding is performed in inverse companders 224.
- Inverse DCT transform is performed in Inverse Transform Block 226 to obtain a reconstructed time-frequency signal.
- FIGs. 4A and 4B therein shown are flow diagrams of functions performed in accordance with aspects disclosed herein. Examples of functions performed in the encoder are shown in an encoding process 400A FIG. 4A.
- a transform is performed in step 420 to achieve time-frequency decomposition of the signal.
- Log companding with different companding parameters, such as ⁇ constants, is performed in step 430, and a compressed data signal is outputted in step 440.
- step 450 Upon receiving a compressed data signal in step 450, inverse log companding is performed in step 460. Inverse transform is performed in step 470, and the data signal is output in step 480.
- FIG. 5 therein illustrated is a system 500 that facilitates speech/audio signal processing in a wireless network, in accordance with various aspects.
- System 500 may include an encoder 510 and a decoder 540, for example.
- Encoder 510 can reside at least partially within a base station, for example. It is to be appreciated that system 500 is represented as including functional blocks, which can be functional blocks that represent functions implemented by a processor, software, or combination thereof (e.g., firmware). Encoder 510 includes a logical grouping of electrical components 520, 530 that can act in conjunction. Decoder 540 also includes a logical grouping of electrical components 550, 560 that can act in conjunction.
- logical grouping 520, 530 can include means for performing transform on a received speech/audio signal 520, which functions to perform time- frequency decomposition of the speech audio signal into multiple frequency bands. Further, logical grouping 520, 530 can comprise means for performing log companding 530, which functions to compress the signal by applying different compression ratios on each spectral coefficient for each frequency band. Additionally, logical grouping 520, 530 can include a memory (not shown) that retains instructions for executing functions associated with electrical components 520, 530.
- logical grouping 550, 560 can include means for performing inverse log companding 550, which functions to decode the signal by applying the inverse compression ratios, and means for inverse transform 560, which functions as a time- frequency reconstruction circuit to inverse the time-frequency decomposition of the signal.
- FIG. 6 is an illustration of a receiver 600 that facilitates improved wireless audio/speech decoding.
- Receiver 600 receives a signal from, for instance, a receive antenna (not shown), and performs typical actions thereon (e.g., filters, amplifies, downconverts, etc.) the received signal and digitizes the conditioned signal to obtain samples.
- Receiver 602 can comprise a demodulator 604 that can demodulate received symbols and provide them to a processor 606 for channel estimation.
- Processor 606 can be a processor dedicated to analyzing information received by receiver 600, a processor that controls one or more components of receiver 600, and/or a processor that both analyzes information received by receiver 600 and controls one or more components of receiver 600.
- Receiver 600 can additionally comprise memory 608 that is operatively coupled to processor 606 and that may store data to be transmitted, received data, information related to available channels, data associated with analyzed signal and/or interference strength, information related to an assigned channel, power, rate, or the like, and any other suitable information for estimating a channel and communicating via the channel.
- Memory 608 can additionally store protocols and/or algorithms associated with estimating and/or utilizing a channel (e.g., performance based, capacity based, etc.). Additionally, the memory 608 may store executable code and/or instructions. For example, the memory 608 may store instructions for decompressing a received speech/audio signal. Further, the memory 608 may store instructions for performing inverse log companding to decode the signal by applying inverse encoding ratios, and for performing inverse transform to inverse the time-frequency decomposition of the signal.
- nonvolatile memory can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable PROM (EEPROM), or flash memory.
- Volatile memory can include random access memory (RAM), which acts as external cache memory.
- RAM is available in many forms such as synchronous RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), enhanced SDRAM (ESDRAM), Synchlink DRAM (SLDRAM), and direct Rambus RAM (DRRAM).
- SRAM synchronous RAM
- DRAM dynamic RAM
- SDRAM synchronous DRAM
- DDR SDRAM double data rate SDRAM
- ESDRAM enhanced SDRAM
- SLDRAM Synchlink DRAM
- DRRAM direct Rambus RAM
- the memory 608 of the subject systems and methods is intended to comprise, without being limited to, these and any other suitable types of memory.
- Processor 606 is further operatively coupled to a decoder 610, in which an inverse log companding block 612 may perform inverse log companding to decode the signal by applying inverse compression ratios, and an inverse transform block 618 (e.g., a time- frequency reconstruction circuit) may perform inverse transform to inverse the time- frequency decomposition of the signal.
- the inverse log companding block 612 and/or inverse transform block 618 may include aspects as described above with reference to FIGs. 2-5 to obtain a time-frequency reconstructed signal.
- inverse log companding block 612 and/or inverse transform block 618 may be part of processor 606 or a number of processors (not shown).
- An output block 620 provides the output from the processor 606.
- FIG. 7 is an illustration of an example transmitter system 700 that facilitates speech/audio signal compression, in accordance with aspects disclosed herein.
- System 700 comprises a transmitter 724 that transmits to the one or more mobile devices (not shown) through a plurality of transmit antennas (not shown).
- Input into the transmitter may be analyzed by a processor 714 that can be similar to the processor described above with regard to FIG. 6, and which is coupled to a memory 716 that stores information related to data to be transmitted to or received from mobile device(s) (not shown) or a disparate base station (not shown), and/or any other suitable information related to performing the various actions and functions set forth herein.
- Processor 714 is further coupled to an encoder 718, in which a transform block
- a log companding block 722 can perform log companding to encode the signal by applying a different compression ratio on each spectral coefficient for each frequency band.
- the transform block 720 and/or log companding block 722 may include aspects as described above with reference to FIGs. 2-5.
- Information to be transmitted may be provided to a modulator 726.
- Modulator 726 can multiplex the information for transmission by a transmitter 724 through antenna (not shown) to mobile device(s) (not shown).
- the transform block 720 and/or log companding block 722 may be part of processor 714 or a number of processors (not shown).
- FIG. 8 illustrates an encoding apparatus 800 for encoding a data signal for a wireless communication device having various modules operable to encode the data signal using time-frequency decomposition and log companding.
- a data signal receiver 802 is used for receiving a data signal.
- a time-frequency decomposer 804 is configured to perform a time-frequency decomposition of the data signal to provide at least two spectral coefficients.
- a log compander 806 is configured to perform log companding of the at least two spectral coefficients to provide a compressed data signal.
- FIG. 9 illustrates a decoding apparatus 900 for decoding a data signal for a wireless communication device having various modules operable to decode the data signal using inverse log companding and inverse time-frequency decomposition.
- a compressed signal receiver 902 is used for receiving a compressed signal.
- An inverse log compander 904 is configured to perform inverse log companding by decoding the compressed data signal to obtain at least two spectral coefficients.
- a time-frequency decomposer 906 is configured to perform inverse time-frequency decomposition on the at least two spectral coefficients to provide a data signal.
- a CDMA system may implement a radio technology such as Universal Terrestrial Radio Access (UTRA), cdma2000, etc.
- UTRA includes Wideband-CDMA (W-CDMA) and other variants of CDMA.
- W-CDMA Wideband-CDMA
- cdma2000 covers IS-2000, IS-95 and IS-856 standards.
- GSM Global System for Mobile Communications
- An OFDMA system may implement a radio technology such as Evolved UTRA (E-UTRA), Ultra Mobile Broadband (UMB), IEEE 802.11 (Wi-Fi), IEEE 802.16 (WiMAX), IEEE 802.20, Flash-OFDM, etc.
- E-UTRA Evolved UTRA
- UMB Ultra Mobile Broadband
- IEEE 802.11 Wi-Fi
- WiMAX IEEE 802.16
- Flash-OFDM Flash-OFDM
- UTRA and E-UTRA are part of Universal Mobile Telecommunication System (UMTS).
- UMTS Universal Mobile Telecommunication System
- 3GPP Long Term Evolution (LTE) is a release of UMTS that uses E-UTRA, which employs OFDMA on the downlink and SC-FDMA on the uplink.
- UTRA, E-UTRA, UMTS, LTE and GSM are described in documents from an organization named "3rd Generation Partnership Project" (3GPP).
- wireless communication systems may additionally include peer-to-peer (e.g., mobile-to- mobile) ad hoc network systems often using unpaired unlicensed spectrums, 802. xx wireless LAN, BLUETOOTH and any other short- or long- range, wireless communication techniques.
- peer-to-peer e.g., mobile-to- mobile
- the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored or transmitted as one or more instructions or code on a computer-readable medium.
- Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another.
- a storage medium may be any available media that can be accessed by a computer.
- such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.
- any connection may be termed a computer-readable medium.
- a computer-readable medium includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs usually reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer- readable media.
- an apparatus may be represented as a series of interrelated functional blocks that may represent functions implemented by, for example, one or more integrated circuits (e.g., an ASIC) or may be implemented in some other manner as taught herein.
- an integrated circuit may include a processor, software, other components, or some combination thereof.
- Such an apparatus may include one or more modules that may perform one or more of the functions described above with regard to various figures.
- these components may be implemented via appropriate processor components. These processor components may in some aspects be implemented, at least in part, using structure as taught herein. In some aspects a processor may be adapted to implement a portion or all of the functionality of one or more of these components.
- an apparatus may comprise one or more integrated circuits.
- a single integrated circuit may implement the functionality of one or more of the illustrated components, while in other aspects more than one integrated circuit may implement the functionality of one or more of the illustrated components.
- the components and functions described herein may be implemented using any suitable means. Such means also may be implemented, at least in part, using corresponding structure as taught herein. For example, the components described above may be implemented in an "ASIC" and also may correspond to similarly designated “means for” functionality. Thus, in some aspects one or more of such means may be implemented using one or more of processor components, integrated circuits, or other suitable structure as taught herein.
- any reference to an element herein using a designation such as “first,” “second,” and so forth does not generally limit the quantity or order of those elements. Rather, these designations may be used herein as a convenient method of distinguishing between two or more elements or instances of an element. Thus, a reference to first and second elements does not mean that only two elements may be employed there or that the first element must precede the second element in some manner. Also, unless stated otherwise, a set of elements may comprise of one or more elements.
- terminology of the form “at least one of: A, B, or C" used in the description or the claims means “A or B or C or any combination thereof.”
- any of the various illustrative logical blocks, modules, processors, means, circuits, and algorithm steps described in connection with the aspects disclosed herein may be implemented as electronic hardware (e.g., a digital implementation, an analog implementation, or a combination of the two, which may be designed using source coding or some other technique), various forms of program or design code incorporating instructions (which may be referred to herein, for convenience, as "software” or a "software module”), or combinations of both.
- software or a “software module”
- various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the aspects disclosed herein.
- the various illustrative logical blocks, modules, and circuits described in connection with the aspects disclosed herein may be implemented within or performed by an integrated circuit ("IC"), an access terminal, or an access point.
- the IC may comprise a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, electrical components, optical components, mechanical components, or any combination thereof designed to perform the functions described herein, and may execute codes or instructions that reside within the IC, outside of the IC, or both.
- a general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine.
- a processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
- a software module (e.g., including executable instructions and related data) and other data may reside in a data memory such as RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, a hard disk, a removable disk, a CD-ROM, or any other form of computer- readable storage medium known in the art.
- a sample storage medium may be coupled to a machine such as, for example, a computer/processor (which may be referred to herein, for convenience, as a "processor”) such the processor can read information (e.g., code) from and write information to the storage medium.
- a sample storage medium may be integral to the processor.
- the processor and the storage medium may reside in an ASIC.
- the ASIC may reside in user equipment.
- any suitable computer-program product may comprise a computer-readable medium comprising codes (e.g., executable by at least one computer) relating to one or more of the aspects disclosed herein.
- a computer program product may comprise packaging materials.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Signal Processing (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Nonlinear Science (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Telephone Function (AREA)
Abstract
Description
Claims
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2011529264A JP2012504373A (en) | 2008-09-26 | 2009-09-25 | Method and apparatus for signal processing using transform domain log companding |
KR1020117009533A KR101278880B1 (en) | 2008-09-26 | 2009-09-25 | Method and apparatus for signal processing using transform-domain log-companding |
CN2009801377943A CN102165699A (en) | 2008-09-26 | 2009-09-25 | Method and apparatus for signal processing using transform-domain log-companding |
EP09793011A EP2345166A1 (en) | 2008-09-26 | 2009-09-25 | Method and apparatus for signal processing using transform-domain log-companding |
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10064508P | 2008-09-26 | 2008-09-26 | |
US61/100,645 | 2008-09-26 | ||
US10107008P | 2008-09-29 | 2008-09-29 | |
US61/101,070 | 2008-09-29 | ||
US12/428,336 | 2009-04-22 | ||
US12/428,336 US20100106269A1 (en) | 2008-09-26 | 2009-04-22 | Method and apparatus for signal processing using transform-domain log-companding |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2010036897A1 true WO2010036897A1 (en) | 2010-04-01 |
Family
ID=41667444
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2009/058387 WO2010036897A1 (en) | 2008-09-26 | 2009-09-25 | Method and apparatus for signal processing using transform-domain log-companding |
Country Status (7)
Country | Link |
---|---|
US (1) | US20100106269A1 (en) |
EP (1) | EP2345166A1 (en) |
JP (2) | JP2012504373A (en) |
KR (1) | KR101278880B1 (en) |
CN (1) | CN102165699A (en) |
TW (1) | TW201019315A (en) |
WO (1) | WO2010036897A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103532936A (en) * | 2013-09-28 | 2014-01-22 | 福州瑞芯微电子有限公司 | Bluetooth audio self-adaption transmission method |
US9626521B2 (en) | 2014-04-16 | 2017-04-18 | Arizona Board Of Regents On Behalf Of Arizona State University | Physiological signal-based encryption and EHR management |
US9642543B2 (en) | 2013-05-23 | 2017-05-09 | Arizona Board Of Regents | Systems and methods for model-based non-contact physiological data acquisition |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2965687A1 (en) * | 2010-09-30 | 2012-04-06 | France Telecom | NOISE LIMITATION FOR TRANSMISSION IN A MULTI-CHANNEL CHANNEL |
US9767823B2 (en) | 2011-02-07 | 2017-09-19 | Qualcomm Incorporated | Devices for encoding and detecting a watermarked signal |
US9767822B2 (en) | 2011-02-07 | 2017-09-19 | Qualcomm Incorporated | Devices for encoding and decoding a watermarked signal |
US9177570B2 (en) * | 2011-04-15 | 2015-11-03 | St-Ericsson Sa | Time scaling of audio frames to adapt audio processing to communications network timing |
US9077183B2 (en) | 2011-09-06 | 2015-07-07 | Portland State University | Distributed low-power wireless monitoring |
CN103974268B (en) * | 2013-01-29 | 2017-09-29 | 上海携昌电子科技有限公司 | The adjustable low delay sensor network data transmission method of fine granulation |
WO2016041204A1 (en) * | 2014-09-19 | 2016-03-24 | Telefonaktiebolaget L M Ericsson (Publ) | Methods for compressing and decompressing iq data, and associated devices |
US10542961B2 (en) | 2015-06-15 | 2020-01-28 | The Research Foundation For The State University Of New York | System and method for infrasonic cardiac monitoring |
WO2017080835A1 (en) * | 2015-11-10 | 2017-05-18 | Dolby International Ab | Signal-dependent companding system and method to reduce quantization noise |
JP6652096B2 (en) * | 2017-03-22 | 2020-02-19 | ヤマハ株式会社 | Sound system and headphone device |
US10373630B2 (en) * | 2017-03-31 | 2019-08-06 | Intel Corporation | Systems and methods for energy efficient and low power distributed automatic speech recognition on wearable devices |
CN110035299B (en) * | 2019-04-18 | 2021-02-05 | 雷欧尼斯(北京)信息技术有限公司 | Compression transmission method and system for immersive object audio |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5222189A (en) * | 1989-01-27 | 1993-06-22 | Dolby Laboratories Licensing Corporation | Low time-delay transform coder, decoder, and encoder/decoder for high-quality audio |
US6363338B1 (en) * | 1999-04-12 | 2002-03-26 | Dolby Laboratories Licensing Corporation | Quantization in perceptual audio coders with compensation for synthesis filter noise spreading |
US6377916B1 (en) * | 1999-11-29 | 2002-04-23 | Digital Voice Systems, Inc. | Multiband harmonic transform coder |
US6675125B2 (en) * | 1999-11-29 | 2004-01-06 | Syfx | Statistics generator system and method |
US6681207B2 (en) * | 2001-01-12 | 2004-01-20 | Qualcomm Incorporated | System and method for lossy compression of voice recognition models |
US20030135374A1 (en) * | 2002-01-16 | 2003-07-17 | Hardwick John C. | Speech synthesizer |
US7225135B2 (en) * | 2002-04-05 | 2007-05-29 | Lectrosonics, Inc. | Signal-predictive audio transmission system |
US7043423B2 (en) * | 2002-07-16 | 2006-05-09 | Dolby Laboratories Licensing Corporation | Low bit-rate audio coding systems and methods that use expanding quantizers with arithmetic coding |
JP4296957B2 (en) * | 2004-02-18 | 2009-07-15 | トヨタ自動車株式会社 | Control device for continuously variable transmission for vehicle |
US20070094035A1 (en) * | 2005-10-21 | 2007-04-26 | Nokia Corporation | Audio coding |
JP5236005B2 (en) * | 2008-10-10 | 2013-07-17 | 日本電信電話株式会社 | Encoding method, encoding apparatus, decoding method, decoding apparatus, program, and recording medium |
-
2009
- 2009-04-22 US US12/428,336 patent/US20100106269A1/en not_active Abandoned
- 2009-09-25 WO PCT/US2009/058387 patent/WO2010036897A1/en active Application Filing
- 2009-09-25 EP EP09793011A patent/EP2345166A1/en not_active Withdrawn
- 2009-09-25 KR KR1020117009533A patent/KR101278880B1/en active IP Right Grant
- 2009-09-25 JP JP2011529264A patent/JP2012504373A/en not_active Withdrawn
- 2009-09-25 CN CN2009801377943A patent/CN102165699A/en active Pending
- 2009-09-28 TW TW098132773A patent/TW201019315A/en unknown
-
2012
- 2012-12-07 JP JP2012268459A patent/JP2013081229A/en active Pending
Non-Patent Citations (2)
Title |
---|
MILNER B ET AL: "Low bit-rate feature vector compression using transform coding and non-uniform bit allocation", PROCEEDINGS OF INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP'03) 6-10 APRIL 2003 HONG KONG, CHINA; [IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP)], 2003 IEEE INTERNATIONAL CONFERENCE, vol. 2, 6 April 2003 (2003-04-06), pages 129 - 132, XP010640898, ISBN: 978-0-7803-7663-2 * |
See also references of EP2345166A1 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9642543B2 (en) | 2013-05-23 | 2017-05-09 | Arizona Board Of Regents | Systems and methods for model-based non-contact physiological data acquisition |
CN103532936A (en) * | 2013-09-28 | 2014-01-22 | 福州瑞芯微电子有限公司 | Bluetooth audio self-adaption transmission method |
US9626521B2 (en) | 2014-04-16 | 2017-04-18 | Arizona Board Of Regents On Behalf Of Arizona State University | Physiological signal-based encryption and EHR management |
Also Published As
Publication number | Publication date |
---|---|
EP2345166A1 (en) | 2011-07-20 |
JP2013081229A (en) | 2013-05-02 |
CN102165699A (en) | 2011-08-24 |
TW201019315A (en) | 2010-05-16 |
JP2012504373A (en) | 2012-02-16 |
KR101278880B1 (en) | 2013-06-26 |
KR20110074887A (en) | 2011-07-04 |
US20100106269A1 (en) | 2010-04-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20100106269A1 (en) | Method and apparatus for signal processing using transform-domain log-companding | |
US10878827B2 (en) | Energy lossless-encoding method and apparatus, audio encoding method and apparatus, energy lossless-decoding method and apparatus, and audio decoding method and apparatus | |
CN103069484B (en) | Time/frequency two dimension post-processing | |
CN108831501B (en) | High frequency encoding/decoding method and apparatus for bandwidth extension | |
US6029126A (en) | Scalable audio coder and decoder | |
RU2585990C2 (en) | Device and method for encoding by huffman method | |
EP1080542B1 (en) | System and method for masking quantization noise of audio signals | |
JP4859670B2 (en) | Speech coding apparatus and speech coding method | |
US8190440B2 (en) | Sub-band codec with native voice activity detection | |
ES2687249T3 (en) | Non-sound / sound decision for voice processing | |
ZA200606713B (en) | Classification of audio signals | |
US8027242B2 (en) | Signal coding and decoding based on spectral dynamics | |
EP3614384B1 (en) | Method for estimating noise in an audio signal, noise estimator, audio encoder, audio decoder, and system for transmitting audio signals | |
JP2006018023A (en) | Audio signal coding device, and coding program | |
US8781822B2 (en) | Audio and speech processing with optimal bit-allocation for constant bit rate applications | |
Talbi et al. | New Speech Compression Technique based on Filter Bank Design and Psychoacoustic Model | |
TWI602173B (en) | Audio processing method and non-transitory computer readable medium | |
Radha et al. | Comparative analysis of compression techniques for Tamil speech datasets | |
CN116137151A (en) | System and method for providing high quality audio communication in low code rate network connection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 200980137794.3 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 09793011 Country of ref document: EP Kind code of ref document: A1 |
|
DPE1 | Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101) | ||
WWE | Wipo information: entry into national phase |
Ref document number: 1848/CHENP/2011 Country of ref document: IN |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2011529264 Country of ref document: JP |
|
ENP | Entry into the national phase |
Ref document number: 20117009533 Country of ref document: KR Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2009793011 Country of ref document: EP |