GB2349054A - Digital audio signal encoders - Google Patents

Digital audio signal encoders Download PDF

Info

Publication number
GB2349054A
GB2349054A GB9908659A GB9908659A GB2349054A GB 2349054 A GB2349054 A GB 2349054A GB 9908659 A GB9908659 A GB 9908659A GB 9908659 A GB9908659 A GB 9908659A GB 2349054 A GB2349054 A GB 2349054A
Authority
GB
United Kingdom
Prior art keywords
allowed
distortion
signal
encoding
factor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB9908659A
Other versions
GB9908659D0 (en
Inventor
Jeremy Bennett
Alberto Duenas
Stuart Mcdonald
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Synamedia Ltd
Ericsson Television AS
Original Assignee
NDS Ltd
Tandberg Television AS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NDS Ltd, Tandberg Television AS filed Critical NDS Ltd
Priority to GB9908659A priority Critical patent/GB2349054A/en
Publication of GB9908659D0 publication Critical patent/GB9908659D0/en
Publication of GB2349054A publication Critical patent/GB2349054A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

In encoding digital audio signals (for transmission) by a perceptual audio encoding method, involving implementing a psycho-acoustic model (PAM) on a frame of audio data which produces a set of "allowed" distortion for each listening band of the human ear, converting the frame of audio data from the time to the frequency domain and dividing it into the listening bands, and converting the frequency amplitude values into a set of bits, an improved algorithm (Fig.4, not shown) is used to obtain an optimal quantisation factor (QF) for implementing the latter task.

Description

IMPROVEMENTS IN OR RELATING TO ENCODING AND DECODING AUDIO SIGNALS This invention relates to the encoding and decoding of audio signals, in particular digital audio signals.
Processing of audio signals in the digital domain is well known and for digital television sound signals is defined in the MPEG standard. It is inefficient to transmit an uncoded digital audio signal. Accordingly the audio signal is generally encoded prior to transmission. Figure 1 shows the main encoding tasks for perceptual audio encoding. These are: implementing a psychoacoustic model (PAM) on a frame of audio data which produces a set of allowed distortion for each listening band of the human ear; converting the frame of audio data from the time to frequency domain and dividing it into the listening bands; and converting the frequency amplitude values into a set of bits.
The standard technique for implementing the third task is by quantisation where the large amplitude values are divided by a large real value called the quantisation factor (QF), to produce a set of small integers. The amplitudes are then reconstructed during decoding by multiplying the integer values with the inverse of the QF. The distortion produced by the process must be less than that allowed by the PAM which will be called"allowed"Hence, the important process for the quantisation task is the search for a value of the QF which produces distortion less than or equal to"allowed". As the purpose of encoding is to reduce the amount of storage required, the value must also minimises the number of bits required to represent the set of integers. A brute-force search can be used when there are no time restrictions placed on the search time. However, in the context of real-time implementation there is insufficient time to operate such a search and a non-optimal but faster method must be used.
The method suggested in the MPEG-2 AAC algorithm is to use two iteration loop. For this algorithm, the number of bits to code a whole frame is fixed before the quantisation process. A base QF is chosen as the QF for all of the listening bands. The iteration loops are then as follows : 1. The inner loop increases the QF of each band until the number of bits required for the frame is not greater than the allowed number of bits.
2. The outer loop carries out the following steps: a. Calling the inner loop ; b. Calculating the distortion in each band and checking if this quantisation process produced the best results so far. If so, then the results are saved. c. If all of the bands have distortion which is too large, then the iterations terminate and the best results are restored. d. If all of the bands have distortion less than or equal to allowed, then the iterations terminate and the best results are restored. e. Decreasing the QF in each band with distortion greater than allowed. f. Repeating the process.
The purpose of storing the best results is because this method does not readily converge to a solution. Also, while this method does produce a good estimate for the QFs, it is not particular fast and the number of iterations must be restricted for real-time implementations.
One object of the present invention is to provide a system which overcomes the disadvantages of the known systems.
According to one aspect of the present invention, there is provided a method of encoding a digital audio signal comprising one or more portions in which an allowed distortion level is known, the method comprising representing the or each portion of the signal as one of a predetermined range of factors, selecting the size of the range of factors to be reduced from the predetermined range; iterating within the reduced range to select the optimal factor for the or each portion, and encoding the signal in accordance with the optimal factor.
In this way each of the listening bands is searched separately and the quantisation factor for each is estimated independently to decrease the time and resources required for the quantisation process. In addition, this solution is bracketed and as such a standard non-liner equation solver such as Newton-Raphson can be used which offer many processing advantages. This method is both deterministic and ensures convergence to a solution.
According to a second aspect of the present invention there is provided apparatus for encoding a digital audio signal comprising one or more portions in which an allowed distortion level is known, the apparatus comprising means for representing the or each portion of the signal as one of a predetermined range of factors; a selector for selecting the size of the range of factors to be reduced from the predetermined range; a selector for searching within the reduced range to select the optimal factor for the or each portion, and an encoder for encoding the signal in accordance with the optimal factor.
Reference will now be made, by way of example, to the accompanying drawings in which: Figure 1 is a prior art block diagram of a perceptual audio encoding process.
Figure 2 is a graph showing the relationship between scalefactor and number of bits.
Figure 3 is a graph showing the relationship between scalefactor and distortion; and Figure 4 is a flow chart of the audio encoding process of the present invention.
The ear is a complex organ and has a number of known characteristics.
These characteristics are used to generate a model of the ear. This model shows some frequency dependency, for example if the ear has just experienced a loud noise, it may be some time before it can'hear'a soft noise. This type of characteristic is exploite in the model used in the present invention.
The number of bits required to encode a band decreases monotonically with increasing scalefactor. This is shown pictorially in Figure 2 which shows the relationship between the value of QF for a listening band and the number of bits required to code the set of integers for the MPEG-2 audio algorithm. The formula relating scalefactor as shown in Figure 2 to the actual QF shall be called the quantisation equation. Hence, any search for the optimal value of the QF would be to maximise the value of the QF while ensuring that the distortion is not greater than"allowed". Figure 3 shows the relationship between value of the QF for the above band with the distortion produced by that value. The dotted horizontal line in Figure 3 represents the allowed distortion produced by the PAM. As can be seen, there is no clear relationship to utilise. To ensure convergence to a reasonable solution, two values for the QF must be chosen which are known to bracket the solution. In other words, quantisation using a QF of the chosen QFmtn must produce distortion not greater than"allowed"while the distortion produced using the chosen QFmax must be greater than"allowed"Then a search for the solution can be replaced by solving the equation: distortion ="allowed" The solution must exist according to the Intermediate Value Theorem (IVT). A standard iterative technique for solving this problem is derived from the IVT and is called the Newton-Raphson method. Here, the estimated value for the next QF is:
QF. F + (QFmaX-QFmin | allowed"-distortionmjn) distortionm-distortionm ; QFnew is used for the quantisation and if the distortion produced is greater than"allowed", QFmax is replaced by QFnew else QFmjn is replaced by QFnew.
This process is repeated until QFmin and QFmax converge and the solution has been found.
The search time is proportional to the difference of the two initial estimates so the selection for the initial values is important. The maximum value for the QF is chosen as the maximum value for which the quantisation process will produce at least one non-zero integer value. If the QF is increased, then the band will be quantised as a set of zeros which is not valid. This value can be calculated by using the quantisation equation on the maximum amplitude in the band and making the QF the subject of the formula.
The minimum initial value can be determined by examining the process of quantisation. The energy in the error signal is calculated as the sum of the squares of the difference. The largest difference between the reconstructed value and the original is half that of the QF. Hence, the most possible energy contained in the error signal is the number of amplitudes in the band multiplie by the square of QF/2. Making the maximum energy equal to "allowed"and QF the subject of the formula :
QF = 2 allowed QF=2 where"allowed"is the distortion allowed by the PAM; and N is the number of amplitudes in the band.
The process described above can be understood more clearly with reference to the flow chart of Figure 4.
QFmax is calculated and the distortion produced is determined (40). The level of the distortion is compared with the"allowed"level (42). If the distortion is less than"allowed"QFma, (44) is used. If the distortion is greater than "allowed" (48) then QFmin is calculated and the distortion produced is determined (50).
The level of distortion is compared with"allowed" (52). If the distortion is the same (54) the QFmin is used (56). If the distortion is not the same (58) further processing occurs. QFnew is calculated from QFmin and QFmax and the distortion is determined (60). Then the distortion is compared to"allowed" (62).
If the distortion is the same (64) QFnew is used (66). If the distortion is more (68) or less (70) one of two iterative loops are started. If the distortion is more QFmax is set to QFnew (72).
If QFmax = QFmin + 1 (74) (76) QFmjn is used (78). If QFmax $ QFmin +1 (80) a new QFmin is calculated (60) and the loop is repeated.
If the distortion is less than"allowed" (70) QFmin is set to QFnew (82). If QFmax = QFmin +1 (84) (86) QFmin is used (88). If QFmax w QFmin +1 (90) the loop is repeated starting at (60).
The method of the invention can be modified to deal with higher bitrates. The above technique does not necessarily use all the bits allowed for the quantisation hence reducing the audio quality below that which is achievable for the bitrate. The method can be modified to use more bits if allowed. The number of bits required to encode the whole frame is calculated after each iteration of the QFnew estimator for all of the bands. The QF used for the quantisation process is taken as QFmen. If the number of bits required is less than that allowed for this frame, the quantisation process is terminated and the current results are used.

Claims (6)

  1. Claims 1. A method of encoding a digital audio signal comprising one or more portions in which an allowed distortion level is known, the method comprising representing the or each portion of the signal as one of a predetermined range of factors, selecting the size of the range of factors to be reduced from the predetermined range; searching within the reduced range to select the optimal factor for the or each portion, and encoding the signal in accordance with the optimal factor.
  2. 2. The method of claim 1, wherein the selecting step comprises selecting a maxiumum and a minimum factor which brackets the optimal factor.
  3. 3. The method of claim 2 where the selecting step comprises selecting different maximum and minimum factors for different portions of the signal.
  4. 4. The method of claim 2 or claim 3, wherein the selecting step comprises selecting the maximum factor to produce distortion greater than the allowed distortion level and selecting the minimum factor to produce distortion less than the allowed distortion level.
  5. 5. The method of any preceding claim, wherein the representing step comprises representing the or each portion of the signal as one of a predetermined range of quantisation factors.
  6. 6. Apparatus for encoding a digital audio signal comprising one or more portions in which an allowed distortion level is known, the apparatus comprising means for representing the or each portion of the signal as one of a predetermined range of factors; a selector for selecting the size of the range of factors to be reduced from the predetermined range; a selector for searching within the reduced range to select the optimal factor for the or each portion, and an encoder for encoding the signal in accordance with the optimal factor.
GB9908659A 1999-04-16 1999-04-16 Digital audio signal encoders Withdrawn GB2349054A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB9908659A GB2349054A (en) 1999-04-16 1999-04-16 Digital audio signal encoders

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB9908659A GB2349054A (en) 1999-04-16 1999-04-16 Digital audio signal encoders

Publications (2)

Publication Number Publication Date
GB9908659D0 GB9908659D0 (en) 1999-06-09
GB2349054A true GB2349054A (en) 2000-10-18

Family

ID=10851609

Family Applications (1)

Application Number Title Priority Date Filing Date
GB9908659A Withdrawn GB2349054A (en) 1999-04-16 1999-04-16 Digital audio signal encoders

Country Status (1)

Country Link
GB (1) GB2349054A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0473367A1 (en) * 1990-08-24 1992-03-04 Sony Corporation Digital signal encoders
EP0559348A2 (en) * 1992-03-02 1993-09-08 AT&T Corp. Rate control loop processor for perceptual encoder/decoder
US5301255A (en) * 1990-11-09 1994-04-05 Matsushita Electric Industrial Co., Ltd. Audio signal subband encoder
EP0612159A2 (en) * 1993-02-19 1994-08-24 Matsushita Electric Industrial Co., Ltd. An enhancement method for a coarse quantizer in the ATRAC
US5414795A (en) * 1991-03-29 1995-05-09 Sony Corporation High efficiency digital data encoding and decoding apparatus
WO1996035269A1 (en) * 1995-05-03 1996-11-07 Sony Corporation Non-linearly quantizing an information signal

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0473367A1 (en) * 1990-08-24 1992-03-04 Sony Corporation Digital signal encoders
US5301255A (en) * 1990-11-09 1994-04-05 Matsushita Electric Industrial Co., Ltd. Audio signal subband encoder
US5414795A (en) * 1991-03-29 1995-05-09 Sony Corporation High efficiency digital data encoding and decoding apparatus
EP0559348A2 (en) * 1992-03-02 1993-09-08 AT&T Corp. Rate control loop processor for perceptual encoder/decoder
EP0612159A2 (en) * 1993-02-19 1994-08-24 Matsushita Electric Industrial Co., Ltd. An enhancement method for a coarse quantizer in the ATRAC
WO1996035269A1 (en) * 1995-05-03 1996-11-07 Sony Corporation Non-linearly quantizing an information signal

Also Published As

Publication number Publication date
GB9908659D0 (en) 1999-06-09

Similar Documents

Publication Publication Date Title
CN101939782B (en) Adaptive transition frequency between noise fill and bandwidth extension
JP3881943B2 (en) Acoustic encoding apparatus and acoustic encoding method
US10121480B2 (en) Method and apparatus for encoding audio data
JP3343965B2 (en) Voice encoding method and decoding method
JP3071795B2 (en) Subband coding method and apparatus
AU2003299395B2 (en) Method for encoding and decoding audio at a variable rate
JP3881946B2 (en) Acoustic encoding apparatus and acoustic encoding method
JPH05304479A (en) High efficient encoder of audio signal
JP2005189886A (en) Method for improving coding efficiency of audio signal
JP2007525716A (en) Apparatus and method for determining step size of quantizer
US20080164942A1 (en) Audio data processing apparatus, terminal, and method of audio data processing
JP2003501925A (en) Comfort noise generation method and apparatus using parametric noise model statistics
JPH0713600A (en) Vocoder ane method for encoding of drive synchronizing time
CN114550732B (en) Coding and decoding method and related device for high-frequency audio signal
JP2007504503A (en) Low bit rate audio encoding
US9287895B2 (en) Method and decoder for reconstructing a source signal
JP2001102930A (en) Method and device for correcting quantization error, and method and device for decoding audio information
EP1385150A1 (en) Method and system for parametric characterization of transient audio signals
JP3472279B2 (en) Speech coding parameter coding method and apparatus
JP3751001B2 (en) Audio signal reproducing method and reproducing apparatus
JP2006011170A (en) Signal-coding device and method, and signal-decoding device and method
US10734005B2 (en) Method of encoding, method of decoding, encoder, and decoder of an audio signal using transformation of frequencies of sinusoids
JP2004302259A (en) Hierarchical encoding method and hierarchical decoding method for sound signal
GB2349054A (en) Digital audio signal encoders
JP4125520B2 (en) Decoding method for transform-coded data and decoding device for transform-coded data

Legal Events

Date Code Title Description
COOA Change in applicant's name or ownership of the application
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)