WO1996010870A1 - Psychoacoustic audio-signal coding system and method - Google Patents

Psychoacoustic audio-signal coding system and method Download PDF

Info

Publication number
WO1996010870A1
WO1996010870A1 PCT/EP1995/003866 EP9503866W WO9610870A1 WO 1996010870 A1 WO1996010870 A1 WO 1996010870A1 EP 9503866 W EP9503866 W EP 9503866W WO 9610870 A1 WO9610870 A1 WO 9610870A1
Authority
WO
WIPO (PCT)
Prior art keywords
signal
ratio
blocks
mask
mask ratio
Prior art date
Application number
PCT/EP1995/003866
Other languages
French (fr)
Inventor
Giorgio Parladori
Gain Antonio Mian
Renato Andreola
Original Assignee
Alcatel Italia S.P.A.
Alcatel N.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alcatel Italia S.P.A., Alcatel N.V. filed Critical Alcatel Italia S.P.A.
Priority to AU38027/95A priority Critical patent/AU3802795A/en
Publication of WO1996010870A1 publication Critical patent/WO1996010870A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B1/00Details of transmission systems, not covered by a single one of groups H04B3/00 - H04B13/00; Details of transmission systems not characterised by the medium used for transmission
    • H04B1/66Details of transmission systems, not covered by a single one of groups H04B3/00 - H04B13/00; Details of transmission systems not characterised by the medium used for transmission for reducing bandwidth of signals; for improving efficiency of transmission
    • H04B1/665Details of transmission systems, not covered by a single one of groups H04B3/00 - H04B13/00; Details of transmission systems not characterised by the medium used for transmission for reducing bandwidth of signals; for improving efficiency of transmission using psychoacoustic properties of the ear, e.g. masking effect

Definitions

  • the present invention relates to a method and to a psychoacoustic system for the audio-signal coding that utilizes the perceptive characteristics of the human auditive system.
  • the audio-signal encoders utilize the results of the psychoacoustic analysis to know which characteristics of the signal must be kept unchanged during the coding process, in order to maintain a perceptive transparency or, correspondingly, the same subjective qualities of the signal.
  • the analysis leads to define the maximum noise spectrum characteristics (noise mask) that the coding may introduce, underneath which the system is perceptively transparent; in other words, the maximum amount of noise that the ear is not able to perceive is defined.
  • a vector of signal-to-mask ratios relative to the frequency intervals which the audible spectrum has been subdivided into is then calculated.
  • the psychoacoustic model described above starting from a frequency analysis of the signal subject of the coding, one is able to define a signal-to-mask ratio that utilizes the concealment phenomenon produced by a frequency or, more in general, by a narrow frequency band over the entire spectrum of the signal.
  • the present invention utilizes the time concealment phenomenon to obtain a psychoacoustic analysis more improved and suitable to auditive perception mechanisms.
  • this invention provides a method usable in any coding ambit in which a good compression ratio i desired to be achieved, moreover, it can also be used in compliance with the ISO standard No. 11172-3 in which the use of the signal-to-mask ratio is recommended. Therefore, it is possible also to use the improvement provided with our invention still observing the constraints described in the standard.
  • Fig. 1 is a block diagram of a generic coding psychoacoustic system
  • Fig. 2 represents an implementation of the subsystem which realizes the modification of the signal-to-mask ratio.
  • Fig. 1 there is represented a generic encoder 1 that receives a signal from an input line and information on how to encode the signal from a psychoacoustic analyzer 2.
  • a time concealment is an alteration in the capability to detect a noise along with a tone if the tone power is varied.
  • the signal-to-mask ratio SM and the procedure of calculating it are considered.
  • the values of SM are calculated on the basis of frequency concealment.
  • the basic idea of the time concealment is to increase the SM values in those situations where the sensitivity to noises in listening is greater and to decrease them when the sensitivity goes down.
  • the perceptive entropy is the minimum information (bit/s) to be transmitted that assures the perceptive indistinguishability of the received signal with respect to the original one.
  • the introduction of the time concealment, in the coding according to the invention does not necessarily lead to lower values in the average of the perceptive entropy, but the bits are surely distributed in a different manner in the sub-bands. On the basis of experimental observations, see e.g. James 0.
  • a double amount of peaks per second brings to the brain more information on the tone beginning than brought successively in an equal amount of time. It has further been found that the ratio between the peak at the beginning and the stationary average value of peaks per second is not independent from the amplitude of the signal: as the amplitude is increased the peak increases more than the stationary value increases. Moreover, the description of a second tone reaching the brain is much less detailed if preceded by a strong tone. This phenomenon is called postconcealment. For the postconcealment it can be said that the insensitivity to a noise is maximum in the vicinity of the tone power fall, and the power of the concealed noise decreases with a time constant of about 20 ms independently of the tone frequency.
  • a signal corresponding to the instantaneous power of the signal to be encoded is applied to terminal 10, such terminal 10 being connected with block 12 for calculating and updating the average power, such terminal being also applied to block 13 that carries out the product of the instantaneous power and the average power.
  • the block 13 is connected with block 14 that calculates the result of the function f () applied to the argument outgoing from block 13.
  • Block 15 multiplies the result of block 14 with the signal-to-mask ratio received by terminal 11. The modified signal-to-mask ratio is then obtained at terminal 16.
  • the average power is evaluated by applying a suitable function.
  • a first-order low pass filter is used with a time constant equal to the constant of the evolution times of time-concealment phenomena.
  • the equation of the average power updating is:
  • ⁇ ⁇ k (n) p ⁇ P k ⁇ (7 -1 ) + ( ⁇ -p) p k (n- ⁇ )
  • r is the time constant of the mobile-average filter and T is the power updating time.

Abstract

The present invention relates to a method and to a psychoacoustic system for encoding the audio-signal that exploits the perceptive characteristics of the human auditive system. In particular, the system is based on a method of analysis which permits to highlight the characteristics of the audio-signal that result to be perceptively significant. Such information, according to one aspect of the invention, can be used for compressing the information contained in said signal in order to be able to transmit it in an effective manner.

Description

Psychoacoustic Audio-Signal Codi ng System and Method
The present invention relates to a method and to a psychoacoustic system for the audio-signal coding that utilizes the perceptive characteristics of the human auditive system.
The mechanisms of sound stimulus perception by the mankind are known in the literature as, e.g. , in the book by E. Zwicker and R.
Feldtkeller entitled "Das Ohr als Nachrichtenempfaenger" (Hirzel, Stuttgart, 1967).
The discipline that studies these phenomena is named psychoacoustics.
In general the audio-signal encoders utilize the results of the psychoacoustic analysis to know which characteristics of the signal must be kept unchanged during the coding process, in order to maintain a perceptive transparency or, correspondingly, the same subjective qualities of the signal.
The analysis leads to define the maximum noise spectrum characteristics (noise mask) that the coding may introduce, underneath which the system is perceptively transparent; in other words, the maximum amount of noise that the ear is not able to perceive is defined.
A vector of signal-to-mask ratios relative to the frequency intervals which the audible spectrum has been subdivided into, is then calculated. In the psychoacoustic model described above, starting from a frequency analysis of the signal subject of the coding, one is able to define a signal-to-mask ratio that utilizes the concealment phenomenon produced by a frequency or, more in general, by a narrow frequency band over the entire spectrum of the signal.
In the field of the audio-coding the psychoacoustic analysis resulted to be very effective in obtaining high compression ratios. It is an object of the present invention to improve the known art in terms of obtaining a higher compression ratio. This object is therefore reached through a method as set forth in claim 1 and through a system as set forth in claim 6.
Further advantageous aspects of the present invention are set forth in the dependent claims. The present invention utilizes the time concealment phenomenon to obtain a psychoacoustic analysis more improved and suitable to auditive perception mechanisms.
According to one aspect of the invention it is possible to calculate concealment curves which allow a higher signal compression still maintaining the same subjective quality of the reconstructed signal. This invention provides a method usable in any coding ambit in which a good compression ratio i desired to be achieved, moreover, it can also be used in compliance with the ISO standard No. 11172-3 in which the use of the signal-to-mask ratio is recommended. Therefore, it is possible also to use the improvement provided with our invention still observing the constraints described in the standard.
The invention will now be described in greater detail with reference to the attached drawings wherein:
- Fig. 1 is a block diagram of a generic coding psychoacoustic system,
- Fig. 2 represents an implementation of the subsystem which realizes the modification of the signal-to-mask ratio. In Fig. 1 there is represented a generic encoder 1 that receives a signal from an input line and information on how to encode the signal from a psychoacoustic analyzer 2.
Such schematic is equivalent to the structure utilized in the ISO standard No. 11172-3 and used also as a basis for the invention described hereinafter.
A time concealment is an alteration in the capability to detect a noise along with a tone if the tone power is varied. With reference to the ISO standard 11172-3 the signal-to-mask ratio SM and the procedure of calculating it are considered. In the standard, the values of SM are calculated on the basis of frequency concealment.
The basic idea of the time concealment is to increase the SM values in those situations where the sensitivity to noises in listening is greater and to decrease them when the sensitivity goes down. In order to quantify the effects of the system introduced in the present invention it is useful to introduce the concept of perceptive entropy. The perceptive entropy is the minimum information (bit/s) to be transmitted that assures the perceptive indistinguishability of the received signal with respect to the original one. The introduction of the time concealment, in the coding according to the invention, does not necessarily lead to lower values in the average of the perceptive entropy, but the bits are surely distributed in a different manner in the sub-bands. On the basis of experimental observations, see e.g. James 0. Pickles "An introduction to the Physiology of Hearing", London, Academic Press, 1982, pages 10-99, it was possible to establish how to a stepped envelope tone, the nervous fibers outgoing from the ciliated cellules, respond with a peak of activity having an amplitude about twice the stationary activity (peaks/s) and a duration of about 20 ms.
A double amount of peaks per second brings to the brain more information on the tone beginning than brought successively in an equal amount of time. It has further been found that the ratio between the peak at the beginning and the stationary average value of peaks per second is not independent from the amplitude of the signal: as the amplitude is increased the peak increases more than the stationary value increases. Moreover, the description of a second tone reaching the brain is much less detailed if preceded by a strong tone. This phenomenon is called postconcealment. For the postconcealment it can be said that the insensitivity to a noise is maximum in the vicinity of the tone power fall, and the power of the concealed noise decreases with a time constant of about 20 ms independently of the tone frequency. In order to exploit the phenomena described above according to one aspect of the invention the schematic as shown in Fig. 2 is used. A signal corresponding to the instantaneous power of the signal to be encoded is applied to terminal 10, such terminal 10 being connected with block 12 for calculating and updating the average power, such terminal being also applied to block 13 that carries out the product of the instantaneous power and the average power. The block 13 is connected with block 14 that calculates the result of the function f () applied to the argument outgoing from block 13. Block 15 multiplies the result of block 14 with the signal-to-mask ratio received by terminal 11. The modified signal-to-mask ratio is then obtained at terminal 16.
Consider a generic subdivision into time blocks of the signal to be coded: the variation of power in band K and in block n is indicated by the ratio
Pk (n)
between of the instantaneous power in band K and in block n to the average power in the same block. The average power is evaluated by applying a suitable function. According to one aspect of the present invention a first-order low pass filter is used with a time constant equal to the constant of the evolution times of time-concealment phenomena. The equation of the average power updating is:
ρ~ k(n) =p~Pk ~ (7 -1 ) + (ι-p) pk (n-ι)
with
T τ
where r is the time constant of the mobile-average filter and T is the power updating time.
Clearly the parameters are to be rated in order to maximize the performances. In order to conveniently exploit the analysis carried out according to one aspect of the invention, an increasing function f() is applied to the ratio between the powers and the value
f (
P (n)
is used for modifying the signal-to-mask ratios of each block n in accordance with the following formula
SM=f ( -i ) *SM P According to one embodiment of the invention use an increasing function of the type
- 6 - SUBSTTTUTE SHEET (RULE 26)
Figure imgf000009_0001
leading to a new definition of SM.
Parameter "all controls the adaptation rate of function f() to be determined for the maximization of the performances.

Claims

C L A I M S 1. Psychoacoustic method of coding an audio signal including the steps of:
- subdividing said audio signal into frequency bands; - subdividing said audio signal of each band into time blocks;
- determining for each block a signal-to-mask ratio;
- coding said blocks in accordance with said signal-to-mask ratio; characterized in that prior to encode said blocks, said signal-to- mask ratio is modified by using the time concealment phenomenon.
2. Method according to claim 1 , characterized in that said modification of said signal-to-mask ratio is carried out by multiplying said signal-to-mask power in said blocks.
3. Method according to claim 2, characterized in that the argument of said function is given by the ratio of instantaneous power to the average power in said block.
4. Method according to claim 3, characterized in that said average power is evaluated through the formula:
P~ k (n) =pP~ k (n-l ) + ( l -p) Pk (n-l )
5. Method according to claim 2, characterized in function is an increasing function.
6. Psychoacoustic system for coding an audio signal designed to implement the method of claim 1 including:
- means for subdividing said audio signal into frequency bands;
- means for subdividing said audio signal of each band into time blocks;
- means for determining a signal-to-mask ratio for each block;
- means for coding said blocks according to said signal-to-mask ratio; characterized by further comprising means for modifying said signal-to-mask ratio prior to encode said blocks by using the time- concealment phenomenon.
7. System according to claim 6, characterized in that said means for modifying said signal-to-mask ratio comprise: - means for evaluating the average power;
- means for executing the ratio of said average power to the instantaneous power;
- means for calculating a function having said ratio as a variable;
- means for multiplying said function with said signal-to-mask ratio.
PCT/EP1995/003866 1994-10-04 1995-09-29 Psychoacoustic audio-signal coding system and method WO1996010870A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU38027/95A AU3802795A (en) 1994-10-04 1995-09-29 Psychoacoustic audio-signal coding system and method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
ITMI94A002015 1994-10-04
ITMI942015A IT1271240B (en) 1994-10-04 1994-10-04 AUDIO SIGNAL CODING METHOD AND SYSTEM

Publications (1)

Publication Number Publication Date
WO1996010870A1 true WO1996010870A1 (en) 1996-04-11

Family

ID=11369642

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP1995/003866 WO1996010870A1 (en) 1994-10-04 1995-09-29 Psychoacoustic audio-signal coding system and method

Country Status (3)

Country Link
AU (1) AU3802795A (en)
IT (1) IT1271240B (en)
WO (1) WO1996010870A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0458645A2 (en) * 1990-05-25 1991-11-27 Sony Corporation Subband digital signal encoding apparatus
EP0610007A2 (en) * 1993-02-02 1994-08-10 Sony Corporation High efficiency encoding and decoding

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0458645A2 (en) * 1990-05-25 1991-11-27 Sony Corporation Subband digital signal encoding apparatus
EP0610007A2 (en) * 1993-02-02 1994-08-10 Sony Corporation High efficiency encoding and decoding

Also Published As

Publication number Publication date
ITMI942015A0 (en) 1994-10-04
IT1271240B (en) 1997-05-27
AU3802795A (en) 1996-04-26
ITMI942015A1 (en) 1996-04-04

Similar Documents

Publication Publication Date Title
JP3297051B2 (en) Apparatus and method for adaptive bit allocation encoding
US8913754B2 (en) System for dynamic spectral correction of audio signals to compensate for ambient noise
KR100299528B1 (en) Apparatus and method for encoding / decoding audio signal using intensity-stereo process and prediction process
RU2381571C2 (en) Synthesisation of monophonic sound signal based on encoded multichannel sound signal
US8805696B2 (en) Quality improvement techniques in an audio encoder
JP5539203B2 (en) Improved transform coding of speech and audio signals
DE69633633T2 (en) MULTI-CHANNEL PREDICTIVE SUBBAND CODIER WITH ADAPTIVE, PSYCHOACOUS BOOK ASSIGNMENT
KR100551862B1 (en) Enhancing the performance of coding systems that use high frequency reconstruction methods
KR100242864B1 (en) Digital signal coder and the method
JP3343962B2 (en) High efficiency coding method and apparatus
EP3598442B1 (en) Systems and methods for modifying an audio signal using custom psychoacoustic models
HU213963B (en) High-activity coder and decoder for digital data
EP0740428A1 (en) Tonality for perceptual audio compression based on loudness uncertainty
DE69932861T2 (en) METHOD FOR CODING AN AUDIO SIGNAL WITH A QUALITY VALUE FOR BIT ASSIGNMENT
CA2118916C (en) Process for reducing data in the transmission and/or storage of digital signals from several dependent channels
JPH1028057A (en) Audio decoder and audio encoding/decoding system
EP0525774B1 (en) Digital audio signal coding system and method therefor
JPH0653911A (en) Method and device for encoding voice data
Davidson et al. Parametric bit allocation in a perceptual audio coder
KR20020044416A (en) Personal wireless communication apparatus and method having a hearing compensation facility
WO1996010870A1 (en) Psychoacoustic audio-signal coding system and method
Davidson Digital audio coding: Dolby AC-3
JP2000148161A (en) Method and device for automatically controlling sound quality and volume
JP3134363B2 (en) Quantization method
Lanciani Auditory perception and the MPEG audio standard

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AU CA CN FI JP KR MX NZ US

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: CA

122 Ep: pct application non-entry in european phase