WO2002003758A1

WO2002003758A1 - Method and apparatus for perceptual evaluation of audio products

Info

Publication number: WO2002003758A1
Application number: PCT/SG2000/000099
Authority: WO
Inventors: Ming Zhang
Original assignee: Nanyang Technological University
Priority date: 2000-07-05
Filing date: 2000-07-05
Publication date: 2002-01-10
Also published as: AU2000267448A1

Abstract

An improved method and apparatus employs psychoacoustic modeling to evaluate the distortions for audio products, such as multimedia sound cards, CD players, power amplifiers, loudspeakers, etc. The characteristic of this method and apparatus is to use psychoacoustic modeling to conform the measurements of distortion to human auditory perception. The proposed method and apparatus is to measure audio distortion for the audio products, not only for audio CODEC systems.

Description

METHOD AND APPARATUS FOR PERCEPTUAL EVALUATION OF

AUDIO PRODUCTS

FIELD OF THE INVENTION

The present invention relates to a method and apparatus for evaluating distortions in an audio product based on a psychoacoustic model, more particularly to an efficient method and apparatus consistent with actual human auditory perception.

BACKGROUND OF THE INVENTION

An audio test and measurement device is usually used to evaluate the performance of audio products. The normal performance parameters, such as "Total Harmonic Distortion (THD)", "Inter-Modulation Distortion (TMD)" and "Signal to Noise Ratios (SNR)" etc. are physical characters of the products, but are typically not consistent with actual human auditory perception. As a result, it often happens that a listener judges a sound produced by an audio product having a greater THD or IMD (or less SNR) to be less distorted than one having less THD or IMD (or greater SNR). Thus the traditional methods of measuring audio products will unavoidably lead to results that do not conform to human auditory perception, and the figures obtained therefore will fail to represent the quality of a sound consistent with human auditory perception.

Some papers and various techniques for evaluating audio distortions based on human perception have been proposed. One of them is U.S. Pat. No. 4,706,290, which comprises a primary and a secondary network for the measurement of loudspeaker subharmonics so that the results obtained will approximate the human auditory perception. However, as this apparatus serves to measure weighted harmonic distortions in time domain, the results do not best reflect how the human auditory faculty functions. Further, the apparatus has to employ various analog circuitrys, rendering it rather difficult to precisely adjust the circuit parameters up to a desired level. U.S. Pat. No. 5,402,495 and U.S. Pat. No. 5,563,953 provide improved apparatuses and methods in which an MPEG psychoacoustic model is employed. However, the apparatuses and methods are more likely used for measuring distortions of audio CODEC systems. The perceptual evaluation of quality of audio CODEC systems can also be found in some literature, e.g., B. Paillard, et. al., "PERCEVAL: Perceptual evaluation of the quality of audio signals," J. Audio Eng. Soc, Vol. 40, No.1/2 ρp.21-31, 1992; J.G. Beerends and J. A. Stemerdink, "A perceptual audio quality measure based on a psychoacoustic sound representation," J. Audio Eng. Soc, Vol.40, No.12, pp.978, 1992; and T. Sporer and U. Gbur, "Evaluating a measurement system," J. Audio Eng. Soc, Vol. 43, No.5, pp. 353-363, 1995.

However, in practice, there are some differences between measuring audio CODEC systems and audio products (e.g., multimedia sound cards, power amplifier, or the like). These differences include: a) for audio CODEC systems, inputs and outputs are same to the full, in general, however, for an audio product, inputs and outputs are somewhat different (basically the output signals are amplified input signals in most cases); b) for audio CODEC systems, the sampling rates can only be selected from some fixed sampling rates, e.g. 32 kHz, 44.1 kHz, and 48 kHz for MPEG and AC-3 etc However, for audio product measurements, the sampling rates are dependent on data acquisition devices; c) in general, in an audio CODEC measurement, digital input and output signals (e.g.,

PCM) are considered and the number of bits are selected from the various fixed number of bits (e.g., 16, 20, 24 bits). But for audio product measurements, input and output signals are analogue and the numbers of bits are determined by data acquisition devices.

The above described differences will affect the measurement method and apparatus as follows: a) direct differences between the input and output audio signals are used as the error signal in U.S. Pat. No. 5,402,495 and U.S. Pat. No. 5,563,953. However, this way cannot be used in audio product measurement; b) psychoacoustic models for some CODEC standards, e.g., MPEG, AC-3, etc. cannot directly be used in audio product measurements; an interpolating method should be used to get a psychoacoustic model at any other sampling rates; c) a voltage value of the output signals of audio products must be made corresponding to a sound pressure level (SPL) value, measured in dB, by generating a relationship. Accordingly, a basic objective of the present invention is to improve the known measurements of distortion for audio products generally, not only for audio CODEC systems.

SUMMARY OF THE INVENTION

It is, therefore, an object of the invention to provide an improved method and apparatus for efficiently measuring the distortions of audio products, such as multimedia sound cards, power amplifier, CD player, and loudspeakers, etc., and not only for audio CODEC systems.

It is another object of the present invention to provide a method and apparatus for evaluating an audio distortion in audio products based on psychoacoustic models in order to make the measuring results confoπ ing to human auditory perceptions.

It is a further object of the present invention to provide a method and apparatus for interpolating the psychoacoustic models for audio product measurements.

It is still yet another object of the present invention to provide a method and apparatus for transforming voltage values of the output signals of audio products to sound pressure level (SPL) values, measured in dB, for audio product measurements.

An apparatus for estimating the audio distortion of an audio product, said apparatus comprises: a data acquisition device for receiving an output signal from the audio product, an audio signal generator for generating a test signal to the input of the audio product, said audio signal generator can be integrated into said data acquistion device, a power spectrum estimator for estimating values of a power spectrum of the digital output signal, a transformation unit for transforming the values of the power spectrum to sound pressure level values, an interpolation unit for interpolating a standard absolute hearing threshold based on the sampling rate of the data acquisition device to generate an interpolated hearing threshold, an excite curve determination unit for determining an excite curve based on a psychoacoustic model using the sound pressure level values corresponding to predetermined frequencies k of an input signal x(n), which is input to the audio product, a masking curve determination unit for determining a masking curve using the interpolated hearing threshold and the excite curve, a perceptual distortion estimator for determining the audio distortion using the masking curve.

A method for estimating the audio distortion of an audio product comprises the following steps: generating a single digital sine-wave signal or a signal combined by multiple sine-waves and going through a D/A converter to act as a test signal, inputting the test signal to the input of the audio product, receiving the output signal from the audio product, converting the output signal from the audio product to a digital output signal of the audio product using a data acquisition device, deteπnining power spectrum values of the digital output signal of the audio product, transforming the power spectrum values to sound pressure level values, interpolating a standard absolute hearing threshold based on the sampling rate of a data acquisition device used, thereby generating an interpolated hearing threshold, determining an excite curve based on a psychoacoustic model using the sound pressure level values corresponding to frequencies k of an input signal x(n), which is input to the audio product,, determining a masking curve using the interpolated hearing threshold and the excite curve, estimating the audio distortion using the masking curve.

Using this apparatus or method, a more efficient measuring of the distortions of audio products in general, not only suitable for CODEC systems, is provided. Preferred embodiments of this invention are claimed in the dependent claims.

BRIEF DESCRIPTION OF THE DRAWINGS

These and other objects, features, and advantages of the present invention are better understood by reading the following detailed description of the invention, taken in conjunction with the accompanying drawings. In the drawings, it should be noted that like symbols or reference numerals represent like elements, wherein:

FIG. 1 is a block diagram of an audio product measurement system that implements the preferred embodiment of the invention.

FIG. 2 is a block diagram illustrating one implementation of a power spectrum estimator.

FIG. 3 is a block diagram illustrating one implementation of a computing masking curve.

DETAILED DESCRIPTION OF THE INVENTION

In the preferred embodiment of the present invention, an improved method and apparatus of measuring audio products leads to results conforming to human auditory perception.

FIG. 1 illustrates a block diagram generally illustrative of the preferred embodiment of the invention. An audio signal generator 5-1 produces audio test signals u(n), which are, usually, digital signals, further denoted as "first digital signals". In this invention, the audio test signals u(n), generated by the audio signal generator 5-1 have two formats: a varied frequency single sine-wave and a signal combined multiple sine-waves that the frequencies are the starting frequencies in different barks. The audio test signals u(n) are provided to a data acquisition device (DAD) 4-1. They are converted into first analog signals u(t) in the data acquisition device DAD 4- 1 and then are outputted to the input 2-1 of an audio product under test (APUT) 1-1. An output 3- 1 of the audio product under test (APUT) 1-1 is connected back to the data acquisition device (DAD) 4-1 and second analog signals x(t), which are outputted by the audio product under test (APUT) 1-1 to the output 3-1, are converted into second digital signals x(n) within the data acquisition device (DAD) 4-1. Then, the second digital signals x(n) are used to compute the spectrum power by a power spectrum estimator 6-1 The spectrum power needs to be transformed to sound pressure levels (SPL), expressed in the units of dB, by a power to SPL transformer 7-1. The transformation coefficient used within this transformation is as follows:

R ^■Tr ^" : (io⁹-⁶)/ P_out (i)

where Rτ_r is the transformation coefficient and P_out is the maximum output power of the audio product under test (APUT) 8-1. Then the transformation can be described as:

Y(f) (dB) =101og₁₀(Rτr |X(f)|²),

(2) Y(k) (dB) =101og₁₀(R_Tr |X(k)|²), k≠ f

wherein Y(f) is the value of the sound pressure level (SPL) at a frequency f. The sound pressure level (SPL) values Y(k) corresponding to frequencies k of the second digital signals x(n) are used by a computing masking circuit 11-1 to compute a masking curve for the whole audio frequency band, i.e., 20 Hz to 20 kHz. The measuring results can be generated by a perceptual distance estimator 13-1, using the formula

T_mea=∑ Max[0, Y(f)-M(f)], f≠ k (3)

where T_mea is the measuring result 14-1 and M(f) is the value of the masking curve at frequency f.

FIG. 2 illustrates one way to implement the power spectrum estimator 6-1 of Fig.1. The second digital signal from the audio product under test (APUT) 1-1 via the data acquisition device (DAD) 4-1, in Fig. 2 shown as 1-11, is windowed by a Hanning Windows circuit 1-12. The resulting windowed signal x'(n), denoted as 1-13 in Fig. 2, is then transformed to a frequency spectrum X(f), in Fig. 2 denoted as 1-15, by a Fast Fourier Transformation (FFT) 1-14. Then the spectrum power |X(f)|² can be obtained using a spectrum power circuit 1-16.

FIG. 3 illustrates one way to implement the computation of the masking curve 9-1. The sound pressure level (SPL) value Y(k) corresponding to frequencies k of the second digital signal x(n) is used to computed the excite curve Fd(n) 2-13 by means of an electronic circuit based on a standard psychoacoustic model, h general, standard absolute hearing threshold values H(f) 2-14 are only adapted for sampling rates 32 kHz, 44.1 kHz, and 48 kHz. However, the sampling rate of the data acquisition device (DAD) 4-1 may not be one of these sampling rates. Thus, an interpolator 2-17 is used to interpolate the hearing threshold based on the sampling rate 2-16 of the data acquisition device (DAD) in order to obtain a suitable hearing threshold H'(f) 2-18. The specific interpolating method used in this embodiment is linear interpolation. However, any kind of suitable interpolation algorithm may be used, e.g. an interpolation of grade n. The excite curve Fd(n) 2-13 and the interpolating hearing threshold H'(f) 2-18 are used to compute the masking curve M(f) 2-20 by a comparator 2-19. In this embodiment of the present invention, an MPEG psychoacoustic model is used for computing the excite curve and an absolute hearing threshold. For more detailed information about the MPEG psychoacoustic model, one can refer to "Coding of moving pictures and associated audio", ISO/TEC/JTC1/SC29/WG11 NO501 MPEG 93, July 1993, the content of which is herewith incorporated by reference.

It is to be understood that various modifications could be made to the illustrative embodiments provided herein without departing from the scope of the present invention. Accordingly, the invention is not limited except as by the appended claims.

Claims

CLAIMSWhat is claimed is:

1. An apparatus for estimating the audio distortion of an audio product, said apparatus comprising: a data acquisition device for receiving an output signal from the audio product, said data acquisition device being arranged to generate a digital output signal, a power spectrum estimator for estimating values of a power spectrum of the digital output signal, a transformation unit for transforming the values of the power spectrum to sound pressure level values, an interpolation unit for interpolating a standard absolute hearing threshold based on the sampling rate of the data acquisition device to generate an interpolated hearing threshold, an excite curve determination unit for determining an excite curve based on a psychoacoustic model using the sound pressure level values corresponding to predetermined frequencies k of an input signal x(n), which is input to the audio product, a masking curve determination unit for detemώiing a masking curve using the interpolated hearing threshold and the excite curve, a perceptual distortion estimator for determining the audio distortion using the masking curve.

2. The apparatus for estimating the audio distortion of an audio product according to claim 1, wherein the psychoacoustic model is a standard psychoacoustic model.

3. The apparatus for estimating the audio distortion of an audio product according to claim 2, wherein the psychoacoustic model is an MPEG psychoacoustic model.

4. The apparatus for estimating the audio distortion of an audio product according to any one of the preceding claims, wherein the masking curve determination unit comprises a comparator for receiving the excite curve and the interpolated hearing threshold and generating the masking curve.

5. The apparatus for estimating the audio distortion of an audio product according to any one of the preceding claims, further comprising an audio signal generating unit for generating an audio signal, which is provided to the data acquisition device.

6. The apparatus for estimating the audio distortion of an audio product according to claim 5, wherein the data acquisition device is arranged in such a way that it converts the analog audio signal to a digital audio signal, which is provided to the audio product as a test signal.

7. A method for estimating the audio distortion of an audio product, said method comprising the following steps: generating a single digital sine-wave signal or a signal combined by multiple sine- waves as a test signal, inputting the test signal to the input of the audio product, receiving the output signal from the audio product, converting the output signal from the audio product to a digital output signal of the audio product using a data acquisition device, determining power spectrum values of the digital output signal of the audio product, transforming the power spectrum values to sound pressure level values, interpolating a standard absolute hearing threshold based on based on the sampling rate of a data acquisition device used, thereby generating an interpolated hearing threshold, determining an excite curve based on a psychoacoustic model using the sound pressure level values corresponding to frequencies k of an input signal, which is input to the audio product,, determining a masking curve using the interpolated hearing threshold and the excite curve, estimating the audio distortion using the masking curve.