EP1468416A1 - Verfahren zur qualitativen bewertung eines digitalen audiosignals - Google Patents

Verfahren zur qualitativen bewertung eines digitalen audiosignals

Info

Publication number
EP1468416A1
EP1468416A1 EP03715043A EP03715043A EP1468416A1 EP 1468416 A1 EP1468416 A1 EP 1468416A1 EP 03715043 A EP03715043 A EP 03715043A EP 03715043 A EP03715043 A EP 03715043A EP 1468416 A1 EP1468416 A1 EP 1468416A1
Authority
EP
European Patent Office
Prior art keywords
audio signal
signal
evaluated
quality indicator
implements
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
EP03715043A
Other languages
English (en)
French (fr)
Other versions
EP1468416B1 (de
Inventor
Alexandre Joly
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Telediffusion de France ets Public de Diffusion
Original Assignee
Telediffusion de France ets Public de Diffusion
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telediffusion de France ets Public de Diffusion filed Critical Telediffusion de France ets Public de Diffusion
Publication of EP1468416A1 publication Critical patent/EP1468416A1/de
Application granted granted Critical
Publication of EP1468416B1 publication Critical patent/EP1468416B1/de
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/69Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for evaluating synthetic or decoded voice signals

Definitions

  • the subject of the present invention is a method for evaluating a digital audio signal, in particular a digitally transmitted signal and / or a digital signal to which digital coding has been applied, in particular with rate reduction and / or decoding.
  • a digitally transmitted signal can be a standalone audio signal (broadcasting) or an audio signal that accompanies a program such as an audiovisual program.
  • the first directly compares the original signal to the degraded signal (after coding, broadcasting, multiplexing, 7)
  • the second compares only parameters extracted from two signals (called reduced reference).
  • the faults generated by the diffusion chain are detected using their main known characteristics.
  • This last class overcomes the constraints linked to the use of the reference signal. Indeed, in all other cases, the reference must be sent instead of comparison then perfectly synchronized with the degraded signal. This makes the system complex and more expensive.
  • Degradations due to transmission errors significantly reduce the signal quality. They appear during the broadcast, of an MPEG digital stream for example or during the broadcast, notably of radio, on the Internet.
  • the methods with full reference for which the signal to be evaluated is compared to the reference signal correspond to the conventional techniques used to estimate the quality of audio coders for example. Their general principle is based on the calculation, via a perceptual hearing model, of an internal representation of the original signal and the degraded signal, then on a comparison of these two internal representations. Such a method is described in the article by John G. BEERENDS and JAN A. STEMERDINK entitled “A Perceptual Audio Quality Measure Based on a Psychoacoustic Sound Representation", published in "Journal of Audio Engineering Society", vol. 12, December 1992, pages 963 to 978.
  • the OBQ (Output-Based Objective Speech Quality) measurement is the most advanced of the techniques without reference. This method of estimating the quality of a speech signal only, without a reference signal, is based on the calculation of perceptual parameters representing the content of the signal, gathered in a vector. These vectors, calculated on non-degraded signals, will constitute a reference base. The quality will be estimated by comparing the same parameters, extracted from the degraded signals, with the vectors of the reference base.
  • the main method using neural networks is the OSSQAR (Objective Scaling of Sound Quality And Reproduction) measurement. The general principle of this method is to use a hearing model in conjunction with a neural network.
  • the network is trained to predict the subjective quality of a signal from its perceptual representation calculated by the hearing model, to simulate the phenomena of psychoacoustics. It should be noted that the results obtained by these methods are much better when the signals are part of the learning base or at least when they have close characteristics.
  • Such methods are therefore not suitable for evaluating the quality of any signals, for example the audio signals of a radio or TN broadcast.
  • the present invention proposes a method according to which the indicators are simpler and can be calculated in real time and in continuous time, and require a significantly lower bit rate. Since the degradations can only modify a few samples, while degrading the quality significantly, the proposed method allows the entire audio stream to be analyzed.
  • the method according to the invention allows a reliable estimate of the quality of an audio signal having passed through a digital type transmission or coding. Indeed, the disturbances undergone by the transmission channels can induce the appearance of errors on the transmitted data; these errors result in degradations in the final audio signal.
  • the technological approach proposed consists in making a measurement on the audio signal, at the input and another at the output, the chain or any other system to be studied. A comparison between these measurements makes it possible to ensure the "transparency" of the transmission channel and to assess the extent of the degradations introduced. Used in conjunction or not with methods without reference, detecting the degradations based on the signature of the characteristics of the most important defects to be sought, the proposed approach allows a reliable estimate of the degradations introduced. It also makes it possible to compensate for a lack of reference signal. This method makes it possible to reduce the reference throughput necessary for estimating the quality in the case of measurements with reduced reference, and the number of parameters to be used in the case of measurements without reference.
  • the invention thus relates to a method for evaluating a digital audio signal, characterized in that it implements in real time and in continuous time, in successive time windows, the calculation of a quality indicator constituted, for each time window of a vector whose size is advantageously at least one hundred times less than the number of audio samples of a time window.
  • This dimension is for example between 1 and 10 and preferably between 1 and 5.
  • the digital audio signal to be evaluated can be a signal which has been transmitted digitally and / or which has been subjected to digital coding, in particular with reduction in bit rate, from a digital reference signal.
  • the method is characterized in that the generation of a said quality indicator vector implements for a reference audio signal and for the audio signal to be evaluated, the steps a) calculate the power spectral density of the audio signal for each time window and apply a filter representative of the attenuation of the inner and middle ear to obtain a filtered spectral density, b) calculate from this density filtered spectral individual excitations using the frequency spreading function in the basilar scale, c) determine from said individual excitations the compressed loudness using a function modeling the nonlinear frequency sensitivity of l ear, to obtain basilar components, d) separate the basilar components into classes, preferably in three classes, and calc uler for each class a number C representing the sum of the frequencies of this class, said vector consisting of said numbers C, e) calculating a distance between the vectors of the reference audio signal and of the audio signal to be evaluated associated with each time window for perform a so-called
  • the method is characterized in that the generation of a said quality indicator vector implements, for the reference audio signal and for the audio signal to be evaluated, the following steps: a) calculating N coefficients of a prediction filter by an autoregressive modeling. b) determining in each time window the maximum of the residue by difference between the predicted signal using the prediction filter and the audio signal, said maximum of the prediction residue constituting said quality indicator vector, c) calculating a distance between said vectors of the reference audio signal and of the audio signal to be evaluated associated with each time window in order to carry out a so-called evaluation of the degradation of the audio signal.
  • the method is characterized in that the generation of a said quality indicator vector implements for the reference audio signal and for the audio signal to evaluate, the following steps: a) calculate for each time window the power spectral density of the audio signal and apply to it a filter representative of the attenuation of the inner and middle ear, to obtain a frequency spreading function in l basal scale, b) calculate individual excitations from the frequency spreading function in the basilar scale, c) obtain from said individual excitations the compressed loudness using a function modeling nonlinear sensitivity by ear frequency, to obtain basilar components, d) calculate from said basilar components N 'predictive coefficients ion of a prediction filter by autoregressive modeling. e) generate for each time window a said quality indicator vector from only some of the N ′ prediction coefficients.
  • the quality indicator vector comprises between 5 and 10 of said prediction coefficients.
  • the method is characterized in that the generation of a said quality indicator vector implements at least for the audio signal to be evaluated the following steps : a) calculation of a temporal activity of the signal in each time window, b) calculate a sliding average over Ni successive values of the time activity, c) keep the minimum value among M] successive values of the sliding average.
  • the quality indicator vector can be constituted by said minimum value, or alternatively by a binary value resulting from the comparison of said minimum value with a given threshold.
  • the method can be characterized in that it implements the calculation of a quality score by determining a cumulative time interval during which said minimum value is less than a given threshold and / or by determining the number of times per second where said minimum value is less than a given threshold or else in that said minimum values are generated both for the reference audio signal and for the audio signal to be evaluated and in that a quality vector is generated by comparison between the corresponding minimum values of the reference audio signal and of the audio signal to be evaluated, for example by calculating the difference or the ratio between said minimum values.
  • the method is characterized in that the generation of a said quality indicator vector implements at least for the audio signal to be evaluated the following steps : a) calculate a temporal activity of the signal in each time window, b) calculate a sliding average over N 2 successive values of the temporal activity, c) keep the maximum value among M 2 successive values of the sliding average.
  • the quality indicator vector can be constituted by said maximum value or by a binary value resulting from the comparison of said minimum value with a given threshold.
  • the method can be characterized in that a degradation indicator is generated by comparison between the maximum value obtained on the reference audio signal and its corresponding maximum value obtained on the audio signal to be evaluated, for example by calculating the difference or the ratio between these maximum values.
  • the method is characterized in that the generation of a said quality indicator vector implements at least for the audio signal to be evaluated. calculation of the Fourier transform in successive blocks of N 3 samples constituting said time windows and calculating the minimum of the spectrum in M 3 successive blocks which constitute a vector indicative of quality.
  • the method can be characterized in that it includes a step of evaluating the introduction of noise into the audio signal to be evaluated by comparing the value of said minimum of the spectrum in M 3 successive blocks associated with the audio signal to be evaluated with the value maximum of the M 3 minima obtained in the same M 3 successive blocks associated with the reference audio signal.
  • the method comprises a step of evaluating the introduction of noise into the audio signal to be evaluated by comparing the value of said minimum of the spectrum in M 3 successive blocks with an average value spectrum minima obtained in blocks prior to the M 3 successive blocks, for example by calculating the difference or the ratio between these average values.
  • the method is characterized in that the generation of a said quality indicator vector implements at least for the audio signal to be evaluated the calculation a spectrum flattening parameter which is the ratio between an arithmetic mean and a geometric mean of the components of the signal spectrum.
  • the method can then be characterized in that it implements an indicator for detecting a degradation of the audio signal by the introduction of broadband noise by comparing said spectrum flattening parameter between the reference audio signal and the audio signal to be evaluated, for example by calculating the difference or the ratio between these two parameters.
  • FIG. 1 is a flowchart illustrating a quality assessment with full reference.
  • FIG. 2 illustrates an audio transmission with loss of quality
  • FIGS. 3 to 10 illustrate evaluation methods according to the present invention
  • FIG. 11 and 12 illustrate an audio quality system implementing the present invention.
  • the management and recovery of decoding errors is not standardized. The influence of these errors on the perceived quality therefore depends on the decoder used.
  • the audibility of these faults is also linked to the type of element affected in the frame, for example MPEG, and to its audio content.
  • the quality can be estimated in binary fashion: either the signal has not been degraded and the quality will depend on the initial coding used, or errors have been introduced and significant degradations appear.
  • the estimation of the quality can then be done by methods without reference, by accounting for the degradations detected over regular time intervals of the order, for example of a second.
  • Subjective tests have in fact made it possible to obtain a reliable estimate of the perceived quality, from the number and the length of the interruptions linked to impulse-type degradations in a signal.
  • the proposed method makes it possible to reduce the flow required for transporting the reference. This allows the use of reserved lanes with relatively limited speed. These measurements make it possible to detect degradations other than those due to transmission errors.
  • the present invention allows a reduction in the bit rate in the case of measurements with reduced reference and, by adding simple measurements without reference, to keep measurements on the significant degradations in the case of a loss of the reference by example, by locally generating a vector which simply characterizes the degradations, and which could therefore be easily processed and transmitted to a control installation, in particular centralized.
  • the measurements taken along the chain and at various points on the network inform the monitoring and management system for digital television broadcasting on its overall performance. Measurements of signal degradations inform the broadcasting operator about the quality of service delivered.
  • the process is characterized by two complementary operating modes: With reduced reference.
  • the technological approach proposed consists in making a measurement on the audio signal, at the input, and another at the output of the transmission chain or any other system to be studied (encoder, decoder, etc.). A comparison between these measures makes it possible to ensure the "transparency" of the chain or system and to assess the extent of the degradations introduced.
  • the method performs an evaluation in real time and in continuous time.
  • the reference measurements at the input of the chain represent a very small amount of data compared to the audio signal data, hence its classification as “reduced reference”.
  • the reference data or measurements used are both a reduced representation of the content of the signal and a measure of the importance of a type of degradation.
  • the invention makes it possible to compensate for a lack of reference signal.
  • the method defines measures for the characteristic digital faults to be sought.
  • the proposed approach allows an estimation of the degradations introduced on any signal, and in a reliable manner and this approach can be implemented both on the scale of a transmission network and locally on an equipment.
  • the computation complexity according to the method is low, and the indicator obtained represents a small quantity of data compared to the digital audio stream.
  • the method can be applied indifferently to purely digital signals or to signals having undergone after transmission a digital to analog conversion then analog to digital.
  • perceptual modeling The principle of objective perceptual measurements is based on the transformation of the physical representation (sound pressure, level, time and frequency) into the psychoacoustic representation (sound strength, masking level, time and critical bands or barks) of two signals (the reference signal and the signal to be evaluated) in order to compare them. This transformation takes place thanks to a modeling of the human auditory system (generally, this modeling consists of a spectral analysis in the Barks domain followed by spreading phenomena). A distance can then be calculated between the psychoacoustic representations of the two signals, distance which can be linked to the quality of the signal to be evaluated (the smaller the distance, the closer the signal to be evaluated to the original signal and the better its quality).
  • the first process implements a parameter called "Difference in Perceptual Accounts".
  • Windowing of the temporal signal in blocks then, for each of the blocks, calculation of the excitation induced by the signal using a hearing model.
  • This representation of the signals takes into account the phenomena of psychoacoustics, and provides a histogram whose accounts are the values of the basilar components.
  • the attenuation filter for the outer and middle ear is applied to the power spectral density, obtained from the signal spectrum.
  • This filter also takes into account the absolute hearing threshold.
  • the notion of critical bands is modeled by a transformation of the frequency scale into a basilar scale.
  • the following stage corresponds to the calculation of the individual excitations to take account of the masking phenomena, thanks to the frequency spreading function in the basilar scale and to a nonlinear addition.
  • the last step makes it possible to obtain the compressed loudness, by a power function, to model the non-linear frequency sensitivity of the ear, by a histogram comprising the 109 basilar components.
  • the histogram accounts obtained are then grouped into three classes. This vectorization makes it possible to obtain a visual representation of the evolution of the structure of the signals. This also makes it possible to obtain a simple and concise characterization of the signal and therefore to have a particularly interesting reference parameter.
  • the second strategy takes into account the Beerends scaling zones.
  • gain compensation between the excitation of the reference signal and that of the signal to be tested is carried out by the ear, the fixed limits are then the following:
  • a point (X, Y) constituting a vector is therefore obtained for each time window of the signal, which corresponds to the transmission of two values per window of for example 1024 bits, or a bit rate of 3 kbits / s for an audio signal. sampled at 48 kHz.
  • the associated representation is thus a trajectory parameterized by time, as shown in Figure 3.
  • the distance (Euclidean) between the reference signal and the degraded signal is then calculated.
  • the distance between the points makes it possible to estimate the extent of the degradations introduced between the reference signal and the degraded signal. This distance can be considered as a perceptual distance due to the use of models of psychoacoustics.
  • the main advantage of the parameter comes from the fact that psychoacoustic phenomena are taken into account without increasing the bit rate necessary for transferring the reference. This makes it possible to reduce the reference to 2 values for 1024 signal samples (3 kbits / s).
  • the second method implements an autoregressive modeling of the signal.
  • the general principle of linear prediction is to model the signal as being a combination of its past values. The idea is to calculate the
  • Prediction errors or residuals are calculated by difference between these two signals.
  • the presence and amount of noise in a signal can be determined by analyzing these residues.
  • the comparison of the residues obtained on the reference signal and those calculated from the degraded signal, and therefore of the noise levels, makes it possible to estimate the importance of the modifications and defects inserted.
  • the reference to be transmitted corresponds to the maximum of the residuals over a time window of given size. It is in fact not interesting to transmit all the residues if the bit rate of the reference wants to be reduced.
  • the gradient algorithm which is described for example in the aforementioned work of M. BELLANGER p. 371 and following.
  • the main drawback of the previous parameter is the need, in the case of an implementation on DSP, to store the N 0 samples to estimate the autocorrelation, have the coefficients of the filter then calculate the residues.
  • This second parameter makes it possible to avoid this by using another algorithm making it possible to calculate the coefficients of the filter: the algorithm of the gradient. This uses the error made to update the coefficients.
  • the filter coefficients are changed in the direction of the gradient of the instantaneous quadratic error, with the opposite signal.
  • the reference vector to be transmitted can thus be reduced to a single number.
  • the comparison consists of a simple calculation of the distance between the maxima of the reference and of the degraded signal, for example by difference.
  • Figure 5 summarizes the principle of parameter calculation: The main advantage of the two parameters is the flow required to transfer the reference. This reduces the reference to 1 real number for 1024 signal samples.
  • the third method implements an autoregressive modeling of the basilar excitation.
  • this method allows to take into account the phenomena of psychoacoustics, in order to obtain an evaluation of the perceived quality. For this, the calculation of the parameter goes through a modeling of various hearing principles.
  • a linear prediction models the signal as a combination of its past values. Analysis of the residuals (or prediction errors) makes it possible to determine and estimate the presence of noise in a signal.
  • the major drawback when using these techniques is the fact that there is no consideration of the principles of psychoacoustics. Thus, it is not possible to estimate the amount of noise actually perceived.
  • the process takes up the general principle of classical linear prediction. It also incorporates the phenomena of psychoacoustics to adapt it to the non-linear sensitivity in frequency (loudness) and intensity (tone) of the human ear.
  • the first part of the calculation of this parameter corresponds to the modeling of the principles of psychoacoustics using classical hearing models.
  • the second part is the calculation of the linear prediction coefficients.
  • the last part corresponds to the comparison of the prediction coefficients calculated for the reference signal and those obtained for the degraded signal.
  • Temporal windowing of the signal then calculation of an internal representation of the signal by modeling the phenomena of psychoacoustics.
  • This step corresponds to the calculation of the compressed loudness, which is in fact the excitation induced by the signal at the level of the inner ear.
  • This representation of the signals account of the phenomena of psychoacoustics, and is obtained from the signal spectrum, using conventional models: attenuation of the outer and middle ear, integration according to critical bands and frequency masking.
  • This calculation step is identical to the parameter described above; - Autoregressive modeling of this compressed loudness in order to obtain the coefficients of a RIF prediction filter, just like in a classic linear prediction.
  • the method used is that of autocorrelation, by solving the Yule-alker equations.
  • the first step in obtaining the prediction coefficients is therefore the calculation of the signal autocorrelation.
  • the modeling of the phenomena of psychoacoustics makes it possible to obtain 24 basilar components.
  • the order N of the prediction filter is 32. From these, 32 autocorrelation coefficients are estimated, which gives 32 prediction coefficients of which only 5 to 10 coefficients are kept as an indicator vector of quality, for example the first 5 to 10 coefficients.
  • the main advantage of the parameter comes from taking into account the phenomena of psychoacoustics. To do this, it was necessary to increase the flow required for the transfer of the reference to 5 or 10 values for
  • the following methods can be used with or without reference. This makes it possible to keep the most significant degradation detection measures, even in the case where no reference parameter is available at the control point, at the time when the comparison should be carried out.
  • the first of these methods implements dish detection in signal activity.
  • the notion of activity which can be approximated by a derivation operation in the audio signal, is used to identify breaks and interruptions in the time signal.
  • These types of faults are characteristic of decoding errors after transmission of the digital audio stream or during the broadcasting of sound sequences on the Internet. This occurs when the network speed becomes insufficient to ensure the arrival of all the frames necessary at the time of decoding, for example.
  • These degradations which introduce very low activity zones, translate at the auditory level by different sensations in the listener: mute, blur, impulse noise ...
  • the first step in calculating the parameter corresponds to estimating the temporal activity of the signal.
  • the second derivative operator is used. It makes it possible to have a sufficiently precise estimate of the activity and requires very few calculations.
  • the comparison step consists of a simple difference which makes it possible to identify the zones where the signal has been replaced by decoding dishes.
  • Plats r (t) and PlatS d (t) are respectively the parameter calculated on the reference and on the degraded signal.
  • the next step therefore consists in using correspondence curves from the binary parameter. These curves provide a quality score from the cumulative length and the number of impulse degradations detected per second. These curves are established from subjective tests. Different curves can be established depending on the type of audio signals (mainly speech or music). Once the estimate has been obtained, it is also possible to use a filter simulating the response of a panelist. This makes it possible to take into account the dynamic effect of the votes and the reaction times when faced with degradations.
  • the main advantage of the parameter is the possibility of making measurements without reference. Another interesting point is the speed necessary for the transfer of the reference. This makes it possible to reduce the reference to 1 real number, ie a bit rate of 1.5 kbits / s (or even 1 bit in the event of thresholding, or a bit rate of 47 bits / s) for
  • the second of these methods implements activity peak detection.
  • This parameter is based on signal activity. This allows you to detect dropouts, breaks, breaks in part of the audio signal and outliers by looking for peaks in signal activity.
  • ActTemp (t) max (y (t - k)) (11) where y (t) is the signal activity calculated by the filter.
  • ActTemp r (t) and ActTemp ⁇ d (t) are respectively the parameter calculated on the reference and on the degraded signal.
  • the threshold In the case where the reference is not available, it is possible to use a thresholding to detect if the parameter is greater than a threshold S ', which indicates the presence of degradations. To avoid false detections due to signals of an impulsive nature (attacks, percussions, ...), the threshold must have a fairly large value, which can lead to non-detections.
  • the use of correspondence curves is possible to estimate a perceptual quality.
  • the method consists in integrating the degradations detected by this parameter, with the others found by the previous parameter for example, and thus obtaining a global perceptual estimate.
  • the advantage of the parameter lies in the possibility of making detections without reference.
  • the reduced complexity and the low bit rate necessary for transporting the reference limited to 1 value, i.e. a bit rate of 1.5 kbits / s (or even 1 bit in case of thresholding, or a bit rate of 47 bits / s) for 1024 signal samples sampled at 48 kHz, are also interesting points.
  • the following method implements the study of the minimum of the signal spectrum to locate the degradations.
  • the first step in calculating these parameters is to estimate the spectrum of the signal.
  • MinSpe min ( ⁇ : ; .) For 1 ⁇ i ⁇ N (13) with Xi the N components of the X spectrum in dB (by distance calculation).
  • x r> i is the i th of the N components of the spectrum obtained on the reference
  • x ⁇ Li is the i th of the N components of the spectrum obtained on the degraded signal
  • the method can be summarized as follows by the following two diagrams Figure 9. Again, the main advantage of these parameters is the possibility of making measurements without reference. Another interesting point is the speed necessary for the transfer of the reference. This makes it possible to reduce the reference to 1 real number and even 1 integer, ie a bit rate of at most 1.5 kbit / s for N (for example 1024) signal samples. The reduced complexity of the algorithm is also an asset.
  • the statistical flattening coefficient called "kurtosis” or “concentration” was used.
  • the estimation is made from the centered moments of order 2 and 4. They allow the shape of the spectrum to be estimated with respect to a normal distribution in the statistical sense of the term.
  • the calculation corresponds to the ratio between the centered moment of order 4 and the centered moment of order 2 (variance) squared of the coefficients of the spectrum.
  • the formula used is as follows:
  • X is the arithmetic mean of the N components Xj of the X spectrum in dB.
  • the higher the value obtained the more the signal is concentrated and the less noise there is in the signal. This one is calculated on the reference and on the degraded signal. By comparison, the level of white noise inserted is estimated.
  • the diagram in Figure 10 presents the principle (valid for the two parameters above): In the case of a comparison with the reference, a simple distance of the difference or other type is sufficient to detect the degradations. If no reference is available, it is necessary to detect peaks in the variation of the parameters to search for degradations. This can be done using the technique, classic in image processing, of gray level mathematical morphology (erosions and dilations).
  • the reference audio signal corresponds to the signal at the input of the broadcasting network.
  • the reference parameters are calculated on this signal, then transmitted via a specific data channel, to the desired measurement point. It is at this point that the same parameters necessary for the comparison are calculated for the establishment of the measurements with reduced reference. Non-referenced measurements are also calculated. In the event that the reference parameters are not available (not present, incorrect, ...) these measurements are sufficient to detect the most significant errors.
  • the dotted subsystems in Figure 11 are no longer used.
  • the same diagram as above can be used to visualize (with or without reference) the performance of radio broadcasting on the Internet.
  • the data channel used to transport the reference parameters can be the network itself, as well as to return the estimated notes to the center of monitoring.
  • the reference signal corresponds to the signal sent by the server, and the degraded signal is that decoded at the chosen measurement point. This can for example be used to choose the most appropriate server according to the connection location by accessing data from a monitoring center.
  • the diagram ( Figure 12) below illustrates this embodiment in the case where the reference parameters are sent by the network and where the notes obtained use a specific transmission channel.
  • a method according to the invention is applicable whenever it is necessary to identify faults on an audio signal which has been transmitted by any broadcasting network (cable, satellite, wireless, Internet, DNB, DAB, etc.). .).
  • the proposed method exploits two classes of methods: techniques with reduced reference and those without reference. It is particularly advantageous when the bit rate available for transmitting the reference is limited.
  • this invention is applicable for operational purposes for metrology equipment and for supervision systems of audio signal distribution networks.
  • One of its advantageous characteristics lies in the combination of the measurements carried out with and without reference.
  • this invention corresponds to the requirements imposed in service quality management systems.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
EP03715043.0A 2002-01-24 2003-01-23 Verfahren zur qualitativen bewertung eines digitalen audiosignals Expired - Lifetime EP1468416B1 (de)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
FR0200856A FR2835125B1 (fr) 2002-01-24 2002-01-24 Procede d'evaluation d'un signal audio numerique
FR0200856 2002-01-24
PCT/FR2003/000222 WO2003063134A1 (fr) 2002-01-24 2003-01-23 Procede d'evaluation qualitative d'un signal audio numerique.

Publications (2)

Publication Number Publication Date
EP1468416A1 true EP1468416A1 (de) 2004-10-20
EP1468416B1 EP1468416B1 (de) 2015-12-23

Family

ID=27589574

Family Applications (1)

Application Number Title Priority Date Filing Date
EP03715043.0A Expired - Lifetime EP1468416B1 (de) 2002-01-24 2003-01-23 Verfahren zur qualitativen bewertung eines digitalen audiosignals

Country Status (5)

Country Link
US (2) US8036765B2 (de)
EP (1) EP1468416B1 (de)
CA (1) CA2474067C (de)
FR (1) FR2835125B1 (de)
WO (1) WO2003063134A1 (de)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112562714A (zh) * 2020-11-24 2021-03-26 潍柴动力股份有限公司 一种噪声评估方法及装置

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2833791B1 (fr) * 2001-12-13 2004-02-06 Telediffusion De France Tdf Dispositif de metrologie pour la surveillance automatique d'un reseau de diffusion de signaux numeriques et reseau de diffusion comprenant un tel dispositif de metrologie
CN101512935B (zh) * 2006-07-27 2013-10-30 艾利森电话股份有限公司 通过多个发射机进行分层广播发射
US8599704B2 (en) * 2007-01-23 2013-12-03 Microsoft Corporation Assessing gateway quality using audio systems
EP2115742B1 (de) * 2007-03-02 2012-09-12 Telefonaktiebolaget LM Ericsson (publ) Verfahren und anordnungen in einem telekommunikationsnetz
CN101608947B (zh) * 2008-06-19 2012-05-16 鸿富锦精密工业(深圳)有限公司 声音测试方法
US20100161779A1 (en) * 2008-12-24 2010-06-24 Verizon Services Organization Inc System and method for providing quality-referenced multimedia
EP2392003B1 (de) * 2009-01-30 2013-01-02 Telefonaktiebolaget LM Ericsson (publ) Tonsignalqualitätsvorhersage
ES2452170T3 (es) * 2009-05-14 2014-03-31 Koninklijke Philips N.V. Detección robusta de transmisiones de DVD-T/H
WO2010140940A1 (en) 2009-06-04 2010-12-09 Telefonaktiebolaget Lm Ericsson (Publ) A method and arrangement for estimating the quality degradation of a processed signal
US8560312B2 (en) * 2009-12-17 2013-10-15 Alcatel Lucent Method and apparatus for the detection of impulsive noise in transmitted speech signals for use in speech quality assessment
WO2012078142A1 (en) 2010-12-07 2012-06-14 Empire Technology Development Llc Audio fingerprint differences for end-to-end quality of experience measurement
US9779731B1 (en) * 2012-08-20 2017-10-03 Amazon Technologies, Inc. Echo cancellation based on shared reference signals
US9679555B2 (en) 2013-06-26 2017-06-13 Qualcomm Incorporated Systems and methods for measuring speech signal quality
US9619980B2 (en) 2013-09-06 2017-04-11 Immersion Corporation Systems and methods for generating haptic effects associated with audio signals
US9576445B2 (en) 2013-09-06 2017-02-21 Immersion Corp. Systems and methods for generating haptic effects associated with an envelope in audio signals
CN104681038B (zh) * 2013-11-29 2018-03-09 清华大学 音频信号质量检测方法及装置
US10147441B1 (en) 2013-12-19 2018-12-04 Amazon Technologies, Inc. Voice controlled system
US10224759B2 (en) 2014-07-15 2019-03-05 Qorvo Us, Inc. Radio frequency (RF) power harvesting circuit
US10566843B2 (en) * 2014-07-15 2020-02-18 Qorvo Us, Inc. Wireless charging circuit
US10559970B2 (en) 2014-09-16 2020-02-11 Qorvo Us, Inc. Method for wireless charging power control
CN105893515B (zh) * 2016-03-30 2021-02-05 腾讯科技(深圳)有限公司 一种信息处理方法及服务器
RU2700551C2 (ru) * 2018-01-22 2019-09-17 Российская Федерация, от имени которой выступает Министерство обороны Российской Федерации Способ контроля качества каналов передачи данных в автоматизированных системах управления реального масштаба времени
CN110570874B (zh) * 2018-06-05 2021-10-22 中国科学院声学研究所 一种用于监测野外鸟类鸣声强度及分布的系统及其方法
CN109147804A (zh) * 2018-06-05 2019-01-04 安克创新科技股份有限公司 一种基于深度学习的音质特性处理方法及系统
CN110211610A (zh) * 2019-06-20 2019-09-06 平安科技(深圳)有限公司 评估音频信号损失的方法、装置及存储介质
CN112929808A (zh) * 2021-02-05 2021-06-08 四川湖山电器股份有限公司 检测校园广播播音设备能否正常工作的方法、模块及系统
EP4084366A1 (de) * 2021-04-26 2022-11-02 Aptiv Technologies Limited Verfahren zum testen einer rundfunkempfängervorrichtung in einem fahrzeug
CN113409820B (zh) * 2021-06-09 2022-03-15 合肥群音信息服务有限公司 一种基于语音数据的质量评价方法
CN113488074B (zh) * 2021-08-20 2023-06-23 四川大学 一种用于检测合成语音的二维时频特征生成方法

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB9213459D0 (en) * 1992-06-24 1992-08-05 British Telecomm Characterisation of communications systems using a speech-like test stimulus
FR2737948B1 (fr) * 1995-08-16 1997-10-17 Alcatel Mobile Comm France Dispositif de commande de volume sonore pour recepteur de signaux de parole codes par blocs
FR2769777B1 (fr) * 1997-10-13 1999-12-24 Telediffusion Fse Procede et systeme d'evaluation, a la reception, de la qualite d'un signal numerique, tel qu'un signal audio/video numerique
CA2230188A1 (en) * 1998-03-27 1999-09-27 William C. Treurniet Objective audio quality measurement
SE517547C2 (sv) * 1998-06-08 2002-06-18 Ericsson Telefon Ab L M Signalsynkronisering vid signalkvalitetsmätning
EP0980064A1 (de) * 1998-06-26 2000-02-16 Ascom AG Verfahren zur Durchführung einer maschinengestützten Beurteilung der Uebertragungsqualität von Audiosignalen
US7006555B1 (en) * 1998-07-16 2006-02-28 Nielsen Media Research, Inc. Spectral audio encoding
EP1277295A1 (de) * 1999-10-27 2003-01-22 Nielsen Media Research, Inc. Verfahren und vorrichtung zur kodierung von tonsignalen für verwendung in programmidentifikationssystemen, durch hinzufügen eines unhörbaren kodes im tonsignal
NL1014075C2 (nl) * 2000-01-13 2001-07-16 Koninkl Kpn Nv Methode en inrichting voor het bepalen van de kwaliteit van een signaal.

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO03063134A1 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112562714A (zh) * 2020-11-24 2021-03-26 潍柴动力股份有限公司 一种噪声评估方法及装置

Also Published As

Publication number Publication date
FR2835125B1 (fr) 2004-06-18
WO2003063134A1 (fr) 2003-07-31
US20050143974A1 (en) 2005-06-30
FR2835125A1 (fr) 2003-07-25
CA2474067A1 (en) 2003-07-31
EP1468416B1 (de) 2015-12-23
US20120099734A1 (en) 2012-04-26
US8606385B2 (en) 2013-12-10
US8036765B2 (en) 2011-10-11
CA2474067C (en) 2014-12-30

Similar Documents

Publication Publication Date Title
EP1468416B1 (de) Verfahren zur qualitativen bewertung eines digitalen audiosignals
EP2419900B1 (de) Verfahren und einrichtung zur objektiven evaluierung der sprachqualität eines sprachsignals unter berücksichtigung der klassifikation der in dem signal enthaltenen hintergrundgeräusche
EP0768770B1 (de) Verfahren und Vorrichtung zur Erzeugung von Hintergrundrauschen in einem digitalen Übertragungssystem
EP2691952B1 (de) Zuweisung von bits anhand von subbändern zur quantifizierung von rauminformationsparametern für parametrische codierung
EP1051703B1 (de) Verfahren zur dekodierung eines audiosignals mit korrektur von übertragungsfehlern
EP1997103B1 (de) Verfahren zur codierung eines quellenaudiosignals, entsprechende codierungseinrichtung, decodierungsverfahren und einrichtung, signal und computerprogrammprodukte
EP0906613B1 (de) Verfahren und vorrichtung zur kodierung eines audiosignals mittels "vorwärts"- und "rückwärts"-lpc-analyse
EP2080194B1 (de) Dämpfung von stimmüberlagerung, im besonderen zur erregungserzeugung bei einem decoder in abwesenheit von informationen
EP2795618B1 (de) Verfahren zur erkennung eines vorgegebenen frequenzbandes in einem audiodatensignal, erkennungsvorrichtung und computerprogramm dafür
FR2882458A1 (fr) Procede de mesure de la gene due au bruit dans un signal audio
EP1875465A1 (de) Verfahren zur anpassung für interoperabilität zwischen kurzzeit-korrelationsmodellen digitaler signale
CA2377808C (fr) Procede d'evaluation de la qualite de sequences audiovisuelles
US6549757B1 (en) Method and system for assessing, at reception level, the quality of a digital signal, such as a digital audio/video signal
EP1468573A1 (de) Verfahren zur synchronisierung von zwei digitalen datenströmen mit gleichem inhalt
CN108877816B (zh) 基于qmdct系数的aac音频重压缩检测方法
Organiściak et al. Single-ended quality measurement of a music content via convolutional recurrent neural networks
EP1159795B1 (de) Verfahren zur qualitätsüberwachung von einem digitaltonsignal das mit einem rundfunkprogramm übertragen wird
FR2790845A1 (fr) Procede de controle de la qualite d'un signal audionumerique distribue

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20040728

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PT SE SI SK TR

RIN1 Information on inventor provided before grant (corrected)

Inventor name: JOLY, ALEXANDRE

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: TDF

17Q First examination report despatched

Effective date: 20080707

APBK Appeal reference recorded

Free format text: ORIGINAL CODE: EPIDOSNREFNE

APBN Date of receipt of notice of appeal recorded

Free format text: ORIGINAL CODE: EPIDOSNNOA2E

APBR Date of receipt of statement of grounds of appeal recorded

Free format text: ORIGINAL CODE: EPIDOSNNOA3E

APAV Appeal reference deleted

Free format text: ORIGINAL CODE: EPIDOSDREFNE

APBT Appeal procedure closed

Free format text: ORIGINAL CODE: EPIDOSNNOA9E

REG Reference to a national code

Ref country code: DE

Ref legal event code: R079

Ref document number: 60348361

Country of ref document: DE

Free format text: PREVIOUS MAIN CLASS: G10L0019000000

Ipc: G10L0025690000

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

RIC1 Information provided on ipc code assigned before grant

Ipc: G10L 25/69 20130101AFI20150625BHEP

INTG Intention to grant announced

Effective date: 20150416

GRAC Information related to communication of intention to grant a patent modified

Free format text: ORIGINAL CODE: EPIDOSCIGR1

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

INTG Intention to grant announced

Effective date: 20150716

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): DE FR GB NL

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

Free format text: NOT ENGLISH

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 14

REG Reference to a national code

Ref country code: DE

Ref legal event code: R096

Ref document number: 60348361

Country of ref document: DE

REG Reference to a national code

Ref country code: NL

Ref legal event code: FP

REG Reference to a national code

Ref country code: DE

Ref legal event code: R097

Ref document number: 60348361

Country of ref document: DE

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed

Effective date: 20160926

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 15

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20161228

Year of fee payment: 15

Ref country code: NL

Payment date: 20161220

Year of fee payment: 15

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20161221

Year of fee payment: 15

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20161219

Year of fee payment: 15

REG Reference to a national code

Ref country code: DE

Ref legal event code: R119

Ref document number: 60348361

Country of ref document: DE

REG Reference to a national code

Ref country code: NL

Ref legal event code: MM

Effective date: 20180201

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20180123

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20180801

Ref country code: FR

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20180131

REG Reference to a national code

Ref country code: FR

Ref legal event code: ST

Effective date: 20180928

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: NL

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20180201

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20180123