EP1468416A1 - Verfahren zur qualitativen bewertung eines digitalen audiosignals - Google Patents
Verfahren zur qualitativen bewertung eines digitalen audiosignalsInfo
- Publication number
- EP1468416A1 EP1468416A1 EP03715043A EP03715043A EP1468416A1 EP 1468416 A1 EP1468416 A1 EP 1468416A1 EP 03715043 A EP03715043 A EP 03715043A EP 03715043 A EP03715043 A EP 03715043A EP 1468416 A1 EP1468416 A1 EP 1468416A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- audio signal
- signal
- evaluated
- quality indicator
- implements
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000005236 sound signal Effects 0.000 title claims abstract description 109
- 238000000034 method Methods 0.000 title claims abstract description 105
- 238000011156 evaluation Methods 0.000 title claims abstract description 10
- 230000015556 catabolic process Effects 0.000 claims abstract description 58
- 238000006731 degradation reaction Methods 0.000 claims abstract description 58
- 239000013598 vector Substances 0.000 claims abstract description 54
- 238000001228 spectrum Methods 0.000 claims abstract description 41
- 238000004364 calculation method Methods 0.000 claims abstract description 38
- 230000000694 effects Effects 0.000 claims abstract description 22
- 230000003595 spectral effect Effects 0.000 claims abstract description 12
- 230000005284 excitation Effects 0.000 claims description 17
- 230000002123 temporal effect Effects 0.000 claims description 12
- 210000000959 ear middle Anatomy 0.000 claims description 7
- 230000007480 spreading Effects 0.000 claims description 7
- 238000003892 spreading Methods 0.000 claims description 7
- 230000035945 sensitivity Effects 0.000 claims description 6
- 210000003027 ear inner Anatomy 0.000 claims description 5
- 230000009467 reduction Effects 0.000 claims description 5
- 230000001186 cumulative effect Effects 0.000 claims description 3
- 238000005259 measurement Methods 0.000 description 33
- 230000005540 biological transmission Effects 0.000 description 18
- 238000004422 calculation algorithm Methods 0.000 description 13
- 238000001514 detection method Methods 0.000 description 13
- 238000010586 diagram Methods 0.000 description 10
- 230000008901 benefit Effects 0.000 description 9
- 238000012360 testing method Methods 0.000 description 9
- 230000006870 function Effects 0.000 description 8
- 230000000873 masking effect Effects 0.000 description 6
- 230000009466 transformation Effects 0.000 description 6
- 238000013459 approach Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000012544 monitoring process Methods 0.000 description 4
- 238000012546 transfer Methods 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 210000000883 ear external Anatomy 0.000 description 3
- 230000001360 synchronised effect Effects 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 238000009792 diffusion process Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000009434 installation Methods 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000003326 Quality management system Methods 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000005311 autocorrelation function Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000000593 degrading effect Effects 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 230000006866 deterioration Effects 0.000 description 1
- 230000010339 dilation Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000003628 erosive effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000009527 percussion Methods 0.000 description 1
- 238000012797 qualification Methods 0.000 description 1
- 238000001303 quality assessment method Methods 0.000 description 1
- 230000035484 reaction time Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000035807 sensation Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000010183 spectrum analysis Methods 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/69—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for evaluating synthetic or decoded voice signals
Definitions
- the subject of the present invention is a method for evaluating a digital audio signal, in particular a digitally transmitted signal and / or a digital signal to which digital coding has been applied, in particular with rate reduction and / or decoding.
- a digitally transmitted signal can be a standalone audio signal (broadcasting) or an audio signal that accompanies a program such as an audiovisual program.
- the first directly compares the original signal to the degraded signal (after coding, broadcasting, multiplexing, 7)
- the second compares only parameters extracted from two signals (called reduced reference).
- the faults generated by the diffusion chain are detected using their main known characteristics.
- This last class overcomes the constraints linked to the use of the reference signal. Indeed, in all other cases, the reference must be sent instead of comparison then perfectly synchronized with the degraded signal. This makes the system complex and more expensive.
- Degradations due to transmission errors significantly reduce the signal quality. They appear during the broadcast, of an MPEG digital stream for example or during the broadcast, notably of radio, on the Internet.
- the methods with full reference for which the signal to be evaluated is compared to the reference signal correspond to the conventional techniques used to estimate the quality of audio coders for example. Their general principle is based on the calculation, via a perceptual hearing model, of an internal representation of the original signal and the degraded signal, then on a comparison of these two internal representations. Such a method is described in the article by John G. BEERENDS and JAN A. STEMERDINK entitled “A Perceptual Audio Quality Measure Based on a Psychoacoustic Sound Representation", published in "Journal of Audio Engineering Society", vol. 12, December 1992, pages 963 to 978.
- the OBQ (Output-Based Objective Speech Quality) measurement is the most advanced of the techniques without reference. This method of estimating the quality of a speech signal only, without a reference signal, is based on the calculation of perceptual parameters representing the content of the signal, gathered in a vector. These vectors, calculated on non-degraded signals, will constitute a reference base. The quality will be estimated by comparing the same parameters, extracted from the degraded signals, with the vectors of the reference base.
- the main method using neural networks is the OSSQAR (Objective Scaling of Sound Quality And Reproduction) measurement. The general principle of this method is to use a hearing model in conjunction with a neural network.
- the network is trained to predict the subjective quality of a signal from its perceptual representation calculated by the hearing model, to simulate the phenomena of psychoacoustics. It should be noted that the results obtained by these methods are much better when the signals are part of the learning base or at least when they have close characteristics.
- Such methods are therefore not suitable for evaluating the quality of any signals, for example the audio signals of a radio or TN broadcast.
- the present invention proposes a method according to which the indicators are simpler and can be calculated in real time and in continuous time, and require a significantly lower bit rate. Since the degradations can only modify a few samples, while degrading the quality significantly, the proposed method allows the entire audio stream to be analyzed.
- the method according to the invention allows a reliable estimate of the quality of an audio signal having passed through a digital type transmission or coding. Indeed, the disturbances undergone by the transmission channels can induce the appearance of errors on the transmitted data; these errors result in degradations in the final audio signal.
- the technological approach proposed consists in making a measurement on the audio signal, at the input and another at the output, the chain or any other system to be studied. A comparison between these measurements makes it possible to ensure the "transparency" of the transmission channel and to assess the extent of the degradations introduced. Used in conjunction or not with methods without reference, detecting the degradations based on the signature of the characteristics of the most important defects to be sought, the proposed approach allows a reliable estimate of the degradations introduced. It also makes it possible to compensate for a lack of reference signal. This method makes it possible to reduce the reference throughput necessary for estimating the quality in the case of measurements with reduced reference, and the number of parameters to be used in the case of measurements without reference.
- the invention thus relates to a method for evaluating a digital audio signal, characterized in that it implements in real time and in continuous time, in successive time windows, the calculation of a quality indicator constituted, for each time window of a vector whose size is advantageously at least one hundred times less than the number of audio samples of a time window.
- This dimension is for example between 1 and 10 and preferably between 1 and 5.
- the digital audio signal to be evaluated can be a signal which has been transmitted digitally and / or which has been subjected to digital coding, in particular with reduction in bit rate, from a digital reference signal.
- the method is characterized in that the generation of a said quality indicator vector implements for a reference audio signal and for the audio signal to be evaluated, the steps a) calculate the power spectral density of the audio signal for each time window and apply a filter representative of the attenuation of the inner and middle ear to obtain a filtered spectral density, b) calculate from this density filtered spectral individual excitations using the frequency spreading function in the basilar scale, c) determine from said individual excitations the compressed loudness using a function modeling the nonlinear frequency sensitivity of l ear, to obtain basilar components, d) separate the basilar components into classes, preferably in three classes, and calc uler for each class a number C representing the sum of the frequencies of this class, said vector consisting of said numbers C, e) calculating a distance between the vectors of the reference audio signal and of the audio signal to be evaluated associated with each time window for perform a so-called
- the method is characterized in that the generation of a said quality indicator vector implements, for the reference audio signal and for the audio signal to be evaluated, the following steps: a) calculating N coefficients of a prediction filter by an autoregressive modeling. b) determining in each time window the maximum of the residue by difference between the predicted signal using the prediction filter and the audio signal, said maximum of the prediction residue constituting said quality indicator vector, c) calculating a distance between said vectors of the reference audio signal and of the audio signal to be evaluated associated with each time window in order to carry out a so-called evaluation of the degradation of the audio signal.
- the method is characterized in that the generation of a said quality indicator vector implements for the reference audio signal and for the audio signal to evaluate, the following steps: a) calculate for each time window the power spectral density of the audio signal and apply to it a filter representative of the attenuation of the inner and middle ear, to obtain a frequency spreading function in l basal scale, b) calculate individual excitations from the frequency spreading function in the basilar scale, c) obtain from said individual excitations the compressed loudness using a function modeling nonlinear sensitivity by ear frequency, to obtain basilar components, d) calculate from said basilar components N 'predictive coefficients ion of a prediction filter by autoregressive modeling. e) generate for each time window a said quality indicator vector from only some of the N ′ prediction coefficients.
- the quality indicator vector comprises between 5 and 10 of said prediction coefficients.
- the method is characterized in that the generation of a said quality indicator vector implements at least for the audio signal to be evaluated the following steps : a) calculation of a temporal activity of the signal in each time window, b) calculate a sliding average over Ni successive values of the time activity, c) keep the minimum value among M] successive values of the sliding average.
- the quality indicator vector can be constituted by said minimum value, or alternatively by a binary value resulting from the comparison of said minimum value with a given threshold.
- the method can be characterized in that it implements the calculation of a quality score by determining a cumulative time interval during which said minimum value is less than a given threshold and / or by determining the number of times per second where said minimum value is less than a given threshold or else in that said minimum values are generated both for the reference audio signal and for the audio signal to be evaluated and in that a quality vector is generated by comparison between the corresponding minimum values of the reference audio signal and of the audio signal to be evaluated, for example by calculating the difference or the ratio between said minimum values.
- the method is characterized in that the generation of a said quality indicator vector implements at least for the audio signal to be evaluated the following steps : a) calculate a temporal activity of the signal in each time window, b) calculate a sliding average over N 2 successive values of the temporal activity, c) keep the maximum value among M 2 successive values of the sliding average.
- the quality indicator vector can be constituted by said maximum value or by a binary value resulting from the comparison of said minimum value with a given threshold.
- the method can be characterized in that a degradation indicator is generated by comparison between the maximum value obtained on the reference audio signal and its corresponding maximum value obtained on the audio signal to be evaluated, for example by calculating the difference or the ratio between these maximum values.
- the method is characterized in that the generation of a said quality indicator vector implements at least for the audio signal to be evaluated. calculation of the Fourier transform in successive blocks of N 3 samples constituting said time windows and calculating the minimum of the spectrum in M 3 successive blocks which constitute a vector indicative of quality.
- the method can be characterized in that it includes a step of evaluating the introduction of noise into the audio signal to be evaluated by comparing the value of said minimum of the spectrum in M 3 successive blocks associated with the audio signal to be evaluated with the value maximum of the M 3 minima obtained in the same M 3 successive blocks associated with the reference audio signal.
- the method comprises a step of evaluating the introduction of noise into the audio signal to be evaluated by comparing the value of said minimum of the spectrum in M 3 successive blocks with an average value spectrum minima obtained in blocks prior to the M 3 successive blocks, for example by calculating the difference or the ratio between these average values.
- the method is characterized in that the generation of a said quality indicator vector implements at least for the audio signal to be evaluated the calculation a spectrum flattening parameter which is the ratio between an arithmetic mean and a geometric mean of the components of the signal spectrum.
- the method can then be characterized in that it implements an indicator for detecting a degradation of the audio signal by the introduction of broadband noise by comparing said spectrum flattening parameter between the reference audio signal and the audio signal to be evaluated, for example by calculating the difference or the ratio between these two parameters.
- FIG. 1 is a flowchart illustrating a quality assessment with full reference.
- FIG. 2 illustrates an audio transmission with loss of quality
- FIGS. 3 to 10 illustrate evaluation methods according to the present invention
- FIG. 11 and 12 illustrate an audio quality system implementing the present invention.
- the management and recovery of decoding errors is not standardized. The influence of these errors on the perceived quality therefore depends on the decoder used.
- the audibility of these faults is also linked to the type of element affected in the frame, for example MPEG, and to its audio content.
- the quality can be estimated in binary fashion: either the signal has not been degraded and the quality will depend on the initial coding used, or errors have been introduced and significant degradations appear.
- the estimation of the quality can then be done by methods without reference, by accounting for the degradations detected over regular time intervals of the order, for example of a second.
- Subjective tests have in fact made it possible to obtain a reliable estimate of the perceived quality, from the number and the length of the interruptions linked to impulse-type degradations in a signal.
- the proposed method makes it possible to reduce the flow required for transporting the reference. This allows the use of reserved lanes with relatively limited speed. These measurements make it possible to detect degradations other than those due to transmission errors.
- the present invention allows a reduction in the bit rate in the case of measurements with reduced reference and, by adding simple measurements without reference, to keep measurements on the significant degradations in the case of a loss of the reference by example, by locally generating a vector which simply characterizes the degradations, and which could therefore be easily processed and transmitted to a control installation, in particular centralized.
- the measurements taken along the chain and at various points on the network inform the monitoring and management system for digital television broadcasting on its overall performance. Measurements of signal degradations inform the broadcasting operator about the quality of service delivered.
- the process is characterized by two complementary operating modes: With reduced reference.
- the technological approach proposed consists in making a measurement on the audio signal, at the input, and another at the output of the transmission chain or any other system to be studied (encoder, decoder, etc.). A comparison between these measures makes it possible to ensure the "transparency" of the chain or system and to assess the extent of the degradations introduced.
- the method performs an evaluation in real time and in continuous time.
- the reference measurements at the input of the chain represent a very small amount of data compared to the audio signal data, hence its classification as “reduced reference”.
- the reference data or measurements used are both a reduced representation of the content of the signal and a measure of the importance of a type of degradation.
- the invention makes it possible to compensate for a lack of reference signal.
- the method defines measures for the characteristic digital faults to be sought.
- the proposed approach allows an estimation of the degradations introduced on any signal, and in a reliable manner and this approach can be implemented both on the scale of a transmission network and locally on an equipment.
- the computation complexity according to the method is low, and the indicator obtained represents a small quantity of data compared to the digital audio stream.
- the method can be applied indifferently to purely digital signals or to signals having undergone after transmission a digital to analog conversion then analog to digital.
- perceptual modeling The principle of objective perceptual measurements is based on the transformation of the physical representation (sound pressure, level, time and frequency) into the psychoacoustic representation (sound strength, masking level, time and critical bands or barks) of two signals (the reference signal and the signal to be evaluated) in order to compare them. This transformation takes place thanks to a modeling of the human auditory system (generally, this modeling consists of a spectral analysis in the Barks domain followed by spreading phenomena). A distance can then be calculated between the psychoacoustic representations of the two signals, distance which can be linked to the quality of the signal to be evaluated (the smaller the distance, the closer the signal to be evaluated to the original signal and the better its quality).
- the first process implements a parameter called "Difference in Perceptual Accounts".
- Windowing of the temporal signal in blocks then, for each of the blocks, calculation of the excitation induced by the signal using a hearing model.
- This representation of the signals takes into account the phenomena of psychoacoustics, and provides a histogram whose accounts are the values of the basilar components.
- the attenuation filter for the outer and middle ear is applied to the power spectral density, obtained from the signal spectrum.
- This filter also takes into account the absolute hearing threshold.
- the notion of critical bands is modeled by a transformation of the frequency scale into a basilar scale.
- the following stage corresponds to the calculation of the individual excitations to take account of the masking phenomena, thanks to the frequency spreading function in the basilar scale and to a nonlinear addition.
- the last step makes it possible to obtain the compressed loudness, by a power function, to model the non-linear frequency sensitivity of the ear, by a histogram comprising the 109 basilar components.
- the histogram accounts obtained are then grouped into three classes. This vectorization makes it possible to obtain a visual representation of the evolution of the structure of the signals. This also makes it possible to obtain a simple and concise characterization of the signal and therefore to have a particularly interesting reference parameter.
- the second strategy takes into account the Beerends scaling zones.
- gain compensation between the excitation of the reference signal and that of the signal to be tested is carried out by the ear, the fixed limits are then the following:
- a point (X, Y) constituting a vector is therefore obtained for each time window of the signal, which corresponds to the transmission of two values per window of for example 1024 bits, or a bit rate of 3 kbits / s for an audio signal. sampled at 48 kHz.
- the associated representation is thus a trajectory parameterized by time, as shown in Figure 3.
- the distance (Euclidean) between the reference signal and the degraded signal is then calculated.
- the distance between the points makes it possible to estimate the extent of the degradations introduced between the reference signal and the degraded signal. This distance can be considered as a perceptual distance due to the use of models of psychoacoustics.
- the main advantage of the parameter comes from the fact that psychoacoustic phenomena are taken into account without increasing the bit rate necessary for transferring the reference. This makes it possible to reduce the reference to 2 values for 1024 signal samples (3 kbits / s).
- the second method implements an autoregressive modeling of the signal.
- the general principle of linear prediction is to model the signal as being a combination of its past values. The idea is to calculate the
- Prediction errors or residuals are calculated by difference between these two signals.
- the presence and amount of noise in a signal can be determined by analyzing these residues.
- the comparison of the residues obtained on the reference signal and those calculated from the degraded signal, and therefore of the noise levels, makes it possible to estimate the importance of the modifications and defects inserted.
- the reference to be transmitted corresponds to the maximum of the residuals over a time window of given size. It is in fact not interesting to transmit all the residues if the bit rate of the reference wants to be reduced.
- the gradient algorithm which is described for example in the aforementioned work of M. BELLANGER p. 371 and following.
- the main drawback of the previous parameter is the need, in the case of an implementation on DSP, to store the N 0 samples to estimate the autocorrelation, have the coefficients of the filter then calculate the residues.
- This second parameter makes it possible to avoid this by using another algorithm making it possible to calculate the coefficients of the filter: the algorithm of the gradient. This uses the error made to update the coefficients.
- the filter coefficients are changed in the direction of the gradient of the instantaneous quadratic error, with the opposite signal.
- the reference vector to be transmitted can thus be reduced to a single number.
- the comparison consists of a simple calculation of the distance between the maxima of the reference and of the degraded signal, for example by difference.
- Figure 5 summarizes the principle of parameter calculation: The main advantage of the two parameters is the flow required to transfer the reference. This reduces the reference to 1 real number for 1024 signal samples.
- the third method implements an autoregressive modeling of the basilar excitation.
- this method allows to take into account the phenomena of psychoacoustics, in order to obtain an evaluation of the perceived quality. For this, the calculation of the parameter goes through a modeling of various hearing principles.
- a linear prediction models the signal as a combination of its past values. Analysis of the residuals (or prediction errors) makes it possible to determine and estimate the presence of noise in a signal.
- the major drawback when using these techniques is the fact that there is no consideration of the principles of psychoacoustics. Thus, it is not possible to estimate the amount of noise actually perceived.
- the process takes up the general principle of classical linear prediction. It also incorporates the phenomena of psychoacoustics to adapt it to the non-linear sensitivity in frequency (loudness) and intensity (tone) of the human ear.
- the first part of the calculation of this parameter corresponds to the modeling of the principles of psychoacoustics using classical hearing models.
- the second part is the calculation of the linear prediction coefficients.
- the last part corresponds to the comparison of the prediction coefficients calculated for the reference signal and those obtained for the degraded signal.
- Temporal windowing of the signal then calculation of an internal representation of the signal by modeling the phenomena of psychoacoustics.
- This step corresponds to the calculation of the compressed loudness, which is in fact the excitation induced by the signal at the level of the inner ear.
- This representation of the signals account of the phenomena of psychoacoustics, and is obtained from the signal spectrum, using conventional models: attenuation of the outer and middle ear, integration according to critical bands and frequency masking.
- This calculation step is identical to the parameter described above; - Autoregressive modeling of this compressed loudness in order to obtain the coefficients of a RIF prediction filter, just like in a classic linear prediction.
- the method used is that of autocorrelation, by solving the Yule-alker equations.
- the first step in obtaining the prediction coefficients is therefore the calculation of the signal autocorrelation.
- the modeling of the phenomena of psychoacoustics makes it possible to obtain 24 basilar components.
- the order N of the prediction filter is 32. From these, 32 autocorrelation coefficients are estimated, which gives 32 prediction coefficients of which only 5 to 10 coefficients are kept as an indicator vector of quality, for example the first 5 to 10 coefficients.
- the main advantage of the parameter comes from taking into account the phenomena of psychoacoustics. To do this, it was necessary to increase the flow required for the transfer of the reference to 5 or 10 values for
- the following methods can be used with or without reference. This makes it possible to keep the most significant degradation detection measures, even in the case where no reference parameter is available at the control point, at the time when the comparison should be carried out.
- the first of these methods implements dish detection in signal activity.
- the notion of activity which can be approximated by a derivation operation in the audio signal, is used to identify breaks and interruptions in the time signal.
- These types of faults are characteristic of decoding errors after transmission of the digital audio stream or during the broadcasting of sound sequences on the Internet. This occurs when the network speed becomes insufficient to ensure the arrival of all the frames necessary at the time of decoding, for example.
- These degradations which introduce very low activity zones, translate at the auditory level by different sensations in the listener: mute, blur, impulse noise ...
- the first step in calculating the parameter corresponds to estimating the temporal activity of the signal.
- the second derivative operator is used. It makes it possible to have a sufficiently precise estimate of the activity and requires very few calculations.
- the comparison step consists of a simple difference which makes it possible to identify the zones where the signal has been replaced by decoding dishes.
- Plats r (t) and PlatS d (t) are respectively the parameter calculated on the reference and on the degraded signal.
- the next step therefore consists in using correspondence curves from the binary parameter. These curves provide a quality score from the cumulative length and the number of impulse degradations detected per second. These curves are established from subjective tests. Different curves can be established depending on the type of audio signals (mainly speech or music). Once the estimate has been obtained, it is also possible to use a filter simulating the response of a panelist. This makes it possible to take into account the dynamic effect of the votes and the reaction times when faced with degradations.
- the main advantage of the parameter is the possibility of making measurements without reference. Another interesting point is the speed necessary for the transfer of the reference. This makes it possible to reduce the reference to 1 real number, ie a bit rate of 1.5 kbits / s (or even 1 bit in the event of thresholding, or a bit rate of 47 bits / s) for
- the second of these methods implements activity peak detection.
- This parameter is based on signal activity. This allows you to detect dropouts, breaks, breaks in part of the audio signal and outliers by looking for peaks in signal activity.
- ActTemp (t) max (y (t - k)) (11) where y (t) is the signal activity calculated by the filter.
- ActTemp r (t) and ActTemp ⁇ d (t) are respectively the parameter calculated on the reference and on the degraded signal.
- the threshold In the case where the reference is not available, it is possible to use a thresholding to detect if the parameter is greater than a threshold S ', which indicates the presence of degradations. To avoid false detections due to signals of an impulsive nature (attacks, percussions, ...), the threshold must have a fairly large value, which can lead to non-detections.
- the use of correspondence curves is possible to estimate a perceptual quality.
- the method consists in integrating the degradations detected by this parameter, with the others found by the previous parameter for example, and thus obtaining a global perceptual estimate.
- the advantage of the parameter lies in the possibility of making detections without reference.
- the reduced complexity and the low bit rate necessary for transporting the reference limited to 1 value, i.e. a bit rate of 1.5 kbits / s (or even 1 bit in case of thresholding, or a bit rate of 47 bits / s) for 1024 signal samples sampled at 48 kHz, are also interesting points.
- the following method implements the study of the minimum of the signal spectrum to locate the degradations.
- the first step in calculating these parameters is to estimate the spectrum of the signal.
- MinSpe min ( ⁇ : ; .) For 1 ⁇ i ⁇ N (13) with Xi the N components of the X spectrum in dB (by distance calculation).
- x r> i is the i th of the N components of the spectrum obtained on the reference
- x ⁇ Li is the i th of the N components of the spectrum obtained on the degraded signal
- the method can be summarized as follows by the following two diagrams Figure 9. Again, the main advantage of these parameters is the possibility of making measurements without reference. Another interesting point is the speed necessary for the transfer of the reference. This makes it possible to reduce the reference to 1 real number and even 1 integer, ie a bit rate of at most 1.5 kbit / s for N (for example 1024) signal samples. The reduced complexity of the algorithm is also an asset.
- the statistical flattening coefficient called "kurtosis” or “concentration” was used.
- the estimation is made from the centered moments of order 2 and 4. They allow the shape of the spectrum to be estimated with respect to a normal distribution in the statistical sense of the term.
- the calculation corresponds to the ratio between the centered moment of order 4 and the centered moment of order 2 (variance) squared of the coefficients of the spectrum.
- the formula used is as follows:
- X is the arithmetic mean of the N components Xj of the X spectrum in dB.
- the higher the value obtained the more the signal is concentrated and the less noise there is in the signal. This one is calculated on the reference and on the degraded signal. By comparison, the level of white noise inserted is estimated.
- the diagram in Figure 10 presents the principle (valid for the two parameters above): In the case of a comparison with the reference, a simple distance of the difference or other type is sufficient to detect the degradations. If no reference is available, it is necessary to detect peaks in the variation of the parameters to search for degradations. This can be done using the technique, classic in image processing, of gray level mathematical morphology (erosions and dilations).
- the reference audio signal corresponds to the signal at the input of the broadcasting network.
- the reference parameters are calculated on this signal, then transmitted via a specific data channel, to the desired measurement point. It is at this point that the same parameters necessary for the comparison are calculated for the establishment of the measurements with reduced reference. Non-referenced measurements are also calculated. In the event that the reference parameters are not available (not present, incorrect, ...) these measurements are sufficient to detect the most significant errors.
- the dotted subsystems in Figure 11 are no longer used.
- the same diagram as above can be used to visualize (with or without reference) the performance of radio broadcasting on the Internet.
- the data channel used to transport the reference parameters can be the network itself, as well as to return the estimated notes to the center of monitoring.
- the reference signal corresponds to the signal sent by the server, and the degraded signal is that decoded at the chosen measurement point. This can for example be used to choose the most appropriate server according to the connection location by accessing data from a monitoring center.
- the diagram ( Figure 12) below illustrates this embodiment in the case where the reference parameters are sent by the network and where the notes obtained use a specific transmission channel.
- a method according to the invention is applicable whenever it is necessary to identify faults on an audio signal which has been transmitted by any broadcasting network (cable, satellite, wireless, Internet, DNB, DAB, etc.). .).
- the proposed method exploits two classes of methods: techniques with reduced reference and those without reference. It is particularly advantageous when the bit rate available for transmitting the reference is limited.
- this invention is applicable for operational purposes for metrology equipment and for supervision systems of audio signal distribution networks.
- One of its advantageous characteristics lies in the combination of the measurements carried out with and without reference.
- this invention corresponds to the requirements imposed in service quality management systems.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR0200856A FR2835125B1 (fr) | 2002-01-24 | 2002-01-24 | Procede d'evaluation d'un signal audio numerique |
FR0200856 | 2002-01-24 | ||
PCT/FR2003/000222 WO2003063134A1 (fr) | 2002-01-24 | 2003-01-23 | Procede d'evaluation qualitative d'un signal audio numerique. |
Publications (2)
Publication Number | Publication Date |
---|---|
EP1468416A1 true EP1468416A1 (de) | 2004-10-20 |
EP1468416B1 EP1468416B1 (de) | 2015-12-23 |
Family
ID=27589574
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP03715043.0A Expired - Lifetime EP1468416B1 (de) | 2002-01-24 | 2003-01-23 | Verfahren zur qualitativen bewertung eines digitalen audiosignals |
Country Status (5)
Country | Link |
---|---|
US (2) | US8036765B2 (de) |
EP (1) | EP1468416B1 (de) |
CA (1) | CA2474067C (de) |
FR (1) | FR2835125B1 (de) |
WO (1) | WO2003063134A1 (de) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112562714A (zh) * | 2020-11-24 | 2021-03-26 | 潍柴动力股份有限公司 | 一种噪声评估方法及装置 |
Families Citing this family (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2833791B1 (fr) * | 2001-12-13 | 2004-02-06 | Telediffusion De France Tdf | Dispositif de metrologie pour la surveillance automatique d'un reseau de diffusion de signaux numeriques et reseau de diffusion comprenant un tel dispositif de metrologie |
CN101512935B (zh) * | 2006-07-27 | 2013-10-30 | 艾利森电话股份有限公司 | 通过多个发射机进行分层广播发射 |
US8599704B2 (en) * | 2007-01-23 | 2013-12-03 | Microsoft Corporation | Assessing gateway quality using audio systems |
EP2115742B1 (de) * | 2007-03-02 | 2012-09-12 | Telefonaktiebolaget LM Ericsson (publ) | Verfahren und anordnungen in einem telekommunikationsnetz |
CN101608947B (zh) * | 2008-06-19 | 2012-05-16 | 鸿富锦精密工业(深圳)有限公司 | 声音测试方法 |
US20100161779A1 (en) * | 2008-12-24 | 2010-06-24 | Verizon Services Organization Inc | System and method for providing quality-referenced multimedia |
EP2392003B1 (de) * | 2009-01-30 | 2013-01-02 | Telefonaktiebolaget LM Ericsson (publ) | Tonsignalqualitätsvorhersage |
ES2452170T3 (es) * | 2009-05-14 | 2014-03-31 | Koninklijke Philips N.V. | Detección robusta de transmisiones de DVD-T/H |
WO2010140940A1 (en) | 2009-06-04 | 2010-12-09 | Telefonaktiebolaget Lm Ericsson (Publ) | A method and arrangement for estimating the quality degradation of a processed signal |
US8560312B2 (en) * | 2009-12-17 | 2013-10-15 | Alcatel Lucent | Method and apparatus for the detection of impulsive noise in transmitted speech signals for use in speech quality assessment |
WO2012078142A1 (en) | 2010-12-07 | 2012-06-14 | Empire Technology Development Llc | Audio fingerprint differences for end-to-end quality of experience measurement |
US9779731B1 (en) * | 2012-08-20 | 2017-10-03 | Amazon Technologies, Inc. | Echo cancellation based on shared reference signals |
US9679555B2 (en) | 2013-06-26 | 2017-06-13 | Qualcomm Incorporated | Systems and methods for measuring speech signal quality |
US9619980B2 (en) | 2013-09-06 | 2017-04-11 | Immersion Corporation | Systems and methods for generating haptic effects associated with audio signals |
US9576445B2 (en) | 2013-09-06 | 2017-02-21 | Immersion Corp. | Systems and methods for generating haptic effects associated with an envelope in audio signals |
CN104681038B (zh) * | 2013-11-29 | 2018-03-09 | 清华大学 | 音频信号质量检测方法及装置 |
US10147441B1 (en) | 2013-12-19 | 2018-12-04 | Amazon Technologies, Inc. | Voice controlled system |
US10224759B2 (en) | 2014-07-15 | 2019-03-05 | Qorvo Us, Inc. | Radio frequency (RF) power harvesting circuit |
US10566843B2 (en) * | 2014-07-15 | 2020-02-18 | Qorvo Us, Inc. | Wireless charging circuit |
US10559970B2 (en) | 2014-09-16 | 2020-02-11 | Qorvo Us, Inc. | Method for wireless charging power control |
CN105893515B (zh) * | 2016-03-30 | 2021-02-05 | 腾讯科技(深圳)有限公司 | 一种信息处理方法及服务器 |
RU2700551C2 (ru) * | 2018-01-22 | 2019-09-17 | Российская Федерация, от имени которой выступает Министерство обороны Российской Федерации | Способ контроля качества каналов передачи данных в автоматизированных системах управления реального масштаба времени |
CN110570874B (zh) * | 2018-06-05 | 2021-10-22 | 中国科学院声学研究所 | 一种用于监测野外鸟类鸣声强度及分布的系统及其方法 |
CN109147804A (zh) * | 2018-06-05 | 2019-01-04 | 安克创新科技股份有限公司 | 一种基于深度学习的音质特性处理方法及系统 |
CN110211610A (zh) * | 2019-06-20 | 2019-09-06 | 平安科技(深圳)有限公司 | 评估音频信号损失的方法、装置及存储介质 |
CN112929808A (zh) * | 2021-02-05 | 2021-06-08 | 四川湖山电器股份有限公司 | 检测校园广播播音设备能否正常工作的方法、模块及系统 |
EP4084366A1 (de) * | 2021-04-26 | 2022-11-02 | Aptiv Technologies Limited | Verfahren zum testen einer rundfunkempfängervorrichtung in einem fahrzeug |
CN113409820B (zh) * | 2021-06-09 | 2022-03-15 | 合肥群音信息服务有限公司 | 一种基于语音数据的质量评价方法 |
CN113488074B (zh) * | 2021-08-20 | 2023-06-23 | 四川大学 | 一种用于检测合成语音的二维时频特征生成方法 |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB9213459D0 (en) * | 1992-06-24 | 1992-08-05 | British Telecomm | Characterisation of communications systems using a speech-like test stimulus |
FR2737948B1 (fr) * | 1995-08-16 | 1997-10-17 | Alcatel Mobile Comm France | Dispositif de commande de volume sonore pour recepteur de signaux de parole codes par blocs |
FR2769777B1 (fr) * | 1997-10-13 | 1999-12-24 | Telediffusion Fse | Procede et systeme d'evaluation, a la reception, de la qualite d'un signal numerique, tel qu'un signal audio/video numerique |
CA2230188A1 (en) * | 1998-03-27 | 1999-09-27 | William C. Treurniet | Objective audio quality measurement |
SE517547C2 (sv) * | 1998-06-08 | 2002-06-18 | Ericsson Telefon Ab L M | Signalsynkronisering vid signalkvalitetsmätning |
EP0980064A1 (de) * | 1998-06-26 | 2000-02-16 | Ascom AG | Verfahren zur Durchführung einer maschinengestützten Beurteilung der Uebertragungsqualität von Audiosignalen |
US7006555B1 (en) * | 1998-07-16 | 2006-02-28 | Nielsen Media Research, Inc. | Spectral audio encoding |
EP1277295A1 (de) * | 1999-10-27 | 2003-01-22 | Nielsen Media Research, Inc. | Verfahren und vorrichtung zur kodierung von tonsignalen für verwendung in programmidentifikationssystemen, durch hinzufügen eines unhörbaren kodes im tonsignal |
NL1014075C2 (nl) * | 2000-01-13 | 2001-07-16 | Koninkl Kpn Nv | Methode en inrichting voor het bepalen van de kwaliteit van een signaal. |
-
2002
- 2002-01-24 FR FR0200856A patent/FR2835125B1/fr not_active Expired - Fee Related
-
2003
- 2003-01-23 US US10/502,425 patent/US8036765B2/en not_active Expired - Fee Related
- 2003-01-23 WO PCT/FR2003/000222 patent/WO2003063134A1/fr active Application Filing
- 2003-01-23 EP EP03715043.0A patent/EP1468416B1/de not_active Expired - Lifetime
- 2003-01-23 CA CA2474067A patent/CA2474067C/en not_active Expired - Fee Related
-
2011
- 2011-08-26 US US13/219,391 patent/US8606385B2/en not_active Expired - Fee Related
Non-Patent Citations (1)
Title |
---|
See references of WO03063134A1 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112562714A (zh) * | 2020-11-24 | 2021-03-26 | 潍柴动力股份有限公司 | 一种噪声评估方法及装置 |
Also Published As
Publication number | Publication date |
---|---|
FR2835125B1 (fr) | 2004-06-18 |
WO2003063134A1 (fr) | 2003-07-31 |
US20050143974A1 (en) | 2005-06-30 |
FR2835125A1 (fr) | 2003-07-25 |
CA2474067A1 (en) | 2003-07-31 |
EP1468416B1 (de) | 2015-12-23 |
US20120099734A1 (en) | 2012-04-26 |
US8606385B2 (en) | 2013-12-10 |
US8036765B2 (en) | 2011-10-11 |
CA2474067C (en) | 2014-12-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP1468416B1 (de) | Verfahren zur qualitativen bewertung eines digitalen audiosignals | |
EP2419900B1 (de) | Verfahren und einrichtung zur objektiven evaluierung der sprachqualität eines sprachsignals unter berücksichtigung der klassifikation der in dem signal enthaltenen hintergrundgeräusche | |
EP0768770B1 (de) | Verfahren und Vorrichtung zur Erzeugung von Hintergrundrauschen in einem digitalen Übertragungssystem | |
EP2691952B1 (de) | Zuweisung von bits anhand von subbändern zur quantifizierung von rauminformationsparametern für parametrische codierung | |
EP1051703B1 (de) | Verfahren zur dekodierung eines audiosignals mit korrektur von übertragungsfehlern | |
EP1997103B1 (de) | Verfahren zur codierung eines quellenaudiosignals, entsprechende codierungseinrichtung, decodierungsverfahren und einrichtung, signal und computerprogrammprodukte | |
EP0906613B1 (de) | Verfahren und vorrichtung zur kodierung eines audiosignals mittels "vorwärts"- und "rückwärts"-lpc-analyse | |
EP2080194B1 (de) | Dämpfung von stimmüberlagerung, im besonderen zur erregungserzeugung bei einem decoder in abwesenheit von informationen | |
EP2795618B1 (de) | Verfahren zur erkennung eines vorgegebenen frequenzbandes in einem audiodatensignal, erkennungsvorrichtung und computerprogramm dafür | |
FR2882458A1 (fr) | Procede de mesure de la gene due au bruit dans un signal audio | |
EP1875465A1 (de) | Verfahren zur anpassung für interoperabilität zwischen kurzzeit-korrelationsmodellen digitaler signale | |
CA2377808C (fr) | Procede d'evaluation de la qualite de sequences audiovisuelles | |
US6549757B1 (en) | Method and system for assessing, at reception level, the quality of a digital signal, such as a digital audio/video signal | |
EP1468573A1 (de) | Verfahren zur synchronisierung von zwei digitalen datenströmen mit gleichem inhalt | |
CN108877816B (zh) | 基于qmdct系数的aac音频重压缩检测方法 | |
Organiściak et al. | Single-ended quality measurement of a music content via convolutional recurrent neural networks | |
EP1159795B1 (de) | Verfahren zur qualitätsüberwachung von einem digitaltonsignal das mit einem rundfunkprogramm übertragen wird | |
FR2790845A1 (fr) | Procede de controle de la qualite d'un signal audionumerique distribue |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20040728 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PT SE SI SK TR |
|
RIN1 | Information on inventor provided before grant (corrected) |
Inventor name: JOLY, ALEXANDRE |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: TDF |
|
17Q | First examination report despatched |
Effective date: 20080707 |
|
APBK | Appeal reference recorded |
Free format text: ORIGINAL CODE: EPIDOSNREFNE |
|
APBN | Date of receipt of notice of appeal recorded |
Free format text: ORIGINAL CODE: EPIDOSNNOA2E |
|
APBR | Date of receipt of statement of grounds of appeal recorded |
Free format text: ORIGINAL CODE: EPIDOSNNOA3E |
|
APAV | Appeal reference deleted |
Free format text: ORIGINAL CODE: EPIDOSDREFNE |
|
APBT | Appeal procedure closed |
Free format text: ORIGINAL CODE: EPIDOSNNOA9E |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R079 Ref document number: 60348361 Country of ref document: DE Free format text: PREVIOUS MAIN CLASS: G10L0019000000 Ipc: G10L0025690000 |
|
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G10L 25/69 20130101AFI20150625BHEP |
|
INTG | Intention to grant announced |
Effective date: 20150416 |
|
GRAC | Information related to communication of intention to grant a patent modified |
Free format text: ORIGINAL CODE: EPIDOSCIGR1 |
|
GRAS | Grant fee paid |
Free format text: ORIGINAL CODE: EPIDOSNIGR3 |
|
INTG | Intention to grant announced |
Effective date: 20150716 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): DE FR GB NL |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: FG4D Free format text: NOT ENGLISH |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: PLFP Year of fee payment: 14 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R096 Ref document number: 60348361 Country of ref document: DE |
|
REG | Reference to a national code |
Ref country code: NL Ref legal event code: FP |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R097 Ref document number: 60348361 Country of ref document: DE |
|
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed |
Effective date: 20160926 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: PLFP Year of fee payment: 15 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20161228 Year of fee payment: 15 Ref country code: NL Payment date: 20161220 Year of fee payment: 15 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20161221 Year of fee payment: 15 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20161219 Year of fee payment: 15 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R119 Ref document number: 60348361 Country of ref document: DE |
|
REG | Reference to a national code |
Ref country code: NL Ref legal event code: MM Effective date: 20180201 |
|
GBPC | Gb: european patent ceased through non-payment of renewal fee |
Effective date: 20180123 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20180801 Ref country code: FR Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20180131 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: ST Effective date: 20180928 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: NL Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20180201 Ref country code: GB Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20180123 |