US20050108006A1 - Method and device for determining the voice quality degradation of a signal - Google Patents

Method and device for determining the voice quality degradation of a signal Download PDF

Info

Publication number
US20050108006A1
US20050108006A1 US10/178,299 US17829902A US2005108006A1 US 20050108006 A1 US20050108006 A1 US 20050108006A1 US 17829902 A US17829902 A US 17829902A US 2005108006 A1 US2005108006 A1 US 2005108006A1
Authority
US
United States
Prior art keywords
signal
sequences
periods
speech
samples
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US10/178,299
Inventor
Charles-Henry Jurd
Houmad Tighezza
Abdelkrim Moulehiawy
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alcatel Lucent SAS
Original Assignee
Alcatel SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alcatel SA filed Critical Alcatel SA
Assigned to ALCATEL reassignment ALCATEL ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: JURD, CHARLES-HENRY, MOULEHIAWY, ABDELKRIM, TIGHEZZA, HOUMAD
Publication of US20050108006A1 publication Critical patent/US20050108006A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/69Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for evaluating synthetic or decoded voice signals

Definitions

  • the present invention is generally related to the transmission of signals through communication means, more particularly the transmission of voice or speech carrying signals, and concerns a method and a device for determining the voice or speech quality degradation of a signal transmitted over and/or through at least one communication device, network or similar.
  • the invention is based on a priority application EP 01 440 198.7 which is hereby incorporated by reference.
  • the importance of such degradation can depend on several factors such as length of the transmission, quality of the bearers and of the signal treatment devices, quality of the connexion and interfaces between the successive elements involved in the transmission procedure, possible interference or disturbance phenomena or similar.
  • PQSM Perceptual Speech Quality Measurements
  • ITU International Telecommunication Union
  • the major aim of the invention is to propose a method and a device for objectively determining the degradation of the quality of a voice signal which needs only one signal.
  • the present invention concerns a method for determining the voice or speech quality degradation of a signal, without using any reference or initial signal, characterised in that it mainly consists in decomposing the signal to be analysed by means of a segmentation algorithm, then applying at least one metric to the resulting decomposed signal and finally evaluating the signal degradation.
  • the invention does also concern a device, mainly in the form of a software tool, which is able to carry out said method.
  • FIG. 1 represents a speech signal with annoying background noise
  • FIG. 2 is a graphical representation of the energy contained in the successive frames (groups of samples) of the signal of FIG. 1 ;
  • FIG. 3 is a graphical representation of the energy variation between the frames of the signal of FIGS. 1 and 2 ;
  • FIGS. 4A to 4 D are graphical representations of signals subjected to a segmentation algorithm showing the variation of the quality of the segmentation in relation with the noise energy level
  • FIGS. 5A to 5 D are graphical representations of the signals of FIG. 4 subjected to a segmentation algorithm with an automatically ajusted sensitivity according to the invention
  • FIG. 6 shows the signal of FIG. 1 —before (upper part) and after (lower part) a segmentation procedure with noise extraction has been applied to it and,
  • FIG. 7 is a graphical representation of the spectrum of the signal of FIGS. 1 and 6 (upper part) onto critical bands of Bark's scale.
  • the method for determining and measuring the degradation of the voice or speech component of a transmitted signal mainly consists in decomposing the signal to be analysed by means of a segmentation algorithm, then applying at least one metric to the resulting decomposed signal and finally evaluating the signal degradation.
  • the segmentation algorithm allows to precisely cut up the signal into homogeneous temporaly areas, sequences or segments, in which for example the envelope has a relatively constant behaviour, autorising a deeper local study of said signal.
  • the segmentation algorithm is based on the Burg's algorithm which provides a AR2 type model of the signal (see in particular “Musical Signal Parameter Estimation”, Tristan Jehan, PhD thesis, Berkeley Univ., URL: http://www.cnmat.berkeley.edu/tristan/report/report.html).
  • the resulting segmentation is representative of the type of information carried by the signal when the latter is only weakly noise infected (clear signal), i.e. a high density of segmentation points when the signal carries speech and a very low density of segmentation points or no segmentation points at all during the silence periods of the signal (periods with no speech).
  • the performance of said segmentation procedure can be enhanced by pretreating the signal to be analysed.
  • the method can consist, before subjecting the signal to be analysed to the temporal segmentation algorithm, in sampling said signal, calculating energy related quantities for said signal samples ( FIG. 2 ), thresholding said plurality of calculated quantities in order to identify the speech, silence and/or noise sequences or periods of said signal, and determining the average energy level of noise during the sequences or periods of the signal carrying no speech or silence sequences or periods, in order to perform a first signal degradation evaluation.
  • the previous operation can consist in obtaining a PCM (Pulse Code Modulation) version of the signal and submitting said sampled signal, as successive groups or frames of samples, to a G.729 type coder in order to determine the groups or frames of samples, and the associated periods or sequences of the signal, comprising speech or voice activity.
  • PCM Pulse Code Modulation
  • the energy related quantities preferably correspond to the square numbers of the values of the samples and to the sums of these square numbers for all samples of predetermined groups or frames of samples.
  • the invention advantageously consists, in order to discriminate sequences or periods with and without speech of the signal, in determining the variation of the energy related quantities within or between predetermined or consecutive groups of samples, spotting the sequences in which or between which the variation is of a small magnitude and identifying as sequences or periods of silence or without speech, sequences or periods which correspond to at least two consecutive groups of samples with small internal and/or mutual variation of the energy related quantities.
  • silence or silence frames are never isolated, but always exist as series of such frames. Therefore an isolated frame identified as silence or noise frame is very likely not a real noise or silence frame and should be disregarded as an erroneous detection.
  • the pretreatment operation described herebefore can thus be used to submit to the segmentation algorithm a signal comprising only speech frames.
  • the method consists in using a variable triggering threshold for the temporal segmentation algorithm, in the form of a quantity which is dependent from the current average value of energy or of an energy related quantity of the noise carried within said signal.
  • the inventive method further consists in performing a spectral analysis of the various homogeneous sequences or periods resulting from the decomposition of the signal to be analysed by the segmentation algorithm, said sequences or periods corresponding to one or several predetermined group(s) or frame(s) of samples extracted from the signal to be analysed ( FIG. 6 ).
  • the said spectral analysis mainly consists in subjecting the groups of samples to a fast Fourier transform, then in projecting the spectrum onto critical bands of the Bark's scale and eventually analysing the resulting data.
  • said spectral analysis is advantageously at least partly performed by applying a PSQM type algorithm to the consecutive groups of samples forming the signal, said algorithm carrying out the fast Fourier transform and the spectral projection.
  • Said spectral analysis normally comprises two different types of treatment procedures depending on whether the considered group of samples to be analysed incorporates speech or not, and therefore has been identified as such by the combined previous operative steps of segmentation/voice activity detection.
  • Said SNR (Signal to Noise Ratio) provides a good estimation of the voice degradation and can be used as a quality mark.
  • said method consists, for the groups of samples corresponding to sequences or periods of the signal without speech, i.e. silence or noise sequences, in averaging the spectral features of the signal in order to characterise the existing noise and deduct its origin.
  • the present invention also concerns a device for determining the noise or speech quality degradation of a signal, without using any reference or initial signal, characterised in that said device mainly comprises means for decomposing the signal to be analysed through a segmentation algorithm, means for applying at least one metric to the resulting decomposed signal and means for evaluating the signal degradation.
  • said device also comprises additional means for identifying the speech, silence and/or noise sequences or periods of the signal to be analysed and for determining the average energy level of noise during the sequences or periods of the signal without speech activity.

Abstract

The present invention concerns a method and a device for determining the voice quality degradation of a signal. Method for determining the voice or speech quality degradation of a signal, without using any reference or initial signal, wherein it mainly consists in decomposing the signal to be analysed by means of a segmentation algorithm, then applying at least one metric to the resulting decomposed signal and finally evaluating the signal degradation.

Description

    TECHNICAL FIELD
  • The present invention is generally related to the transmission of signals through communication means, more particularly the transmission of voice or speech carrying signals, and concerns a method and a device for determining the voice or speech quality degradation of a signal transmitted over and/or through at least one communication device, network or similar.
  • The invention is based on a priority application EP 01 440 198.7 which is hereby incorporated by reference.
  • BACKGROUND OF THE INVENTION
  • When a signal is transmitted over and through several devices and bearers, a degradation of the informative content of said signal occurs inevitably.
  • The importance of such degradation can depend on several factors such as length of the transmission, quality of the bearers and of the signal treatment devices, quality of the connexion and interfaces between the successive elements involved in the transmission procedure, possible interference or disturbance phenomena or similar.
  • Such degradation is particularly annoying when the concerned signals are speech or voice carrying signals.
  • It is therefore a necessity to measure the level of voice quality degradation in order to evaluate the considered transmission path and to be able to propose solutions to improve said level.
  • Tools to objectively measure the voice quality degradation do already exist, but they all need both of the source and the degraded signals to be able to perform the considered measurements.
  • This is in particular the case with the algorithm known as PQSM (for Perceptual Speech Quality Measurements) and corresponding to recommendation P.861 of the ITU (International Telecommunication Union), which is in fact dedicated to the estimatoin of the degradation due to vocal coder/decoder.
  • But such tools, while working in laboratory conditions, can generally not be applied practically, i.e. in real or field conditions, as both source and degraded signals are rarely available for the evaluation tool, in particular when network transmission is involved.
  • SUMMARY OF THE INVENTION
  • Thus, the major aim of the invention is to propose a method and a device for objectively determining the degradation of the quality of a voice signal which needs only one signal.
  • Furthermore the proposed solution should be fully embeddable in existing systems, not too complex to implement and flexible in the ways of expressing the result of the degradation evaluation.
  • To that effect, the present invention concerns a method for determining the voice or speech quality degradation of a signal, without using any reference or initial signal, characterised in that it mainly consists in decomposing the signal to be analysed by means of a segmentation algorithm, then applying at least one metric to the resulting decomposed signal and finally evaluating the signal degradation.
  • The invention does also concern a device, mainly in the form of a software tool, which is able to carry out said method.
  • BRIEF DESCRIPTIONS OF THE INVENTION
  • The present invention will be better understood thanks to the following description of an embodiment of said invention given as a non limitative example thereof, said description being made in relation with the enclosed drawings in which:
  • FIG. 1 represents a speech signal with annoying background noise
  • FIG. 2 is a graphical representation of the energy contained in the successive frames (groups of samples) of the signal of FIG. 1;
  • FIG. 3 is a graphical representation of the energy variation between the frames of the signal of FIGS. 1 and 2;
  • FIGS. 4A to 4D are graphical representations of signals subjected to a segmentation algorithm showing the variation of the quality of the segmentation in relation with the noise energy level
  • FIGS. 5A to 5D are graphical representations of the signals of FIG. 4 subjected to a segmentation algorithm with an automatically ajusted sensitivity according to the invention
  • FIG. 6 shows the signal of FIG. 1—before (upper part) and after (lower part) a segmentation procedure with noise extraction has been applied to it and,
  • FIG. 7 is a graphical representation of the spectrum of the signal of FIGS. 1 and 6 (upper part) onto critical bands of Bark's scale.
  • DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
  • According to the invention, the method for determining and measuring the degradation of the voice or speech component of a transmitted signal mainly consists in decomposing the signal to be analysed by means of a segmentation algorithm, then applying at least one metric to the resulting decomposed signal and finally evaluating the signal degradation.
  • The segmentation algorithm allows to precisely cut up the signal into homogeneous temporaly areas, sequences or segments, in which for example the envelope has a relatively constant behaviour, autorising a deeper local study of said signal.
  • Advantageously, the segmentation algorithm is based on the Burg's algorithm which provides a AR2 type model of the signal (see in particular “Musical Signal Parameter Estimation”, Tristan Jehan, PhD thesis, Berkeley Univ., URL: http://www.cnmat.berkeley.edu/tristan/report/report.html).
  • The resulting segmentation is representative of the type of information carried by the signal when the latter is only weakly noise infected (clear signal), i.e. a high density of segmentation points when the signal carries speech and a very low density of segmentation points or no segmentation points at all during the silence periods of the signal (periods with no speech).
  • Nevertheless, the more the signal is noise infected, the less the segmentation algorithm is precise and efficient. This loss of performance can be clearly seen by comparing mutually FIGS. 4A (clear signal) to 4D (heavily noisy signal).
  • The performance of said segmentation procedure can be enhanced by pretreating the signal to be analysed.
  • Thus, in accordance with the invention, the method can consist, before subjecting the signal to be analysed to the temporal segmentation algorithm, in sampling said signal, calculating energy related quantities for said signal samples (FIG. 2), thresholding said plurality of calculated quantities in order to identify the speech, silence and/or noise sequences or periods of said signal, and determining the average energy level of noise during the sequences or periods of the signal carrying no speech or silence sequences or periods, in order to perform a first signal degradation evaluation.
  • The previous operation can consist in obtaining a PCM (Pulse Code Modulation) version of the signal and submitting said sampled signal, as successive groups or frames of samples, to a G.729 type coder in order to determine the groups or frames of samples, and the associated periods or sequences of the signal, comprising speech or voice activity.
  • Nevertheless, the energy related quantities preferably correspond to the square numbers of the values of the samples and to the sums of these square numbers for all samples of predetermined groups or frames of samples.
  • As the simple thresholding of the energy related quantities of the sample groups does not allow to distinguish the groups or frames carrying speech, the invention advantageously consists, in order to discriminate sequences or periods with and without speech of the signal, in determining the variation of the energy related quantities within or between predetermined or consecutive groups of samples, spotting the sequences in which or between which the variation is of a small magnitude and identifying as sequences or periods of silence or without speech, sequences or periods which correspond to at least two consecutive groups of samples with small internal and/or mutual variation of the energy related quantities.
  • Indeed, it has been noticed by the inventors that the energy differences between groups or frames are important when said signal contains speech and that the energy differences between groups or frames are small or null and relatively constant when said signal contains noise or silence (see FIG. 3).
  • By applying a threshold to this metric (energy variation between frames) it is easily possible to identify on the one hand the speech and on the other hand the noise or silence frames.
  • Then by calculating the average energy level of noise during said identified noise or silence frames, one can operate a first evaluation of the sound quality of the signal and allocate a first mark.
  • It should also be noted that real noise or silence frames are never isolated, but always exist as series of such frames. Therefore an isolated frame identified as silence or noise frame is very likely not a real noise or silence frame and should be disregarded as an erroneous detection.
  • The pretreatment operation described herebefore can thus be used to submit to the segmentation algorithm a signal comprising only speech frames.
  • According to a preferred embodiment of the invention, the method consists in using a variable triggering threshold for the temporal segmentation algorithm, in the form of a quantity which is dependent from the current average value of energy or of an energy related quantity of the noise carried within said signal.
  • The use of such an automatically adaptive threshold (which can be infinitely variable in the theoretical range of the signal) allows to provide a constant segmentation efficiency independently of the level of noise of said signal (see FIGS. 5A to 5D).
  • In order to obtain a more precise view of the degradation which occurred to the signal, the inventive method further consists in performing a spectral analysis of the various homogeneous sequences or periods resulting from the decomposition of the signal to be analysed by the segmentation algorithm, said sequences or periods corresponding to one or several predetermined group(s) or frame(s) of samples extracted from the signal to be analysed (FIG. 6).
  • According to a preferred feature of the invention, the said spectral analysis mainly consists in subjecting the groups of samples to a fast Fourier transform, then in projecting the spectrum onto critical bands of the Bark's scale and eventually analysing the resulting data.
  • Such a projection of a signal from a Hertz scale into a Bark scale, which provides a psycho-accoustic representation of the signal, is in particular described in “Bark and ERB Bilinear Transforms”, Julius O. Smith III et al., IEEE Transactions on Speech and Audio Processing, pp. 697-708, November 1999 (see FIG. 7).
  • Practically, said spectral analysis is advantageously at least partly performed by applying a PSQM type algorithm to the consecutive groups of samples forming the signal, said algorithm carrying out the fast Fourier transform and the spectral projection.
  • Said spectral analysis normally comprises two different types of treatment procedures depending on whether the considered group of samples to be analysed incorporates speech or not, and therefore has been identified as such by the combined previous operative steps of segmentation/voice activity detection.
  • Thus, the inventive method consists, for the groups of samples corresponding to sequences or periods comprising speech, and after performing the fast Fourier transform and projecting the resulting spectrum onto the bands of the Bark's scale, in calculating for each group an energy ratio SNR defined as: SNR=Energy (in concerned bands)/Energy (outside concerned bands), wherein the concerned bands correspond to the bands in which speech activity can be detected, preferably bands 14 to 41 of the 56 critical bands of the Bark's scale.
  • Said SNR (Signal to Noise Ratio) provides a good estimation of the voice degradation and can be used as a quality mark.
  • Alternatively, said method consists, for the groups of samples corresponding to sequences or periods of the signal without speech, i.e. silence or noise sequences, in averaging the spectral features of the signal in order to caracterise the existing noise and deduct its origin.
  • The present invention also concerns a device for determining the noise or speech quality degradation of a signal, without using any reference or initial signal, characterised in that said device mainly comprises means for decomposing the signal to be analysed through a segmentation algorithm, means for applying at least one metric to the resulting decomposed signal and means for evaluating the signal degradation.
  • Advantageously, said device also comprises additional means for identifying the speech, silence and/or noise sequences or periods of the signal to be analysed and for determining the average energy level of noise during the sequences or periods of the signal without speech activity.

Claims (13)

1. Method for determining the voice or speech quality degradation of a signal, without using any reference or initial signal, said method comprising the steps of: decomposing the signal to be analysed by means of a segmentation algorithm, then applying at least one metric to the resulting decomposed signal and finally evaluating the signal degradation, while before subjecting the signal to be analysed to the temporal segmentation algorithm, sampling said signal, calculating energy related quantities for said signal samples, thresholding said plurality of calculated quantities in order to identify the speech, silence and/or noise sequences or periods of said signal, and determining the average energy level of noise during the sequences or periods of the signal carrying no speech or silence sequences or periods, in order to perform a first signal degradation evaluation.
2. Method according to claim 1, wherein the segmentation algorithm is based on the Burg's algorithm which provides a AR2 type model of the signal.
3. Method according to claim 1, wherein it consists, in order to discriminate sequences or periods with and without speech of the signal, of determining the variation of the energy related quantities within or between predetermined or consecutive groups of samples, spotting the sequences in which or between which the variation is of a small magnitude and identifying as sequences or periods of silence or without speech, sequences or periods which correspond to at least two consecutive groups of samples with small internal and/or mutual variation of the energy related quantities.
4. Method according to anyone of claims 1 to 4, wherein obtaining a PCM version of the signal and submitting said sampled signal, as successive groups or frames of samples, to a G.729 type coder in order to determine the groups or frames of samples, and the associated periods or sequences of the signal, comprising speech or voice activity.
5. Method according to anyone of claims 1 to 4, wherein using a variable triggering threshold for the temporal segmentation algorithm, in the form of a quantity which is dependant from the current average value of energy or of an energy related quantity of the noise carried within said signal.
6. Method according to anyone of claims 1 to 5, wherein performing a spectral analysis of the various homogeneous sequences or periods resulting from the decomposition of the signal to be analysed by the segmentation algorithm, said sequences or periods corresponding to one or several predetermined group(s) or frame(s) of samples extracted from the signal to be analysed.
7. Method according to claim 6, wherein the spectral analysis mainly consists in subjecting the groups of samples to a fast Fourier transform, then in projecting the spectrum onto critical bands of the Bark's scale and eventually analysing the resulting data.
8. Method according to claim 7, wherein the spectral analysis is at least partly performed by applying a PSQM type algorithm to the consecutive groups of samples forming the signal, said algorithm carrying out the fast Fourier transform and the spectral projection.
9. Method according to claim 7 or 8, wherein for the groups of samples corresponding to sequences or periods comprising speech, and after performing the fast Fourier transform and projecting the resulting spectrum onto the bands of the Bark's scale, in calculating for each group an energy ratio SNR defined as: SNR=Energy (in concerned bands)/Energy (outside concerned bands), wherein the concerned bands correspond to the bands in which speech activity can be detected, preferably bands 14 to 41 of the 56 critical bands of the Bark's scale.
10. Method according to claim 7 or 8, wherein for the groups of samples corresponding to sequences or periods of the signal without speech, i.e. silence or noise sequences, in averaging the spectral features of the signal in order to caracterise the existing noise and deduct its origin.
11. Device for determining the noise or speech quality degradation of a signal, without using any reference or initial signal, whereby said device mainly comprises means for decomposing the signal to be analysed through a segmentation algorithm, means for applying at least one metric to the resulting decomposed signal and means for evaluating the signal degradation.
12. Device according to claim 11, whereby it also comprises additional means for identifying the speech, silence and/or noise sequences or periods of the signal to be analysed and for determining the average energy level of noise during the sequences or periods of the signal without speech activity.
13. Device according to claims 11 and 12, whereby said means are adapted to perform the method according to any of claims 1 to 10.
US10/178,299 2001-06-25 2002-06-25 Method and device for determining the voice quality degradation of a signal Abandoned US20050108006A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP01440189.7 2001-06-25
EP01440189A EP1271470A1 (en) 2001-06-25 2001-06-25 Method and device for determining the voice quality degradation of a signal

Publications (1)

Publication Number Publication Date
US20050108006A1 true US20050108006A1 (en) 2005-05-19

Family

ID=8183243

Family Applications (1)

Application Number Title Priority Date Filing Date
US10/178,299 Abandoned US20050108006A1 (en) 2001-06-25 2002-06-25 Method and device for determining the voice quality degradation of a signal

Country Status (2)

Country Link
US (1) US20050108006A1 (en)
EP (1) EP1271470A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040167773A1 (en) * 2003-02-24 2004-08-26 International Business Machines Corporation Low-frequency band noise detection
US20120215536A1 (en) * 2009-10-19 2012-08-23 Martin Sehlstedt Methods and Voice Activity Detectors for Speech Encoders
CN103716470A (en) * 2012-09-29 2014-04-09 华为技术有限公司 Method and device for speech quality monitoring
US20140163978A1 (en) * 2012-12-11 2014-06-12 Amazon Technologies, Inc. Speech recognition power management
US20160267923A1 (en) * 2015-03-09 2016-09-15 Tomoyuki Goto Communication apparatus, communication system, method of storing log data, and storage medium

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE10327239A1 (en) 2003-06-17 2005-01-27 Opticom Dipl.-Ing. Michael Keyhl Gmbh Apparatus and method for extracting a test signal portion from an audio signal
DE102012000931A1 (en) 2012-01-19 2013-07-25 Volkswagen Ag Method for diagnosing audio system of motor vehicle, involves emitting test signal acoustically by loud speaker of audio system of motor vehicle, where test signal is received by microphone for generating microphone signal

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4696039A (en) * 1983-10-13 1987-09-22 Texas Instruments Incorporated Speech analysis/synthesis system with silence suppression
US5732390A (en) * 1993-06-29 1998-03-24 Sony Corp Speech signal transmitting and receiving apparatus with noise sensitive volume control
US6487535B1 (en) * 1995-12-01 2002-11-26 Digital Theater Systems, Inc. Multi-channel audio encoder
US6609092B1 (en) * 1999-12-16 2003-08-19 Lucent Technologies Inc. Method and apparatus for estimating subjective audio signal quality from objective distortion measures
US6898566B1 (en) * 2000-08-16 2005-05-24 Mindspeed Technologies, Inc. Using signal to noise ratio of a speech signal to adjust thresholds for extracting speech parameters for coding the speech signal

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4696039A (en) * 1983-10-13 1987-09-22 Texas Instruments Incorporated Speech analysis/synthesis system with silence suppression
US5732390A (en) * 1993-06-29 1998-03-24 Sony Corp Speech signal transmitting and receiving apparatus with noise sensitive volume control
US6487535B1 (en) * 1995-12-01 2002-11-26 Digital Theater Systems, Inc. Multi-channel audio encoder
US6609092B1 (en) * 1999-12-16 2003-08-19 Lucent Technologies Inc. Method and apparatus for estimating subjective audio signal quality from objective distortion measures
US6898566B1 (en) * 2000-08-16 2005-05-24 Mindspeed Technologies, Inc. Using signal to noise ratio of a speech signal to adjust thresholds for extracting speech parameters for coding the speech signal

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040167773A1 (en) * 2003-02-24 2004-08-26 International Business Machines Corporation Low-frequency band noise detection
US7233894B2 (en) * 2003-02-24 2007-06-19 International Business Machines Corporation Low-frequency band noise detection
US20120215536A1 (en) * 2009-10-19 2012-08-23 Martin Sehlstedt Methods and Voice Activity Detectors for Speech Encoders
US9401160B2 (en) * 2009-10-19 2016-07-26 Telefonaktiebolaget Lm Ericsson (Publ) Methods and voice activity detectors for speech encoders
US20160322067A1 (en) * 2009-10-19 2016-11-03 Telefonaktiebolaget Lm Ericsson (Publ) Methods and Voice Activity Detectors for a Speech Encoders
CN103716470A (en) * 2012-09-29 2014-04-09 华为技术有限公司 Method and device for speech quality monitoring
EP2884493A4 (en) * 2012-09-29 2015-10-21 Huawei Tech Co Ltd Method and apparatus for voice quality monitoring
US20140163978A1 (en) * 2012-12-11 2014-06-12 Amazon Technologies, Inc. Speech recognition power management
US9704486B2 (en) * 2012-12-11 2017-07-11 Amazon Technologies, Inc. Speech recognition power management
US10325598B2 (en) 2012-12-11 2019-06-18 Amazon Technologies, Inc. Speech recognition power management
US11322152B2 (en) 2012-12-11 2022-05-03 Amazon Technologies, Inc. Speech recognition power management
US20160267923A1 (en) * 2015-03-09 2016-09-15 Tomoyuki Goto Communication apparatus, communication system, method of storing log data, and storage medium

Also Published As

Publication number Publication date
EP1271470A1 (en) 2003-01-02

Similar Documents

Publication Publication Date Title
US8195449B2 (en) Low-complexity, non-intrusive speech quality assessment
US7680655B2 (en) Method and apparatus for measuring the quality of speech transmissions that use speech compression
US7729275B2 (en) Method and apparatus for non-intrusive single-ended voice quality assessment in VoIP
EP1157377B1 (en) Speech enhancement with gain limitations based on speech activity
EP0856961B1 (en) Testing telecommunications apparatus
JP2004254329A5 (en)
CA2310491A1 (en) Noise suppression for low bitrate speech coder
US7024352B2 (en) Method and device for objective speech quality assessment without reference signal
US6246978B1 (en) Method and system for measurement of speech distortion from samples of telephonic voice signals
JPH06153244A (en) Method and apparatus for discrimination frequency signal existing in plurality of single-frequency signals
KR100655953B1 (en) Speech processing system and method using wavelet packet transform
US20050108006A1 (en) Method and device for determining the voice quality degradation of a signal
JP4759230B2 (en) Quality evaluation device
US7818168B1 (en) Method of measuring degree of enhancement to voice signal
US6490552B1 (en) Methods and apparatus for silence quality measurement
Mittag et al. Detecting Packet-Loss Concealment Using Formant Features and Decision Tree Learning.
US11132987B1 (en) Chroma detection among music, speech, and noise
CN117061039B (en) Broadcast signal monitoring device, method, system, equipment and medium
Zha et al. A data mining approach to objective speech quality measurement
Esquef et al. Quality assessment of audio: Increasing applicability scope of objective methods via prior identification of impairment types
Jo et al. Classification of pathological voice into normal/benign/malignant state
Somek et al. Speech quality assessment
WO2022139730A1 (en) Method enabling the detection of the speech signal activity regions
Benetazzo et al. Speech/voice-band data classification for data traffic measurements in telephone-type systems
Bertocco et al. Nonintrusive measurement of impulsive noise in telephone-type networks

Legal Events

Date Code Title Description
AS Assignment

Owner name: ALCATEL, FRANCE

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:JURD, CHARLES-HENRY;TIGHEZZA, HOUMAD;MOULEHIAWY, ABDELKRIM;REEL/FRAME:015226/0643

Effective date: 20020515

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION