WO2001003316A1 - Controle d'echo dans un domaine code - Google Patents

Controle d'echo dans un domaine code Download PDF

Info

Publication number
WO2001003316A1
WO2001003316A1 PCT/US2000/018104 US0018104W WO0103316A1 WO 2001003316 A1 WO2001003316 A1 WO 2001003316A1 US 0018104 W US0018104 W US 0018104W WO 0103316 A1 WO0103316 A1 WO 0103316A1
Authority
WO
WIPO (PCT)
Prior art keywords
parameter
code
near end
digital signal
echo
Prior art date
Application number
PCT/US2000/018104
Other languages
English (en)
Inventor
Ravi Chandran
Daniel J. Marchok
Original Assignee
Tellabs Operations, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tellabs Operations, Inc. filed Critical Tellabs Operations, Inc.
Priority to CA002378012A priority Critical patent/CA2378012A1/fr
Priority to EP00948555A priority patent/EP1190495A1/fr
Priority to JP2001508063A priority patent/JP2003533902A/ja
Priority to AU62033/00A priority patent/AU6203300A/en
Publication of WO2001003316A1 publication Critical patent/WO2001003316A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M9/00Arrangements for interconnection not involving centralised switching
    • H04M9/08Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic
    • H04M9/082Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic using echo cancellers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0316Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
    • G10L21/0364Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for improving intelligibility
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B3/00Line transmission systems
    • H04B3/02Details
    • H04B3/20Reducing echo effects or singing; Opening or closing transmitting path; Conditioning for transmission in one direction or the other
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/0001Systems modifying transmission characteristics according to link quality, e.g. power backoff
    • H04L1/0014Systems modifying transmission characteristics according to link quality, e.g. power backoff by adapting the source coding

Definitions

  • the present invention relates to coded domain enhancement of compressed speech and in particular to coded domain echo contol.
  • GSM 06.10 Digital cellular telecommunication system (Phase 2); Full rate speech; Part 2: Transcoding", ETS 300 580-2, March 1998, Second Edition.
  • GSM 06.60 Digital cellular telecommunications system (Phase 2); Enhanced Full Rate (EFR) speech transcoding", June 1998.
  • GSM 08.62 Digital cellular telecommunications system (Phase 2+); Inband Tandem Free Operation (TFO) of Speech Codecs", ETSI, March 2000.
  • GSM 06.12 European digital cellular telecommunications system (Phase 2); Comfort noise aspect for full rate speech traffic channels", ETSI, September 1994.
  • Speech transmission between the mobile stations (handsets) and the base station is in compressed or coded form.
  • Speech coding techniques such as the GSM FR [1] and EFR [2] are used to compress the speech.
  • the devices used to compress speech are called vocoders.
  • the coded speech requires less than 2 bits per sample. This situation is depicted in Figure 1. Between the base stations, the speech is transmitted in an uncoded form (using PCM companding which requires 8 bits per sample).
  • coded speech and uncoded speech may be described as follows:
  • Uncoded speech refers to the digital speech signal samples typically used in telephony; these samples are either in linear 13-bits per sample form or companded form such as the 8-bits per sample ⁇ -law or A-law PCM form; the typical bit-rate is
  • Coded speech refers to the compressed speech signal parameters (also referred to as coded parameters) which use a bit rate typically well below 64kbps such as 13 kbps in the case of the GSM FR and 12.2 kbps in the case of GSM EFR; the compression methods are more extensive than the simple PCM companding scheme; examples of compression methods are linear predictive coding, code-excited linear prediction and multi-band excitation coding [4].
  • TFO Tandem-Free Operation
  • the TFO standard applies to mobile-to- mobile calls.
  • the speech signal is conveyed between mobiles in a compressed form after a brief negotiation period.
  • the elimination of tandem codecs is known to improve speech quality in the case where the original signal is clean.
  • the key point to note is that the speech transmission remains coded between the mobile handsets and is depicted in Figure 2.
  • the echo problem and its traditional solution are shown in Figure 4.
  • echo occurs due to the impedance mismatch at the 4-wire-to-2- wire hybrids.
  • the mismatch results in electrical reflections of a portion of the far-end signal into the near-end signal.
  • the endpath impulse response is estimated using a network echo canceller (EC) and is used to produce an estimate of the echo signal.
  • the estimate is then subtracted from the near-end signal to remove the echo.
  • NLP non-linear processor
  • the echo occurs due to the feedback from the speaker (earpiece) to the microphone (mouthpiece).
  • the acoustic feedback can be significant and the echo can be annoying, particularly in the case of hands-free phones.
  • Figure 5 shows the feedback path from the speaker to the microphone in a digital cellular handset.
  • the depicted handset does not have echo cancellation implemented in the handset.
  • a vocoder tandem i.e. two encoder/decoder pairs placed in series
  • comfort noise generation may be used to mask the echo.
  • Comfort noise generation is used for silence suppression or discontinuous transmission purposes (e.g. [5]). It is possible to use such techniques to completely mask the echo whenever echo is detected. However, such techniques suffer from "choppiness" particularly during double-talk conditions, as well as poor and unnatural background transparency.
  • the proposed techniques are capable of performing echo control (acoustic or linear) directly on the coded speech (i.e. by direct modification of the coded parameters).
  • Low computational complexity and delay are achieved. Tandeming effects are avoided or minimized, resulting in better perceived quality after echo control. Excellent background transparency is also achieved.
  • Speech compression which falls under the category of lossy source coding, is commonly referred to as speech coding.
  • Speech coding is performed to minimize the bandwidth necessary for speech transmission. This is especially important in wireless telephony where bandwidth is scarce. In the relatively bandwidth abundant packet networks, speech coding is still important to minimize network delay and jitter. This is because speech communication, unlike data, is highly intolerant of delay. Hence a smaller packet size eases the transmission through a packet network.
  • Table 1 The four ETSI GSM standards of concern are listed in Table 1.
  • a set of consecutive digital speech samples is referred to as a speech frame.
  • the GSM coders operate on a frame size of 20ms (160 samples at 8kHz sampling rate). Given a speech frame, a speech encoder determines a small set of parameters for a speech synthesis model. With these speech parameters and the speech synthesis model, a speech frame can be reconstructed that appears and sounds very similar to the original speech frame. The reconstruction is performed by the speech decoder. In the GSM vocoders listed above, the encoding process is much more computationally intensive than the decoding process.
  • the speech parameters determined by the speech encoder depend on the speech synthesis model used.
  • the GSM coders in Table 1 utilize linear predictive coding (LPC) models.
  • LPC linear predictive coding
  • a block diagram of a simplified view of a generic LPC speech synthesis model is shown in Figure 7. This model can be used to generate speech-like signals by specifying the model parameters appropriately.
  • the parameters include the time-varying filter coefficients, pitch periods, codebook vectors and the gain factors.
  • the synthetic speech is generated as follows.
  • An appropriate codebook vector, c(n) is first scaled by the codebook gain
  • n denotes sample time.
  • a pitch synthesis filter whose parameters include the pitch gain, g , and the pitch
  • T The result is sometimes referred to as the total excitation vector, u(n) .
  • the pitch synthesis filter provides the harmonic quality of voiced speech.
  • the total excitation vector is then filtered by the LPC synthesis filter which specifies the broad spectral shape of the speech frame and the broad spectral shape of the corresponding audio signal. For each speech frame, the parameters are usually updated more than once.
  • the codebook vector, codebook gain and the pitch synthesis filter parameters are determined every subframe (5ms).
  • LPC synthesis filter parameters are determined twice per frame (every 10ms) in EFR and once per frame in FR.
  • a typical sequence of steps used in a speech encoder is as follows:
  • a typical sequence of steps used in a speech decoder is as follows:
  • the GSM FR vocoder As an example of the arrangement of coded parameters in the bit-stream transmitted by the encoder, the GSM FR vocoder is considered.
  • a frame is defined as 160 samples of speech sampled at 8kHz, i.e. a frame is 20ms long. With A-law PCM companding, 160 samples would require 1280 bits for transmission.
  • the encoder compresses the 160 samples into 260 bits.
  • the arrangement of the various coded parameters in the 260 bits of each frame is shown in Figure 8.
  • the first 36 bits of each coded frame consists of the log-area ratios which correspond to LPC synthesis filter.
  • the remaining 224 bits can be grouped into 4 subframes of 56 bits each. Within each subframe, the coded parameter bits contain the pitch synthesis filter related parameters followed by the codebook vector and gain related parameters.
  • the preferred embodiment is useful in a communications system for transmitting a near end digital signal using a compression code comprising a plurality of parameters including a first parameter.
  • the parameters represent an audio signal comprising a plurality of audio characteristics.
  • the compression code is decodable by a plurality of decoding steps.
  • the communications system also transmits a far end digital signal using a compression code.
  • the echo in the near end digital signal can be reduced by reading at least the first parameter of the plurality of parameters in response to the near end digital signal.
  • At least one of the plurality of the decoding steps is performed on the near end digital signal and the far end digital signal to generate at least partially decoded near end signals and at least partially decoded far end signals.
  • the first parameter is adjusted in response to the at least partially decoded near end signals and at least partially decoded far end signals to generate an adjusted first parameter.
  • the first parameter is replaced with the adjusted first parameter in the near end digital signal.
  • the reading, generating and adjusting preferably are performed by a processor.
  • Another embodiment of the invention is useful in a communications system for transmitting a near end digital signal comprising code samples further comprising first bits using a compression code and second bits using a linear code.
  • the code samples represent an audio signal having a plurality of audio characteristics.
  • the system also transmits a far end digital signal. In such an environment, any echo in the near end digital signal can be reduced without decoding the compression code by adjusting the first bits and second bits in response to the near end digital signal and the far end digital signal.
  • Figure 1 is a schematic block diagram of a system for speech transmission in a GSM digital cellular network.
  • FIG. 2 is a schematic block diagram of a system for speech transmission in a GSM network under tandem-free operation (TFO).
  • FIG. 3 is a graph illustrating transmission of speech under tandem-free operation (TFO).
  • Figure 4 is a schematic block diagram of a traditional solution to an echo problem in a wireline network.
  • Figure 5 is a schematic block diagram illustrating acoustic feedback from a speaker to a microphone in a digital cellular telephone.
  • Figure 6 is a schematic block diagram of a traditional echo cancellation approach for coded speech.
  • Figure 7 is a schematic block diagram of a generic linear predictive code (LPC) speech synthesis model or speech decoder model.
  • LPC linear predictive code
  • Figure 8 is a diagram illustrating the arrangement of coded parameters in the bit stream for GSM FR.
  • Figure 9 is a schematic block diagram of a preferred form of coded domain echo control system for acoustic echo environments made in accordance with the invention.
  • Figure 10 is a schematic block diagram of another preferred form of coded domain echo control system for echo due to 4-wire-to-2-wire hybrids made in accordance with the invention.
  • Figure 11 is a schematic block diagram of a simplified end path model with flat delay and attenuation.
  • Figure 12 is a graph illustrating a preliminary echo likelihood versus near end to far end subframe power ratio.
  • Figure 13 is a flow diagram illustrating a preferred form of coded domain echo control methodology.
  • Figure 14 is a graph illustrating an exemplary pitch synthesis filter magnitude frequency response.
  • Figure 15 is a graph illustrating exemplary magnitude frequency responses of an original LPC synthesis filter and flattened versions of such a filter.
  • the codebook vector, c(n) is filtered by H(z) to result in the synthesized
  • LPC-based vocoders use parameters similar to the above set, parameters that may be converted to the above forms, or parameters that are related to the above forms.
  • the LPC coefficients in LPC-based vocoders may be represented using log-area ratios (e.g. the GSM FR) or line spectral frequencies (e.g. GSM EFR); both of these forms can be converted to LPC coefficients.
  • An example of a case where a parameter is related to the above form is the block maximum parameter in the GSM FR vocoder; the block maximum can be considered to be directly proportional to the codebook gain in the model described by equation (1).
  • coded parameter modification methods is mostly limited to the generic speech decoder model, it is relatively straightforward to tailor these methods for any LPC-based vocoder, and possibly even other models.
  • linear code By a linear code, we mean a compression technique that results in one coded parameter or coded sample for each sample of the audio signal.
  • linear codes are PCM (A-law and ⁇ -law) ADPCM (adaptive differential pulse code modulation), and delta modulation.
  • Compression code By a compression code, we mean a technique that results in fewer than one coded parameter for each sample of the audio signal. Typically, compression codes result in a small set of coded parameters for each block or frame of audio signal samples. Examples of compression codes are linear predictive coding based vocoders such as the GSM vocoders (HR, FR, EFR).
  • Figure 9 shows a novel implementation of coded domain echo control (CDEC) for a situation where acoustic echo is present.
  • a communications system 10 transmits near end coded digital signals over a network 24 using a compression code, such as any of the codes used by the Codecs identified in Table 1.
  • the compression code is generated by an encoder 16 from linear audio signals generated by a near end microphone 14 within a near end speaker handset 12.
  • the compression code comprises parameters, such as the shown in Figure 8.
  • the parameters represent an audio signal comprising a plurality of audio characteristics, including audio level and power.
  • the compression code is decodable by various decoding steps.
  • system 10 controls echo in the near end digital signals due to the presence of a far end digital signals transmitted by system 10 over a network 32. The echo is controlled with minimal delay and minimal, if any, decoding of the compression code parameters shown in Figure 8.
  • Near end digital signals using the compression code are received on a near end terminal 20, and digital signals using an adjusted compression code are transmitted by a near end terminal 22 over a network 24 to a far end handset (not shown) which includes a decoder (not shown) of the adjusted compression code.
  • the adjusted compression code is compatible with the original compression code. In other words, when the coded parameters are modified or adjusted, we term it the adjusted compression code, but it still is decodable using a standard decoder corresponding to the original compression code.
  • a linear far end audio signal is encoded by a far end encoder (not shown) to generate far end digital signals using a compression code compatible with decoder 18, and is transmitted over a network 32 to a far end terminal 34.
  • a decoder 18 of near end handset 12 decodes the far end digital signals. As shown in Figure 9, echo signals from the far end signals may find their way to encoder 16 of the near end handset 12 through acoustic feedback.
  • a processor 40 performs various operations on the near end and far end compression code.
  • Processor 40 may be a microprocessor, microcontroller, digital signal processor, or other type of logic unit capable of arithmetic and logical operations.
  • a different coded domain echo control algorithm 44 is executed by processor 40 at all times - under compressed mode and linear mode, during TFO as well as non-TFO.
  • a partial decoder 48 is executed by processor 40 to read at least a first of the parameters received at terminal 20.
  • Another partial decoder 46 is executed by processor 40 to generate at least partially decoded far end signals. Decoder 48 generates at least partially decoded near end signals. (Note that the compression codes used by the near end and far end signals may be different, and hence the partial decoders may also be different.)
  • algorithm 44 Based on the partial decoding, algorithm 44 generates an echo likelihood signal at least estimating the amount of echo in the near end digital signal.
  • the echo likelihood signal varies over time since the amount of echo depends on the far end speech signal.
  • the echo likelihood signal is used by algorithm 44 to adjust the parameter(s) read by algorithm 44.
  • the adjusted parameter is written into the near end digital signal to form an adjusted near end digital signal which is transmitted from terminal 22 to network 24. In other words, the adjusted parameter is substitued for the originally read parameter.
  • the partial decoders 46 and 48 shown within the Network ALC Device are algorithms executed by processor 40 and are codec-dependent.
  • the partial decoders operate on signals compressed using compression codes.
  • partial decoder 46 may decode the linear code rather than the compression code. Also, in this case, partial decoder 48 decodes the linear code and only determines the coded parameters from the compression code without actually synthesizing the audio signal from the compression code.
  • Blocks 44, 46 and 48 also may be implemented as hardwired circuits.
  • Figure 10 shows that the Figure 9 embodiment can be useful for a system in which the echo is due to a 4-wire-to-2-wire hybrid.
  • the CDEC device/algorithm removes the effects of echo from the near-end coded speech by directly modifying the coded parameters in the bit-stream received from the near-end. Decoding of the near-end and far-end signals is performed in order to determine the likelihood of echo being present in the near-end. Certain statistics are measured from the decoded signals to determine this likelihood value.
  • the decoding of near-end and far-end signals may be complete or partial depending on the vocoder being used for the encode and decode operations. Some examples of situations where partial decoding suffices are listed below:
  • CELP code-excited linear prediction
  • the CDEC device may be placed between the base station and the switch (known as the A-interface) or between the two switches. Since the 6 MSBs of each 8-bit sample of the speech signal corresponds to the PCM code as shown in Figure 3, it is possible to avoid decoding the coded speech altogether in this situation. A simple table-lookup is sufficient to convert the 8-bit companded samples to 13-bit linear speech samples using A-law companding tables. This provides an economical way to obtain a version of the speech signal without invoking the appropriate decoder. Note that the speech signal obtained in this manner is somewhat noisy, but has been found to be adequate for the measurement of the statistics necessary for determining the likelihood of echo.
  • the far-end and near-end signals are available, certain statistics are measured and used to determine the likelihood of echo being present in the near-end signal.
  • the echo likelihood is estimated for each speech subframe, where the subframe duration is dependent on the vocoder being used. A preferred approach is described in this section.
  • FIG. 11 A simplified model of the end-path is assumed as shown in Figure 11.
  • the end-path is assumed to consist of a flat delay of ⁇ samples and an echo return loss (ERL), ⁇ .
  • ERP echo return loss
  • s NE (n) and s FE (n) are the near-end and far-end uncoded
  • P NE is the power of the current subframe of the near-end signal.
  • P FE (0) is the power of the current subframe of the far-end signal.
  • P FE (m) is the power of the m th subframe before the current subframe of the
  • N is the number of samples in a subframe.
  • R is the near-end to far-end subframe power ratio.
  • p is the echo likelihood obtained by smoothing the preliminary echo
  • the echo likelihood is estimated for each subframe using the steps below.
  • the processing may be more appropriately performed frame-by- frame rather than subframe-by- subframe.
  • denominator is essentially the maximum far-end subframe power measured during the expected end-path delay time period.
  • the codebook gain parameter, G for each subframe is reduced by a scale factor depending on the echo likelihood, p , for the subframe.
  • G new gain parameter
  • This parameter is then requantized according to the vocoder standard.
  • the codebook gain controls the overall level of the synthesized signal in the speech decoder model of Figure 7, and therefore controls the overall level of the corresponding audio signal. Attenuating the codebook gain in turn results in the attenuation of the echo.
  • the resulting 6 bit value is reinserted at the appropriate positions in the bit-stream.
  • the codebook vector, c(n) is modified by randomizing the pulse positions
  • Randomizing the codebook vector results in destroying the correlation properties of the echo. This has the effect of destroying much of the "speech-like" nature of the echo.
  • the randomization is performed whenever the likelihood of echo is determined to be high, preferably when p > 0.8 .
  • randomization may be performed using any suitable pseudo-random bit generation technique.
  • the codebook vector for each subframe is determined by the RPE grid position parameter (2 bits) and 13 RPE pulses (3 bits each). These 41 bits are replaced with 41 random bits using a pseudo-random bit generator.
  • the pitch synthesis filter implements any period of long-term correlation in the speech signal, and is particularly important for modeling the harmonics of voiced speech.
  • the model of this filter discussed in Figure 7 uses only two parameters, the
  • the pitch period T
  • the pitch gain g p
  • the pitch period is relatively constant over several subframes or frames.
  • the pitch gain in most vocoders ranges from zero to one or a small value above one (e.g. 1.2 in GSM EFR).
  • the pitch gain is at or near its maximum value.
  • the voiced harmonics of the echo are generally well modeled by the pitch synthesis filter; the likelihood of echo is detected to be high ( p > 0.8 ).
  • the likelihood of echo is at moderate levels ( 0.5 ⁇ p ⁇ 0.8 ).
  • the encoding process generally results in modeling the stronger of the two signals. It is reasonable to assume that, in most cases, the near-end speech is stronger than the echo. If this is the case, then the encoding process, due to its nature, tends to model mostly the near-end speech harmonics and little or none of the echo harmonics with the pitch synthesis filter.
  • the pitch period is randomized so that long-term correlation in the echo is removed, hence destroying the voiced nature of the echo. Such randomization is performed only when the likelihood of echo is high, preferably when p > 0.8 .
  • the pitch gain is reduced so as to control the strength of the harmonics or the strength of the long-term correlation in the audio signal.
  • Such gain attenuation is preferably performed only when the likelihood of echo is at least moderate ( p > 0.5 ).
  • the pitch period is not randomized during moderate echo likelihood but the pitch gain may be attenuated so that the voicing quality of the signal is not as strong.
  • the dotted line is the response for a high pitch gain
  • audio signal can be controlled by modifying this parameter in this manner.
  • N corresponds to the pitch period T of the model of Figure 7. N takes up 7 bits in
  • bit-stream can range from 40 to 120, inclusive.
  • the LTP gain parameter of subframe j of the GSM FR vocoder denoted by
  • the magnitude frequency response of this filter may be
  • the modified transfer function is
  • the effect of such spectral morphing on echo is to reduce or remove any formant structure present in the signal.
  • the echo is blended or morphed to sound like background noise.
  • the LPC synthesis filter magnitude frequency response for a voiced speech segment and its flattened versions for several different values of ⁇ are shown in Figure 15.
  • the spectral morphing factor ⁇ is determined
  • a similar spectral morphing method is obtained for other representations of the LPC filter coefficients commonly used in vocoders such as reflection coefficients, log-area ratios, inverse sines functions, and line spectral frequencies.
  • the GSM FR vocoder utilizes log-area ratios for representing the
  • the modified log-area ratios are then quantized according to the specifications in the standard. Note that these approaches to modification of the log-area ratios preserve the stability of the LPC synthesis filter.
  • Figure 8 shows the order in which the coded parameters from the GSM FR encoder are received.
  • a straightforward approach involves buffering up the entire 260 bits for each frame and then processing these buffered bits for coded domain echo control purposes. However, this introduces a buffering delay of about 20ms plus the processing delay.
  • the entire first subframe can be decoded as soon as bit 92 is received.
  • the first subframe may be processed after about 7.1ms (20ms times 92/260) of buffering delay.
  • the buffering delay is reduced by almost 13ms.
  • the coded LPC synthesis filter parameters are modified based on information available at the end of the first subframe of the frame.
  • the entire frame is affected by the echo likelihood computed based on the first subframe.
  • no noticeable artifacts were found due to this 'early' decision, particularly because the echo likelihood is a smoothed quantity based effectively on several previous subframes as well as the current subframe.
  • bit-stream When applying the novel coded domain processing techiques described in this report for removing echo, some are all of the bits corresponding to the coded parameters are modified in the bit-stream. This may affect other error-correction or detection bits that may also be embedded in the bit-stream. For instance, a speech encoder may embed some checksums in the bit-stream for the decoder to verify to ensure that an error-free frame is received. Such checksums as well as any parity check bits, error correction or detection bits, and framing bits are updated in accordance with the appropriate standard, if necessary.
  • additional information is available in addition to the coded parameters.
  • This additional information is the 6 MSBs of the A-law PCM samples of the audio signal.
  • these PCM samples may be used to reconstruct a version of the audio signal for both the far end and near end without using the coded parameters. This results in computational savings.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Quality & Reliability (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
  • Reduction Or Emphasis Of Bandwidth Of Signals (AREA)
  • Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)

Abstract

Un système de communication (10) transmet un signal numérique d'extrémité proche au moyen d'un code de compression composé d'une pluralité de paramètres, y compris un premier paramètre. Ces paramètres représentent un signal sonore comprenant une pluralité de caractéristiques sonores. Ce code de compression est décodable par une pluralité d'étapes de décodage. Ce système transmet également un signal numérique d'extrémité lointaine au moyen d'un code de compression. Un terminal (20) reçoit le signal numérique d'extrémité proche et un terminal (36) reçoit le signal numérique d'extrémité lointaine. Un processeur (40) réagit au signal numérique d'extrémité proche afin de lire au moins le premier paramètre. Ce processeur génère des signaux d'extrémité proche au moins partiellement décodés et des signaux d'extrémité lointaine au moins partiellement décodés. Le processeur règle le premier paramètre en fonction de ces signaux et entre le premier paramètre réglé dans le signal numérique d'extrémité proche. Un autre terminal (22) transmet le signal numérique d'extrémité proche réglé. Ceci permet de limiter l'écho du signal numérique d'extrémité proche.
PCT/US2000/018104 1999-07-02 2000-06-30 Controle d'echo dans un domaine code WO2001003316A1 (fr)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CA002378012A CA2378012A1 (fr) 1999-07-02 2000-06-30 Controle d'echo dans un domaine code
EP00948555A EP1190495A1 (fr) 1999-07-02 2000-06-30 Controle d'echo dans un domaine code
JP2001508063A JP2003533902A (ja) 1999-07-02 2000-06-30 符号化されたドメインのエコーの制御
AU62033/00A AU6203300A (en) 1999-07-02 2000-06-30 Coded domain echo control

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US14213699P 1999-07-02 1999-07-02
US60/142,136 1999-07-02

Publications (1)

Publication Number Publication Date
WO2001003316A1 true WO2001003316A1 (fr) 2001-01-11

Family

ID=22498680

Family Applications (3)

Application Number Title Priority Date Filing Date
PCT/US2000/018165 WO2001002929A2 (fr) 1999-07-02 2000-06-30 Gestion du bruit du domaine code
PCT/US2000/018293 WO2001003317A1 (fr) 1999-07-02 2000-06-30 Commande de parole comprimee adaptative de niveau a domaine code
PCT/US2000/018104 WO2001003316A1 (fr) 1999-07-02 2000-06-30 Controle d'echo dans un domaine code

Family Applications Before (2)

Application Number Title Priority Date Filing Date
PCT/US2000/018165 WO2001002929A2 (fr) 1999-07-02 2000-06-30 Gestion du bruit du domaine code
PCT/US2000/018293 WO2001003317A1 (fr) 1999-07-02 2000-06-30 Commande de parole comprimee adaptative de niveau a domaine code

Country Status (5)

Country Link
EP (3) EP1190494A1 (fr)
JP (3) JP2003533902A (fr)
AU (3) AU6067100A (fr)
CA (3) CA2378012A1 (fr)
WO (3) WO2001002929A2 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003295898A (ja) * 2002-04-05 2003-10-15 Nippon Telegr & Teleph Corp <Ntt> 音声処理方法、音声処理装置、音声処理プログラム
US8032365B2 (en) * 2007-08-31 2011-10-04 Tellabs Operations, Inc. Method and apparatus for controlling echo in the coded domain
US8571204B2 (en) 2011-07-25 2013-10-29 Huawei Technologies Co., Ltd. Apparatus and method for echo control in parameter domain
US8874437B2 (en) 2005-03-28 2014-10-28 Tellabs Operations, Inc. Method and apparatus for modifying an encoded signal for voice quality enhancement

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1301018A1 (fr) * 2001-10-02 2003-04-09 Alcatel Méthode et appareille pour modifié un signal digital dons un domain codifié
JP3876781B2 (ja) 2002-07-16 2007-02-07 ソニー株式会社 受信装置および受信方法、記録媒体、並びにプログラム
EP1521242A1 (fr) * 2003-10-01 2005-04-06 Siemens Aktiengesellschaft Procédé de codage de la parole avec réduction de bruit au moyen de la modification du gain du livre de code
US7613607B2 (en) 2003-12-18 2009-11-03 Nokia Corporation Audio enhancement in coded domain
EP1943730A4 (fr) * 2005-10-31 2017-07-26 Telefonaktiebolaget LM Ericsson (publ) Reduction de retard de filtre numerique
US7852792B2 (en) * 2006-09-19 2010-12-14 Alcatel-Lucent Usa Inc. Packet based echo cancellation and suppression
JP4915576B2 (ja) * 2007-05-28 2012-04-11 パナソニック株式会社 音声伝送システム
JP4915575B2 (ja) * 2007-05-28 2012-04-11 パナソニック株式会社 音声伝送システム
JP4915577B2 (ja) * 2007-05-28 2012-04-11 パナソニック株式会社 音声伝送システム
TWI469135B (zh) * 2011-12-22 2015-01-11 Univ Kun Shan 調適性差分脈衝碼調變編碼解碼的方法
JP6011188B2 (ja) * 2012-09-18 2016-10-19 沖電気工業株式会社 エコー経路遅延測定装置、方法及びプログラム
JP6816277B2 (ja) 2017-07-03 2021-01-20 パイオニア株式会社 信号処理装置、制御方法、プログラム及び記憶媒体

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5140543A (en) * 1989-04-18 1992-08-18 Victor Company Of Japan, Ltd. Apparatus for digitally processing audio signal
US5828995A (en) * 1995-02-28 1998-10-27 Motorola, Inc. Method and apparatus for intelligible fast forward and reverse playback of time-scale compressed voice messages
US5847303A (en) * 1997-03-25 1998-12-08 Yamaha Corporation Voice processor with adaptive configuration by parameter setting
US6064693A (en) * 1997-02-28 2000-05-16 Data Race, Inc. System and method for handling underrun of compressed speech frames due to unsynchronized receive and transmit clock rates
US6112177A (en) * 1997-11-07 2000-08-29 At&T Corp. Coarticulation method for audio-visual text-to-speech synthesis

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0683114B2 (ja) * 1985-03-08 1994-10-19 松下電器産業株式会社 エコ−キヤンセラ
US4969192A (en) * 1987-04-06 1990-11-06 Voicecraft, Inc. Vector adaptive predictive coder for speech and audio
US5097507A (en) * 1989-12-22 1992-03-17 General Electric Company Fading bit error protection for digital cellular multi-pulse speech coder
US5680508A (en) * 1991-05-03 1997-10-21 Itt Corporation Enhancement of speech coding in background noise for low-rate speech coder
JP3353257B2 (ja) * 1993-08-30 2002-12-03 日本電信電話株式会社 音声符号化復号化併用型エコーキャンセラー
JPH0954600A (ja) * 1995-08-14 1997-02-25 Toshiba Corp 音声符号化通信装置
JPH0993132A (ja) * 1995-09-27 1997-04-04 Toshiba Corp 符号化・復号化装置及び方法
JPH10143197A (ja) * 1996-11-06 1998-05-29 Matsushita Electric Ind Co Ltd 再生装置
US5943645A (en) * 1996-12-19 1999-08-24 Northern Telecom Limited Method and apparatus for computing measures of echo
JP3283200B2 (ja) * 1996-12-19 2002-05-20 ケイディーディーアイ株式会社 符号化音声データの符号化レート変換方法および装置
JP3346765B2 (ja) * 1997-12-24 2002-11-18 三菱電機株式会社 音声復号化方法及び音声復号化装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5140543A (en) * 1989-04-18 1992-08-18 Victor Company Of Japan, Ltd. Apparatus for digitally processing audio signal
US5828995A (en) * 1995-02-28 1998-10-27 Motorola, Inc. Method and apparatus for intelligible fast forward and reverse playback of time-scale compressed voice messages
US6064693A (en) * 1997-02-28 2000-05-16 Data Race, Inc. System and method for handling underrun of compressed speech frames due to unsynchronized receive and transmit clock rates
US5847303A (en) * 1997-03-25 1998-12-08 Yamaha Corporation Voice processor with adaptive configuration by parameter setting
US6112177A (en) * 1997-11-07 2000-08-29 At&T Corp. Coarticulation method for audio-visual text-to-speech synthesis

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003295898A (ja) * 2002-04-05 2003-10-15 Nippon Telegr & Teleph Corp <Ntt> 音声処理方法、音声処理装置、音声処理プログラム
US8874437B2 (en) 2005-03-28 2014-10-28 Tellabs Operations, Inc. Method and apparatus for modifying an encoded signal for voice quality enhancement
US8032365B2 (en) * 2007-08-31 2011-10-04 Tellabs Operations, Inc. Method and apparatus for controlling echo in the coded domain
US8571204B2 (en) 2011-07-25 2013-10-29 Huawei Technologies Co., Ltd. Apparatus and method for echo control in parameter domain

Also Published As

Publication number Publication date
EP1208413A2 (fr) 2002-05-29
CA2378035A1 (fr) 2001-01-11
WO2001003317A1 (fr) 2001-01-11
JP2003504669A (ja) 2003-02-04
AU6203300A (en) 2001-01-22
CA2378062A1 (fr) 2001-01-11
WO2001002929A2 (fr) 2001-01-11
EP1190494A1 (fr) 2002-03-27
CA2378012A1 (fr) 2001-01-11
JP2003503760A (ja) 2003-01-28
JP2003533902A (ja) 2003-11-11
AU6063600A (en) 2001-01-22
WO2001002929A3 (fr) 2001-07-19
AU6067100A (en) 2001-01-22
EP1190495A1 (fr) 2002-03-27

Similar Documents

Publication Publication Date Title
RU2325707C2 (ru) Способ и устройство для эффективного маскирования стертых кадров в речевых кодеках на основе линейного предсказания
JP3842821B2 (ja) 通信システムにおいて雑音を抑圧する方法および装置
US20060215683A1 (en) Method and apparatus for voice quality enhancement
US20070160154A1 (en) Method and apparatus for injecting comfort noise in a communications signal
EP1190495A1 (fr) Controle d&#39;echo dans un domaine code
JPH09204199A (ja) 非活性音声の効率的符号化のための方法および装置
US20060217969A1 (en) Method and apparatus for echo suppression
JPH02155313A (ja) 符号化方法
US8874437B2 (en) Method and apparatus for modifying an encoded signal for voice quality enhancement
WO2000075919A1 (fr) Generation de bruit de confort a partir de statistiques de modeles de bruit parametriques et dispositif a cet effet
JP2003533902A5 (fr)
EP1869672A1 (fr) Methode et appareil pour modifier un signal code
US20060217970A1 (en) Method and apparatus for noise reduction
US20060217988A1 (en) Method and apparatus for adaptive level control
US20030065507A1 (en) Network unit and a method for modifying a digital signal in the coded domain
CA2244008A1 (fr) Filtre non-lineaire de suppression du bruit dans des dispositifs de traitement de la parole a prediction lineaire
US20060217971A1 (en) Method and apparatus for modifying an encoded signal
AU6533799A (en) Method for transmitting data in wireless speech channels
US6141639A (en) Method and apparatus for coding of signals containing speech and background noise
WO2000016313A1 (fr) Codage de la parole avec reproduction du bruit de fond
Chandran et al. Compressed domain noise reduction and echo suppression for network speech enhancement
CN105632504A (zh) Adpcm编解码器及adpcm解码器丢包隐藏的方法
JP3163567B2 (ja) 音声符号化通信方式及びその装置
Wada et al. Measurement of the effects of nonlinearities on the network-based linear acoustic echo cancellation
Kulakcherla Non linear adaptive filters for echo cancellation of speech coded signals

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 2378012

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 2000948555

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 62033/00

Country of ref document: AU

WWP Wipo information: published in national office

Ref document number: 2000948555

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWE Wipo information: entry into national phase

Ref document number: 10019615

Country of ref document: US

WWW Wipo information: withdrawn in national office

Ref document number: 2000948555

Country of ref document: EP