WO2013017018A1 - Procédé et appareil d'exécution d'une transmission discontinue et adaptative de la voix - Google Patents

Procédé et appareil d'exécution d'une transmission discontinue et adaptative de la voix Download PDF

Info

Publication number
WO2013017018A1
WO2013017018A1 PCT/CN2012/078878 CN2012078878W WO2013017018A1 WO 2013017018 A1 WO2013017018 A1 WO 2013017018A1 CN 2012078878 W CN2012078878 W CN 2012078878W WO 2013017018 A1 WO2013017018 A1 WO 2013017018A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
spectral energy
insertion description
mute insertion
speech signal
Prior art date
Application number
PCT/CN2012/078878
Other languages
English (en)
Chinese (zh)
Inventor
顾彩霞
袁浩
江东平
黎家力
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2013017018A1 publication Critical patent/WO2013017018A1/fr

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/012Comfort noise or silence coding

Definitions

  • the present invention relates to the field of digital signal processing, and in particular, to a method and apparatus for performing speech adaptive discontinuous transmission (DTX).
  • DTX speech adaptive discontinuous transmission
  • the sender uses the Voice Activity Detector (VAD) algorithm for signal detection, and when detecting the inactive segment of the call, the lower code rate pair is used in the silence segment.
  • VAD Voice Activity Detector
  • the important information of the signal is encoded, that is, the signal is coded into a Silence Insertion Descriptor (SID) frame, and the SID frame is transmitted in a discontinuous manner.
  • SID Silence Insertion Descriptor
  • the decoding end decodes according to the received SID frame in the form of Comfort Noise Generation (CNG).
  • CNG Comfort Noise Generation
  • a SID frame is transmitted at a certain number of frames in the mute segment by using a parameter set in advance, for example, the 3GPP AMR and the AMR-WB speech coding standard are used. Method, fixed once every 8 frames.
  • the advantage of this method is that the calculation is simple and easy to implement, and the disadvantage is that the code rate cannot be automatically adjusted according to the signal characteristics.
  • the sender detects a silence frame after a voice frame, it does not immediately enter the silence segment, but uses a certain hangover mechanism.
  • the encoding of the normal speech is still encoded.
  • the silence frame is still detected, then the SIDFIRST frame (ie the first SID frame) is sent at the first silence frame position after the silence segment, and a SID update (SIDUPDATE) frame is sent at the third silence frame position.
  • a SID update frame is sent every 7 frames, so that the SID frame is updated with a fixed low code rate after the buffering phase, so as to update the parameters.
  • the buffering phase is canceled, and the SID update frame is directly transmitted.
  • This method is simple to calculate and can be implemented only by using a counter. No additional parameter calculation is required, and the code rate is controllable, and the algorithm is stable.
  • the disadvantage of this method is that the fixed interval is used to make the code rate fixed, and the uniform code rate is used for different noises, and cannot be adjusted according to the change of the noise signal. For example, for white noise, the parameters are very stable, but the SID frame is still sent frequently, which cannot effectively reduce the code rate. For a fast-changing noise signal, the signal change cannot be tracked in time, causing information delay, resulting in a large distortion of the noise signal when the CNG is restored at the decoding end.
  • variable interval transmission scheme of mode 2 When using the variable interval transmission scheme of mode 2, a certain algorithm is used to evaluate the signal of the silent segment in real time, and according to the real-time change of the signal, it is determined whether the SID frame needs to be transmitted.
  • the advantage of this method is flexibility, it can be changed according to the real-time change of the signal, the bandwidth is saved to the maximum, and the average code rate can be adjusted.
  • the disadvantage is that the calculation is relatively complicated.
  • variable interval transmission mode is used to measure whether the signal changes significantly by calculating the parameters such as LPC of the signal to determine whether the update is needed, although the method can be adaptive.
  • the signal is tracked, but the computational complexity is high.
  • This method is based on linear prediction.
  • LPC linear predictive coding
  • the mathematical representation of the coefficient is used, and the same parameter of the last transmitted SID frame stored in the memory.
  • the signal is considered to change, then the SID update frame is sent, otherwise it is not sent.
  • Embodiments of the present invention provide a method and apparatus for performing speech adaptive discontinuous transmission, which overcomes the problem that the fixed interval method in the related art cannot flexibly track signal changes, and the variable interval method must have multiple parameters such as linear prediction.
  • the calculations lead to the disadvantage of high computational complexity.
  • an embodiment of the present invention provides a method for performing voice adaptive discontinuous transmission, including:
  • whether to send a mute insertion description frame is determined according to the current speech signal frame and the spectrum information of the previous mute insertion description frame.
  • the spectrum information of the voice signal frame refers to the spectrum information calculated according to the frequency domain signal of the voice signal frame, or the frequency domain signal of the voice signal frame is smoothed and processed according to the smoothed frequency domain signal. Calculated frequency information.
  • the step of determining whether to send the mute insertion description frame according to the current speech signal frame and the frequency information of the previous mute insertion description frame includes:
  • Determining an absolute value of a spectral energy of the speech signal frame and/or an absolute value of a spectral energy of the last mute insertion description frame is greater than a single frame energy threshold, and a spectral energy of the speech signal frame and a previous mute insertion description
  • the mute insertion description frame is sent.
  • the step of determining whether to send the silence insertion description frame according to the current speech signal frame and the frequency information of the previous mute insertion description frame includes:
  • Determining an absolute value of a spectral energy of the speech signal frame and/or an absolute value of a spectral energy of the last mute insertion description frame is greater than a single frame energy threshold, and a spectral energy of the speech signal frame and the previous muting
  • the difference between the spectral energy of the description frame is greater than the first preset limit
  • the difference between the spectral energy of the speech signal frame and the spectral energy of the previous mute insertion description frame is greater than a preset limit:
  • the ratio of the spectral energy of the speech signal frame to the spectral energy of the last mute insertion description frame is large a ratio threshold corresponding to the preset limit or less than a reciprocal of the ratio threshold, wherein the ratio threshold is a real number greater than one;
  • the difference between the spectral energy of the speech signal frame and the spectral energy of the last mute insertion description frame is greater than the difference threshold.
  • the step of determining whether to send the mute insertion description frame according to the current speech signal frame and the frequency information of the previous mute insertion description frame includes:
  • Determining the absolute value of the spectral energy of the speech signal frame and/or the absolute value of the spectral energy of the last mute insertion description frame is greater than the single frame energy threshold, calculating the speech signal frame and the previous mute insertion description frame
  • the frequency-dependent value of the spectral energy when it is judged that the calculated frequency-related value is less than the spectral correlation threshold, sends a mute insertion description frame.
  • an embodiment of the present invention further provides an apparatus for performing voice adaptive non-contiguous transmission, including a mute insertion description frame processing unit and a mute insertion description frame storage unit;
  • the mute insertion description frame processing unit is configured to determine whether to send a mute insertion description frame according to the current speech signal frame and the spectrum information of the last mute insertion description frame;
  • the mute insertion description frame storage unit is configured to store the spectrum information of the mute insertion description frame after the mute insertion description frame processing unit transmits the mute insertion description frame.
  • the mute insertion description frame processing unit is further configured to perform smoothing processing on the frequency domain signal of the speech signal frame, and calculate the frequency information of the speech signal frame according to the smoothed frequency domain signal;
  • the mute insertion description frame storage unit is further arranged to store the smoothed frequency domain signal.
  • the mute insertion description frame processing unit is configured to decide whether to transmit a mute insertion description frame by: determining an absolute value of a spectral energy of the speech signal frame and/or an absolute value of a spectral energy of the last mute insertion description frame When the value is greater than the single frame energy threshold, and the difference between the spectral energy of the speech signal frame and the spectral energy of the previous mute insertion description frame is greater than the first preset limit, the mute insertion description frame is sent; or the speech signal frame is determined.
  • Absolute value of the spectral energy and / or the The absolute value of the spectral energy of the previous mute insertion description frame is greater than the single frame energy threshold, and the difference between the spectral energy of the speech signal frame and the spectral energy of the previous mute insertion description frame is greater than the first preset limit, further Determining whether a difference between a spectral energy of the speech signal frame and a spectral energy of the previous mute insertion description frame is greater than a second preset limit, and if so, continuously transmitting two mute insertion description frames, wherein the second preset limit corresponds to The spectral energy difference is greater than the spectral energy difference corresponding to the first preset limit;
  • the difference between the spectral energy of the speech signal frame and the spectral energy of the previous mute insertion description frame is greater than a preset limit: the spectral energy of the speech signal frame and the spectral energy of the previous mute insertion description frame
  • the ratio is greater than a ratio threshold corresponding to the preset limit or less than a reciprocal of the ratio threshold, wherein the ratio threshold is a real number greater than 1; or the frequency energy of the speech signal frame and the spectrum of the previous muting insertion description frame
  • the absolute difference in energy is greater than the difference threshold.
  • the mute insertion description frame processing unit is configured to decide whether to transmit a mute insertion description frame by: determining an absolute value of a spectral energy of the speech signal frame and/or an absolute value of a spectral energy of the last mute insertion description frame When the value is greater than the single-frame energy threshold, the frequency-correlation value of the spectral energy of the speech signal frame and the previous mute insertion description frame is calculated, and when the calculated frequency-related value is less than the spectral correlation threshold, the mute insertion description frame is sent. .
  • 1 is a schematic structural diagram of an apparatus for performing voice adaptive discontinuous transmission
  • FIG. 2 is a schematic structural diagram of another apparatus for performing voice adaptive discontinuous transmission
  • FIG. 3 is a schematic flowchart of performing voice adaptive discontinuous transmission in Embodiment 2
  • FIG. 4 is a voice adaptive method in Embodiment 3. Schematic diagram of the process of discontinuous transmission. Preferred embodiment of the invention
  • the apparatus for performing voice adaptive discontinuous transmission includes a mute insertion description frame processing unit and a mute insertion description frame storage unit.
  • the mute insertion description frame processing unit is configured to determine whether to send the mute insertion description frame according to the current speech signal frame and the frequency information of the previous mute insertion description frame;
  • the mute insertion description frame storage unit is arranged to store the frequency information of the mute insertion description frame after the device transmits the mute insertion description frame.
  • the mute insertion description frame processing unit is configured to determine whether to send the mute insertion description frame by: determining the absolute value of the spectral energy of the speech signal frame and/or the spectrum of the previous mute insertion description frame.
  • the absolute value of the energy is greater than the single frame energy threshold, and when the difference between the spectral energy of the speech signal frame and the spectral energy of the last mute insertion description frame is greater than the first preset limit, the mute insertion description frame is sent.
  • the mute insertion description frame processing unit may be further configured to decide whether to transmit the mute insertion description frame by: determining an absolute value of the spectral energy of the speech signal frame and/or an absolute value of the spectral energy of the last mute insertion description frame If the difference between the spectral energy of the speech signal frame and the spectral energy of the previous mute insertion description frame is greater than the first predetermined limit, further determining the spectral energy of the speech signal frame and the The previous mute insertion describes whether the difference value of the spectral energy of the frame is greater than the second preset limit. If yes, two mute insertion description frames are continuously sent, where the spectral energy difference corresponding to the second preset limit is greater than the first preset limit.
  • the spectral energy gap may be further configured to decide whether to transmit the mute insertion description frame by: determining an absolute value of the spectral energy of the speech signal frame and/or an absolute value of the spectral energy of the last mute insertion description
  • the difference between the spectral energy of the speech signal frame and the spectral energy of the previous mute insertion description frame is greater than a preset limit:
  • the ratio of the spectral energy of the speech signal frame to the spectral energy of the previous mute insertion description frame is greater than a ratio threshold corresponding to the preset limit or less than a reciprocal of the ratio threshold, wherein the ratio threshold is a real number greater than 1; or, the speech signal frame
  • the absolute value of the difference between the spectral energy and the spectral energy of the last mute insertion description frame is greater than the difference threshold.
  • the mute insertion description frame processing unit is configured to decide whether to send the mute insertion description frame by: determining the absolute value of the spectral energy of the speech signal frame and/or the upper When the absolute value of the frequency speech energy of the mute insertion description frame is greater than the single frame energy threshold, the frequency correlation value of the frame is calculated according to the current speech signal frame and the spectrum energy of the previous mute insertion description frame, and the spectrum correlation value is determined. When less than the spectral correlation threshold, the mute insertion description frame is sent.
  • the mute insertion description frame processing unit is configured to determine whether to transmit the mute insertion description frame by the difference of the spectrum energy of the two and the frequency correlation value.
  • the apparatus may further include: a smoothing filtering unit; the smoothing filtering unit is configured to perform smoothing filtering on the frequency domain signal of the voice signal, and input to the mute insertion description frame processing unit, and the mute insertion description frame processing unit The above processing is performed on the smoothed frequency domain signal, and the mute insertion description frame storage unit also needs to save the smoothed frequency domain signal.
  • the method for performing voice adaptive discontinuous transmission includes: In performing voice adaptive discontinuous transmission, determining whether to send a silence insertion description frame according to a current voice signal frame and a frequency information of a previous silence insertion description frame.
  • the spectrum information of the voice signal frame refers to the spectrum information calculated according to the frequency domain signal of the voice signal frame, or the frequency domain signal of the voice signal frame is smoothed and processed according to the smoothed frequency domain signal. Calculated frequency information.
  • the smoothing process is mainly to more accurately compare the spectral changes of the signal, reduce the influence of the details of the spectrum on the overall comparison, eliminate the spectral spikes and burrs, and make the output spectrum smoother, making the spectral envelope more stable.
  • This spectral smoothing can be achieved using a smoothing filter. Take 16kHz sample and 20ms frame length as an example. By using a fast Fourier transform (FFT), the time domain signal is transformed into the frequency domain to obtain the spectral parameters of the frame signal, and the FFT length is 320 points.
  • FFT fast Fourier transform
  • H(z) a 0 Z ⁇ 2 + ⁇ ⁇ ⁇ ⁇ + 2 + ⁇ 3 ⁇ + ⁇ 4 ⁇ 2
  • the coefficients [ , A , ⁇ ⁇ , ] are the smoothing coefficients, which can be [0.15, 0.15, 0.4, 0.15, 0.15]. After smoothing, the trend of the line is unchanged, but the instantaneous mutation is reduced, which is more conducive to observing the change of the signal envelope of the signal.
  • the above spectral smoothing includes, but is not limited to, the above-described manner of using a filter. During the use of the filter, different adjustment effects can also be achieved by adjusting the coefficients or orders of the filter.
  • determining an absolute value of a spectral energy of the speech signal frame and/or an absolute value of a spectral energy of the last mute insertion description frame is greater than a single frame energy threshold, and a spectral energy sum of the speech signal frame
  • the mute insertion description frame is sent.
  • determining an absolute value of the spectral energy of the speech signal frame and/or an absolute value of the spectral energy of the last mute insertion description frame is greater than a single frame energy threshold, and the spectral energy of the speech signal frame and the upper
  • determining, by a mute insertion, that the difference between the spectral energy of the frame is greater than the first preset limit further determining whether a difference between the spectral energy of the speech signal frame and the spectral energy of the previous mute insertion description frame is greater than a second preset limit, If yes, two mute insertion description frames are continuously sent, wherein the second preset limit corresponds to a spectral energy difference greater than a spectral energy difference corresponding to the first preset limit.
  • the difference between the spectral energy of the speech signal frame and the spectral energy of the previous mute insertion description frame is greater than a preset limit: the spectral energy of the speech signal frame and the spectral energy of the previous mute insertion description frame
  • the ratio is greater than a ratio threshold corresponding to the preset limit or less than a reciprocal of the ratio threshold, wherein the ratio threshold is a real number greater than 1; or the frequency energy of the speech signal frame and the spectrum of the previous muting insertion description frame
  • the absolute difference in energy is greater than the difference threshold.
  • Embodiment 2 when determining the absolute value of the spectral energy of the speech signal frame and/or the absolute value of the frequency speech energy of the last mute insertion description frame is greater than the single frame energy threshold, according to the current speech signal frame and the upper A mute insertion describes a frequency-correlation value of the spectral energy of the frame, and when the frequency-related value is less than the frequency-dependent threshold, the mute insertion description frame is sent.
  • whether the mute insertion description frame is sent may be determined according to the difference of the spectrum energy of the two and the frequency correlation value.
  • the frequency word correlation value parameter is used for judgment.
  • the device After the SID frame is sent, the device stores the spectrum energy information of the SID frame in the SID frame storage unit, that is, the information stored in the silence insertion description frame storage unit is the last transmission. Spectrum energy information of the SID frame.
  • the SID frame When determining whether to send the SID frame, first determining that at least one of the absolute value of the spectral energy of the current speech signal frame and the absolute value of the spectral energy of the previous mute insertion description frame is greater than a single frame energy threshold (THR1), if not satisfied In the above condition, the signal execution is considered to maintain low energy, and the SID frame does not need to be transmitted. After the above conditions are satisfied, the correlation between the spectral energy of the current speech signal frame and the spectral energy of the previous mute insertion description frame is calculated according to the following formula:
  • S(i) represents the spectral energy of the current speech signal frame
  • S last (i) represents the spectral energy of the previous SID frame of the current frame
  • N represents the spectral length, which is 320 in this embodiment.
  • the ratio of the spectral energy is used to determine.
  • the device After the SID frame is sent, the device stores the spectrum energy information of the SID frame in the SID frame storage unit, that is, the information stored in the silence insertion description frame storage unit is the spectrum energy information of the last transmitted SID frame.
  • the ratio of the spectral energy of the current speech signal frame to the spectral energy of the last mute insertion description frame is calculated according to the following formula:
  • S(i) represents the spectral energy of the current speech signal frame
  • S last (i) represents the spectral energy of the previous SID frame of the current frame
  • N represents the spectral length
  • THR3 is a real number greater than 1, indicating that the signal energy changes greatly, and a SID frame needs to be sent. Otherwise, the SID frame does not need to be transmitted.
  • the ratio of the spectral energy is used to determine.
  • the device After the SID frame is sent, the device stores the spectrum energy information of the SID frame in the SID frame storage unit, that is, the information stored in the silence insertion description frame storage unit is the spectrum energy information of the last transmitted SID frame.
  • the ratio of the spectral energy of the current speech signal frame to the spectral energy of the last mute insertion description frame is calculated according to the following formula:
  • S(i) represents the spectral energy of the current speech signal frame
  • S last (i) represents the spectral energy of the previous SID frame of the current frame
  • N represents the spectral length
  • THR3 is A real number greater than 1 indicates that the signal energy has changed greatly, and the next step is judged. Otherwise, there is no need to send a SID frame.
  • the difference is determined by the difference in spectral energy.
  • the device After the SID frame is sent, the device stores the spectrum energy information of the SID frame in the SID frame storage unit, that is, the information stored in the silence insertion description frame storage unit is the spectrum energy information of the last transmitted SID frame.
  • the difference between the spectral energy of the current speech signal frame and the spectral energy of the last mute insertion description frame is calculated according to the following formula:
  • R 3 X * 5( - ⁇ 5 ⁇ ( * 5 ⁇ (
  • S(i) represents the spectral energy of the current speech signal frame
  • S last (i) represents the spectral energy of the previous SID frame of the current frame
  • N represents the spectral length
  • the absolute value of the difference R 3 is greater than the threshold value THR5, it indicates that the signal energy changes greatly, and the SID frame needs to be sent, and the information of the SID frame storage unit is updated at the same time.
  • a hangover algorithm may be added to ensure the sound quality at the end of the speech, and the CNG algorithm initialization is completed. That is, when a silence frame is detected after a continuous speech frame, instead of directly entering the discontinuous transmission mode, the first few silent frames continue to be processed in accordance with the voice frame mode. After that, it enters the discontinuous transmission mode. For example, in the language When the first silence frame is detected after the tone frame, the first 7 silence frames continue to be processed in the voice frame mode. Then, if the detected silence frame is still a silence frame, the SID_ FIRST frame is transmitted, and the SID_UPDATE is transmitted in the third frame after SID_ FIRST, and then the SID frame is sent according to the decision algorithm described above.
  • the hangover algorithm includes counting the continuous speech frames.
  • the buffer algorithm is set according to the above buffer algorithm. Buffer phase, otherwise, send SID_UPDATE directly, and enter the automatic detection state, and the count of consecutive speech frames will be cleared.
  • the maximum SID interval threshold value may also be set.
  • the SID is forced to be updated to ensure the stability of the system and reduce the adverse effects caused by abnormal conditions such as SID frame loss.
  • a minimum SID interval threshold value may also be set.
  • the solution can be used for real-time two-way communication, such as wireless, IP conferencing, television, and other areas of voice transmission, to effectively save bandwidth resources and improve network usage efficiency without substantially affecting sound quality.
  • the scheme has low computational complexity, accurate tracking of signal spectrum changes, effective tracking in the case of fast noise changes, effective bandwidth saving in the case of noise smoothness, and independent of specific speech and audio encoders. Flexible and efficient.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephonic Communication Services (AREA)
  • Telephone Function (AREA)

Abstract

La présente invention concerne un procédé et un appareil d'exécution d'une transmission discontinue et adaptative de la voix. Le procédé comprend les étapes au cours desquelles : lors d'une transmission discontinue et adaptative de la voix, il est déterminé s'il est nécessaire d'envoyer une trame de descripteur d'insertion de silence en fonction d'informations d'un spectre de fréquences d'une trame actuelle de signaux vocaux et d'informations d'un spectre de fréquences d'une précédente trame de descripteur d'insertion de silence. Ce processus peut remédier à des inconvénients tels que ceux des techniques associées, à savoir qu'en mode à intervalles fixes, les changements de signaux ne peuvent pas être suivis de manière flexible et qu'en mode à intervalles variables, il est nécessaire de procéder à un calcul intégrant de multiples paramètres, telle une prédiction linéaire, qui se solde par une grande complexité de calcul. Ce processus est exécuté directement dans un domaine harmonique et il peut suivre correctement les changements de signaux, ce qui garantit à la fois la qualité des sons et le maintien d'un taux de code moyen bas.
PCT/CN2012/078878 2011-07-29 2012-07-19 Procédé et appareil d'exécution d'une transmission discontinue et adaptative de la voix WO2013017018A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201110216374.3A CN102903364B (zh) 2011-07-29 2011-07-29 一种进行语音自适应非连续传输的方法及装置
CN201110216374.3 2011-07-29

Publications (1)

Publication Number Publication Date
WO2013017018A1 true WO2013017018A1 (fr) 2013-02-07

Family

ID=47575567

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2012/078878 WO2013017018A1 (fr) 2011-07-29 2012-07-19 Procédé et appareil d'exécution d'une transmission discontinue et adaptative de la voix

Country Status (2)

Country Link
CN (1) CN102903364B (fr)
WO (1) WO2013017018A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10805191B2 (en) 2018-12-14 2020-10-13 At&T Intellectual Property I, L.P. Systems and methods for analyzing performance silence packets

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104217723B (zh) * 2013-05-30 2016-11-09 华为技术有限公司 信号编码方法及设备
EP2980790A1 (fr) * 2014-07-28 2016-02-03 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Appareil et procédé de sélection de mode de génération de bruit de confort
CN104378474A (zh) * 2014-11-20 2015-02-25 惠州Tcl移动通信有限公司 一种降低通话输入噪音的移动终端及其方法
US9748929B1 (en) * 2016-10-24 2017-08-29 Analog Devices, Inc. Envelope-dependent order-varying filter control

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060149536A1 (en) * 2004-12-30 2006-07-06 Dunling Li SID frame update using SID prediction error
CN1964408A (zh) * 2005-11-12 2007-05-16 鸿富锦精密工业(深圳)有限公司 静音处理装置及方法
CN101213591A (zh) * 2005-06-18 2008-07-02 诺基亚公司 用于非连续语音传输期间的舒适噪声参数自适应传输的系统和方法
WO2008121035A1 (fr) * 2007-03-29 2008-10-09 Telefonaktiebolaget Lm Ericsson (Publ) Procédé et codeur vocal avec un ajustement de longueur de la période de maintien de transmission discontinue
CN101335001A (zh) * 2007-11-02 2008-12-31 华为技术有限公司 一种dtx判决方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060149536A1 (en) * 2004-12-30 2006-07-06 Dunling Li SID frame update using SID prediction error
CN101213591A (zh) * 2005-06-18 2008-07-02 诺基亚公司 用于非连续语音传输期间的舒适噪声参数自适应传输的系统和方法
CN1964408A (zh) * 2005-11-12 2007-05-16 鸿富锦精密工业(深圳)有限公司 静音处理装置及方法
WO2008121035A1 (fr) * 2007-03-29 2008-10-09 Telefonaktiebolaget Lm Ericsson (Publ) Procédé et codeur vocal avec un ajustement de longueur de la période de maintien de transmission discontinue
CN101335001A (zh) * 2007-11-02 2008-12-31 华为技术有限公司 一种dtx判决方法和装置

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10805191B2 (en) 2018-12-14 2020-10-13 At&T Intellectual Property I, L.P. Systems and methods for analyzing performance silence packets
US11323343B2 (en) 2018-12-14 2022-05-03 At&T Intellectual Property I, L.P. Systems and methods for analyzing performance silence packets
US11729076B2 (en) 2018-12-14 2023-08-15 At&T Intellectual Property I, L.P. Systems and methods for analyzing performance silence packets

Also Published As

Publication number Publication date
CN102903364B (zh) 2017-04-12
CN102903364A (zh) 2013-01-30

Similar Documents

Publication Publication Date Title
JP7427752B2 (ja) 時間領域デコーダにおける量子化雑音を低減するためのデバイスおよび方法
JP4025018B2 (ja) 音声信号の改善された音声/雑音選別のための複合信号活動検出
JP4995913B2 (ja) 信号変化検出のためのシステム、方法、および装置
US11417354B2 (en) Method and device for voice activity detection
KR102299938B1 (ko) 시간 지연 추정 방법 및 디바이스
WO2008148323A1 (fr) Procédé et dispositif de détection d'activité vocale
KR101427863B1 (ko) 오디오 신호 코딩 방법 및 장치
JP3273599B2 (ja) 音声符号化レート選択器と音声符号化装置
WO2013060223A1 (fr) Procédé et appareil de compensation de perte de trames pour signal à trames de parole
KR101648290B1 (ko) 컴포트 노이즈의 생성
CN103854649A (zh) 一种变换域的丢帧补偿方法及装置
WO2013017018A1 (fr) Procédé et appareil d'exécution d'une transmission discontinue et adaptative de la voix
WO2009115039A1 (fr) Procédé et appareil permettant de générer du bruit
JP2019527855A (ja) マルチチャネル信号を符号化する方法及びエンコーダ
US20140172420A1 (en) Audio or voice signal processor
JP2019023742A (ja) オーディオ信号内の雑音を推定するための方法、雑音推定器、オーディオ符号化器、オーディオ復号器、およびオーディオ信号を送信するためのシステム
WO2014190641A1 (fr) Procédé, dispositif et système de transmission de données multimédia
CN112599140B (zh) 一种优化语音编码速率和运算量的方法、装置及存储介质
WO2008089696A1 (fr) Procédé et dispositif destinés au décodage de la parole dans un décodeur de parole
JP4437011B2 (ja) 音声符号化装置
CN113643713B (zh) 一种蓝牙音频编码方法、装置及存储介质
CN115762547A (zh) 一种检测和消除噪声的方法、装置、编码方法、介质及设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12820724

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12820724

Country of ref document: EP

Kind code of ref document: A1