WO2013097764A1 - 音频数据的处理方法、装置和系统 - Google Patents
音频数据的处理方法、装置和系统 Download PDFInfo
- Publication number
- WO2013097764A1 WO2013097764A1 PCT/CN2012/087812 CN2012087812W WO2013097764A1 WO 2013097764 A1 WO2013097764 A1 WO 2013097764A1 CN 2012087812 W CN2012087812 W CN 2012087812W WO 2013097764 A1 WO2013097764 A1 WO 2013097764A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- noise
- sid
- frame
- band signal
- energy
- Prior art date
Links
- 238000003672 processing method Methods 0.000 title abstract description 6
- 230000005540 biological transmission Effects 0.000 claims abstract description 175
- 238000000034 method Methods 0.000 claims abstract description 89
- 230000007246 mechanism Effects 0.000 claims abstract description 74
- 238000012545 processing Methods 0.000 claims abstract description 47
- 230000005236 sound signal Effects 0.000 claims abstract description 20
- 238000001228 spectrum Methods 0.000 claims description 52
- 230000015572 biosynthetic process Effects 0.000 claims description 39
- 230000003595 spectral effect Effects 0.000 claims description 39
- 238000003786 synthesis reaction Methods 0.000 claims description 39
- 238000003780 insertion Methods 0.000 claims description 18
- 230000037431 insertion Effects 0.000 claims description 18
- 108010001267 Protein Subunits Proteins 0.000 claims description 11
- 230000008569 process Effects 0.000 claims description 11
- 238000004364 calculation method Methods 0.000 claims description 10
- 238000009499 grossing Methods 0.000 claims description 8
- 239000000872 buffer Substances 0.000 claims description 5
- 238000001453 impedance spectrum Methods 0.000 claims description 3
- 230000008859 change Effects 0.000 claims description 2
- 239000002131 composite material Substances 0.000 claims 1
- 239000013386 metal-inorganic framework Substances 0.000 claims 1
- 238000004891 communication Methods 0.000 abstract description 6
- 230000005284 excitation Effects 0.000 description 11
- 230000007774 longterm Effects 0.000 description 11
- 230000009286 beneficial effect Effects 0.000 description 9
- 238000004458 analytical method Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 5
- 238000013139 quantization Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000007723 transport mechanism Effects 0.000 description 2
- 206010021403 Illusion Diseases 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000009432 framing Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000010183 spectrum analysis Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/028—Noise substitution, i.e. substituting non-tonal spectral components by noisy source
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/012—Comfort noise or silence coding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0204—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
- G10L19/22—Mode decision, i.e. based on audio signal content versus external parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/26—Pre-filtering or post-filtering
- G10L19/265—Pre-filtering, e.g. high frequency emphasis prior to encoding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/21—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
Definitions
- the present invention relates to the field of communications technologies, and in particular, to a method, device, and system for processing audio data.
- voice, image, audio, and video transmissions have a wide range of application requirements, such as mobile phone calls, audio and video conferencing, broadcast television, and multimedia entertainment.
- the voice is digitized and transmitted from one terminal to another through a voice communication network, where the terminal can be a mobile phone, a digital telephone terminal or any other type of voice terminal, such as a VOIP telephone or an ISDN telephone, a computer, an electric book. Cable communication telephone.
- the audio signal is compressed at the transmitting end and transmitted to the receiving end, and the receiving end recovers the audio signal and plays it by decompressing.
- DTX/CNG Discontinuous Transmission System/Comfort Noise Generation
- SID Silence Insertion Descriptor
- This continuous recovery of background noise is not a faithful reproduction of the background noise at the encoding end, but rather strives to minimize the loss of auditory quality, making the user feel more comfortable, and the recovered background noise Called CN (Comfort Noise), this method of recovering CN from the decoder is called comfort noise generation.
- ITU-T G.718 is a relatively new standardized wideband codec that includes a broadband DTX/CNG system.
- the system can transmit the SID according to a fixed interval, and can also adaptively adjust the transmission interval of the SID according to the estimated noise level.
- the G. 718 SID frame consists of 16 ISP parameters and excitation energy parameters.
- the ISP (Immittance Spectral Pair) parameter characterizes the spectral envelope of the noise over the entire wideband bandwidth, and the excitation energy is obtained by the analysis filter represented by the set of ISP parameters.
- the 718 estimates the LPC coefficients required for CNG according to the ISP parameters obtained by decoding the SID in the CNG state, and the excitation energy obtained according to the decoded SID frame. The number of excitation energies required for CNG is estimated, and the reconstructed CN is obtained by exciting the CNG synthesis filter with gain-adjusted white noise.
- the embodiment of the present invention provides a method, device and system for processing audio data.
- the technical solution is as follows:
- a method for processing audio data comprising:
- the noise frame of the audio signal Acquiring a noise frame of the audio signal, and decomposing the noise frame into a noise low band signal and a noise high band signal; encoding and transmitting the noise low band signal by using a first discontinuous transmission mechanism, and encoding and transmitting by using a second discontinuous transmission mechanism
- the noise high band signal wherein the first mute insertion of the first discontinuous transmission mechanism describes a transmission policy of the frame SID and a transmission strategy of the second SID of the second discontinuous transmission mechanism, or The encoding strategy of the first SID of the first discontinuous transmission mechanism is different from the encoding strategy of the second SID of the second discontinuous transmission mechanism.
- a method for processing audio data comprising:
- Decoding a muting insertion description frame SID determining whether the SID includes a low band parameter and/or including a high band parameter; if the SID includes the low band parameter, decoding the SID to obtain a noise low band parameter, and Generating a noise high band parameter locally, obtaining a first comfort noise CN frame according to the decoded low band parameter obtained by the decoding and the locally generated noise high band parameter;
- the SID includes the high band parameter, decoding the SID to obtain a noise high band parameter, and locally generating a noise low band parameter, and the noise high band parameter obtained according to the decoding and the locally generated noise low band
- the parameter gets the second CN frame
- the SID includes the high band parameter and the low band parameter
- decoding the SID to obtain a noise high band parameter and the noise low band parameter, and the noise high band parameter and the noise low band parameter obtained according to the decoding A third CN frame is obtained.
- an encoding apparatus for audio data comprising:
- An acquiring module configured to acquire a noise frame of the audio signal, and decompose the noise frame into a noise low band signal and a noise high band signal
- a transmission module configured to transmit the noise low band signal by using a first discontinuous transmission mechanism, and to transmit the noise high band signal by a second discontinuous transmission mechanism, where the first mute of the first discontinuous transmission mechanism is Inserting a transmission policy describing the frame SID and a transmission policy of the second SID of the second discontinuous transmission mechanism, or the coding strategy of the first SID of the first discontinuous transmission mechanism and the second discontinuity
- the encoding strategy of the second SID of the transport mechanism is different.
- a decoding apparatus for audio data comprising:
- An obtaining module configured to obtain a mute insertion description frame SID, determine whether the SID includes a low band parameter and/or includes a high band parameter;
- a first decoding module configured to: if the SID acquired by the acquiring module includes the low-band parameter, decode the SID, obtain a noise low-band parameter, and generate a noise high-band parameter locally, according to the decoded Deriving a noise low band parameter and the locally generated noise high band parameter to obtain a first comfort noise CN frame;
- a second decoding module configured to: if the SID acquired by the acquiring module includes the high-band parameter, decode the SID to obtain a noise high-band parameter, and generate a noise low-band parameter locally, and obtain a high noise according to the decoding. Taking a parameter and the locally generated noise low band parameter to obtain a second CN frame;
- a third decoding module configured to: if the SID acquired by the acquiring module includes the highband parameter and the lowband parameter, decoding the SID to obtain a noise highband parameter and the noise lowband parameter, according to the decoding The obtained noise high band parameter and the noise low band parameter obtain the third CN frame.
- a processing system for audio data comprising: encoding means for audio data as described above and decoding means for audio data as described above.
- the current noise frame is decomposed into a noise low band signal and a noise high band signal
- the noise low band signal is encoded and transmitted by the first discontinuous transmission mechanism.
- the discontinuous transmission mechanism encodes and transmits the noisy highband signal
- the decoder obtains a silence insertion description frame SID, determines whether the SID includes a lowband parameter and/or includes a highband parameter; and uses different noise decoding modes for different judgment results.
- the computational complexity and the coding bits can be saved without reducing the subjective quality of the codec, and the saved bits can be reduced to reduce the transmission bandwidth or used for The purpose of improving the overall coding quality is to solve the problem of coding transmission due to ultra-wideband.
- FIG. 2 is a flowchart of a method for processing audio data provided in Embodiment 2 of the present invention.
- Embodiment 3 is a flowchart of a method for processing audio data provided in Embodiment 3 of the present invention.
- Embodiment 4 is a flowchart of a method for processing audio data provided in Embodiment 4 of the present invention.
- FIG. 5 is a schematic diagram of an apparatus for encoding audio data according to Embodiment 6 of the present invention.
- FIG. 6 is a schematic diagram of another encoding apparatus for audio data provided in Embodiment 6 of the present invention.
- FIG. 7 is a schematic diagram of an apparatus for decoding audio data according to Embodiment 7 of the present invention.
- Embodiment 8 is a schematic diagram of another decoding apparatus for audio data provided in Embodiment 7 of the present invention.
- FIG 9 is a schematic diagram of a processing system for audio data provided in Embodiment 8 of the present invention. detailed description
- this embodiment provides a method for processing audio data, where the method includes:
- the first low-pass transmission mechanism encodes and transmits the noise low-band signal
- the second non-continuous transmission mechanism encodes and transmits the noise high-band signal, where the first silent insertion description frame of the first discontinuous transmission mechanism
- the sending policy of the SID is different from the sending policy of the second SID of the second discontinuous transmission mechanism, or the coding strategy of the first SID of the first discontinuous transmission mechanism and the second discontinuous transmission mechanism
- the coding strategy of the two SIDs is different.
- the first SID includes a low band parameter of the noise frame
- the second SID includes a low band parameter or a high band parameter of the noise frame.
- the transmitting, by using the second non-continuous transmission mechanism, the transmitting the high-band signal includes: determining whether the noise high-band signal has a preset spectrum structure, and if yes, and satisfying the foregoing And transmitting, in the second SID transmission policy, encoding, by the second SID coding strategy, the SID of the noise highband signal and transmitting; if not, determining that the noise highband signal is not required to be encoded and transmitted.
- the determining whether the noise highband signal has a preset spectrum structure includes:
- Obtaining a spectrum of the noisy high-band signal dividing the spectrum into at least two sub-bands, if an average energy of any one of the sub-bands is not less than an average of a second sub-band in the sub-band Energy, wherein the frequency band in which the second sub-band is located is higher than a frequency band in which the first sub-band is located, confirming that the noise high-band signal does not have a preset spectrum structure, otherwise
- the noise high band signal has a preset spectral structure.
- the encoding, by the second non-continuous transmission mechanism, the transmitting the high-band signal comprises: generating a deviation degree value according to the first ratio and the second ratio, where the first ratio is a ratio of the energy of the noise high band signal of the noise frame to the energy of the noise low band signal, the second ratio being the time at which the SID containing the noisy high band parameter is transmitted last time before the noise frame The ratio of the energy of the noise high band signal to the energy of the noise low band signal;
- Determining whether the deviation degree value reaches a preset threshold if yes, encoding the SID of the noise high band signal by the second SID coding strategy and transmitting; if not, determining that the noise is not required to be high band
- the signal is encoded and transmitted.
- the first ratio is a ratio of an energy of a noise highband signal of the noise frame to an energy of the noise lowband signal, and includes:
- the first ratio is a ratio of an instantaneous energy of a noise high band signal of the noise frame to an instantaneous energy of the noise low band signal;
- the second ratio is a ratio of the energy of the noise highband signal and the energy of the noise lowband signal at the time when the SID containing the noisy highband parameter is transmitted last time before the noise frame, and includes:
- the second ratio is a ratio of the instantaneous energy of the noise high band signal and the instantaneous energy of the noise low band signal of the time corresponding to the last transmission of the SID including the noisy high band parameter before the noise frame;
- the first ratio is a ratio of the energy of the noise high band signal of the noise frame to the energy of the noise low band signal, and includes:
- the first ratio is a ratio of a weighted average energy of a noise highband signal of the noise frame and its previous noise frame to a weighted average energy of a noise lowband signal of the noise frame and its previous noise frame;
- the second ratio is a ratio of the energy of the noise highband signal and the energy of the noise lowband signal at the time when the SID containing the noisy highband parameter is transmitted last time before the noise frame, and includes:
- the second ratio is a weighted average energy of the high-band signal and a low-band signal of the noise frame at the time corresponding to the last time the SID corresponding to the SID containing the noisy high-band parameter is transmitted before the noise frame The ratio of the average energy.
- the generating the deviation degree value according to the first ratio and the second ratio includes:
- the absolute value of the difference between the logarithmic value of the first ratio and the logarithm of the second ratio is calculated to obtain the degree of deviation.
- the encoding, by the second discontinuous transmission mechanism, the transmitting the high-band signal includes: Determining whether a spectral structure of the noise high band signal of the noise frame satisfies a preset condition compared to an average spectral structure of the noise high band signal before the noise frame, and if so, encoding the second coding strategy
- the noise of the noise frame is high with the SID of the signal and is transmitted; if not, it is determined that the noise high band signal of the noise frame does not need to be encoded and transmitted.
- the average spectral structure of the noise highband signal before the noise frame includes: a weighted average of the spectrum of the noise highband signal before the noise frame.
- the sending condition in the sending policy of the second SID of the second discontinuous transmission mechanism further includes: the first discontinuous transmission mechanism satisfies a sending condition of the first SID.
- the method provided by the present invention has the beneficial effects of: acquiring a current noise frame of the audio signal, and decomposing the current noise frame into a noise low band signal and a noise high band signal, and transmitting the code by using the first discontinuous transmission mechanism a low-band signal with a second discontinuous transmission mechanism for encoding and transmitting the noisy high-band signal, so that by processing the high-band signal and the low-band signal differently, the calculation can be saved without reducing the subjective quality of the codec.
- the complexity and coding bits, the saved bits can achieve the purpose of reducing the transmission bandwidth or improving the overall coding quality, thereby solving the problem of coding transmission due to ultra-wideband.
- a method for processing audio data includes:
- the decoder obtains a mute insertion description frame SID, and determines whether the SID includes a low band parameter or includes a high band parameter.
- the SID includes the lowband parameter, decoding the SID, obtaining a noise lowband parameter, and locally generating a noise highband parameter, according to the decoded lowband parameter and the local The generated noise high band parameter obtains the first comfort noise CN frame;
- the SID includes the highband parameter, decoding the SID to obtain a noise highband parameter, and locally generating a noise lowband parameter, according to the decoded highband parameter and the locally generated noise.
- the low band parameter obtains the second CN frame;
- the SID includes the high band parameter and the low band parameter
- decoding the SID to obtain a noise high band parameter and the noise low band parameter, and the noise high band parameter and noise obtained according to the decoding are low.
- the third CN frame is obtained with parameters.
- the decoding the SID if the SID includes the low-band parameter, the decoding the SID, obtaining a noise low-band parameter, and locally generating a noise high-band parameter, according to the decoding.
- the noise low band parameter and the locally generated noise high band parameter before the first comfort noise CN frame are also included:
- the decoder enters a second CNG state if the decoder is in a first comfort noise generating CNG state.
- the decoding station if the SID includes the high band parameter and the low band parameter, the decoding station The SID obtains a noise high band parameter and the noise low band parameter, and before the third CN frame is obtained according to the decoded high band parameter and the noise low band parameter, the method further includes:
- the decoder If the decoder is in the second CNG state, the decoder enters a first CNG state.
- determining whether the SID includes a low-band parameter and/or including a high-band parameter includes: if the number of bits of the SID is less than a preset first threshold, confirming that the SID includes a high If the number of bits of the SID is greater than a preset first threshold and less than a preset second threshold, confirming that the SID includes a low band parameter; if the number of bits of the SID is greater than a preset second The threshold is less than the preset third threshold, and the SID is confirmed to include a high band parameter and a low band parameter;
- the SID includes the first identifier, confirming that the SID includes a highband parameter, and if the SID includes a second identifier, confirming that the SID includes a lowband parameter, if the SID The third identifier is included, and it is confirmed that the SID includes a low band parameter and a high band parameter.
- the generating the noise high band parameter locally includes:
- the obtaining the weighted average energy of the noise highband signal at the time corresponding to the SID includes:
- the calculating, by the ratio of the energy of the noise high band signal corresponding to the time of receiving the SID containing the high band parameter and the energy of the noise low band signal, before the SID, obtains a first ratio includes:
- the local frame of the previous CN frame is updated at the first rate.
- the energy of the high band signal otherwise updating the energy of the high band signal of the locally buffered previous CN frame at a second rate, the first rate being greater than the second rate.
- the weighted average of the energy of the noise highband signal at the time corresponding to the SID is obtained, including:
- the obtaining the synthesis filter coefficient of the noise highband signal at the time corresponding to the SID includes:
- ISF Interference Spectral Frequency coefficients or ISP coefficients or LSF (Line Spectral Frequency) coefficients or LSPs (Line Spectral pair) are distributed in the frequency range corresponding to the high-band signal. Coefficient;
- each of the M coefficients is gradually brought closer to a corresponding one of its target values, and the target value is the same as the coefficient a value within a preset range adjacent to the value; a target value of each of the M coefficients is changed every N frames, wherein the M and the N are natural numbers; according to the randomization process
- the filter coefficient obtains a synthesis filter coefficient of the noise high band signal at the time corresponding to the SID.
- the obtaining a synthesis filter coefficient of the noise highband signal at the time corresponding to the SID includes:
- a synthetic filter coefficient of the noise high band signal at the time corresponding to the SID is obtained according to the randomized processed filter coefficient.
- the method before the obtaining the first low-band parameter and the locally generated noise high-band parameter according to the decoding, the method further includes:
- the noise high band signal of the subsequent L frame from the SID is multiplied by a smoothing coefficient less than 1, to obtain a weighted average of the energy of the new locally generated noise high band signal;
- the first CN frame is obtained according to the noise low band parameter obtained by the decoding and the locally generated noise high band parameter, and includes:
- the decoder acquires a mute insertion description frame SID, determines whether the SID includes a low band parameter and/or includes a high band parameter; if the SID includes the low band parameter, Decoding the SID, obtaining a noise lowband parameter, and locally generating a noise highband parameter, and obtaining a first comfort noise CN frame according to the decoded lowband parameter obtained by the decoding and the locally generated noise highband parameter; If the SID includes the high band parameter, decoding the SID to obtain a noise high band parameter, and locally generating a noise low band parameter, and the noise high band parameter obtained according to the decoding and the locally generated noise low band The parameter obtains a second CN frame; if the SID includes the high band parameter and the low band parameter, decoding the SID to obtain a noise high band parameter and the noise low band parameter, and the noise obtained according to the decoding is high A third CN frame is obtained with parameters and noise low band parameters.
- a method for processing audio data is provided.
- the CNG noise spectrum of the low frequency band or the CNG noise spectrum of the high frequency band usually has lost the harmonic structure, so that the CNG high frequency band
- the signal's effect on auditory perception will be primarily its energy rather than its spectral structure. Therefore, when performing DTX transmission on an ultra-wideband signal, in many cases, it is not necessary to transmit a high-band signal spectrum in the SID, and a high-band spectrum can be locally constructed at the decoding end by an appropriate method. A locally constructed high-band spectrum does not cause significant perceptual distortion. In this way, the calculation and encoding of the high-band spectrum calculations and bits at the encoding end are saved.
- a method for processing audio data including: analyzing/classifying a high-bandwidth spectrum of a noise, and blinding a spectrum of a high-band signal by a decoder, when the SID does not include a high-band energy parameter, the decoder is high. Estimation with signal energy, switching between decoders between different CNG modules, etc.
- the specific processing method of the audio data provided at the encoder end in this embodiment includes:
- the encoder obtains a noise frame of the audio signal, and decomposes the noise frame into a noise low band signal and a noise high band signal.
- the encoder obtains the noise frame of the audio signal, and the noise frame may be the current noise frame, or may be the noise frame buffered by the encoder side. This embodiment does not specifically limit this embodiment.
- an ultra-wideband input audio signal sampled at 32 kHz is taken as an example.
- the encoder first framing the input audio signal with a frame of 20ms (or 640 samples).
- the encoder For the current frame (the current frame in this embodiment refers to the current frame to be encoded), the encoder first performs a high-pass filtering, and the passband is a frequency above 50 Hz.
- the high-pass filtered current frame is decomposed into a low-band signal s by a QMF (Quadrature Mirror Filter) analysis filter.
- a high band signal S1 wherein the low band signal s.
- the high band signal is also 16 kHz sampling, characterizing the 8 to 16 kHz spectrum of the current frame.
- VAD Voice Activity Detector
- the encoder When VAD (Voice Activity Detector) indicates that the current frame is the foreground signal frame, ie speech In the case of a signal frame, the encoder performs speech coding on the current frame.
- the encoding of the speech encoded frame by the encoder belongs to the prior art, and is not described in this embodiment.
- the VAD indicates that the current frame is a noise frame.
- the time encoder enters the DTX operating state.
- the noise frame refers to both the background noise frame and the silence frame.
- the DTX controller determines whether the low-band signal of the current frame encodes the SID and transmits according to the SID transmission policy.
- the SID transmission strategy of the low-band signal in this embodiment is similar to the prior art, and the present invention does not describe it in detail.
- determining whether the highband signal of the current noise frame satisfies the preset encoding transmission condition includes: determining whether the noise highband signal has a preset spectrum structure, and if yes, and satisfying the second SID transmission policy And transmitting, in the second SID encoding strategy, the SID of the noisy highband signal and transmitting; if not, determining that encoding of the noisy highband signal is not required.
- Determining whether the noise highband signal has a preset spectrum structure comprises: obtaining a spectrum of the noise highband signal, and dividing the spectrum into at least two subbands, if any of the first subbands in the subband The average energy is not less than the average energy of the second sub-band in the sub-band, wherein the second sub-band is in a higher frequency band than the first sub-band, and the noise high-band signal is confirmed. There is no preset spectral structure, otherwise the noise high band signal has a preset spectral structure.
- the encoder performs spectrum analysis on the high-band signal S1 of the current noise frame to determine whether 51 has a more obvious spectral structure, that is, a preset spectrum structure.
- determining whether the high-band signal of the current noise frame satisfies a preset coding transmission condition includes: The ratio and the second ratio generate a deviation degree value, wherein the first ratio is a ratio of an energy of the noise high band signal of the noise frame to an energy of the noise low band signal, and the second ratio is at the noise The ratio of the energy of the noise high-band signal and the energy of the noise low-band signal at the time corresponding to the SID containing the noise high-band parameter is transmitted most recently before the frame; determining whether the deviation degree value reaches a preset threshold, and if so, Encoding the SID of the noisy highband signal with the second SID encoding strategy and transmitting; if not, determining that the noise is not needed Band signals encoded and transmitted.
- the first ratio is a ratio of an energy of a noise highband signal of the noise frame to an energy of the noise lowband signal
- the method includes: the first ratio is a high noise of the noise frame.
- the second ratio is a ratio of the energy of the noise high band signal and the energy of the noise low band signal at the time when the SID corresponding to the noise high band parameter is transmitted last time before the noise frame, and includes: the second The ratio is a ratio of the instantaneous energy of the noise high band signal and the instantaneous energy of the noise low band signal at the time when the SID containing the noisy high band parameter is transmitted last time before the noise frame; or, the first ratio is The ratio of the energy of the noise high band signal of the noise frame to the energy of the noise low band signal includes: the first ratio is a weighted average energy of a noise high band signal of the noise frame and a noise frame before the noise frame a
- generating the deviation degree value according to the first ratio and the second ratio comprising: separately calculating a logarithmic value of the logarithmic value of the first ratio and the second ratio; taking the logarithmic value of the first ratio and the The absolute value of the difference between the logarithmic values of the second ratio is obtained, and the degree of deviation is obtained.
- determining whether the deviation degree value reaches a preset threshold may be implemented in the following manner: In the DTX working state, the encoder calculates the logarithmic energy e of the current frame high and low band signals S1 , so respectively.
- e xa is initialized to the e x of the current frame.
- the long-term moving average is one of the weighted average calculations, which is not specifically limited in this embodiment.
- determining whether the deviation degree value reaches a preset threshold may be used as a second determination condition. In a specific implementation process, only one of the first determination condition or the second determination condition is required to be performed. It can be confirmed that the noise high-band signal needs to be encoded and transmitted, which is not specifically limited in this embodiment.
- the second determining condition is optionally, the purpose of performing the step is to assist the decoding end according to the noise low-band energy and the noise high-low energy ratio when the SID containing the high-band parameter is received last time. Highly noisy energy is estimated locally. Specifically, if the deviation degree value is not calculated at the encoding end, the decoding end may obtain the speech frame with the lowest energy of the high-band signal in the speech frame for a period of time before the current noise frame, and the high-band signal in the speech frame according to the period before the current noise frame.
- the high-band signal energy of the lowest energy speech frame locally estimates the current high-noise energy, for example, selecting the high-band signal energy of the speech frame with the lowest energy of the high-band signal in the speech frame for a period of time before the current noise frame as the current high
- the weighted average energy obtains a weighted average of the energy of the noise high band signal at the time corresponding to the SID.
- the specific embodiment is not limited herein.
- Transform isp SID (i) to ISF coefficient isfsiD(i), quantize isf SID Ci), obtain a set of quantization index idx ISF , and encapsulate it into SID.
- update the decoded ISP coefficient long-term moving average with the buffered isp'(i):
- a 0.9
- isp a ( ) is initialized to isp'1 of the first SID.
- Transform isp a ( ) to the LPC coefficient lpc a ( ), the analysis filter A(Z) is obtained.
- the cache is in this embodiment.
- the flag SID of the current noise frame is 1, according to the cached M history including the current noise frame
- the calculated weighted average logarithm energy e SID of the frame, e slD ⁇ j ⁇ 1.5, where Wl (k) is a set of M-dimensional positive coefficients,
- the quantization index idx e is obtained by quantizing the e SID .
- the coding and transmission strategy for the noise low-band signal is similar to the coding and transmission strategy for the noise broadband signal in the prior art. This is only a brief introduction in this embodiment, and the specific implementation process is not detailed in this embodiment. description.
- the noise high band signal of the current noise frame does not need to be encoded, and only the noise low band signal is encoded, which saves the calculation amount of the coding end and also saves the transmission bit.
- the first low-speed transmission mechanism encodes and transmits the noise low-band signal
- the second discontinuous transmission mechanism encodes and transmits the noise high-band signal.
- the SID needs to encode the high-band parameter in addition to the encoding of the low-band parameter.
- the coding of the low-bandwidth and low-band parameters is the same as the coding in the step 303, and will not be further described in this embodiment.
- the lsp a (i) is quantized to obtain a set of quantization index id SP .
- the number of energy is quantized at the long-term moving average e la of the encoding end to obtain a quantization index idx E.
- the SID will be composed of idx ISF , idx e , idxL SP and idx E , which in this embodiment will be idx ISF , idx e , idxL
- the SID composed of SP and idx E is called a large SID.
- the coding strategy for the noisy high-band signal is similar to the principle of the coding strategy for the low-band signal. This is only a brief introduction in this embodiment. The specific implementation process is not described in detail in this embodiment.
- the coded transmission of the noise high band signal when the coded transmission condition of the noise high band signal is satisfied, the coded transmission of the noise high band signal is always
- the coded transmission of the noise low-band signal is performed simultaneously, but optionally, the coded transmission of the noise high-band signal and the coded transmission of the noise low-band signal may also be performed at different times, that is, there are three possible cases when the SID is transmitted: Only the low-band signal encoding transmission is performed on the current noise frame; 2) only the high-band signal encoding transmission is performed on the current noise frame; 3) the low-band and high-band signal encoding transmission is simultaneously performed on the current noise frame, at this time
- the sending condition in the sending policy of the second SID of the second discontinuous transmission mechanism further includes: the first discontinuous transmission mechanism satisfies a sending condition of the first SID.
- the above three cases of transmitting the SID are not specifically limited in this embodiment.
- the steps 302-304 are specifically performing the step of encoding and transmitting the noise lowband signal by using a first discontinuous transmission mechanism, and encoding the noise highband signal by using a second discontinuous transmission mechanism, where the The first mute insertion of the discontinuous transmission mechanism describes that the transmission policy of the frame SID and the transmission strategy of the second SID of the second discontinuous transmission mechanism are different, or the first SID of the first discontinuous transmission mechanism The encoding strategy is different from the encoding strategy of the second SID of the second discontinuous transmission mechanism.
- the method provided by the present invention has the beneficial effects of: acquiring a current noise frame of the audio signal, and decomposing the current noise frame into a noise low band signal and a noise high band signal, and transmitting the code by using the first discontinuous transmission mechanism a low-band signal with a second discontinuous transmission mechanism for encoding and transmitting the noisy high-band signal, so that by processing the high-band signal and the low-band signal differently, the calculation can be saved without reducing the subjective quality of the codec.
- the complexity and coding bits, the saved bits can achieve the purpose of reducing the transmission bandwidth or improving the overall coding quality, thereby solving the problem of coding transmission due to ultra-wideband.
- a method for processing audio data is provided.
- the decoding end can determine, according to the received code stream, whether the current frame is a voice coded frame or a SID or a NO_DATA frame.
- the NO_DATA frame indicates that the encoder does not encode the frame for transmitting the SID during the noise.
- the decoder may further determine, according to the number of bits of the SID, whether the SID includes a low band and/or a high band parameter.
- the decoder may also determine whether the SID includes low-band and/or high-band parameters according to the specific identifier entered in the SID, which requires adding additional identification bits when encoding the SID, such as when entering in the SID.
- the SID is only included in the high-band parameter.
- the SID is identified to contain only the low-band parameter, and the third identifier is entered, and the SID is included with the high-band parameter and the low-band parameter. parameter.
- the current frame is a voice-coded frame
- the decoder performs the process of decoding the voice frame.
- the specific processing is similar to the prior art, which is not described in detail in this embodiment.
- the decoder selects a corresponding method to reconstruct the CN frame according to the specific working state of the CNG.
- the CNG has two working states, corresponding to the half-decoded CNG state of the small SID frame, that is, the first CNG state, corresponding to the fully decoded CNG state of the large SID frame, that is, the second CNG state.
- the decoder reconstructs the CN frame according to the noise level parameter obtained by decoding the large SID frame.
- the decoder reconstructs the CN frame based on the noise low band parameters obtained by decoding the small SID frame and the locally estimated noise high band parameters.
- the decoder acquires a SID. If the SID includes the highband parameter and the lowband parameter, decoding the SID to obtain a noise highband parameter and the noise lowband parameter, and the noise obtained according to the decoding is high. A third CN frame is obtained with parameters and noise low band parameters.
- the decoder determines the type of the voice frame first, so that different decoding modes are adopted according to different types of voice frames. Specifically, if the number of bits of the SID is less than a preset first threshold, confirm that the SID includes a highband parameter; if the number of bits of the SID is greater than a preset first threshold and less than a preset second a threshold, confirming that the SID includes a low band parameter; if the number of bits of the SID is greater than a preset second threshold and less than a preset third threshold, confirming that the SID includes a high band parameter and a low band parameter Or if the SID includes the first identifier, confirming that the SID includes a highband parameter, and if the SID includes a second identifier, confirming that the SID includes a lowband parameter, if The third identifier is included in the SID, and it is confirmed that the SID includes a low band parameter and
- the SID includes the high band parameter and the low band parameter
- decoding the SID to obtain a noise high band parameter and the noise low band parameter
- the noise high band parameter obtained according to the decoding And the noise low band parameter gets the third
- the decoder decodes the SID to obtain a decoded low-band excitation logarithm energy e D , a low-band ISF coefficient isf d (i), a high-band logarithmic energy E D , and a high-band LSP coefficient lspd1.
- Transform isf d (i) to ISP coefficient isp d (i), convert e D , E D to energy e d , E d , where
- the decoder end passes s' Q , s ' i through the QMF synthesis filter to obtain the first CN frame of the final 32 kHz sample reconstructed by the decoder.
- the SID includes the lowband parameter
- decoding the SID obtains a noise lowband parameter
- locally generating a noise highband parameter according to the decoded lowband parameter and the local
- the generated noise high band parameter gets the first CN frame.
- the high band signal of the first CN frame is still obtained by the method of exciting the synthesis filter with white noise, except that the high band signal energy and the synthesis filter coefficient of the first CN frame are obtained by local estimation.
- generating the noise high band parameter locally includes: obtaining a weighted average energy of the noise high band signal and a synthesis filter coefficient of the noise high band signal at the time corresponding to the SID, respectively;
- the noise high band signal is obtained by the weighted average energy of the noise high band signal at the time corresponding to the SID and the synthesis filter coefficient of the noise high band signal.
- the weighted average energy of the noise high band signal at the time corresponding to the SID is obtained in the embodiment, including: obtaining the energy of the low band signal of the first CN frame according to the noise low band parameter obtained by the decoding; a ratio of the energy of the noise high band signal corresponding to the time at which the SID containing the high band parameter is received in front of the SID and the energy of the noise low band signal is obtained; a first ratio is obtained according to the low band signal of the first CN frame Energy and the first ratio, obtaining the pair of SIDs
- the noise of the high-band signal at the moment of the response weighting the energy of the noise high-band signal at the time corresponding to the SID and the energy of the high-band signal of the locally buffered CN frame to obtain the noise at the time corresponding to the SID
- the weighted average energy of the high band signal wherein the weighted average energy of the noise high band signal at the time corresponding to the SID is the high band signal energy of the first CN frame.
- the calculating, by the ratio of the energy of the noise high-band signal corresponding to the time of receiving the SID including the high-band parameter and the energy of the noise low-band signal, before the SID obtains a first ratio, including: And obtaining, by the ratio of the instantaneous energy of the noise high-band signal corresponding to the time when the SID containing the high-band parameter is received, and the instantaneous energy of the noise low-band signal, the first ratio; or, calculating, receiving in front of the SID
- the ratio of the weighted average of the energy of the noise high band signal corresponding to the time of the SID containing the high band parameter to the weighted average of the energy of the noise low band signal is the first ratio.
- the instant energy is the energy obtained by decoding. And updating, when the energy of the high-band signal of the time corresponding to the SID is greater than the energy of the high-band signal of the previous CN frame of the local cache, updating the previous CN frame of the local cache at the first rate The energy of the high band signal, otherwise updating the energy of the high band signal of the locally buffered previous CN frame at a second rate, the first rate being greater than the second rate.
- the energy E Q of the low band signal of the first CN frame s' Q is obtained according to the noise low band parameter obtained by the decoding.
- the SID is optionally obtained.
- a weighted average of the energy of the high-band signal at the corresponding time comprising: selecting a high-band signal of the voice frame with the lowest energy of the high-band signal in the voice frame in the preset time period before the SID; according to the high-band signal in the voice frame The energy of the high-band signal of the lowest energy speech frame obtains the weighted average energy of the noise high-band signal at the time corresponding to the SID; or, the high-band signal energy in the speech frame within the preset time period before the SID is selected is less than the pre- Setting a high-band signal of N speech frames of a threshold; The weighted average energy of the high band signal of the N speech frames obtains a weighted average of the energy of the noise high band signal at the time corresponding to the SID, wherein the weighted average energy of the noise high band signal at the time corresponding to the SID is The high band signal
- the synthesis filter coefficients of the noise high band signal at the time corresponding to the SID are obtained, including: distributing M impedance coefficients ISF coefficients or derivatives in a frequency range corresponding to the high band signal Resistance spectrum to ISP coefficient or a line spectrum frequency LSF coefficient or a line spectrum pair LSP coefficient; randomizing the M coefficients, wherein the randomization is characterized by: causing each of the M coefficients to correspond to a respective target thereof The value is gradually close, the target value is a value within a preset range adjacent to the coefficient value; a target value of each of the M coefficients is changed every N frames, and N may be a variable;
- the randomized filter coefficient obtains a synthesis filter coefficient of a noise high band signal at a time corresponding to the SID.
- RND in equation (15) is a set of 9-dimensional random number sequences, each of which is different and both in [-1, 1
- the modCcnt in the calculation of RtCi), 10 in 10) can also be a variable, such as:
- N _ ⁇ 0 + 5 -RND mod(cnt, N (_1) ) 0
- RND is a random number in the range of [-1, 1], which is not specifically limited in this embodiment.
- Transformed into LPC coefficients lpd, i 0, l,...9.
- Multiply lpd by a set of 10-dimensional weighting coefficients W(i) ⁇ 0.6699, 0.5862, 0.5129, 0.4488, 0.3927, 0.3436, 0.3007, 0.2631, 0.2302, 0.2014 ⁇ , and obtain the weighted LPC coefficient lpc ⁇ i), ie For the estimated synthesis filter 1/A (Z).
- the obtaining, by the obtaining, the synthesis filter coefficient of the noise high band signal at the time corresponding to the SID includes: acquiring the one ISF or the ISP or the LSF of the locally buffered noise high band signal Or LSP coefficients; performing randomization processing on the plurality of coefficients, wherein the randomization is characterized by: causing each of the coefficients to gradually move toward a corresponding one of their target values, the target value a value within a preset range adjacent to the coefficient value; a target value of each of the coefficients is changed every time the frame is changed; and the filter coefficient obtained according to the randomization process is The synthesis filter coefficient of the noise high band signal at the time corresponding to the SID.
- This embodiment is not specifically limited.
- s' Q is obtained, and the first CN frame of the final 32 kHz sample reconstructed by the decoder is obtained through the QMF synthesis filter.
- the locally generated noise high band parameter may also be optimized before the first CN frame is obtained according to the noise low band parameter obtained by the decoding and the locally generated noise high band parameter.
- the specific optimization step includes: when the historical frame adjacent to the SID is a speech encoded frame, if the speech encoded frame decodes a high band signal or a part of a high band When the average energy of the signal is less than the average energy of the locally generated noise high band signal or the partial noise high band signal, the noise high band signal of the subsequent L frame from the SID is multiplied by a smoothing coefficient less than 1, to obtain a new a weighted average of the energy of the locally generated noise highband signal; correspondingly, the obtaining the first CN frame according to the noise lowband parameter obtained by the decoding and the locally generated noise highband parameter, including: according to the a noise low band parameter obtained by decoding, a synthesis filter coefficient of a noise high band signal at a time corresponding to
- the current SID and the following SIDs are required (in this embodiment, 50 frames)
- the high band signal energy is smoothed.
- the specific smoothing method is: multiplying 3' 1 of the current frame by the gain G s to obtain smoothed s' ls . among them
- This smoothing process only takes up to 50 frames, and if the period occurs - greater than E, the smoothing process is terminated.
- - and may also only represent the energy of a part of the frame, which is not specifically limited in this embodiment.
- s' Q , S' i (or s, ls ) is passed through a QMF synthesis filter to obtain a final 32 kHz sampled CN frame reconstructed by the decoder.
- the SID includes the highband parameter, decoding the SID to obtain a noise highband parameter, and locally generating a noise lowband parameter, according to the decoded highband parameter and the locally generated noise.
- the low band parameter gets the second CN frame.
- the SID includes the highband parameter
- decoding the SID to obtain a noise highband parameter and locally generating a noise lowband parameter, according to the decoded highband parameter and the locally generated
- the noise low band parameter obtains the second CN frame
- the method for decoding the high band parameter is the same as the method in step 401, and is not described in this embodiment, and the method for locally generating the low band parameter is local to the prior art.
- the method for generating the broadband parameter is the same, and the details are not described in this embodiment.
- the decoder acquires a mute insertion description frame SID, determines whether the SID includes a low band parameter and/or includes a high band parameter; if the SID includes the low band parameter, Decoding the SID, obtaining a noise lowband parameter, and locally generating a noise highband parameter, and obtaining a first comfort noise CN frame according to the decoded lowband parameter obtained by the decoding and the locally generated noise highband parameter; If the SID includes the high band parameter, decoding the SID to obtain a noise high band parameter, and locally generating a noise low band parameter, and the noise high band parameter obtained according to the decoding and the locally generated noise low band The parameter obtains a second CN frame; if the SID includes the high band parameter and the low band parameter, decoding the SID to obtain a noise high band parameter and the noise low band parameter, and the noise obtained according to the decoding is high A third CN frame is obtained with parameters and noise low band parameters.
- the computational complexity and the coding bits can be saved without reducing the subjective quality of the codec, and the saved bits can be reduced to reduce the transmission bandwidth or to improve the overall coding.
- the purpose of quality is thus solved due to the problem of code transmission in ultra-wideband.
- the locally generated noise high band parameter may also be optimized, so that a better effect can be obtained. Comfort noise, which further optimizes the performance of the decoder.
- a method for processing audio data is provided.
- the encoder side acquires a noise frame of an audio signal, and decomposes the noise frame into a low-band noise signal and a high noise.
- determining whether the high band signal of the noise frame satisfies a preset encoding transmission condition includes: determining noise of the noise frame Whether the spectral structure of the high band signal satisfies a preset condition compared to the average spectral structure of the noise high band signal preceding the noise frame, and if so, the noise high band signal of the noise frame is encoded by the second coding strategy
- the SID is sent and transmitted; if not, the noise high band signal of the noise frame does not need to be encoded and transmitted.
- the average spectral structure of the noise high band signal before the noise frame includes: a weighted average of the spectrum of the noise high band signal before the noise frame.
- the second determination condition may be used to determine whether the transmission noise high band signal needs to be encoded, which is not specifically limited in this embodiment.
- the LSP, LSF, or ISF, or ISP coefficients are only different representations of different domains, but all represent synthesis filter coefficients. This embodiment is not specifically limited. Update its sliding average with lsp(i),
- lsp a (i) is the long-term moving average of lsp(i)
- the synthesis filter coefficient of the noise high band signal at the time corresponding to the SID is obtained, including: The locally buffered noise high band signal of the M ISF coefficients or ISP coefficients or LSF coefficients or LSP coefficients; randomizing the M coefficients, wherein the randomization is characterized by: making the M coefficients Each of the coefficients is gradually closer to a respective one of its target values, the target value being a value within a preset range adjacent to the coefficient value; each target value of each of the M coefficients is passed The N frame is changed; the synthesis filter coefficient of the noise high band signal at the time corresponding to the SID is obtained according to the randomized processed filter coefficient.
- the specific noise obtained at the time corresponding to the SID is high.
- Lsp'(i) is randomized in the same manner as in the fourth embodiment to obtain ls Pl (i),
- the lspl(i) is transformed into the LPC coefficient lpcl(i), and weighted by w(i) in the same manner as in the fourth embodiment to obtain a synthetic filter 1/A CZ).
- the obtained lspl(i) is not used to update the long-term moving average of the LSP coefficients of the high-band signal of the CN frame buffered by the decoder when the current frame is the SID.
- a high band signal slidably log energy at length encoding end average e la quantizing of e la certain attenuating the present embodiment (i.e., by subtracting a certain value)
- quantization is performed, so at this time, it is no longer necessary to multiply s (i) by G2 or G4 in Embodiment 4 at the time of decoding.
- the other steps of the decoding end in this embodiment are similar to the steps in the foregoing embodiment, and are not described in detail in this embodiment.
- the method provided by the present invention has the beneficial effects of: acquiring a current noise frame of the audio signal, and decomposing the current noise frame into a noise low band signal and a noise high band signal, and transmitting the code by using the first discontinuous transmission mechanism a noise low band signal, encoding the noise high band signal by a second discontinuous transmission mechanism, the decoder acquiring a mute insertion description frame SID, determining whether the SID includes a low band parameter and/or including a high band parameter; The SID includes the low band parameter, the SID is decoded, a noise low band parameter is obtained, and a noise high band parameter is generated locally, and the noise low band parameter and the locally generated noise high band are obtained according to the decoding.
- the parameter obtains a first comfort noise CN frame; if the SID includes the high band parameter, decoding the SID to obtain a noise high band parameter, and locally generating a noise low band parameter, and the noise high band parameter obtained according to the decoding And generating, by the locally generated noise low band parameter, a second CN frame; if the SID includes the high band parameter and the low band parameter, decoding the SI D obtains a noise high band parameter and the noise low band parameter, and obtains a third CN frame according to the noise high band parameter and the noise low band parameter obtained by the decoding.
- an apparatus for encoding audio data includes: an obtaining module 501, and a transmission module 502.
- the obtaining module 501 is configured to acquire a noise frame of the audio signal, and decompose the noise frame into a noise low band signal and a noise high band signal;
- the transmitting module 502 is configured to code and transmit the noise low band signal by using a first discontinuous transmission mechanism, and transmit the noise high band signal by a second discontinuous transmission mechanism, where the first discontinuous transmission mechanism is first.
- the transmission policy of the mute insertion description frame SID is different from the transmission policy of the second SID of the second discontinuous transmission mechanism, or the coding strategy of the first SID of the first discontinuous transmission mechanism and the second discontinuity
- the encoding strategy of the second SID of the transport mechanism is different.
- the first SID includes a low band parameter of the noise frame
- the second SID includes a low band parameter and/or a high band parameter of the noise frame.
- the transmission module 502 includes:
- the first transmission unit 502a is configured to determine whether the noise highband signal has a preset spectrum structure, and if yes, and satisfy a transmission condition in the second SID transmission policy, use the second SID coding strategy Encoding the SID of the noisy highband signal and transmitting; if not, determining that encoding of the noisy highband signal is not required.
- the first transmission unit 502a includes:
- a determining subunit configured to obtain a spectrum of the noisy highband signal, and divide the frequency spectrum into at least two subbands, if an average energy of any of the first subbands in the subband is not less than the subband An average energy of the second sub-band, wherein the frequency band in which the second sub-band is located is higher than a frequency band in which the first sub-band is located, confirming that the noisy high-band signal does not have a preset spectral structure, otherwise
- the noise high band signal has a preset spectral structure.
- the transmission module 502 includes:
- a second transmission unit 502b configured to generate a deviation degree value according to the first ratio and the second ratio, wherein the first ratio is a ratio of an energy of the noise highband signal of the noise frame to an energy of the noise lowband signal
- the second ratio is a ratio of the energy of the noise highband signal and the energy of the noise lowband signal at the time when the SID including the noisy highband parameter is transmitted most recently before the noise frame; determining the degree of deviation Whether the value reaches a preset threshold, and if so, encoding the SID of the noisy highband signal with the second SID encoding strategy and transmitting; if not, determining that encoding of the noisy highband signal is not required.
- the first ratio is a ratio of an energy of a noise highband signal of the noise frame to an energy of the noise lowband signal, and includes:
- the first ratio is a ratio of an instantaneous energy of a noise high band signal of the noise frame to an instantaneous energy of the noise low band signal;
- the second ratio is a ratio of the energy of the noise highband signal and the energy of the noise lowband signal at the time when the SID containing the noisy highband parameter is transmitted last time before the noise frame, and includes:
- the second ratio is a ratio of the instantaneous energy of the noise high band signal and the instantaneous energy of the noise low band signal of the time corresponding to the last transmission of the SID including the noisy high band parameter before the noise frame;
- the first ratio is a ratio of the energy of the noise high band signal of the noise frame to the energy of the noise low band signal, and includes:
- the first ratio is a ratio of a weighted average energy of a noise highband signal of the noise frame and its previous noise frame to a weighted average energy of a noise lowband signal of the noise frame and its previous noise frame;
- the second ratio is a ratio of the energy of the noise highband signal and the energy of the noise lowband signal at the time when the SID containing the noisy highband parameter is transmitted last time before the noise frame, and includes:
- the second ratio is a weighted average energy of the high-band signal and a low-band signal of the noise frame at the time corresponding to the last time the SID corresponding to the SID containing the noisy high-band parameter is transmitted before the noise frame The ratio of the average energy.
- the second transmission unit 502b includes:
- Calculating a subunit configured to separately calculate a logarithmic value of the logarithmic value of the first ratio and the second ratio; calculating an absolute value of a difference between the logarithmic value of the first ratio and the logarithm of the second ratio, to obtain the deviation Degree value.
- the transmission module 502 includes:
- a third transmission unit 502c configured to determine whether a spectral structure of the noise highband signal of the noise frame satisfies a preset condition compared with an average spectral structure of a noise highband signal before the noise frame, and if yes,
- the second encoding strategy encodes the SID of the noise high band signal of the noise frame and transmits; if not, it determines that the noise high band signal of the noise frame does not need to be encoded and transmitted.
- the average spectral structure of the noise highband signal before the noise frame includes: a weighted average of the spectrum of the noise highband signal before the noise frame.
- the sending condition in the sending policy of the second SID of the second discontinuous transmission mechanism in the embodiment further includes: the first discontinuous transmission mechanism satisfies a sending condition of the first SID.
- the device provided by the present invention has the beneficial effects of: acquiring a current noise frame of the audio signal, and decomposing the current noise frame into a noise low band signal and a noise high band signal, and transmitting the code by using the first discontinuous transmission mechanism. a low-band signal with a second discontinuous transmission mechanism for transmitting the noise high-band signal, such that the high-band signal and the low-band signal are not.
- the same processing method can save computational complexity and coding bits without reducing the subjective quality of the codec, and the saved bits can achieve the purpose of reducing the transmission bandwidth or improving the overall coding quality, thereby solving the problem of ultra-wideband.
- the encoding transmission problem Example 7
- the apparatus includes: an obtaining module 601, a first decoding module 602, a second decoding module 603, and a third decoding module 604.
- the obtaining module 601 is configured to determine whether the received current mute insertion description frame SID includes a high band parameter or a low band parameter;
- the first decoding module 602 is configured to: if the SID acquired by the acquiring module 601 includes the low-band parameter, decode the SID, obtain a noise low-band parameter, and generate a noise high-band parameter locally, according to the decoding.
- the noise low band parameter and the locally generated noise high band parameter to obtain a first comfort noise CN frame;
- a second decoding module 603, configured to: if the SID acquired by the acquiring module 601 includes the highband parameter, decode the SID to obtain a noise highband parameter, and generate a noise lowband parameter locally, according to the decoding. a noise high band parameter and the locally generated noise low band parameter to obtain a second CN frame;
- a third decoding module 604 configured to: if the SID acquired by the acquiring module 601 includes the highband parameter and the lowband parameter, decoding the SID to obtain a noise highband parameter and the noise lowband parameter, according to the The noise high band parameter and the noise low band parameter obtained by the decoding obtain the third CN frame.
- the first decoding module 602 is further configured to: after decoding the SID, obtain a noise lowband parameter, and locally generate a noise highband parameter, and the noise lowband parameter obtained according to the decoding. And before the locally generated noise high band parameter obtains the first comfort noise CN frame, if the decoder is in the first comfort noise generating CNG state, then enters the second CNG state.
- the third decoding module 604 is further configured to: decode the SID to obtain a noise highband parameter and the noise lowband parameter, and obtain a noise highband parameter and a low noise band according to the decoding. Before the parameter obtains the third CN frame, if the decoder is in the second CNG state, it enters the first CNG state.
- the obtaining module 601 includes:
- a first confirming unit configured to: if the number of bits of the SID is less than a preset first threshold, confirm that the SID includes a highband parameter; if the number of bits of the SID is greater than a preset first threshold and less than a pre- Setting a second threshold, confirming that the SID includes a low band parameter; if the number of bits of the SID is greater than a preset second threshold and less than a preset third threshold, confirming that the SID includes a high band parameter And low band parameters; Or a second confirming unit, configured to: if the SID includes a first identifier, confirm that the SID includes a highband parameter, and if the SID includes a second identifier, confirm that the SID includes a low identifier With parameters, if the third identifier is included in the SID, it is confirmed that the SID includes a low band parameter and a high band parameter.
- the first decoding module 602 includes:
- a first acquiring unit configured to respectively obtain a weighted average energy of a noise high band signal and a synthesis filter coefficient of a noise high band signal at a time corresponding to the SID;
- a second acquiring unit configured to obtain the noise high band signal according to the weighted average energy of the noise high band signal and the synthesis filter coefficient of the noise high band signal at the time corresponding to the obtained SID.
- the first acquiring unit includes:
- a first obtaining subunit configured to obtain, according to the decoded low-band parameter, a low-band signal capable sub-unit of the first CN frame, configured to calculate, before the SID, a SID that includes a high-band parameter
- the ratio of the energy of the noise high-band signal corresponding to the energy of the noise low-band signal corresponding to the time is obtained as a first ratio
- a second acquiring sub-unit configured to obtain, according to the energy of the low-band signal of the first CN frame and the first ratio, an energy of a noise high-band signal at a corresponding moment of the SID;
- a third acquiring sub-unit configured to weight-average the energy of the noise high-band signal at the time corresponding to the SID and the energy of the high-band signal of the locally buffered CN frame, to obtain a noise high-band signal at a time corresponding to the SID
- the weighted average energy of the noise high band signal at the time corresponding to the SID is the high band signal energy of the first CN frame, wherein the calculation subunit is specifically used for:
- a ratio of a weighted average of the energy of the noise high band signal corresponding to the time at which the SID containing the high band parameter is received in front of the SID and the energy of the noise low band signal is calculated to obtain a first ratio.
- the local frame of the previous CN frame is updated at the first rate.
- the energy of the high band signal otherwise updating the energy of the high band signal of the locally buffered previous CN frame at a second rate, the first rate being greater than the second rate.
- the first acquiring unit includes:
- a first selection sub-unit configured to select a high-band signal of a speech frame with a minimum energy of a high-band signal in a speech frame within a preset time period before the SID; and a high-band of a speech frame with a minimum energy of a high-band signal in the speech frame Signal energy acquisition
- the weighted average energy of the noise high band signal at the time corresponding to the SID, wherein the weighted average energy of the noise high band signal at the time corresponding to the SID is the high band signal energy of the first CN frame;
- a second selection sub-unit configured to select a high-band signal of the N speech frames in which the high-band signal energy in the speech frame in the preset time period is less than a preset threshold in the preset time period; according to the height of the N speech frames
- the weighted average energy of the signal obtains a weighted average of the energy of the noise high band signal at the time corresponding to the SID, wherein the weighted average energy of the noise high band signal at the time corresponding to the SID is the height of the first CN frame With signal energy.
- the first acquiring unit includes:
- a distribution subunit for distributing M impedance spectrum frequencies in the frequency range corresponding to the highband signal, ISF coefficient or impedance spectrum pair ISP coefficient or line spectrum frequency LSF coefficient or line spectrum pair LSP coefficient;
- a first randomization processing sub-unit configured to perform randomization processing on the M coefficients, wherein the randomization is characterized by: causing each of the M coefficients to gradually reach a corresponding target value thereof Closely, the target value is a value within a preset range adjacent to the coefficient value; a target value of each of the M coefficients is changed every N frames, wherein the M and the N are both Natural number;
- a fourth acquiring subunit configured to obtain a synthesis filter coefficient of the noise highband signal at the time corresponding to the SID according to the randomized processed filter coefficient.
- the first acquiring unit includes:
- a fifth obtaining subunit configured to obtain the M ISF coefficients or ISP coefficients or LSF coefficients or LSP coefficients of the locally buffered noise highband signal
- a second randomization processing sub-unit performing randomization processing on the M coefficients, wherein the randomization is characterized by: causing each of the M coefficients to gradually move toward a corresponding one of their target values, The target value is a value within a preset range adjacent to the coefficient value; a target value of each of the M coefficients is changed every time the N frame is changed; a sixth acquisition subunit is configured according to The randomized processed filter coefficient obtains a synthesis filter coefficient of a noise high band signal at a time corresponding to the SID.
- the device further includes:
- the optimization module 605 is configured to: before the first decoding module 602 obtains the first CN frame, when the historical frame adjacent to the SID is a voice coded frame, if the voice coded frame is decoded by the high band signal or part When the average energy of the high band signal is less than the average energy of the locally generated noise high band signal or the partial noise high band signal, the noise high band signal of the subsequent L frame from the SID is multiplied by a smoothing coefficient less than 1, Obtaining a weighted average of the energy of the new locally generated noise highband signal;
- the first decoding module 602 is specifically configured to use the noise low band parameter obtained by the decoding, and the SID A weighted average of the synthesized filter coefficients of the noise high band signal at the corresponding time and the energy of the new locally generated noise high band signal results in a fourth CN frame.
- the decoder acquires a mute insertion description frame SID, determines whether the SID includes a low band parameter and/or includes a high band parameter; if the SID includes the low band parameter, Decoding the SID, obtaining a noise lowband parameter, and locally generating a noise highband parameter, and obtaining a first comfort noise CN frame according to the decoded lowband parameter obtained by the decoding and the locally generated noise highband parameter; If the SID includes the high band parameter, decoding the SID to obtain a noise high band parameter, and locally generating a noise low band parameter, and the noise high band parameter obtained according to the decoding and the locally generated noise low band The parameter obtains a second CN frame; if the SID includes the high band parameter and the low band parameter, decoding the SID to obtain a noise high band parameter and the noise low band parameter, and the noise obtained according to the decoding is high A third CN frame is obtained with parameters and noise low band parameters.
- a processing system for audio data comprising: an encoding device 500 for audio data as described above and a decoding device 600 for audio data as described above.
- the technical solution provided by the embodiment of the present invention has the beneficial effects of: acquiring a current noise frame of the audio signal, and decomposing the current noise frame into a noise low band signal and a noise high band signal, and coding by using the first discontinuous transmission mechanism Transmitting the noise low band signal, encoding the noise high band signal by using a second discontinuous transmission mechanism, and the decoder acquires a mute insertion description frame SID, determining whether the SID includes a low band parameter and/or includes a high band parameter; If the SID includes the low band parameter, decoding the SID, obtaining a noise low band parameter, and locally generating a noise high band parameter, the noise low band parameter obtained according to the decoding, and the locally generated The noise high band parameter obtains a first comfort noise CN frame; if the SID includes the high band parameter, decoding the SID to obtain a noise high band parameter, and locally generating a noise low band parameter, according to the decoded noise a high band parameter and the locally generated noise low band parameter
- the computational complexity and the coding bits can be saved without reducing the subjective quality of the codec, and the saved bits can be reduced to reduce the transmission bandwidth or to improve the overall coding.
- the purpose of quality thus solving the problem of encoding transmission due to ultra-wideband.
- the device and the system provided by this embodiment may be the same as the method embodiment, and the specific implementation process is described in detail in the method embodiment, and details are not described herein again.
- the method and apparatus for processing audio data in the above embodiments may be applied to an audio encoder or an audio decoder.
- Audio codecs can be used in a wide variety of electronic devices such as mobile phones, wireless devices, personal data assistants (PDAs), handheld or portable computers, GPS receivers/navigators, cameras, audio/video players, Cameras, video recorders, surveillance equipment, etc.
- PDAs personal data assistants
- Such an electronic device includes an audio encoder or an audio decoder, and the audio encoder or decoder may be directly implemented by a digital circuit or a chip such as a DSP (digital signal processor), or may be executed by a software code driven processor in the software code. The process is implemented.
- DSP digital signal processor
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Noise Elimination (AREA)
Abstract
Description
Claims
Priority Applications (21)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2012361423A AU2012361423B2 (en) | 2011-12-30 | 2012-12-28 | Method, apparatus, and system for processing audio data |
KR1020167036611A KR101770237B1 (ko) | 2011-12-30 | 2012-12-28 | 오디오 데이터 처리 방법, 장치 및 시스템 |
BR112014016153-4A BR112014016153B1 (pt) | 2011-12-30 | 2012-12-28 | método para um codificador processar dados de áudio, método para processar um sinal de áudio, codificador e decodificador |
ES12861377.5T ES2610783T3 (es) | 2011-12-30 | 2012-12-28 | Método y aparato para procesar datos de audio |
MX2014007968A MX338445B (es) | 2011-12-30 | 2012-12-28 | Metodo, aparato, y sistema para procesar datos de audio. |
CA2861916A CA2861916C (en) | 2011-12-30 | 2012-12-28 | Method, apparatus, and system for processing audio data |
EP12861377.5A EP2793227B1 (en) | 2011-12-30 | 2012-12-28 | Audio data processing method and apparatus |
MYPI2014001949A MY173976A (en) | 2011-12-30 | 2012-12-28 | Method, apparatus, and system for processing audio data |
SG11201403686SA SG11201403686SA (en) | 2011-12-30 | 2012-12-28 | Method, apparatus, and system for processing audio data |
KR1020147020836A KR101693280B1 (ko) | 2011-12-30 | 2012-12-28 | 오디오 데이터 처리 방법, 장치 및 시스템 |
RU2014131387/08A RU2579926C1 (ru) | 2011-12-30 | 2012-12-28 | Способ, устройство и система для обработки аудиоданных |
JP2014549344A JP6072068B2 (ja) | 2011-12-30 | 2012-12-28 | オーディオ・データを処理するための方法、装置、及びシステム |
US14/318,899 US9406304B2 (en) | 2011-12-30 | 2014-06-30 | Method, apparatus, and system for processing audio data |
IN1436KON2014 IN2014KN01436A (zh) | 2011-12-30 | 2014-07-08 | |
ZA2014/04996A ZA201404996B (en) | 2011-12-30 | 2014-07-08 | Method, apparatus , and system for processing audio data |
HK14113112.0A HK1199543A1 (zh) | 2011-12-30 | 2014-12-31 | 音頻數據的處理方法、裝置和系統 |
US15/188,518 US9892738B2 (en) | 2011-12-30 | 2016-06-21 | Method, apparatus, and system for processing audio data |
US15/867,977 US10529345B2 (en) | 2011-12-30 | 2018-01-11 | Method, apparatus, and system for processing audio data |
US16/697,822 US11183197B2 (en) | 2011-12-30 | 2019-11-27 | Method, apparatus, and system for processing audio data |
US17/507,200 US11727946B2 (en) | 2011-12-30 | 2021-10-21 | Method, apparatus, and system for processing audio data |
US18/344,445 US20230352035A1 (en) | 2011-12-30 | 2023-06-29 | Method, Apparatus, and System for Processing Audio Data |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110455836.7A CN103187065B (zh) | 2011-12-30 | 2011-12-30 | 音频数据的处理方法、装置和系统 |
CN201110455836.7 | 2011-12-30 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/318,899 Continuation US9406304B2 (en) | 2011-12-30 | 2014-06-30 | Method, apparatus, and system for processing audio data |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2013097764A1 true WO2013097764A1 (zh) | 2013-07-04 |
Family
ID=48678198
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2012/087812 WO2013097764A1 (zh) | 2011-12-30 | 2012-12-28 | 音频数据的处理方法、装置和系统 |
Country Status (18)
Country | Link |
---|---|
US (6) | US9406304B2 (zh) |
EP (1) | EP2793227B1 (zh) |
JP (2) | JP6072068B2 (zh) |
KR (2) | KR101770237B1 (zh) |
CN (1) | CN103187065B (zh) |
AU (1) | AU2012361423B2 (zh) |
BR (1) | BR112014016153B1 (zh) |
CA (3) | CA3059322C (zh) |
ES (1) | ES2610783T3 (zh) |
HK (1) | HK1199543A1 (zh) |
IN (1) | IN2014KN01436A (zh) |
MX (1) | MX338445B (zh) |
MY (1) | MY173976A (zh) |
PT (1) | PT2793227T (zh) |
RU (3) | RU2617926C1 (zh) |
SG (2) | SG10201609338SA (zh) |
WO (1) | WO2013097764A1 (zh) |
ZA (2) | ZA201404996B (zh) |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103187065B (zh) * | 2011-12-30 | 2015-12-16 | 华为技术有限公司 | 音频数据的处理方法、装置和系统 |
CN106169297B (zh) * | 2013-05-30 | 2019-04-19 | 华为技术有限公司 | 信号编码方法及设备 |
US9136763B2 (en) * | 2013-06-18 | 2015-09-15 | Intersil Americas LLC | Audio frequency deadband system and method for switch mode regulators operating in discontinuous conduction mode |
CN111710342B (zh) * | 2014-03-31 | 2024-04-16 | 弗朗霍弗应用研究促进协会 | 编码装置、解码装置、编码方法、解码方法及程序 |
US10163453B2 (en) * | 2014-10-24 | 2018-12-25 | Staton Techiya, Llc | Robust voice activity detector system for use with an earphone |
GB2532041B (en) * | 2014-11-06 | 2019-05-29 | Imagination Tech Ltd | Comfort noise generation |
CN105681512B (zh) * | 2016-02-25 | 2019-02-01 | Oppo广东移动通信有限公司 | 一种降低语音通话功耗的方法及装置 |
CN105721656B (zh) * | 2016-03-17 | 2018-10-12 | 北京小米移动软件有限公司 | 背景噪声生成方法及装置 |
ES2745018T3 (es) | 2016-12-12 | 2020-02-27 | Kyynel Oy | Procedimiento versátil de selección de canal para red inalámbrica |
US10504538B2 (en) * | 2017-06-01 | 2019-12-10 | Sorenson Ip Holdings, Llc | Noise reduction by application of two thresholds in each frequency band in audio signals |
US10540983B2 (en) * | 2017-06-01 | 2020-01-21 | Sorenson Ip Holdings, Llc | Detecting and reducing feedback |
GB2595891A (en) * | 2020-06-10 | 2021-12-15 | Nokia Technologies Oy | Adapting multi-source inputs for constant rate encoding |
CN113571072B (zh) * | 2021-09-26 | 2021-12-14 | 腾讯科技(深圳)有限公司 | 一种语音编码方法、装置、设备、存储介质及产品 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101087319A (zh) * | 2006-06-05 | 2007-12-12 | 华为技术有限公司 | 一种发送和接收背景噪声的方法和装置及静音压缩系统 |
CN101246688A (zh) * | 2007-02-14 | 2008-08-20 | 华为技术有限公司 | 一种对背景噪声信号进行编解码的方法、系统和装置 |
CN101320563A (zh) * | 2007-06-05 | 2008-12-10 | 华为技术有限公司 | 一种背景噪声编码/解码装置、方法和通信设备 |
US20110228946A1 (en) * | 2010-03-22 | 2011-09-22 | Dsp Group Ltd. | Comfort noise generation method and system |
Family Cites Families (31)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7103065B1 (en) * | 1998-10-30 | 2006-09-05 | Broadcom Corporation | Data packet fragmentation in a cable modem system |
US6424938B1 (en) | 1998-11-23 | 2002-07-23 | Telefonaktiebolaget L M Ericsson | Complex signal activity detection for improved speech/noise classification of an audio signal |
EP1133886B1 (en) * | 1998-11-24 | 2008-03-12 | Telefonaktiebolaget LM Ericsson (publ) | Efficient in-band signaling for discontinuous transmission and configuration changes in adaptive multi-rate communications systems |
US6549587B1 (en) * | 1999-09-20 | 2003-04-15 | Broadcom Corporation | Voice and data exchange over a packet based network with timing recovery |
US6782360B1 (en) * | 1999-09-22 | 2004-08-24 | Mindspeed Technologies, Inc. | Gain quantization for a CELP speech coder |
AU1359601A (en) * | 1999-11-03 | 2001-05-14 | Tellabs Operations, Inc. | Integrated voice processing system for packet networks |
FI116643B (fi) * | 1999-11-15 | 2006-01-13 | Nokia Corp | Kohinan vaimennus |
US7920697B2 (en) | 1999-12-09 | 2011-04-05 | Broadcom Corp. | Interaction between echo canceller and packet voice processing |
US6691085B1 (en) | 2000-10-18 | 2004-02-10 | Nokia Mobile Phones Ltd. | Method and system for estimating artificial high band signal in speech codec using voice activity information |
US6615169B1 (en) * | 2000-10-18 | 2003-09-02 | Nokia Corporation | High frequency enhancement layer coding in wideband speech codec |
US6691805B2 (en) | 2001-08-27 | 2004-02-17 | Halliburton Energy Services, Inc. | Electrically conductive oil-based mud |
US7319703B2 (en) * | 2001-09-04 | 2008-01-15 | Nokia Corporation | Method and apparatus for reducing synchronization delay in packet-based voice terminals by resynchronizing during talk spurts |
US20030093270A1 (en) * | 2001-11-13 | 2003-05-15 | Domer Steven M. | Comfort noise including recorded noise |
CA2392640A1 (en) * | 2002-07-05 | 2004-01-05 | Voiceage Corporation | A method and device for efficient in-based dim-and-burst signaling and half-rate max operation in variable bit-rate wideband speech coding for cdma wireless systems |
FR2859566B1 (fr) * | 2003-09-05 | 2010-11-05 | Eads Telecom | Procede de transmission d'un flux d'information par insertion a l'interieur d'un flux de donnees de parole, et codec parametrique pour sa mise en oeuvre |
JP4572123B2 (ja) * | 2005-02-28 | 2010-10-27 | 日本電気株式会社 | 音源供給装置及び音源供給方法 |
US7809559B2 (en) * | 2006-07-24 | 2010-10-05 | Motorola, Inc. | Method and apparatus for removing from an audio signal periodic noise pulses representable as signals combined by convolution |
US8725499B2 (en) * | 2006-07-31 | 2014-05-13 | Qualcomm Incorporated | Systems, methods, and apparatus for signal change detection |
US8260609B2 (en) | 2006-07-31 | 2012-09-04 | Qualcomm Incorporated | Systems, methods, and apparatus for wideband encoding and decoding of inactive frames |
JP2008139447A (ja) * | 2006-11-30 | 2008-06-19 | Mitsubishi Electric Corp | 音声符号化装置及び音声復号装置 |
US8032359B2 (en) * | 2007-02-14 | 2011-10-04 | Mindspeed Technologies, Inc. | Embedded silence and background noise compression |
BRPI0818927A2 (pt) * | 2007-11-02 | 2015-06-16 | Huawei Tech Co Ltd | Método e aparelho para a decodificação de áudio |
CN100555414C (zh) * | 2007-11-02 | 2009-10-28 | 华为技术有限公司 | 一种dtx判决方法和装置 |
DE102008009718A1 (de) * | 2008-02-19 | 2009-08-20 | Siemens Enterprise Communications Gmbh & Co. Kg | Verfahren und Mittel zur Enkodierung von Hintergrundrauschinformationen |
DE102008009719A1 (de) | 2008-02-19 | 2009-08-20 | Siemens Enterprise Communications Gmbh & Co. Kg | Verfahren und Mittel zur Enkodierung von Hintergrundrauschinformationen |
CN101483495B (zh) * | 2008-03-20 | 2012-02-15 | 华为技术有限公司 | 一种背景噪声生成方法以及噪声处理装置 |
CN101335000B (zh) | 2008-03-26 | 2010-04-21 | 华为技术有限公司 | 编码的方法及装置 |
CN102792760B (zh) * | 2010-02-25 | 2015-08-12 | 瑞典爱立信有限公司 | 为音乐关闭dtx |
JP2012215198A (ja) * | 2011-03-31 | 2012-11-08 | Showa Corp | 回転構造体 |
CN103187065B (zh) * | 2011-12-30 | 2015-12-16 | 华为技术有限公司 | 音频数据的处理方法、装置和系统 |
JP6180544B2 (ja) * | 2012-12-21 | 2017-08-16 | フラウンホーファー−ゲゼルシャフト・ツール・フェルデルング・デル・アンゲヴァンテン・フォルシュング・アインゲトラーゲネル・フェライン | オーディオ信号の不連続伝送における高スペクトル−時間分解能を持つコンフォートノイズの生成 |
-
2011
- 2011-12-30 CN CN201110455836.7A patent/CN103187065B/zh active Active
-
2012
- 2012-12-28 CA CA3059322A patent/CA3059322C/en active Active
- 2012-12-28 KR KR1020167036611A patent/KR101770237B1/ko active IP Right Grant
- 2012-12-28 SG SG10201609338SA patent/SG10201609338SA/en unknown
- 2012-12-28 SG SG11201403686SA patent/SG11201403686SA/en unknown
- 2012-12-28 BR BR112014016153-4A patent/BR112014016153B1/pt active IP Right Grant
- 2012-12-28 RU RU2016100179A patent/RU2617926C1/ru active
- 2012-12-28 ES ES12861377.5T patent/ES2610783T3/es active Active
- 2012-12-28 EP EP12861377.5A patent/EP2793227B1/en active Active
- 2012-12-28 KR KR1020147020836A patent/KR101693280B1/ko active Application Filing
- 2012-12-28 AU AU2012361423A patent/AU2012361423B2/en active Active
- 2012-12-28 RU RU2014131387/08A patent/RU2579926C1/ru active
- 2012-12-28 CA CA2861916A patent/CA2861916C/en active Active
- 2012-12-28 MY MYPI2014001949A patent/MY173976A/en unknown
- 2012-12-28 MX MX2014007968A patent/MX338445B/es active IP Right Grant
- 2012-12-28 PT PT128613775T patent/PT2793227T/pt unknown
- 2012-12-28 WO PCT/CN2012/087812 patent/WO2013097764A1/zh active Application Filing
- 2012-12-28 CA CA3181066A patent/CA3181066A1/en active Pending
- 2012-12-28 JP JP2014549344A patent/JP6072068B2/ja active Active
-
2014
- 2014-06-30 US US14/318,899 patent/US9406304B2/en active Active
- 2014-07-08 IN IN1436KON2014 patent/IN2014KN01436A/en unknown
- 2014-07-08 ZA ZA2014/04996A patent/ZA201404996B/en unknown
- 2014-12-31 HK HK14113112.0A patent/HK1199543A1/zh unknown
-
2016
- 2016-01-12 ZA ZA2016/00247A patent/ZA201600247B/en unknown
- 2016-06-21 US US15/188,518 patent/US9892738B2/en active Active
- 2016-12-27 JP JP2016252612A patent/JP6462653B2/ja active Active
-
2017
- 2017-04-18 RU RU2017113357A patent/RU2641464C1/ru active
-
2018
- 2018-01-11 US US15/867,977 patent/US10529345B2/en active Active
-
2019
- 2019-11-27 US US16/697,822 patent/US11183197B2/en active Active
-
2021
- 2021-10-21 US US17/507,200 patent/US11727946B2/en active Active
-
2023
- 2023-06-29 US US18/344,445 patent/US20230352035A1/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101087319A (zh) * | 2006-06-05 | 2007-12-12 | 华为技术有限公司 | 一种发送和接收背景噪声的方法和装置及静音压缩系统 |
CN101246688A (zh) * | 2007-02-14 | 2008-08-20 | 华为技术有限公司 | 一种对背景噪声信号进行编解码的方法、系统和装置 |
CN101320563A (zh) * | 2007-06-05 | 2008-12-10 | 华为技术有限公司 | 一种背景噪声编码/解码装置、方法和通信设备 |
US20110228946A1 (en) * | 2010-03-22 | 2011-09-22 | Dsp Group Ltd. | Comfort noise generation method and system |
Also Published As
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11727946B2 (en) | Method, apparatus, and system for processing audio data | |
RU2383943C2 (ru) | Кодирование звуковых сигналов | |
US6539355B1 (en) | Signal band expanding method and apparatus and signal synthesis method and apparatus | |
WO2009056027A1 (fr) | Procédé et dispositif de décodage audio | |
WO2014117458A1 (zh) | 高频带信号的预测方法、编/解码设备 | |
WO2023197809A1 (zh) | 一种高频音频信号的编解码方法和相关装置 | |
JP2005241761A (ja) | 通信装置及び信号符号化/復号化方法 | |
EP2774148A1 (en) | Bandwidth extension of audio signals | |
JP6258522B2 (ja) | デバイスにおいてコーディング技術を切り替える装置および方法 | |
CN116137151A (zh) | 低码率网络连接中提供高质量音频通信的系统和方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 12861377 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2861916 Country of ref document: CA Ref document number: 2014549344 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: MX/A/2014/007968 Country of ref document: MX |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REEP | Request for entry into the european phase |
Ref document number: 2012861377 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2012861377 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 20147020836 Country of ref document: KR Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2014131387 Country of ref document: RU Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2012361423 Country of ref document: AU Date of ref document: 20121228 Kind code of ref document: A |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112014016153 Country of ref document: BR |
|
ENP | Entry into the national phase |
Ref document number: 112014016153 Country of ref document: BR Kind code of ref document: A2 Effective date: 20140627 |