WO2015184813A1 - 一种语音频信号的处理方法和装置 - Google Patents
一种语音频信号的处理方法和装置 Download PDFInfo
- Publication number
- WO2015184813A1 WO2015184813A1 PCT/CN2015/071017 CN2015071017W WO2015184813A1 WO 2015184813 A1 WO2015184813 A1 WO 2015184813A1 CN 2015071017 W CN2015071017 W CN 2015071017W WO 2015184813 A1 WO2015184813 A1 WO 2015184813A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- value
- audio signal
- signal
- sampled
- length
- Prior art date
Links
- 230000005236 sound signal Effects 0.000 title claims abstract description 219
- 238000000034 method Methods 0.000 title claims abstract description 72
- 238000012545 processing Methods 0.000 title claims description 17
- 230000003044 adaptive effect Effects 0.000 claims abstract description 147
- 238000010606 normalization Methods 0.000 claims abstract description 71
- 238000005070 sampling Methods 0.000 claims abstract description 20
- 238000012937 correction Methods 0.000 claims description 65
- 238000004364 calculation method Methods 0.000 claims description 17
- 230000000630 rising effect Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 6
- 238000004891 communication Methods 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 238000000802 evaporation-induced self-assembly Methods 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011022 operating instruction Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/028—Noise substitution, i.e. substituting non-tonal spectral components by noisy source
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/012—Comfort noise or silence coding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/167—Audio streaming, i.e. formatting and decoding of an encoded audio signal representation into a data stream for transmission or storage purposes
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/26—Pre-filtering or post-filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0316—Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/038—Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
Definitions
- the present invention relates to the field of communications, and in particular, to a method and apparatus for processing a voice signal.
- the electronic device currently recovers the noise component of the decoded speech and audio signal when decoding the encoded information of the speech and audio signal.
- an electronic device when it recovers the noise component of a speech signal, it is generally implemented by adding a random noise signal to the speech and audio signal. Specifically, the speech and audio signals and the random noise signal are weighted to obtain a signal after the speech and audio signals recover the noise component; wherein the speech and audio signals may be time domain signals, frequency domain signals or excitation signals, or low frequency signals or high signals. Frequency signal, etc.
- the method of restoring the noise component of the speech signal causes the signal obtained by the speech signal to recover the noise component to have an echo, which affects the recovery of the noise component.
- the auditory quality of the signal is a signal having a rising edge or a falling edge
- a method and a device for processing a speech and audio signal are provided.
- a speech and audio signal having a rising edge or a falling edge when the noise component is restored, the signal after the speech component is restored to the noise component has an echo. Improve the auditory quality of the signal after restoring the noise component.
- an embodiment of the present invention provides a method for processing a voice signal, where the method includes:
- the first speech audio signal is a signal in the speech audio signal that needs to recover a noise component
- the determining, by the adaptive normalized length and an amplitude value of each of the sample values, an adjustment amplitude value of each of the sampled values include:
- An adjustment amplitude value of each of the sample values is calculated according to an amplitude value of each of the sampled values and a corresponding amplitude disturbance value thereof.
- the calculating according to the amplitude value of each of the sampled values and the adaptive normalized length, each The average of the amplitudes corresponding to the sampled values, including:
- An average value of amplitude values of all sample values in the sub-band to which the sampled value belongs is calculated, and the calculated average value is used as an average value of the amplitude corresponding to the sampled value.
- determining, according to the adaptive normalization length, a sub-sample to which the sample value belongs Belt including:
- a sub-band composed of m sample values, the sampled value, and n sample values after the sampled value is determined as a sub-band to which the sampled value belongs, m, n is determined by the adaptive normalization length, m is an integer not less than 0, and n is an integer not less than 0.
- the fourth possible implementation in the first aspect In combination with the first possible implementation of the first aspect, and/or the second possible implementation of the first aspect, and/or the third possible implementation of the first aspect, the fourth possible implementation in the first aspect In the mode, the calculating the amplitude value of each of the sampled values according to the amplitude value of each of the sampled values and the corresponding amplitude disturbance value thereof, including:
- the amplitude value of each of the sampled values is subtracted from its corresponding amplitude disturbance value to obtain a difference between the two, and the obtained difference value is used as an adjustment amplitude value of each of the sampled values.
- the determining the adaptive normalization length includes:
- N is a natural number
- the adaptive normalized length is calculated according to a signal type of the high frequency band signal and the number of the subbands in the speech audio signal.
- the calculating, according to a signal type of the high frequency band signal and the number of the subbands in the voice audio signal, The adaptive normalized length includes:
- L is the adaptive normalized length
- K is a value corresponding to the signal type of the high-band signal in the speech and audio signal, and the value of K corresponding to the signal type of the different high-band signal is different
- the peak-to-average ratio is greater than the preset peak-to-average ratio of the number of sub-bands
- ⁇ is a constant less than one.
- the determining an adaptive normalization length including:
- the adaptive normalization length is determined as a preset first length value, when the peak-to-average ratio of the low-band signal and the high-band signal.
- the adaptive normalization length is determined as a preset second length value; the first length value is greater than the second length value ;or,
- the adaptive normalized length is determined as a preset first length value, when a peak-to-average ratio of the low-band signal is not less than the high-band signal
- the peak normalized ratio is determined by determining the adaptive normalization length as a preset second length value; or
- the adaptive normalization length is determined according to a signal type of the high frequency band signal in the voice audio signal, and the signal normalization lengths of the different high frequency band signals are different in adaptive normalization length.
- the determining, according to the symbol of each of the sampled values and the adjusted amplitude value of each of the sampled values, the second voice signal comprises:
- the calculating the correction factor includes:
- the sampling is performed according to the correction factor
- the adjustment amplitude value of the value of the adjustment amplitude value greater than 0 is corrected, including:
- the correction amplitude value greater than 0 in the adjustment amplitude value of the sampled value is corrected by using the following formula:
- Y is the adjustment amplitude value after the correction process
- y is the adjustment amplitude value of the adjustment amplitude value of the sampled value greater than
- b is a constant, 0 ⁇ b ⁇ 2.
- an embodiment of the present invention provides an apparatus for recovering a noise component of a voice signal, including:
- a first determining unit configured to determine a symbol of each sample value in the first speech audio signal determined by the signal determining unit, and an amplitude value of each of the sample values
- a second determining unit configured to determine an adaptive normalized length
- a third determining unit configured to determine, according to the adaptive normalized length determined by the second determining unit and an amplitude value of each of the sampled values determined by the first determining unit, each of the sampled values Adjust the amplitude value;
- a fourth determining unit configured to determine a second speech audio signal according to a symbol of each of the sampling values determined by the first determining unit and an adjustment amplitude value of each of the sampling values determined by the third determining unit,
- the second speech audio signal is a signal obtained after the first speech audio signal recovers a noise component.
- the third determining unit includes:
- Determining a subunit configured to calculate an amplitude average corresponding to each of the sampled values according to an amplitude value of each of the sampled values and the adaptive normalized length, and an average value corresponding to each of the sampled values Determining an amplitude disturbance value corresponding to each of the sampled values;
- the amplitude value calculation subunit is configured to calculate an adjustment amplitude value of each of the sample values according to the amplitude value of each of the sample values and the corresponding amplitude disturbance value.
- the determining subunit includes:
- the calculation module is configured to calculate an average value of the amplitude values of all the sample values in the subband to which the sampled value belongs, and use the calculated average value as the amplitude average value corresponding to the sampled value.
- the determining module is specifically configured to:
- a sub-band composed of m sample values, the sampled value, and n sample values after the sampled value is determined as a sub-band to which the sampled value belongs, m, n is determined by the adaptive normalization length, m is an integer not less than 0, and n is an integer not less than 0.
- the adjusting the amplitude value calculating subunit is specifically configured to:
- the amplitude value of each of the sampled values is subtracted from its corresponding amplitude disturbance value to obtain a difference between the two, and the obtained difference value is used as an adjustment amplitude value of each of the sampled values.
- the second determining unit includes:
- a dividing subunit configured to divide the low frequency band signal in the speech audio signal into N sub-bands; N is a natural number;
- a length calculation subunit configured to calculate the adaptive normalization length according to a signal type of the high frequency band signal and the number of the subbands in the voice audio signal.
- the length calculation subunit is specifically configured to:
- L is the adaptive normalized length
- K is a value corresponding to the signal type of the high-band signal in the speech and audio signal, and the value of K corresponding to the signal type of the different high-band signal is different
- the peak-to-average ratio is greater than the preset peak-to-average ratio of the number of sub-bands
- ⁇ is a constant less than one.
- the second determining unit is specifically configured to:
- the adaptive normalization length is determined as a preset first length value, when the peak-to-average ratio of the low-band signal and the high-band signal.
- the adaptive normalization length is determined as a preset second length value; the first length value is greater than the second length value ;or,
- the normalized length is determined to be a preset first length value, and when the peak-to-average ratio of the low-band signal is not less than a peak-to-average ratio of the high-band signal, the adaptive normalized length is determined as a preset. Second length value; or,
- the adaptive normalization length is determined according to a signal type of the high frequency band signal in the voice audio signal, and the signal normalization lengths of the different high frequency band signals are different in adaptive normalization length.
- the fourth determining unit is specifically configured to:
- Calculating a correction factor performing a correction process on the adjustment amplitude value of the adjusted amplitude value of the sampled value greater than 0 according to the correction factor; determining each of the values according to the sign of each of the sampled values and the adjusted amplitude value after the correction process The new value of the sampled value is obtained to obtain a second-language audio signal.
- the fourth determining unit is specifically configured to:
- Y is the adjustment amplitude value after the correction process
- y is the adjustment amplitude value which is greater than 0 in the adjustment amplitude value of the sampled value
- b is a constant, and 0 ⁇ b ⁇ 2.
- the code stream is received, the code stream is decoded to obtain a speech and audio signal, the first speech audio signal is determined according to the speech and audio signal, and the symbol and each of each sample value in the first speech audio signal are determined. Determining an adaptive normalized length according to the amplitude value of the sampled value, determining an adjusted amplitude value of each of the sampled values according to the adaptive normalized length and an amplitude value of each of the sampled values, according to each The sign of the sampled value and the adjusted amplitude value of each of the sampled values determine a second speech audio signal. In this process, only the original signal of the first speech audio signal is processed, and a new signal is not added to the first speech audio signal, thereby restoring the noise component. No new energy is added to the binary audio signal, so that if the first speech audio signal has a rising edge or a falling edge, the echo in the second speech audio signal is not increased, thereby improving the auditory quality of the second speech audio signal.
- 1A is a schematic diagram showing an example of sampling value grouping according to an embodiment of the present invention.
- FIG. 1B is another schematic diagram of an example of sampling value grouping according to an embodiment of the present invention.
- FIG. 2 is a schematic flow chart of another method for restoring a noise component of a speech audio signal according to an embodiment of the present invention
- FIG. 3 is a schematic flow chart of another method for restoring a noise component of a speech and audio signal according to an embodiment of the present invention
- FIG. 4 is a schematic structural diagram of an apparatus for restoring a noise component of a speech and audio signal according to an embodiment of the present invention
- FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
- FIG. 1 is a flowchart of a method for restoring a noise component of a speech and audio signal according to an embodiment of the present invention, where the method includes:
- Step 101 Receive a code stream, and decode the code stream to obtain a voice audio signal.
- Step 102 Determine a first speech audio signal according to the speech audio signal;
- the first speech audio signal is a signal that needs to recover a noise component in the decoded speech audio signal;
- the first speech audio signal may be a low frequency band signal, a high frequency band signal, or a full frequency band signal or the like in the decoded speech audio signal.
- the decoded speech audio signal may include one low frequency band signal and one high frequency band signal, or may also include one full frequency band signal.
- Step 103 Determine a symbol of each sample value in the first speech audio signal and an amplitude value of each of the sample values
- the implementation manner of the sampling value may also be different.
- the sampling value may be a spectral coefficient
- the preamble audio signal is a time domain signal
- the sampled value can be a sample point value.
- Step 104 Determine an adaptive normalized length
- the adaptive normalized length when determining the adaptive normalized length, it may be determined according to the low frequency band signal of the decoded audio signal and/or the relevant parameter of the high frequency band signal.
- the related parameters may include a signal type, a peak-to-average ratio, and the like.
- the determining an adaptive normalized length may include:
- N is a natural number
- the adaptive normalized length is calculated according to a signal type of the high frequency band signal and the number of the subbands in the speech audio signal.
- the calculating the adaptive normalization length according to the signal type of the high-band signal and the number of the sub-bands in the voice-audio signal may include:
- the adaptive normalization length may also be calculated according to a signal type of the low frequency band signal and the number of the subbands in the voice signal.
- L K+ ⁇ M.
- K at this time is the value corresponding to the signal type of the low-band signal in the speech and audio signal, and the K corresponding to the signal type of the different low-band signal. The values are different.
- determining the adaptive normalization length may include:
- the adaptive normalization length is determined as the preset first length value, and the peak-to-average ratio of the low-band signal and the peak-to-average ratio of the high-band signal are not less than the absolute value.
- the adaptive normalization length is determined as a preset second length value.
- the first length value is greater than the second length value, and the first length value and the second length value may also be calculated by comparing a peak-to-average ratio of the low-band signal with a peak-to-average ratio of the high-band signal or a difference, and the specific calculation method is not limited.
- determining the adaptive normalization length may include:
- the adaptive normalization length is determined as a preset first length value, and when the peak-to-average ratio of the low-band signal is not less than the peak-to-average ratio of the high-band signal, the adaptive normalized length is determined as the preset second length value.
- the first length value is greater than the second length value, and the first length value and the second length value may also be calculated by comparing a peak-to-average ratio of the low-band signal with a peak-to-average ratio of the high-band signal or a difference, and the specific calculation method is not limited.
- determining the adaptive normalization length may include: determining an adaptive normalized length according to a signal type of the high frequency band signal in the audio and video signal, and different signal types corresponding to different adaptive normalizations
- the signal type is a harmonic signal
- the corresponding adaptive normalized length is 32.
- the signal type is a normal signal
- the corresponding adaptive normalized length is 16
- the signal type is a transient signal
- the corresponding The adaptive normalization length is 8 and so on.
- Step 105 Determine an adjustment amplitude value of each of the sample values according to the adaptive normalization length and an amplitude value of each of the sample values;
- the determining the amplitude value of each of the sampled values according to the adaptive normalized length and the amplitude value of each of the sampled values may include:
- An adjustment amplitude value of each of the sample values is calculated according to an amplitude value of each of the sampled values and a corresponding amplitude disturbance value thereof.
- the calculating the average value of the amplitude corresponding to each of the sampled values according to the amplitude value of each of the sampled values and the adaptive normalized length may include:
- An average value of amplitude values of all sample values in the sub-band to which the sampled value belongs is calculated, and the calculated average value is used as an average value of the amplitude corresponding to the sampled value.
- All sample values are divided into sub-bands according to the adaptive normalization length in a preset order; for each of the sample values, a sub-band including the sample value is determined as a sub-band to which the sample value belongs.
- the predetermined sequence may be, for example, a sequence from a low frequency to a high frequency or a sequence from a high frequency to a low frequency, and is not limited herein.
- x1 to x5 can be divided into one sub-band, x6.
- ⁇ x10 is divided into one sub-band... and so on, and several sub-bands are obtained.
- sub-bands x1 ⁇ x5 are the sub-bands to which each sample value belongs, for x6 ⁇ x10
- the subbands x6 to x10 are the subbands to which each sample value belongs.
- the subband to which the sampled value belongs according to the adaptive normalization length which may include:
- a sub-band composed of m sample values, the sampled value, and n sample values after the sampled value is determined as a sub-band to which the sampled value belongs, m, n is determined by the adaptive normalization length, m is an integer not less than 0, and n is an integer not less than 0.
- the sub-bands to which x1, x2, x(n-1), and xn belong can be set autonomously in practical applications, for example, adding sample values to supplement the missing sample values in the sub-band, etc., for example, for sampling
- the value x1, which does not have a sampled value before, may be x1, x1, x1, x2, x3 as its associated sub-band or the like.
- the amplitude average value corresponding to each of the sampled values may be directly used as each of the sampled values, when an amplitude disturbance value corresponding to each of the sampled values is determined according to an amplitude average value corresponding to each of the sampled values.
- the amplitude perturbation value may be obtained by performing a preset operation on the amplitude average corresponding to each of the sampled values to obtain an amplitude perturbation value corresponding to each of the sampled values, where the preset operation may be, for example, the amplitude
- the average is multiplied by a value, which is typically greater than zero.
- the calculating the amplitude value of each of the sampled values according to the amplitude value of each of the sampled values and the corresponding amplitude disturbance value may include:
- the amplitude value of each of the sampled values is subtracted from its corresponding amplitude disturbance value to obtain a difference between the two, and the obtained difference value is used as an adjustment amplitude value of each of the sampled values.
- Step 106 Determine a second voice according to a symbol of each of the sampled values and an adjusted amplitude value of each of the sampled values a frequency signal; the second speech audio signal is a signal obtained by recovering a noise component of the first speech audio signal.
- the determining the second speech audio signal according to the symbol of each of the sampled values and the adjusted amplitude value of each of the sampled values may include:
- a new value of each sample value is determined according to the sign of each sampled value and the adjusted amplitude value after the correction process, to obtain a second speech audio signal.
- the obtained second speech audio signal may include a new value of all sample values.
- the correction factor may be calculated according to the adaptive normalized length. Specifically, the correction factor ⁇ may be equal to a/L; wherein a is a constant greater than 1.
- Y is the adjustment amplitude value after the correction process
- y is the adjustment amplitude value greater than 0 in the adjustment amplitude value of the sample value
- b is a constant, 0 ⁇ b ⁇ 2.
- the step of extracting the symbol of each sample value in the first speech audio signal in step 103 may be processed at any time before step 106, and there is no necessary execution order between the steps 104 and 105.
- the time domain signal of the speech audio signal may be within one frame, and the sample value of the partial signal in the speech audio signal is particularly large, energy. Especially large, and the sample value of other parts of the speech and audio signal is particularly small, and the energy is particularly small.
- a random noise signal is added to the speech and audio signal in the frequency domain to obtain a signal after recovering the noise component, because the random noise signal is in one
- the energy in the intra-frame time domain is comparable, so that when the frequency domain signal of the signal after restoring the noise component is converted into a time domain signal, the newly added random noise signal tends to cause the original sample in the converted time domain signal.
- the energy of a part of the signal with a particularly small value increases, and the sample value of this part of the signal also changes accordingly, which causes the signal after restoring the noise component to have some echo, which affects the auditory quality of the signal after the noise component is restored.
- the first speech audio signal is determined according to the speech and audio signal, the symbol of each sample value in the first speech audio signal and the amplitude value of each of the sample values are determined, and an adaptive normalized length is determined. Determining an adjustment amplitude value of each of the sampled values according to the adaptive normalized length and an amplitude value of each of the sampled values, according to each of the The sign of the sample and the magnitude of the adjustment of each of the sampled values determine the second speech audio signal.
- the original signal of the first speech audio signal is processed, and no new signal is added to the first speech audio signal, so that no new noise is added to the second speech audio signal after the noise component is restored.
- the energy, and thus if the first speech audio signal has a rising or falling edge does not increase the echo in the second speech audio signal, thereby improving the auditory quality of the second speech audio signal.
- FIG. 2 is a schematic flowchart of another method for restoring a noise component of a speech and audio signal according to an embodiment of the present invention, where the method includes:
- Step 201 Receive a code stream, decode the code stream to obtain a speech and audio signal, and the decoded speech and audio signal includes a low frequency band signal and a high frequency band signal, and determine the high frequency band signal as the first speech audio signal.
- Step 202 Determine a symbol of each sample value in the high frequency band signal and an amplitude value of each sample value.
- the coefficient of a certain sample value in the high-band signal is -4
- the sign value of the sampled value is "-"
- the amplitude value is 4.
- Step 203 Determine an adaptive normalized length.
- Step 204 Determine an amplitude average value corresponding to each sample value according to the amplitude value of each sample value and the adaptive normalization length, and determine an amplitude disturbance corresponding to each sample value according to the amplitude average value corresponding to each sample value. value.
- Step 205 Calculate an adjustment amplitude value of each sample value according to the amplitude value of each sample value and its corresponding amplitude disturbance value;
- how to calculate the adjustment amplitude value of each sample value may refer to the related description in step 105, and details are not described herein.
- the second speech audio signal is a signal obtained after the first speech audio signal recovers a noise component.
- step 106 For the specific implementation of this step, refer to the related description in step 106, and details are not described herein.
- the step of determining the sign of each sample value in the first speech audio signal in step 202 may be performed at any time prior to step 206, and there is no necessary execution order between steps 203, 204, 205.
- step 202 The order of execution between step 202 and step 203 is not limited.
- Step 207 Combine the second speech audio signal and the low frequency band signal of the decoded speech audio signal to obtain an output signal.
- the first speech audio signal is a low frequency band signal of the decoded speech audio signal
- the first speech audio signal is a high frequency band signal of the decoded speech audio signal
- the second speech audio signal and the decoded low frequency band signal of the speech audio signal may be combined to obtain an output signal
- the noise component is recovered by the high-band signal of the decoded speech-audio signal, thereby finally recovering the noise component in the high-band signal, and obtaining the second speech audio signal.
- the high frequency band signal has a rising edge or a falling edge, the echo in the second speech audio signal is not increased, and the auditory quality of the second speech audio signal is improved, thereby improving the auditory quality of the final outputted output signal.
- FIG. 3 is a schematic flowchart of another method for restoring a noise component of a speech and audio signal according to an embodiment of the present invention, where the method includes:
- Steps 301 to 305 are the same as steps 201 to 205, and are not described here.
- Step 306 Calculate a correction factor, and perform correction processing on the adjustment amplitude value greater than 0 in the adjustment amplitude value of each sample value according to the correction factor;
- step 106 For the specific implementation of this step, refer to the related description in step 106, and details are not described herein.
- Step 307 Determine the second speech audio signal according to the symbol of each sampled value and the adjusted amplitude value after the correction processing.
- step 106 For the specific implementation of this step, refer to the related description in step 106, and details are not described herein.
- the step of determining the sign of each sample value in the first speech audio signal in step 302 may be performed at any time before step 307, and there is no necessary execution order between steps 303, 304, 305, and 306.
- step 302 and step 303 The order of execution between step 302 and step 303 is not limited.
- Step 308 Combine the second speech audio signal and the low frequency band signal of the decoded speech audio signal to obtain an output signal.
- the adjustment amplitude value of the adjustment amplitude value greater than 0 is further corrected, thereby further improving the second speech audio signal.
- the auditory quality in turn, further enhances the auditory quality of the final output of the output signal.
- the high-band signal in the decoded speech-audio signal is determined as the first speech audio signal, and is restored therein. a noise component, thereby finally obtaining a second-language audio signal.
- the method for restoring the noise component of the speech and audio signal according to the embodiment of the present invention may recover the noise component of the full-band signal of the decoded speech and audio signal, or decode the noise component.
- the low frequency band signal of the obtained speech and audio signal recovers the noise component, and finally obtains the second speech audio signal, and the implementation process thereof can be seen in FIG. 2 .
- the difference from the method example shown in FIG. 3 is that the full-band signal or the low-band signal is determined as the first speech audio signal when determining the first speech audio signal, which is not illustrated here.
- FIG. 4 is a schematic structural diagram of an apparatus for restoring a noise component of a speech signal according to an embodiment of the present invention.
- the device may be disposed in an electronic device, and the device 400 may include:
- the code stream processing unit 410 is configured to receive a code stream, and decode the code stream to obtain a speech and audio signal, where the first speech and audio signal is a signal in the decoded speech and audio signal that needs to recover a noise component;
- a signal determining unit 420 configured to determine, according to the voice signal obtained by the code stream processing unit 410, a first voice signal
- a first determining unit 430 configured to determine a symbol of each sample value in the first voice signal determined by the signal determining unit 420, and an amplitude value of each of the sample values
- a second determining unit 440 configured to determine an adaptive normalized length
- a third determining unit 450 configured to determine, according to the adaptive normalization length determined by the second determining unit 440 and the amplitude value of each of the sampling values determined by the first determining unit 430 The adjusted amplitude value of the sampled value;
- a fourth determining unit 460 configured to determine a second language according to a symbol of each of the sampling values determined by the first determining unit 430 and an adjustment amplitude value of each of the sampling values determined by the third determining unit 450
- An audio signal the second speech audio signal being a signal obtained by recovering a noise component of the first speech audio signal.
- the third determining unit 450 may include:
- the amplitude value calculation subunit is configured to calculate an adjustment amplitude value of each of the sample values according to the amplitude value of each of the sample values and the corresponding amplitude disturbance value.
- the determining subunit may include:
- a determining module configured, for each of the sampled values, a sub-band to which the sampled value belongs according to the adaptive normalized length
- the calculation module is configured to calculate an average value of the amplitude values of all the sample values in the subband to which the sampled value belongs, and use the calculated average value as the amplitude average value corresponding to the sampled value.
- the determining module is specifically configured to:
- a sub-band composed of m sample values, the sampled value, and n sample values after the sampled value is determined as a sub-band to which the sampled value belongs, m, n is determined by the adaptive normalization length, m is an integer not less than 0, and n is an integer not less than 0.
- the adjustment amplitude value calculation subunit is specifically configured to:
- the amplitude value of each of the sampled values is subtracted from its corresponding amplitude disturbance value to obtain a difference between the two, and the obtained difference value is used as an adjustment amplitude value of each of the sampled values.
- the second determining unit 440 may include:
- a dividing subunit configured to divide the low frequency band signal in the speech audio signal into N sub-bands; N is a natural number;
- a length calculation subunit configured to calculate the adaptive normalization length according to a signal type of the high frequency band signal and the number of the subbands in the voice audio signal.
- the length calculation subunit may be specifically configured to:
- L is the adaptive normalized length
- K is a value corresponding to the signal type of the high-band signal in the speech and audio signal, and the value of K corresponding to the signal type of the different high-band signal is different
- the peak-to-average ratio is greater than the preset peak-to-average ratio of the number of sub-bands
- ⁇ is a constant less than one.
- the second determining unit 440 is specifically configured to:
- the adaptive normalization length is determined as a preset first length value, when the peak-to-average ratio of the low-band signal and the high-band signal.
- the adaptive normalization length is determined as a preset second length value; the first length value is greater than the second length value ;or,
- determining the adaptive normalization length as a preset first length value, where the peak-to-average ratio of the low-band signal is not less than a peak-to-average ratio of the high-band signal,
- the normalized length is determined to be a preset second length value; or,
- the adaptive normalization length is determined according to a signal type of the high frequency band signal in the voice audio signal, and the signal normalization lengths of the different high frequency band signals are different in adaptive normalization length.
- the fourth determining unit 460 is specifically configured to:
- Calculating a correction factor performing a correction process on the adjustment amplitude value of the adjusted amplitude value of the sampled value greater than 0 according to the correction factor; determining each of the values according to the sign of each of the sampled values and the adjusted amplitude value after the correction process The new value of the sampled value is obtained to obtain a second-language audio signal.
- the fourth determining unit 460 is specifically configured to:
- the correction amplitude value greater than 0 in the adjustment amplitude value of the sampled value is corrected by using the following formula:
- Y is the adjustment amplitude value after the correction process
- y is the adjustment amplitude value which is greater than 0 in the adjustment amplitude value of the sampled value
- b is a constant, and 0 ⁇ b ⁇ 2.
- the first speech audio signal is determined according to the speech and audio signal, the symbol of each sample value in the first speech audio signal and the amplitude value of each of the sample values are determined, and an adaptive normalized length is determined. Determining an adjustment amplitude value of each of the sample values according to the adaptive normalization length and an amplitude value of each of the sample values, according to a sign of each of the sample values and an adjustment range of each of the sample values The value determines the second speech audio signal. In this process, only the original signal of the first speech audio signal is processed, and no new signal is added to the first speech audio signal, so that no new noise is added to the second speech audio signal after the noise component is restored. The energy, and thus if the first speech audio signal has a rising or falling edge, does not increase the echo in the second speech audio signal, thereby improving the auditory quality of the second speech audio signal.
- the processor 510, the memory 520, and the transceiver 530 are connected to each other through a bus 540.
- the bus 540 may be an ISA bus, a PCI bus, or an EISA bus.
- the bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is shown in Figure 5, but it does not mean that there is only one bus or one type of bus.
- the transceiver 530 is used to connect other devices and communicate with other devices. Specifically, the transceiver 530 can be configured to: receive a code stream;
- An adjustment amplitude value of each of the sample values is calculated according to an amplitude value of each of the sampled values and a corresponding amplitude disturbance value thereof.
- processor 510 is specifically configured to:
- An average value of amplitude values of all sample values in the sub-band to which the sampled value belongs is calculated, and the calculated average value is used as an average value of the amplitude corresponding to the sampled value.
- processor 510 is specifically configured to:
- a sub-band composed of m sample values, the sampled value, and n sample values after the sampled value is determined as a sub-band to which the sampled value belongs, m, n is determined by the adaptive normalization length, m is an integer not less than 0, and n is an integer not less than 0.
- processor 510 is specifically configured to:
- the amplitude value of each of the sampled values is subtracted from its corresponding amplitude disturbance value to obtain a difference between the two, and the obtained difference value is used as an adjustment amplitude value of each of the sampled values.
- processor 510 is specifically configured to:
- N is a natural number
- processor 510 is specifically configured to:
- L is the adaptive normalized length
- K is a value corresponding to the signal type of the high-band signal in the speech and audio signal, and the value of K corresponding to the signal type of the different high-band signal is different
- the peak-to-average ratio is greater than the preset peak-to-average ratio of the number of sub-bands
- ⁇ is a constant less than one.
- processor 510 is specifically configured to:
- Calculating a correction factor performing a correction process on the adjustment amplitude value of the adjusted amplitude value of the sampled value greater than 0 according to the correction factor; determining each of the values according to the sign of each of the sampled values and the adjusted amplitude value after the correction process The new value of the sampled value is obtained to obtain a second-language audio signal.
- processor 510 is specifically configured to:
- processor 510 is specifically configured to:
- Y is the adjustment amplitude value after the correction process
- y is the adjustment amplitude value of the adjustment amplitude value of the sampled value greater than
- b is a constant, 0 ⁇ b ⁇ 2.
- the electronic device determines the first speech audio signal according to the speech and audio signal, determines a symbol of each sample value in the first speech audio signal, and an amplitude value of each of the sampling values, and determines adaptive normalization. Length, determining an adjustment amplitude value of each of the sample values according to the adaptive normalization length and an amplitude value of each of the sample values, according to a symbol of each of the sample values and each of the sample values The amplitude value is adjusted to determine the second speech audio signal. In this process, only the original signal of the first speech audio signal is processed, and no new signal is added to the first speech audio signal, so that no new noise is added to the second speech audio signal after the noise component is restored. The energy, and thus if the first speech audio signal has a rising or falling edge, does not increase the echo in the second speech audio signal, thereby improving the auditory quality of the second speech audio signal.
- the system embodiment since it basically corresponds to the method embodiment, it can be referred to the partial description of the method embodiment.
- the system embodiments described above are merely illustrative, wherein the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, ie may be located A place, or it can be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment. Those of ordinary skill in the art can understand and implement without any creative effort.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Quality & Reliability (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Noise Elimination (AREA)
- Telephone Function (AREA)
- Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
Abstract
Description
Claims (22)
- 一种语音频信号的处理方法,其特征在于,所述方法包括:接收码流,解码所述码流得到语音频信号;根据所述语音频信号确定第一语音频信号,所述第一语音频信号是所述语音频信号中需要恢复噪声成分的信号;确定所述第一语音频信号中每个采样值的符号和每个所述采样值的幅度值;确定自适应归一化长度;根据所述自适应归一化长度和每个所述采样值的幅度值确定每个所述采样值的调整幅度值;根据每个所述采样值的符号和每个所述采样值的调整幅度值确定第二语音频信号,所述第二语音频信号是所述第一语音频信号恢复噪声成分后得到的信号。
- 根据权利要求1所述的方法,其特征在于,所述根据所述自适应归一化长度和每个所述采样值的幅度值确定每个所述采样值的调整幅度值,包括:根据每个所述采样值的幅度值以及所述自适应归一化长度计算每个所述采样值对应的幅度平均值,根据每个所述采样值对应的幅度平均值确定每个所述采样值对应的幅度扰动值;根据每个所述采样值的幅度值及其对应的幅度扰动值计算每个所述采样值的调整幅度值。
- 根据权利要求2所述的方法,其特征在于,所述根据每个所述采样值的幅度值以及所述自适应归一化长度计算每个所述采样值对应的幅度平均值,包括:对于每个所述采样值,根据所述自适应归一化长度确定所述采样值所属的子带;计算所述采样值所属子带内所有采样值的幅度值的平均值,将计算得到的平均值作为所述采样值对应的幅度平均值。
- 根据权利要求3所述的方法,其特征在于,对于每个所述采样值,根据所述自适应归一化长度确定所述采样值所属的子带,包括:将所有采样值按照预设顺序根据所述自适应归一化长度划分子带;对于每个所述采样值,将包括所述采样值的子带确定为所述采样值所属的子带;或者,对于每个所述采样值,将所述采样值之前m个采样值、所述采样值、所述采样值之后n个采样值构成的子带确定为所述采样值所属的子带,m、n由所述自适应归一化长度确定, m是不小于0的整数,n是不小于0的整数。
- 根据权利要求2至4任一项所述的方法,其特征在于,所述根据每个所述采样值的幅度值及其对应的幅度扰动值计算每个所述采样值的调整幅度值,包括:将每个所述采样值的幅度值与其对应的幅度扰动值相减得到两者的差值,将得到的差值作为每个所述采样值的调整幅度值。
- 根据权利要求1至5任一项所述的方法,其特征在于,所述确定自适应归一化长度,包括:将所述语音频信号中的低频带信号划分为N个子带;N为自然数;计算每个所述子带的峰均比,并确定所述峰均比大于预设峰均比阈值的子带个数;根据所述语音频信号中高频带信号的信号类型和所述子带个数,计算所述自适应归一化长度。
- 根据权利要求6所述的方法,其特征在于,所述根据所述语音频信号中高频带信号的信号类型和所述子带个数,计算所述自适应归一化长度,包括:根据公式L=K+α×M计算所述自适应归一化长度;其中,L为所述自适应归一化长度;K为所述语音频信号中的高频带信号的信号类型对应的数值,不同高频带信号的信号类型对应的K的数值不同;M为峰均比大于预设峰均比阈值的子带个数;α为小于1的常数。
- 根据权利要求1至5任一项所述的方法,其特征在于,所述确定自适应归一化长度,包括:计算所述语音频信号中低频带信号的峰均比,和所述语音频信号中高频带信号的峰均比;当所述低频带信号的峰均比和所述高频带信号的峰均比的差值的绝对值小于预设差值阈值时,将所述自适应归一化长度确定为预设第一长度值,当所述低频带信号的峰均比和所述高频带信号的峰均比的差值的绝对值不小于预设差值阈值时,将所述自适应归一化长度确定为预设第二长度值;所述第一长度值大于所述第二长度值;或者,计算所述语音频信号中低频带信号的峰均比,和所述语音频信号中高频带信号的峰均比;当所述低频带信号的峰均比小于所述高频带信号的峰均比时,将所述自适应归一化长度确定为预设第一长度值,当所述低频带信号的峰均比不小于所述高频带信号的峰均比时,将所述自适应归一化长度确定为预设第二长度值;或者,根据所述语音频信号中高频带信号的信号类型确定所述自适应归一化长度,不同高频 带信号的信号类型对应的自适应归一化长度不同。
- 根据权利要求1至8任一项所述的方法,其特征在于,所述根据每个所述采样值的符号和每个所述采样值的调整幅度值确定第二语音频信号,包括:根据每个所述采样值的符号和调整幅度值确定每个所述采样值的新取值,得到所述第二语音频信号;或者,计算修正因子;根据所述修正因子对所述采样值的调整幅度值中大于0的调整幅度值进行修正处理;根据每个所述采样值的符号和修正处理后的调整幅度值确定每个所述采样值的新取值,得到第二语音频信号。
- 根据权利要求9所述的方法,其特征在于,所述计算修正因子,包括:使用公式β=a/L计算所述修正因子;其中,β为所述修正因子,L为所述自适应归一化长度,a为大于1的常数。
- 根据权利要求9或10所述的方法,其特征在于,所述根据所述修正因子对所述采样值的调整幅度值中大于0的调整幅度值进行修正处理,包括:使用以下公式对所述采样值的调整幅度值中大于0的调整幅度值进行修正处理:Y=y×(b-β);其中,Y为修正处理后的调整幅度值,y为所述采样值的调整幅度值中大于0的调整幅度值,b为常数,0<b<2。
- 一种恢复语音频信号噪声成分的装置,其特征在于,包括:码流处理单元,用于接收码流,解码所述码流得到语音频信号;信号确定单元,用于根据所述码流处理单元得到的所述语音频信号确定第一语音频信号,所述第一语音频信号是解码得到的所述语音频信号中需要恢复噪声成分的信号;第一确定单元,用于确定所述信号确定单元确定的所述第一语音频信号中每个采样值的符号和每个所述采样值的幅度值;第二确定单元,用于确定自适应归一化长度;第三确定单元,用于根据所述第二确定单元确定的所述自适应归一化长度和所述第一确定单元确定的每个所述采样值的幅度值确定每个所述采样值的调整幅度值;第四确定单元,用于根据所述第一确定单元确定的每个所述采样值的符号和所述第三确定单元确定的每个所述采样值的调整幅度值确定第二语音频信号,所述第二语音频信号 是所述第一语音频信号恢复噪声成分后得到的信号。
- 根据权利要求12所述的装置,其特征在于,所述第三确定单元包括:确定子单元,用于根据每个所述采样值的幅度值以及所述自适应归一化长度计算每个所述采样值对应的幅度平均值,根据每个所述采样值对应的幅度平均值确定每个所述采样值对应的幅度扰动值;调整幅度值计算子单元,用于根据每个所述采样值的幅度值及其对应的幅度扰动值计算每个所述采样值的调整幅度值。
- 根据权利要求13所述的装置,其特征在于,所述确定子单元包括:确定模块,用于对于每个所述采样值,根据所述自适应归一化长度确定所述采样值所属的子带;计算模块,用于计算所述采样值所属子带内所有采样值的幅度值的平均值,将计算得到的平均值作为所述采样值对应的幅度平均值。
- 根据权利要求14所述的装置,其特征在于,所述确定模块具体用于:将所有采样值按照预设顺序根据所述自适应归一化长度划分子带;对于每个所述采样值,将包括所述采样值的子带确定为所述采样值所属的子带;或者,对于每个所述采样值,将所述采样值之前m个采样值、所述采样值、所述采样值之后n个采样值构成的子带确定为所述采样值所属的子带,m、n由所述自适应归一化长度确定,m是不小于0的整数,n是不小于0的整数。
- 根据权利要求13至15任一项所述的装置,其特征在于,所述调整幅度值计算子单元具体用于:将每个所述采样值的幅度值与其对应的幅度扰动值相减得到两者的差值,将得到的差值作为每个所述采样值的调整幅度值。
- 根据权利要求12至16任一项所述的装置,其特征在于,所述第二确定单元包括:划分子单元,用于将所述语音频信号中的低频带信号划分为N个子带;N为自然数;个数确定子单元,用于计算每个所述子带的峰均比,并确定所述峰均比大于预设峰均比阈值的子带个数;长度计算子单元,用于根据所述语音频信号中高频带信号的信号类型和所述子带个数,计算所述自适应归一化长度。
- 根据权利要求17所述的装置,其特征在于,所述长度计算子单元具体用于:根据公式L=K+α×M计算所述自适应归一化长度;其中,L为所述自适应归一化长度;K为所述语音频信号中的高频带信号的信号类型对应的数值,不同高频带信号的信号类型对应的K的数值不同;M为峰均比大于预设峰均比阈值的子带个数;α为小于1的常数。
- 根据权利要求12至16任一项所述的装置,其特征在于,所述第二确定单元具体用于:计算所述语音频信号中低频带信号的峰均比,和所述语音频信号中高频带信号的峰均比;当所述低频带信号的峰均比和所述高频带信号的峰均比的差值的绝对值小于预设差值阈值时,将所述自适应归一化长度确定为预设第一长度值,当所述低频带信号的峰均比和所述高频带信号的峰均比的差值的绝对值不小于预设差值阈值时,将所述自适应归一化长度确定为预设第二长度值;所述第一长度值大于所述第二长度值;或者,计算所述语音频信号中低频带信号的峰均比,和所述语音频信号中高频带信号的峰均比;当所述低频带信号的峰均比小于所述高频带信号的峰均比时,将所述自适应归一化长度确定为预设第一长度值,当所述低频带信号的峰均比不小于所述高频带信号的峰均比时,将所述自适应归一化长度确定为预设第二长度值;或者,根据所述语音频信号中高频带信号的信号类型确定所述自适应归一化长度,不同高频带信号的信号类型对应的自适应归一化长度不同。
- 根据权利要求12至19任一项所述的装置,其特征在于,所述第四确定单元具体用于:根据每个所述采样值的符号和调整幅度值确定每个所述采样值的新取值,得到所述第二语音频信号;或者,计算修正因子;根据所述修正因子对所述采样值的调整幅度值中大于0的调整幅度值进行修正处理;根据每个所述采样值的符号和修正处理后的调整幅度值确定每个所述采样值的新取值,得到第二语音频信号。
- 根据权利要求20所述的装置,其特征在于,所述第四确定单元具体用于:使用公式β=a/L计算所述修正因子;其中,β为所述修正因子,L为所述自适应归一化长度,a为大于1的常数。
- 根据权利要求20或21所述的装置,其特征在于,所述第四确定单元具体用于:使用以下公式对所述采样值的调整幅度值中大于0的调整幅度值进行修正处理:Y=y×(b-β);其中,Y为修正处理后的调整幅度值,y为将所述采样值的调整幅度值中大于0的调整幅度值,b为常数,0<b<2。
Priority Applications (19)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
SG11201610141RA SG11201610141RA (en) | 2014-06-03 | 2015-01-19 | Method for processing speech/audio signal and apparatus |
EP23184053.9A EP4283614A3 (en) | 2014-06-03 | 2015-01-19 | Method for processing speech/audio signal and apparatus |
EP19190663.5A EP3712890B1 (en) | 2014-06-03 | 2015-01-19 | Method for processing speech/audio signal and apparatus |
EP15802508.0A EP3147900B1 (en) | 2014-06-03 | 2015-01-19 | Method and device for processing audio signal |
CA2951169A CA2951169C (en) | 2014-06-03 | 2015-01-19 | Method for processing speech/audio signal and apparatus |
RU2016152224A RU2651184C1 (ru) | 2014-06-03 | 2015-01-19 | Способ обработки речевого/звукового сигнала и устройство |
KR1020167035690A KR101943529B1 (ko) | 2014-06-03 | 2015-01-19 | 오디오 신호를 처리하기 위한 방법 및 장치 |
BR112016028375-9A BR112016028375B1 (pt) | 2014-06-03 | 2015-01-19 | Método para processar sinal de fala/áudio e aparelho |
KR1020207011385A KR102201791B1 (ko) | 2014-06-03 | 2015-01-19 | 오디오 신호를 처리하기 위한 방법 및 장치 |
JP2016570979A JP6462727B2 (ja) | 2014-06-03 | 2015-01-19 | 音声/オーディオ信号を処理するための方法および装置 |
KR1020197002091A KR102104561B1 (ko) | 2014-06-03 | 2015-01-19 | 오디오 신호를 처리하기 위한 방법 및 장치 |
NZ727567A NZ727567A (en) | 2014-06-03 | 2015-01-19 | Method for processing speech/audio signal and apparatus |
MX2016015950A MX362612B (es) | 2014-06-03 | 2015-01-19 | Metodo para procesar señal de voz/audio y aparato. |
AU2015271580A AU2015271580B2 (en) | 2014-06-03 | 2015-01-19 | Method for processing speech/audio signal and apparatus |
IL249337A IL249337B (en) | 2014-06-03 | 2016-12-01 | Method and apparatus for processing speech/audio signals |
US15/369,396 US9978383B2 (en) | 2014-06-03 | 2016-12-05 | Method for processing speech/audio signal and apparatus |
ZA2016/08477A ZA201608477B (en) | 2014-06-03 | 2016-12-08 | Method for processing speech/audio signal and apparatus |
US15/985,281 US10657977B2 (en) | 2014-06-03 | 2018-05-21 | Method for processing speech/audio signal and apparatus |
US16/877,389 US11462225B2 (en) | 2014-06-03 | 2020-05-18 | Method for processing speech/audio signal and apparatus |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410242233.2 | 2014-06-03 | ||
CN201410242233.2A CN105336339B (zh) | 2014-06-03 | 2014-06-03 | 一种语音频信号的处理方法和装置 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/369,396 Continuation US9978383B2 (en) | 2014-06-03 | 2016-12-05 | Method for processing speech/audio signal and apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2015184813A1 true WO2015184813A1 (zh) | 2015-12-10 |
Family
ID=54766052
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2015/071017 WO2015184813A1 (zh) | 2014-06-03 | 2015-01-19 | 一种语音频信号的处理方法和装置 |
Country Status (19)
Country | Link |
---|---|
US (3) | US9978383B2 (zh) |
EP (3) | EP3147900B1 (zh) |
JP (3) | JP6462727B2 (zh) |
KR (3) | KR102104561B1 (zh) |
CN (2) | CN110097892B (zh) |
AU (1) | AU2015271580B2 (zh) |
BR (1) | BR112016028375B1 (zh) |
CA (1) | CA2951169C (zh) |
CL (1) | CL2016003121A1 (zh) |
ES (1) | ES2964221T3 (zh) |
HK (1) | HK1220543A1 (zh) |
IL (1) | IL249337B (zh) |
MX (2) | MX362612B (zh) |
MY (1) | MY179546A (zh) |
NZ (1) | NZ727567A (zh) |
RU (1) | RU2651184C1 (zh) |
SG (1) | SG11201610141RA (zh) |
WO (1) | WO2015184813A1 (zh) |
ZA (1) | ZA201608477B (zh) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110097892B (zh) * | 2014-06-03 | 2022-05-10 | 华为技术有限公司 | 一种语音频信号的处理方法和装置 |
CN108133712B (zh) * | 2016-11-30 | 2021-02-12 | 华为技术有限公司 | 一种处理音频数据的方法和装置 |
CN106847299B (zh) * | 2017-02-24 | 2020-06-19 | 喜大(上海)网络科技有限公司 | 延时的估计方法及装置 |
RU2754497C1 (ru) * | 2020-11-17 | 2021-09-02 | федеральное государственное автономное образовательное учреждение высшего образования "Казанский (Приволжский) федеральный университет" (ФГАОУ ВО КФУ) | Способ передачи речевых файлов по зашумленному каналу и устройство для его реализации |
US20230300524A1 (en) * | 2022-03-21 | 2023-09-21 | Qualcomm Incorporated | Adaptively adjusting an input current limit for a boost converter |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020120439A1 (en) * | 2001-02-28 | 2002-08-29 | Telefonaktiebolaget Lm Ericsson (Publ) | Method and apparatus for providing comfort noise in communication system with discontinuous transmission |
WO2003042982A1 (en) * | 2001-11-13 | 2003-05-22 | Acoustic Technologies Inc. | Comfort noise including recorded noise |
CN101320563A (zh) * | 2007-06-05 | 2008-12-10 | 华为技术有限公司 | 一种背景噪声编码/解码装置、方法和通信设备 |
CN101335003A (zh) * | 2007-09-28 | 2008-12-31 | 华为技术有限公司 | 噪声生成装置、及方法 |
CN101366077A (zh) * | 2005-08-31 | 2009-02-11 | 摩托罗拉公司 | 在语音通信系统中产生舒适噪声的方法和设备 |
CN101483042A (zh) * | 2008-03-20 | 2009-07-15 | 华为技术有限公司 | 一种噪声生成方法以及噪声生成装置 |
US8139777B2 (en) * | 2007-10-31 | 2012-03-20 | Qnx Software Systems Co. | System for comfort noise injection |
JP2013015598A (ja) * | 2011-06-30 | 2013-01-24 | Zte Corp | オーディオ符号化/復号化方法、システム及びノイズレベルの推定方法 |
Family Cites Families (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6261312B1 (en) | 1998-06-23 | 2001-07-17 | Innercool Therapies, Inc. | Inflatable catheter for selective organ heating and cooling and method of using the same |
SE9803698L (sv) * | 1998-10-26 | 2000-04-27 | Ericsson Telefon Ab L M | Metoder och anordningar i ett telekommunikationssystem |
CA2252170A1 (en) * | 1998-10-27 | 2000-04-27 | Bruno Bessette | A method and device for high quality coding of wideband speech and audio signals |
US6687668B2 (en) * | 1999-12-31 | 2004-02-03 | C & S Technology Co., Ltd. | Method for improvement of G.723.1 processing time and speech quality and for reduction of bit rate in CELP vocoder and CELP vococer using the same |
US6631139B2 (en) * | 2001-01-31 | 2003-10-07 | Qualcomm Incorporated | Method and apparatus for interoperability between voice transmission systems during speech inactivity |
KR100935961B1 (ko) * | 2001-11-14 | 2010-01-08 | 파나소닉 주식회사 | 부호화 장치 및 복호화 장치 |
US7536298B2 (en) * | 2004-03-15 | 2009-05-19 | Intel Corporation | Method of comfort noise generation for speech communication |
US7831421B2 (en) * | 2005-05-31 | 2010-11-09 | Microsoft Corporation | Robust decoder |
US8255213B2 (en) | 2006-07-12 | 2012-08-28 | Panasonic Corporation | Speech decoding apparatus, speech encoding apparatus, and lost frame concealment method |
KR101396140B1 (ko) * | 2006-09-18 | 2014-05-20 | 코닌클리케 필립스 엔.브이. | 오디오 객체들의 인코딩과 디코딩 |
AU2009267518B2 (en) * | 2008-07-11 | 2012-08-16 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus and method for encoding/decoding an audio signal using an aliasing switch scheme |
PT2146344T (pt) * | 2008-07-17 | 2016-10-13 | Fraunhofer Ges Forschung | Esquema de codificação/descodificação de áudio com uma derivação comutável |
CN101483048B (zh) | 2009-02-06 | 2010-08-25 | 凌阳科技股份有限公司 | 光学储存装置及其回路增益值的自动校正方法 |
US9047875B2 (en) * | 2010-07-19 | 2015-06-02 | Futurewei Technologies, Inc. | Spectrum flatness control for bandwidth extension |
CN102436820B (zh) | 2010-09-29 | 2013-08-28 | 华为技术有限公司 | 高频带信号编码方法及装置、高频带信号解码方法及装置 |
JP6189831B2 (ja) * | 2011-05-13 | 2017-08-30 | サムスン エレクトロニクス カンパニー リミテッド | ビット割り当て方法及び記録媒体 |
US20130006644A1 (en) * | 2011-06-30 | 2013-01-03 | Zte Corporation | Method and device for spectral band replication, and method and system for audio decoding |
CN102208188B (zh) * | 2011-07-13 | 2013-04-17 | 华为技术有限公司 | 音频信号编解码方法和设备 |
US20130132100A1 (en) | 2011-10-28 | 2013-05-23 | Electronics And Telecommunications Research Institute | Apparatus and method for codec signal in a communication system |
LT2774145T (lt) * | 2011-11-03 | 2020-09-25 | Voiceage Evs Llc | Nekalbinio turinio gerinimas mažos spartos celp dekoderiui |
US20130282373A1 (en) | 2012-04-23 | 2013-10-24 | Qualcomm Incorporated | Systems and methods for audio signal processing |
CN110097892B (zh) * | 2014-06-03 | 2022-05-10 | 华为技术有限公司 | 一种语音频信号的处理方法和装置 |
US12044962B2 (en) | 2019-04-19 | 2024-07-23 | Canon Kabushiki Kaisha | Forming apparatus, forming method, and article manufacturing method |
-
2014
- 2014-06-03 CN CN201910358522.1A patent/CN110097892B/zh active Active
- 2014-06-03 CN CN201410242233.2A patent/CN105336339B/zh active Active
-
2015
- 2015-01-19 SG SG11201610141RA patent/SG11201610141RA/en unknown
- 2015-01-19 RU RU2016152224A patent/RU2651184C1/ru active
- 2015-01-19 MY MYPI2016704486A patent/MY179546A/en unknown
- 2015-01-19 EP EP15802508.0A patent/EP3147900B1/en active Active
- 2015-01-19 MX MX2016015950A patent/MX362612B/es active IP Right Grant
- 2015-01-19 AU AU2015271580A patent/AU2015271580B2/en active Active
- 2015-01-19 CA CA2951169A patent/CA2951169C/en active Active
- 2015-01-19 EP EP23184053.9A patent/EP4283614A3/en active Pending
- 2015-01-19 WO PCT/CN2015/071017 patent/WO2015184813A1/zh active Application Filing
- 2015-01-19 EP EP19190663.5A patent/EP3712890B1/en active Active
- 2015-01-19 BR BR112016028375-9A patent/BR112016028375B1/pt active IP Right Grant
- 2015-01-19 JP JP2016570979A patent/JP6462727B2/ja active Active
- 2015-01-19 KR KR1020197002091A patent/KR102104561B1/ko active IP Right Grant
- 2015-01-19 KR KR1020167035690A patent/KR101943529B1/ko active IP Right Grant
- 2015-01-19 ES ES19190663T patent/ES2964221T3/es active Active
- 2015-01-19 KR KR1020207011385A patent/KR102201791B1/ko active IP Right Grant
- 2015-01-19 NZ NZ727567A patent/NZ727567A/en unknown
-
2016
- 2016-07-15 HK HK16108374.1A patent/HK1220543A1/zh unknown
- 2016-12-01 IL IL249337A patent/IL249337B/en active IP Right Grant
- 2016-12-02 CL CL2016003121A patent/CL2016003121A1/es unknown
- 2016-12-02 MX MX2019001193A patent/MX2019001193A/es unknown
- 2016-12-05 US US15/369,396 patent/US9978383B2/en active Active
- 2016-12-08 ZA ZA2016/08477A patent/ZA201608477B/en unknown
-
2018
- 2018-05-21 US US15/985,281 patent/US10657977B2/en active Active
- 2018-12-26 JP JP2018242725A patent/JP6817283B2/ja active Active
-
2020
- 2020-05-18 US US16/877,389 patent/US11462225B2/en active Active
- 2020-12-23 JP JP2020213571A patent/JP7142674B2/ja active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020120439A1 (en) * | 2001-02-28 | 2002-08-29 | Telefonaktiebolaget Lm Ericsson (Publ) | Method and apparatus for providing comfort noise in communication system with discontinuous transmission |
WO2003042982A1 (en) * | 2001-11-13 | 2003-05-22 | Acoustic Technologies Inc. | Comfort noise including recorded noise |
CN101366077A (zh) * | 2005-08-31 | 2009-02-11 | 摩托罗拉公司 | 在语音通信系统中产生舒适噪声的方法和设备 |
CN101320563A (zh) * | 2007-06-05 | 2008-12-10 | 华为技术有限公司 | 一种背景噪声编码/解码装置、方法和通信设备 |
CN101335003A (zh) * | 2007-09-28 | 2008-12-31 | 华为技术有限公司 | 噪声生成装置、及方法 |
US8139777B2 (en) * | 2007-10-31 | 2012-03-20 | Qnx Software Systems Co. | System for comfort noise injection |
CN101483042A (zh) * | 2008-03-20 | 2009-07-15 | 华为技术有限公司 | 一种噪声生成方法以及噪声生成装置 |
JP2013015598A (ja) * | 2011-06-30 | 2013-01-24 | Zte Corp | オーディオ符号化/復号化方法、システム及びノイズレベルの推定方法 |
Also Published As
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2015184813A1 (zh) | 一种语音频信号的处理方法和装置 | |
KR101924767B1 (ko) | 음성 주파수 코드 스트림 디코딩 방법 및 디바이스 | |
US20150036679A1 (en) | Methods and apparatuses for transmitting and receiving audio signals | |
JP6616470B2 (ja) | 符号化方法、復号化方法、符号化装置及び復号化装置 | |
WO2014194625A1 (en) | Systems and methods for audio encoding and decoding | |
US9312893B2 (en) | Systems, methods and devices for electronic communications having decreased information loss | |
CN103456307A (zh) | 音频解码器中帧差错隐藏的谱代替方法及系统 | |
WO2015165264A1 (zh) | 处理信号的方法及设备 | |
JP2003522981A (ja) | ピッチ変化検出を伴なう誤り訂正方法 | |
US20150194157A1 (en) | System, method, and computer program product for artifact reduction in high-frequency regeneration audio signals |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15802508 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 249337 Country of ref document: IL |
|
ENP | Entry into the national phase |
Ref document number: 2016570979 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: MX/A/2016/015950 Country of ref document: MX |
|
ENP | Entry into the national phase |
Ref document number: 2951169 Country of ref document: CA |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112016028375 Country of ref document: BR |
|
REEP | Request for entry into the european phase |
Ref document number: 2015802508 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2015802508 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 20167035690 Country of ref document: KR Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2016152224 Country of ref document: RU Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2015271580 Country of ref document: AU Date of ref document: 20150119 Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 112016028375 Country of ref document: BR Kind code of ref document: A2 Effective date: 20161202 |