WO2015184813A1 - 一种语音频信号的处理方法和装置 - Google Patents

一种语音频信号的处理方法和装置 Download PDF

Info

Publication number
WO2015184813A1
WO2015184813A1 PCT/CN2015/071017 CN2015071017W WO2015184813A1 WO 2015184813 A1 WO2015184813 A1 WO 2015184813A1 CN 2015071017 W CN2015071017 W CN 2015071017W WO 2015184813 A1 WO2015184813 A1 WO 2015184813A1
Authority
WO
WIPO (PCT)
Prior art keywords
value
audio signal
signal
sampled
length
Prior art date
Application number
PCT/CN2015/071017
Other languages
English (en)
French (fr)
Inventor
刘泽新
苗磊
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to KR1020197002091A priority Critical patent/KR102104561B1/ko
Priority to EP23184053.9A priority patent/EP4283614A3/en
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CA2951169A priority patent/CA2951169C/en
Priority to SG11201610141RA priority patent/SG11201610141RA/en
Priority to AU2015271580A priority patent/AU2015271580B2/en
Priority to EP19190663.5A priority patent/EP3712890B1/en
Priority to KR1020207011385A priority patent/KR102201791B1/ko
Priority to BR112016028375-9A priority patent/BR112016028375B1/pt
Priority to KR1020167035690A priority patent/KR101943529B1/ko
Priority to JP2016570979A priority patent/JP6462727B2/ja
Priority to RU2016152224A priority patent/RU2651184C1/ru
Priority to NZ727567A priority patent/NZ727567A/en
Priority to EP15802508.0A priority patent/EP3147900B1/en
Priority to MX2016015950A priority patent/MX362612B/es
Publication of WO2015184813A1 publication Critical patent/WO2015184813A1/zh
Priority to IL249337A priority patent/IL249337B/en
Priority to US15/369,396 priority patent/US9978383B2/en
Priority to ZA2016/08477A priority patent/ZA201608477B/en
Priority to US15/985,281 priority patent/US10657977B2/en
Priority to US16/877,389 priority patent/US11462225B2/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/028Noise substitution, i.e. substituting non-tonal spectral components by noisy source
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/012Comfort noise or silence coding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/167Audio streaming, i.e. formatting and decoding of an encoded audio signal representation into a data stream for transmission or storage purposes
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/26Pre-filtering or post-filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0316Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/038Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques

Definitions

  • the present invention relates to the field of communications, and in particular, to a method and apparatus for processing a voice signal.
  • the electronic device currently recovers the noise component of the decoded speech and audio signal when decoding the encoded information of the speech and audio signal.
  • an electronic device when it recovers the noise component of a speech signal, it is generally implemented by adding a random noise signal to the speech and audio signal. Specifically, the speech and audio signals and the random noise signal are weighted to obtain a signal after the speech and audio signals recover the noise component; wherein the speech and audio signals may be time domain signals, frequency domain signals or excitation signals, or low frequency signals or high signals. Frequency signal, etc.
  • the method of restoring the noise component of the speech signal causes the signal obtained by the speech signal to recover the noise component to have an echo, which affects the recovery of the noise component.
  • the auditory quality of the signal is a signal having a rising edge or a falling edge
  • a method and a device for processing a speech and audio signal are provided.
  • a speech and audio signal having a rising edge or a falling edge when the noise component is restored, the signal after the speech component is restored to the noise component has an echo. Improve the auditory quality of the signal after restoring the noise component.
  • an embodiment of the present invention provides a method for processing a voice signal, where the method includes:
  • the first speech audio signal is a signal in the speech audio signal that needs to recover a noise component
  • the determining, by the adaptive normalized length and an amplitude value of each of the sample values, an adjustment amplitude value of each of the sampled values include:
  • An adjustment amplitude value of each of the sample values is calculated according to an amplitude value of each of the sampled values and a corresponding amplitude disturbance value thereof.
  • the calculating according to the amplitude value of each of the sampled values and the adaptive normalized length, each The average of the amplitudes corresponding to the sampled values, including:
  • An average value of amplitude values of all sample values in the sub-band to which the sampled value belongs is calculated, and the calculated average value is used as an average value of the amplitude corresponding to the sampled value.
  • determining, according to the adaptive normalization length, a sub-sample to which the sample value belongs Belt including:
  • a sub-band composed of m sample values, the sampled value, and n sample values after the sampled value is determined as a sub-band to which the sampled value belongs, m, n is determined by the adaptive normalization length, m is an integer not less than 0, and n is an integer not less than 0.
  • the fourth possible implementation in the first aspect In combination with the first possible implementation of the first aspect, and/or the second possible implementation of the first aspect, and/or the third possible implementation of the first aspect, the fourth possible implementation in the first aspect In the mode, the calculating the amplitude value of each of the sampled values according to the amplitude value of each of the sampled values and the corresponding amplitude disturbance value thereof, including:
  • the amplitude value of each of the sampled values is subtracted from its corresponding amplitude disturbance value to obtain a difference between the two, and the obtained difference value is used as an adjustment amplitude value of each of the sampled values.
  • the determining the adaptive normalization length includes:
  • N is a natural number
  • the adaptive normalized length is calculated according to a signal type of the high frequency band signal and the number of the subbands in the speech audio signal.
  • the calculating, according to a signal type of the high frequency band signal and the number of the subbands in the voice audio signal, The adaptive normalized length includes:
  • L is the adaptive normalized length
  • K is a value corresponding to the signal type of the high-band signal in the speech and audio signal, and the value of K corresponding to the signal type of the different high-band signal is different
  • the peak-to-average ratio is greater than the preset peak-to-average ratio of the number of sub-bands
  • is a constant less than one.
  • the determining an adaptive normalization length including:
  • the adaptive normalization length is determined as a preset first length value, when the peak-to-average ratio of the low-band signal and the high-band signal.
  • the adaptive normalization length is determined as a preset second length value; the first length value is greater than the second length value ;or,
  • the adaptive normalized length is determined as a preset first length value, when a peak-to-average ratio of the low-band signal is not less than the high-band signal
  • the peak normalized ratio is determined by determining the adaptive normalization length as a preset second length value; or
  • the adaptive normalization length is determined according to a signal type of the high frequency band signal in the voice audio signal, and the signal normalization lengths of the different high frequency band signals are different in adaptive normalization length.
  • the determining, according to the symbol of each of the sampled values and the adjusted amplitude value of each of the sampled values, the second voice signal comprises:
  • the calculating the correction factor includes:
  • the sampling is performed according to the correction factor
  • the adjustment amplitude value of the value of the adjustment amplitude value greater than 0 is corrected, including:
  • the correction amplitude value greater than 0 in the adjustment amplitude value of the sampled value is corrected by using the following formula:
  • Y is the adjustment amplitude value after the correction process
  • y is the adjustment amplitude value of the adjustment amplitude value of the sampled value greater than
  • b is a constant, 0 ⁇ b ⁇ 2.
  • an embodiment of the present invention provides an apparatus for recovering a noise component of a voice signal, including:
  • a first determining unit configured to determine a symbol of each sample value in the first speech audio signal determined by the signal determining unit, and an amplitude value of each of the sample values
  • a second determining unit configured to determine an adaptive normalized length
  • a third determining unit configured to determine, according to the adaptive normalized length determined by the second determining unit and an amplitude value of each of the sampled values determined by the first determining unit, each of the sampled values Adjust the amplitude value;
  • a fourth determining unit configured to determine a second speech audio signal according to a symbol of each of the sampling values determined by the first determining unit and an adjustment amplitude value of each of the sampling values determined by the third determining unit,
  • the second speech audio signal is a signal obtained after the first speech audio signal recovers a noise component.
  • the third determining unit includes:
  • Determining a subunit configured to calculate an amplitude average corresponding to each of the sampled values according to an amplitude value of each of the sampled values and the adaptive normalized length, and an average value corresponding to each of the sampled values Determining an amplitude disturbance value corresponding to each of the sampled values;
  • the amplitude value calculation subunit is configured to calculate an adjustment amplitude value of each of the sample values according to the amplitude value of each of the sample values and the corresponding amplitude disturbance value.
  • the determining subunit includes:
  • the calculation module is configured to calculate an average value of the amplitude values of all the sample values in the subband to which the sampled value belongs, and use the calculated average value as the amplitude average value corresponding to the sampled value.
  • the determining module is specifically configured to:
  • a sub-band composed of m sample values, the sampled value, and n sample values after the sampled value is determined as a sub-band to which the sampled value belongs, m, n is determined by the adaptive normalization length, m is an integer not less than 0, and n is an integer not less than 0.
  • the adjusting the amplitude value calculating subunit is specifically configured to:
  • the amplitude value of each of the sampled values is subtracted from its corresponding amplitude disturbance value to obtain a difference between the two, and the obtained difference value is used as an adjustment amplitude value of each of the sampled values.
  • the second determining unit includes:
  • a dividing subunit configured to divide the low frequency band signal in the speech audio signal into N sub-bands; N is a natural number;
  • a length calculation subunit configured to calculate the adaptive normalization length according to a signal type of the high frequency band signal and the number of the subbands in the voice audio signal.
  • the length calculation subunit is specifically configured to:
  • L is the adaptive normalized length
  • K is a value corresponding to the signal type of the high-band signal in the speech and audio signal, and the value of K corresponding to the signal type of the different high-band signal is different
  • the peak-to-average ratio is greater than the preset peak-to-average ratio of the number of sub-bands
  • is a constant less than one.
  • the second determining unit is specifically configured to:
  • the adaptive normalization length is determined as a preset first length value, when the peak-to-average ratio of the low-band signal and the high-band signal.
  • the adaptive normalization length is determined as a preset second length value; the first length value is greater than the second length value ;or,
  • the normalized length is determined to be a preset first length value, and when the peak-to-average ratio of the low-band signal is not less than a peak-to-average ratio of the high-band signal, the adaptive normalized length is determined as a preset. Second length value; or,
  • the adaptive normalization length is determined according to a signal type of the high frequency band signal in the voice audio signal, and the signal normalization lengths of the different high frequency band signals are different in adaptive normalization length.
  • the fourth determining unit is specifically configured to:
  • Calculating a correction factor performing a correction process on the adjustment amplitude value of the adjusted amplitude value of the sampled value greater than 0 according to the correction factor; determining each of the values according to the sign of each of the sampled values and the adjusted amplitude value after the correction process The new value of the sampled value is obtained to obtain a second-language audio signal.
  • the fourth determining unit is specifically configured to:
  • Y is the adjustment amplitude value after the correction process
  • y is the adjustment amplitude value which is greater than 0 in the adjustment amplitude value of the sampled value
  • b is a constant, and 0 ⁇ b ⁇ 2.
  • the code stream is received, the code stream is decoded to obtain a speech and audio signal, the first speech audio signal is determined according to the speech and audio signal, and the symbol and each of each sample value in the first speech audio signal are determined. Determining an adaptive normalized length according to the amplitude value of the sampled value, determining an adjusted amplitude value of each of the sampled values according to the adaptive normalized length and an amplitude value of each of the sampled values, according to each The sign of the sampled value and the adjusted amplitude value of each of the sampled values determine a second speech audio signal. In this process, only the original signal of the first speech audio signal is processed, and a new signal is not added to the first speech audio signal, thereby restoring the noise component. No new energy is added to the binary audio signal, so that if the first speech audio signal has a rising edge or a falling edge, the echo in the second speech audio signal is not increased, thereby improving the auditory quality of the second speech audio signal.
  • 1A is a schematic diagram showing an example of sampling value grouping according to an embodiment of the present invention.
  • FIG. 1B is another schematic diagram of an example of sampling value grouping according to an embodiment of the present invention.
  • FIG. 2 is a schematic flow chart of another method for restoring a noise component of a speech audio signal according to an embodiment of the present invention
  • FIG. 3 is a schematic flow chart of another method for restoring a noise component of a speech and audio signal according to an embodiment of the present invention
  • FIG. 4 is a schematic structural diagram of an apparatus for restoring a noise component of a speech and audio signal according to an embodiment of the present invention
  • FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
  • FIG. 1 is a flowchart of a method for restoring a noise component of a speech and audio signal according to an embodiment of the present invention, where the method includes:
  • Step 101 Receive a code stream, and decode the code stream to obtain a voice audio signal.
  • Step 102 Determine a first speech audio signal according to the speech audio signal;
  • the first speech audio signal is a signal that needs to recover a noise component in the decoded speech audio signal;
  • the first speech audio signal may be a low frequency band signal, a high frequency band signal, or a full frequency band signal or the like in the decoded speech audio signal.
  • the decoded speech audio signal may include one low frequency band signal and one high frequency band signal, or may also include one full frequency band signal.
  • Step 103 Determine a symbol of each sample value in the first speech audio signal and an amplitude value of each of the sample values
  • the implementation manner of the sampling value may also be different.
  • the sampling value may be a spectral coefficient
  • the preamble audio signal is a time domain signal
  • the sampled value can be a sample point value.
  • Step 104 Determine an adaptive normalized length
  • the adaptive normalized length when determining the adaptive normalized length, it may be determined according to the low frequency band signal of the decoded audio signal and/or the relevant parameter of the high frequency band signal.
  • the related parameters may include a signal type, a peak-to-average ratio, and the like.
  • the determining an adaptive normalized length may include:
  • N is a natural number
  • the adaptive normalized length is calculated according to a signal type of the high frequency band signal and the number of the subbands in the speech audio signal.
  • the calculating the adaptive normalization length according to the signal type of the high-band signal and the number of the sub-bands in the voice-audio signal may include:
  • the adaptive normalization length may also be calculated according to a signal type of the low frequency band signal and the number of the subbands in the voice signal.
  • L K+ ⁇ M.
  • K at this time is the value corresponding to the signal type of the low-band signal in the speech and audio signal, and the K corresponding to the signal type of the different low-band signal. The values are different.
  • determining the adaptive normalization length may include:
  • the adaptive normalization length is determined as the preset first length value, and the peak-to-average ratio of the low-band signal and the peak-to-average ratio of the high-band signal are not less than the absolute value.
  • the adaptive normalization length is determined as a preset second length value.
  • the first length value is greater than the second length value, and the first length value and the second length value may also be calculated by comparing a peak-to-average ratio of the low-band signal with a peak-to-average ratio of the high-band signal or a difference, and the specific calculation method is not limited.
  • determining the adaptive normalization length may include:
  • the adaptive normalization length is determined as a preset first length value, and when the peak-to-average ratio of the low-band signal is not less than the peak-to-average ratio of the high-band signal, the adaptive normalized length is determined as the preset second length value.
  • the first length value is greater than the second length value, and the first length value and the second length value may also be calculated by comparing a peak-to-average ratio of the low-band signal with a peak-to-average ratio of the high-band signal or a difference, and the specific calculation method is not limited.
  • determining the adaptive normalization length may include: determining an adaptive normalized length according to a signal type of the high frequency band signal in the audio and video signal, and different signal types corresponding to different adaptive normalizations
  • the signal type is a harmonic signal
  • the corresponding adaptive normalized length is 32.
  • the signal type is a normal signal
  • the corresponding adaptive normalized length is 16
  • the signal type is a transient signal
  • the corresponding The adaptive normalization length is 8 and so on.
  • Step 105 Determine an adjustment amplitude value of each of the sample values according to the adaptive normalization length and an amplitude value of each of the sample values;
  • the determining the amplitude value of each of the sampled values according to the adaptive normalized length and the amplitude value of each of the sampled values may include:
  • An adjustment amplitude value of each of the sample values is calculated according to an amplitude value of each of the sampled values and a corresponding amplitude disturbance value thereof.
  • the calculating the average value of the amplitude corresponding to each of the sampled values according to the amplitude value of each of the sampled values and the adaptive normalized length may include:
  • An average value of amplitude values of all sample values in the sub-band to which the sampled value belongs is calculated, and the calculated average value is used as an average value of the amplitude corresponding to the sampled value.
  • All sample values are divided into sub-bands according to the adaptive normalization length in a preset order; for each of the sample values, a sub-band including the sample value is determined as a sub-band to which the sample value belongs.
  • the predetermined sequence may be, for example, a sequence from a low frequency to a high frequency or a sequence from a high frequency to a low frequency, and is not limited herein.
  • x1 to x5 can be divided into one sub-band, x6.
  • ⁇ x10 is divided into one sub-band... and so on, and several sub-bands are obtained.
  • sub-bands x1 ⁇ x5 are the sub-bands to which each sample value belongs, for x6 ⁇ x10
  • the subbands x6 to x10 are the subbands to which each sample value belongs.
  • the subband to which the sampled value belongs according to the adaptive normalization length which may include:
  • a sub-band composed of m sample values, the sampled value, and n sample values after the sampled value is determined as a sub-band to which the sampled value belongs, m, n is determined by the adaptive normalization length, m is an integer not less than 0, and n is an integer not less than 0.
  • the sub-bands to which x1, x2, x(n-1), and xn belong can be set autonomously in practical applications, for example, adding sample values to supplement the missing sample values in the sub-band, etc., for example, for sampling
  • the value x1, which does not have a sampled value before, may be x1, x1, x1, x2, x3 as its associated sub-band or the like.
  • the amplitude average value corresponding to each of the sampled values may be directly used as each of the sampled values, when an amplitude disturbance value corresponding to each of the sampled values is determined according to an amplitude average value corresponding to each of the sampled values.
  • the amplitude perturbation value may be obtained by performing a preset operation on the amplitude average corresponding to each of the sampled values to obtain an amplitude perturbation value corresponding to each of the sampled values, where the preset operation may be, for example, the amplitude
  • the average is multiplied by a value, which is typically greater than zero.
  • the calculating the amplitude value of each of the sampled values according to the amplitude value of each of the sampled values and the corresponding amplitude disturbance value may include:
  • the amplitude value of each of the sampled values is subtracted from its corresponding amplitude disturbance value to obtain a difference between the two, and the obtained difference value is used as an adjustment amplitude value of each of the sampled values.
  • Step 106 Determine a second voice according to a symbol of each of the sampled values and an adjusted amplitude value of each of the sampled values a frequency signal; the second speech audio signal is a signal obtained by recovering a noise component of the first speech audio signal.
  • the determining the second speech audio signal according to the symbol of each of the sampled values and the adjusted amplitude value of each of the sampled values may include:
  • a new value of each sample value is determined according to the sign of each sampled value and the adjusted amplitude value after the correction process, to obtain a second speech audio signal.
  • the obtained second speech audio signal may include a new value of all sample values.
  • the correction factor may be calculated according to the adaptive normalized length. Specifically, the correction factor ⁇ may be equal to a/L; wherein a is a constant greater than 1.
  • Y is the adjustment amplitude value after the correction process
  • y is the adjustment amplitude value greater than 0 in the adjustment amplitude value of the sample value
  • b is a constant, 0 ⁇ b ⁇ 2.
  • the step of extracting the symbol of each sample value in the first speech audio signal in step 103 may be processed at any time before step 106, and there is no necessary execution order between the steps 104 and 105.
  • the time domain signal of the speech audio signal may be within one frame, and the sample value of the partial signal in the speech audio signal is particularly large, energy. Especially large, and the sample value of other parts of the speech and audio signal is particularly small, and the energy is particularly small.
  • a random noise signal is added to the speech and audio signal in the frequency domain to obtain a signal after recovering the noise component, because the random noise signal is in one
  • the energy in the intra-frame time domain is comparable, so that when the frequency domain signal of the signal after restoring the noise component is converted into a time domain signal, the newly added random noise signal tends to cause the original sample in the converted time domain signal.
  • the energy of a part of the signal with a particularly small value increases, and the sample value of this part of the signal also changes accordingly, which causes the signal after restoring the noise component to have some echo, which affects the auditory quality of the signal after the noise component is restored.
  • the first speech audio signal is determined according to the speech and audio signal, the symbol of each sample value in the first speech audio signal and the amplitude value of each of the sample values are determined, and an adaptive normalized length is determined. Determining an adjustment amplitude value of each of the sampled values according to the adaptive normalized length and an amplitude value of each of the sampled values, according to each of the The sign of the sample and the magnitude of the adjustment of each of the sampled values determine the second speech audio signal.
  • the original signal of the first speech audio signal is processed, and no new signal is added to the first speech audio signal, so that no new noise is added to the second speech audio signal after the noise component is restored.
  • the energy, and thus if the first speech audio signal has a rising or falling edge does not increase the echo in the second speech audio signal, thereby improving the auditory quality of the second speech audio signal.
  • FIG. 2 is a schematic flowchart of another method for restoring a noise component of a speech and audio signal according to an embodiment of the present invention, where the method includes:
  • Step 201 Receive a code stream, decode the code stream to obtain a speech and audio signal, and the decoded speech and audio signal includes a low frequency band signal and a high frequency band signal, and determine the high frequency band signal as the first speech audio signal.
  • Step 202 Determine a symbol of each sample value in the high frequency band signal and an amplitude value of each sample value.
  • the coefficient of a certain sample value in the high-band signal is -4
  • the sign value of the sampled value is "-"
  • the amplitude value is 4.
  • Step 203 Determine an adaptive normalized length.
  • Step 204 Determine an amplitude average value corresponding to each sample value according to the amplitude value of each sample value and the adaptive normalization length, and determine an amplitude disturbance corresponding to each sample value according to the amplitude average value corresponding to each sample value. value.
  • Step 205 Calculate an adjustment amplitude value of each sample value according to the amplitude value of each sample value and its corresponding amplitude disturbance value;
  • how to calculate the adjustment amplitude value of each sample value may refer to the related description in step 105, and details are not described herein.
  • the second speech audio signal is a signal obtained after the first speech audio signal recovers a noise component.
  • step 106 For the specific implementation of this step, refer to the related description in step 106, and details are not described herein.
  • the step of determining the sign of each sample value in the first speech audio signal in step 202 may be performed at any time prior to step 206, and there is no necessary execution order between steps 203, 204, 205.
  • step 202 The order of execution between step 202 and step 203 is not limited.
  • Step 207 Combine the second speech audio signal and the low frequency band signal of the decoded speech audio signal to obtain an output signal.
  • the first speech audio signal is a low frequency band signal of the decoded speech audio signal
  • the first speech audio signal is a high frequency band signal of the decoded speech audio signal
  • the second speech audio signal and the decoded low frequency band signal of the speech audio signal may be combined to obtain an output signal
  • the noise component is recovered by the high-band signal of the decoded speech-audio signal, thereby finally recovering the noise component in the high-band signal, and obtaining the second speech audio signal.
  • the high frequency band signal has a rising edge or a falling edge, the echo in the second speech audio signal is not increased, and the auditory quality of the second speech audio signal is improved, thereby improving the auditory quality of the final outputted output signal.
  • FIG. 3 is a schematic flowchart of another method for restoring a noise component of a speech and audio signal according to an embodiment of the present invention, where the method includes:
  • Steps 301 to 305 are the same as steps 201 to 205, and are not described here.
  • Step 306 Calculate a correction factor, and perform correction processing on the adjustment amplitude value greater than 0 in the adjustment amplitude value of each sample value according to the correction factor;
  • step 106 For the specific implementation of this step, refer to the related description in step 106, and details are not described herein.
  • Step 307 Determine the second speech audio signal according to the symbol of each sampled value and the adjusted amplitude value after the correction processing.
  • step 106 For the specific implementation of this step, refer to the related description in step 106, and details are not described herein.
  • the step of determining the sign of each sample value in the first speech audio signal in step 302 may be performed at any time before step 307, and there is no necessary execution order between steps 303, 304, 305, and 306.
  • step 302 and step 303 The order of execution between step 302 and step 303 is not limited.
  • Step 308 Combine the second speech audio signal and the low frequency band signal of the decoded speech audio signal to obtain an output signal.
  • the adjustment amplitude value of the adjustment amplitude value greater than 0 is further corrected, thereby further improving the second speech audio signal.
  • the auditory quality in turn, further enhances the auditory quality of the final output of the output signal.
  • the high-band signal in the decoded speech-audio signal is determined as the first speech audio signal, and is restored therein. a noise component, thereby finally obtaining a second-language audio signal.
  • the method for restoring the noise component of the speech and audio signal according to the embodiment of the present invention may recover the noise component of the full-band signal of the decoded speech and audio signal, or decode the noise component.
  • the low frequency band signal of the obtained speech and audio signal recovers the noise component, and finally obtains the second speech audio signal, and the implementation process thereof can be seen in FIG. 2 .
  • the difference from the method example shown in FIG. 3 is that the full-band signal or the low-band signal is determined as the first speech audio signal when determining the first speech audio signal, which is not illustrated here.
  • FIG. 4 is a schematic structural diagram of an apparatus for restoring a noise component of a speech signal according to an embodiment of the present invention.
  • the device may be disposed in an electronic device, and the device 400 may include:
  • the code stream processing unit 410 is configured to receive a code stream, and decode the code stream to obtain a speech and audio signal, where the first speech and audio signal is a signal in the decoded speech and audio signal that needs to recover a noise component;
  • a signal determining unit 420 configured to determine, according to the voice signal obtained by the code stream processing unit 410, a first voice signal
  • a first determining unit 430 configured to determine a symbol of each sample value in the first voice signal determined by the signal determining unit 420, and an amplitude value of each of the sample values
  • a second determining unit 440 configured to determine an adaptive normalized length
  • a third determining unit 450 configured to determine, according to the adaptive normalization length determined by the second determining unit 440 and the amplitude value of each of the sampling values determined by the first determining unit 430 The adjusted amplitude value of the sampled value;
  • a fourth determining unit 460 configured to determine a second language according to a symbol of each of the sampling values determined by the first determining unit 430 and an adjustment amplitude value of each of the sampling values determined by the third determining unit 450
  • An audio signal the second speech audio signal being a signal obtained by recovering a noise component of the first speech audio signal.
  • the third determining unit 450 may include:
  • the amplitude value calculation subunit is configured to calculate an adjustment amplitude value of each of the sample values according to the amplitude value of each of the sample values and the corresponding amplitude disturbance value.
  • the determining subunit may include:
  • a determining module configured, for each of the sampled values, a sub-band to which the sampled value belongs according to the adaptive normalized length
  • the calculation module is configured to calculate an average value of the amplitude values of all the sample values in the subband to which the sampled value belongs, and use the calculated average value as the amplitude average value corresponding to the sampled value.
  • the determining module is specifically configured to:
  • a sub-band composed of m sample values, the sampled value, and n sample values after the sampled value is determined as a sub-band to which the sampled value belongs, m, n is determined by the adaptive normalization length, m is an integer not less than 0, and n is an integer not less than 0.
  • the adjustment amplitude value calculation subunit is specifically configured to:
  • the amplitude value of each of the sampled values is subtracted from its corresponding amplitude disturbance value to obtain a difference between the two, and the obtained difference value is used as an adjustment amplitude value of each of the sampled values.
  • the second determining unit 440 may include:
  • a dividing subunit configured to divide the low frequency band signal in the speech audio signal into N sub-bands; N is a natural number;
  • a length calculation subunit configured to calculate the adaptive normalization length according to a signal type of the high frequency band signal and the number of the subbands in the voice audio signal.
  • the length calculation subunit may be specifically configured to:
  • L is the adaptive normalized length
  • K is a value corresponding to the signal type of the high-band signal in the speech and audio signal, and the value of K corresponding to the signal type of the different high-band signal is different
  • the peak-to-average ratio is greater than the preset peak-to-average ratio of the number of sub-bands
  • is a constant less than one.
  • the second determining unit 440 is specifically configured to:
  • the adaptive normalization length is determined as a preset first length value, when the peak-to-average ratio of the low-band signal and the high-band signal.
  • the adaptive normalization length is determined as a preset second length value; the first length value is greater than the second length value ;or,
  • determining the adaptive normalization length as a preset first length value, where the peak-to-average ratio of the low-band signal is not less than a peak-to-average ratio of the high-band signal,
  • the normalized length is determined to be a preset second length value; or,
  • the adaptive normalization length is determined according to a signal type of the high frequency band signal in the voice audio signal, and the signal normalization lengths of the different high frequency band signals are different in adaptive normalization length.
  • the fourth determining unit 460 is specifically configured to:
  • Calculating a correction factor performing a correction process on the adjustment amplitude value of the adjusted amplitude value of the sampled value greater than 0 according to the correction factor; determining each of the values according to the sign of each of the sampled values and the adjusted amplitude value after the correction process The new value of the sampled value is obtained to obtain a second-language audio signal.
  • the fourth determining unit 460 is specifically configured to:
  • the correction amplitude value greater than 0 in the adjustment amplitude value of the sampled value is corrected by using the following formula:
  • Y is the adjustment amplitude value after the correction process
  • y is the adjustment amplitude value which is greater than 0 in the adjustment amplitude value of the sampled value
  • b is a constant, and 0 ⁇ b ⁇ 2.
  • the first speech audio signal is determined according to the speech and audio signal, the symbol of each sample value in the first speech audio signal and the amplitude value of each of the sample values are determined, and an adaptive normalized length is determined. Determining an adjustment amplitude value of each of the sample values according to the adaptive normalization length and an amplitude value of each of the sample values, according to a sign of each of the sample values and an adjustment range of each of the sample values The value determines the second speech audio signal. In this process, only the original signal of the first speech audio signal is processed, and no new signal is added to the first speech audio signal, so that no new noise is added to the second speech audio signal after the noise component is restored. The energy, and thus if the first speech audio signal has a rising or falling edge, does not increase the echo in the second speech audio signal, thereby improving the auditory quality of the second speech audio signal.
  • the processor 510, the memory 520, and the transceiver 530 are connected to each other through a bus 540.
  • the bus 540 may be an ISA bus, a PCI bus, or an EISA bus.
  • the bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is shown in Figure 5, but it does not mean that there is only one bus or one type of bus.
  • the transceiver 530 is used to connect other devices and communicate with other devices. Specifically, the transceiver 530 can be configured to: receive a code stream;
  • An adjustment amplitude value of each of the sample values is calculated according to an amplitude value of each of the sampled values and a corresponding amplitude disturbance value thereof.
  • processor 510 is specifically configured to:
  • An average value of amplitude values of all sample values in the sub-band to which the sampled value belongs is calculated, and the calculated average value is used as an average value of the amplitude corresponding to the sampled value.
  • processor 510 is specifically configured to:
  • a sub-band composed of m sample values, the sampled value, and n sample values after the sampled value is determined as a sub-band to which the sampled value belongs, m, n is determined by the adaptive normalization length, m is an integer not less than 0, and n is an integer not less than 0.
  • processor 510 is specifically configured to:
  • the amplitude value of each of the sampled values is subtracted from its corresponding amplitude disturbance value to obtain a difference between the two, and the obtained difference value is used as an adjustment amplitude value of each of the sampled values.
  • processor 510 is specifically configured to:
  • N is a natural number
  • processor 510 is specifically configured to:
  • L is the adaptive normalized length
  • K is a value corresponding to the signal type of the high-band signal in the speech and audio signal, and the value of K corresponding to the signal type of the different high-band signal is different
  • the peak-to-average ratio is greater than the preset peak-to-average ratio of the number of sub-bands
  • is a constant less than one.
  • processor 510 is specifically configured to:
  • Calculating a correction factor performing a correction process on the adjustment amplitude value of the adjusted amplitude value of the sampled value greater than 0 according to the correction factor; determining each of the values according to the sign of each of the sampled values and the adjusted amplitude value after the correction process The new value of the sampled value is obtained to obtain a second-language audio signal.
  • processor 510 is specifically configured to:
  • processor 510 is specifically configured to:
  • Y is the adjustment amplitude value after the correction process
  • y is the adjustment amplitude value of the adjustment amplitude value of the sampled value greater than
  • b is a constant, 0 ⁇ b ⁇ 2.
  • the electronic device determines the first speech audio signal according to the speech and audio signal, determines a symbol of each sample value in the first speech audio signal, and an amplitude value of each of the sampling values, and determines adaptive normalization. Length, determining an adjustment amplitude value of each of the sample values according to the adaptive normalization length and an amplitude value of each of the sample values, according to a symbol of each of the sample values and each of the sample values The amplitude value is adjusted to determine the second speech audio signal. In this process, only the original signal of the first speech audio signal is processed, and no new signal is added to the first speech audio signal, so that no new noise is added to the second speech audio signal after the noise component is restored. The energy, and thus if the first speech audio signal has a rising or falling edge, does not increase the echo in the second speech audio signal, thereby improving the auditory quality of the second speech audio signal.
  • the system embodiment since it basically corresponds to the method embodiment, it can be referred to the partial description of the method embodiment.
  • the system embodiments described above are merely illustrative, wherein the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, ie may be located A place, or it can be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment. Those of ordinary skill in the art can understand and implement without any creative effort.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Quality & Reliability (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Noise Elimination (AREA)
  • Telephone Function (AREA)
  • Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)

Abstract

一种恢复语音频信号噪声成分的方法及装置,方法包括:接收码流,解码所述码流得到语音频信号(101);根据所述语音频信号确定第一语音频信号(102);确定所述第一语音频信号中每个采样值的符号和每个所述采样值的幅度值(103);确定自适应归一化长度(104);根据所述自适应归一化长度和每个所述采样值的幅度值确定每个所述采样值的调整幅度值(105);根据每个所述采样值的符号和每个所述采样值的调整幅度值确定第二语音频信号(106)。

Description

一种语音频信号的处理方法和装置
本申请要求于2014年6月3日提交中国专利局、申请号为201410242233.2、发明名称为“一种语音频信号的处理方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及通信领域,尤其涉及一种语音频信号的处理方法和装置。
背景技术
为了达到更好的听觉质量,目前电子设备在进行语音频信号的编码信息解码时,会对解码得到的语音频信号进行噪声成分的恢复。
目前电子设备在恢复语音频信号的噪声成分时,一般都是通过在语音频信号中添加随机噪声信号来实现的。具体的,将语音频信号和随机噪声信号相加权,得到语音频信号恢复噪声成分后的信号;其中,语音频信号可以是时域信号、频域信号或激励信号,也可以是低频信号或高频信号等。
但是,发明人发现如果语音频信号是具有上升沿或下降沿的信号时,这种恢复语音频信号噪声成分的方法会造成语音频信号恢复噪声成分后得到的信号具有回声,影响恢复噪声成分后的信号的听觉质量。
发明内容
本发明实施例中提供了一种语音频信号的处理方法和装置,对于具有上升沿或下降沿的语音频信号,在恢复其噪声成分时不会造成语音频信号恢复噪声成分后的信号具有回声,提高恢复噪声成分后的信号的听觉质量。
第一方面,本发明实施例提供一种语音频信号的处理方法,所述方法包括:
接收码流,解码所述码流得到语音频信号;
根据所述语音频信号确定第一语音频信号,所述第一语音频信号是所述语音频信号中需要恢复噪声成分的信号;
确定所述第一语音频信号中每个采样值的符号和每个所述采样值的幅度值;
确定自适应归一化长度;
根据所述自适应归一化长度和每个所述采样值的幅度值确定每个所述采样值的调整幅度值;
根据每个所述采样值的符号和每个所述采样值的调整幅度值确定第二语音频信号,所述第二语音频信号是所述第一语音频信号恢复噪声成分后得到的信号。
结合第一方面,在第一方面第一种可能的实现方式中,所述根据所述自适应归一化长度和每个所述采样值的幅度值确定每个所述采样值的调整幅度值,包括:
根据每个所述采样值的幅度值以及所述自适应归一化长度计算每个所述采样值对应的幅度平均值,根据每个所述采样值对应的幅度平均值确定每个所述采样值对应的幅度扰动值;
根据每个所述采样值的幅度值及其对应的幅度扰动值计算每个所述采样值的调整幅度值。
结合第一方面第一种可能的实现方式,在第一方面第二种可能的实现方式中,所述根据每个所述采样值的幅度值以及所述自适应归一化长度计算每个所述采样值对应的幅度平均值,包括:
对于每个所述采样值,根据所述自适应归一化长度确定所述采样值所属的子带;
计算所述采样值所属子带内所有采样值的幅度值的平均值,将计算得到的平均值作为所述采样值对应的幅度平均值。
结合第一方面第二种可能的实现方式,在第一方面第三种可能的实现方式中,对于每个所述采样值,根据所述自适应归一化长度确定所述采样值所属的子带,包括:
将所有采样值按照预设顺序根据所述自适应归一化长度划分子带;对于每个所述采样值,将包括所述采样值的子带确定为所述采样值所属的子带;或者,
对于每个所述采样值,将所述采样值之前m个采样值、所述采样值、所述采样值之后n个采样值构成的子带确定为所述采样值所属的子带,m、n由所述自适应归一化长度确定,m是不小于0的整数,n是不小于0的整数。
结合第一方面第一种可能的实现方式,和/或第一方面第二种可能的实现方式,和/或第一方面第三种可能的实现方式,在第一方面第四种可能的实现方式中,所述根据每个所述采样值的幅度值及其对应的幅度扰动值计算每个所述采样值的调整幅度值,包括:
将每个所述采样值的幅度值与其对应的幅度扰动值相减得到两者的差值,将得到的差值作为每个所述采样值的调整幅度值。
结合第一方面,和/或第一方面第一种可能的实现方式,和/或第一方面第二种可能的实现方式,和/或第一方面第三种可能的实现方式,和/或第一方面第四种可能的实现方式中,在第一方面第五种可能的实现方式中,所述确定自适应归一化长度,包括:
将所述语音频信号中的低频带信号划分为N个子带;N为自然数;
计算每个所述子带的峰均比,并确定所述峰均比大于预设峰均比阈值的子带个数;
根据所述语音频信号中高频带信号的信号类型和所述子带个数,计算所述自适应归一化长度。
结合第一方面第五种可能的实现方式,在第一方面第六种可能的实现方式中,所述根据所述语音频信号中高频带信号的信号类型和所述子带个数,计算所述自适应归一化长度,包括:
根据公式L=K+α×M计算所述自适应归一化长度;
其中,L为所述自适应归一化长度;K为所述语音频信号中的高频带信号的信号类型对应的数值,不同高频带信号的信号类型对应的K的数值不同;M为峰均比大于预设峰均比阈值的子带个数;α为小于1的常数。
结合第一方面,和/或第一方面第一种可能的实现方式,和/或第一方面第二种可能的实现方式,和/或第一方面第三种可能的实现方式,和/或第一方面第四种可能的实现方式,在第一方面第七种可能的实现方式中,所述确定自适应归一化长度,包括:
计算所述语音频信号中低频带信号的峰均比,和所述语音频信号中高频带信号的峰均比;当所述低频带信号的峰均比和所述高频带信号的峰均比的差值的绝对值小于预设差值阈值时,将所述自适应归一化长度确定为预设第一长度值,当所述低频带信号的峰均比和所述高频带信号的峰均比的差值的绝对值不小于预设差值阈值时,将所述自适应归一化长度确定为预设第二长度值;所述第一长度值大于所述第二长度值;或者,
计算所述语音频信号中低频带信号的峰均比,和所述语音频信号中高频带信号的峰均比;当所述低频带信号的峰均比小于所述高频带信号的峰均比时,将所述自适应归一化长度确定为预设第一长度值,当所述低频带信号的峰均比不小于所述高频带信 号的峰均比时,将所述自适应归一化长度确定为预设第二长度值;或者,
根据所述语音频信号中高频带信号的信号类型确定所述自适应归一化长度,不同高频带信号的信号类型对应的自适应归一化长度不同。
结合第一方面,和/或第一方面第一种可能的实现方式,和/或第一方面第二种可能的实现方式,和/或第一方面第三种可能的实现方式,和/或第一方面第四种可能的实现方式,和/或第一方面第五种可能的实现方式,和/或第一方面第六种可能的实现方式,和/或第一方面第七种可能的实现方式,在第一方面第八种可能的实现方式中,所述根据每个所述采样值的符号和每个所述采样值的调整幅度值确定第二语音频信号,包括:
根据每个所述采样值的符号和调整幅度值确定每个所述采样值的新取值,得到所述第二语音频信号;或者,
计算修正因子;根据所述修正因子对所述采样值的调整幅度值中大于0的调整幅度值进行修正处理;根据每个所述采样值的符号和修正处理后的调整幅度值确定每个所述采样值的新取值,得到第二语音频信号。
结合第一方面第八种可能的实现方式,在第一方面第九种可能的实现方式中,所述计算修正因子,包括:
使用公式β=a/L计算所述修正因子;其中,β为所述修正因子,L为所述自适应归一化长度,a为大于1的常数。
结合第一方面第八种可能的实现方式,和/或第一方面第九种可能的实现方式,在第一方面第十种可能的实现方式中,所述根据所述修正因子对所述采样值的调整幅度值中大于0的调整幅度值进行修正处理,包括:
使用以下公式对所述采样值的调整幅度值中大于0的调整幅度值进行修正处理:
Y=y×(b-β);
其中,Y为修正处理后的调整幅度值,y为所述采样值的调整幅度值中大于0的调整幅度值,b为常数,0<b<2。
第二方面,本发明实施例提供一种恢复语音频信号噪声成分的装置,包括:
码流处理单元,用于接收码流,解码所述码流得到语音频信号;
信号确定单元,用于根据所述码流处理单元得到的所述语音频信号确定第一语音频信号,所述第一语音频信号是解码得到的所述语音频信号中需要恢复噪声成分的信 号;
第一确定单元,用于确定所述信号确定单元确定的所述第一语音频信号中每个采样值的符号和每个所述采样值的幅度值;
第二确定单元,用于确定自适应归一化长度;
第三确定单元,用于根据所述第二确定单元确定的所述自适应归一化长度和所述第一确定单元确定的每个所述采样值的幅度值确定每个所述采样值的调整幅度值;
第四确定单元,用于根据所述第一确定单元确定的每个所述采样值的符号和所述第三确定单元确定的每个所述采样值的调整幅度值确定第二语音频信号,所述第二语音频信号是所述第一语音频信号恢复噪声成分后得到的信号。
结合第二方面,在第二方面第一种可能的实现方式中,所述第三确定单元包括:
确定子单元,用于根据每个所述采样值的幅度值以及所述自适应归一化长度计算每个所述采样值对应的幅度平均值,根据每个所述采样值对应的幅度平均值确定每个所述采样值对应的幅度扰动值;
调整幅度值计算子单元,用于根据每个所述采样值的幅度值及其对应的幅度扰动值计算每个所述采样值的调整幅度值。
结合第二方面第一种可能的实现方式,在第二方面第二种可能的实现方式中,所述确定子单元包括:
确定模块,用于对于每个所述采样值,根据所述自适应归一化长度确定所述采样值所属的子带;
计算模块,用于计算所述采样值所属子带内所有采样值的幅度值的平均值,将计算得到的平均值作为所述采样值对应的幅度平均值。
结合第二方面第二种可能的实现方式,在第二方面第三种可能的实现方式中,所述确定模块具体用于:
将所有采样值按照预设顺序根据所述自适应归一化长度划分子带;对于每个所述采样值,将包括所述采样值的子带确定为所述采样值所属的子带;或者,
对于每个所述采样值,将所述采样值之前m个采样值、所述采样值、所述采样值之后n个采样值构成的子带确定为所述采样值所属的子带,m、n由所述自适应归一化长度确定,m是不小于0的整数,n是不小于0的整数。
结合第二方面第一种可能的实现方式,和/或第二方面第二种可能的实现方式,和 /或第二方面第三种可能的实现方式,在第二方面第四种可能的实现方式中,所述调整幅度值计算子单元具体用于:
将每个所述采样值的幅度值与其对应的幅度扰动值相减得到两者的差值,将得到的差值作为每个所述采样值的调整幅度值。
结合第二方面,和/或第二方面第一种可能的实现方式,和/或第二方面第二种可能的实现方式,和/或第二方面第三种可能的实现方式,和/或第二方面第四种可能的实现方式中,在第二方面第五种可能的实现方式中,所述第二确定单元包括:
划分子单元,用于将所述语音频信号中的低频带信号划分为N个子带;N为自然数;
个数确定子单元,用于计算每个所述子带的峰均比,并确定所述峰均比大于预设峰均比阈值的子带个数;
长度计算子单元,用于根据所述语音频信号中高频带信号的信号类型和所述子带个数,计算所述自适应归一化长度。
结合第二方面第五种可能的实现方式,在第二方面第六种可能的实现方式中,所述长度计算子单元具体用于:
根据公式L=K+α×M计算所述自适应归一化长度;
其中,L为所述自适应归一化长度;K为所述语音频信号中的高频带信号的信号类型对应的数值,不同高频带信号的信号类型对应的K的数值不同;M为峰均比大于预设峰均比阈值的子带个数;α为小于1的常数。
结合第二方面,和/或第二方面第一种可能的实现方式,和/或第二方面第二种可能的实现方式,和/或第二方面第三种可能的实现方式,和/或第二方面第四种可能的实现方式,在第二方面第七种可能的实现方式中,所述第二确定单元具体用于:
计算所述语音频信号中低频带信号的峰均比,和所述语音频信号中高频带信号的峰均比;当所述低频带信号的峰均比和所述高频带信号的峰均比的差值的绝对值小于预设差值阈值时,将所述自适应归一化长度确定为预设第一长度值,当所述低频带信号的峰均比和所述高频带信号的峰均比的差值的绝对值不小于预设差值阈值时,将所述自适应归一化长度确定为预设第二长度值;所述第一长度值大于所述第二长度值;或者,
计算所述语音频信号中低频带信号的峰均比,和所述语音频信号中高频带信号的峰均比;当所述低频带信号的峰均比小于所述高频带信号的峰均比时,将所述自适应 归一化长度确定为预设第一长度值,当所述低频带信号的峰均比不小于所述高频带信号的峰均比时,将所述自适应归一化长度确定为预设第二长度值;或者,
根据所述语音频信号中高频带信号的信号类型确定所述自适应归一化长度,不同高频带信号的信号类型对应的自适应归一化长度不同。
结合第二方面,和/或第二方面第一种可能的实现方式,和/或第二方面第二种可能的实现方式,和/或第二方面第三种可能的实现方式,和/或第二方面第四种可能的实现方式,和/或第二方面第五种可能的实现方式,和/或第二方面第六种可能的实现方式,和/或第二方面第七种可能的实现方式,在第二方面第八种可能的实现方式中,所述第四确定单元具体用于:
根据每个所述采样值的符号和调整幅度值确定每个所述采样值的新取值,得到所述第二语音频信号;或者,
计算修正因子;根据所述修正因子对所述采样值的调整幅度值中大于0的调整幅度值进行修正处理;根据每个所述采样值的符号和修正处理后的调整幅度值确定每个所述采样值的新取值,得到第二语音频信号。
结合第二方面第八种可能的实现方式,在第二方面第九种可能的实现方式中,所述第四确定单元具体用于:使用公式β=a/L计算所述修正因子;其中,β为所述修正因子,L为所述自适应归一化长度,a为大于1的常数。
结合第二方面第八种可能的实现方式,和/或第二方面第九种可能的实现方式,在第二方面第十种可能的实现方式中,所述第四确定单元具体用于:
使用以下公式对所述采样值的调整幅度值中大于0的调整幅度值进行修正处理:
Y=y×(b-β);
其中,Y为修正处理后的调整幅度值,y为将所述采样值的调整幅度值中大于0的调整幅度值,b为常数,0<b<2。
本实施例中,接收码流,解码所述码流得到语音频信号,根据所述语音频信号确定第一语音频信号,确定所述第一语音频信号中每个采样值的符号和每个所述采样值的幅度值,确定自适应归一化长度,根据所述自适应归一化长度和每个所述采样值的幅度值确定每个所述采样值的调整幅度值,根据每个所述采样值的符号和每个所述采样值的调整幅度值确定第二语音频信号。这一过程中,只是对第一语音频信号这一原有信号进行处理,并未在第一语音频信号中增加新的信号,从而恢复噪声成分后的第 二语音频信号中并未增加新的能量,从而如果第一语音频信号具有上升沿或下降沿,不会增加第二语音频信号中的回声,从而提高了第二语音频信号的听觉质量。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性的,并不能限制本发明的保护范围。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例恢复语音频信号噪声成分的一种方法流程示意图;
图1A为本发明实施例采样值分组举例示意图;
图1B为本发明实施例采样值分组举例另一示意图;
图2为本发明实施例恢复语音频信号噪声成分的另一种方法流程示意图;
图3为本发明实施例恢复语音频信号噪声成分的另一种方法流程示意图;
图4为本发明实施例恢复语音频信号噪声成分的装置结构示意图;
图5为本发明实施例电子设备结构示意图。
通过上述附图,已示出本发明明确的实施例,后文中将有更详细的描述。这些附图和文字描述并不是为了通过任何方式限制本发明构思的范围,而是通过参考特定实施例为本领域技术人员说明本发明的概念。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
为了全面理解本发明,在以下详细描述中提到了众多具体的细节,但是本领域技术人员应该理解,本发明可以无需这些具体细节而实现。在其他实施例中,不详细描述公知的 方法、过程、组件和电路,以免不必要地导致实施例模糊。
参见图1,为本发明实施例恢复语音频信号噪声成分的方法流程图,该方法包括:
步骤101:接收码流,解码所述码流得到语音频信号;
其中,具体如何解码码流得到语音频信号,这里不再赘述。
步骤102:根据所述语音频信号确定第一语音频信号;所述第一语音频信号是解码得到的所述语音频信号中需要恢复噪声成分的信号;
其中,所述第一语音频信号可以是解码得到的语音频信号中的低频带信号、高频带信号、或者全频带信号等。
所述解码得到的语音频信号可以包括一路低频带信号和一路高频带信号,或者也可以包括一路全频带信号。
步骤103:确定所述第一语音频信号中每个采样值的符号和每个所述采样值的幅度值;
其中,所述第一语音频信号具有不同实现时,所述采样值的实现方式也可能不同,例如如果所述第一语音频信号是频域信号,所述采样值可以为频谱系数;如果所述语音频信号是时域信号,所述采样值可以为样点值。
步骤104:确定自适应归一化长度;
其中,在确定自适应归一化长度时,可以根据所述解码得到的语音频信号的低频带信号和/或高频带信号的相关参数来确定。具体的,所述相关参数可以包括信号类型、峰均比等。例如,在一种可能的实现方式中,所述确定自适应归一化长度,可以包括:
将所述语音频信号中的低频带信号划分为N个子带;N为自然数;
计算每个所述子带的峰均比,并确定所述峰均比大于预设峰均比阈值的子带个数;
根据所述语音频信号中高频带信号的信号类型和所述子带个数,计算所述自适应归一化长度。
可选地,所述根据所述语音频信号中高频带信号的信号类型和所述子带个数,计算所述自适应归一化长度,可以包括:
根据公式L=K+α×M计算所述自适应归一化长度;
其中,L为所述自适应归一化长度;K为所述语音频信号中的高频带信号的信号类型对应的数值,不同高频带信号的信号类型对应的K的数值不同;M为峰均比大于预设峰均比阈值的子带个数;α为小于1的常数。
在另一种可能的实现方式中,也可以根据所述语音频信号中低频带信号的信号类型和所述子带个数,计算所述自适应归一化长度。具体的计算公式可以参见公式L=K+α×M,区别仅在于此时的K为所述语音频信号中的低频带信号的信号类型对应的数值,不同低频带信号的信号类型对应的K的数值不同。
在第三种可能的实现方式中,确定自适应归一化长度可以包括:
计算所述语音频信号中低频带信号的峰均比,和所述语音频信号中高频带信号的峰均比;当低频带信号的峰均比和高频带信号的峰均比差值的绝对值小于预设差值阈值时,将自适应归一化长度确定为预设第一长度值,当低频带信号的峰均比和高频带信号的峰均比差值的绝对值不小于预设差值阈值时,将自适应归一化长度确定为预设第二长度值。第一长度值大于第二长度值,第一长度值和第二长度值也可以通过低频带信号的峰均比和高频带信号的峰均比的比值或差值计算得到,具体计算方法不限定。
在第四种可能的实现方式中,确定自适应归一化长度可以包括:
计算所述语音频信号中低频带信号的峰均比,和所述语音频信号中高频带信号的峰均比;当低频带信号的峰均比小于高频带信号的峰均比时,将自适应归一化长度确定为预设第一长度值,当低频带信号的峰均比不小于高频带信号的峰均比时,将自适应归一化长度确定为预设第二长度值。第一长度值大于第二长度值,第一长度值和第二长度值也可以通过低频带信号的峰均比和高频带信号的峰均比的比值或差值计算得到,具体计算方法不限定。
在第五种可能的实现方式中,确定自适应归一化长度可以包括:根据语音频信号中高频带信号的信号类型确定自适应归一化长度,不同的信号类型对应不同的自适应归一化长度,如信号类型为谐波信号时,对应的自适应归一化长度为32,信号类型为普通信号时,对应的自适应归一化长度为16,信号类型为瞬态信号时,对应的自适应归一化长度为8等。
步骤105:根据所述自适应归一化长度和每个所述采样值的幅度值确定每个所述采样值的调整幅度值;
其中,所述根据所述自适应归一化长度和每个所述采样值的幅度值确定每个所述采样值的调整幅度值,可以包括:
根据每个所述采样值的幅度值以及所述自适应归一化长度计算每个所述采样值对应的幅度平均值,根据每个所述采样值对应的幅度平均值确定每个所述采样值对应的幅度扰动值;
根据每个所述采样值的幅度值及其对应的幅度扰动值计算每个所述采样值的调整幅度值。
其中,所述根据每个所述采样值的幅度值以及所述自适应归一化长度计算每个所述采样值对应的幅度平均值,可以包括:
对于每个所述采样值,根据所述自适应归一化长度确定所述采样值所属的子带;
计算所述采样值所属子带内所有采样值的幅度值的平均值,将计算得到的平均值作为所述采样值对应的幅度平均值。
其中,对于每个所述采样值,根据所述自适应归一化长度确定所述采样值所属的子带,可以包括:
将所有采样值按照预设顺序根据所述自适应归一化长度划分子带;对于每个所述采样值,将包括所述采样值的子带确定为所述采样值所属的子带。
其中,所述预设顺序例如可以为从低频到高频的顺序或者从高频到低频的顺序等,这里不限制。
例如,参见图1A所示,假设采样值从低到高分别为x1、x2、x3…xn,所述自适应归一化长度假设为5,则可以将x1~x5划分为一个子带,x6~x10划分为一个子带…以此类推,得到若干个子带,则对于x1~x5中的每个采样值而言,子带x1~x5就是每个采样值所属的子带,对于x6~x10中的每个采样值而言,子带x6~x10就是每个采样值所属的子带。
或者,对于每个所述采样值,根据所述自适应归一化长度确定所述采样值所属的子带,可以包括:
对于每个所述采样值,将所述采样值之前m个采样值、所述采样值、所述采样值之后n个采样值构成的子带确定为所述采样值所属的子带,m、n由所述自适应归一化长度确定,m是不小于0的整数,n是不小于0的整数。
例如,参见图1B所示,假设采样值从低到高分别为x1、x2、x3…xn,所述自适应归一化长度假设为5,m取值为2,n取值为2,则,对于采样值x3而言,x1~x5构成的子带就是采样值x3所属子带,对于采样值x4而言,x2~x6构成的子带就是采样值x4所述子带,以此类推。其中,对于采样值x1、x2而言,由于其之前没有足够的采样值构成其所属子带,对于采样值x(n-1)、xn而言,由于其之后没有足够的采样值构成其所属子带,因此,可以在实际应用中自主设定x1、x2、x(n-1)、xn所属子带,例如添加采样值自身补充子带中缺少的采样值等,举例来说,对于采样值x1,其之前不存在采样值,则可以将x1、x1、x1、x2、x3作为其所属子带等。
其中,根据每个所述采样值对应的幅度平均值确定每个所述采样值对应的幅度扰动值时,可以将每个所述采样值对应的幅度平均值直接作为每个所述采样值对应的幅度扰动值,也可以对每个所述采样值对应的幅度平均值做某一预设运算得到每个所述采样值对应的幅度扰动值,所述预设运算例如可以为为所述幅度平均值乘以一个数值,该数值一般大于0。
其中,所述根据每个所述采样值的幅度值及其对应的幅度扰动值计算每个所述采样值的调整幅度值,可以包括:
将每个所述采样值的幅度值与其对应的幅度扰动值相减得到两者的差值,将得到的差值作为每个所述采样值的调整幅度值。
步骤106:根据每个所述采样值的符号和每个所述采样值的调整幅度值确定第二语音 频信号;所述第二语音频信号是所述第一语音频信号恢复噪声成分后得到的信号。
其中,在一种可能的实现方式中,可以根据每个采样值的符号和调整幅度值确定每个采样值的新取值,得到所述第二语音频信号;
在另一种可能的实现方式中,所述根据每个所述采样值的符号和每个所述采样值的调整幅度值确定第二语音频信号,可以包括:
计算修正因子;
根据所述修正因子对采样值的调整幅度值中大于0的调整幅度值进行修正处理;
根据每个采样值的符号和修正处理后的调整幅度值确定每个采样值的新取值,得到第二语音频信号。
在一种可能的实现方式中,得到的所述第二语音频信号可以包括所有采样值的新取值。
其中,所述修正因子可以根据所述自适应归一化长度计算,具体的,所述修正因子β可以等于a/L;其中,a为大于1的常数。
其中,所述根据所述修正因子对采样值的调整幅度值中大于0的调整幅度值进行修正处理,可以包括:
使用以下公式对采样值的调整幅度值中大于0的调整幅度值进行修正处理:
Y=y×(b-β);
其中,Y为修正处理后的调整幅度值,y为采样值的调整幅度值中大于0的调整幅度值,b为常数,0<b<2。
其中,步骤103中提取所述第一语音频信号中每个采样值的符号的步骤可以在步骤106之前的任意时刻处理,与步骤104、105之间没有必然的执行顺序。
其中,步骤103与步骤104之间的执行顺序不限制。
在现有技术中,当语音频信号是具有上升沿或下降沿的信号时,语音频信号的时域信号可能在一帧内,此时语音频信号中部分信号的样点值特别大,能量特别大,而语音频信号的其他部分信号的样点值特别小,能量特别小,此时,在频域对语音频信号添加随机噪声信号得到恢复噪声成分后的信号,由于随机噪声信号在一帧内时域上看能量是相当的,从而在将恢复噪声成分后的信号的频域信号转换为时域信号时,新添加的随机噪声信号往往会使得转换得到的时域信号中原来样点值特别小的部分信号的能量增加,这一部分信号的样点值也都相应变的比较大,这样就会造成恢复噪声成分后的信号具有一些回声,影响恢复噪声成分后的信号的听觉质量。
而本实施例中,根据语音频信号确定第一语音频信号,确定所述第一语音频信号中每个采样值的符号和每个所述采样值的幅度值,确定自适应归一化长度,根据所述自适应归一化长度和每个所述采样值的幅度值确定每个所述采样值的调整幅度值,根据每个所述采 样值的符号和每个所述采样值的调整幅度值确定第二语音频信号。这一过程中,只是对第一语音频信号这一原有信号进行处理,并未在第一语音频信号中增加新的信号,从而恢复噪声成分后的第二语音频信号中并未增加新的能量,从而如果第一语音频信号具有上升沿或下降沿,不会增加第二语音频信号中的回声,从而提高了第二语音频信号的听觉质量。
参见图2,为本发明实施例恢复语音频信号噪声成分的方法另一种流程示意图,该方法包括:
步骤201:接收码流,解码所述码流得到语音频信号,解码得到的语音频信号包括低频带信号和高频带信号,将高频带信号确定为第一语音频信号。
其中,如何对码流进行解码,本发明并不限制。
步骤202:确定所述高频带信号中每个采样值的符号以及每个采样值的幅度值。
例如,高频带信号中某一采样值的系数为-4,则该采样值的符号为“-”,幅度值为4。
步骤203:确定自适应归一化长度;
其中,具体如何确定所述自适应归一化长度可以参考步骤104中的相关描述,这里不赘述。
步骤204:根据每个采样值的幅度值以及所述自适应归一化长度确定每个采样值对应的幅度平均值,根据每个采样值对应的幅度平均值确定每个采样值对应的幅度扰动值。
其中,如何确定每个采样值对应的幅度平均值请参考步骤105中的相关描述,这里不赘述。
步骤205:根据每个采样值的幅度值及其对应的幅度扰动值计算每个采样值的调整幅度值;
其中,如何计算每个采样值的调整幅度值可以参考步骤105中的相关描述,这里不赘述。
步骤206:根据每个采样值的符号和调整幅度值确定第二语音频信号。
所述第二语音频信号是所述第一语音频信号恢复噪声成分后得到的信号。
其中,本步骤的具体实现请参考步骤106中的相关描述,这里不赘述。
其中,步骤202中确定第一语音频信号中每个采样值的符号的步骤可以在步骤206之前的任意时刻执行,与步骤203、204、205之间没有必然的执行顺序。
其中,步骤202与步骤203之间的执行顺序不限制。
步骤207:将所述第二语音频信号和解码得到的语音频信号的低频带信号合并,得到输出信号。
其中,如果所述第一语音频信号为解码得到的语音频信号的低频带信号,则可以将所 述第二语音频信号和所述解码得到的语音频信号的高频带信号合并,得到输出信号;
如果所述第一语音频信号为解码得到的语音频信号的高频带信号,则可以将所述第二语音频信号和所述解码得到的语音频信号的低频带信号合并,得到输出信号;
如果所述第一语音频信号为解码得到的语音频信号的全频带信号,则可以将所述第二语音频信号直接确定为所述输出信号。
本实施例中,通过对解码得到的语音频信号的高频带信号恢复噪声成分,从而最终恢复高频带信号中的噪声成分,得到第二语音频信号。从而如果高频带信号具有上升沿或下降沿,不会增加第二语音频信号中的回声,提高了第二语音频信号的听觉质量,进而提高了最终输出的所述输出信号的听觉质量。
参见图3,为本发明实施例恢复语音频信号噪声成分的方法另一种流程示意图,该方法包括:
步骤301~步骤305与步骤201~步骤205相同,这里不赘述。
步骤306:计算修正因子,根据所述修正因子对每个采样值的调整幅度值中大于0的调整幅度值进行修正处理;
其中,本步骤的具体实现请参考步骤106中的相关描述,这里不赘述。
步骤307:根据每个采样值的符号和修正处理后的调整幅度值确定第二语音频信号。
其中,本步骤的具体实现请参考步骤106中的相关描述,这里不赘述。
其中,步骤302中确定第一语音频信号中每个采样值的符号的步骤可以在步骤307之前的任意时刻执行,与步骤303、304、305、306之间没有必然的执行顺序。
其中,步骤302与步骤303之间的执行顺序不限制。
步骤308:将所述第二语音频信号和解码得到的语音频信号的低频带信号合并,得到输出信号。
本实施例相对于图2所示的实施例,在得到每个采样值的调整幅度值后,对调整幅度值中大于0的调整幅度值进一步进行修正,从而进一步提高了第二语音频信号的听觉质量,进而也进一步提高了最终输出的所述输出信号的听觉质量。
在图2和图3给出的本发明实施例恢复语音频信号噪声成分的方法示例中,都是将解码得到的语音频信号中的高频带信号确定为第一语音频信号,在其中恢复噪音成分,从而最终得到第二语音频信号,在实际应用中,还可以按照本发明实施例恢复语音频信号噪声成分的方法对解码得到的语音频信号的全频带信号恢复噪声成分,或者对解码得到的语音频信号的低频带信号恢复噪声成分,最终得到第二语音频信号,其实现过程可以参见图2 和图3所示的方法示例,区别仅在于在确定第一语音频信号时将全频带信号或者低频带信号确定为所述第一语音频信号,这里不一一举例说明。
参见图4,为本发明实施例一种恢复语音频信号噪声成分的装置结构示意图,该装置可以设置于电子设备中,该装置400可以包括:
码流处理单元410,用于接收码流,解码所述码流得到语音频信号,所述第一语音频信号是解码得到的所述语音频信号中需要恢复噪声成分的信号;
信号确定单元420,用于根据所述码流处理单元410得到的所述语音频信号确定第一语音频信号;
第一确定单元430,用于确定所述信号确定单元420确定的所述第一语音频信号中每个采样值的符号和每个所述采样值的幅度值;
第二确定单元440,用于确定自适应归一化长度;
第三确定单元450,用于根据所述第二确定单元440确定的所述自适应归一化长度和所述第一确定单元430确定的每个所述采样值的幅度值确定每个所述采样值的调整幅度值;
第四确定单元460,用于根据所述第一确定单元430确定的每个所述采样值的符号和所述第三确定单元450确定的每个所述采样值的调整幅度值确定第二语音频信号,所述第二语音频信号是所述第一语音频信号恢复噪声成分后得到的信号。
可选地,所述第三确定单元450可以包括:
确定子单元,用于根据每个所述采样值的幅度值以及所述自适应归一化长度计算每个所述采样值对应的幅度平均值,根据每个所述采样值对应的幅度平均值确定每个所述采样值对应的幅度扰动值;
调整幅度值计算子单元,用于根据每个所述采样值的幅度值及其对应的幅度扰动值计算每个所述采样值的调整幅度值。
可选地,所述确定子单元可以包括:
确定模块,用于对于每个所述采样值,根据所述自适应归一化长度确定所述采样值所属的子带;
计算模块,用于计算所述采样值所属子带内所有采样值的幅度值的平均值,将计算得到的平均值作为所述采样值对应的幅度平均值。
可选地,所述确定模块具体可以用于:
将所有采样值按照预设顺序根据所述自适应归一化长度划分子带;对于每个所述采样值,将包括所述采样值的子带确定为所述采样值所属的子带;或者,
对于每个所述采样值,将所述采样值之前m个采样值、所述采样值、所述采样值之后n个采样值构成的子带确定为所述采样值所属的子带,m、n由所述自适应归一化长度确定,m是不小于0的整数,n是不小于0的整数。
可选地,所述调整幅度值计算子单元具体用于:
将每个所述采样值的幅度值与其对应的幅度扰动值相减得到两者的差值,将得到的差值作为每个所述采样值的调整幅度值。
可选地,所述第二确定单元440可以包括:
划分子单元,用于将所述语音频信号中的低频带信号划分为N个子带;N为自然数;
个数确定子单元,用于计算每个所述子带的峰均比,并确定所述峰均比大于预设峰均比阈值的子带个数;
长度计算子单元,用于根据所述语音频信号中高频带信号的信号类型和所述子带个数,计算所述自适应归一化长度。
可选地,所述长度计算子单元具体可以用于:
根据公式L=K+α×M计算所述自适应归一化长度;
其中,L为所述自适应归一化长度;K为所述语音频信号中的高频带信号的信号类型对应的数值,不同高频带信号的信号类型对应的K的数值不同;M为峰均比大于预设峰均比阈值的子带个数;α为小于1的常数。
可选地,所述第二确定单元440具体可以用于:
计算所述语音频信号中低频带信号的峰均比,和所述语音频信号中高频带信号的峰均比;当所述低频带信号的峰均比和所述高频带信号的峰均比的差值的绝对值小于预设差值阈值时,将所述自适应归一化长度确定为预设第一长度值,当所述低频带信号的峰均比和所述高频带信号的峰均比的差值的绝对值不小于预设差值阈值时,将所述自适应归一化长度确定为预设第二长度值;所述第一长度值大于所述第二长度值;或者,
计算所述语音频信号中低频带信号的峰均比,和所述语音频信号中高频带信号的峰均比;当所述低频带信号的峰均比小于所述高频带信号的峰均比时,将所述自适应归一化长度确定为预设第一长度值,当所述低频带信号的峰均比不小于所述高频带信号的峰均比时,将所述自适应归一化长度确定为预设第二长度值;或者,
根据所述语音频信号中高频带信号的信号类型确定所述自适应归一化长度,不同高频带信号的信号类型对应的自适应归一化长度不同。
可选地,所述第四确定单元460具体可以用于:
根据每个所述采样值的符号和调整幅度值确定每个所述采样值的新取值,得到所述第二语音频信号;或者,
计算修正因子;根据所述修正因子对所述采样值的调整幅度值中大于0的调整幅度值进行修正处理;根据每个所述采样值的符号和修正处理后的调整幅度值确定每个所述采样值的新取值,得到第二语音频信号。
可选地,所述第四确定单元460具体可以用于:使用公式β=a/L计算所述修正因子;其中,β为所述修正因子,L为所述自适应归一化长度,a为大于1的常数。
可选地,所述第四确定单元460具体可以用于:
使用以下公式对所述采样值的调整幅度值中大于0的调整幅度值进行修正处理:
Y=y×(b-β);
其中,Y为修正处理后的调整幅度值,y为将所述采样值的调整幅度值中大于0的调整幅度值,b为常数,0<b<2。
本实施例中,根据语音频信号确定第一语音频信号,确定所述第一语音频信号中每个采样值的符号和每个所述采样值的幅度值,确定自适应归一化长度,根据所述自适应归一化长度和每个所述采样值的幅度值确定每个所述采样值的调整幅度值,根据每个所述采样值的符号和每个所述采样值的调整幅度值确定第二语音频信号。这一过程中,只是对第一语音频信号这一原有信号进行处理,并未在第一语音频信号中增加新的信号,从而恢复噪声成分后的第二语音频信号中并未增加新的能量,从而如果第一语音频信号具有上升沿或下降沿,不会增加第二语音频信号中的回声,从而提高了第二语音频信号的听觉质量。
参见图5,为本发明实施例电子设备结构图,该电子设备500包括:处理器510、存储器520、收发器530和总线540;
处理器510、存储器520、收发器530通过总线540相互连接;总线540可以是ISA总线、PCI总线或EISA总线等。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图5中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
存储器520,用于存放程序。具体地,程序可以包括程序代码,所述程序代码包括计算机操作指令。存储器520可能包含高速RAM存储器,也可能还包括非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。
收发器530用于连接其他设备,并与其他设备进行通信。具体的所述收发器530可以用于:接收码流;
所述处理器510执行存储器520中存储的所述程序代码,用于解码所述码流得到语音频信号;根据所述语音频信号确定第一语音频信号;确定所述第一语音频信号中每个采样值的符号和每个所述采样值的幅度值;确定自适应归一化长度;根据所述自适应归一化长度和每个所述采样值的幅度值确定每个所述采样值的调整幅度值;根据每个所述采样值的 符号和每个所述采样值的调整幅度值确定第二语音频信号。
可选地,所述处理器510具体可以用于:
根据每个所述采样值的幅度值以及所述自适应归一化长度计算每个所述采样值对应的幅度平均值,根据每个所述采样值对应的幅度平均值确定每个所述采样值对应的幅度扰动值;
根据每个所述采样值的幅度值及其对应的幅度扰动值计算每个所述采样值的调整幅度值。
可选地,所述处理器510具体可以用于:
对于每个所述采样值,根据所述自适应归一化长度确定所述采样值所属的子带;
计算所述采样值所属子带内所有采样值的幅度值的平均值,将计算得到的平均值作为所述采样值对应的幅度平均值。
可选地,所述处理器510具体可以用于:
将所有采样值按照预设顺序根据所述自适应归一化长度划分子带;对于每个所述采样值,将包括所述采样值的子带确定为所述采样值所属的子带;或者,
对于每个所述采样值,将所述采样值之前m个采样值、所述采样值、所述采样值之后n个采样值构成的子带确定为所述采样值所属的子带,m、n由所述自适应归一化长度确定,m是不小于0的整数,n是不小于0的整数。
可选地,所述处理器510具体可以用于:
将每个所述采样值的幅度值与其对应的幅度扰动值相减得到两者的差值,将得到的差值作为每个所述采样值的调整幅度值。
可选地,所述处理器510具体可以用于:
将所述语音频信号中的低频带信号划分为N个子带;N为自然数;
计算每个所述子带的峰均比,并确定所述峰均比大于预设峰均比阈值的子带个数;
根据所述语音频信号中高频带信号的信号类型和所述子带个数,计算所述自适应归一化长度。
可选地,所述处理器510具体可以用于:
根据公式L=K+α×M计算所述自适应归一化长度;
其中,L为所述自适应归一化长度;K为所述语音频信号中的高频带信号的信号类型对应的数值,不同高频带信号的信号类型对应的K的数值不同;M为峰均比大于预设峰均比阈值的子带个数;α为小于1的常数。
可选地,所述处理器510具体可以用于:
计算所述语音频信号中低频带信号的峰均比,和所述语音频信号中高频带信号的峰均 比;当所述低频带信号的峰均比和所述高频带信号的峰均比的差值的绝对值小于预设差值阈值时,将所述自适应归一化长度确定为预设第一长度值,当所述低频带信号的峰均比和所述高频带信号的峰均比的差值的绝对值不小于预设差值阈值时,将所述自适应归一化长度确定为预设第二长度值;所述第一长度值大于所述第二长度值;或者,
计算所述语音频信号中低频带信号的峰均比,和所述语音频信号中高频带信号的峰均比;当所述低频带信号的峰均比小于所述高频带信号的峰均比时,将所述自适应归一化长度确定为预设第一长度值,当所述低频带信号的峰均比不小于所述高频带信号的峰均比时,将所述自适应归一化长度确定为预设第二长度值;或者,
根据所述语音频信号中高频带信号的信号类型确定所述自适应归一化长度,不同高频带信号的信号类型对应的自适应归一化长度不同。
可选地,所述处理器510具体可以用于:
根据每个所述采样值的符号和调整幅度值确定每个所述采样值的新取值,得到所述第二语音频信号;或者,
计算修正因子;根据所述修正因子对所述采样值的调整幅度值中大于0的调整幅度值进行修正处理;根据每个所述采样值的符号和修正处理后的调整幅度值确定每个所述采样值的新取值,得到第二语音频信号。
可选地,所述处理器510具体可以用于:
使用公式β=a/L计算所述修正因子;其中,β为所述修正因子,L为所述自适应归一化长度,a为大于1的常数。
可选地,所述处理器510具体可以用于:
使用以下公式对所述采样值的调整幅度值中大于0的调整幅度值进行修正处理:
Y=y×(b-β);
其中,Y为修正处理后的调整幅度值,y为所述采样值的调整幅度值中大于0的调整幅度值,b为常数,0<b<2。
本实施例中,电子设备根据语音频信号确定第一语音频信号,确定所述第一语音频信号中每个采样值的符号和每个所述采样值的幅度值,确定自适应归一化长度,根据所述自适应归一化长度和每个所述采样值的幅度值确定每个所述采样值的调整幅度值,根据每个所述采样值的符号和每个所述采样值的调整幅度值确定第二语音频信号。这一过程中,只是对第一语音频信号这一原有信号进行处理,并未在第一语音频信号中增加新的信号,从而恢复噪声成分后的第二语音频信号中并未增加新的能量,从而如果第一语音频信号具有上升沿或下降沿,不会增加第二语音频信号中的回声,从而提高了第二语音频信号的听觉质量。
对于系统实施例而言,由于其基本对应于方法实施例,所以相关之处参见方法实施例的部分说明即可。以上所描述的系统实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。
本发明可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本发明,在这些分布式计算环境中,由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。
本领域普通技术人员可以理解实现上述方法实施方式中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,所述的程序可以存储于计算机可读取存储介质中,这里所称得的存储介质,如:ROM、RAM、磁碟、光盘等。
还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
以上所述仅为本发明的较佳实施例而已,并非用于限定本发明的保护范围。本文中应用了具体个例对本发明的原理及实施方式进行了闸述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想;同时,对于本领域的一般技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变之处。综上所述,本说明书内容不应理解为对本发明的限制。凡在本发明的精神和原则之内所作的任何修改、等同替换、改进等,均包含在本发明的保护范围内。

Claims (22)

  1. 一种语音频信号的处理方法,其特征在于,所述方法包括:
    接收码流,解码所述码流得到语音频信号;
    根据所述语音频信号确定第一语音频信号,所述第一语音频信号是所述语音频信号中需要恢复噪声成分的信号;
    确定所述第一语音频信号中每个采样值的符号和每个所述采样值的幅度值;
    确定自适应归一化长度;
    根据所述自适应归一化长度和每个所述采样值的幅度值确定每个所述采样值的调整幅度值;
    根据每个所述采样值的符号和每个所述采样值的调整幅度值确定第二语音频信号,所述第二语音频信号是所述第一语音频信号恢复噪声成分后得到的信号。
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述自适应归一化长度和每个所述采样值的幅度值确定每个所述采样值的调整幅度值,包括:
    根据每个所述采样值的幅度值以及所述自适应归一化长度计算每个所述采样值对应的幅度平均值,根据每个所述采样值对应的幅度平均值确定每个所述采样值对应的幅度扰动值;
    根据每个所述采样值的幅度值及其对应的幅度扰动值计算每个所述采样值的调整幅度值。
  3. 根据权利要求2所述的方法,其特征在于,所述根据每个所述采样值的幅度值以及所述自适应归一化长度计算每个所述采样值对应的幅度平均值,包括:
    对于每个所述采样值,根据所述自适应归一化长度确定所述采样值所属的子带;
    计算所述采样值所属子带内所有采样值的幅度值的平均值,将计算得到的平均值作为所述采样值对应的幅度平均值。
  4. 根据权利要求3所述的方法,其特征在于,对于每个所述采样值,根据所述自适应归一化长度确定所述采样值所属的子带,包括:
    将所有采样值按照预设顺序根据所述自适应归一化长度划分子带;对于每个所述采样值,将包括所述采样值的子带确定为所述采样值所属的子带;或者,
    对于每个所述采样值,将所述采样值之前m个采样值、所述采样值、所述采样值之后n个采样值构成的子带确定为所述采样值所属的子带,m、n由所述自适应归一化长度确定, m是不小于0的整数,n是不小于0的整数。
  5. 根据权利要求2至4任一项所述的方法,其特征在于,所述根据每个所述采样值的幅度值及其对应的幅度扰动值计算每个所述采样值的调整幅度值,包括:
    将每个所述采样值的幅度值与其对应的幅度扰动值相减得到两者的差值,将得到的差值作为每个所述采样值的调整幅度值。
  6. 根据权利要求1至5任一项所述的方法,其特征在于,所述确定自适应归一化长度,包括:
    将所述语音频信号中的低频带信号划分为N个子带;N为自然数;
    计算每个所述子带的峰均比,并确定所述峰均比大于预设峰均比阈值的子带个数;
    根据所述语音频信号中高频带信号的信号类型和所述子带个数,计算所述自适应归一化长度。
  7. 根据权利要求6所述的方法,其特征在于,所述根据所述语音频信号中高频带信号的信号类型和所述子带个数,计算所述自适应归一化长度,包括:
    根据公式L=K+α×M计算所述自适应归一化长度;
    其中,L为所述自适应归一化长度;K为所述语音频信号中的高频带信号的信号类型对应的数值,不同高频带信号的信号类型对应的K的数值不同;M为峰均比大于预设峰均比阈值的子带个数;α为小于1的常数。
  8. 根据权利要求1至5任一项所述的方法,其特征在于,所述确定自适应归一化长度,包括:
    计算所述语音频信号中低频带信号的峰均比,和所述语音频信号中高频带信号的峰均比;当所述低频带信号的峰均比和所述高频带信号的峰均比的差值的绝对值小于预设差值阈值时,将所述自适应归一化长度确定为预设第一长度值,当所述低频带信号的峰均比和所述高频带信号的峰均比的差值的绝对值不小于预设差值阈值时,将所述自适应归一化长度确定为预设第二长度值;所述第一长度值大于所述第二长度值;或者,
    计算所述语音频信号中低频带信号的峰均比,和所述语音频信号中高频带信号的峰均比;当所述低频带信号的峰均比小于所述高频带信号的峰均比时,将所述自适应归一化长度确定为预设第一长度值,当所述低频带信号的峰均比不小于所述高频带信号的峰均比时,将所述自适应归一化长度确定为预设第二长度值;或者,
    根据所述语音频信号中高频带信号的信号类型确定所述自适应归一化长度,不同高频 带信号的信号类型对应的自适应归一化长度不同。
  9. 根据权利要求1至8任一项所述的方法,其特征在于,所述根据每个所述采样值的符号和每个所述采样值的调整幅度值确定第二语音频信号,包括:
    根据每个所述采样值的符号和调整幅度值确定每个所述采样值的新取值,得到所述第二语音频信号;或者,
    计算修正因子;根据所述修正因子对所述采样值的调整幅度值中大于0的调整幅度值进行修正处理;根据每个所述采样值的符号和修正处理后的调整幅度值确定每个所述采样值的新取值,得到第二语音频信号。
  10. 根据权利要求9所述的方法,其特征在于,所述计算修正因子,包括:
    使用公式β=a/L计算所述修正因子;其中,β为所述修正因子,L为所述自适应归一化长度,a为大于1的常数。
  11. 根据权利要求9或10所述的方法,其特征在于,所述根据所述修正因子对所述采样值的调整幅度值中大于0的调整幅度值进行修正处理,包括:
    使用以下公式对所述采样值的调整幅度值中大于0的调整幅度值进行修正处理:
    Y=y×(b-β);
    其中,Y为修正处理后的调整幅度值,y为所述采样值的调整幅度值中大于0的调整幅度值,b为常数,0<b<2。
  12. 一种恢复语音频信号噪声成分的装置,其特征在于,包括:
    码流处理单元,用于接收码流,解码所述码流得到语音频信号;
    信号确定单元,用于根据所述码流处理单元得到的所述语音频信号确定第一语音频信号,所述第一语音频信号是解码得到的所述语音频信号中需要恢复噪声成分的信号;
    第一确定单元,用于确定所述信号确定单元确定的所述第一语音频信号中每个采样值的符号和每个所述采样值的幅度值;
    第二确定单元,用于确定自适应归一化长度;
    第三确定单元,用于根据所述第二确定单元确定的所述自适应归一化长度和所述第一确定单元确定的每个所述采样值的幅度值确定每个所述采样值的调整幅度值;
    第四确定单元,用于根据所述第一确定单元确定的每个所述采样值的符号和所述第三确定单元确定的每个所述采样值的调整幅度值确定第二语音频信号,所述第二语音频信号 是所述第一语音频信号恢复噪声成分后得到的信号。
  13. 根据权利要求12所述的装置,其特征在于,所述第三确定单元包括:
    确定子单元,用于根据每个所述采样值的幅度值以及所述自适应归一化长度计算每个所述采样值对应的幅度平均值,根据每个所述采样值对应的幅度平均值确定每个所述采样值对应的幅度扰动值;
    调整幅度值计算子单元,用于根据每个所述采样值的幅度值及其对应的幅度扰动值计算每个所述采样值的调整幅度值。
  14. 根据权利要求13所述的装置,其特征在于,所述确定子单元包括:
    确定模块,用于对于每个所述采样值,根据所述自适应归一化长度确定所述采样值所属的子带;
    计算模块,用于计算所述采样值所属子带内所有采样值的幅度值的平均值,将计算得到的平均值作为所述采样值对应的幅度平均值。
  15. 根据权利要求14所述的装置,其特征在于,所述确定模块具体用于:
    将所有采样值按照预设顺序根据所述自适应归一化长度划分子带;对于每个所述采样值,将包括所述采样值的子带确定为所述采样值所属的子带;或者,
    对于每个所述采样值,将所述采样值之前m个采样值、所述采样值、所述采样值之后n个采样值构成的子带确定为所述采样值所属的子带,m、n由所述自适应归一化长度确定,m是不小于0的整数,n是不小于0的整数。
  16. 根据权利要求13至15任一项所述的装置,其特征在于,所述调整幅度值计算子单元具体用于:
    将每个所述采样值的幅度值与其对应的幅度扰动值相减得到两者的差值,将得到的差值作为每个所述采样值的调整幅度值。
  17. 根据权利要求12至16任一项所述的装置,其特征在于,所述第二确定单元包括:
    划分子单元,用于将所述语音频信号中的低频带信号划分为N个子带;N为自然数;
    个数确定子单元,用于计算每个所述子带的峰均比,并确定所述峰均比大于预设峰均比阈值的子带个数;
    长度计算子单元,用于根据所述语音频信号中高频带信号的信号类型和所述子带个数,计算所述自适应归一化长度。
  18. 根据权利要求17所述的装置,其特征在于,所述长度计算子单元具体用于:
    根据公式L=K+α×M计算所述自适应归一化长度;
    其中,L为所述自适应归一化长度;K为所述语音频信号中的高频带信号的信号类型对应的数值,不同高频带信号的信号类型对应的K的数值不同;M为峰均比大于预设峰均比阈值的子带个数;α为小于1的常数。
  19. 根据权利要求12至16任一项所述的装置,其特征在于,所述第二确定单元具体用于:
    计算所述语音频信号中低频带信号的峰均比,和所述语音频信号中高频带信号的峰均比;当所述低频带信号的峰均比和所述高频带信号的峰均比的差值的绝对值小于预设差值阈值时,将所述自适应归一化长度确定为预设第一长度值,当所述低频带信号的峰均比和所述高频带信号的峰均比的差值的绝对值不小于预设差值阈值时,将所述自适应归一化长度确定为预设第二长度值;所述第一长度值大于所述第二长度值;或者,
    计算所述语音频信号中低频带信号的峰均比,和所述语音频信号中高频带信号的峰均比;当所述低频带信号的峰均比小于所述高频带信号的峰均比时,将所述自适应归一化长度确定为预设第一长度值,当所述低频带信号的峰均比不小于所述高频带信号的峰均比时,将所述自适应归一化长度确定为预设第二长度值;或者,
    根据所述语音频信号中高频带信号的信号类型确定所述自适应归一化长度,不同高频带信号的信号类型对应的自适应归一化长度不同。
  20. 根据权利要求12至19任一项所述的装置,其特征在于,所述第四确定单元具体用于:
    根据每个所述采样值的符号和调整幅度值确定每个所述采样值的新取值,得到所述第二语音频信号;或者,
    计算修正因子;根据所述修正因子对所述采样值的调整幅度值中大于0的调整幅度值进行修正处理;根据每个所述采样值的符号和修正处理后的调整幅度值确定每个所述采样值的新取值,得到第二语音频信号。
  21. 根据权利要求20所述的装置,其特征在于,所述第四确定单元具体用于:使用公式β=a/L计算所述修正因子;其中,β为所述修正因子,L为所述自适应归一化长度,a为大于1的常数。
  22. 根据权利要求20或21所述的装置,其特征在于,所述第四确定单元具体用于:
    使用以下公式对所述采样值的调整幅度值中大于0的调整幅度值进行修正处理:
    Y=y×(b-β);
    其中,Y为修正处理后的调整幅度值,y为将所述采样值的调整幅度值中大于0的调整幅度值,b为常数,0<b<2。
PCT/CN2015/071017 2014-06-03 2015-01-19 一种语音频信号的处理方法和装置 WO2015184813A1 (zh)

Priority Applications (19)

Application Number Priority Date Filing Date Title
RU2016152224A RU2651184C1 (ru) 2014-06-03 2015-01-19 Способ обработки речевого/звукового сигнала и устройство
JP2016570979A JP6462727B2 (ja) 2014-06-03 2015-01-19 音声/オーディオ信号を処理するための方法および装置
NZ727567A NZ727567A (en) 2014-06-03 2015-01-19 Method for processing speech/audio signal and apparatus
EP23184053.9A EP4283614A3 (en) 2014-06-03 2015-01-19 Method for processing speech/audio signal and apparatus
AU2015271580A AU2015271580B2 (en) 2014-06-03 2015-01-19 Method for processing speech/audio signal and apparatus
EP19190663.5A EP3712890B1 (en) 2014-06-03 2015-01-19 Method for processing speech/audio signal and apparatus
KR1020207011385A KR102201791B1 (ko) 2014-06-03 2015-01-19 오디오 신호를 처리하기 위한 방법 및 장치
BR112016028375-9A BR112016028375B1 (pt) 2014-06-03 2015-01-19 Método para processar sinal de fala/áudio e aparelho
KR1020167035690A KR101943529B1 (ko) 2014-06-03 2015-01-19 오디오 신호를 처리하기 위한 방법 및 장치
KR1020197002091A KR102104561B1 (ko) 2014-06-03 2015-01-19 오디오 신호를 처리하기 위한 방법 및 장치
SG11201610141RA SG11201610141RA (en) 2014-06-03 2015-01-19 Method for processing speech/audio signal and apparatus
CA2951169A CA2951169C (en) 2014-06-03 2015-01-19 Method for processing speech/audio signal and apparatus
EP15802508.0A EP3147900B1 (en) 2014-06-03 2015-01-19 Method and device for processing audio signal
MX2016015950A MX362612B (es) 2014-06-03 2015-01-19 Metodo para procesar señal de voz/audio y aparato.
IL249337A IL249337B (en) 2014-06-03 2016-12-01 Method and apparatus for processing speech/audio signals
US15/369,396 US9978383B2 (en) 2014-06-03 2016-12-05 Method for processing speech/audio signal and apparatus
ZA2016/08477A ZA201608477B (en) 2014-06-03 2016-12-08 Method for processing speech/audio signal and apparatus
US15/985,281 US10657977B2 (en) 2014-06-03 2018-05-21 Method for processing speech/audio signal and apparatus
US16/877,389 US11462225B2 (en) 2014-06-03 2020-05-18 Method for processing speech/audio signal and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410242233.2 2014-06-03
CN201410242233.2A CN105336339B (zh) 2014-06-03 2014-06-03 一种语音频信号的处理方法和装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/369,396 Continuation US9978383B2 (en) 2014-06-03 2016-12-05 Method for processing speech/audio signal and apparatus

Publications (1)

Publication Number Publication Date
WO2015184813A1 true WO2015184813A1 (zh) 2015-12-10

Family

ID=54766052

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/071017 WO2015184813A1 (zh) 2014-06-03 2015-01-19 一种语音频信号的处理方法和装置

Country Status (19)

Country Link
US (3) US9978383B2 (zh)
EP (3) EP3147900B1 (zh)
JP (3) JP6462727B2 (zh)
KR (3) KR102201791B1 (zh)
CN (2) CN110097892B (zh)
AU (1) AU2015271580B2 (zh)
BR (1) BR112016028375B1 (zh)
CA (1) CA2951169C (zh)
CL (1) CL2016003121A1 (zh)
ES (1) ES2964221T3 (zh)
HK (1) HK1220543A1 (zh)
IL (1) IL249337B (zh)
MX (2) MX362612B (zh)
MY (1) MY179546A (zh)
NZ (1) NZ727567A (zh)
RU (1) RU2651184C1 (zh)
SG (1) SG11201610141RA (zh)
WO (1) WO2015184813A1 (zh)
ZA (1) ZA201608477B (zh)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110097892B (zh) * 2014-06-03 2022-05-10 华为技术有限公司 一种语音频信号的处理方法和装置
CN108133712B (zh) * 2016-11-30 2021-02-12 华为技术有限公司 一种处理音频数据的方法和装置
CN106847299B (zh) * 2017-02-24 2020-06-19 喜大(上海)网络科技有限公司 延时的估计方法及装置
RU2754497C1 (ru) * 2020-11-17 2021-09-02 федеральное государственное автономное образовательное учреждение высшего образования "Казанский (Приволжский) федеральный университет" (ФГАОУ ВО КФУ) Способ передачи речевых файлов по зашумленному каналу и устройство для его реализации
US20230300524A1 (en) * 2022-03-21 2023-09-21 Qualcomm Incorporated Adaptively adjusting an input current limit for a boost converter

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020120439A1 (en) * 2001-02-28 2002-08-29 Telefonaktiebolaget Lm Ericsson (Publ) Method and apparatus for providing comfort noise in communication system with discontinuous transmission
WO2003042982A1 (en) * 2001-11-13 2003-05-22 Acoustic Technologies Inc. Comfort noise including recorded noise
CN101320563A (zh) * 2007-06-05 2008-12-10 华为技术有限公司 一种背景噪声编码/解码装置、方法和通信设备
CN101335003A (zh) * 2007-09-28 2008-12-31 华为技术有限公司 噪声生成装置、及方法
CN101366077A (zh) * 2005-08-31 2009-02-11 摩托罗拉公司 在语音通信系统中产生舒适噪声的方法和设备
CN101483042A (zh) * 2008-03-20 2009-07-15 华为技术有限公司 一种噪声生成方法以及噪声生成装置
US8139777B2 (en) * 2007-10-31 2012-03-20 Qnx Software Systems Co. System for comfort noise injection
JP2013015598A (ja) * 2011-06-30 2013-01-24 Zte Corp オーディオ符号化/復号化方法、システム及びノイズレベルの推定方法

Family Cites Families (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6261312B1 (en) 1998-06-23 2001-07-17 Innercool Therapies, Inc. Inflatable catheter for selective organ heating and cooling and method of using the same
SE9803698L (sv) * 1998-10-26 2000-04-27 Ericsson Telefon Ab L M Metoder och anordningar i ett telekommunikationssystem
CA2252170A1 (en) * 1998-10-27 2000-04-27 Bruno Bessette A method and device for high quality coding of wideband speech and audio signals
US6687668B2 (en) * 1999-12-31 2004-02-03 C & S Technology Co., Ltd. Method for improvement of G.723.1 processing time and speech quality and for reduction of bit rate in CELP vocoder and CELP vococer using the same
US6631139B2 (en) * 2001-01-31 2003-10-07 Qualcomm Incorporated Method and apparatus for interoperability between voice transmission systems during speech inactivity
KR100935961B1 (ko) * 2001-11-14 2010-01-08 파나소닉 주식회사 부호화 장치 및 복호화 장치
US7536298B2 (en) * 2004-03-15 2009-05-19 Intel Corporation Method of comfort noise generation for speech communication
US7831421B2 (en) * 2005-05-31 2010-11-09 Microsoft Corporation Robust decoder
WO2008007700A1 (fr) 2006-07-12 2008-01-17 Panasonic Corporation Dispositif de décodage de son, dispositif de codage de son, et procédé de compensation de trame perdue
JP5281575B2 (ja) 2006-09-18 2013-09-04 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ オーディオオブジェクトのエンコード及びデコード
PL2301020T3 (pl) 2008-07-11 2013-06-28 Fraunhofer Ges Forschung Urządzenie i sposób do kodowania/dekodowania sygnału audio z użyciem algorytmu przełączania aliasingu
PL2146344T3 (pl) 2008-07-17 2017-01-31 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Sposób kodowania/dekodowania sygnału audio obejmujący przełączalne obejście
CN101483048B (zh) 2009-02-06 2010-08-25 凌阳科技股份有限公司 光学储存装置及其回路增益值的自动校正方法
US9047875B2 (en) 2010-07-19 2015-06-02 Futurewei Technologies, Inc. Spectrum flatness control for bandwidth extension
CN102436820B (zh) 2010-09-29 2013-08-28 华为技术有限公司 高频带信号编码方法及装置、高频带信号解码方法及装置
TWI576829B (zh) * 2011-05-13 2017-04-01 三星電子股份有限公司 位元配置裝置
US20130006644A1 (en) * 2011-06-30 2013-01-03 Zte Corporation Method and device for spectral band replication, and method and system for audio decoding
CN102208188B (zh) * 2011-07-13 2013-04-17 华为技术有限公司 音频信号编解码方法和设备
US20130132100A1 (en) 2011-10-28 2013-05-23 Electronics And Telecommunications Research Institute Apparatus and method for codec signal in a communication system
JP6239521B2 (ja) * 2011-11-03 2017-11-29 ヴォイスエイジ・コーポレーション 低レートcelpデコーダに関する非音声コンテンツの向上
US9305567B2 (en) 2012-04-23 2016-04-05 Qualcomm Incorporated Systems and methods for audio signal processing
CN110097892B (zh) * 2014-06-03 2022-05-10 华为技术有限公司 一种语音频信号的处理方法和装置
US20200333702A1 (en) 2019-04-19 2020-10-22 Canon Kabushiki Kaisha Forming apparatus, forming method, and article manufacturing method

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020120439A1 (en) * 2001-02-28 2002-08-29 Telefonaktiebolaget Lm Ericsson (Publ) Method and apparatus for providing comfort noise in communication system with discontinuous transmission
WO2003042982A1 (en) * 2001-11-13 2003-05-22 Acoustic Technologies Inc. Comfort noise including recorded noise
CN101366077A (zh) * 2005-08-31 2009-02-11 摩托罗拉公司 在语音通信系统中产生舒适噪声的方法和设备
CN101320563A (zh) * 2007-06-05 2008-12-10 华为技术有限公司 一种背景噪声编码/解码装置、方法和通信设备
CN101335003A (zh) * 2007-09-28 2008-12-31 华为技术有限公司 噪声生成装置、及方法
US8139777B2 (en) * 2007-10-31 2012-03-20 Qnx Software Systems Co. System for comfort noise injection
CN101483042A (zh) * 2008-03-20 2009-07-15 华为技术有限公司 一种噪声生成方法以及噪声生成装置
JP2013015598A (ja) * 2011-06-30 2013-01-24 Zte Corp オーディオ符号化/復号化方法、システム及びノイズレベルの推定方法

Also Published As

Publication number Publication date
ES2964221T3 (es) 2024-04-04
BR112016028375A2 (pt) 2017-08-22
US11462225B2 (en) 2022-10-04
EP3147900A1 (en) 2017-03-29
US20170084282A1 (en) 2017-03-23
EP3147900A4 (en) 2017-05-03
RU2651184C1 (ru) 2018-04-18
AU2015271580A1 (en) 2017-01-19
EP3712890B1 (en) 2023-08-30
KR20170008837A (ko) 2017-01-24
IL249337A0 (en) 2017-02-28
JP2021060609A (ja) 2021-04-15
MX2016015950A (es) 2017-04-05
AU2015271580B2 (en) 2018-01-18
MX2019001193A (es) 2019-06-12
KR102201791B1 (ko) 2021-01-11
JP6817283B2 (ja) 2021-01-20
JP2019061282A (ja) 2019-04-18
EP4283614A2 (en) 2023-11-29
KR20200043548A (ko) 2020-04-27
US9978383B2 (en) 2018-05-22
KR101943529B1 (ko) 2019-01-29
CN105336339A (zh) 2016-02-17
JP2017517034A (ja) 2017-06-22
MY179546A (en) 2020-11-10
CA2951169C (en) 2019-12-31
EP4283614A3 (en) 2024-02-21
CN110097892B (zh) 2022-05-10
EP3147900B1 (en) 2019-10-02
CL2016003121A1 (es) 2017-04-28
US20180268830A1 (en) 2018-09-20
KR102104561B1 (ko) 2020-04-24
US10657977B2 (en) 2020-05-19
US20200279572A1 (en) 2020-09-03
JP6462727B2 (ja) 2019-01-30
HK1220543A1 (zh) 2017-05-05
CN110097892A (zh) 2019-08-06
CN105336339B (zh) 2019-05-03
IL249337B (en) 2020-09-30
CA2951169A1 (en) 2015-12-10
KR20190009440A (ko) 2019-01-28
EP3712890A1 (en) 2020-09-23
SG11201610141RA (en) 2017-01-27
BR112016028375B1 (pt) 2022-09-27
JP7142674B2 (ja) 2022-09-27
MX362612B (es) 2019-01-28
ZA201608477B (en) 2018-08-29
NZ727567A (en) 2018-01-26

Similar Documents

Publication Publication Date Title
WO2015184813A1 (zh) 一种语音频信号的处理方法和装置
US9916837B2 (en) Methods and apparatuses for transmitting and receiving audio signals
KR101924767B1 (ko) 음성 주파수 코드 스트림 디코딩 방법 및 디바이스
JP6397082B2 (ja) 符号化方法、復号化方法、符号化装置及び復号化装置
WO2014194625A1 (en) Systems and methods for audio encoding and decoding
US9312893B2 (en) Systems, methods and devices for electronic communications having decreased information loss
CN103456307A (zh) 音频解码器中帧差错隐藏的谱代替方法及系统
JP2003522981A (ja) ピッチ変化検出を伴なう誤り訂正方法
US20150194157A1 (en) System, method, and computer program product for artifact reduction in high-frequency regeneration audio signals

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15802508

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 249337

Country of ref document: IL

ENP Entry into the national phase

Ref document number: 2016570979

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: MX/A/2016/015950

Country of ref document: MX

ENP Entry into the national phase

Ref document number: 2951169

Country of ref document: CA

NENP Non-entry into the national phase

Ref country code: DE

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112016028375

Country of ref document: BR

REEP Request for entry into the european phase

Ref document number: 2015802508

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2015802508

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 20167035690

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2016152224

Country of ref document: RU

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2015271580

Country of ref document: AU

Date of ref document: 20150119

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 112016028375

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20161202