WO2011134415A1 - Audio signal switching method and device - Google Patents

Audio signal switching method and device Download PDF

Info

Publication number
WO2011134415A1
WO2011134415A1 PCT/CN2011/073479 CN2011073479W WO2011134415A1 WO 2011134415 A1 WO2011134415 A1 WO 2011134415A1 CN 2011073479 W CN2011073479 W CN 2011073479W WO 2011134415 A1 WO2011134415 A1 WO 2011134415A1
Authority
WO
WIPO (PCT)
Prior art keywords
weight
signal
frequency band
band signal
speech
Prior art date
Application number
PCT/CN2011/073479
Other languages
French (fr)
Chinese (zh)
Inventor
刘泽新
苗磊
胡晨
吴文海
郎玥
张清
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to BR112012013306A priority Critical patent/BR112012013306B8/en
Priority to EP17151713.9A priority patent/EP3249648B1/en
Priority to EP11774406.0A priority patent/EP2485029B1/en
Priority to AU2011247719A priority patent/AU2011247719B2/en
Priority to KR1020127012328A priority patent/KR101377547B1/en
Priority to JP2012541316A priority patent/JP5667202B2/en
Priority to ES11774406.0T priority patent/ES2635212T3/en
Publication of WO2011134415A1 publication Critical patent/WO2011134415A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/24Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/12Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/038Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques

Definitions

  • the embodiments of the present invention relate to the field of communications technologies, and in particular, to a voice and audio signal switching method and apparatus. Background technique
  • the network will cut off the code stream of the speech and audio signals transmitted from the encoding end to the network, so that the decoding end will intercept the data.
  • the subsequent code stream decodes speech and audio signals of different bandwidths.
  • the narrowband signal mentioned in the present invention is a wideband signal which is switched to a low band component only and a high band component is empty by upsampling and low pass filtering, and the wideband speech and audio signal has both a low band signal component and a High frequency band signal component.
  • the inventors have found that at least the following problems exist in the prior art: Since the narrowband speech audio signal and the wideband speech audio signal are different from each other in the high frequency band signal, the speech and audio signals of different bandwidths are switched. When the audio signal energy is excited, the user may feel uncomfortable and cause the quality of the user's audio signal to deteriorate.
  • Embodiments of the present invention provide a method and device for switching voice and audio signals, which are implemented smoothly.
  • the audio and video signals of different bandwidths are switched to improve the quality of the audio signals received by the user.
  • the embodiment of the invention provides a method for switching voice and audio signals, including:
  • the first high frequency band signal of the current frame audio signal and the second high frequency band signal of the previous M frame voice signal are weighted to obtain the processed first high frequency band signal.
  • M is greater than or equal to 1;
  • An embodiment of the present invention provides an audio signal switching apparatus, including:
  • a processing module configured to perform weighting processing on the first high frequency band signal of the current frame speech audio signal and the second high frequency band signal of the previous M frame speech and audio signal when the audio signal is switched, to obtain the processed a high frequency band signal; wherein, M is greater than or equal to 1;
  • a first synthesizing module configured to synthesize the processed first high frequency band signal with the first low frequency band signal of the current frame audio signal into a broadband signal.
  • the voice-audio signal switching method and apparatus processes the first high-band signal of the current frame voice-audio signal according to the second high-band signal of the voice-audio signal of the previous M frame, so that the first M-frame
  • the second high-band signal of the speech audio signal can smoothly transition to the processed first high-band signal, and combine the processed first high-band signal with the first low-band signal into a wide-band signal, thereby switching
  • the speech/audio signal switching of different bandwidths can be smoothly performed, the influence of the subjective auditory quality difference of the speech and audio signals caused by the energy mutation is reduced, and the quality of the audio signal of the user is improved.
  • Embodiment 1 is a flowchart of Embodiment 1 of a method for switching a voice signal according to the present invention
  • 2 is a flowchart of Embodiment 2 of a method for switching a voice signal according to the present invention
  • Figure 3 is a flow chart of the step 201 - an embodiment of Figure 2;
  • step 302 in FIG. 3 is a flow chart of step 302 in FIG. 3;
  • FIG. 5 is a flow chart 2 of another embodiment of step 302 in Figure 3;
  • Figure 6 is a flow chart of the step 202 of Figure 2;
  • FIG. 7 is a second flowchart of another embodiment of step 201 in Figure 2;
  • Figure 8 is a third flowchart of another embodiment of step 201 in Figure 2;
  • Embodiment 9 is a schematic structural diagram of Embodiment 1 of a speech audio signal switching apparatus according to the present invention.
  • Embodiment 2 of a speech audio signal switching apparatus according to the present invention
  • FIG 11 is a schematic structural diagram of a processing module in the second embodiment of the audio signal switching device of the present invention.
  • FIG. 12 is a schematic structural diagram of a first module in Embodiment 2 of a speech audio signal switching apparatus according to the present invention.
  • Figure 13a is a schematic diagram showing the structure of a processing module in the second embodiment of the voice-audio signal switching device of the present invention
  • Figure 13b is a schematic structural view of the processing module in the second embodiment of the voice-audio signal switching device of the present invention.
  • FIG. 1 is a flowchart of Embodiment 1 of a method for switching a voice signal according to the present invention.
  • the voice-audio signal switching method of this embodiment when the voice-audio signal appears to be switched, after switching the frame
  • Each frame is processed as follows:
  • Step 101 When the speech signal is switched, the first high frequency band signal of the current frame speech audio signal and the second high frequency band signal of the previous M frame speech and audio signal are weighted to obtain the first high after processing. a band signal; wherein, M is greater than or equal to 1.
  • Step 102 Synthesize the processed first high frequency band signal with the first low frequency band signal of the current frame speech audio signal into a wide frequency band signal.
  • the pre-M frame speech and audio signal in the embodiment provided by the present invention refers to the M frame speech frequency signal before the current frame.
  • the L frame speech and audio signal before switching refers to the L frame speech and audio signal before the frame is switched when the audio signal is switched.
  • the current speech frame is a broadband signal and the previous frame is a narrowband signal; or the current speech frame is a narrowband signal and the previous frame is a wideband signal, then the speech and audio signals are switched, and the current speech frame is a switching frame. .
  • the voice-audio signal switching method of the embodiment of the present invention processes the first high-band signal in the current frame voice-audio signal according to the second high-band signal in the pre-M frame voice-audio signal, so that the first M-frame is obtained.
  • the second high-band signal in the speech audio signal can smoothly transition to the processed first high-band signal, thereby enabling high-band signals of different bandwidth speech and audio signals in switching between different bandwidth speech and audio signals.
  • the smooth transition can be smoothly performed.
  • the processed first high frequency band signal and the first low frequency band signal are combined into a wideband signal, and the wideband signal is transmitted to the user terminal, so that the user enjoys a high quality speech and audio signal.
  • the speech and audio signal switching of different bandwidths can be smoothly performed, the influence of the subjective auditory quality difference of the speech and audio signals caused by the energy excitation is reduced, and the quality of the audio signal of the user is improved.
  • FIG. 2 is a flowchart of Embodiment 2 of a method for switching a voice signal according to the present invention. As shown in FIG. 2, the voice and audio signal switching method in this embodiment includes:
  • Step 200 Synthesize the first high frequency band signal of the current frame speech audio signal with the first low frequency band signal into a broadband signal when no switching occurs.
  • the first frequency band audio signal in this embodiment may be a wideband speech audio signal or a narrowband speech audio signal.
  • the first frequency band when the first frequency band is voiced
  • the frequency signal does not switch, it is processed in the following two cases: 1. If the first band speech and audio signal is a wideband speech and audio signal, the low frequency band signal and the high frequency band signal in the wideband speech and audio signal are synthesized. Broadband signal; 2. If the first frequency band audio signal is a narrowband speech and audio signal, the low frequency band signal and the high frequency band signal in the narrow band speech and audio signal are combined into a wideband signal, at this time, although it is a wide band Signal, but the high band is empty, no information.
  • Step 201 When the speech signal is switched, the first high frequency band signal of the current frame speech audio signal and the second high frequency band signal of the previous M frame speech and audio signal are weighted to obtain the first high after processing. Frequency band signal. Where M is greater than or equal to 1.
  • the first high frequency band signal of the current frame voice signal is processed according to the second high frequency band signal of the previous M frame voice signal, so that the front M
  • the second high-band signal of the frame speech audio signal can smoothly transition to the processed first high-band signal, for example, when the wide-band speech audio signal is switched to the narrow-band speech audio signal, due to the narrow-band speech audio signal corresponding to The high-band signal is empty, so in order to smoothly switch the wideband speech and audio signal to the narrow-band speech and audio signal, it is necessary to recover the components of the high-band signal corresponding to the narrow-band speech and audio signal, and when the narrow-band speech and audio signal is switched to Wideband speech and audio signals, since the high-band signal in the wideband speech and audio signal is not empty, in order to smoothly switch the narrow-band speech and audio signals to the wide-band speech and audio signals, it is necessary to reduce the continuous multi-frame wide-band speech and audio after switching.
  • the energy of the high-band signal in the signal causing the high-band signal of the wideband speech and audio signal to gradually transition to the true high-band signal.
  • the current frame speech and audio signal is processed in step 201, so that the high-band signal in the speech/audio signals of different bandwidths can smoothly transition, and when the switching between the wide-band speech audio signal and the narrow-band speech audio signal is solved, The user's hearing is uncomfortable due to the energy stimuli, so that the user receives a high quality audio signal.
  • the first high-band signal and the second high-band signal of the pre-M frame speech and audio signal may be directly weighted, and the result obtained after the processing is The first high frequency band signal after processing.
  • Step 202 The first high frequency band signal after processing and the first low frequency of the current frame voice signal Signaling a wideband signal with a signal.
  • the second high frequency band signal of the pre-M frame speech and audio signal can be smoothly transitioned to the processed first high frequency band signal of the current frame, and then step 202 is performed.
  • step 202 is performed.
  • the smooth switching of the signal is beneficial to improve the quality of the audio signal received by the user.
  • the voice-audio signal switching method of the embodiment of the present invention processes the first high-band signal in the current frame voice-audio signal according to the second high-band signal of the previous M-frame voice signal, so that the first M-frame language
  • the second high-band signal of the audio signal can smoothly transition to the processed first high-band signal, thereby smoothing the high-band signal of the speech and audio signals of different bandwidths during the switching of the speech and audio signals of different bandwidths. Transition switching;
  • the processed first high frequency band signal and the first low frequency band signal are combined into a wideband signal, and the wideband signal is transmitted to the user terminal, so that the user enjoys a high quality voice signal.
  • the speech and audio signal switching method can smoothly switch the speech and audio signals of different bandwidths, reduce the influence of the subjective auditory quality difference of the audio signal caused by the energy excitation, and improve the quality of the audio signal received by the user. Further, by synthesizing the first high band signal of the current frame speech audio signal and the first low band signal into a wide band signal when the speech/audio signal switching of the different bandwidth does not occur, the user is allowed to obtain a high quality audio signal.
  • step 201 in this embodiment includes:
  • Step 301 Predict the predicted fine structure information and the predicted envelope information corresponding to the first high frequency band signal.
  • the speech and audio signals can be decomposed into two parts: fine structure information and envelope information, so that the speech and audio signals can be restored according to the fine structure information and the envelope information.
  • fine structure information and envelope information In the process of switching from a wideband speech and audio signal to a narrowband speech and audio signal, since only a low frequency band signal is present in the narrowband speech and audio signal, the corresponding high frequency band signal is empty, in order to smooth the wideband speech and audio signal.
  • Switching to the narrowband speech and audio signal requires recovery of the high frequency band signal required by the current narrowband speech and audio signal to achieve smooth switching of the speech and audio signals.
  • Step 301 in this embodiment predicts the predicted fine structure information and the predicted envelope information corresponding to the first high frequency band signal in the narrowband speech and audio signal.
  • step 301 may further perform signal classification on the first low frequency band signal of the current frame speech audio signal; and then according to the first low frequency band.
  • the signal type corresponding to the signal predicts predicted fine structure information and predicted envelope information corresponding to the first high frequency band signal.
  • the narrowband speech and audio signal of the current frame may be a harmonic signal, or a non-harmonic signal or a transient signal, and the like, according to the information type corresponding to the narrowband speech and audio signal, the signal of the type should be known.
  • the fine structure information and the envelope information are provided to more accurately predict the fine structure information and the envelope information corresponding to the high frequency band signal of the current frame.
  • the speech audio signal switching method of the present invention does not limit the signal type of the narrowband speech and audio signal.
  • Step 302 Perform weighting processing on the predicted envelope signal and the first M frame envelope information corresponding to the second high frequency band signal of the pre-M frame speech and audio signal to obtain the first information about the first high frequency band signal.
  • the prediction may be based on the predicted envelope information and the second high frequency of the pre-M frame speech and audio signals.
  • the first M-frame envelope information corresponding to the signal is generated, and the first packet information corresponding to the first high-band signal is generated.
  • the process of generating the first envelope information corresponding to the first high-band signal in step 302 can be implemented in the following two manners, as follows:
  • an embodiment of obtaining the first envelope information by step 302 may include:
  • Step 401 Calculate a correlation between the first low-band signal and the low-band signal of the audio signal of the first N frame according to the first low-band signal and the low-band signal of the audio signal of the first N frame. Number; where N is greater than or equal to 1.
  • the speech signal of the first N frames may be a mixed signal composed of a narrowband speech audio signal, a wideband speech audio signal or a narrowband speech audio signal and a wideband speech audio signal.
  • Step 402 Determine whether the correlation coefficient is within a given first threshold range.
  • the correlation coefficient is calculated in step 401, it is determined whether the correlation coefficient is within a given threshold range.
  • the function of calculating the correlation coefficient is to know whether the current frame speech audio signal is fading from the speech signal of the previous N frame or is abrupt, that is to say, whether their characteristics are the same, and then judging the prediction of the current frame speech audio signal.
  • the high frequency corresponding to the speech and audio signal of the previous frame Envelope information with envelope information or transitions occupies a larger weight; otherwise, if the first low-band signal of the current frame-audio signal differs greatly from the low-band signal energy of the speech signal of the previous frame, and the type is different , indicating that the speech signal of the previous frame has a lower correlation with the current frame speech audio signal. Therefore, in order to accurately restore the first envelope information corresponding to the current frame speech audio signal, the corresponding corresponding frame audio signal is restored.
  • the high-band envelope information or the transition envelope information corresponding to the audio-video signal of the previous frame occupies a smaller weight;
  • Step 403 if the correlation coefficient is not within the given first threshold range, according to the set
  • the first first weight 1 and the first weight 2 are weighted to calculate the first envelope information.
  • the first weight 1 is a weight value of the previous frame envelope information corresponding to the high frequency band signal of the previous frame speech and audio signal
  • the first weight 2 is a weight value of the envelope information.
  • step 402 concludes that the correlation coefficient is not within the given first threshold range, it may be known that the current frame speech audio signal has a small correlation with the previous N frame speech and audio signals, so the front M
  • the first M frame envelope information or the transition envelope information corresponding to the first band speech and audio signal of the frame or the high band envelope information corresponding to the speech signal of the previous frame has little influence on the first envelope information, and is restored.
  • the first envelope information corresponding to the current frame audio signal is the first M frame envelope information corresponding to the first band of the first M frame, or the transition envelope information or the high frequency corresponding to the previous frame of the speech signal Envelope information has a smaller weight.
  • the first envelope information of the current frame can be calculated.
  • the first weight one is a weight value of the envelope information corresponding to the high frequency band signal of the previous frame speech and audio signal, and the previous frame speech and audio signal may be a wideband speech audio signal or a processed narrow frequency band language.
  • the audio signal when switched for the first time, the speech audio signal of the previous frame is the wideband speech and audio signal; and the first weight 2 is the weight value of the predicted envelope information.
  • the product of the predicted envelope information and the first weight two is added, and the sum of the envelope information of the previous frame and the first weight one is added, and the sum of the weights obtained is the first envelope information of the current frame.
  • the speech and audio signals transmitted later are restored in this manner and weights, and the first envelope information corresponding to the audio signal is restored until the speech and audio signals are switched again.
  • Step 404 If the correlation coefficient is within the first threshold range, perform weighting processing according to the set second weight one and the second weight two to calculate transition envelope information.
  • the second weight 1 is a weight value of the envelope information before the handover
  • the second weight 2 is a weight value of the envelope information of the previous M frame; where L is greater than or equal to 1.
  • step 402 finds that the correlation coefficient is within a given threshold range, it can be known that the current frame speech audio signal has similar characteristics to the speech signal of the previous consecutive N frames, and the current frame speech audio signal corresponds to the first
  • An envelope information is greatly affected by the envelope information of the speech signal of the previous consecutive N frames, and considering the authenticity of the envelope of the previous M frame, therefore, the envelope information according to the previous M frame needs to be
  • the envelope information before the handover is used to solve the transition envelope information corresponding to the current frame speech and audio signal, and the envelope information of the previous M frame and the envelope information of the pre-switch L frame are restored when the first envelope information of the current frame speech and audio signal is restored. Takes a larger weight; then solves the first envelope information through the transition envelope information.
  • the second weight 1 is a weight value of the envelope information before the handover
  • the second weight 2 is a weight value of the envelope information of the previous M frame. Then, the product of the envelope information before the switching and the second weight one, plus the sum of the product of the previous M frame envelope information and the second weight 2, the obtained weighted value is the transition envelope information.
  • Step 405 Decrease the second weight one by one in the first weight step, and increase the second weight two by using the first weight step.
  • the subsequent narrow-band speech and audio signals are gradually reduced by the influence of the wide-band speech and audio signals before switching, and in order to make the calculated first envelope information more accurate, the Two weights one and two weights two adjust the applicability. Since the subsequent audio signal is gradually reduced by the influence of the wideband speech and audio signal of the L frame before switching, the value of the second weight one gradually becomes smaller, and the value of the second weight two gradually increases, thereby weakening the pre-switching The effect of the envelope information on the first envelope information.
  • the step 405 may modify the second weight one and the second weight two by using the following method: the new second weight one is equal to the old second weight one minus the first weight step, and the new second weight two is equal to the old The second weight 2 is added to the first weight step; wherein the first weight step is a set value.
  • Step 406 Determine whether the third weight 1 that has been set is greater than the first weight one.
  • the third weight 1 is a weight value of the transition envelope information.
  • the third weight 1 By comparing the magnitudes of the third weight one with the second weight one, it can be known that the first envelope information of the current frame is affected by the transition envelope information.
  • the transition envelope information is calculated from the envelope information of the first M frame and the envelope information before the handover. Therefore, the third weight 1 actually represents the degree of influence of the envelope information before the first envelope information is switched. .
  • Step 407 If the third weight is not greater than the first weight one, perform weighting processing according to the first weight 1 and the first weight 2 that have been set to calculate the first envelope information.
  • the third weight one is less than or equal to the first weight
  • the current frame speech audio signal is far from the speech signal of the L frame before the handover, and the first envelope information is mainly affected by the envelope information of the previous M frame, and therefore, according to the first weight 1 and the first weight that have been set.
  • the first envelope information of the current frame can be calculated.
  • Step 408 If the third weight one is greater than the first weight one, perform weighting processing according to the set third weight one and third weight two to calculate the first envelope information.
  • the third weight 1 is a weight value of the transition envelope information
  • the third weight 2 is a weight value of the predicted envelope information.
  • the third weight 1 is a weight value of the transition envelope information
  • the third weight 2 is a weight value of the predicted envelope information. Then, the product of the transition envelope information and the third weight one, and the weighted value obtained by adding the sum of the predicted envelope information and the third weight two is the first envelope information.
  • Step 409 Decrease the third weight one by the second weight step, and increase the third weight two by the second weight step, until the third weight one is equal to zero.
  • the purpose of modifying the third weight one and the third weight two in step 409 is the same as the purpose of modifying the second weight one and the second weight two in step 405, both for the purpose of the subsequent transmission of the audio and video signals before being switched.
  • the applicability is adjusted for the third weight one and the third weight two. Since the subsequent audio signal is gradually reduced by the influence of the L frame speech and audio signal before the switching, the value of the third weight one gradually becomes smaller, and the value of the third weight two gradually increases, thereby also achieving the weakening before the switching. The effect of the envelope information on the first envelope information.
  • the step 409 may modify the third weight one and the third weight two by the following method: the new third weight one is equal to the old third weight one minus the second weight step, and the new third weight two is equal to the old The third weight 2 is added to the second weight step; wherein the second weight step is a set value.
  • the sum of the first weight one and the first weight two is one, the sum of the second weight one and the second weight two is one, the sum of the third weight one and the third weight two is one; the initial value of the third weight one is greater than First right
  • the initial value of the first one; the first weight one and the first weight two are fixed constants.
  • the weight one and the weight two in the embodiment actually represent the percentage of the envelope information before the handover and the first M-frame envelope information composing the first envelope information of the current frame. For the current frame audio and video signal, the closer to the speech and audio signal of the L frame before the handover and the greater the correlation, the higher the percentage of the envelope information before the handover, and the opposite of the pre-M frame envelope information. The lower the percentage.
  • the current frame speech audio signal When the current frame speech audio signal is far away from the speech signal of the L frame before switching, it indicates that the speech/audio signal has been stably transmitted in the network, or when the correlation between the current frame speech audio signal and the pre-switching L frame speech audio signal is low, It indicates that the current frame audio signal characteristics have changed. Therefore, the current frame speech and audio signals are less affected by the speech and audio signals of the L frames before switching, and the percentage of the envelope information before switching is lower.
  • step 404 and step 405 in this embodiment may be interchanged, that is, the second weight one and the second weight two may be modified first, and then the transition is calculated according to the second weight one and the second weight two.
  • Envelope information may be interchanged.
  • step 408 and step 409 in this embodiment may be interchanged, that is, the third weight 1 and the third weight 2 may be modified first, and then the third weight 1 and the third weight 2 may be modified according to the modification.
  • First envelope information may be interchanged, that is, the second weight one and the second weight two may be modified first, and then the transition is calculated according to the second weight one and the second weight two.
  • step 408 and step 409 in this embodiment may be interchanged, that is, the third weight 1 and the third weight 2 may be modified first, and then the third weight 1 and the third weight 2 may be modified according to the modification.
  • Manner 2 as shown in FIG. 5, another embodiment of obtaining the first envelope information by using step 302 may further include:
  • Step 501 Calculate a correlation coefficient between the first low frequency band signal and the low frequency signal of the speech signal of the previous frame according to the first low frequency band signal of the current frame speech audio signal and the low frequency signal of the speech signal of the previous frame.
  • the correlation coefficient may be represented by "corr”
  • the current frame speech audio signal is obtained by the energy relationship between the first low frequency band signal of the current frame speech audio signal and the low frequency band signal of the speech signal of the previous frame.
  • Correlation coefficient corr between the first low-band signal and the low-band signal of the speech signal of the previous frame the smaller the energy difference, the larger the corr, otherwise, corr The smaller.
  • Step 502 Determine whether the correlation coefficient is within a given second threshold range.
  • this embodiment can represent the second threshold range as cl ⁇ c2.
  • Step 503 If the correlation coefficient is not within the second threshold range, perform weighting processing according to the first weight 1 and the first weight 2 that have been set to calculate the first envelope information.
  • the first weight one is the weight value of the previous frame envelope information corresponding to the high frequency band signal of the speech audio signal of the previous frame, and the first weight 2 is the weight value of the predicted envelope information;
  • the first weight two is a fixed constant.
  • the step 502 is that the corr is smaller than cl or greater than c2
  • the first envelope information corresponding to the current frame audio signal is less affected by the envelope information of the previous frame of the voice signal, so
  • the first weight one and the first weight two the first envelope information of the current frame can be calculated.
  • the product of the predicted envelope information and the first weight two, plus the sum of the product of the previous frame envelope information and the first weight one, the weighted sum obtained is the first envelope information of the current frame.
  • the narrowband audio signal transmitted later recovers the first envelope information corresponding to the narrowband audio signal in this manner and the weight until the speech and audio signals of different bandwidths are switched again.
  • the first weight 1 in this embodiment may be represented by al
  • the first weight 2 may be represented by bl
  • the previous frame envelope information may be represented by pre_fenv
  • the predicted envelope information may be represented by fenv
  • Information can be represented by cur_fenv.
  • Step 504 If the correlation coefficient is within the second threshold range, determine whether the second weight 1 that has been set is greater than the first weight one.
  • the second weight 1 is a weight value of the envelope information before the handover corresponding to the high frequency band signal of the previous frame of the audio signal.
  • Step 505 If the second weight is not greater than the first weight one, calculate the first envelope information according to the first weight one and the first weight two that have been set.
  • Step 506 If the second weight one is greater than the first weight one, perform weighting processing according to the second weight one and the second weight 2 that has been set to calculate the first envelope information.
  • the second weight 2 is a weight value of the predicted envelope information.
  • the second weight one can be represented by a2, and the second weight two can be represented by b2.
  • step 504 when it is determined in step 504 that the second weight 1 is greater than the first weight, the current frame speech audio signal is closer to the first frequency band audio signal of the previous frame, and the first envelope information is switched by the previous frame.
  • the envelope information before the switching of the speech and audio signals has a large influence. Therefore, based on the second weight one and the second weight two that have been set, the first envelope information of the current frame can be calculated. Then, the product of the predicted envelope information and the second weight 2, plus the sum of the product of the envelope information before the switching and the second weight, the obtained weighted sum is the first envelope information of the current frame.
  • the envelope information before the handover can be represented by con_fenv
  • Step 507 Decrease the second weight one by the second weight step, and add the second weight two by the second weight step.
  • the subsequent current frame speech and audio signals are gradually reduced by the influence of the pre-switched audio signal, and in order to make the calculated first envelope information more accurate, the second weight is required.
  • One and the second weight 2 are adjusted for applicability. Since the subsequent audio signal is gradually reduced by the influence of the previous frame of the speech and audio signal; and close to the current frame speech and audio signal The influence of a frame of audio and video signals becomes larger. Therefore, the value of the second weight one gradually becomes smaller, and the value of the second weight two gradually increases, thereby weakening the influence of the envelope information before the switching on the first envelope information, and enhancing the envelope information of the prediction to the first The impact of envelope information.
  • the step 507 may modify the second weight one and the second weight two by using the following method: the new second weight one is equal to the old second weight one minus the first weight step, and the new second weight two is equal to the old The second weight 2 is added to the first weight step; wherein the first weight step is a set value.
  • the sum of the first weight one and the first weight two is one, and the sum of the second weight one and the second weight two is one; the initial value of the second weight one is greater than the initial value of the first weight one.
  • Step 303 Generate a processed first high frequency band signal according to the first envelope information and the predicted fine structure information.
  • the required processed first high-band signal may be generated according to the first envelope information and the predicted fine structure information, so that the second The high frequency band signal can smoothly transition to the processed first high frequency band signal.
  • the processed current frame is obtained by the predicted fine structure information and the first envelope information. a first high-band signal, so that the second high-band signal of the wide-band speech and audio signal before the switching can be smoothly transitioned to the processed first high-band signal corresponding to the narrow-band speech and audio signal, and more It is beneficial to improve the quality of the audio signal received by the user.
  • step 202 in this embodiment includes:
  • Step 601 Determine whether the processed first high frequency band signal needs to be attenuated according to the current frame speech audio signal and the switching of the speech and audio signals of the previous frame.
  • the processed first high-frequency is processed.
  • the energy with the signal is attenuated frame by frame until the attenuation coefficient reaches a given threshold.
  • the interval between the current frame speech audio signal and the speech/audio signal of the previous frame can be known by the current frame audio signal and the audio signal of the previous frame.
  • the narrowband speech and audio signal can be recorded by the counter.
  • the number of frames transmitted. This frame number can be a value that is predetermined to be greater than or equal to zero.
  • Step 602 Combine the processed first high frequency band signal with the first low frequency band signal into a wideband signal if attenuation is not required.
  • step 601 if it is determined in step 601 that the processed first high-band signal does not need to be attenuated, the processed first high-band signal and the first low-band signal are directly combined into a wide-band signal.
  • Step 603 If attenuation is required, determine whether the attenuation factor corresponding to the processed first high frequency band signal is greater than a threshold.
  • the initial value of the attenuation factor is one; the threshold is less than one and greater than or equal to zero. If it is determined in step 601 that the processed first high frequency band signal needs to be attenuated, it is determined in step 603 whether the attenuation factor corresponding to the processed first high frequency band signal is greater than a given threshold.
  • Step 604 If the attenuation factor is not greater than a given threshold, multiply the processed first high-band signal by a threshold, and then synthesize the broadband signal with the first low-band signal.
  • the value of the attenuation factor is not greater than a given threshold in step 603, it indicates that the energy of the processed first high-band signal has been attenuated to a certain extent, and the processed first high-band signal has been It will not bring bad effects, and you can maintain this attenuation ratio in the future. Then, the processed first high frequency band signal is multiplied by a threshold value, and then the wideband signal is synthesized with the first low frequency band signal.
  • Step 605 If the attenuation factor is greater than a given threshold, multiplying the processed first high-band signal by an attenuation factor, and then synthesizing the broadband signal with the first low-band signal.
  • step 603 finds that the value of the attenuation factor is greater than a given threshold, it indicates that the first high-band signal after processing may cause a bad hearing effect at the attenuation factor, and further Attenuate until a given threshold. Then, the processed first high frequency band signal is multiplied by the attenuation factor, and then the wide frequency band signal is synthesized with the first low frequency band signal.
  • Step 606 Modify the attenuation factor to reduce the attenuation factor. Specifically, with the transmission of the speech and audio signals, the subsequent narrowband audio signals are gradually reduced by the influence of the speech and audio signals before switching, and correspondingly, the attenuation factor should also be gradually reduced.
  • an implementation of the processed first high frequency band signal is obtained through step 201 in this embodiment. Examples include:
  • Step 701 Perform weighting processing according to the set fourth weight one and fourth weight two to calculate the processed first high frequency band signal.
  • the fourth weight 1 is a weight value of the second high frequency band signal
  • the fourth weight 2 is a weight value of the first high frequency band signal of the current frame audio signal.
  • the high frequency band signal corresponding to the narrowband speech and audio signal is Empty or processed high-band signals, in order to enable smooth switching of narrow-band speech and audio signals to wide-band speech and audio signals, energy attenuation of high-band signals in wide-band speech and audio signals is required to implement speech and audio signals. Smooth switching.
  • the obtained weighted value is the processed first high frequency band signal.
  • Step 702 Decrease the fourth weight one by the third weight step, and increase the fourth weight two by the third weight step, until the fourth weight one is equal to zero. The sum of the fourth weight one and the fourth weight two is one.
  • the fourth weight gradually becomes smaller, and the fourth weight 2 gradually increases until the fourth weight 1 becomes zero, and the fourth weight 2 becomes one, that is, the transmitted speech and audio signal is always a broadband audio signal.
  • another embodiment of obtaining the processed first high-band signal by step 201 in this embodiment may further include:
  • Step 801 Perform weighting processing according to the set fifth weight one and fifth weight two to calculate the processed first high frequency band signal.
  • the fifth weight one is a fixed parameter that has been set
  • the fifth weight 2 is the weight value of the first high frequency band signal of the current frame speech audio signal.
  • a fixed parameter may be set instead of the high frequency band signal of the narrowband speech and audio signal, wherein the fixed parameter is one greater than or equal to zero less than the first A constant of the energy of a high frequency band signal.
  • the weighted value obtained by the product of the fixed parameter and the fifth weight one plus the product of the first high frequency band signal and the fifth weight two is the processed first high frequency band signal.
  • Step 802 Decrease the fifth weight one in units of the fourth weight step, and increase the fifth weight two in units of the fourth weight step, until the fifth weight one is equal to zero; wherein, the fifth weight one and the fifth weight two The sum of one.
  • the subsequent wideband speech and audio signals are gradually reduced by the influence of the narrowband speech and audio signals before switching. Therefore, the fifth weight gradually becomes smaller, and the fifth weight 2 gradually increases until the fifth weight becomes zero, and the fifth weight 2 becomes one, that is, the transmitted speech and audio signals are always true broadband words. audio signal.
  • the high frequency band signal of the wideband speech and audio signal is attenuated and processed.
  • the high-band signal enables the high-band signal corresponding to the narrow-band speech and audio signal before the switching to smoothly transition to the processed high-band signal corresponding to the wide-band speech and audio signal, which is more conducive to improving the user's listening audio.
  • the quality of the signal is more conducive to improving the user's listening audio.
  • the envelope information in this embodiment may also be replaced by other parameters capable of representing a high-band signal, such as: Linear Predictive Coding (LPC) parameters, or amplitude parameters.
  • LPC Linear Predictive Coding
  • FIG. 9 is a schematic structural diagram of Embodiment 1 of a speech audio signal switching apparatus according to the present invention.
  • the audio signal switching apparatus of this embodiment includes: a processing module 91 and a first synthesizing module 92.
  • the processing module 91 is configured to perform weighting processing on the first high frequency band signal of the current frame speech audio signal and the second high frequency band signal of the previous M frame speech and audio signal when the speech signal is switched, to obtain the processed A high frequency band signal.
  • M is greater than or equal to 1.
  • the first synthesis module 92 is configured to synthesize the processed first high frequency band signal with the first low frequency band signal of the current frame speech audio signal into a wide frequency band signal.
  • the speech/audio signal switching device of the embodiment of the present invention processes, by the processing module, the first high-band signal in the current frame speech and audio signal according to the second high-band signal in the pre-M frame speech and audio signal, so that The two high-band signals can smoothly transition to the processed first high-band signal, so that the high-band signals of the different bandwidth speech and audio signals can be smoothly switched during the process of switching the speech and audio signals of different bandwidths; Finally, the processed first high frequency band signal and the first low frequency band signal are combined by the first synthesis module to synthesize a wideband signal, and the wideband signal is transmitted to the user terminal, so that the user enjoys a high quality speech and audio signal.
  • the method for switching the speech and audio signals of the embodiment can smoothly perform the switching of the speech and audio signals of different bandwidths, reduce the influence of the subjective auditory quality difference of the speech and audio signals caused by the energy excitation, and improve the quality of the audio signal of the user.
  • FIG. 10 is a schematic structural diagram of Embodiment 2 of a speech audio signal switching apparatus according to the present invention. As shown in FIG. 10, the audio signal switching apparatus of this embodiment is based on the first embodiment of the audio signal switching apparatus. The difference is that the audio signal switching apparatus of this embodiment further includes: a second combining module 103.
  • the second synthesizing module 103 is configured to synthesize the first high band signal and the first low band signal into a wide band signal when the switching of the speech signal does not occur.
  • the first low frequency in the first frequency band audio signal of the current frame may be used by the second combining module without switching the voice and audio signals of different bandwidths.
  • the band signal and the first high frequency band signal are combined to form a wideband signal, thereby facilitating the improvement of the quality of the user's audio and video signals.
  • the processing module 101 in this embodiment includes:
  • the prediction module 1011 is configured to predict predicted fine structure information and predicted envelope information corresponding to the first high-band signal.
  • the first generation module 1012 is configured to perform weighting processing according to the predicted envelope information and the pre-M frame envelope information corresponding to the second high-band signal of the pre-M frame speech and audio signal, to obtain a first corresponding to the first high-band signal.
  • An envelope information is configured to perform weighting processing according to the predicted envelope information and the pre-M frame envelope information corresponding to the second high-band signal of the pre-M frame speech and audio signal, to obtain a first corresponding to the first high-band signal.
  • the second generation module 1013 is configured to generate the processed first high frequency band signal according to the first envelope information and the predicted fine structure information.
  • the audio signal switching apparatus of this embodiment may further include: a classification module 1010, configured to perform signal classification on the first low frequency band signal of the current frame speech audio signal; and the prediction module 1011 is further configured to use the first low frequency band signal according to the signal The corresponding signal type predicts predicted fine structure information and predicted envelope information corresponding to the first low frequency band signal of the current frame speech audio signal.
  • the predicted fine structure information and the predicted envelope information corresponding to the first high frequency band signal are predicted by the prediction module, so that the first generation module and the second generation module can be accurately generated and processed.
  • the first high-band signal so that the first high-band signal can be smoothly transitioned to the processed first high-band signal, which is more advantageous for improving the quality of the user's audio and video signals.
  • the first low-band signal of the current frame speech and audio signal is classified by the classification module, and then the prediction module obtains the predicted fine structure information and the predicted envelope information according to the signal type, thereby making the predicted fine structure information and prediction.
  • the envelope information is more accurate, and the quality of the speech and audio signals received by the user is higher.
  • the first synthesizing module 102 in this embodiment includes:
  • the first judging module 1021 is configured to judge whether the processed first high-band signal needs to be attenuated according to the current frame speech audio signal and the speech/audio signal of the previous frame.
  • the third synthesizing module 1022 is configured to: if the first judging module 1021 obtains that the processed first high frequency band signal does not need to be attenuated, synthesize the processed first high frequency band signal and the first low frequency band signal into a wide width Frequency band signal.
  • the second determining module 1023 is configured to determine whether the processed first high frequency band signal needs to be attenuated if the first determining module 1021 determines that the processed first high frequency band signal has an attenuation factor greater than a given threshold.
  • the fourth synthesizing module 1024 is configured to: if the second judging module 1023 finds that the attenuation factor is not greater than a given threshold, multiply the processed first high-band signal by a threshold, and then synthesize the broadband with the first low-band signal. With signal.
  • the fifth synthesizing module 1025 is configured to: if the second judging module 1023 obtains that the attenuation factor is greater than a given threshold, multiply the processed first high-band signal by an attenuation factor, and then synthesize the broadband with the first low-band signal. With signal.
  • the first modification module 1026 is for modifying the attenuation factor to reduce the attenuation factor.
  • the initial value of the attenuation factor is one; the threshold is less than one and greater than or equal to zero.
  • the wide-band signal obtained by processing the current frame-audio signal can be more accurate, which is more advantageous for improving the user's listening to the audio signal. quality.
  • the processing module 101 in this embodiment includes:
  • the first calculating module 1011a is configured to perform weighting processing according to the set fourth weight one and fourth weight two to calculate the processed first high frequency band signal; wherein, the fourth weight one is the second high frequency a weight value with a signal, and a fourth weight 2 is a weight value of the first high frequency band signal;
  • the second modification module 1012a is configured to reduce the fourth weight one by a third weight step, and add the fourth weight two by a third weight step, until the fourth weight one is equal to zero; wherein, the fourth weight is one The sum of the fourth weight two is one.
  • the processing module 101 in this embodiment may further include:
  • the second calculation module 101 lb is configured to perform the second weight and the fifth weight according to the set fifth weight Performing a weighting process to calculate a processed first high-band signal; wherein, the fifth weight one is a weight value of the fixed parameter that has been set, and the fifth weight two is a weight value of the first high-band signal;
  • the third modification module 1012b is configured to reduce the fifth weight one by the fourth weight step, and add the fifth weight two by the fourth weight step, until the fifth weight one is equal to zero; wherein, the fifth weight one The sum of the fifth weight and the second weight is one; wherein, the fixed parameter is a constant greater than or equal to zero and less than the energy value of the first high frequency band signal.
  • the speech/audio signal switching device is processed by attenuating the high-band signal of the wide-band speech and audio signal.
  • the high-band signal enables the high-band signal corresponding to the narrow-band speech and audio signal before the switching to smoothly transition to the processed high-band signal corresponding to the wide-band speech and audio signal, which is more conducive to improving the user's listening audio.
  • the quality of the signal is more conducive to improving the user's listening audio.

Abstract

An audio signal switching method and a device are provided. The audio signal switching method comprises the following steps: when an audio signal switches, performing weighting process on a first high-frequency band signal of a current frame audio signal and a second high-frequency band signal of former M frames audio signals, so as to obtain a processed first high-frequency band signal (101); synthesizing the processed first high-frequency band signal and a first low-frequency band signal of the current frame audio signal into a broad band signal (102).

Description

语音频信号切换方法及装置 本申请要求于 2010 年 4 月 28 日提交中国专利局、 申请号为 201010163406.3 ,发明名称为"语音频信号切换方法及装置 "的中国专利申请 的优先权, 在先申请文件的内容通过弓 )用结合在本申请中。 技术领域  The present invention claims the priority of the Chinese patent application filed on April 28, 2010, the Chinese Patent Office, the application number is 201010163406.3, and the invention name is "Voice and audio signal switching method and device", the prior application The contents of the document are incorporated by reference in this application. Technical field
本发明实施例涉及通信技术领域, 尤其涉及一种语音频信号切换方法 及装置。 背景技术  The embodiments of the present invention relate to the field of communications technologies, and in particular, to a voice and audio signal switching method and apparatus. Background technique
目前, 语音频信号在网络状态传输过程中, 由于网络状态的不同, 网 络会对从编码端传输到网络的语音频信号的码流做不同码率的截断, 从而 解码端就会才艮据截断后的码流解码出不同带宽的语音频信号。  At present, in the process of network state transmission, due to the different network states, the network will cut off the code stream of the speech and audio signals transmitted from the encoding end to the network, so that the decoding end will intercept the data. The subsequent code stream decodes speech and audio signals of different bandwidths.
在现有技术中, 由于网络中传输的语音频信号的带宽不同, 在语音频 信号传输过程中, 存在窄频带语音频信号向宽频带语音频信号切换, 以及 宽频带语音频信号向窄频带语音频信号切换的现象。 本发明中提到的窄频 带信号为通过上采样和低通滤波, 切换为只有低频带成分而高频带成分为 空的宽频带信号, 而宽频带语音频信号既有低频带信号成分又有高频带信 号成分。 在实现本发明过程中, 发明人发现现有技术中至少存在如下问题: 由 于窄频带语音频信号与宽频带语音频信号之间相差高频带信号中的信息, 在切换不同带宽的语音频信号时, 会出现音频信号能量激变的现象, 从而 会导致用户听觉上感到不舒服, 造成用户接听音频信号的质量变差。  In the prior art, due to the different bandwidth of the speech and audio signals transmitted in the network, during the transmission of the speech and audio signals, there are switching between the narrowband speech and audio signals to the wideband speech and audio signals, and the wideband speech and audio signals to the narrowband speech. The phenomenon of audio signal switching. The narrowband signal mentioned in the present invention is a wideband signal which is switched to a low band component only and a high band component is empty by upsampling and low pass filtering, and the wideband speech and audio signal has both a low band signal component and a High frequency band signal component. In the process of implementing the present invention, the inventors have found that at least the following problems exist in the prior art: Since the narrowband speech audio signal and the wideband speech audio signal are different from each other in the high frequency band signal, the speech and audio signals of different bandwidths are switched. When the audio signal energy is excited, the user may feel uncomfortable and cause the quality of the user's audio signal to deteriorate.
发明内容 Summary of the invention
本发明实施例提供一种语音频信号切换方法及装置, 实现平滑的进行 不同带宽的语音频信号切换, 以提高用户接听音频信号的质量。 本发明实施例提供一种语音频信号切换方法, 包括: Embodiments of the present invention provide a method and device for switching voice and audio signals, which are implemented smoothly. The audio and video signals of different bandwidths are switched to improve the quality of the audio signals received by the user. The embodiment of the invention provides a method for switching voice and audio signals, including:
当语音频信号出现切换时, 将当前帧语音频信号的第一高频带信号和 前 M帧语音频信号的第二高频带信号进行加权处理, 以得到处理后的第一 高频带信号; 其中, M大于等于 1 ;  When the audio signal is switched, the first high frequency band signal of the current frame audio signal and the second high frequency band signal of the previous M frame voice signal are weighted to obtain the processed first high frequency band signal. Where M is greater than or equal to 1;
将所述处理后的第一高频带信号与所述当前帧语音频信号的第一低频 带信号合成宽频带信号。  And combining the processed first high frequency band signal with the first low frequency band signal of the current frame speech audio signal into a wide frequency band signal.
本发明实施例提供一种音频信号切换装置, 包括:  An embodiment of the present invention provides an audio signal switching apparatus, including:
处理模块, 用于当语音频信号出现切换时, 将当前帧语音频信号的第 一高频带信号和前 M帧语音频信号的第二高频带信号进行加权处理, 以得 到处理后的第一高频带信号; 其中, M大于等于 1 ;  a processing module, configured to perform weighting processing on the first high frequency band signal of the current frame speech audio signal and the second high frequency band signal of the previous M frame speech and audio signal when the audio signal is switched, to obtain the processed a high frequency band signal; wherein, M is greater than or equal to 1;
第一合成模块, 用于将所述处理后的第一高频带信号与所述当前帧语 音频信号的第一低频带信号合成宽频带信号。 本发明实施例的语音频信号切换方法及装置, 通过根据前 M帧的语音 频信号的第二高频带信号对当前帧语音频信号的第一高频带信号进行处 理, 以使前 M帧的语音频信号的第二高频带信号能平滑过渡到处理后的第 一高频带信号, 并将处理后的第一高频带信号与第一低频带信号合成宽频 带信号, 从而在切换不同带宽的语音频信号过程中, 能够平滑的进行不同 带宽的语音频信号切换, 减小了能量激变造成语音频信号的主观听觉质量 差的影响, 提高了用户接听语音频信号的质量。 附图说明  And a first synthesizing module, configured to synthesize the processed first high frequency band signal with the first low frequency band signal of the current frame audio signal into a broadband signal. The voice-audio signal switching method and apparatus according to the embodiment of the present invention processes the first high-band signal of the current frame voice-audio signal according to the second high-band signal of the voice-audio signal of the previous M frame, so that the first M-frame The second high-band signal of the speech audio signal can smoothly transition to the processed first high-band signal, and combine the processed first high-band signal with the first low-band signal into a wide-band signal, thereby switching In the process of interlingual audio signals with different bandwidths, the speech/audio signal switching of different bandwidths can be smoothly performed, the influence of the subjective auditory quality difference of the speech and audio signals caused by the energy mutation is reduced, and the quality of the audio signal of the user is improved. DRAWINGS
为了更清楚地说明本发明实施例或现有技术中的技术方案, 下面将对 实施例或现有技术描述中所需要使用的附图作一简单地介绍, 显而易见地, 下面描述中的附图是本发明的一些实施例, 对于本领域普通技术人员来讲, 在不付出创造性劳动性的前提下, 还可以根据这些附图获得其他的附图。  In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, a brief description of the drawings used in the embodiments or the prior art description will be briefly described below. Obviously, the drawings in the following description It is a certain embodiment of the present invention, and other drawings can be obtained from those skilled in the art without any inventive labor.
图 1为本发明语音频信号切换方法实施例一流程图; 图 2为本发明语音频信号切换方法实施例二流程图; 1 is a flowchart of Embodiment 1 of a method for switching a voice signal according to the present invention; 2 is a flowchart of Embodiment 2 of a method for switching a voice signal according to the present invention;
图 3为图 2中步骤 201—个实施例的流程图;  Figure 3 is a flow chart of the step 201 - an embodiment of Figure 2;
图 4为图 3中步骤 302—个实施例的流程图;  4 is a flow chart of step 302 in FIG. 3;
图 5为图 3中步骤 302另一个实施例的流程图二;  Figure 5 is a flow chart 2 of another embodiment of step 302 in Figure 3;
图 6为图 2中步骤 202—个实施例的流程图;  Figure 6 is a flow chart of the step 202 of Figure 2;
图 7为图 2中步骤 201另一个实施例的流程图二;  Figure 7 is a second flowchart of another embodiment of step 201 in Figure 2;
图 8为图 2中步骤 201另一个实施例的流程图三;  Figure 8 is a third flowchart of another embodiment of step 201 in Figure 2;
图 9为本发明语音频信号切换装置实施例一的结构示意图;  9 is a schematic structural diagram of Embodiment 1 of a speech audio signal switching apparatus according to the present invention;
图 10为本发明语音频信号切换装置实施例二的结构示意图;  10 is a schematic structural diagram of Embodiment 2 of a speech audio signal switching apparatus according to the present invention;
图 11为本发明语音频信号切换装置实施例二中的处理模块的结构示意 图一;  Figure 11 is a schematic structural diagram of a processing module in the second embodiment of the audio signal switching device of the present invention;
图 12为本发明语音频信号切换装置实施例二中的第一模块的结构示意 图;  FIG. 12 is a schematic structural diagram of a first module in Embodiment 2 of a speech audio signal switching apparatus according to the present invention; FIG.
图 13a为本发明语音频信号切换装置实施例二中的处理模块的结构示 意图二; 图 13b为本发明语音频信号切换装置实施例二中的处理模块的结构示 意图三。 具体实施方式  Figure 13a is a schematic diagram showing the structure of a processing module in the second embodiment of the voice-audio signal switching device of the present invention; Figure 13b is a schematic structural view of the processing module in the second embodiment of the voice-audio signal switching device of the present invention. detailed description
为使本发明实施例的目的、 技术方案和优点更加清楚, 下面将结合本 发明实施例中的附图, 对本发明实施例中的技术方案进行清楚、 完整地描 述, 显然, 所描述的实施例是本发明一部分实施例, 而不是全部的实施例。 基于本发明中的实施例, 本领域普通技术人员在没有作出创造性劳动前提 下所获得的所有其他实施例, 都属于本发明保护的范围。  The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is a partial embodiment of the invention, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.
图 1为本发明语音频信号切换方法实施例一流程图。 如图 1所示, 本 实施例语音频信号切换方法, 当语音频信号出现切换时, 对切换帧之后的 每一帧采用如下方式进行处理: FIG. 1 is a flowchart of Embodiment 1 of a method for switching a voice signal according to the present invention. As shown in FIG. 1, the voice-audio signal switching method of this embodiment, when the voice-audio signal appears to be switched, after switching the frame Each frame is processed as follows:
步骤 101、 当语音频信号出现切换时, 将当前帧语音频信号的第一高频 带信号和前 M帧语音频信号的第二高频带信号进行加权处理, 以得到处理 后的第一高频带信号; 其中, M大于等于 1。  Step 101: When the speech signal is switched, the first high frequency band signal of the current frame speech audio signal and the second high frequency band signal of the previous M frame speech and audio signal are weighted to obtain the first high after processing. a band signal; wherein, M is greater than or equal to 1.
步骤 102、将处理后的第一高频带信号与当前帧语音频信号的第一低频 带信号合成宽频带信号。  Step 102: Synthesize the processed first high frequency band signal with the first low frequency band signal of the current frame speech audio signal into a wide frequency band signal.
本发明提供的实施例中的前 M帧语音频信号指当前帧之前的 M帧语音 频信号。 切换前 L帧语音频信号指出现语音频信号切换时切换帧之前的 L 帧语音频信号。 当前语音帧为宽频带信号而前一帧语音帧为窄频带信号; 或者当前语音帧为窄频带信号而前一帧语音帧为宽频带信号, 则语音频信 号出现切换, 当前语音帧为切换帧。  The pre-M frame speech and audio signal in the embodiment provided by the present invention refers to the M frame speech frequency signal before the current frame. The L frame speech and audio signal before switching refers to the L frame speech and audio signal before the frame is switched when the audio signal is switched. The current speech frame is a broadband signal and the previous frame is a narrowband signal; or the current speech frame is a narrowband signal and the previous frame is a wideband signal, then the speech and audio signals are switched, and the current speech frame is a switching frame. .
本发明实施例的语音频信号切换方法, 通过根据前 M帧语音频信号中 的第二高频带信号对当前帧语音频信号中的第一高频带信号进行处理, 以 使的前 M帧语音频信号中的第二高频带信号能平滑的过渡到处理后的第一 高频带信号, 从而在切换不同带宽的语音频信号过程中, 使不同带宽的语 音频信号的高频带信号能够平滑的过渡切换; 最后, 处理后的第一高频带 信号与第一低频带信号合成宽频带信号, 将该宽频带信号传输到用户终端, 使用户享受到高质量的语音频信号。 本实施例语音频信号切换方法能够平 滑的进行不同带宽的语音频信号切换, 减小了能量激变造成语音频信号的 主观听觉质量差的影响, 提高了用户接听语音频信号的质量。  The voice-audio signal switching method of the embodiment of the present invention processes the first high-band signal in the current frame voice-audio signal according to the second high-band signal in the pre-M frame voice-audio signal, so that the first M-frame is obtained. The second high-band signal in the speech audio signal can smoothly transition to the processed first high-band signal, thereby enabling high-band signals of different bandwidth speech and audio signals in switching between different bandwidth speech and audio signals. The smooth transition can be smoothly performed. Finally, the processed first high frequency band signal and the first low frequency band signal are combined into a wideband signal, and the wideband signal is transmitted to the user terminal, so that the user enjoys a high quality speech and audio signal. In the speech/audio signal switching method of the embodiment, the speech and audio signal switching of different bandwidths can be smoothly performed, the influence of the subjective auditory quality difference of the speech and audio signals caused by the energy excitation is reduced, and the quality of the audio signal of the user is improved.
图 2为本发明语音频信号切换方法实施例二流程图。 如图 2所示, 本 实施例语音频信号切换方法, 包括:  FIG. 2 is a flowchart of Embodiment 2 of a method for switching a voice signal according to the present invention. As shown in FIG. 2, the voice and audio signal switching method in this embodiment includes:
步骤 200、 当语音频信号没有出现切换时, 将当前帧语音频信号的第一 高频带信号与第一低频带信号合成宽频带信号。  Step 200: Synthesize the first high frequency band signal of the current frame speech audio signal with the first low frequency band signal into a broadband signal when no switching occurs.
具体的, 本实施例中的第一频带语音频信号可以是宽频带语音频信号 或者是窄频带语音频信号。 在语音频信号的传输过程中, 当第一频带语音 频信号不发生切换时, 分以下两种情况进行处理: 一、 若第一频带语音频 信号为宽频带语音频信号时, 则将宽频带语音频信号中的低频带信号和高 频带信号合成宽频带信号; 二、 若第一频带语音频信号为窄频带语音频信 号时, 则将窄频带语音频信号中的低频带信号和高频带信号合成宽频带信 号, 此时, 虽然为宽频带信号, 但高频带是空, 没有信息。 Specifically, the first frequency band audio signal in this embodiment may be a wideband speech audio signal or a narrowband speech audio signal. In the transmission of speech and audio signals, when the first frequency band is voiced When the frequency signal does not switch, it is processed in the following two cases: 1. If the first band speech and audio signal is a wideband speech and audio signal, the low frequency band signal and the high frequency band signal in the wideband speech and audio signal are synthesized. Broadband signal; 2. If the first frequency band audio signal is a narrowband speech and audio signal, the low frequency band signal and the high frequency band signal in the narrow band speech and audio signal are combined into a wideband signal, at this time, although it is a wide band Signal, but the high band is empty, no information.
步骤 201、 当语音频信号出现切换时, 将当前帧语音频信号的第一高频 带信号和前 M帧语音频信号的第二高频带信号进行加权处理, 以得到处理 后的第一高频带信号。 其中, M大于等于 1。  Step 201: When the speech signal is switched, the first high frequency band signal of the current frame speech audio signal and the second high frequency band signal of the previous M frame speech and audio signal are weighted to obtain the first high after processing. Frequency band signal. Where M is greater than or equal to 1.
具体的, 当出现不同带宽的语音频信号进行切换时, 根据前 M帧语音 频信号的第二高频带信号, 对当前帧语音频信号的第一高频带信号进行处 理, 以使得前 M帧语音频信号的第二高频带信号能平滑过渡到处理后的第 一高频带信号, 例如, 当宽频带语音频信号切换到窄频带语音频信号, 由 于窄频带语音频信号对应的的高频带信号为空, 因此为了使宽频带语音频 信号平滑切换到窄频带语音频信号, 需要恢复窄频带语音频信号对应的的 高频带信号的成分, 而当窄频带语音频信号切换到宽频带语音频信号, 由 于宽频带语音频信号中的高频带信号不为空, 因此为了使窄频带语音频信 号平滑切换到宽频带语音频信号, 需要减弱切换后连续多帧宽频带语音频 信号中的高频带信号的能量, 使得宽频带语音频信号的高频带信号逐渐过 渡到真实的高频带信号。 通过步骤 201 对当前帧语音频信号进行处理, 使 得不同带宽的语音频信号中的高频带信号能够平滑的进行过渡, 解决了宽 频带语音频信号与窄频带语音频信号之间进行切换时, 由于能量激变而造 成用户的听觉不舒适, 从而使用户接收到高质量的音频信号。 其中, 为了 简化得到处理后的第一高频带信号的过程, 可以将第一高频带信号与前 M 帧语音频信号的第二高频带信号直接进行加权处理, 处理后获得的结果即 为处理后的第一高频带信号。  Specifically, when the voice and audio signals of different bandwidths are switched, the first high frequency band signal of the current frame voice signal is processed according to the second high frequency band signal of the previous M frame voice signal, so that the front M The second high-band signal of the frame speech audio signal can smoothly transition to the processed first high-band signal, for example, when the wide-band speech audio signal is switched to the narrow-band speech audio signal, due to the narrow-band speech audio signal corresponding to The high-band signal is empty, so in order to smoothly switch the wideband speech and audio signal to the narrow-band speech and audio signal, it is necessary to recover the components of the high-band signal corresponding to the narrow-band speech and audio signal, and when the narrow-band speech and audio signal is switched to Wideband speech and audio signals, since the high-band signal in the wideband speech and audio signal is not empty, in order to smoothly switch the narrow-band speech and audio signals to the wide-band speech and audio signals, it is necessary to reduce the continuous multi-frame wide-band speech and audio after switching. The energy of the high-band signal in the signal, causing the high-band signal of the wideband speech and audio signal to gradually transition to the true high-band signal. The current frame speech and audio signal is processed in step 201, so that the high-band signal in the speech/audio signals of different bandwidths can smoothly transition, and when the switching between the wide-band speech audio signal and the narrow-band speech audio signal is solved, The user's hearing is uncomfortable due to the energy stimuli, so that the user receives a high quality audio signal. In order to simplify the process of obtaining the processed first high-band signal, the first high-band signal and the second high-band signal of the pre-M frame speech and audio signal may be directly weighted, and the result obtained after the processing is The first high frequency band signal after processing.
步骤 202、将处理后的第一高频带信号与当前帧语音频信号的第一低频 带信号合成宽频带信号。 Step 202: The first high frequency band signal after processing and the first low frequency of the current frame voice signal Signaling a wideband signal with a signal.
具体的, 当前帧语音频信号通过步骤 201处理后, 使前 M帧语音频信 号的第二高频带信号能够平滑的过渡到当前帧的处理后的第一高频带信 号, 然后通过步骤 202将处理后的第一高频带信号与当前帧语音频信号的 第一低频带信号合成宽频带信号, 使用户接收到的语音频信号均为宽频带 语音频信号, 实现了不同带宽的语音频信号平稳的切换, 有利于提高用户 接听音频信号的质量。  Specifically, after the current frame speech audio signal is processed by step 201, the second high frequency band signal of the pre-M frame speech and audio signal can be smoothly transitioned to the processed first high frequency band signal of the current frame, and then step 202 is performed. Combining the processed first high frequency band signal with the first low frequency band signal of the current frame speech audio signal to form a wide frequency band signal, so that the voice and audio signals received by the user are wideband speech and audio signals, and realizing different bandwidths of speech and audio. The smooth switching of the signal is beneficial to improve the quality of the audio signal received by the user.
本发明实施例的语音频信号切换方法, 通过根据前 M帧语音频信号的 第二高频带信号对当前帧语音频信号中的第一高频带信号进行处理, 以使 的前 M帧语音频信号的第二高频带信号能平滑的过渡到处理后的第一高频 带信号, 从而在切换不同带宽的语音频信号过程中, 使不同带宽的语音频 信号的高频带信号能够平滑的过渡切换; 最后, 处理后的第一高频带信号 与第一低频带信号合成宽频带信号, 该宽频带信号传输到用户终端, 使用 户享受到高质量的语音信号。 本实施例语音频信号切换方法能够平滑的进 行不同带宽的语音频信号切换, 减小了能量激变造成音频信号的主观听觉 质量差的影响, 提高了用户接听音频信号的质量。 另外, 通过在不发生不 同带宽的语音频信号切换时, 将当前帧语音频信号的第一高频带信号和第 一低频带信号合成宽频带信号, 使用户获得高质量的音频信号。  The voice-audio signal switching method of the embodiment of the present invention processes the first high-band signal in the current frame voice-audio signal according to the second high-band signal of the previous M-frame voice signal, so that the first M-frame language The second high-band signal of the audio signal can smoothly transition to the processed first high-band signal, thereby smoothing the high-band signal of the speech and audio signals of different bandwidths during the switching of the speech and audio signals of different bandwidths. Transition switching; Finally, the processed first high frequency band signal and the first low frequency band signal are combined into a wideband signal, and the wideband signal is transmitted to the user terminal, so that the user enjoys a high quality voice signal. In the embodiment, the speech and audio signal switching method can smoothly switch the speech and audio signals of different bandwidths, reduce the influence of the subjective auditory quality difference of the audio signal caused by the energy excitation, and improve the quality of the audio signal received by the user. Further, by synthesizing the first high band signal of the current frame speech audio signal and the first low band signal into a wide band signal when the speech/audio signal switching of the different bandwidth does not occur, the user is allowed to obtain a high quality audio signal.
基于上述技术方案, 可选的, 当宽频带语音频信号向窄频带语音频信 号转换时, 如图 3所示, 本实施例中的步骤 201包括:  Based on the foregoing technical solution, optionally, when the wideband audio and video signal is converted to the narrowband speech and audio signal, as shown in FIG. 3, step 201 in this embodiment includes:
步骤 301、预测第一高频带信号对应的预测的精细结构信息和预测的包 络信息。  Step 301: Predict the predicted fine structure information and the predicted envelope information corresponding to the first high frequency band signal.
具体的, 语音频信号可以分解为精细结构信息和包络信息两部分, 从 而可以根据精细结构信息和包络信息恢复语音频信号。 在由宽频带语音频 信号切换到窄频带语音频信号的过程中, 由于窄频带语音频信号中只有低 频带信号, 其对应的高频带信号为空, 为了使宽频带语音频信号能够平滑 的切换到窄频带语音频信号, 需要恢复出当前的窄频带语音频信号所需要 的高频带信号, 以实现语音频信号的平滑切换。 本实施例中的步骤 301 将 预测窄频带语音频信号中的第一高频带信号对应的预测的精细结构信息和 预测的包络信息。 Specifically, the speech and audio signals can be decomposed into two parts: fine structure information and envelope information, so that the speech and audio signals can be restored according to the fine structure information and the envelope information. In the process of switching from a wideband speech and audio signal to a narrowband speech and audio signal, since only a low frequency band signal is present in the narrowband speech and audio signal, the corresponding high frequency band signal is empty, in order to smooth the wideband speech and audio signal. Switching to the narrowband speech and audio signal requires recovery of the high frequency band signal required by the current narrowband speech and audio signal to achieve smooth switching of the speech and audio signals. Step 301 in this embodiment predicts the predicted fine structure information and the predicted envelope information corresponding to the first high frequency band signal in the narrowband speech and audio signal.
为了更加准确的预测当前帧语音频信号对应的预测的精细结构信息和 预测的包络信息, 步骤 301 还可以对当前帧语音频信号的第一低频带信号 进行信号分类; 再根据第一低频带信号对应的信号类型预测第一高频带信 号对应的预测的精细结构信息和预测的包络信息。 例如, 当前帧的窄频带 语音频信号可以为谐波信号、 或非谐波信号或瞬态信号等信息类型, 则可 以根据窄频带语音频信号对应的信息类型, 得知该种类型的信号应具有的 精细结构信息和包络信息, 从而更加准确的预测当前帧的高频带信号对应 的精细结构信息和包络信息。 其中, 本发明语音频信号切换方法对窄频带 语音频信号的信号类型不做限制。  In order to more accurately predict the predicted fine structure information and the predicted envelope information corresponding to the current frame speech and audio signal, step 301 may further perform signal classification on the first low frequency band signal of the current frame speech audio signal; and then according to the first low frequency band. The signal type corresponding to the signal predicts predicted fine structure information and predicted envelope information corresponding to the first high frequency band signal. For example, the narrowband speech and audio signal of the current frame may be a harmonic signal, or a non-harmonic signal or a transient signal, and the like, according to the information type corresponding to the narrowband speech and audio signal, the signal of the type should be known. The fine structure information and the envelope information are provided to more accurately predict the fine structure information and the envelope information corresponding to the high frequency band signal of the current frame. The speech audio signal switching method of the present invention does not limit the signal type of the narrowband speech and audio signal.
步骤 302、 将预测的包络信号和前 M帧语音频信号的第二高频带信号 对应的前 M帧包络信息进行加权处理, 以得到第一高频带信号对应的第一 包洛信息。  Step 302: Perform weighting processing on the predicted envelope signal and the first M frame envelope information corresponding to the second high frequency band signal of the pre-M frame speech and audio signal to obtain the first information about the first high frequency band signal. .
具体的, 在步骤 301 预测当前帧的第一高频带信号对应的预测的精细 结构信息和预测的包络信息后, 可以根据预测的包络信息和前 M帧语音频 信号的第二高频带信号对应的前 M帧包络信息, 生成第一高频带信号对应 的第一包洛信息。  Specifically, after predicting the predicted fine structure information and the predicted envelope information corresponding to the first high-band signal of the current frame in step 301, the prediction may be based on the predicted envelope information and the second high frequency of the pre-M frame speech and audio signals. The first M-frame envelope information corresponding to the signal is generated, and the first packet information corresponding to the first high-band signal is generated.
具体而言, 步骤 302 中生成第一高频带信号对应的第一包络信息的过 程可以通过如下两种方式实现, 具体如下:  Specifically, the process of generating the first envelope information corresponding to the first high-band signal in step 302 can be implemented in the following two manners, as follows:
方式一, 如图 4所示, 通过步骤 302获得第一包络信息的一个实施例 可以包括:  Method 1, as shown in FIG. 4, an embodiment of obtaining the first envelope information by step 302 may include:
步骤 401、 根据第一低频带信号和前 N帧的语音频信号的低频带信号, 计算第一低频带信号与前 N帧的语音频信号的低频带信号之间的相关性系 数; 其中, N大于等于 1。 Step 401: Calculate a correlation between the first low-band signal and the low-band signal of the audio signal of the first N frame according to the first low-band signal and the low-band signal of the audio signal of the first N frame. Number; where N is greater than or equal to 1.
具体的, 对当前帧语音频信号的第一低频带信号与前 N帧的语音频信 号的低频带信号进行比较, 以得出当前帧语音频信号的第一低频带信号与 前 N帧的语音频信号的低频带信号之间的相关性系数, 例如, 可以通过判 断当前帧语音频信号的第一低频带信号中的某频段信息, 与前 N帧的语音 频信号的低频带信号的相同频段信息的能量大小或信息类型的差异等, 确 定它们之间的相关性, 以计算得出所要的相关性系数。 其中, 前 N帧的语 音频信号可以为窄频带语音频信号、 宽频带语音频信号或者窄频带语音频 信号和宽频带语音频信号组成的混合信号。  Specifically, comparing the first low frequency band signal of the current frame speech audio signal with the low frequency band signal of the first N frame of the speech and audio signal to obtain the first low frequency band signal of the current frame speech audio signal and the language of the first N frame a correlation coefficient between the low frequency band signals of the audio signal, for example, by determining a certain frequency band information in the first low frequency band signal of the current frame speech audio signal, and the same frequency band as the low frequency band signal of the speech audio signal of the first N frame. The magnitude of the energy of the information or the difference in the type of information, etc., determine the correlation between them to calculate the desired correlation coefficient. The speech signal of the first N frames may be a mixed signal composed of a narrowband speech audio signal, a wideband speech audio signal or a narrowband speech audio signal and a wideband speech audio signal.
步骤 402、 判断相关性系数是否在给定的第一阀值范围内。  Step 402: Determine whether the correlation coefficient is within a given first threshold range.
具体的, 在步骤 401 计算出相关性系数后, 判断该相关性系数是否在 给定的阀值范围内。 计算相关性系数的作用是为了得知当前帧语音频信号 是从前 N帧的语音频信号渐变过来的还是突变过来的, 也就是说看它们的 特性是否相同, 进而判断预测当前帧语音频信号的高频带信号时前面帧的 高频带信号所占的权重。 例如, 如果当前帧语音频信号的第一低频带信号 与前一帧的语音频信号的低频带信号能量相当, 且类型相同, 则说明前一 帧的语音频信号与当前帧语音频信号有较高的相关性, 因此, 为了准确的 恢复当前帧语音频信号对应的第一包络信息, 恢复当前帧语音频信号对应 的第一包络信息时, 前一帧的语音频信号对应的高频带包络信息或过渡的 包络信息占较大的权重; 否则, 如果当前帧语音频信号的第一低频带信号 与前一帧的语音频信号的低频带信号能量相差很大, 且类型不同, 则说明 前一帧的语音频信号与当前帧语音频信号有较低的相关性, 因此, 为了准 确的恢复当前帧语音频信号对应的第一包络信息, 恢复当前帧语音频信号 对应的第一包络信息时, 前一帧的语音频信号对应的高频带包络信息或过 渡的包络信息占较小的权重;  Specifically, after the correlation coefficient is calculated in step 401, it is determined whether the correlation coefficient is within a given threshold range. The function of calculating the correlation coefficient is to know whether the current frame speech audio signal is fading from the speech signal of the previous N frame or is abrupt, that is to say, whether their characteristics are the same, and then judging the prediction of the current frame speech audio signal. The weight of the high-band signal of the previous frame when the high-band signal is used. For example, if the first low-band signal of the current frame audio signal is equivalent to the low-band signal energy of the speech signal of the previous frame, and the type is the same, it means that the speech signal of the previous frame is compared with the current frame audio signal. High correlation, therefore, in order to accurately restore the first envelope information corresponding to the current frame speech audio signal, and restore the first envelope information corresponding to the current frame speech audio signal, the high frequency corresponding to the speech and audio signal of the previous frame Envelope information with envelope information or transitions occupies a larger weight; otherwise, if the first low-band signal of the current frame-audio signal differs greatly from the low-band signal energy of the speech signal of the previous frame, and the type is different , indicating that the speech signal of the previous frame has a lower correlation with the current frame speech audio signal. Therefore, in order to accurately restore the first envelope information corresponding to the current frame speech audio signal, the corresponding corresponding frame audio signal is restored. When the first envelope information is used, the high-band envelope information or the transition envelope information corresponding to the audio-video signal of the previous frame occupies a smaller weight;
步骤 403、如果相关性系数不在给定的第一阀值范围内, 则根据已设定 好的第一权重一和第一权重二进行加权处理, 以计算出第一包络信息。 其 中, 第一权重一为前一帧语音频信号的高频带信号对应的前一帧包络信息 的权重值, 第一权重二为包络信息的权重值。 Step 403, if the correlation coefficient is not within the given first threshold range, according to the set The first first weight 1 and the first weight 2 are weighted to calculate the first envelope information. The first weight 1 is a weight value of the previous frame envelope information corresponding to the high frequency band signal of the previous frame speech and audio signal, and the first weight 2 is a weight value of the envelope information.
具体的, 当步骤 402得出相关性系数不在给定的第一阀值范围内时, 则可以得知当前帧语音频信号与前 N帧语音频信号有较小的相关性,所以, 前 M帧的第一频带语音频信号对应的前 M帧包络信息或过渡的包络信息或 前一帧的语音频信号对应的高频带包络信息对第一包络信息影响较小, 在 恢复当前帧语音频信号对应的第一包络信息时, 前 M帧的第一频带语音频 信号对应的前 M帧包络信息或过渡的包络信息或前一帧的语音频信号对应 的高频带包络信息占的权重较小。 因此, 根据已设定好的第一权重一和第 一权重二, 便可以计算出当前帧的第一包络信息。 其中, 第一权重一为前 一帧语音频信号的高频带信号对应的包络信息的权重值, 该前一帧语音频 信号可以是宽频带语音频信号或者是已处理过的窄频带语音频信号, 当第 一次切换时, 前一帧语音频信号即为宽频带语音频信号; 而第一权重二为 预测的包络信息的权重值。 将预测的包络信息与第一权重二的乘积, 加上 前一帧的包络信息与第一权重一的乘积之和, 所求得的加权之和为当前帧 的第一包络信息。 另外, 以后传输的语音频信号都按此方式和权重, 恢复 该语音频信号对应的第一包络信息, 直到语音频信号再次发生切换。  Specifically, when step 402 concludes that the correlation coefficient is not within the given first threshold range, it may be known that the current frame speech audio signal has a small correlation with the previous N frame speech and audio signals, so the front M The first M frame envelope information or the transition envelope information corresponding to the first band speech and audio signal of the frame or the high band envelope information corresponding to the speech signal of the previous frame has little influence on the first envelope information, and is restored. When the first envelope information corresponding to the current frame audio signal is the first M frame envelope information corresponding to the first band of the first M frame, or the transition envelope information or the high frequency corresponding to the previous frame of the speech signal Envelope information has a smaller weight. Therefore, based on the first weight one and the first weight two that have been set, the first envelope information of the current frame can be calculated. The first weight one is a weight value of the envelope information corresponding to the high frequency band signal of the previous frame speech and audio signal, and the previous frame speech and audio signal may be a wideband speech audio signal or a processed narrow frequency band language. The audio signal, when switched for the first time, the speech audio signal of the previous frame is the wideband speech and audio signal; and the first weight 2 is the weight value of the predicted envelope information. The product of the predicted envelope information and the first weight two is added, and the sum of the envelope information of the previous frame and the first weight one is added, and the sum of the weights obtained is the first envelope information of the current frame. In addition, the speech and audio signals transmitted later are restored in this manner and weights, and the first envelope information corresponding to the audio signal is restored until the speech and audio signals are switched again.
步骤 404、如果相关性系数在第一阀值范围内,根据已设定好的第二权 重一和第二权重二进行加权处理, 以计算出过渡包络信息。 其中, 第二权 重一为切换前的包络信息的权重值, 第二权重二为前 M帧包络信息的权重 值; 其中, L大于等于 1。  Step 404: If the correlation coefficient is within the first threshold range, perform weighting processing according to the set second weight one and the second weight two to calculate transition envelope information. The second weight 1 is a weight value of the envelope information before the handover, and the second weight 2 is a weight value of the envelope information of the previous M frame; where L is greater than or equal to 1.
具体的, 当步骤 402得出相关性系数在给定的阀值范围内时, 则可以 得知当前帧语音频信号与前连续 N帧的语音频信号特性相似, 当前帧语音 频信号对应的第一包络信息受前连续 N帧的语音频信号的包络信息影响较 大, 同时考虑到前 M帧包络的真实性, 因此, 需要根据前 M帧包络信息和 切换前的包络信息求解当前帧语音频信号对应的过渡包络信息, 在恢复当 前帧语音频信号的第一包络信息时, 让前 M帧包络信息和切换前 L帧的包 络信息占较大的权重; 再通过过渡包络信息求解第一包络信息。 其中, 第 二权重一为切换前的包络信息的权重值, 而第二权重二为前 M帧包络信息 的权重值。 则切换前的包络信息与第二权重一的乘积, 加上前 M帧包络信 息与第二权重二的乘积之和, 所求得的加权值即为过渡包络信息。 Specifically, when step 402 finds that the correlation coefficient is within a given threshold range, it can be known that the current frame speech audio signal has similar characteristics to the speech signal of the previous consecutive N frames, and the current frame speech audio signal corresponds to the first An envelope information is greatly affected by the envelope information of the speech signal of the previous consecutive N frames, and considering the authenticity of the envelope of the previous M frame, therefore, the envelope information according to the previous M frame needs to be The envelope information before the handover is used to solve the transition envelope information corresponding to the current frame speech and audio signal, and the envelope information of the previous M frame and the envelope information of the pre-switch L frame are restored when the first envelope information of the current frame speech and audio signal is restored. Takes a larger weight; then solves the first envelope information through the transition envelope information. The second weight 1 is a weight value of the envelope information before the handover, and the second weight 2 is a weight value of the envelope information of the previous M frame. Then, the product of the envelope information before the switching and the second weight one, plus the sum of the product of the previous M frame envelope information and the second weight 2, the obtained weighted value is the transition envelope information.
步骤 405、 以第一权重步长为单位减小第二权重一, 以第一权重步长为 单位增加第二权重二。  Step 405: Decrease the second weight one by one in the first weight step, and increase the second weight two by using the first weight step.
具体的, 随着语音频信号的传输, 后续的窄频带语音频信号受切换前 的宽频带语音频信号的影响逐渐变小, 为了使计算得出的第一包络信息更 加准确, 需要对第二权重一和第二权重二进行适用性的调整。 由于后续的 音频信号受切换前 L帧的宽频带语音频信号的影响逐渐变小, 因此, 第二 权重一的数值逐渐变小, 而第二权重二的数值逐渐增大, 从而减弱切换前 的包络信息对第一包络信息的影响。 其中, 步骤 405对第二权重一和第二 权重二进行修改可以通过如下方法: 新的第二权重一等于旧的第二权重一 减去第一权重步长, 新的第二权重二等于旧的第二权重二加上第一权重步 长; 其中, 第一权重步长为已设定好的值。  Specifically, with the transmission of the speech and audio signals, the subsequent narrow-band speech and audio signals are gradually reduced by the influence of the wide-band speech and audio signals before switching, and in order to make the calculated first envelope information more accurate, the Two weights one and two weights two adjust the applicability. Since the subsequent audio signal is gradually reduced by the influence of the wideband speech and audio signal of the L frame before switching, the value of the second weight one gradually becomes smaller, and the value of the second weight two gradually increases, thereby weakening the pre-switching The effect of the envelope information on the first envelope information. The step 405 may modify the second weight one and the second weight two by using the following method: the new second weight one is equal to the old second weight one minus the first weight step, and the new second weight two is equal to the old The second weight 2 is added to the first weight step; wherein the first weight step is a set value.
步骤 406、 判断已设定好的第三权重一是否大于第一权重一。  Step 406: Determine whether the third weight 1 that has been set is greater than the first weight one.
具体的, 第三权重一为过渡包络信息的权重值, 通过比较第三权重一 与第二权重一的大小, 可以得知当前帧的第一包络信息受过渡包络信息的 影响大小。 其中, 过渡包络信息中由前 M帧包络信息和切换前的包络信息 计算而来, 因此, 第三权重一实际上代表了第一包络信息受切换前的包络 信息的影响程度。  Specifically, the third weight 1 is a weight value of the transition envelope information. By comparing the magnitudes of the third weight one with the second weight one, it can be known that the first envelope information of the current frame is affected by the transition envelope information. The transition envelope information is calculated from the envelope information of the first M frame and the envelope information before the handover. Therefore, the third weight 1 actually represents the degree of influence of the envelope information before the first envelope information is switched. .
步骤 407、如果第三权重一不大于第一权重一, 则根据已设定好的第一 权重一和第一权重二进行加权处理, 以计算出所述第一包络信息。  Step 407: If the third weight is not greater than the first weight one, perform weighting processing according to the first weight 1 and the first weight 2 that have been set to calculate the first envelope information.
具体的, 在步骤 406判断得知第三权重一小于等于第一权重一时, 说 明当前帧语音频信号离切换前 L帧的语音频信号较远, 第一包络信息主要 受前 M帧包络信息的影响, 因此, 根据已设定好的第一权重一和第一权重 二, 便可以计算出当前帧的第一包络信息。 Specifically, when it is determined in step 406 that the third weight one is less than or equal to the first weight, The current frame speech audio signal is far from the speech signal of the L frame before the handover, and the first envelope information is mainly affected by the envelope information of the previous M frame, and therefore, according to the first weight 1 and the first weight that have been set. Second, the first envelope information of the current frame can be calculated.
步骤 408、如果第三权重一大于第一权重一,根据已设定好的第三权重 一和第三权重二进行加权处理, 以计算出第一包络信息。 其中, 第三权重 一为过渡包络信息的权重值, 第三权重二为预测的包络信息的权重值。  Step 408: If the third weight one is greater than the first weight one, perform weighting processing according to the set third weight one and third weight two to calculate the first envelope information. The third weight 1 is a weight value of the transition envelope information, and the third weight 2 is a weight value of the predicted envelope information.
具体的, 在步骤 406判断得知第三权重一大于第一权重一时, 说明当 前帧语音频信号离切换前 L帧语音频信号较近, 第一包络信息受切换前的 包络信息影响较大, 因此, 需要根据过渡包络信息求解当前帧的第一包络 信息。 其中, 第三权重一为过渡包络信息的权重值, 而第三权重二为预测 的包络信息的权重值。 则过渡包络信息与第三权重一的乘积, 加上预测的 包络信息与第三权重二的乘积之和所求得的加权值即为第一包络信息。  Specifically, when it is determined in step 406 that the third weight 1 is greater than the first weight, the current frame speech audio signal is closer to the L frame speech and audio signal before the handover, and the first envelope information is affected by the envelope information before the handover. Large, therefore, the first envelope information of the current frame needs to be solved according to the transition envelope information. The third weight 1 is a weight value of the transition envelope information, and the third weight 2 is a weight value of the predicted envelope information. Then, the product of the transition envelope information and the third weight one, and the weighted value obtained by adding the sum of the predicted envelope information and the third weight two is the first envelope information.
步骤 409、 以第二权重步长为单位减小第三权重一, 以第二权重步长为 单位增加第三权重二, 直至第三权重一等于零。  Step 409: Decrease the third weight one by the second weight step, and increase the third weight two by the second weight step, until the third weight one is equal to zero.
具体的, 步骤 409中修改第三权重一和第三权重二的目的, 与步骤 405 中修改第二权重一和第二权重二的目的相同, 都是为了在后续传输的语音 频信号受切换前 L帧语音频信号的影响逐渐变小的情况下, 为了使计算得 出的第一包络信息更加准确, 对第三权重一和第三权重二进行适用性的调 整。 由于后续的音频信号受切换前 L帧语音频信号的影响逐渐变小, 因此, 第三权重一的数值逐渐变小, 而第三权重二的数值逐渐增大, 从而也实现 了减弱切换前的包络信息对第一包络信息的影响。 其中, 步骤 409对第三 权重一和第三权重二进行修改可以通过如下方法: 新的第三权重一等于旧 的第三权重一减去第二权重步长, 新的第三权重二等于旧的第三权重二加 上第二权重步长; 其中, 第二权重步长为已设定好的值。  Specifically, the purpose of modifying the third weight one and the third weight two in step 409 is the same as the purpose of modifying the second weight one and the second weight two in step 405, both for the purpose of the subsequent transmission of the audio and video signals before being switched. In the case where the influence of the L-frame audio signal is gradually reduced, in order to make the calculated first envelope information more accurate, the applicability is adjusted for the third weight one and the third weight two. Since the subsequent audio signal is gradually reduced by the influence of the L frame speech and audio signal before the switching, the value of the third weight one gradually becomes smaller, and the value of the third weight two gradually increases, thereby also achieving the weakening before the switching. The effect of the envelope information on the first envelope information. The step 409 may modify the third weight one and the third weight two by the following method: the new third weight one is equal to the old third weight one minus the second weight step, and the new third weight two is equal to the old The third weight 2 is added to the second weight step; wherein the second weight step is a set value.
第一权重一和第一权重二的和为一, 第二权重一和第二权重二的和为 一, 第三权重一和第三权重二的和为一; 第三权重一的初始值大于第一权 重一的初始值; 第一权重一和第一权重二是固定的常数。 具体的, 本实施 例中的权重一和权重二, 实际上代表了切换前的包络信息和前 M帧包络信 息组成当前帧的第一包络信息时所占的百分比。 对于当前帧语音频信号, 离切换前 L帧的语音频信号越近且相关性较大时, 切换前的包络信息所占 的百分比就越高, 而相反的前 M帧包络信息所占的百分比就越低。 在当前 帧语音频信号离切换前 L帧的语音频信号较远时, 说明网络中已经稳定的 传输语音频信号, 或者当前帧语音频信号与切换前 L帧语音频信号相关性 较低时, 说明当前帧语音频信号特性已发生改变, 因此, 当前帧语音频信 号受切换前 L帧的语音频信号影响较小, 切换前的包络信息所占的百分比 就越低。 The sum of the first weight one and the first weight two is one, the sum of the second weight one and the second weight two is one, the sum of the third weight one and the third weight two is one; the initial value of the third weight one is greater than First right The initial value of the first one; the first weight one and the first weight two are fixed constants. Specifically, the weight one and the weight two in the embodiment actually represent the percentage of the envelope information before the handover and the first M-frame envelope information composing the first envelope information of the current frame. For the current frame audio and video signal, the closer to the speech and audio signal of the L frame before the handover and the greater the correlation, the higher the percentage of the envelope information before the handover, and the opposite of the pre-M frame envelope information. The lower the percentage. When the current frame speech audio signal is far away from the speech signal of the L frame before switching, it indicates that the speech/audio signal has been stably transmitted in the network, or when the correlation between the current frame speech audio signal and the pre-switching L frame speech audio signal is low, It indicates that the current frame audio signal characteristics have changed. Therefore, the current frame speech and audio signals are less affected by the speech and audio signals of the L frames before switching, and the percentage of the envelope information before switching is lower.
另夕卜, 本实施例中的步骤 404和步骤 405的执行顺序可以互换, 即可 以先修改第二权重一和第二权重二, 再根据第二权重一和第二权重二, 计 算出过渡包络信息。 同样的, 本实施例中的步骤 408和步骤 409的执行顺 序可以互换, 即可以先修改修改第三权重一和第三权重二, 再根据修改第 三权重一和第三权重二, 计算出第一包络信息。  In addition, the execution order of step 404 and step 405 in this embodiment may be interchanged, that is, the second weight one and the second weight two may be modified first, and then the transition is calculated according to the second weight one and the second weight two. Envelope information. Similarly, the execution order of step 408 and step 409 in this embodiment may be interchanged, that is, the third weight 1 and the third weight 2 may be modified first, and then the third weight 1 and the third weight 2 may be modified according to the modification. First envelope information.
方式二, 如图 5所示, 通过步骤 302获得第一包络信息的另一个实施 例还可以包括:  Manner 2, as shown in FIG. 5, another embodiment of obtaining the first envelope information by using step 302 may further include:
步骤 501、根据当前帧语音频信号的第一低频带信号和前一帧的语音频 信号的低频信号, 计算第一低频带信号与前一帧的语音频信号的低频信号 之间的相关性系数。  Step 501: Calculate a correlation coefficient between the first low frequency band signal and the low frequency signal of the speech signal of the previous frame according to the first low frequency band signal of the current frame speech audio signal and the low frequency signal of the speech signal of the previous frame. .
具体的, 为了更加准确的得出第一包络信息, 求解当前帧语音频信号 的第一低频带信号某频段和前一帧的语音频信号的低频信号相同频段的能 量的关系。 本实施例可以以 "corr" 代表相关性系数, 通过当前帧语音频信 号的第一低频带信号与前一帧的语音频信号的低频带信号相同频段的能量 关系, 得出当前帧语音频信号的第一低频带信号与前一帧的语音频信号的 低频带信号之间的相关性系数 corr, 能量相差越小, corr越大, 否则, corr 越小。 具体过程可以参见步骤 401中关于前 N帧的语音频信号相关性计算 的介绍。 Specifically, in order to obtain the first envelope information more accurately, the relationship between the energy of the same frequency band of the low frequency signal of the first low frequency band signal of the current frame speech audio signal and the low frequency signal of the audio signal of the previous frame is solved. In this embodiment, the correlation coefficient may be represented by "corr", and the current frame speech audio signal is obtained by the energy relationship between the first low frequency band signal of the current frame speech audio signal and the low frequency band signal of the speech signal of the previous frame. Correlation coefficient corr between the first low-band signal and the low-band signal of the speech signal of the previous frame, the smaller the energy difference, the larger the corr, otherwise, corr The smaller. For a specific process, refer to the description of the correlation calculation of the speech and audio signals of the first N frames in step 401.
步骤 502、 判断相关性系数是否在给定的第二阀值范围内。  Step 502: Determine whether the correlation coefficient is within a given second threshold range.
具体的, 在步骤 501计算出 corr的值后, 判断计算出的 corr是否在给 定的第二阀值范围内。 例如, 本实施例可以将第二阀值范围用 cl〜c2代表。  Specifically, after the value of corr is calculated in step 501, it is determined whether the calculated corr is within a given second threshold range. For example, this embodiment can represent the second threshold range as cl~c2.
步骤 503、如果相关性系数不在第二阀值范围内, 则根据已设定好的第 一权重一和第一权重二进行加权处理, 以计算出第一包络信息。 其中, 第 一权重一为前一帧的语音频信号的高频带信号对应的前一帧包络信息的权 重值, 第一权重二为预测的包络信息的权重值; 第一权重一和第一权重二 是固定的常数。  Step 503: If the correlation coefficient is not within the second threshold range, perform weighting processing according to the first weight 1 and the first weight 2 that have been set to calculate the first envelope information. The first weight one is the weight value of the previous frame envelope information corresponding to the high frequency band signal of the speech audio signal of the previous frame, and the first weight 2 is the weight value of the predicted envelope information; The first weight two is a fixed constant.
具体的, 当步骤 502得出 corr小于 cl或大于 c2时, 得知当前帧语音 频信号对应的第一包络信息受切换前一帧语音频信号的包络信息影响较 小, 因此, 根据已设定好的第一权重一和第一权重二, 便可以计算出当前 帧的第一包络信息。 预测的包络信息与第一权重二的乘积, 加上前一帧包 络信息与第一权重一的乘积之和, 所求得的加权和即为当前帧的第一包络 信息。 另外, 以后传输的窄带语音频信号都按此方式和权重恢复该窄带语 音频信号对应的第一包络信息, 直到不同带宽的语音频信号再次发生切换。 例如: 本实施例中的第一权重一可以用 al代表、 第一权重二可以用 bl代 表、 前一帧包络信息可以用 pre_fenv代表、 预测的包络信息可以用 fenv代 表、 第一包络信息可以用 cur_fenv代表。 则步骤 503可以利用以下公式表 示: cur_fenv=pre_fenv *a 1 +fenv *b 1。  Specifically, when the step 502 is that the corr is smaller than cl or greater than c2, it is learned that the first envelope information corresponding to the current frame audio signal is less affected by the envelope information of the previous frame of the voice signal, so By setting the first weight one and the first weight two, the first envelope information of the current frame can be calculated. The product of the predicted envelope information and the first weight two, plus the sum of the product of the previous frame envelope information and the first weight one, the weighted sum obtained is the first envelope information of the current frame. In addition, the narrowband audio signal transmitted later recovers the first envelope information corresponding to the narrowband audio signal in this manner and the weight until the speech and audio signals of different bandwidths are switched again. For example, the first weight 1 in this embodiment may be represented by al, the first weight 2 may be represented by bl, the previous frame envelope information may be represented by pre_fenv, and the predicted envelope information may be represented by fenv, the first envelope. Information can be represented by cur_fenv. Then step 503 can be expressed by the following formula: cur_fenv = pre_fenv * a 1 + fenv * b 1.
步骤 504、如果相关性系数在第二阀值范围内, 判断已设定好的第二权 重一是否大于第一权重一。 其中, 第二权重一为切换前一帧语音频信号的 高频带信号对应的切换前的包络信息的权重值,  Step 504: If the correlation coefficient is within the second threshold range, determine whether the second weight 1 that has been set is greater than the first weight one. The second weight 1 is a weight value of the envelope information before the handover corresponding to the high frequency band signal of the previous frame of the audio signal.
具体的,如果 cl<corr<c2,则通过比较第二权重一与第一权重一的大小, 可以得知当前帧的第一包络信息受切换前的包络信息和前一帧包络信息的 影响程度。 Specifically, if cl<corr<c2, by comparing the size of the second weight one with the first weight one, it can be known that the first envelope information of the current frame is subjected to the envelope information before the handover and the envelope information of the previous frame. of influence level.
步骤 505、如果第二权重一不大于第一权重一, 则根据已设定好的第一 权重一和第一权重二, 计算出第一包络信息。  Step 505: If the second weight is not greater than the first weight one, calculate the first envelope information according to the first weight one and the first weight two that have been set.
具体的, 在步骤 504判断得知第二权重一小于第一权重一时, 说明当 前帧语音频信号离切换前一帧的语音频信号较远, 第一包络信息受切换前 的包络信息的影响较小, 因此, 根据已设定好的第一权重一和第一权重二, 便可以计算出当前帧的第一包络信息。 则步骤 505可以利用以下公式表示: cur_fenv=pre_fenv *a 1 +fenv*b 1。  Specifically, when it is determined in step 504 that the second weight 1 is less than the first weight, the current frame speech audio signal is far from the speech signal of the previous frame, and the first envelope information is subjected to the envelope information before the handover. The influence is small. Therefore, according to the first weight one and the first weight two that have been set, the first envelope information of the current frame can be calculated. Then step 505 can be expressed by the following formula: cur_fenv = pre_fenv * a 1 + fenv * b 1.
步骤 506、如果第二权重一大于第一权重一,根据第二权重一和已设定 好的第二权重二进行加权处理, 以计算出第一包络信息。 其中, 第二权重 二为预测的包络信息的权重值。 例如: 第二权重一可以用 a2代表, 第二权 重二可以用 b2代表。  Step 506: If the second weight one is greater than the first weight one, perform weighting processing according to the second weight one and the second weight 2 that has been set to calculate the first envelope information. Wherein, the second weight 2 is a weight value of the predicted envelope information. For example: The second weight one can be represented by a2, and the second weight two can be represented by b2.
具体的, 在步骤 504判断得知第二权重一大于第一权重一时, 说明当 前帧语音频信号离切换前一帧的第一频带语音频信号较近, 第一包络信息 受切换前一帧语音频信号对应的切换前的包络信息影响较大。 因此, 根据 已设定好的第二权重一和第二权重二, 便可以计算出当前帧的第一包络信 息。 则预测的包络信息与第二权重二的乘积, 加上切换前的包络信息与第 二权重一的乘积之和, 所求得的加权和即为当前帧的第一包络信息。 其中, 切换前的包络信息可以用 con_fenv代表, 则步骤 506可以利用以下公式表 示: cur_fenv=con_fenv*a2+fenv*b2。  Specifically, when it is determined in step 504 that the second weight 1 is greater than the first weight, the current frame speech audio signal is closer to the first frequency band audio signal of the previous frame, and the first envelope information is switched by the previous frame. The envelope information before the switching of the speech and audio signals has a large influence. Therefore, based on the second weight one and the second weight two that have been set, the first envelope information of the current frame can be calculated. Then, the product of the predicted envelope information and the second weight 2, plus the sum of the product of the envelope information before the switching and the second weight, the obtained weighted sum is the first envelope information of the current frame. Wherein, the envelope information before the handover can be represented by con_fenv, then step 506 can be expressed by the following formula: cur_fenv=con_fenv*a2+fenv*b2.
步骤 507、 以第二权重步长为单位减小第二权重一, 以第二权重步长为 单位增加第二权重二。  Step 507: Decrease the second weight one by the second weight step, and add the second weight two by the second weight step.
具体的, 随着语音频信号的传输, 后续的当前帧语音频信号受切换前 一语音频信号的影响逐渐变小, 为了使计算得出的第一包络信息更加准确, 需要对第二权重一和第二权重二进行适用性的调整。 由于后续的音频信号 受切换前一帧语音频信号的影响逐渐变小; 而靠近当前帧语音频信号的前 一帧语音频信号的影响变大。 因此, 第二权重一的数值逐渐变小, 而第二 权重二的数值逐渐增大, 从而减弱切换前的包络信息对第一包络信息的影 响, 而增强预测的包络信息对第一包络信息的影响。 其中, 步骤 507对第 二权重一和第二权重二进行修改可以通过如下方法: 新的第二权重一等于 旧的第二权重一减去第一权重步长, 新的第二权重二等于旧的第二权重二 加上第一权重步长; 其中, 第一权重步长为已设定好的值。 Specifically, with the transmission of the speech and audio signals, the subsequent current frame speech and audio signals are gradually reduced by the influence of the pre-switched audio signal, and in order to make the calculated first envelope information more accurate, the second weight is required. One and the second weight 2 are adjusted for applicability. Since the subsequent audio signal is gradually reduced by the influence of the previous frame of the speech and audio signal; and close to the current frame speech and audio signal The influence of a frame of audio and video signals becomes larger. Therefore, the value of the second weight one gradually becomes smaller, and the value of the second weight two gradually increases, thereby weakening the influence of the envelope information before the switching on the first envelope information, and enhancing the envelope information of the prediction to the first The impact of envelope information. The step 507 may modify the second weight one and the second weight two by using the following method: the new second weight one is equal to the old second weight one minus the first weight step, and the new second weight two is equal to the old The second weight 2 is added to the first weight step; wherein the first weight step is a set value.
其中, 第一权重一和第一权重二的和为一, 第二权重一和第二权重二 的和为一; 第二权重一的初始值大于第一权重一的初始值。  The sum of the first weight one and the first weight two is one, and the sum of the second weight one and the second weight two is one; the initial value of the second weight one is greater than the initial value of the first weight one.
步骤 303、根据第一包络信息和预测的精细结构信息, 生成处理后的第 一高频带信号。  Step 303: Generate a processed first high frequency band signal according to the first envelope information and the predicted fine structure information.
具体的, 通过步骤 302得出当前帧的第一包络信息后, 可以根据第一 包络信息和预测的精细结构信息, 生成所需要的处理后的第一高频带信号, 从而使第二高频带信号能平滑过渡到处理后的第一高频带信号。  Specifically, after obtaining the first envelope information of the current frame by using step 302, the required processed first high-band signal may be generated according to the first envelope information and the predicted fine structure information, so that the second The high frequency band signal can smoothly transition to the processed first high frequency band signal.
本实施例语音频信号切换方法, 在语音频信号发生从宽频带语音频信 号向窄频带语音频信号切换的过程中, 通过预测的精细结构信息和第一包 络信息获得当前帧的处理后的第一高频带信号, 从而能够使切换前的宽频 带语音频信号的第二高频带信号能够平滑的过渡到窄频带语音频信号所对 应的处理后的第一高频带信号, 更有利于提高用户接听音频信号的质量。  In the speech/audio signal switching method of the embodiment, in the process of switching the speech/audio signal from the wideband speech and audio signal to the narrowband speech and audio signal, the processed current frame is obtained by the predicted fine structure information and the first envelope information. a first high-band signal, so that the second high-band signal of the wide-band speech and audio signal before the switching can be smoothly transitioned to the processed first high-band signal corresponding to the narrow-band speech and audio signal, and more It is beneficial to improve the quality of the audio signal received by the user.
基于上述技术方案, 可选的, 如图 6所示, 本实施例中的步骤 202包 括:  Based on the foregoing technical solution, optionally, as shown in FIG. 6, step 202 in this embodiment includes:
步骤 601、根据当前帧语音频信号与切换前一帧的语音频信号, 判断处 理后的第一高频带信号是否需要衰减。  Step 601: Determine whether the processed first high frequency band signal needs to be attenuated according to the current frame speech audio signal and the switching of the speech and audio signals of the previous frame.
具体的, 由于窄带语音频信号的第一高频带信号为空, 在宽频带语音 频信号切换到窄频带语音频信号的过程中, 为了防止恢复出的窄频带语音 频信号对应的处理后的第一高频带信号带来不好的影响, 在由窄频带语音 频信号扩展为宽频带信号的帧数达到给定的帧数后, 将处理后的第一高频 带信号的能量逐帧进行衰减, 直到衰减系数达到给定的阔值。 通过当前帧 语音频信号与切换前一帧的语音频信号, 可以得知当前帧语音频信号与切 换前一帧的语音频信号之间的间隔时间, 例如, 可以通过计数器记录窄频 带语音频信号传输的帧数, 此帧数可以为已预定好的大于等于 0的值。 Specifically, since the first high-band signal of the narrow-band audio signal is empty, in the process of switching the wide-band audio-video signal to the narrow-band speech and audio signal, in order to prevent the recovered narrow-band speech and audio signal corresponding to the processed The first high-band signal has a bad influence. After the number of frames extended from the narrow-band speech audio signal to the wide-band signal reaches a given number of frames, the processed first high-frequency is processed. The energy with the signal is attenuated frame by frame until the attenuation coefficient reaches a given threshold. The interval between the current frame speech audio signal and the speech/audio signal of the previous frame can be known by the current frame audio signal and the audio signal of the previous frame. For example, the narrowband speech and audio signal can be recorded by the counter. The number of frames transmitted. This frame number can be a value that is predetermined to be greater than or equal to zero.
步骤 602、如果不需要衰减,将处理后的第一高频带信号与第一低频带 信号合成宽频带信号。  Step 602: Combine the processed first high frequency band signal with the first low frequency band signal into a wideband signal if attenuation is not required.
具体的, 如果步骤 601 判断得出处理后的第一高频带信号无需进行衰 减, 则直接将处理后的第一高频带信号与第一低频带信号合成宽频带信号。  Specifically, if it is determined in step 601 that the processed first high-band signal does not need to be attenuated, the processed first high-band signal and the first low-band signal are directly combined into a wide-band signal.
步骤 603、如果需要衰减, 判断处理后的第一高频带信号对应的衰减因 子是否大于阀值。  Step 603: If attenuation is required, determine whether the attenuation factor corresponding to the processed first high frequency band signal is greater than a threshold.
具体的, 衰减因子的初始值为一; 阀值小于一并大于等于零。 如果步 骤 601 判断得出处理后的第一高频带信号需要进行衰减, 则通过步骤 603 判断处理后的第一高频带信号对应的衰减因子是否大于给定的阀值。  Specifically, the initial value of the attenuation factor is one; the threshold is less than one and greater than or equal to zero. If it is determined in step 601 that the processed first high frequency band signal needs to be attenuated, it is determined in step 603 whether the attenuation factor corresponding to the processed first high frequency band signal is greater than a given threshold.
步骤 604、如果衰减因子不大于给定的阔值, 则将处理后的第一高频带 信号乘以阔值, 然后与第一低频带信号合成宽频带信号。  Step 604: If the attenuation factor is not greater than a given threshold, multiply the processed first high-band signal by a threshold, and then synthesize the broadband signal with the first low-band signal.
具体的, 如果步骤 603得出衰减因子的值不大于给定的阀值时, 说明 处理后的第一高频带信号的能量已经衰减到一定的程度, 处理后的第一高 频带信号已经不会带来不好的影响, 以后就可以保持此衰减比例。 则将处 理后的第一高频带信号乘以阔值, 然后与第一低频带信号合成宽频带信号。  Specifically, if the value of the attenuation factor is not greater than a given threshold in step 603, it indicates that the energy of the processed first high-band signal has been attenuated to a certain extent, and the processed first high-band signal has been It will not bring bad effects, and you can maintain this attenuation ratio in the future. Then, the processed first high frequency band signal is multiplied by a threshold value, and then the wideband signal is synthesized with the first low frequency band signal.
步骤 605、如果衰减因子大于给定的阔值, 则将处理后的第一高频带信 号乘以衰减因子后, 再与第一低频带信号合成宽频带信号。  Step 605: If the attenuation factor is greater than a given threshold, multiplying the processed first high-band signal by an attenuation factor, and then synthesizing the broadband signal with the first low-band signal.
具体的, 如果步骤 603得出衰减因子的值大于给定的阀值时, 则说明 在此衰减因子时, 处理后的第一高频带信号还有可能引起不好的听觉影响, 还需要继续衰减, 直到给定的阀值。 则将处理后的第一高频带信号乘以衰 减因子后, 再与第一低频带信号合成宽频带信号。  Specifically, if step 603 finds that the value of the attenuation factor is greater than a given threshold, it indicates that the first high-band signal after processing may cause a bad hearing effect at the attenuation factor, and further Attenuate until a given threshold. Then, the processed first high frequency band signal is multiplied by the attenuation factor, and then the wide frequency band signal is synthesized with the first low frequency band signal.
步骤 606、 修改衰减因子, 以使衰减因子减小。 具体的, 随着语音频信号的传输, 后续的窄带语音频信号受切换前的 语音频信号的影响逐渐变小, 相对应的, 衰减因子也应该逐渐变小。 Step 606: Modify the attenuation factor to reduce the attenuation factor. Specifically, with the transmission of the speech and audio signals, the subsequent narrowband audio signals are gradually reduced by the influence of the speech and audio signals before switching, and correspondingly, the attenuation factor should also be gradually reduced.
基于上述技术方案, 可选的, 当窄频带语音频信号向宽频带语音频信 号切换时, 如图 7所示, 本实施例中通过步骤 201获得处理后的第一高频 带信号的一个实施例包括:  Based on the above technical solution, optionally, when the narrowband speech and audio signal is switched to the wideband speech and audio signal, as shown in FIG. 7, an implementation of the processed first high frequency band signal is obtained through step 201 in this embodiment. Examples include:
步骤 701、根据已设定好的第四权重一和第四权重二进行加权处理, 以 计算出处理后的第一高频带信号。 其中, 第四权重一为第二高频带信号的 权重值, 第四权重二为当前帧语音频信号的第一高频带信号的权重值。  Step 701: Perform weighting processing according to the set fourth weight one and fourth weight two to calculate the processed first high frequency band signal. The fourth weight 1 is a weight value of the second high frequency band signal, and the fourth weight 2 is a weight value of the first high frequency band signal of the current frame audio signal.
具体的, 在由窄频带语音频信号切换到宽频带语音频信号的过程中, 由于宽频带语音频信号中的高频带信号不为空, 而窄频带语音频信号对应 的高频带信号为空或者处理后的高频带信号, 为了使窄频带语音频信号能 够平滑的切换到宽频带语音频信号, 需要对宽频带语音频信号中的高频带 信号进行能量衰减, 以实现语音频信号的平滑的进行切换。 通过第二高频 带信号与第四权重一的乘积, 加上第一高频带信号与第四权重二的乘积之 和, 所求得的加权值即为处理后的第一高频带信号。  Specifically, in the process of switching from a narrowband speech and audio signal to a wideband speech and audio signal, since the high frequency band signal in the wideband speech and audio signal is not empty, the high frequency band signal corresponding to the narrowband speech and audio signal is Empty or processed high-band signals, in order to enable smooth switching of narrow-band speech and audio signals to wide-band speech and audio signals, energy attenuation of high-band signals in wide-band speech and audio signals is required to implement speech and audio signals. Smooth switching. By adding the product of the second high frequency band signal and the fourth weight one, and adding the product of the first high frequency band signal and the fourth weight two, the obtained weighted value is the processed first high frequency band signal. .
步骤 702、 以第三权重步长为单位减小第四权重一, 以第三权重步长为 单位增加第四权重二, 直至第四权重一等于零。 其中, 第四权重一和第四 权重二的和为一。  Step 702: Decrease the fourth weight one by the third weight step, and increase the fourth weight two by the third weight step, until the fourth weight one is equal to zero. The sum of the fourth weight one and the fourth weight two is one.
具体的, 随着语音频信号的传输, 后续的宽频带语音频信号受切换前 的窄频带语音频信号的影响逐渐变小。 因此, 第四权重一逐渐变小, 而第 四权重二逐渐增大, 直至第四权重一变为零, 而第四权重二变为一, 即传 输的语音频信号一直为宽频带语音频信号。  Specifically, with the transmission of the speech and audio signals, the subsequent wideband speech and audio signals are gradually reduced by the influence of the narrowband speech and audio signals before switching. Therefore, the fourth weight gradually becomes smaller, and the fourth weight 2 gradually increases until the fourth weight 1 becomes zero, and the fourth weight 2 becomes one, that is, the transmitted speech and audio signal is always a broadband audio signal. .
同样的, 如图 8所示, 本实施例中通过步骤 201获得处理后的第一高 频带信号的另一个实施例还可以包括:  Similarly, as shown in FIG. 8, another embodiment of obtaining the processed first high-band signal by step 201 in this embodiment may further include:
步骤 801、根据已设定好的第五权重一和第五权重二进行加权处理, 以 计算出处理后的第一高频带信号。 其中, 第五权重一为已设定好的固定参 数的权重值, 第五权重二为当前帧语音频信号的第一高频带信号的权重值。 具体的, 由于窄频带语音频信号的第一高频带信号为空, 因此可以设 定一固定参数来代替窄频带语音频信号的高频带信号, 其中, 该固定参数 为一个大于等于零小于第一高频带信号能量的常数。 通过固定参数与第五 权重一的乘积, 加上第一高频带信号与第五权重二的乘积之和, 所求得的 加权值即为处理后的第一高频带信号。 Step 801: Perform weighting processing according to the set fifth weight one and fifth weight two to calculate the processed first high frequency band signal. Wherein, the fifth weight one is a fixed parameter that has been set The weight value of the number, the fifth weight 2 is the weight value of the first high frequency band signal of the current frame speech audio signal. Specifically, since the first high frequency band signal of the narrowband speech and audio signal is empty, a fixed parameter may be set instead of the high frequency band signal of the narrowband speech and audio signal, wherein the fixed parameter is one greater than or equal to zero less than the first A constant of the energy of a high frequency band signal. The weighted value obtained by the product of the fixed parameter and the fifth weight one plus the product of the first high frequency band signal and the fifth weight two is the processed first high frequency band signal.
步骤 802、 以第四权重步长为单位减小第五权重一, 以第四权重步长为 单位增加第五权重二, 直至第五权重一等于零; 其中, 第五权重一和第五 权重二的和为一。  Step 802: Decrease the fifth weight one in units of the fourth weight step, and increase the fifth weight two in units of the fourth weight step, until the fifth weight one is equal to zero; wherein, the fifth weight one and the fifth weight two The sum of one.
具体的, 随着语音频信号的传输, 后续的宽频带语音频信号受切换前 的窄频带语音频信号的影响逐渐变小。 因此, 第五权重一逐渐变小, 而第 五权重二逐渐增大, 直至第五权重一变为零, 而第五权重二变为一, 即传 输的语音频信号一直为真实的宽频带语音频信号。  Specifically, with the transmission of the speech and audio signals, the subsequent wideband speech and audio signals are gradually reduced by the influence of the narrowband speech and audio signals before switching. Therefore, the fifth weight gradually becomes smaller, and the fifth weight 2 gradually increases until the fifth weight becomes zero, and the fifth weight 2 becomes one, that is, the transmitted speech and audio signals are always true broadband words. audio signal.
本实施例语音频信号切换方法, 在语音频信号发生从窄频带语音频信 号向宽频带语音频信号切换的过程中, 通过对宽频带语音频信号的高频带 信号进行衰减处理得到处理后的高频带信号, 从而能够使切换前的窄频带 语音频信号对应的高频带信号能够平滑的过渡到宽频带语音频信号所对应 的处理后的高频带信号, 更有利于提高用户接听音频信号的质量。  In the method for switching the speech and audio signals of the present embodiment, in the process of switching the speech/audio signal from the narrowband speech audio signal to the wideband speech audio signal, the high frequency band signal of the wideband speech and audio signal is attenuated and processed. The high-band signal enables the high-band signal corresponding to the narrow-band speech and audio signal before the switching to smoothly transition to the processed high-band signal corresponding to the wide-band speech and audio signal, which is more conducive to improving the user's listening audio. The quality of the signal.
其中, 本实施例中的包络信息还可以通过能够代表高频带信号的其他 参数代替, 例如: 线性预测编码 (Linear Predictive Coding, 以下简称: LPC) 参数, 或幅度参数等。  The envelope information in this embodiment may also be replaced by other parameters capable of representing a high-band signal, such as: Linear Predictive Coding (LPC) parameters, or amplitude parameters.
本领域普通技术人员可以理解: 实现上述方法实施例的全部或部分步 骤可以通过程序指令相关的硬件来完成, 前述的程序可以存储于一计算机 可读取存储介质中, 该程序在执行时, 执行包括上述方法实施例的步骤; 而前述的存储介质包括: ROM、 RAM, 磁碟或者光盘等各种可以存储程序 代码的介质。 图 9为本发明语音频信号切换装置实施例一的结构示意图。 如图 9所 示, 本实施例音频信号切换装置, 包括: 处理模块 91和第一合成模块 92。 A person skilled in the art can understand that all or part of the steps of implementing the above method embodiments may be completed by using hardware related to program instructions, and the foregoing program may be stored in a computer readable storage medium, and the program is executed when executed. The foregoing steps include the steps of the foregoing method embodiments; and the foregoing storage medium includes: a medium that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk. FIG. 9 is a schematic structural diagram of Embodiment 1 of a speech audio signal switching apparatus according to the present invention. As shown in FIG. 9, the audio signal switching apparatus of this embodiment includes: a processing module 91 and a first synthesizing module 92.
处理模块 91用于当语音频信号出现切换时, 将当前帧语音频信号的第 一高频带信号和前 M帧语音频信号的第二高频带信号进行加权处理, 以得 到处理后的第一高频带信号。 其中, M大于等于 1。  The processing module 91 is configured to perform weighting processing on the first high frequency band signal of the current frame speech audio signal and the second high frequency band signal of the previous M frame speech and audio signal when the speech signal is switched, to obtain the processed A high frequency band signal. Where M is greater than or equal to 1.
第一合成模块 92用于将处理后的第一高频带信号与当前帧语音频信号 的第一低频带信号合成宽频带信号。  The first synthesis module 92 is configured to synthesize the processed first high frequency band signal with the first low frequency band signal of the current frame speech audio signal into a wide frequency band signal.
本发明实施例的语音频信号切换装置, 通过处理模块根据前 M帧语音 频信号中的第二高频带信号对当前帧语音频信号中的第一高频带信号进行 处理, 以使的第二高频带信号能平滑的过渡到处理后的第一高频带信号, 从而在切换不同带宽的语音频信号过程中, 使不同带宽的语音频信号的高 频带信号能够平滑的过渡切换; 最后, 处理后的第一高频带信号与第一低 频带信号通过第一合成模块合成宽频带信号, 将该宽频带信号传输到用户 终端, 使用户享受到高质量的语音频信号。 本实施例语音频信号切换方法 能够平滑的进行不同带宽的语音频信号切换, 减小了能量激变造成语音频 信号的主观听觉质量差的影响, 提高了用户接听语音频信号的质量。  The speech/audio signal switching device of the embodiment of the present invention processes, by the processing module, the first high-band signal in the current frame speech and audio signal according to the second high-band signal in the pre-M frame speech and audio signal, so that The two high-band signals can smoothly transition to the processed first high-band signal, so that the high-band signals of the different bandwidth speech and audio signals can be smoothly switched during the process of switching the speech and audio signals of different bandwidths; Finally, the processed first high frequency band signal and the first low frequency band signal are combined by the first synthesis module to synthesize a wideband signal, and the wideband signal is transmitted to the user terminal, so that the user enjoys a high quality speech and audio signal. The method for switching the speech and audio signals of the embodiment can smoothly perform the switching of the speech and audio signals of different bandwidths, reduce the influence of the subjective auditory quality difference of the speech and audio signals caused by the energy excitation, and improve the quality of the audio signal of the user.
图 10为本发明语音频信号切换装置实施例二的结构示意图。 如图 10 所示, 本实施例音频信号切换装置基于上述音频信号切换装置实施例一, 其区别在于: 本实施例音频信号切换装置还包括: 第二合成模块 103。  FIG. 10 is a schematic structural diagram of Embodiment 2 of a speech audio signal switching apparatus according to the present invention. As shown in FIG. 10, the audio signal switching apparatus of this embodiment is based on the first embodiment of the audio signal switching apparatus. The difference is that the audio signal switching apparatus of this embodiment further includes: a second combining module 103.
第二合成模块 103 用于当语音频信号没有出现切换时, 将第一高频带 信号与第一低频带信号合成宽频带信号。  The second synthesizing module 103 is configured to synthesize the first high band signal and the first low band signal into a wide band signal when the switching of the speech signal does not occur.
本实施例音频信号切换装置, 通过设置第二合成模块, 可以在不发生 不同带宽的语音频信号切换的前提下, 由第二合成模块将当前帧的第一频 带语音频信号中的第一低频带信号和第一高频带信号合成宽频带信号, 从 而有利于提高用户接听语音频信号的质量。  In the audio signal switching apparatus of this embodiment, by setting the second combining module, the first low frequency in the first frequency band audio signal of the current frame may be used by the second combining module without switching the voice and audio signals of different bandwidths. The band signal and the first high frequency band signal are combined to form a wideband signal, thereby facilitating the improvement of the quality of the user's audio and video signals.
基于上述技术方案, 可选的, 当宽频带语音频信号向窄频带语音频信 号切换时, 如图 10和图 11所示, 本实施例中的处理模块 101包括: 预测模块 1011用于预测第一高频带信号对应的预测的精细结构信息和 预测的包络信息。 Based on the above technical solution, optionally, when a wideband speech and audio signal is transmitted to a narrowband speech and audio signal When the number is switched, as shown in FIG. 10 and FIG. 11 , the processing module 101 in this embodiment includes: The prediction module 1011 is configured to predict predicted fine structure information and predicted envelope information corresponding to the first high-band signal.
第一生成模块 1012用于根据预测的包络信息和前 M帧语音频信号的第 二高频带信号对应的前 M帧包络信息进行加权处理, 以得到第一高频带信 号对应的第一包络信息。  The first generation module 1012 is configured to perform weighting processing according to the predicted envelope information and the pre-M frame envelope information corresponding to the second high-band signal of the pre-M frame speech and audio signal, to obtain a first corresponding to the first high-band signal. An envelope information.
第二生成模块 1013用于根据第一包络信息和预测的精细结构信息, 生 成处理后的第一高频带信号。  The second generation module 1013 is configured to generate the processed first high frequency band signal according to the first envelope information and the predicted fine structure information.
更进一步的, 本实施例音频信号切换装置可以还包括: 分类模块 1010, 用于对当前帧语音频信号的第一低频带信号进行信号分类; 而预测模块 1011还用于根据第一低频带信号对应的信号类型预测当前帧语音频信号的 第一低频带信号对应的预测的精细结构信息和预测的包络信息。  Further, the audio signal switching apparatus of this embodiment may further include: a classification module 1010, configured to perform signal classification on the first low frequency band signal of the current frame speech audio signal; and the prediction module 1011 is further configured to use the first low frequency band signal according to the signal The corresponding signal type predicts predicted fine structure information and predicted envelope information corresponding to the first low frequency band signal of the current frame speech audio signal.
本实施例音频信号切换装置, 通过预测模块预测出第一高频带信号对 应的预测的精细结构信息和预测的包络信息, 从而可以通过第一生成模块 和第二生成模块准确的生成处理后的第一高频带信号, 从而使第一高频带 信号能够更加平滑的过渡到处理后的第一高频带信号, 更有利于提高用户 接听语音频信号的质量。 另外, 通过分类模块对当前帧语音频信号的第一 低频带信号进行信号分类, 然后, 预测模块根据信号类型得到预测的精细 结构信息和预测的包络信息, 从而使预测的精细结构信息和预测的包络信 息更加准确, 使用户接听到的语音频信号的质量更高。  In the audio signal switching apparatus of this embodiment, the predicted fine structure information and the predicted envelope information corresponding to the first high frequency band signal are predicted by the prediction module, so that the first generation module and the second generation module can be accurately generated and processed. The first high-band signal, so that the first high-band signal can be smoothly transitioned to the processed first high-band signal, which is more advantageous for improving the quality of the user's audio and video signals. In addition, the first low-band signal of the current frame speech and audio signal is classified by the classification module, and then the prediction module obtains the predicted fine structure information and the predicted envelope information according to the signal type, thereby making the predicted fine structure information and prediction. The envelope information is more accurate, and the quality of the speech and audio signals received by the user is higher.
基于上述技术方案, 可选的, 如图 10和图 12所示, 本实施例中的第 一合成模块 102包括:  Based on the foregoing technical solution, optionally, as shown in FIG. 10 and FIG. 12, the first synthesizing module 102 in this embodiment includes:
第一判断模块 1021用于根据当前帧语音频信号与切换前一帧的语音频 信号, 判断处理后的第一高频带信号是否需要衰减。  The first judging module 1021 is configured to judge whether the processed first high-band signal needs to be attenuated according to the current frame speech audio signal and the speech/audio signal of the previous frame.
第三合成模块 1022用于若第一判断模块 1021得出处理后的第一高频 带信号不需要衰减, 将处理后的第一高频带信号与第一低频带信号合成宽 频带信号。 The third synthesizing module 1022 is configured to: if the first judging module 1021 obtains that the processed first high frequency band signal does not need to be attenuated, synthesize the processed first high frequency band signal and the first low frequency band signal into a wide width Frequency band signal.
第二判断模块 1023用于若第一判断模块 1021得出处理后的第一高频 带信号需要衰减, 判断处理后的第一高频带信号对应的衰减因子是否大于 给定的阀值。  The second determining module 1023 is configured to determine whether the processed first high frequency band signal needs to be attenuated if the first determining module 1021 determines that the processed first high frequency band signal has an attenuation factor greater than a given threshold.
第四合成模块 1024用于若第二判断模块 1023得出衰减因子不大于给 定的阀值, 则将处理后的第一高频带信号乘以阔值, 然后与第一低频带信 号合成宽频带信号。  The fourth synthesizing module 1024 is configured to: if the second judging module 1023 finds that the attenuation factor is not greater than a given threshold, multiply the processed first high-band signal by a threshold, and then synthesize the broadband with the first low-band signal. With signal.
第五合成模块 1025用于若第二判断模块 1023得出衰减因子大于给定 的阀值, 则将处理后的第一高频带信号乘以衰减因子后, 再与第一低频带 信号合成宽频带信号。  The fifth synthesizing module 1025 is configured to: if the second judging module 1023 obtains that the attenuation factor is greater than a given threshold, multiply the processed first high-band signal by an attenuation factor, and then synthesize the broadband with the first low-band signal. With signal.
第一修改模块 1026用于修改衰减因子, 以使衰减因子减小。  The first modification module 1026 is for modifying the attenuation factor to reduce the attenuation factor.
其中, 衰减因子的初始值为一; 阀值小于一并大于等于零。  Wherein, the initial value of the attenuation factor is one; the threshold is less than one and greater than or equal to zero.
本实施例音频信号切换装置, 通过对处理后的第一高频带信号进行衰 减处理, 可以使当前帧语音频信号经过处理后得到的宽频带信号更加准确, 更有利于提高用户接听音频信号的质量。  In the audio signal switching apparatus of this embodiment, by performing attenuation processing on the processed first high-band signal, the wide-band signal obtained by processing the current frame-audio signal can be more accurate, which is more advantageous for improving the user's listening to the audio signal. quality.
基于上述技术方案, 可选的, 当窄频带语音频信号向宽频带语音频信 号切换时, 如图 10和图 13a所示, 本实施例中的处理模块 101包括:  Based on the foregoing technical solution, optionally, when the narrowband audio and video signals are switched to the broadband audio and video signals, as shown in FIG. 10 and FIG. 13a, the processing module 101 in this embodiment includes:
第一计算模块 1011a用于根据已设定好的第四权重一和第四权重二进 行加权处理, 以计算出处理后的第一高频带信号; 其中, 第四权重一为第 二高频带信号的权重值, 第四权重二为第一高频带信号的权重值;  The first calculating module 1011a is configured to perform weighting processing according to the set fourth weight one and fourth weight two to calculate the processed first high frequency band signal; wherein, the fourth weight one is the second high frequency a weight value with a signal, and a fourth weight 2 is a weight value of the first high frequency band signal;
第二修改模块 1012a用于以第三权重步长为单位减小第四权重一, 以 第三权重步长为单位增加第四权重二, 直至第四权重一等于零; 其中, 第 四权重一和第四权重二的和为一。  The second modification module 1012a is configured to reduce the fourth weight one by a third weight step, and add the fourth weight two by a third weight step, until the fourth weight one is equal to zero; wherein, the fourth weight is one The sum of the fourth weight two is one.
同样的, 当窄频带语音频信号向宽频带语音频信号切换时, 如图 10和 图 13b所示, 本实施例中的处理模块 101可以还包括:  Similarly, when the narrowband speech and audio signals are switched to the wideband audio and video signals, as shown in FIG. 10 and FIG. 13b, the processing module 101 in this embodiment may further include:
第二计算模块 101 lb用于根据已设定好的第五权重一和第五权重二进 行加权处理, 以计算出处理后的第一高频带信号; 其中, 第五权重一为已 设定好的固定参数的权重值, 第五权重二为第一高频带信号的权重值; 第三修改模块 1012b用于以第四权重步长为单位减小第五权重一, 以 第四权重步长为单位增加第五权重二, 直至第五权重一等于零; ; 其中, 第五权重一和第五权重二的和为一; 其中, 固定参数为一个大于等于零并 小于第一高频带信号能量值的常数。 The second calculation module 101 lb is configured to perform the second weight and the fifth weight according to the set fifth weight Performing a weighting process to calculate a processed first high-band signal; wherein, the fifth weight one is a weight value of the fixed parameter that has been set, and the fifth weight two is a weight value of the first high-band signal; The third modification module 1012b is configured to reduce the fifth weight one by the fourth weight step, and add the fifth weight two by the fourth weight step, until the fifth weight one is equal to zero; wherein, the fifth weight one The sum of the fifth weight and the second weight is one; wherein, the fixed parameter is a constant greater than or equal to zero and less than the energy value of the first high frequency band signal.
本实施例语音频信号切换装置, 在语音频信号发生从窄频带语音频信 号向宽频带语音频信号切换的过程中, 通过对宽频带语音频信号的高频带 信号进行衰减处理得到处理后的高频带信号, 从而能够使切换前的窄频带 语音频信号对应的高频带信号能够平滑的过渡到宽频带语音频信号所对应 的处理后的高频带信号, 更有利于提高用户接听音频信号的质量。 最后应说明的是: 以上实施例仅用以说明本发明的技术方案, 而非对 其限制; 尽管参照前述实施例对本发明进行了详细的说明, 本领域的普通 技术人员应当理解: 其依然可以对前述各实施例所记载的技术方案进行修 改, 或者对其中部分技术特征进行等同替换; 而这些修改或者替换, 并不 使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。  In the process of switching the speech/audio signal from the narrow-band speech audio signal to the wide-band speech audio signal, the speech/audio signal switching device is processed by attenuating the high-band signal of the wide-band speech and audio signal. The high-band signal enables the high-band signal corresponding to the narrow-band speech and audio signal before the switching to smoothly transition to the processed high-band signal corresponding to the wide-band speech and audio signal, which is more conducive to improving the user's listening audio. The quality of the signal. It should be noted that the above embodiments are only for explaining the technical solutions of the present invention, and are not intended to be limiting; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art that: The technical solutions described in the foregoing embodiments are modified, or some of the technical features are equivalently replaced. The modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

权利要求 Rights request
1、 一种语音频信号切换方法, 其特征在于, 包括: A method for switching voice and audio signals, comprising:
当语音频信号出现切换时, 将当前帧语音频信号的第一高频带信号和 前 M帧语音频信号的第二高频带信号进行加权处理, 以得到处理后的第一 高频带信号; 其中, M大于等于 1 ;  When the audio signal is switched, the first high frequency band signal of the current frame audio signal and the second high frequency band signal of the previous M frame voice signal are weighted to obtain the processed first high frequency band signal. Where M is greater than or equal to 1;
将所述处理后的第一高频带信号与所述当前帧语音频信号的第一低频 带信号合成宽频带信号。  And combining the processed first high frequency band signal with the first low frequency band signal of the current frame speech audio signal into a wide frequency band signal.
2、根据权利要求 1所述的语音频信号切换方法, 其特征在于,还包括: 当语音频信号没有出现切换时, 将所述第一高频带信号与所述第一低 频带信号合成所述宽频带信号。  The method for switching a voice signal according to claim 1, further comprising: synthesizing the first high frequency band signal and the first low frequency band signal when a switching of the speech audio signal does not occur Describe the broadband signal.
3、 根据权利要求 1或 2所述的语音频信号切换方法, 其特征在于, 当 宽频带语音频信号向窄频带语音频信号切换时;  The method for switching a speech audio signal according to claim 1 or 2, wherein when the wideband speech and audio signal is switched to the narrowband speech and audio signal;
所述将当前帧语音频信号的第一高频带信号和前 M帧语音频信号的第 二高频带信号进行加权, 以得到处理后的第一高频带信号具体为:  And weighting the first high frequency band signal of the current frame speech audio signal and the second high frequency band signal of the first M frame speech and audio signal to obtain the processed first high frequency band signal, specifically:
预测所述当前帧语音频信号的所述第一高频带信号对应的预测的精细 结构信息和预测的包络信息;  Predicting predicted fine structure information and predicted envelope information corresponding to the first high frequency band signal of the current frame speech audio signal;
将所述预测的包络信息与所述前 M帧语音频信号的第二高频带信号对 应的前 M帧包络信息进行加权处理, 以得到所述第一高频带信号对应的第 一包洛信息;  And weighting the predicted envelope information and the first M frame envelope information corresponding to the second high frequency band signal of the pre-M frame speech and audio signal to obtain a first corresponding to the first high frequency band signal Baoluo information;
根据所述第一包络信息和所述预测的精细结构信息, 生成所述处理后 的第一高频带信号。  And generating the processed first high frequency band signal according to the first envelope information and the predicted fine structure information.
4、 根据权利要求 3所述的语音频信号切换方法, 其特征在于, 所述预 测所述第一高频带信号对应的预测的精细结构信息和预测的包络信息具体 为:  The voice-audio signal switching method according to claim 3, wherein the predicted fine structure information and the predicted envelope information corresponding to the first high-band signal are:
对所述当前帧语音频信号的第一低频带信号进行信号分类; 根据所述第一低频带信号对应的信号类型预测所述预测的精细结构信 息和所述预测的包络信息。 Performing signal classification on the first low frequency band signal of the current frame speech audio signal; And predicting the predicted fine structure information and the predicted envelope information according to a signal type corresponding to the first low frequency band signal.
5、 根据权利要求 3所述的语音频信号切换方法, 其特征在于, 所述将 所述预测的包络信号和所述前 M帧语音频信号的第二高频带信号对应的前 M 帧包络信息进行加权处理, 以得到所述第一高频带信号对应的第一包络 信息具体为:  The method for switching audio and video signals according to claim 3, wherein the pre-M frame corresponding to the predicted envelope signal and the second high-band signal of the pre-M frame speech and audio signal The enveloping information is weighted to obtain the first envelope information corresponding to the first high-band signal, which is specifically:
根据所述第一低频带信号和前 N帧的语音频信号的低频带信号, 计算 所述第一低频带信号与所述前 N帧的语音频信号的低频带信号之间的相关 性系数; 其中, N大于等于 1 ;  Calculating a correlation coefficient between the first low frequency band signal and a low frequency band signal of the voice signal of the first N frame according to the first low frequency band signal and the low frequency band signal of the voice signal of the first N frame; Where N is greater than or equal to 1;
判断所述相关性系数是否在给定的第一阀值范围内;  Determining whether the correlation coefficient is within a given first threshold range;
如果所述相关性系数不在所述第一阀值范围内, 则根据已设定好的第 一权重一和第一权重二进行加权处理, 以计算出所述第一包络信息; 其中, 所述第一权重一为前一帧语音频信号的高频带信号对应的前一帧包络信息 的权重值, 所述第一权重二为所述包络信息的权重值;  If the correlation coefficient is not within the first threshold range, performing weighting processing according to the first weight 1 and the first weight 2 that have been set to calculate the first envelope information; The first weight 1 is a weight value of the previous frame envelope information corresponding to the high frequency band signal of the previous frame speech and audio signal, and the first weight 2 is a weight value of the envelope information;
如果所述相关性系数在所述第一阀值范围内, 根据已设定好的第二权 重一和第二权重二进行加权处理, 以计算出过渡包络信息; 其中, 所述第 二权重一为所述切换前 L帧语音频信号的高频带信号对应的切换前的包络 信息的权重值, 所述第二权重二为所述前 M帧包络信息的权重值; 其中, L大于等于 1 ;  If the correlation coefficient is within the first threshold range, performing weighting processing according to the set second weight one and second weight two to calculate transition envelope information; wherein, the second weight One is a weight value of the envelope information before the handover corresponding to the high frequency band signal of the pre-switching L frame speech and audio signal, and the second weight 2 is a weight value of the envelope information of the previous M frame; wherein, L Greater than or equal to 1;
以第一权重步长为单位减小所述第二权重一, 以第一权重步长为单位 增加所述第二权重二;  Reducing the second weight one in units of the first weight step, and adding the second weight two in units of the first weight step;
判断已设定好的第三权重一是否大于所述第一权重一;  Determining whether the third weight 1 that has been set is greater than the first weight one;
如果所述第三权重一不大于所述第一权重一, 则根据已设定好的所述 第一权重一和所述第一权重二进行加权处理, 以计算出所述第一包络信息; 如果所述第三权重一大于所述第一权重一, 根据已设定好的第三权重 一和第三权重二进行加权处理, 以计算出所述第一包络信息; 其中, 所述 第三权重一为所述过渡包络信息的权重值, 所述第三权重二为所述预测的 包络信息的权重值; If the third weight is not greater than the first weight one, performing weighting processing according to the first weight 1 and the first weight 2 that have been set to calculate the first envelope information. And if the third weight one is greater than the first weight one, performing weighting processing according to the set third weight one and third weight two to calculate the first envelope information; The third weight 1 is a weight value of the transition envelope information, and the third weight 2 is a weight value of the predicted envelope information;
以第二权重步长为单位减小所述第三权重一, 以第二权重步长为单位 增加所述第三权重二, 直至所述第三权重一等于零;  Decreasing the third weight one in units of the second weight step, and increasing the third weight two in units of the second weight step until the third weight one is equal to zero;
其中, 所述第一权重一和所述第一权重二的和为一, 所述第二权重一 和所述第二权重二的和为一, 所述第三权重一和所述第三权重二的和为一; 所述第三权重一的初始值大于所述第一权重一的初始值; 所述第一权重一 和所述第一权重二是固定的常数。  The sum of the first weight one and the first weight two is one, the sum of the second weight one and the second weight two is one, the third weight one and the third weight The sum of two is one; the initial value of the third weight one is greater than the initial value of the first weight one; the first weight one and the first weight two are fixed constants.
6、 根据权利要求 3所述的语音频信号切换方法, 其特征在于, 所述对 所述预测的包络信号和前 M帧语音频信号的第二高频带信号对应的前 M帧 包络信息进行加权处理, 以得到所述第一高频带信号对应的第一包络信息 具体为:  The voice-audio signal switching method according to claim 3, wherein the pre-M frame envelope corresponding to the predicted envelope signal and the second high-band signal of the previous M-frame voice signal The information is weighted to obtain the first envelope information corresponding to the first high frequency band signal, which is specifically:
根据当前帧的所述第一低频带信号和前一帧的语音频信号的低频信 号, 计算所述第一低频带信号与所述前一帧的语音频信号的低频信号之间 的相关性系数;  Calculating a correlation coefficient between the first low frequency band signal and a low frequency signal of the speech signal of the previous frame according to the first low frequency band signal of the current frame and the low frequency signal of the speech signal of the previous frame ;
判断所述相关性系数是否在给定的第二阀值范围内;  Determining whether the correlation coefficient is within a given second threshold range;
如果所述相关性系数不在所述第二阀值范围内, 则根据已设定好的第 一权重一和第一权重二进行加权处理, 以计算出所述第一包络信息; 其中, 所述第一权重一为所述前一帧的语音频信号的高频带信号对应的前一帧包 络信息的权重值, 所述第一权重二为所述预测的包络信息的权重值; 所述 第一权重一和所述第一权重二是固定的常数;  If the correlation coefficient is not within the second threshold range, performing weighting processing according to the first weight 1 and the first weight 2 that have been set to calculate the first envelope information; The first weight 1 is a weight value of the previous frame envelope information corresponding to the high frequency band signal of the speech and audio signal of the previous frame, and the first weight 2 is a weight value of the predicted envelope information; The first weight one and the first weight two are fixed constants;
如果所述相关性系数在所述第二阀值范围内, 判断已设定好的第二权 重一是否大于所述第一权重一; 其中, 所述第二权重一为切换前一帧语音 频信号的高频带信号对应的切换前的包络信息的权重值,  If the correlation coefficient is within the second threshold range, it is determined whether the second weight 1 that is set is greater than the first weight one; wherein the second weight one is the previous frame speech and audio The weight value of the envelope information before switching corresponding to the high frequency band signal of the signal,
如果所述第二权重一不大于所述第一权重一, 则根据已设定好的第一 权重一和第一权重二进行加权处理, 以计算出所述第一包络信息; 如果所述第二权重一大于所述第一权重一, 根据所述第二权重一和已 设定好的第二权重二进行加权处理, 以计算出所述第一包络信息; 其中, 所述第二权重二为所述预测的包络信息的权重值; If the second weight is not greater than the first weight one, performing weighting processing according to the first weight 1 and the first weight 2 that have been set to calculate the first envelope information; If the second weight one is greater than the first weight one, perform weighting processing according to the second weight one and the second weight 2 that has been set to calculate the first envelope information; The second weight 2 is a weight value of the predicted envelope information;
以第二权重步长为单位减小所述第二权重一, 以第二权重步长为单位 增加所述第二权重二;  Reducing the second weight one in units of a second weight step, and increasing the second weight two in units of a second weight step;
其中, 所述第一权重一和所述第一权重二的和为一, 所述第二权重一 和所述第二权重二的和为一; 所述第二权重一的初始值大于所述第一权重 一的初始值。  The sum of the first weight one and the first weight two is one, the sum of the second weight one and the second weight two is one; the initial value of the second weight one is greater than the The initial value of the first weight one.
7、 根据权利要求 3所述的语音频信号切换方法, 其特征在于, 所述将 处理后的所述第一高频带信号与所述当前帧语音频信号的第一低频带信号 合成宽频带信号具体为:  The voice audio signal switching method according to claim 3, wherein the processing the first high frequency band signal and the first low frequency band signal of the current frame speech audio signal into a wide frequency band The signal is specifically:
根据所述当前帧语音频信号与切换前一帧的语音频信号, 判断所述处 理后的第一高频带信号是否需要衰减;  Determining whether the processed first high frequency band signal needs to be attenuated according to the current frame speech audio signal and switching the speech audio signal of the previous frame;
如果不需要衰减, 将所述处理后的第一高频带信号与所述第一低频带 信号合成所述宽频带信号;  If the attenuation is not required, synthesizing the processed first high frequency band signal and the first low frequency band signal into the wideband signal;
如果需要衰减, 判断处理后的所述第一高频带信号对应的衰减因子是 否大于所述阀值;  If the attenuation is required, determining whether the attenuation factor corresponding to the processed first high frequency band signal is greater than the threshold;
如果衰减因子不大于给定的阔值, 则将处理后的所述第一高频带信号 乘以阔值, 然后与所述第一低频带信号合成所述宽频带信号;  If the attenuation factor is not greater than a given threshold, multiplying the processed first high-band signal by a threshold, and then synthesizing the broadband signal with the first low-band signal;
如果衰减因子大于给定的阔值, 则将处理后的所述第一高频带信号乘 以衰减因子后, 再与所述第一低频带信号合成所述宽频带信号;  If the attenuation factor is greater than a given threshold, multiplying the processed first high-band signal by an attenuation factor, and synthesizing the broadband signal with the first low-band signal;
修改所述衰减因子, 以使所述衰减因子减小;  Modifying the attenuation factor to reduce the attenuation factor;
其中, 所述衰减因子的初始值为一; 所述阀值小于一并大于等于零。 The initial value of the attenuation factor is one; the threshold is less than one and greater than or equal to zero.
8、 根据权利要求 1或 2所述的语音频信号切换方法, 其特征在于, 当窄频带语音频信号向宽频带语音频信号切换时; The method for switching a speech audio signal according to claim 1 or 2, wherein when the narrowband speech and audio signal is switched to the wideband speech and audio signal;
所述将当前帧语音频信号的第一高频带信号和前 M帧语音频信号的第 二高频带信号进行加权处理, 以得到处理后的第一高频带信号具体为: 根据已设定好的第四权重一和第四权重二进行加权处理, 以计算出处 理后的所述第一高频带信号; 其中, 所述第四权重一为所述第二高频带信 号的权重值, 所述第四权重二为所述第一高频带信号的权重值; The first high frequency band signal of the current frame speech audio signal and the first M frame audio signal The second high-band signal is subjected to weighting processing to obtain the processed first high-band signal, specifically: performing weighting processing according to the set fourth weight one and fourth weight two to calculate the processed a first high-band signal; wherein the fourth weight one is a weight value of the second high-band signal, and the fourth weight two is a weight value of the first high-band signal;
以第三权重步长为单位减小所述第四权重一, 以第三权重步长为单位 增加所述第四权重二, 直至所述第四权重一等于零; 其中, 所述第四权重 一和所述第四权重二的和为一。  Reducing the fourth weight one in units of a third weight step, and increasing the fourth weight two in units of a third weight step until the fourth weight one is equal to zero; wherein the fourth weight one The sum of the fourth weight two is one.
9、 根据权利要求 1或 2所述的语音频信号切换方法, 其特征在于, 当 窄频带语音频信号向宽频带语音频信号切换时;  The speech/audio signal switching method according to claim 1 or 2, wherein when the narrowband speech and audio signal is switched to the wideband speech and audio signal;
所述将当前帧语音频信号的第一高频带信号和前 M帧语音频信号的第 二高频带信号进行加权处理, 以得到处理后的第一高频带信号具体为: 根据已设定好的第五权重一和第五权重二进行加权处理, 以计算出所 述处理后的第一高频带信号; 其中, 所述第五权重一为已设定好的固定参 数的权重值, 所述第五权重二为所述第一高频带信号的权重值;  And performing weighting processing on the first high frequency band signal of the current frame speech audio signal and the second high frequency band signal of the previous M frame speech and audio signal, to obtain the processed first high frequency band signal, which is specifically: The fifth weight 1 and the fifth weight 2 are determined to perform weighting processing to calculate the processed first high frequency band signal; wherein the fifth weight one is a weight value of the fixed parameter that has been set The fifth weight 2 is a weight value of the first high frequency band signal;
以第四权重步长为单位减小所述第五权重一, 以第四权重步长为单位 增加所述第五权重二, 直至所述第五权重一等于零; ; 其中, 所述第五权 重一和所述第五权重二的和为一;  Reducing the fifth weight one in units of a fourth weight step, and increasing the fifth weight two in units of a fourth weight step until the fifth weight one is equal to zero; wherein the fifth weight The sum of one and the fifth weight two is one;
其中, 所述固定参数为一个大于等于零并小于所述第一高频带信号能 量值的常数。  The fixed parameter is a constant greater than or equal to zero and less than the energy value of the first high frequency band signal.
10、 一种语音频信号切换装置, 其特征在于, 包括:  10. A voice signal switching device, comprising:
处理模块, 用于当语音频信号出现切换时, 将当前帧语音频信号的第 一高频带信号和前 M帧语音频信号的第二高频带信号进行加权处理, 以得 到处理后的第一高频带信号; 其中, M大于等于 1 ;  a processing module, configured to perform weighting processing on the first high frequency band signal of the current frame speech audio signal and the second high frequency band signal of the previous M frame speech and audio signal when the audio signal is switched, to obtain the processed a high frequency band signal; wherein, M is greater than or equal to 1;
第一合成模块, 用于将所述处理后的第一高频带信号与所述当前帧语 音频信号的第一低频带信号合成宽频带信号。  And a first synthesizing module, configured to synthesize the processed first high frequency band signal with the first low frequency band signal of the current frame audio signal into a broadband signal.
11、 根据权利要求 10所述的语音频信号切换装置, 其特征在于, 还包 括: 11. The speech and audio signal switching device according to claim 10, further comprising Includes:
第二合成模块, 用于当语音频信号没有出现切换时, 将所述第一高频 带信号与所述第一低频带信号合成所述宽频带信号。  And a second synthesizing module, configured to synthesize the first high frequency band signal and the first low frequency band signal into the wideband signal when no switching occurs.
12、 根据权利要求 10或 11所述的语音频信号切换装置, 其特征在于, 当宽频带语音频信号向窄频带语音频信号切换时;  The speech/audio signal switching device according to claim 10 or 11, wherein when the wideband speech and audio signal is switched to the narrowband speech and audio signal;
所述处理模块包括:  The processing module includes:
预测模块, 用于预测所述当前帧语音频信号的所述第一高频带信号对 应的预测的精细结构信息和预测的包络信息;  a prediction module, configured to predict predicted fine structure information and predicted envelope information corresponding to the first high frequency band signal of the current frame speech audio signal;
第一生成模块, 用于将所述预测的包络信息与所述前 M帧语音频信号 的第二高频带信号对应的前 M帧包络信息进行加权处理, 以得到所述第一 高频带信号对应的第一包络信息;  a first generation module, configured to perform weighting processing on the predicted MCU envelope information corresponding to the second high-band signal of the pre-M frame speech and audio signal, to obtain the first high First envelope information corresponding to the frequency band signal;
第二生成模块, 用于根据所述第一包络信息和所述预测的精细结构信 息, 生成所述处理后的第一高频带信号。  And a second generating module, configured to generate the processed first high frequency band signal according to the first envelope information and the predicted fine structure information.
13、 根据权利要求 12所述的语音频信号切换装置, 其特征在于, 还包 括: 分类模块, 用于对所述当前帧语音频信号的第一低频带信号进行信号 分类;  The apparatus for switching audio and video signals according to claim 12, further comprising: a classification module, configured to perform signal classification on the first low frequency band signal of the current frame speech audio signal;
所述预测模块还用于根据所述第一低频带信号对应的信号类型预测所 述预测的精细结构信息和所述预测的包络信息。  The prediction module is further configured to predict the predicted fine structure information and the predicted envelope information according to a signal type corresponding to the first low frequency band signal.
14、 根据权利要求 12所述的语音频信号切换装置, 其特征在于, 所述 第一合成模块包括:  The apparatus for switching audio and video signals according to claim 12, wherein the first synthesizing module comprises:
第一判断模块, 用于根据所述当前帧语音频信号与切换前一帧的语音 频信号, 判断所述处理后的第一高频带信号是否需要衰减;  a first determining module, configured to determine, according to the current frame speech audio signal and the voice frequency signal of the previous frame, whether the processed first high frequency band signal needs to be attenuated;
第三合成模块, 用于若所述第一判断模块得出处理后的所述第一高频 带信号不需要衰减, 将所述处理后的第一高频带信号与所述第一低频带信 号合成所述宽频带信号;  a third synthesizing module, configured to: if the first high-band signal that is processed by the first determining module does not need to be attenuated, the processed first high-band signal and the first low-band Signaling the wideband signal;
第二判断模块, 用于若所述第一判断模块得出处理后的所述第一高频 带信号需要衰减, 判断处理后的所述第一高频带信号对应的衰减因子是否 大于给定的阀值; a second determining module, configured to: if the first determining module obtains the processed first high frequency The band signal needs to be attenuated, and it is determined whether the attenuation factor corresponding to the processed first high frequency band signal is greater than a given threshold;
第四合成模块, 用于若所述第二判断模块得出所述衰减因子不大于所 述给定的阀值, 则将处理后的所述第一高频带信号乘以阔值, 然后与所述 第一低频带信号合成所述宽频带信号;  a fourth synthesizing module, configured to: if the second determining module determines that the attenuation factor is not greater than the given threshold, multiply the processed first high frequency band signal by a threshold, and then The first low frequency band signal synthesizes the wideband signal;
第五合成模块, 用于若所述第二判断模块得出所述衰减因子大于所述 给定的阀值, 则将处理后的所述第一高频带信号乘以衰减因子后, 再与所 述第一低频带信号合成所述宽频带信号;  a fifth synthesizing module, configured to: after the second determining module obtains that the attenuation factor is greater than the given threshold, multiplying the processed first high frequency band signal by an attenuation factor, and then The first low frequency band signal synthesizes the wideband signal;
第一修改模块, 用于修改所述衰减因子, 以使所述衰减因子减小; 其中, 所述衰减因子的初始值为一; 所述阀值小于一并大于等于零。 a first modification module, configured to modify the attenuation factor to reduce the attenuation factor; wherein, the initial value of the attenuation factor is one; and the threshold is less than one and greater than or equal to zero.
15、 根据权利要求 10或 11所述的语音频信号切换装置, 其特征在于, 当窄频带语音频信号向宽频带语音频信号切换时; The speech/audio signal switching device according to claim 10 or 11, wherein when the narrowband speech and audio signal is switched to the wideband speech and audio signal;
所述处理模块包括:  The processing module includes:
第一计算模块, 用于根据已设定好的第四权重一和第四权重二进行加 权处理, 以计算出处理后的所述第一高频带信号; 其中, 所述第四权重一 为所述第二高频带信号的权重值, 所述第四权重二为所述第一高频带信号 的权重值;  a first calculating module, configured to perform weighting processing according to the fourth weight one and the fourth weight 2 that have been set, to calculate the processed first high frequency band signal; wherein, the fourth weight one is a weight value of the second high frequency band signal, where the fourth weight 2 is a weight value of the first high frequency band signal;
第二修改模块, 用于以第三权重步长为单位减小所述第四权重一, 以 第三权重步长为单位增加所述第四权重二, 直至所述第四权重一等于零; 其中, 所述第四权重一和所述第四权重二的和为一。  a second modification module, configured to reduce the fourth weight one by a third weight step, and increase the fourth weight two by a third weight step, until the fourth weight one is equal to zero; The sum of the fourth weight one and the fourth weight two is one.
16、 根据权利要求 13所述的语音频信号切换装置, 其特征在于, 当窄 频带语音频信号向宽频带语音频信号切换时; 所述处理模块包括:  The speech/audio signal switching device according to claim 13, wherein when the narrowband speech and audio signal is switched to the wideband speech and audio signal; the processing module comprises:
第二计算模块, 用于根据已设定好的第五权重一和第五权重二进行加 权处理, 以计算出所述处理后的第一高频带信号; 其中, 所述第五权重一 为已设定好的固定参数的权重值, 所述第五权重二为所述第一高频带信号 的权重值; 第三修改模块, 用于以第四权重步长为单位减小所述第五权重一, 以 第四权重步长为单位增加所述第五权重二, 直至所述第五权重一等于零; ; 其中, 所述第五权重一和所述第五权重二的和为一; 其中, 所述固定参数 为一个大于等于零并小于所述第一高频带信号能量值的常数。 a second calculating module, configured to perform a weighting process according to the set fifth weight 1 and the fifth weight 2 to calculate the processed first high frequency band signal; wherein the fifth weight one is The weight value of the fixed parameter is set, and the fifth weight 2 is a weight value of the first high frequency band signal; a third modification module, configured to reduce the fifth weight one by a fourth weight step, and increase the fifth weight two by a fourth weight step, until the fifth weight one is equal to zero; The sum of the fifth weight one and the fifth weight two is one; wherein the fixed parameter is a constant greater than or equal to zero and smaller than the energy value of the first high frequency band signal.
PCT/CN2011/073479 2010-04-28 2011-04-28 Audio signal switching method and device WO2011134415A1 (en)

Priority Applications (7)

Application Number Priority Date Filing Date Title
BR112012013306A BR112012013306B8 (en) 2010-04-28 2011-04-28 method and apparatus for switching speech or audio signals
EP17151713.9A EP3249648B1 (en) 2010-04-28 2011-04-28 Method and apparatus for switching speech or audio signals
EP11774406.0A EP2485029B1 (en) 2010-04-28 2011-04-28 Audio signal switching method and device
AU2011247719A AU2011247719B2 (en) 2010-04-28 2011-04-28 Method and apparatus for switching speech or audio signals
KR1020127012328A KR101377547B1 (en) 2010-04-28 2011-04-28 Method and apparatus for switching speech or audio signals
JP2012541316A JP5667202B2 (en) 2010-04-28 2011-04-28 Audio signal switching method and device
ES11774406.0T ES2635212T3 (en) 2010-04-28 2011-04-28 Procedure and device for switching audio signals

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201010163406.3 2010-04-28
CN2010101634063A CN101964189B (en) 2010-04-28 2010-04-28 Audio signal switching method and device

Publications (1)

Publication Number Publication Date
WO2011134415A1 true WO2011134415A1 (en) 2011-11-03

Family

ID=43517042

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/073479 WO2011134415A1 (en) 2010-04-28 2011-04-28 Audio signal switching method and device

Country Status (8)

Country Link
EP (2) EP3249648B1 (en)
JP (3) JP5667202B2 (en)
KR (1) KR101377547B1 (en)
CN (1) CN101964189B (en)
AU (1) AU2011247719B2 (en)
BR (1) BR112012013306B8 (en)
ES (2) ES2635212T3 (en)
WO (1) WO2011134415A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8214218B2 (en) 2010-04-28 2012-07-03 Huawei Technologies Co., Ltd. Method and apparatus for switching speech or audio signals
US10720170B2 (en) 2016-02-17 2020-07-21 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Post-processor, pre-processor, audio encoder, audio decoder and related methods for enhancing transient processing
US11929084B2 (en) 2014-07-28 2024-03-12 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoder and decoder using a frequency domain processor with full-band gap filling and a time domain processor

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101110800B1 (en) * 2003-05-28 2012-07-06 도꾸리쯔교세이호진 상교기쥬쯔 소고겡뀨죠 Process for producing hydroxyl group-containing compound
US11431312B2 (en) 2004-08-10 2022-08-30 Bongiovi Acoustics Llc System and method for digital signal processing
US8284955B2 (en) 2006-02-07 2012-10-09 Bongiovi Acoustics Llc System and method for digital signal processing
US10158337B2 (en) 2004-08-10 2018-12-18 Bongiovi Acoustics Llc System and method for digital signal processing
US10848118B2 (en) 2004-08-10 2020-11-24 Bongiovi Acoustics Llc System and method for digital signal processing
US10848867B2 (en) 2006-02-07 2020-11-24 Bongiovi Acoustics Llc System and method for digital signal processing
US10701505B2 (en) 2006-02-07 2020-06-30 Bongiovi Acoustics Llc. System, method, and apparatus for generating and digitally processing a head related audio transfer function
CN101964189B (en) * 2010-04-28 2012-08-08 华为技术有限公司 Audio signal switching method and device
CN105469805B (en) 2012-03-01 2018-01-12 华为技术有限公司 A kind of voice frequency signal treating method and apparatus
CN105761724B (en) * 2012-03-01 2021-02-09 华为技术有限公司 Voice frequency signal processing method and device
CN103516440B (en) * 2012-06-29 2015-07-08 华为技术有限公司 Audio signal processing method and encoding device
CN103971693B (en) * 2013-01-29 2017-02-22 华为技术有限公司 Forecasting method for high-frequency band signal, encoding device and decoding device
US9883318B2 (en) 2013-06-12 2018-01-30 Bongiovi Acoustics Llc System and method for stereo field enhancement in two-channel audio systems
US9906858B2 (en) 2013-10-22 2018-02-27 Bongiovi Acoustics Llc System and method for digital signal processing
US9397629B2 (en) * 2013-10-22 2016-07-19 Bongiovi Acoustics Llc System and method for digital signal processing
US9524720B2 (en) * 2013-12-15 2016-12-20 Qualcomm Incorporated Systems and methods of blind bandwidth extension
CN103714822B (en) * 2013-12-27 2017-01-11 广州华多网络科技有限公司 Sub-band coding and decoding method and device based on SILK coder decoder
KR101864122B1 (en) * 2014-02-20 2018-06-05 삼성전자주식회사 Electronic apparatus and controlling method thereof
US10639000B2 (en) 2014-04-16 2020-05-05 Bongiovi Acoustics Llc Device for wide-band auscultation
US10820883B2 (en) 2014-04-16 2020-11-03 Bongiovi Acoustics Llc Noise reduction assembly for auscultation of a body
AU2019252524A1 (en) 2018-04-11 2020-11-05 Bongiovi Acoustics Llc Audio enhanced hearing protection system
CN110556116B (en) * 2018-05-31 2021-10-22 华为技术有限公司 Method and apparatus for calculating downmix signal and residual signal
US10959035B2 (en) 2018-08-02 2021-03-23 Bongiovi Acoustics Llc System, method, and apparatus for generating and digitally processing a head related audio transfer function
CN112002333B (en) * 2019-05-07 2023-07-18 海能达通信股份有限公司 Voice synchronization method and device and communication terminal
CN117373465B (en) * 2023-12-08 2024-04-09 富迪科技(南京)有限公司 Voice frequency signal switching system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050027516A1 (en) * 2003-07-16 2005-02-03 Samsung Electronics Co., Ltd. Wide-band speech signal compression and decompression apparatus, and method thereof
CN101335002A (en) * 2007-11-02 2008-12-31 华为技术有限公司 Method and apparatus for audio decoding
CN101425292A (en) * 2007-11-02 2009-05-06 华为技术有限公司 Decoding method and device for audio signal
CN101964189A (en) * 2010-04-28 2011-02-02 华为技术有限公司 Audio signal switching method and device

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4330689A (en) * 1980-01-28 1982-05-18 The United States Of America As Represented By The Secretary Of The Navy Multirate digital voice communication processor
US4769833A (en) * 1986-03-31 1988-09-06 American Telephone And Telegraph Company Wideband switching system
US5019910A (en) * 1987-01-29 1991-05-28 Norsat International Inc. Apparatus for adapting computer for satellite communications
FI115329B (en) * 2000-05-08 2005-04-15 Nokia Corp Method and arrangement for switching the source signal bandwidth in a communication connection equipped for many bandwidths
US7113522B2 (en) * 2001-01-24 2006-09-26 Qualcomm, Incorporated Enhanced conversion of wideband signals to narrowband signals
JP2005080079A (en) * 2003-09-02 2005-03-24 Sony Corp Sound reproduction device and its method
FI119533B (en) * 2004-04-15 2008-12-15 Nokia Corp Coding of audio signals
JPWO2005106848A1 (en) * 2004-04-30 2007-12-13 松下電器産業株式会社 Scalable decoding apparatus and enhancement layer erasure concealment method
WO2006011444A1 (en) * 2004-07-28 2006-02-02 Matsushita Electric Industrial Co., Ltd. Relay device and signal decoding device
JP4989971B2 (en) * 2004-09-06 2012-08-01 パナソニック株式会社 Scalable decoding apparatus and signal loss compensation method
WO2006075663A1 (en) * 2005-01-14 2006-07-20 Matsushita Electric Industrial Co., Ltd. Audio switching device and audio switching method
US8249861B2 (en) * 2005-04-20 2012-08-21 Qnx Software Systems Limited High frequency compression integration
EP1898397B1 (en) * 2005-06-29 2009-10-21 Panasonic Corporation Scalable decoder and disappeared data interpolating method
US8194865B2 (en) * 2007-02-22 2012-06-05 Personics Holdings Inc. Method and device for sound detection and audio control
JP5547081B2 (en) * 2007-11-02 2014-07-09 華為技術有限公司 Speech decoding method and apparatus

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050027516A1 (en) * 2003-07-16 2005-02-03 Samsung Electronics Co., Ltd. Wide-band speech signal compression and decompression apparatus, and method thereof
CN101335002A (en) * 2007-11-02 2008-12-31 华为技术有限公司 Method and apparatus for audio decoding
CN101425292A (en) * 2007-11-02 2009-05-06 华为技术有限公司 Decoding method and device for audio signal
CN101964189A (en) * 2010-04-28 2011-02-02 华为技术有限公司 Audio signal switching method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2485029A4 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8214218B2 (en) 2010-04-28 2012-07-03 Huawei Technologies Co., Ltd. Method and apparatus for switching speech or audio signals
US11929084B2 (en) 2014-07-28 2024-03-12 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoder and decoder using a frequency domain processor with full-band gap filling and a time domain processor
US10720170B2 (en) 2016-02-17 2020-07-21 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Post-processor, pre-processor, audio encoder, audio decoder and related methods for enhancing transient processing
US11094331B2 (en) 2016-02-17 2021-08-17 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Post-processor, pre-processor, audio encoder, audio decoder and related methods for enhancing transient processing

Also Published As

Publication number Publication date
CN101964189A (en) 2011-02-02
ES2635212T3 (en) 2017-10-02
JP2017033015A (en) 2017-02-09
EP3249648A1 (en) 2017-11-29
JP2015045888A (en) 2015-03-12
BR112012013306A2 (en) 2016-03-01
JP5667202B2 (en) 2015-02-12
CN101964189B (en) 2012-08-08
AU2011247719A1 (en) 2012-06-07
BR112012013306B1 (en) 2020-11-10
BR112012013306B8 (en) 2021-02-17
KR101377547B1 (en) 2014-03-25
AU2011247719B2 (en) 2013-07-11
ES2718947T3 (en) 2019-07-05
EP2485029A1 (en) 2012-08-08
JP2013512468A (en) 2013-04-11
EP2485029A4 (en) 2013-01-02
EP3249648B1 (en) 2019-01-09
JP6410777B2 (en) 2018-10-24
JP6027081B2 (en) 2016-11-16
EP2485029B1 (en) 2017-06-14
KR20120074303A (en) 2012-07-05

Similar Documents

Publication Publication Date Title
WO2011134415A1 (en) Audio signal switching method and device
US10559313B2 (en) Speech/audio signal processing method and apparatus
US9779749B2 (en) Audio signal coding method and apparatus
US8214218B2 (en) Method and apparatus for switching speech or audio signals
WO2021052285A1 (en) Frequency band expansion method and apparatus, electronic device, and computer readable storage medium
KR101924767B1 (en) Voice frequency code stream decoding method and device
JP2022526403A (en) Frequency band expansion methods, devices, electronic devices and computer programs
JP2022548299A (en) Audio encoding method and apparatus
WO2014000559A1 (en) Processing method for speech or audio signals and encoding apparatus thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11774406

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 3907/CHENP/2012

Country of ref document: IN

REEP Request for entry into the european phase

Ref document number: 2011774406

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2011774406

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 20127012328

Country of ref document: KR

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 2011247719

Country of ref document: AU

WWE Wipo information: entry into national phase

Ref document number: 2012541316

Country of ref document: JP

ENP Entry into the national phase

Ref document number: 2011247719

Country of ref document: AU

Date of ref document: 20110428

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112012013306

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 112012013306

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20120601