WO2024021747A1 - 声音编解码方法以及相关装置、系统 - Google Patents

声音编解码方法以及相关装置、系统 Download PDF

Info

Publication number
WO2024021747A1
WO2024021747A1 PCT/CN2023/092547 CN2023092547W WO2024021747A1 WO 2024021747 A1 WO2024021747 A1 WO 2024021747A1 CN 2023092547 W CN2023092547 W CN 2023092547W WO 2024021747 A1 WO2024021747 A1 WO 2024021747A1
Authority
WO
WIPO (PCT)
Prior art keywords
gain
codebook
algebraic codebook
linear
algebraic
Prior art date
Application number
PCT/CN2023/092547
Other languages
English (en)
French (fr)
Inventor
许剑峰
Original Assignee
荣耀终端有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 荣耀终端有限公司 filed Critical 荣耀终端有限公司
Publication of WO2024021747A1 publication Critical patent/WO2024021747A1/zh

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/12Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L2019/0001Codebooks
    • G10L2019/0016Codebook for LPC parameters

Definitions

  • This application relates to the field of sound coding and decoding, and in particular to sound coding and decoding technology based on code-excited linear prediction (CELP).
  • CELP code-excited linear prediction
  • CELP Code Excited Linear Prediction
  • the encoding side transmits to the decoding side is not the original sound, but the encoding parameters, such as the gain of the algebraic codebook and the gain of the adaptive codebook. These encoding parameters are used to minimize the error between the reconstructed speech signal and the original speech signal. value is calculated. How to reduce computational complexity is a research hotspot in this field.
  • Various embodiments of the present application provide a voice encoding and decoding method that can reduce the complexity of calculating codebook gain.
  • a first aspect provides a method for encoding a sound signal, applied to the first subframe in the current frame.
  • the method may include: receiving a frame classification parameter index of the current frame, and coding the first subframe according to the frame classification parameter index of the current frame.
  • a linear estimate of the algebraic codebook gain in the linear domain is searched in a mapping table, where each entry in the first mapping table includes two values: a frame classification parameter index and a linear estimate of the algebraic codebook gain in the linear domain.
  • the method provided in the first aspect may further include: transmitting encoding parameters.
  • the coding parameters may include: the frame classification parameter index of the current frame, and the index of the winning codebook vector in the gain codebook.
  • a method of decoding a sound signal is provided, which is also applied to the first subframe in the current frame.
  • the method may include: receiving encoding parameters, and the encoding parameters may It includes: the frame classification parameter index of the current frame, the index of the winning codebook vector, and the linear estimate of the algebraic codebook gain in the linear domain found in the first mapping table based on the frame classification parameter index of the current frame.
  • Each entry in the first mapping table includes two values: a frame classification parameter index and a linear estimate of the algebraic codebook gain in the linear domain.
  • the energy of the algebraic codebook vector from the algebraic codebook is calculated, and the linear estimate of the algebraic codebook gain in the linear domain is divided by the square root of the energy of the algebraic codebook vector to obtain the estimated gain of the algebraic codebook.
  • the quantized gain of the algebraic codebook is obtained by multiplying the estimated gain of the algebraic codebook by a correction factor.
  • the correction factor is derived from the winning codebook vector.
  • the winning codebook vector is selected from the gain codebook based on the index of the winning codebook vector.
  • the methods provided by the first aspect and the second aspect at least have the following beneficial effects: when encoding and decoding the estimated gain of the algebraic codebook of the first subframe, logarithmic log operations and base 10 exponential operations can be completely avoided. Highly complex operations, Significantly reduce algorithm complexity. Moreover, the codec can directly look up the table to obtain the parameter CT corresponding to the current frame. The value avoids recalculation when the codec is running, thereby reducing algorithm complexity.
  • a method for encoding a sound signal is provided, applied to the first subframe in the current frame.
  • the method may include: receiving a frame classification parameter index of the current frame, and coding the first subframe according to the frame classification parameter index of the current frame.
  • a linear estimate of the algebraic codebook gain in the logarithmic domain is searched in a mapping table, where each entry in the first mapping table includes two values: a frame classification parameter index and a linear estimate of the algebraic codebook gain in the logarithmic domain.
  • the linear estimate of the algebraic codebook gain in the logarithmic domain is converted into the linear domain through exponential operation to obtain the linear estimate of the algebraic codebook gain in the linear domain, and the energy of the algebraic codebook vector from the algebraic codebook is calculated,
  • the linear estimate of the algebraic codebook gain in the linear domain is divided by the square root of the energy of the algebraic codebook vector to obtain the estimated gain of the algebraic codebook.
  • the quantized gain of the algebraic codebook is obtained by multiplying the estimated gain of the algebraic codebook by a correction factor; the correction factor is derived from the winning codebook vector, and the winning codebook vector is selected from the gain codebook.
  • the method provided in the third aspect may further include: transmitting coding parameters, where the coding parameters include: the frame classification parameter index of the current frame, and the index of the winning codebook vector in the gain codebook.
  • a method of decoding a sound signal is provided, which is also applied to the first subframe in the current frame.
  • the method may include: receiving encoding parameters; the encoding parameters include : The frame type of the current frame, the linear estimation constant in the first subframe, and the index of the winning codebook vector; linear estimation is performed using the linear estimation constant in the first subframe and the frame type of the current frame to obtain the code in the logarithmic domain
  • the linear estimate of the codebook gain; the linear estimate of the algebraic codebook gain in the logarithmic domain is converted into the linear domain through exponential operation to obtain the linear estimate of the algebraic codebook gain in the linear domain; the algebraic codebook gain from the algebraic codebook is calculated
  • the energy of the algebraic codebook vector divide the linear estimate of the algebraic codebook gain in the linear domain by the square root of the energy of the algebraic codebook vector to obtain the estimated gain of the algebraic codebook; multiply the estimated gain of the algebraic
  • the methods provided by the third aspect and the fourth aspect at least have the following beneficial effects: when encoding and decoding the estimated gain of the algebraic codebook of the first subframe, the logarithmic operation and exponential operation involved in the energy Ec of the algebraic codebook vector can be avoided. , reduce algorithm complexity. Moreover, the codec can directly look up the table to obtain the value of a 0 + a 1 CT corresponding to the parameter CT of the current frame, which can avoid calculating the value again when the codec is running, saving the amount of calculation.
  • a method of encoding a sound signal is provided, applied to the first subframe in the current frame.
  • the method may include: performing linear estimation using a linear estimation constant in the first subframe and the frame type of the current frame, Obtain the linear estimate of the algebraic codebook gain in the logarithmic domain; convert the linear estimate of the algebraic codebook gain in the logarithmic domain into the linear domain through exponential operation, and obtain the linear estimate of the algebraic codebook gain in the linear domain; calculate Energy of the algebraic codebook vector from the algebraic codebook; divide the linear estimate of the algebraic codebook gain in the linear domain by the square root of the energy of the algebraic codebook vector to obtain the estimated gain of the algebraic codebook; divide the estimated gain of the algebraic codebook
  • the quantized gain of the algebraic codebook is obtained by multiplying by the correction factor; the correction factor is derived from the winning codebook vector, which is selected from the gain codebook.
  • the method provided in the fifth aspect may further include: transmitting coding parameters, where the coding parameters include: the frame type of the current frame, the linear estimation constant, and the index of the winning codebook vector in the gain codebook.
  • the sixth aspect relative to the method of encoding the sound signal of the fifth aspect, provides a method of decoding the sound signal, which is also applied to the first subframe in the current frame.
  • the method may include: receiving encoding parameters; the encoding parameters include : The frame type of the current frame, the linear estimation constant in the first subframe, and the index of the winning codebook vector; linear estimation is performed using the linear estimation constant in the first subframe and the frame type of the current frame to obtain the code in the logarithmic domain
  • the linear estimate of the codebook gain; the linear estimate of the algebraic codebook gain in the logarithmic domain is converted into the linear domain through exponential operation to obtain the linear estimate of the algebraic codebook gain in the linear domain; the algebraic codebook gain from the algebraic codebook is calculated Energy of the codebook vector; a linear estimate of the algebraic codebook gain in the linear domain Dividing the value by the square root of the energy of the algebraic codebook vector yields the estimated gain of the algebraic codebook; multiplying the estimated gain of the
  • the methods provided in the fifth and sixth aspects at least have the following beneficial effects: when encoding and decoding the estimated gain of the algebraic codebook of the first subframe, logarithmic operations and exponential operations involved in the energy Ec of the algebraic codebook vector can be avoided. , reduce algorithm complexity.
  • a seventh aspect provides a method for encoding a sound signal, applied to the first subframe in the current frame.
  • the method may include: receiving a frame classification parameter index of the current frame; Find the linear estimate of the algebraic codebook gain in the linear domain in the mapping table; each entry in the first mapping table includes two values: the frame classification parameter index and the linear estimate of the algebraic codebook gain in the linear domain; calculate from the algebraic codebook gain
  • the energy Ec of the algebraic codebook vector of the digital book is obtained by performing a base-10 logarithmic operation on the square root of the energy, then taking the inverse of the value after the logarithmic operation and performing an exponential operation on the base 10 base.
  • the quantized gain of the algebraic codebook gain is obtained by multiplying by the correction factor, which is derived from the winning codebook vector, and the winning codebook vector is selected from the gain codebook.
  • the method provided in the seventh aspect may further include: transmitting coding parameters, where the coding parameters include: the frame classification parameter index of the current frame, and the index of the winning codebook vector in the gain codebook.
  • An eighth aspect relative to the method of encoding a sound signal in the seventh aspect, provides a method of decoding a sound signal, which is also applied to the first subframe in the current frame.
  • the method may include: receiving encoding parameters; the encoding parameters include : The frame classification parameter index of the current frame, the index of the winning codebook vector; search the linear estimate of the algebraic codebook gain in the linear domain in the first mapping table according to the frame classification parameter index of the current frame; each in the first mapping table An entry consists of two values: the frame classification parameter index and a linear estimate of the algebraic codebook gain in the linear domain; computing the energy Ec of the algebraic codebook vector from the algebraic codebook and taking the base 10 logarithm of the square root of the energy Numerical operation, then take the inverse of the value after logarithmic operation and perform exponential operation with base 10 to get Multiply the linear estimate of the algebraic codebook gain in the linear domain by Get the estimated gain of the algebraic codebook; convert the estimated gain of the algebraic codebook
  • the codec can directly look up the table to obtain the parameter CT corresponding to the current frame.
  • the value can avoid the base 10 exponential operation involved in the linear estimation of the algebraic codebook gain, reducing the algorithm complexity.
  • the methods for encoding and decoding sound signals provided in the above first to eighth aspects may further include: using the quantization gain of the algebraic codebook gain to multiply the algebraic codebook vector from the algebraic codebook to obtain the excitation contribution of the algebraic codebook. And, included using the winning codebook vector selected from the gain codebook. Finally, the incentive contribution of the algebraic codebook and the incentive contribution of the adaptive codebook are added to obtain the total incentive.
  • the total excitation can be used to reconstruct the speech signal through a synthesis filter.
  • a device with a speech coding function which can be used to implement the method provided in the first aspect.
  • the device may include: a search component, such as a table lookup module 601 shown in Figure 6, for performing the function according to the current situation.
  • the frame classification parameter index of the frame looks up a linear estimate of the algebraic codebook gain in the linear domain in the first mapping table.
  • Each entry in the first mapping table includes two values: a frame classification parameter index and a linear estimate of the algebraic codebook gain in the linear domain.
  • the first calculator including a square summer 603 and a square root calculator 604 as shown in Figure 6, is used to calculate the energy of the algebraic codebook vector from the algebraic codebook.
  • the first multiplier, multiplier 602 shown in Figure 6, is used to multiply the linear estimate of the algebraic codebook gain in the linear domain by the reciprocal of the square root of the energy of the algebraic codebook vector to obtain the estimated gain of the algebraic codebook .
  • the second multiplier, such as multiplier 605 shown in Figure 6, is used to multiply the estimated gain of the algebraic codebook by the correction factor included in the winning codebook vector to obtain the quantized gain of the algebraic codebook; the winning codebook vector is selected from Gain codebook.
  • the device provided in the ninth aspect may also include: a communication component, the communication component is used to transmit encoding parameters, encoding parameter packets Including: the frame classification parameter index of the current frame, the index of the winning codebook vector in the gain codebook.
  • a device with a speech decoding function which can be used to implement the method provided in the second aspect.
  • the device may include: a communication component, and the communication component is used to receive encoding parameters.
  • the coding parameters include: the frame classification parameter index of the current frame, and the index of the winning codebook vector.
  • the first search component such as the table lookup module 601 shown in Figure 6, is used to search the linear estimate of the algebraic codebook gain in the linear domain in the first mapping table based on the frame classification parameter index of the current frame; the first mapping table Each entry in contains two values: the frame classification parameter index and a linear estimate of the algebraic codebook gain in the linear domain.
  • the first calculator including a square summer 603 and a square root calculator 604 as shown in Figure 6, is used to calculate the energy of the algebraic codebook vector from the algebraic codebook.
  • the first multiplier, multiplier 602 shown in Figure 6, is used to multiply the linear estimate of the algebraic codebook gain in the linear domain by the reciprocal of the square root of the energy of the algebraic codebook vector to obtain the estimated gain of the algebraic codebook .
  • a second multiplier, such as multiplier 605 shown in Figure 6, is used to multiply the estimated gain of the algebraic codebook by a correction factor to obtain the quantized gain of the algebraic codebook, the correction factor being derived from the winning codebook vector.
  • the second search component is used to search for the winning codebook vector from the gain codebook based on the index of the winning codebook vector.
  • a device with a speech coding function which can be used to implement the method provided in the third aspect.
  • the device may include: a search component, such as a table lookup module 701 shown in Figure 7, for The frame classification parameter index of the current frame looks up the linear estimate of the algebraic codebook gain in the logarithmic domain in the first mapping table; each entry in the first mapping table includes two values: the frame classification parameter index and the logarithmic domain.
  • Linear estimates of algebraic codebook gain is used to convert the linear estimate of the algebraic codebook gain in the logarithmic domain into the linear domain through exponential operation to obtain a linear estimate of the algebraic codebook gain in the linear domain. value.
  • the first calculator including a square summer 704 and a square root calculator 705 as shown in Figure 7, is used to calculate the energy of the algebraic codebook vector from the algebraic codebook.
  • the first multiplier, multiplier 703 shown in Figure 7, is used to multiply the linear estimate of the algebraic codebook gain in the linear domain by the reciprocal of the square root of the energy of the algebraic codebook vector to obtain the estimated gain of the algebraic codebook .
  • the second multiplier, such as multiplier 707 shown in Figure 7, is used to multiply the estimated gain of the algebraic codebook by the correction factor included in the winning codebook vector to obtain the quantized gain of the algebraic codebook; the winning codebook vector is selected from Gain codebook.
  • the apparatus provided in the eleventh aspect may further include: a communication component, the communication component is used to transmit coding parameters, and the coding parameters include: the frame classification parameter index of the current frame, and the index of the winning codebook vector in the gain codebook.
  • a device with a speech decoding function which can be used to implement the method provided in the fourth aspect.
  • the device may include: a communication component, and the communication component is used to receive encoding parameters.
  • the coding parameters include: the frame classification parameter index of the current frame, and the index of the winning codebook vector.
  • the first search component such as the table lookup module 701 shown in Figure 7, is used to search the linear estimate of the algebraic codebook gain in the logarithmic domain in the first mapping table based on the frame classification parameter index of the current frame; the first mapping Each entry in the table consists of two values: the frame classification parameter index and a linear estimate of the algebraic codebook gain in the logarithmic domain.
  • a converter such as the exponential calculator 702 shown in Figure 7, is used to convert the linear estimate of the algebraic codebook gain in the logarithmic domain into the linear domain through exponential operation to obtain a linear estimate of the algebraic codebook gain in the linear domain. value.
  • the first calculator including a square summer 704 and a square root calculator 705 as shown in Figure 7, is used to calculate the energy of the algebraic codebook vector from the algebraic codebook.
  • the first multiplier, multiplier 703 shown in Figure 7, is used to multiply the linear estimate of the algebraic codebook gain in the linear domain by the reciprocal of the square root of the energy of the algebraic codebook vector to obtain the estimated gain of the algebraic codebook .
  • a second multiplier, such as multiplier 707 shown in Figure 7, is used to multiply the estimated gain of the algebraic codebook by a correction factor to obtain the quantized gain of the algebraic codebook, the correction factor being derived from the winning codebook vector.
  • the second search component is used to search for the winning codebook vector from the gain codebook based on the index of the winning codebook vector.
  • a device with a speech coding function is provided, which can be used to implement the method provided in the fifth aspect.
  • the device may include: a linear prediction component, such as the linear estimation module 801 shown in Figure 8, for Use the first subframe
  • the linear estimation constant in and the frame type of the current frame are linearly estimated to obtain a linear estimate of the algebraic codebook gain in the logarithmic domain.
  • a converter, such as the exponential calculator 802 shown in Figure 8 is used to convert the linear estimate of the algebraic codebook gain in the logarithmic domain into the linear domain through exponential operation to obtain a linear estimate of the algebraic codebook gain in the linear domain. value.
  • the first calculator including a square summer 804 and a square root calculator 805 as shown in Figure 8, is used to calculate the energy of the algebraic codebook vector from the algebraic codebook.
  • the first multiplier, multiplier 803 shown in Figure 8 is used to multiply the linear estimate of the algebraic codebook gain in the linear domain by the reciprocal of the square root of the energy of the algebraic codebook vector to obtain the estimated gain of the algebraic codebook .
  • the second multiplier such as multiplier 806 shown in Figure 8, is used to multiply the estimated gain of the algebraic codebook by the correction factor included in the winning codebook vector to obtain the quantized gain of the algebraic codebook; the winning codebook vector is selected from Gain codebook.
  • the device provided in the thirteenth aspect may further include: a communication component, which is used to transmit coding parameters.
  • the coding parameters include: the frame type of the current frame, the linear estimation constant, and the index of the winning codebook vector in the gain codebook.
  • a fourteenth aspect provides a device with a speech decoding function, which can be used to implement the method provided in the sixth aspect.
  • the device may include: a communication component, and the communication component is used to receive encoding parameters.
  • the coding parameters include: the frame type of the current frame, the linear estimation constant in the first subframe, and the index of the winning codebook vector.
  • the linear prediction component such as the linear estimation module 801 shown in Figure 8, is used to perform linear estimation using the linear estimation constant in the first subframe and the frame type of the current frame to obtain a linear estimation of the algebraic codebook gain in the logarithmic domain value.
  • a converter such as the exponential calculator 802 shown in Figure 8 is used to convert the linear estimate of the algebraic codebook gain in the logarithmic domain into the linear domain through exponential operation to obtain a linear estimate of the algebraic codebook gain in the linear domain.
  • the first calculator including a square summer 804 and a square root calculator 805 as shown in Figure 8, is used to calculate the energy of the algebraic codebook vector from the algebraic codebook.
  • the first multiplier, multiplier 803 shown in Figure 8 is used to multiply the linear estimate of the algebraic codebook gain in the linear domain by the reciprocal of the square root of the energy of the algebraic codebook vector to obtain the estimated gain of the algebraic codebook .
  • a second multiplier such as multiplier 806 shown in Figure 8 is used to multiply the estimated gain of the algebraic codebook by a correction factor to obtain the quantized gain of the algebraic codebook, the correction factor being derived from the winning codebook vector.
  • a search component configured to find the winning codebook vector from the gain codebook based on the index of the winning codebook vector.
  • a device with a speech coding function which can be used to implement the method provided in the seventh aspect.
  • the device may include: a first search component, such as the table lookup module 901 shown in Figure 9. Searching the first mapping table for a linear estimate of the algebraic codebook gain in the linear domain according to the frame classification parameter index of the current frame. Each entry in the first mapping table includes two values: a frame classification parameter index and a linear estimate of the algebraic codebook gain in the linear domain.
  • the first calculator including a square summer 903 and a square root calculator 904 as shown in Figure 9, is used to calculate the energy Ec of the algebraic codebook vector from the algebraic codebook.
  • a logarithmic operator such as calculator 905 shown in Figure 9, is used to perform a base 10 logarithmic operation on the square root of energy.
  • the exponent operator such as the calculator 906 shown in Figure 9, is used to take the inverse of the value after the logarithmic operation and then perform an exponential operation with base 10 to obtain
  • a first multiplier, multiplier 902 shown in Figure 9, is used to multiply the linear estimate of the algebraic codebook gain in the linear domain by Get the estimated gain of the algebraic codebook.
  • a second multiplier such as multiplier 907 shown in Figure 9, is used to convert the estimated gain of the algebraic codebook into The quantized gain of the algebraic codebook gain is obtained by multiplying by the correction factor, which is derived from the winning codebook vector, and the winning codebook vector is selected from the gain codebook.
  • the device provided in the fifteenth aspect may further include: a communication component, the communication component is used to transmit coding parameters, and the coding parameters include: the frame classification parameter index of the current frame, and the index of the winning codebook vector in the gain codebook.
  • a sixteenth aspect provides a device with a speech decoding function, which can be used to implement the method provided in the eighth aspect.
  • the device may include: a communication component for receiving encoding parameters.
  • the coding parameters include: the frame classification parameter index of the current frame, and the index of the winning codebook vector in the gain codebook.
  • the first search component such as the table lookup module 901 shown in Figure 9, is used to search the linear estimate of the algebraic codebook gain in the linear domain in the first mapping table according to the frame classification parameter index of the current frame; the first mapping table Each entry in contains two values: the frame classification parameter index and a linear estimate of the algebraic codebook gain in the linear domain.
  • the first calculator including a square summer 903 and a square root calculator 904 as shown in Figure 9, is used to calculate the energy Ec of the algebraic codebook vector from the algebraic codebook.
  • a logarithmic operator such as calculator 905 shown in Figure 9, is used to log energy Performs the base 10 logarithm of the square root of .
  • the exponent operator such as the calculator 906 shown in Figure 9, is used to take the inverse of the value after the logarithmic operation and then perform an exponential operation with base 10 to obtain
  • a first multiplier, multiplier 902 shown in Figure 9 is used to multiply the linear estimate of the algebraic codebook gain in the linear domain by Get the estimated gain of the algebraic codebook.
  • a second multiplier such as multiplier 907 shown in Figure 9 is used to convert the estimated gain of the algebraic codebook into The quantized gain of the algebraic codebook gain is obtained by multiplying by the correction factor derived from the winning codebook vector.
  • the second search component is used to search for the winning codebook vector from the gain codebook based on the index of the winning codebook vector.
  • the device with speech encoding and decoding functions provided in the ninth aspect to the sixteenth aspect above may further include:
  • the third multiplier is used to multiply the algebraic codebook vector from the algebraic codebook by the quantized gain of the algebraic codebook to obtain the excitation contribution of the algebraic codebook;
  • a fourth multiplier for multiplying the adaptive codebook vector from the adaptive codebook by the quantized gain of the adaptive codebook included in the winning codebook vector selected from the gain codebook to obtain the excitation contribution of the adaptive codebook
  • the adder is used to add the incentive contribution of the algebraic codebook and the incentive contribution of the adaptive codebook to obtain the total incentive.
  • a voice communication system may include: a first device and a second device, wherein: the first device may be used to perform the coding of the sound signals provided by the first, third, fifth, and seventh above.
  • the second device can be used to perform the method of encoding the sound signal provided by the second, fourth, sixth and eighth above.
  • the first device may be a device with a speech coding function provided by the ninth, eleventh, thirteenth, and fifteenth above
  • the second device may be a device with a speech coding function provided by the aforementioned ninth, twelfth, fourteenth, and sixteenth A device with speech decoding function.
  • Figure 1 shows a block diagram of an existing CELP coding algorithm
  • Figure 2 shows a block diagram of an existing CELP decoding algorithm
  • Figure 3 shows the gain quantization process in the encoder of memoryless joint gain coding
  • Figure 4 shows the process of calculating the algebraic codebook gain in the first subframe
  • Figure 5 shows the process of calculating the algebraic codebook gain in the second subframe and subsequent subframes
  • Figure 6 shows the process of calculating the algebraic codebook gain in the first subframe provided by Embodiment 1 of the present application
  • Figure 7 shows the process of calculating the algebraic codebook gain in the first subframe provided in Embodiment 2 of the present application.
  • Figure 8 shows the process of calculating the algebraic codebook gain in the first subframe provided in Embodiment 3 of the present application.
  • Figure 9 shows the process of calculating the algebraic codebook gain in the first subframe provided in Embodiment 4 of the present application.
  • Figure 10 shows the process of utilizing calculated estimation constants for gain codebook design
  • Figure 11 shows a voice communication system including a voice coding device and a voice decoding device provided by an embodiment of the present application.
  • the embodiments of the present application improve the existing technology of CELP encoding and decoding, which can reduce the complexity of calculating the codebook gain while realizing memoryless joint gain encoding.
  • Figure 1 shows a block diagram of an existing CELP coding algorithm.
  • the input speech signal is first preprocessed. In preprocessing, the input speech signal can be sampled and pre-emphasized.
  • the preprocessed signal is further output to the LPC analysis quantization interpolation module 101 and the adder 102.
  • the LPC analysis and quantification interpolation module 101 performs linear prediction analysis on the input speech signal, quantifies and interpolates the analysis results, and calculates the linear Predictive coding (linear prediction coding, LPC) parameters.
  • the LPC parameters are used to construct a synthesis filter 103.
  • the result of multiplying the algebraic codebook vector from the algebraic codebook by the algebraic codebook gain g c and the result of multiplying the adaptive codebook vector from the adaptive codebook by the adaptive codebook gain g p are output to the adder 104 for addition , the addition result is output to the synthesis filter 103, thereby constructing a reconstructed speech signal generated after the excitation signal passes through the synthesis filter 103.
  • the reconstructed speech signal is also output to the adder 102, and is subtracted from the input speech signal to obtain an error signal.
  • the error signal is processed by a perceptual weighting filter 105, changes the spectrum according to the auditory experience, and is fed back to the pitch analysis module 106 and the algebra codebook search module 107.
  • the perceptual weighting filter 105 is also constructed based on the LPC parameters.
  • the excitation signal and codebook gain are determined based on the principle of minimizing the mean square error of the perceptually weighted error signal.
  • the pitch analysis module 106 derives the pitch period through autocorrelation analysis, searches the adaptive codebook accordingly to determine the best adaptive codebook vector, and obtains an excitation signal with quasi-periodic characteristics in speech.
  • the algebraic codebook search module 107 searches the algebraic codebook, determines the best algebraic codebook vector according to the criterion of minimizing the weighted mean square error, and obtains the random excitation signal of the speech model. Then, the gain of the best adaptive codebook vector and the gain of the best algebraic codebook vector are determined.
  • the quantization gain of the codebook, the index of the best adaptive codebook vector in the adaptive codebook, the index of the best algebraic codebook vector in the algebraic codebook, and the linear prediction coding parameters form a bit stream and are transmitted to decoding side.
  • Figure 2 shows a block diagram of an existing CELP decoding algorithm.
  • the decoding side obtains each encoding parameter from the compressed bit stream.
  • the decoding side uses these encoding parameters to generate an excitation signal.
  • Each subframe is processed as follows: the adaptive codebook vector and the algebraic codebook vector are multiplied by their respective quantization gains to obtain an excitation signal; the excitation signal is passed through the linear prediction synthesis filter 201 to obtain a reconstructed speech signal.
  • the linear prediction synthesis filter 201 is also constructed based on the LPC parameters.
  • memoryless joint gain coding can be implemented for adaptive codebook gain and algebraic codebook gain in each subframe, especially at low bit rates (such as 7.2kbps, 8kbps).
  • the index of the quantized gain g p of the adaptive codebook in the gain codebook is transmitted with the bit stream.
  • Gain quantization in the encoder is achieved by searching the designed gain codebook based on the minimum mean square error MMSE principle.
  • Each entry in the gain codebook includes two values: the quantization gain g p of the adaptive codebook and the correction factor ⁇ for the algebraic codebook gain.
  • the algebraic codebook gain is estimated in advance and the result g c0 is used to multiply the correction factor ⁇ selected from the gain codebook.
  • the gain quantization process can be shown in Figure 3.
  • Gain quantization is achieved by minimizing the energy of the error signal e(i).
  • the constants c 0 , c 1 , c 2 , c 3 , c 4 and c 5 and the estimated gain g c0 are calculated before searching the gain codebook.
  • the error energy E is calculated for each codebook entry.
  • the codebook vector [g p ; ⁇ ] that results in the smallest error energy is selected as the winning codebook vector, whose entries correspond to the quantization gains g p and ⁇ of the adaptive codebook.
  • the received index is used to obtain the quantized gain g p for the adaptive excitation and the quantization correction factor ⁇ for the estimated gain of the algebraic excitation.
  • the estimated gain for the algebraic part of the excitation is done in the same way.
  • the estimated (prediction) gain of the algebraic codebook is given by the following formula (4):
  • CT is the coding classification parameter (coding mode), which is the type selected for the current frame in the encoder preprocessing part.
  • E c is the energy of the filtered algebraic codebook vector, in dB, calculated in equation (5) below.
  • the estimated constants a 0 and a 1 are determined by MSE minimization on the large signal database.
  • the coding mode parameter CT in the above formula is constant for all subframes of the current frame.
  • the superscript [0] indicates the first subframe of the current frame.
  • c(n) is the filtered algebraic code vector.
  • the estimation process of the algebraic codebook gain is summarized as follows: the algebraic codebook gain is estimated according to the classification parameter CT of the current frame, and the energy of the algebraic codevector from the algebraic codebook has been excluded from the estimated algebraic codebook gain. Finally, the estimated gain of the algebraic codebook is multiplied by the correction factor ⁇ selected from the gain codebook to obtain the quantized algebraic codebook gain g c .
  • the algebraic codebook gain is first linearly estimated in the logarithmic domain according to the classification parameter CT, and the linear estimate of the algebraic codebook gain is obtained: a 0 + a 1 CT. Then, the energy parameter of the algebraic codebook vector from the algebraic codebook is subtracted from this linear estimate Get the estimated gain of the algebraic codebook in the logarithmic domain: Among them, Ec is obtained by the above formula (5). Afterwards, the estimated gain of the algebraic codebook in the logarithmic domain is converted to the linear domain to obtain the estimated gain of the algebraic codebook in the linear domain. Refer to formula (4) above. Finally, the estimated gain of the algebraic codebook Multiply by the correction factor ⁇ selected from the gain codebook to obtain the quantized algebraic codebook gain
  • Quantization gain of adaptive codebook Select directly from the gain codebook, specifically search the gain codebook based on the minimum mean square error MMSE principle, refer to the above formula (2).
  • the first and second summation terms ⁇ in the index represent the quantization gain of the algebraic codebook of the previous subframe and the quantization gain of the adaptive codebook respectively.
  • the estimated constants b 0 ,..., b 2k+1 are also determined by MSE minimization on the large signal database.
  • Figure 5 shows the gain estimation process of the second subframe and subsequent subframes.
  • the difference from the first frame is that in addition to the classification parameter CT of the current frame, the algebraic codebook gain of the subsequent subframe is also estimated based on the adaptation of the previous subframe and the quantization gain of the algebraic codebook.
  • the estimated gain of the algebraic codebook is first calculated in the logarithmic domain: That is, the exponential term in formula (6). Then, the estimated gains of the algebraic codebook in the logarithmic domain are transformed to the linear domain. Finally, the algebraic codebook gain estimated in the linear domain is multiplied by the correction factor ⁇ selected from the gain codebook to form the quantized algebraic codebook gain.
  • the quantization gain of the adaptive codebook Also select directly from the gain codebook.
  • the estimation process for the second and subsequent subframes also differs from the first subframe in that the energy of the algebraic codebook vector from the algebraic codebook is not subtracted from the estimated gain of the algebraic codebook in the logarithmic domain.
  • the reason is that the gain estimation of the subsequent subframe is based on the algebraic codebook gain of the previous subframe, and the energy has been subtracted from the algebraic codebook gain of the first subframe. Gain estimates no longer need to account for the effects of removing this energy.
  • the winning codebook vector [g p ; ⁇ ] that results in the minimum error energy is found from the gain codebook according to the index, where g p is the quantization gain of the adaptive codebook and multiplied by the algebraic code using the correction factor ⁇
  • the estimated gain g c0 of the algebraic codebook results in the quantized gain g c of the algebraic codebook.
  • g c0 is calculated in the same way as in the encoder.
  • the adaptive codebook vector and algebraic codebook vector are obtained by decoding from the bit stream, and the distribution is multiplied by the quantization gain of the adaptive and algebraic codebooks to obtain the adaptive and algebraic excitation contributions. Finally, the two excitation contributions are added together to form a total excitation, which is filtered through a linear prediction synthesis filter to reconstruct the speech signal.
  • various embodiments of the present application improve the calculation process of the algebraic codebook gain in the first subframe, which can save highly complex operations such as logarithmic operations and base-10 exponential operations, and reduce the cost of the codebook gain. Computational complexity.
  • the calculation process of the estimated gain of the algebraic codebook in the first subframe represented by formula (9) can be summarized as follows: First, look up the table according to the index CT index of the classification parameter CT of the current frame to obtain the linear value of the algebraic codebook gain in the linear domain. estimated value. After that, using the square root of the energy E c of the algebraic codebook vector from the algebraic codebook in the linear domain (expressed as ) removes the linear estimate of the algebraic codebook gain in the linear domain to obtain the estimated gain of the algebraic codebook in the first subframe In this way, when encoding and decoding the estimated gain of the algebraic codebook of the first subframe, highly complex operations such as logarithmic operations and base-10 exponential operations can be completely avoided, significantly reducing algorithm complexity.
  • the encoding side can transmit the following encoding parameters to the decoding side: the index CT index of the classification parameter CT of the current frame, the index of the winning codebook vector [g p ; ⁇ ] in the gain codebook.
  • Figure 6 shows the calculation process of the algebraic codebook gain in the first subframe provided in Embodiment 1.
  • the table lookup module 601 looks up the table according to the index CTindex to obtain the linear estimate of the algebraic codebook gain in the linear domain (hereinafter represented as b[CT index ]), and outputs b[CT index ] to the multiplier 602.
  • the square summer 603 calculates Ec through the above formula (5), and then outputs The reciprocal value of is sent to the multiplier 602 to remove b[CT index ] to obtain the estimated gain of the algebraic codebook Then, output the estimated gain of the algebraic codebook to the multiplier 605, used to multiply the correction factor ⁇ from the gain codebook to obtain the quantized gain of the algebraic codebook Quantitative gain of algebraic codebooks It can be further output to a multiplier 606 for multiplying the algebraic codebook vector (filtered algebraic excitation) from the algebraic codebook to obtain the algebraic codebook contribution of the excitation. Finally, the algebraic codebook contribution of the excitation is output to the adder 607, which is added to the adaptive codebook contribution of the excitation to obtain the total filtered excitation.
  • the adaptive codebook contribution of the excitation is the adaptive codebook vector from the adaptive codebook (filtered adaptive excitation) output to the multiplier 608 multiplied by the quantization gain of the adaptive codebook from the gain codebook get.
  • Embodiment 2 provides the calculation process of the algebraic codebook gain in the first subframe and the estimated gain of the algebraic codebook in the first subframe.
  • the calculation formula of has also been optimized as follows (same as the above formula (7)) to reduce the complexity:
  • a 0 +a 1 CT represents the linear estimate of the algebraic codebook gain in the logarithmic domain.
  • each entry in the mapping table b maintained by the codec may include two values: the index CT index of the classification parameter CT and the linear estimate a 0 +a 1 CT of the algebraic codebook gain in the logarithmic domain.
  • the codec can directly look up the table to obtain the value of a 0 + a 1 CT corresponding to the parameter CT of the current frame, which can avoid calculating the value again when the codec is running, saving the amount of calculation.
  • the calculation process of the estimated gain of the algebraic codebook in the first subframe represented by formula (11) can be summarized as follows: First, look up the table according to the index CT index of the classification parameter CT of the current frame to obtain the algebraic codebook gain in the logarithmic domain. Linear estimate. After that, using the square root of the energy E c of the algebraic codebook vector from the algebraic codebook in the linear domain (expressed as ) removes the linear estimate of the algebraic codebook gain in the linear domain to obtain the estimated gain of the algebraic codebook in the first subframe In this way, when encoding and decoding the estimated gain of the algebraic codebook of the first subframe, highly complex operations such as logarithmic operations and base-10 exponential operations can be reduced, thereby reducing algorithm complexity.
  • the encoding side can transmit the following encoding parameters to the decoding side: the index CT index of the classification parameter CT of the current frame, the index of the winning codebook vector [g p ; ⁇ ] in the gain codebook.
  • Figure 7 shows the calculation process of the algebraic codebook gain in the first subframe provided in Embodiment 2.
  • the table lookup module 701 looks up the table according to the index CT index to obtain the linear estimate of the algebraic codebook gain in the logarithmic domain (hereinafter represented as b[CT index ]), and outputs b[CT index ] to an index with base 10.
  • Calculator 702 obtains a linear estimate of the algebraic codebook gain in the linear domain: Next, output a linear estimate of the algebraic codebook gain in the linear domain to multiplier 703.
  • the square root value of the energy of the algebraic codebook vector from the algebraic codebook is calculated through the square summer 704 and the square root calculator 705 in sequence:
  • the square summer 704 calculates Ec through the above formula (5), and then outputs The reciprocal value of Magic weapon 703, used to remove and then obtain the estimated gain of the algebraic codebook Then, output the estimated gain of the algebraic codebook to the multiplier 706, used to multiply the correction factor ⁇ from the gain codebook to obtain the quantized gain of the algebraic codebook Quantitative gain of algebraic codebooks It can be further output to a multiplier 707 for multiplying the algebraic codebook vector (filtered algebraic excitation) from the algebraic codebook to obtain the algebraic codebook contribution of the excitation. Finally, the algebraic codebook contribution of the excitation is output to the adder 708, which is added to the adaptive codebook contribution of the excitation to obtain the total filtered excitation.
  • the adaptive codebook contribution of the excitation is the adaptive codebook vector from the adaptive codebook (filtered adaptive excitation) output to the multiplier 709 multiplied by the quantization gain of the adaptive codebook from the gain codebook get.
  • Embodiment 3 provides the calculation process of the algebraic codebook gain in the first subframe and the estimated gain of the algebraic codebook in the first subframe.
  • the calculation formula of has also been optimized as follows (same as the above formula (7)) to reduce the complexity:
  • the linear estimate of the algebraic codebook gain in the linear domain is not Or the linear estimate of the algebraic codebook gain in the logarithmic domain a 0 + a 1
  • the value of CT is enumerated in advance through the table, but is obtained through the exponential operation with base 10
  • the calculation process of the estimated gain of the algebraic codebook in the first subframe represented by formula (7) can be summarized as follows: First, the linear estimate value a 0 of the algebraic codebook gain in the logarithmic domain is calculated according to the classification parameter CT of the current frame. +a 1 CT.
  • the linear estimate a 0 + a 1 CT of the algebraic codebook gain in the logarithmic domain is converted into the linear domain through the base 10 exponential operation, and we get Using the square root of the energy E c of the algebraic codebook vector from the algebraic codebook in the linear domain (expressed as ) removes the linear estimate of the algebraic codebook gain in the linear domain Get the estimated gain of the algebraic codebook in the first subframe In this way, when encoding and decoding the estimated gain of the algebraic codebook of the first subframe, some factors can be avoided.
  • the involved logarithmic log operation and base 10 exponential operation reduce the complexity of the algorithm.
  • the encoding side can transmit the following encoding parameters to the decoding side: the frame type CT of the current frame, the linear estimation constants a 0 , a 1 , and the index of the winning codebook vector [g p ; ⁇ ] in the gain codebook.
  • Figure 8 shows the calculation process of the algebraic codebook gain in the first subframe provided in Embodiment 3.
  • the linear estimation module 801 calculates the linear estimate a0+a1CT of the algebraic codebook gain in the logarithmic domain according to the parameter CT, and outputs a0+a1CT to the base-10 exponential calculator 802 to obtain the algebraic codebook gain in the linear domain.
  • the square root value of the energy of the algebraic codebook vector from the algebraic codebook is calculated in sequence through the square summer 804 and the square root calculator 805:
  • the square summer 804 calculates Ec through the above formula (5), and then outputs The reciprocal value of is sent to multiplier 803 to remove and then obtain the estimated gain of the algebraic codebook Then, output the estimated gain of the algebraic codebook to the multiplier 806, used to multiply the correction factor ⁇ from the gain codebook to obtain the quantized gain of the algebraic codebook Quantitative gain of algebraic codebooks It can be further output to a multiplier 807 for multiplying the algebraic codebook vector (filtered algebraic excitation) from the algebraic codebook to obtain the algebraic codebook contribution of the excitation. Finally, the algebraic codebook contribution of the excitation is output to the adder 808, which is added to the adaptive codebook contribution of the excitation to obtain the total filtered excitation.
  • the adaptive codebook contribution of the excitation is the adaptive codebook vector from the adaptive codebook (filtered adaptive excitation) output to the multiplier 809 multiplied by the quantization gain of the adaptive codebook from the gain codebook get.
  • Embodiment 4 provides the calculation process of the algebraic codebook gain in the first subframe and the estimated gain of the algebraic codebook in the first subframe.
  • the calculation formula of has been optimized as follows to reduce complexity:
  • Embodiment 1 Since the linear estimate of the algebraic codebook gain in the linear domain It is only related to the classification parameter CT and the estimated constants a 0 and a 1 in the logarithmic domain. Therefore, The values can be enumerated in advance through the table. Represents a linear estimate of the algebraic codebook gain in the linear domain.
  • the codec can maintain a mapping table b. Each entry in table b includes two values: the index CT index of the classification parameter CT and the linear estimate of the algebraic codebook gain in the linear domain. In this way, the codec can directly look up the table to obtain the parameter CT corresponding to the current frame. The value avoids recalculation when the codec is running, thereby reducing the algorithm complexity.
  • the calculation process of the estimated gain of the algebraic codebook in the first subframe represented by formula (13) can be summarized as follows: First, look up the table according to the index CT index of the classification parameter CT of the current frame to obtain the linear value of the algebraic codebook gain in the linear domain. estimated value And, for the energy parameter of the algebraic codebook vector from the algebraic codebook Take the opposite number and perform exponential operation on the opposite number with base 10 to get Finally, the linear estimate of the algebraic codebook gain in the linear domain multiply by Get the estimated gain of the algebraic codebook In this way, when encoding and decoding the estimated gain of the algebraic codebook of the first subframe, the base-10 exponential operation involved in the linear estimate of the algebraic codebook gain can be avoided, thereby reducing the complexity of the algorithm.
  • the encoding side can transmit the following encoding parameters to the decoding side: the index CT index of the classification parameter CT of the current frame, the index of the winning codebook vector [g p ; ⁇ ] in the gain codebook.
  • Figure 9 shows the calculation process of the algebraic codebook gain in the first subframe provided in Embodiment 4.
  • the table lookup module 901 looks up the table according to the index CTindex to obtain the linear estimate of the algebraic codebook gain in the linear domain (hereinafter represented as b[CT index ]), and outputs b[CT index ] to the multiplier 902.
  • the square root value of the energy of the algebraic codebook vector from the algebraic codebook is calculated in sequence through the square summer 903 and the square root calculator 904:
  • the square summer 903 calculates Ec through the above formula (5).
  • the adaptive codebook contribution of the excitation is the adaptive codebook vector from the adaptive codebook (filtered adaptive excitation) output to the multiplier 910 multiplied by the quantization gain of the adaptive codebook from the gain codebook get.
  • the calculation process of formula (12) can be further simplified as:
  • the calculation process of the estimated gain of the algebraic codebook in the first subframe represented by formula (14) can be summarized as follows: First, look up the table according to the index CT index of the classification parameter CT of the current frame to obtain the algebraic codebook gain in the logarithmic domain. Linear estimate. And, the energy of the algebraic codebook vector from the algebraic codebook is subtracted from the linear estimate of the algebraic codebook gain a 0 +a 1 CT in the logarithmic domain get Finally, yes Perform exponential operations with base 10 to get The calculation process represented by formula (14) can obtain the advance estimate of the algebraic codebook in the logarithmic domain by looking up the table without the need for codec calculation, thus saving the amount of calculation.
  • the value of the classification parameter CT of the current frame can be selected according to the signal type. For example, for narrowband signals, set the values of parameter CT to 1, 3, 5, and 7 for unvoiced, voiced, normal, or transition frames, respectively, while for wideband signals, set them to 0, 2, 4, and 6, respectively. Signal classification will be introduced below and will not be expanded upon here.
  • the methods used to determine the classification (parameter CT) of a frame may differ. For example, a basic classification is made based only on voiced or unvoiced sounds. As another example, more categories such as strong voiced sounds or strong unvoiced sounds can be added.
  • Signal classification can be performed in the following three steps.
  • the speech activity detector speech active detector, SAD
  • SAD speech active detector
  • CNG comfort noise generator
  • the frame is further classified to distinguish unvoiced frames. If the frame is further classified as an unvoiced signal, the classification is terminated and the frame is encoded using the encoding method most suitable for the unvoiced signal. Otherwise, further determine whether the frame is a stable voiced sound. If the frame is classified as a stable voiced frame, the frame is encoded using the encoding method most suitable for stable voiced signals. Otherwise, the frame may contain unstable signal segments such as voiced onsets or rapidly evolving voiced signals.
  • Discrimination of unvoiced signals can be based on the following parameters: voicing metrics average spectral tilt The maximum short-term energy increment dE0 and the maximum short-term energy deviation dE at low level.
  • This application does not limit the algorithm for distinguishing unvoiced signals.
  • the algorithm mentioned in the following documents can be used: Jelinek, M., et al., "Advances in source-controlled variable bitrate wideband speech coding", Special Workshop in MAUI (SWIM ); Lectures by masters in speech processing, Maui, January 12-24, 2004, the entire contents of which are hereby incorporated by reference.
  • a frame is not classified as valid or unvoiced, it is tested to see if it is a stably voiced frame.
  • the differentiation of stable voiced frames can be based on the following parameters: normalized correlation per subframe (with 1/4 subsample resolution) average spectral tilt and open-loop pitch estimation for all subframes (with 1/4 subsample resolution). This application also does not limit the algorithm for distinguishing stable voiced frames.
  • the linear estimation gain of the algebraic codebook in the first subframe of the current frame is also related to the estimation constant a i .
  • the estimated constant a i can be determined through large sample data training.
  • Large sample training data can include a large number of diverse speech signals in different languages, different genders, different ages, different environments, etc. Moreover, it is assumed that the large sample training data includes N+1 frames.
  • the estimated coefficients are found by minimizing the mean square error between the estimated gain of the algebraic codebook and the optimal gain in the logarithmic domain over all frames in a large sample of training data.
  • the mean square error energy is given by:
  • the estimated gain of the algebraic codebook in the logarithmic domain is obtained by the following formula:
  • L L samples of the input speech signal.
  • E i (n) represents the energy of the algebraic codebook vector from the algebraic codebook in the nth frame, and can be calculated with reference to the above formula (5).
  • the optimal values of the estimated constants a 0 and a 1 can be determined.
  • the expressions for estimating the optimal values of constants a 0 and a 1 are not provided here, that is, the solution to formula (21) is not shown, and the expressions are relatively complicated. In practical applications, its value can be calculated in advance through calculation software such as MATLAB.
  • the mean square error energy is given by:
  • the optimal values of the estimated constants b 0 , b 1 ,..., b k can be determined.
  • the expressions for estimating the optimal values of the constants b 0 , b 1 ,..., b k are not provided here, and the expressions are relatively complicated. In practical applications, its value can be calculated in advance through calculation software such as MATLAB.
  • the calculated estimation constant can be used to design a gain codebook for each subframe, and then the gain quantization of each subframe can be determined based on the estimation constant and the gain codebook as described above.
  • the estimate of the algebraic codebook gain is slightly different in each subframe, and the estimated coefficients are obtained by the minimum mean square error MSE.
  • the gain codebook can be designed by referring to the KMEANS algorithm in the following literature: MacQueen, J.B. (1967), "Some Methods for classification and Analysis of Multivariate Observations", Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, University of California Press.pp. 281-297, the entire contents of which are hereby incorporated by reference.
  • Figure 11 shows a voice communication system 110 including a voice encoding device 113 and a voice decoding device 117 provided by an embodiment of the present application.
  • Voice communication system 110 supports transmission of voice signals over communication channel 115 .
  • the communication channel 115 may be a wireless communication link or a wired link formed by wires or the like.
  • Communication channel 115 may support multiple simultaneous voice communications requiring shared bandwidth resources.
  • the communication channel 115 can also be extended to a storage unit within a single device for realizing shared communication. For example, the storage unit of a single device records and stores an encoded voice signal for later playback.
  • a voice collection device 111 such as a microphone converts the voice into an analog voice signal 120 that is provided to an analog-to-digital (A/D) converter 112 .
  • the function of the A/D converter 112 is to convert the analog voice signal 120 into a digital voice signal 121 .
  • the speech encoding device 113 encodes the digital speech signal 121 to generate a set of encoding parameters 122 in binary form, and transmits them to the channel encoder 114 through the communication component.
  • the channel encoder 114 performs channel encoding operations such as adding redundancy on the encoding parameters 122 to form a bit stream 123, which is transmitted through the communication channel 115.
  • the channel decoder 116 performs channel decoding operations on the bitstream 124 received through the communication component, such as utilizing redundant information in the bitstream 124 to detect and correct channel errors that occur during transmission.
  • the speech decoding device 117 converts the bitstream 125 received from the channel decoder back into coding parameters for creating a synthesized speech signal 126.
  • the synthesized speech signal 126 reconstructed in the speech decoding device 117 is converted back to an analog speech signal 127 in a digital-to-analog (D/A) converter 108 .
  • the analog voice signal 127 is played through a sound playing device such as the speaker unit 119.
  • the speech encoding device 113 encodes the speech signal to obtain encoding parameters, and the speech decoding device 117 reconstructs the speech signal using the encoding parameters carried in the bit stream.
  • the above content can be referred to and will not be described again.
  • Each component at the transmitter end can be integrated into one electronic device, and each component at the receiver end can be integrated into another electronic device.
  • the two electronic devices communicate through a communication channel formed by a wired or wireless link, such as transmitting encoding parameters.
  • the components on the transmitter side and the components on the receiver side can also be integrated into the same electronic device.
  • data exchange that is, communication, is implemented between the transmitter end and the receiver end through a shared storage unit, such as transmitting encoding parameters.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

声音编解码方法以及相关装置、系统,改进了CELP编解码关于第一子帧中代数码本增益的计算过程。首先,可根据当前帧的分类参数CT的索引CT index查表获得线性域中代数码本增益的线性估计值。之后,利用线性域中来自代数码本的代数码本矢量的能量(表示为I)去除线性域中代数码本增益的线性估计值,得到第一子帧中代数码本的估计增益 II。这样,在编解码计算第一子帧的代数码本的估计增益时,可以完全避免对数log运算和以10为底的指数运算这类复杂度高的运算,显著降低算法复杂度。

Description

声音编解码方法以及相关装置、系统
本申请要求于2022年07月29日提交中国专利局、申请号为202210908196.9、申请名称为“声音编解码方法以及相关装置、系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及声音编解码领域,特别涉及基于码激励线性预测(code-excited linear prediction,CELP)的声音编解码技术。
背景技术
码激励线性预测(CELP)技术能够实现良好的质量和比特率折衷,最早由Manfred R.Schroeder和Bishnu S.Atal在1985年提出。在CELP编码器中,以帧为单位处理输入语音或音频信号(声音信号)。帧进一步被划分成更小的块,该更小的块被称为子帧。编解码器中,激励信号在每个子帧内被确定,包括两种分量:一种是来自过去的激励(也称为自适应码本),另一种是来自代数码本(也称为固定码本或创新码本)。编码侧向解码侧传输的不是原始声音,而是编码参数,例如代数码本的增益、自适应码本的增益,这些编码参数是在重构语音信号声音与原始语音信号之间的误差为最小值时计算出来的。如何降低计算复杂度是本领域的一个研究热点。
发明内容
本申请的各种实施例提供了一种声音编解码方法,能够降低计算码本增益的复杂度。
第一方面,提供了一种编码声音信号的方法,应用于当前帧中的第一子帧,该方法可包括:接收当前帧的帧分类参数索引,根据当前帧的帧分类参数索引在第一映射表中查找线性域中代数码本增益的线性估计值,其中第一映射表中的每一个条目包括两个值:帧分类参数索引和线性域中代数码本增益的线性估计值。计算来自代数码本的代数码本矢量的能量,将线性域中代数码本增益的线性估计值除以代数码本矢量的能量的平方根得到代数码本的估计增益,然后将代数码本的估计增益乘以校正因子得到代数码本的量化增益。校正因子来自获胜码本矢量,获胜码本矢量选自增益码本。
第一方面提供的方法还可以包括:传输编码参数。该编码参数可包括:当前帧的帧分类参数索引,获胜码本矢量在增益码本中的索引。
第二方面,相对于第一方面的编码声音信号的方法,提供了一种解码声音信号的方法,同样应用于当前帧中的第一子帧,该方法可包括:接收编码参数,编码参数可包括:当前帧的帧分类参数索引,获胜码本矢量的索引,基于当前帧的帧分类参数索引在第一映射表中查找线性域中代数码本增益的线性估计值。其中第一映射表中的每一个条目包括两个值:帧分类参数索引和线性域中代数码本增益的线性估计值。并且,计算来自代数码本的代数码本矢量的能量,将线性域中代数码本增益的线性估计值除以代数码本矢量的能量的平方根得到代数码本的估计增益。最后,将代数码本的估计增益乘以校正因子得到代数码本的量化增益,校正因子来自获胜码本矢量,获胜码本矢量是基于获胜码本矢量的索引选自于增益码本。
第一方面和第二方面提供的方法至少具有如下有益效果:在编解码计算第一子帧的代数码本的估计增益时,可以完全避免对数log运算和以10为底的指数运算这类复杂度高的运算, 显著降低算法复杂度。并且,编解码器可直接查表获得当前帧的参数CT对应的的值,避免编解码器运行时再计算,从而降低了算法复杂度。
第三方面,提供了一种编码声音信号的方法,应用于当前帧中的第一子帧,该方法可包括:接收当前帧的帧分类参数索引,根据当前帧的帧分类参数索引在第一映射表中查找对数域中代数码本增益的线性估计值,其中第一映射表中的每一个条目包括两个值:帧分类参数索引和对数域中代数码本增益的线性估计值。并且,将对数域中代数码本增益的线性估计值通过指数运算转换到线性域中得到线性域中代数码本增益的线性估计值,计算来自代数码本的代数码本矢量的能量,将线性域中代数码本增益的线性估计值除以代数码本矢量的能量的平方根,得到代数码本的估计增益。最后,将代数码本的估计增益乘以校正因子得到代数码本的量化增益;校正因子来自获胜码本矢量,获胜码本矢量选自增益码本。
第三方面提供的方法还可以包括:传输编码参数,编码参数包括:当前帧的帧分类参数索引,获胜码本矢量在增益码本中的索引。
第四方面,相对于第三方面的编码声音信号的方法,提供了一种解码声音信号的方法,同样应用于当前帧中的第一子帧,该方法可包括:接收编码参数;编码参数包括:当前帧的帧类型,第一子帧中的线性估计常数,获胜码本矢量的索引;利用第一子帧中的线性估计常数和当前帧的帧类型进行线性估计,得到对数域中代数码本增益的线性估计值;将对数域中代数码本增益的线性估计值通过指数运算转换到线性域中,得到线性域中代数码本增益的线性估计值;计算来自代数码本的代数码本矢量的能量;将线性域中代数码本增益的线性估计值除以代数码本矢量的能量的平方根,得到代数码本的估计增益;将代数码本的估计增益乘以校正因子得到代数码本的量化增益,校正因子来自获胜码本矢量,获胜码本矢量是基于获胜码本矢量的索引选自于增益码本。
第三方面和第四方面提供的方法至少具有如下有益效果:在编解码计算第一子帧的代数码本的估计增益时,可以避免代数码本矢量的能量Ec涉及的对数运算和指数运算,降低算法复杂度。并且,编解码器可直接查表获得当前帧的参数CT对应的a0+a1CT的值,可以避免编解码器运行时再计算该值,节约了计算量。
第五方面,提供了一种编码声音信号的方法,应用于当前帧中的第一子帧,该方法可包括:利用第一子帧中的线性估计常数和当前帧的帧类型进行线性估计,得到对数域中代数码本增益的线性估计值;将对数域中代数码本增益的线性估计值通过指数运算转换到线性域中,得到线性域中代数码本增益的线性估计值;计算来自代数码本的代数码本矢量的能量;将线性域中代数码本增益的线性估计值除以代数码本矢量的能量的平方根,得到代数码本的估计增益;将代数码本的估计增益乘以校正因子得到代数码本的量化增益;校正因子来自获胜码本矢量,获胜码本矢量选自增益码本。
第五方面提供的方法还可以包括:传输编码参数,编码参数包括:当前帧的帧类型,线性估计常数,获胜码本矢量在增益码本中的索引。
第六方面,相对于第五方面的编码声音信号的方法,提供了一种解码声音信号的方法,同样应用于当前帧中的第一子帧,该方法可包括:接收编码参数;编码参数包括:当前帧的帧类型,第一子帧中的线性估计常数,获胜码本矢量的索引;利用第一子帧中的线性估计常数和当前帧的帧类型进行线性估计,得到对数域中代数码本增益的线性估计值;将对数域中代数码本增益的线性估计值通过指数运算转换到线性域中,得到线性域中代数码本增益的线性估计值;计算来自代数码本的代数码本矢量的能量;将线性域中代数码本增益的线性估计 值除以代数码本矢量的能量的平方根,得到代数码本的估计增益;将代数码本的估计增益乘以校正因子得到代数码本的量化增益,校正因子来自获胜码本矢量,获胜码本矢量是基于获胜码本矢量的索引选自于增益码本。
第五方面和第六方面提供的方法至少具有如下有益效果:在编解码计算第一子帧的代数码本的估计增益时,可以避免代数码本矢量的能量Ec涉及的对数运算和指数运算,降低算法复杂度。
第七方面,提供了一种编码声音信号的方法,应用于当前帧中的第一子帧,该方法可包括:接收当前帧的帧分类参数索引;根据当前帧的帧分类参数索引在第一映射表中查找线性域中代数码本增益的线性估计值;第一映射表中的每一个条目包括两个值:帧分类参数索引和线性域中代数码本增益的线性估计值;计算来自代数码本的代数码本矢量的能量Ec,并对能量的平方根进行以10为底的对数运算,再对对数运算后的值取相反数后进行以10为底的指数运算得到将线性域中代数码本增益的线性估计值乘以得到代数码本的估计增益;将代数码本的估计增益乘以校正因子得到代数码本增益的量化增益,校正因子来自获胜码本矢量,获胜码本矢量选自增益码本。
第七方面提供的方法还可以包括:传输编码参数,编码参数包括:当前帧的帧分类参数索引,获胜码本矢量在增益码本中的索引。
第八方面,相对于第七方面的编码声音信号的方法,提供了一种解码声音信号的方法,同样应用于当前帧中的第一子帧,该方法可包括:接收编码参数;编码参数包括:当前帧的帧分类参数索引,获胜码本矢量的索引;根据当前帧的帧分类参数索引在第一映射表中查找线性域中代数码本增益的线性估计值;第一映射表中的每一个条目包括两个值:帧分类参数索引和线性域中代数码本增益的线性估计值;计算来自代数码本的代数码本矢量的能量Ec,并对能量的平方根进行以10为底的对数运算,再对对数运算后的值取相反数后进行以10为底的指数运算得到将线性域中代数码本增益的线性估计值乘以得到代数码本的估计增益;将代数码本的估计增益乘以校正因子得到代数码本增益的量化增益,校正因子来自获胜码本矢量,获胜码本矢量是基于获胜码本矢量的索引选自于增益码本。
第七方面和第八方面提供的方法至少具有如下有益效果:编解码器可直接查表获得当前帧的参数CT对应的的值,可以避免代数码本增益的线性估计值涉及的以10为底的指数运算,降低了算法复杂度。
以上第一方面至第八方面提供的编解码声音信号的方法,还可以包括:利用代数码本增益的量化增益乘以来自代数码本的代数码本矢量,得到代数码本的激励贡献。并且,利用选自增益码本的获胜码本矢量包括的。最后,将代数码本的激励贡献与自适应码本的激励贡献相加,得到总的激励。总的激励可以通过合成滤波器重构出语音信号。
第九方面,提供了一种具有语音编码功能的装置,可用于实现上述第一方面提供的方法,该装置可包括:查找部件,如图6中示出的查表模块601,用于根据当前帧的帧分类参数索引在第一映射表中查找线性域中代数码本增益的线性估计值。其中,第一映射表中的每一个条目包括两个值:帧分类参数索引和线性域中代数码本增益的线性估计值。第一计算器,包括如图6中示出的平方求和器603、平方根计算器604,用于计算来自代数码本的代数码本矢量的能量。第一乘法器,如图6中示出的乘法器602,用于将线性域中代数码本增益的线性估计值乘以代数码本矢量的能量的平方根的倒数,得到代数码本的估计增益。第二乘法器,如图6中示出的乘法器605,用于将代数码本的估计增益乘以获胜码本矢量包括的校正因子,得到代数码本的量化增益;获胜码本矢量选自增益码本。
第九方面提供的装置还可以包括:通信部件,通信部件用于传输编码参数,编码参数包 括:当前帧的帧分类参数索引,获胜码本矢量在增益码本中的索引。
第十方面,提供了一种具有语音解码功能的装置,可用于实现上述第二方面提供的方法,该装置可包括:通信部件,通信部件用于接收编码参数。编码参数包括:当前帧的帧分类参数索引,获胜码本矢量的索引。第一查找部件,如图6中示出的查表模块601,用于基于当前帧的帧分类参数索引在第一映射表中查找线性域中代数码本增益的线性估计值;第一映射表中的每一个条目包括两个值:帧分类参数索引和线性域中代数码本增益的线性估计值。第一计算器,包括如图6中示出的平方求和器603、平方根计算器604,用于计算来自代数码本的代数码本矢量的能量。第一乘法器,如图6中示出的乘法器602,用于将线性域中代数码本增益的线性估计值乘以代数码本矢量的能量的平方根的倒数,得到代数码本的估计增益。第二乘法器,如图6中示出的乘法器605,用于将代数码本的估计增益乘以校正因子得到代数码本的量化增益,校正因子来自获胜码本矢量。第二查找部件,用于基于获胜码本矢量的索引从增益码本查找出获胜码本矢量。
第十一方面,提供了一种具有语音编码功能的装置,可用于实现上述第三方面提供的方法,该装置可包括:查找部件,如图7中示出的查表模块701,用于根据当前帧的帧分类参数索引在第一映射表中查找对数域中代数码本增益的线性估计值;第一映射表中的每一个条目包括两个值:帧分类参数索引和对数域中代数码本增益的线性估计值。转换器,如图7中示出的指数计算器702,用于将对数域中代数码本增益的线性估计值通过指数运算转换到线性域中,得到线性域中代数码本增益的线性估计值。第一计算器,包括如图7中示出的平方求和器704、平方根计算器705,用于计算来自代数码本的代数码本矢量的能量。第一乘法器,如图7中示出的乘法器703,用于将线性域中代数码本增益的线性估计值乘以代数码本矢量的能量的平方根的倒数,得到代数码本的估计增益。第二乘法器,如图7中示出的乘法器707,用于将代数码本的估计增益乘以获胜码本矢量包括的校正因子,得到代数码本的量化增益;获胜码本矢量选自增益码本。
第十一方面提供的装置还可以包括:通信部件,通信部件用于传输编码参数,编码参数包括:当前帧的帧分类参数索引,获胜码本矢量在增益码本中的索引。
第十二方面,提供了一种具有语音解码功能的装置,可用于实现上述第四方面提供的方法,该装置可包括:通信部件,通信部件用于接收编码参数。其中编码参数包括:当前帧的帧分类参数索引,获胜码本矢量的索引。第一查找部件,如图7中示出的查表模块701,用于基于当前帧的帧分类参数索引在第一映射表中查找对数域中代数码本增益的线性估计值;第一映射表中的每一个条目包括两个值:帧分类参数索引和对数域中代数码本增益的线性估计值。转换器,如图7中示出的指数计算器702,用于将对数域中代数码本增益的线性估计值通过指数运算转换到线性域中,得到线性域中代数码本增益的线性估计值。
第一计算器,包括如图7中示出的平方求和器704、平方根计算器705,用于计算来自代数码本的代数码本矢量的能量。第一乘法器,如图7中示出的乘法器703,用于将线性域中代数码本增益的线性估计值乘以代数码本矢量的能量的平方根的倒数,得到代数码本的估计增益。第二乘法器,如图7中示出的乘法器707,用于将代数码本的估计增益乘以校正因子得到代数码本的量化增益,校正因子来自获胜码本矢量。第二查找部件,用于基于获胜码本矢量的索引从增益码本查找出获胜码本矢量。
第十三方面,提供了一种具有语音编码功能的装置,可用于实现上述第五方面提供的方法,该装置可包括:线性预测部件,如图8中示出的线性估计模块801,用于利用第一子帧 中的线性估计常数和当前帧的帧类型进行线性估计,得到对数域中代数码本增益的线性估计值。转换器,如图8中示出的指数计算器802,用于将对数域中代数码本增益的线性估计值通过指数运算转换到线性域中,得到线性域中代数码本增益的线性估计值。第一计算器,包括如图8中示出的平方求和器804、平方根计算器805,用于计算来自代数码本的代数码本矢量的能量。第一乘法器,如图8中示出的乘法器803,用于将线性域中代数码本增益的线性估计值乘以代数码本矢量的能量的平方根的倒数,得到代数码本的估计增益。第二乘法器,如图8中示出的乘法器806,用于将代数码本的估计增益乘以获胜码本矢量包括的校正因子,得到代数码本的量化增益;获胜码本矢量选自增益码本。
第十三方面提供的装置还可以包括:通信部件,通信部件用于传输编码参数,编码参数包括:当前帧的帧类型,线性估计常数,获胜码本矢量在增益码本中的索引。
第十四方面,提供了一种具有语音解码功能的装置,可用于实现上述第六方面提供的方法,该装置可包括:通信部件,通信部件用于接收编码参数。编码参数包括:当前帧的帧类型,第一子帧中的线性估计常数,获胜码本矢量的索引。线性预测部件,如图8中示出的线性估计模块801,用于利用第一子帧中的线性估计常数和当前帧的帧类型进行线性估计,得到对数域中代数码本增益的线性估计值。转换器,如图8中示出的指数计算器802,用于将对数域中代数码本增益的线性估计值通过指数运算转换到线性域中,得到线性域中代数码本增益的线性估计。第一计算器,包括如图8中示出的平方求和器804、平方根计算器805,用于计算来自代数码本的代数码本矢量的能量。第一乘法器,如图8中示出的乘法器803,用于将线性域中代数码本增益的线性估计值乘以代数码本矢量的能量的平方根的倒数,得到代数码本的估计增益。第二乘法器,如图8中示出的乘法器806,用于将代数码本的估计增益乘以校正因子得到代数码本的量化增益,校正因子来自获胜码本矢量。查找部件,用于基于获胜码本矢量的索引从增益码本查找出获胜码本矢量。
第十五方面,提供了一种具有语音编码功能的装置,可用于实现上述第七方面提供的方法,该装置可包括:第一查找部件,如图9中使出的查表模块901,用于根据当前帧的帧分类参数索引在第一映射表中查找线性域中代数码本增益的线性估计值。第一映射表中的每一个条目包括两个值:帧分类参数索引和线性域中代数码本增益的线性估计值。第一计算器,包括如图9中示出的平方求和器903、平方根计算器904,用于计算来自代数码本的代数码本矢量的能量Ec。对数运算器,如图9中示出的计算器905,用于对能量的平方根进行以10为底的对数运算。指数运算器,如图9中示出的计算器906,用于对对数运算后的值取相反数后进行以10为底的指数运算得到第一乘法器,如图9中示出的乘法器902,用于将线性域中代数码本增益的线性估计值乘以得到代数码本的估计增益。第二乘法器,如图9中示出的乘法器907,用于将代数码本的估计增益乘以校正因子得到代数码本增益的量化增益,校正因子来自获胜码本矢量,获胜码本矢量选自增益码本。
第十五方面提供的装置还可以包括:通信部件,通信部件用于传输编码参数,编码参数包括:当前帧的帧分类参数索引,获胜码本矢量在增益码本中的索引。
第十六方面,提供了一种具有语音解码功能的装置,可用于实现上述第八方面提供的方法,该装置可包括:通信部件,用于接收编码参数。编码参数包括:当前帧的帧分类参数索引,获胜码本矢量在增益码本中的索引。第一查找部件,如图9中使出的查表模块901,用于根据当前帧的帧分类参数索引在第一映射表中查找线性域中代数码本增益的线性估计值;第一映射表中的每一个条目包括两个值:帧分类参数索引和线性域中代数码本增益的线性估计值。第一计算器,包括如图9中示出的平方求和器903、平方根计算器904,用于计算来自代数码本的代数码本矢量的能量Ec。对数运算器,如图9中示出的计算器905,用于对能量 的平方根进行以10为底的对数运算。指数运算器,如图9中示出的计算器906,用于对对数运算后的值取相反数后进行以10为底的指数运算得到第一乘法器,如图9中示出的乘法器902,用于将线性域中代数码本增益的线性估计值乘以得到代数码本的估计增益。第二乘法器,如图9中示出的乘法器907,用于将代数码本的估计增益乘以校正因子得到代数码本增益的量化增益,校正因子来自获胜码本矢量。第二查找部件,用于基于获胜码本矢量的索引从增益码本查找出获胜码本矢量。
以上第九方面至第十六方面提供的具有语音编、解码功能的装置,还可以进一步包括:
第三乘法器,用于利用代数码本增益的量化增益乘以来自代数码本的代数码本矢量,得到代数码本的激励贡献;
第四乘法器,用于利用选自增益码本的获胜码本矢量包括的自适应码本的量化增益乘以来自自适应码本的自适应码本矢量,得到自适应码本的激励贡献;
加法器,用于将代数码本的激励贡献与自适应码本的激励贡献相加,得到总的激励。
第十七方面,提供了一种语音通信系统,可包括:第一装置、第二装置,其中:第一装置可用于执行上述第一、第三、第五、第七提供的编码声音信号的方法,第二装置可用于执行上述第二、第四、第六、第八提供的编码声音信号的方法。第一装置可以是上述第九、第十一、第十三、第十五提供的具有语音编码功能的装置,第二装置可以是上述第时、第十二、第十四、第十六提供的具有语音解码功能的装置。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对本申请实施例中所需要使用的附图进行说明。
图1示出了一种现有的CELP编码算法框图;
图2示出了一种现有的CELP解码算法框图;
图3示出了无记忆联合增益编码的编码器中的增益量化过程;
图4示出了第一子帧中计算代数码本增益的过程;
图5示出了第二子帧及以后子帧中计算代数码本增益的过程;
图6示出了本申请实施例一提供的第一子帧中计算代数码本增益的过程;
图7示出了本申请实施例二提供的第一子帧中计算代数码本增益的过程;
图8示出了本申请实施例三提供的第一子帧中计算代数码本增益的过程;
图9示出了本申请实施例四提供的第一子帧中计算代数码本增益的过程;
图10示出了利用计算的估计常数用于增益码本设计的流程;
图11示出了本申请实施例提供的包括语音编码装置和语音解码装置的语音通信系统。
具体实施方式
本申请实施例对CELP编解码的现有技术进行了改进,可以在实现无记忆联合增益编码的同时,降低计算码本增益的复杂度。
现有的CELP编解码技术
图1示出了一种现有的CELP编码算法框图。
输入语音信号首先被进行预处理。在预处理中,可以对输入语音信号进行采样和预加重等。预处理后的信号进一步输出给LPC分析量化插值模块101和加法器102。LPC分析量化插值模块101对输入语音信号进行线性预测分析,并对分析结果进行量化、插值,计算出线性 预测编码(linear prediction coding,LPC)参数。LPC参数用于构造合成滤波器(synthesis filter)103。来自代数码本的代数码本矢量乘以代数码本增益gc的结果,与来自自适应码本的自适应码本矢量乘以自适应码本增益gp的结果都输出加法器104相加,相加结果输出到合成滤波器103,以此构造激励信号经合成滤波器103后所生成的重构语音信号。重构语音信号也输出到加法器102,与输入语音信号相减得到误差信号。
该误差信号经感知加权(perceptual weighting)滤波器105处理,根据听觉感受改变频谱,反馈给基音分析(pitch analysis)模块106和代数码本搜索(algebra codebook search)模块107。感知加权滤波器105也是依据LPC参数构造的。
根据使感知加权的误差信号均方差最小的原则确定激励信号及码本增益。基音分析模块106通过自相关分析推得基音周期,据此搜索自适应码本以确定最佳自适应码本矢量,得到语音中具有准周期特性的激励信号。代数码本搜索模块107搜索代数码本,根据最小化加权均方差的准则确定最佳代数码本矢量,得到语音模型的随机激励信号。然后,确定最佳自适应码本矢量的增益和最佳代数码本矢量的增益。码本的量化增益、最佳自适应码本矢量在自适应码本中的索引、最佳代数码本矢量在代数码本中的索引,以及线性预测编码参数等这些编码参数形成比特流传输至解码侧。
图2示出了一种现有的CELP解码算法框图。
首先,解码侧从压缩比特流中取得各编码参数。然后,解码侧利用这些编码参数生成激励信号。对每个子帧进行如下处理:自适应码本矢量和代数码本矢量乘以各自的量化增益得到激励信号;激励信号经过线性预测合成滤波器201得到重构语音信号。解码侧,线性预测合成滤波器201也同样是依据LPC参数构造的。
无记忆联合增益编码(memory-less joint gain coding)
进一步的,可以在每一个子帧对自适应码本增益和代数码本增益可实施无记忆联合增益编码,尤其是在低比特速率(如7.2kbps、8kbps)下。实施联合增益编码后,随比特流传输的是自适应码本的量化增益gp在增益码本(gain codebook)中的索引。
在增益量化过程之前,假设滤波的自适应码本和代数码本已经知道。编码器中的增益量化通过基于最小均方差MMSE原则搜索已设计好的增益码本来实现。增益码本中的每一个条目包括两个值:自适应码本的量化增益gp和用于代数码本增益的校正因子γ。代数码本增益的估计提前完成,其结果gc0被用来乘以从增益码本中选择出的校正因子γ。在每一个子帧内,增益码本都会被完全的搜索一遍,例如索引q=0,...,Q-1。如果激励的自适应部分的量化增益被强制要求低于特定阈值,则限制搜索范围是可能的。为允许减小搜索范围,增益码本中的码本条目可以以gp的取值进行升序排列。增益量化过程可如图3所示。
增益量化通过最小化误差信号e(i)的能量来实现。误差能量通过如下公式表示:
E=eTe=(x-gPy-gcz)T(x-gPy-gcz)     公式(1)
将gc替换成γ*gc0,该公式可展开成:
常量c0,c1,c2,c3,c4和c5以及估计的增益gc0在搜索增益码本之前被计算。针对每一个码本条目计算误差能量E。导致最小误差能量的码本矢量[gp;γ]被选为获胜码本矢量,其条目对应自适应码本的量化增益gp和γ。
然后,固定码本的量化增益gc可如下计算:
gc=gc0*γ   公式(3)
在解码器中,接收到的索引被用来获取自适应激励的的量化增益gp以及代数激励的估计增益的量化校正因子γ。在编码器中,激励的代数部分的估计增益也通过同样方式完成。
在当前帧的第一个子帧中,代数码本的估计(预测)增益由下面公式(4)给出:
其中,CT是编码分类参数(编码模式),是在编码器预处理部分为当前帧选择的类型。Ec是滤波后的代数码本矢量的能量,以dB为单位,在下面的公式(5)中计算得到。估计常数a0和a1是在大信号数据库上通过MSE最小化确定的。上式中的编码模式参数CT,它对于当前帧的所有子帧都是恒定的。上标[0]表示当前帧的第一个子帧。
其中,c(n)是滤波后的代数码矢量。
图4中示出了第一个子帧的增益估计过程。
代数码本增益的估计过程概述如下:根据当前帧的分类参数CT估计代数码本增益,估计的代数码本增益中已经排除了来自代数码本的代数码矢量的能量。最后,代数码本的估计增益乘以从增益码本中选择的校正因子γ,得到量化后的代数码本增益gc
具体的,如图4所示,首先在对数域中根据分类参数CT对代数码本增益进行线性估计,得到代数码本增益的线性估计值:a0+a1CT。然后,从该线性估计值中减去来自代数码本的代数码本矢量的能量参数得到对数域中代数码本的估计增益:其中Ec通过上面公式(5)得到。之后,将对数域中代数码本的估计增益转换到线性域,得到线性域中代数码本的估计增益参考上面公式(4)。最后,将代数码本的估计增益乘以从增益码本中选择的校正因子γ,得到量化后的代数码本增益
自适应码本的量化增益直接从增益码本中选择,具体基于最小均方差MMSE原则搜索增益码本,参考上面公式(2)。
在当前帧的第一个子帧之后的所有子帧使用略有不同的估计方案。不同之处在于,在这些子帧中,来自先前子帧的自适应码本和代数码本的量化增益都用作辅助估计参数以提高效率。第k个子帧中代数码本的估计增益,k>0,由下面公式(6)给出:
其中k=1,2,3。指数中的第一个和第二个求和项∑,分别表示先前子帧的代数码本的量化增益和自适应码本的量化增益。估计常数b0,…,b2k+1也是在大信号数据库上通过MSE最小化确定的。
图5中示出了第二个子帧以及之后子帧的增益估计过程。与第一帧中不同的是,除了当前帧的分类参数CT,还要根据之前子帧的自适应和代数码本的量化增益估计在后子帧的代数码本增益。
具体的,如图5所示,首先在对数域中计算得到代数码本的估计增益: 即公式(6)中的指数项。然后,将对数域中代数码本的估计增益转换到线性域。最后将线性域估计的代数码本增益乘以来自增益码本中选择的校正因子γ,形成量化后的代数码本增益
第二个子帧以及之后子帧中,自适应码本的量化增益同样直接从增益码本中选择。
第二个子帧以及之后子帧与第一子帧的估计过程的不同之处还在于,没有从对数域的代数码本的估计增益中减去来自代数码本的代数码本矢量的能量。其原因在于,在后子帧的增益估计依据了在前子帧的代数码本增益,而在前的第一个子帧的代数码本增益中已经减去了该能量,在后子帧的增益估计无需再考虑移除该能量的影响。
关于无记忆联合增益编码,可进一步参考如下文献:3GPP TS26.445,"Codec for Enhanced Voice Services(EVS);Detailed algorithmic description",在此通过引用将其全部内容并入本文中。
在解码器中,根据索引从增益码本中找到导致最小误差能量的获胜码本矢量[gp;γ],其中gp为自适应码本的量化增益,并利用校正因子γ乘以代数码本的估计增益gc0得到代数码本的量化增益gc。gc0的计算方式与编码器中采用的方式相同。从比特流中解码获得自适应码本矢量和代数码本矢量,分布乘以自适应、代数码本的量化增益,得到自适应、代数激励贡献。最后,将这两种激励贡献加在一起以形成总激励,并通过线性预测合成滤波器滤波总激励重构出语音信号。
上述CELP编解码的现有技术存在计算复杂度较高的问题。例如,上面公式(4),在第一子帧中估算代数码本增益需要同时进行对数log运算和以10为底的指数运算,计算量较大。
为此,本申请各个实施例改进了第一子帧中代数码本增益的计算过程,可节约对数log运算和以10为底的指数运算这类复杂度高的运算,降低码本增益的计算复杂度。
实施例一
实施例一提供的第一子帧中代数码本增益的计算过程,对第一子帧中代数码本的估计增益的计算公式进行了如下优化,以实现复杂度的降低(保证计算结果不变,不影响效果):
由于线性域中代数码本增益的线性估计值只与分类参数CT以及对数域中的估计常数a0、a1有关,因此,的取值可以提前通过表格枚举出来。表示线性域中代数码本增益的线性估计值。编解码器可维护着一张映射表b,表b中每一个条目包括两个值:分类参数CT的索引CTindex和线性域中代数码本增益的线性估计值这样,编解码器可直接查表获得当前帧的参数CT对应的的值,避免编解码器运行时再计算,从而降低了算法复杂度。
公式(7)计算过程可进一步简化为:
公式(9)所表示的第一子帧中代数码本的估计增益的计算过程可概述如下:首先,根据当前帧的分类参数CT的索引CTindex查表获得线性域中代数码本增益的线性估计值。之后,利用线性域中来自代数码本的代数码本矢量的能量Ec的平方根(表示为)去除线性域中代数码本增益的线性估计值,得到第一子帧中代数码本的估计增益这样,在编解码计算第一子帧的代数码本的估计增益时,可以完全避免对数log运算和以10为底的指数运算这类复杂度高的运算,显著降低算法复杂度。
编码侧向解码侧可以传输以下编码参数:当前帧的分类参数CT的索引CTindex,获胜码本矢量[gp;γ]在增益码本中的索引。
图6示出了实施例一提供的第一子帧中代数码本增益的计算过程。
如图6所示,首先向查表模块601输入当前前帧的分类参数CT的索引CTindex。然后,查表模块601根据索引CTindex查表获得线性域中代数码本增益的线性估计值(下面表示为b[CTindex]),并输出b[CTindex]到乘法器602。并且,依次通过平方求和器603、平方根计算器 604计算来自代数码本的代数码本矢量的能量的平方根值:平方求和器603通过上面公式(5)计算得到Ec,之后输出的倒数值至乘法器602,用以去除b[CTindex]进而获得代数码本的估计增益然后,输出代数码本的估计增益至乘法器605,用以来乘以来自增益码本的校正因子γ得到代数码本的量化增益代数码本的量化增益可进一步输出到乘法器606,用以来乘以自代数码本的代数码本矢量(滤波后的代数激励)得到激励的代数码本贡献。最终,激励的代数码本贡献输出到加法器607,用以和激励的自适应码本贡献相加得到总的滤波后的激励。
其中,激励的自适应码本贡献由输出到乘法器608的来自自适应码本的自适应码本矢量(滤波后的自适应激励)乘以来自增益码本的自适应码本的量化增益得到。自适应码本的量化增益直接从增益码本中选择,具体基于最小均方差MMSE原则搜索增益码本,参考上面公式(2)。
实施例二
实施例二提供的第一子帧中代数码本增益的计算过程,对第一子帧中代数码本的估计增益的计算公式也进行了如下优化(同上述公式(7)),以实现复杂度的降低:
与实施例一稍有不同的是,只将公式(7)中a0+a1CT的取值提前通过表格枚举出来。a0+a1CT表示对数域中代数码本增益的线性估计值。此时,编解码器维护的映射表b中每一个条目可包括两个值:分类参数CT的索引CTindex和对数域中代数码本增益的线性估计值a0+a1CT。这样,编解码器可直接查表获得当前帧的参数CT对应的a0+a1CT的值,可以避免编解码器运行时再计算该值,节约了计算量。
令b[CTindex]=a0+a1CT      公式(10)
公式(7)计算过程可进一步简化为:
公式(11)所表示的第一子帧中代数码本的估计增益的计算过程可概述如下:首先,根据当前帧的分类参数CT的索引CTindex查表获得对数域中代数码本增益的线性估计值。之后,利用线性域中来自代数码本的代数码本矢量的能量Ec的平方根(表示为)去除线性域中代数码本增益的线性估计值,得到第一子帧中代数码本的估计增益这样,在编解码计算第一子帧的代数码本的估计增益时,可以减少对数log运算和以10为底的指数运算这类复杂度高的运算,降低算法复杂度。
编码侧向解码侧可以传输以下编码参数:当前帧的分类参数CT的索引CTindex,获胜码本矢量[gp;γ]在增益码本中的索引。
图7示出了实施例二提供的第一子帧中代数码本增益的计算过程。
如图7所示,首先向查表模块701输入当前前帧的分类参数CT的索引CTindex。然后,查表模块701根据索引CTindex查表获得对数域中代数码本增益的线性估计值(下面表示为b[CTindex]),并输出b[CTindex]到以10为底的指数计算器702获得线性域中代数码本增益的线性估计值:接着,输出线性域中代数码本增益的线性估计值到乘法器703。并且,依次通过平方求和器704、平方根计算器705计算来自代数码本的代数码本矢量的能量的平方根值:平方求和器704通过上面公式(5)计算得到Ec,之后输出的倒数值至乘 法器703,用以去除进而获得代数码本的估计增益然后,输出代数码本的估计增益至乘法器706,用以来乘以来自增益码本的校正因子γ得到代数码本的量化增益代数码本的量化增益可进一步输出到乘法器707,用以来乘以自代数码本的代数码本矢量(滤波后的代数激励)得到激励的代数码本贡献。最终,激励的代数码本贡献输出到加法器708,用以和激励的自适应码本贡献相加得到总的滤波后的激励。
其中,激励的自适应码本贡献由输出到乘法器709的来自自适应码本的自适应码本矢量(滤波后的自适应激励)乘以来自增益码本的自适应码本的量化增益得到。自适应码本的量化增益直接从增益码本中选择,具体基于最小均方差MMSE原则搜索增益码本,参考上面公式(2)。
实施例三
实施例三提供的第一子帧中代数码本增益的计算过程,对第一子帧中代数码本的估计增益的计算公式也进行了如下优化(同上述公式(7)),以实现复杂度的降低:
与前面实施例不同的是,并不将线性域中代数码本增益的线性估计值或对数域中代数码本增益的线性估计值a0+a1CT的取值提前通过表格枚举出来,而是通过以10为底的指数运算得到
公式(7)所表示的第一子帧中代数码本的估计增益的计算过程可概述如下:首先,根据当前帧的分类参数CT计算得到对数域中代数码本增益的线性估计值a0+a1CT。之后,通过以10为底的指数运算将对数域中代数码本增益的线性估计值a0+a1CT转换到线性域中,得到利用线性域中来自代数码本的代数码本矢量的能量Ec的平方根(表示为)去除线性域中代数码本增益的线性估计值得到第一子帧中代数码本的估计增益这样,在编解码计算第一子帧的代数码本的估计增益时,可以避免部分因式涉及的对数log运算和以10为底的指数运算,降低了算法复杂度。
编码侧向解码侧可以传输以下编码参数:当前帧的帧类型CT,线性估计常数a0、a1,获胜码本矢量[gp;γ]在增益码本中的索引。
图8示出了实施例三提供的第一子帧中代数码本增益的计算过程。
如图8所示,首先向线性估计模块801输入当前前帧的分类参数CT。然后,线性估计模块801根据参数CT计算出对数域中代数码本增益的线性估计值a0+a1CT,并输出a0+a1CT到以10为底的指数计算器802获得线性域中代数码本增益的线性估计值:接着,输出线性域中代数码本增益的线性估计值到乘法器803。并且,依次通过平方求和器804、平方根计算器805计算来自代数码本的代数码本矢量的能量的平方根值:平方求和器804通过上面公式(5)计算得到Ec,之后输出的倒数值至乘法器803,用以去除进而获得代数码本的估计增益然后,输出代数码本的估计增益至乘法器806,用以来乘以来自增益码本的校正因子γ得到代数码本的量化增益代数码本的量化增益可进一步输出到乘法器807,用以来乘以自代数码本的代数码本矢量(滤波后的代数激励)得到激励的代数码本贡献。最终,激励的代数码本贡献输出到加法器808,用以和激励的自适应码本贡献相加得到总的滤波后的激励。
其中,激励的自适应码本贡献由输出到乘法器809的来自自适应码本的自适应码本矢量(滤波后的自适应激励)乘以来自增益码本的自适应码本的量化增益得到。自适应码本的量化增益直接从增益码本中选择,具体基于最小均方差MMSE原则搜索增益码本,参考上面公式(2)。
实施例四
实施例四提供的第一子帧中代数码本增益的计算过程,对第一子帧中代数码本的估计增益的计算公式进行了如下优化,以实现复杂度的降低:
可以和实施例一相同的地方是,由于线性域中代数码本增益的线性估计值只与分类参数CT以及对数域中的估计常数a0、a1有关,因此,的取值可以提前通过表格枚举出来。表示线性域中代数码本增益的线性估计值。编解码器可维护着一张映射表b,表b中每一个条目包括两个值:分类参数CT的索引CTindex和线性域中代数码本增益的线性估计值这样,编解码器可直接查表获得当前帧的参数CT对应的的值,避免编解码器运行时再计算,从而降低了算法复杂度。
公式(12)计算过程可进一步简化为:
公式(13)所表示的第一子帧中代数码本的估计增益的计算过程可概述如下:首先,根据当前帧的分类参数CT的索引CTindex查表获得线性域中代数码本增益的线性估计值并且,对来自代数码本的代数码本矢量的能量参数取相反数,对该相反数进行以10为底的指数运算得到最后,将线性域中代数码本增益的线性估计值乘以得到代数码本的估计增益这样,在编解码计算第一子帧的代数码本的估计增益时,可以避免代数码本增益的线性估计值涉及的以10为底的指数运算,降低了算法复杂度。
编码侧向解码侧可以传输以下编码参数:当前帧的分类参数CT的索引CTindex,获胜码本矢量[gp;γ]在增益码本中的索引。
图9示出了实施例四提供的第一子帧中代数码本增益的计算过程。
如图9所示,首先向查表模块901输入当前前帧的分类参数CT的索引CTindex。然后,查表模块901根据索引CTindex查表获得线性域中代数码本增益的线性估计值(下面表示为b[CTindex]),并输出b[CTindex]到乘法器902。并且,依次通过平方求和器903、平方根计算器904计算来自代数码本的代数码本矢量的能量的平方根值:平方求和器903通过上面公式(5)计算得到Ec。之后输出到计算器905,用以对进行对数log运算后取相反数进而得到之后输出到计算器906,用以对进行以10为底的指数运算得到再输出到乘法器902,用以乘以b[CTindex]进而获得代数码本的估计增益然后,输出代数码本的估计增益至乘法器907,用以来乘以来自增益码本的校正因子γ得到代数码本的量化增益代数码本的量化增益可进一步输出到乘法器908,用以来乘以自代数码本的代数码本矢量(滤波后的代数激励)得到激励的代数码本贡献。最终,激励的代数码本贡献输出到加法器909,用以和激励的自适应码本贡献相加得到总的滤波后的激励。
其中,激励的自适应码本贡献由输出到乘法器910的来自自适应码本的自适应码本矢量(滤波后的自适应激励)乘以来自增益码本的自适应码本的量化增益得到。自适应码本的量化增益直接从增益码本中选择,具体基于最小均方差MMSE原则搜索增益码本,参考上面公式(2)。
实施例四中,也可以令b[CTindex]=a0+a1CT,即通过表b仅将a0+a1CT的取值提前通过表格枚举出来。此时,公式(12)计算过程可进一步简化为:
公式(14)所表示的第一子帧中代数码本的估计增益的计算过程可概述如下:首先,根据当前帧的分类参数CT的索引CTindex查表获得对数域中代数码本增益的线性估计值。并且,在对数域中从代数码本增益的线性估计值a0+a1CT中减去来自代数码本的代数码本矢量的能量得到最后,对进行以10为底的指数运算得到公式(14)所表示的计算过程能通过查表获得对数域中代数码本的先行估计值,而无需编解码器计算,节约了计算量。
以上实施例中,当前帧的分类参数CT的数值可以依据信号类型来选择。例如,对于窄带信号,对于清音、浊音、一般或过渡帧,将参数CT的数值分别设置成1,3,5和7,而对于宽带信号,将它们分别设置成0,2,4和6。下面会介绍信号分类,这里先不展开。
信号类型
用于确定帧的分类(参数CT)的方法可以不同。例如,仅按照浊音或清音地进行基础分类。又例如,可以增加强浊音或强清音那样的更多类别。
信号分类可以通过以下三个步骤进行。首先,语音活动检测器(speech active detector,SAD)区分有效和无效语音帧。如果检测到无效语音帧,例如本底噪声(ground noise),则分类终止,利用舒适噪声生成(comfort noise generator,CNG)编码帧。如果检测到有效语音帧,则对该帧实施进一步分类,以区分出清音帧。如果该帧进一步被分类成清音信号,则分类终止,使用最适合清音信号的编码方法编码该帧。否则,再进一步确定该帧是否为稳定浊音。如果该帧被分类成稳定浊音帧,则使用最适合稳定浊音信号的编码方法编码该帧。否则,该帧有可能包含像浊音发端或迅速演变浊音信号那样的非稳定信号段。
清音信号的区分可以基于以下参数:发声度量平均频谱倾斜低水平上的最大短时能量增量dE0和最大短时能量偏差dE。本申请对区分出清音信号的算法不做限制,可例如采用如下文献中提及的算法:Jelinek,M.,etal.,"Advances in source-controlledvariable bitrate wideband speech coding",Special Workshop in MAUI(SWIM);Lectures by masters in speech processing,Maui,January12-24,2004,在此通过引用将其全部内容并入本文中。
如果一个帧未被分类成有效帧或清音帧,则测试是否是稳定浊音帧。稳定浊音帧的区分可以基于以下参数:每个子帧(具有1/4子样本分辨率)的归一化相关性平均频谱倾斜和所有子帧(具有1/4子样本分辨率)的开环音调估计。本申请对区分稳定浊音帧的算法也不做限制。
以上实施例中,除了分类参数CT,当前帧的第一子帧中代数码本的线性估计增益还与估计常数ai有关。估计常数ai可以通过大样本的数据训练确定。
估计常数的计算
大样本的训练数据可包含不同语言、不同性别、不同年龄、不同环境等等的多样的大量语音信号。并且,假设该大样本的训练数据包括N+1个帧。
通过在大样本的训练数据中的所有帧上使代数码本的估计增益与对数域中的最佳增益之间的均方误差最小来寻找估计系数。
对于第n帧中的第一子帧,均方误差能量通过下式给出:
其中,在第n帧中的第一子帧中,对数域中代数码本的估计增益通过下面公式得出:
其中,i=0,...,L-1。L表示输入语音信号的L个样本。Ei(n)表示第n帧中来自代数码本的代数码本矢量的能量,可参考上面公式(5)计算。
代入公式(16)后,公式(15)变为:
在上面的公式(17)中,表示第一子帧中的最佳代数码本增益,其可以通过下面公式(18)和(19)计算得到:
其中,常数或相关性系数c0,c1,c2,c3,c4和c5通过下面公式计算得到:
c0=yty,c1=xty,c2=ztz,c3=xtz,c4=ytz,c5=xtx    公式(19)
通过在对数域中定义代数码本的归一化增益来简化最小均方误差MSE的计算过程:
其中,
上面定义的最小均方误差MSE的解(即估计常数a0、a1的最佳值)通过如下一对偏导数求出:
至此,估计常数a0、a1的最佳值便能确定出。这里不提供估计常数a0、a1的最佳值的表达式,即不展示公式(21)的求解,表达式比较复杂。实际应用中,可以通过MATLAB等计算软件提前计算出其值。
对于第n帧中的第二子帧及以后子帧,均方误差能量通过下式给出:
其中,k≥2,的值同样通过上面公式(18)和(19)计算得到;表示在第k个子帧中对数域代数码本的估计增益。通过下面公式(23)计算得到:
为了求解实现最小均方误差MSE的估计常数b0,b1,...,bk的最佳值,类似第一子帧中的求解方法,可以对公式(22)求偏导得出。
至此,估计常数b0,b1,...,bk的最佳值便能确定出。这里不提供估计常数b0,b1,...,bk的最佳值的表达式,表达式比较复杂。实际应用中,可以通过MATLAB等计算软件提前计算出其值。
如图10所示,计算得到的估计常数可用于为每一个子帧设计增益码本,进而可如上述内容根据估计常数和增益码本确定每个子帧的增益量化。如上所述,代数码本增益的估计在每个子帧中稍有不同,估计系数通过最小均方误差MSE求出。增益码本可以参考下面文献中的KMEANS算法来设计:MacQueen,J.B.(1967),"Some Methods for classification and Analysis of Multivariate Observations",Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability,University of CaliforniaPress.pp.281-297,在此通过引用将其全部内容并入本文中。
图11示出了本申请实施例提供的包括语音编码装置113和语音解码装置117的语音通信系统110。语音通信系统110支持通过通信信道115传输语音信号。通信信道115可以是无线通信链路,也可以是导线等形成的有线链路。通信信道115可以支持需要共享带宽资源的多个同时进行的语音通信。不限于装置与装置之间的通信,通信信道115还可以扩展至单个装置内部用于实现共享通信的存储单元,例如单个装置的存储单元记录和存储编码的语音信号以便之后播放。
在发射器端,话筒等语音采集装置111将语音转换为提供给模数(A/D)转换器112的模拟语音信号120。A/D转换器112的功能是将模拟语音信号120转换为数字语音信号121。语音编码装置113对数字语音信号121进行编码,以产生一组二进制形式的编码参数122,并通过通信部件将其传送到信道编码器114。信道编码器114对编码参数122执行添加冗余等信道编码操作形成比特流123,通过通信信道115对其进行传输。
在接收器端,信道解码器116对通过通信部件所接收的比特流124执行信道解码操作,例如利用比特流124中的冗余信息来检测和校正传输期间发生的信道错误。语音解码装置117将从信道解码器中接收的比特流125转换回编码参数,用以创建合成的语音信号126。在数模(D/A)转换器108中将在语音解码装置117中重构的合成语音信号126转换回模拟语音信号127。最后,通过扬声器单元119等声音播放装置播放模拟语音信号127。
语音编码装置113对语音信号进行编码得到编码参数、语音解码装置117利用比特流中携带的编码参数重构语音信号,可以参考以上内容,不再赘述。
发射器端的各个器件可以集成于一个电子装置中,接收器端的各个器件可以集成于另一个电子装置中。此时,这两个电子装置之间通过有线或无线链路形成的通信信道进行通信,例如传输编码参数,例如传输编码参数。发射器端的各个器件和接收器端的各个器件也可以集成于同一个电子装置中。此时,在该电子装置内部,发射器端和接收器端之间通过共享的存储单元实现数据交换,即通信,例如传输编码参数。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (27)

  1. 一种编码声音信号的方法,应用于当前帧中的第一子帧,其特征在于,所述方法包括:
    接收所述当前帧的帧分类参数索引;
    根据所述当前帧的帧分类参数索引在第一映射表中查找所述线性域中代数码本增益的线性估计值;所述第一映射表中的每一个条目包括两个值:帧分类参数索引和线性域中代数码本增益的线性估计值;
    计算来自代数码本的代数码本矢量的能量;
    将所述线性域中代数码本增益的线性估计值除以所述代数码本矢量的能量的平方根,得到代数码本的估计增益;
    将所述代数码本的估计增益乘以校正因子得到代数码本的量化增益;所述校正因子来自获胜码本矢量,所述获胜码本矢量选自增益码本。
  2. 如权利要求1所述的方法,其特征在于,所述方法还包括:
    传输编码参数,所述编码参数包括:所述当前帧的帧分类参数索引,所述获胜码本矢量在所述增益码本中的索引。
  3. 一种编码声音信号的方法,应用于当前帧中的第一子帧,其特征在于,所述方法包括:
    接收所述当前帧的帧分类参数索引;
    根据所述当前帧的帧分类参数索引在第一映射表中查找对数域中代数码本增益的线性估计值;所述第一映射表中的每一个条目包括两个值:帧分类参数索引和对数域中代数码本增益的线性估计值;
    将所述对数域中代数码本增益的线性估计值通过指数运算转换到线性域中,得到所述线性域中代数码本增益的线性估计值;
    计算来自代数码本的代数码本矢量的能量;
    将所述线性域中代数码本增益的线性估计值除以所述代数码本矢量的能量的平方根,得到代数码本的估计增益;
    将所述代数码本的估计增益乘以校正因子得到代数码本的量化增益;所述校正因子来自获胜码本矢量,所述获胜码本矢量选自增益码本。
  4. 如权利要求3所述的方法,其特征在于,所述方法还包括:
    传输编码参数,所述编码参数包括:所述当前帧的帧分类参数索引,所述获胜码本矢量在所述增益码本中的索引。
  5. 一种编码声音信号的方法,应用于当前帧中的第一子帧,其特征在于,所述方法包括:
    利用所述第一子帧中的线性估计常数和所述当前帧的帧类型进行线性估计,得到对数域中代数码本增益的线性估计值;
    将所述对数域中代数码本增益的线性估计值通过指数运算转换到线性域中,得到所述线性域中代数码本增益的线性估计值;
    计算来自代数码本的代数码本矢量的能量;
    将所述线性域中代数码本增益的线性估计值除以所述代数码本矢量的能量的平方根,得到代数码本的估计增益;
    将所述代数码本的估计增益乘以校正因子得到代数码本的量化增益;所述校正因子来自获胜码本矢量,所述获胜码本矢量选自增益码本。
  6. 如权利要求5所述的方法,其特征在于,所述方法还包括:
    传输编码参数,所述编码参数包括:所述当前帧的帧类型,所述线性估计常数,所述获胜码本矢量在所述增益码本中的索引。
  7. 一种解码声音信号的方法,应用于当前帧中的第一子帧,其特征在于,所述方法包括:
    接收编码参数;所述编码参数包括:所述当前帧的帧分类参数索引,获胜码本矢量的索引;
    基于所述当前帧的帧分类参数索引在第一映射表中查找所述线性域中代数码本增益的线性估计值;所述第一映射表中的每一个条目包括两个值:帧分类参数索引和线性域中代数码本增益的线性估计值;
    计算来自代数码本的代数码本矢量的能量;
    将所述线性域中代数码本增益的线性估计值除以所述代数码本矢量的能量的平方根,得到代数码本的估计增益;
    将所述代数码本的估计增益乘以校正因子得到代数码本的量化增益,所述校正因子来自所述获胜码本矢量,所述获胜码本矢量是基于所述获胜码本矢量的索引选自于增益码本。
  8. 一种解码声音信号的方法,应用于当前帧中的第一子帧,其特征在于,所述方法包括:
    接收编码参数;所述编码参数包括:所述当前帧的帧分类参数索引,获胜码本矢量的索引;
    基于所述当前帧的帧分类参数索引在第一映射表中查找对数域中代数码本增益的线性估计值;所述第一映射表中的每一个条目包括两个值:帧分类参数索引和对数域中代数码本增益的线性估计值;
    将所述对数域中代数码本增益的线性估计值通过指数运算转换到线性域中,得到所述线性域中代数码本增益的线性估计值;
    计算来自代数码本的代数码本矢量的能量;
    将所述线性域中代数码本增益的线性估计值除以所述代数码本矢量的能量的平方根,得到代数码本的估计增益;
    将所述代数码本的估计增益乘以校正因子得到代数码本的量化增益,所述校正因子来自所述获胜码本矢量,所述获胜码本矢量是基于所述获胜码本矢量的索引选自于增益码本。
  9. 一种解码声音信号的方法,应用于当前帧中的第一子帧,其特征在于,所述方法包括:
    接收编码参数;所述编码参数包括:所述当前帧的帧类型,所述第一子帧中的线性估计常数,获胜码本矢量的索引;
    利用所述第一子帧中的线性估计常数和所述当前帧的帧类型进行线性估计,得到对数域中代数码本增益的线性估计值;
    将所述对数域中代数码本增益的线性估计值通过指数运算转换到线性域中,得到所述线性域中代数码本增益的线性估计值;
    计算来自代数码本的代数码本矢量的能量;
    将所述线性域中代数码本增益的线性估计值除以所述代数码本矢量的能量的平方根,得到代数码本的估计增益;
    将所述代数码本的估计增益乘以校正因子得到代数码本的量化增益,所述校正因子来自所述获胜码本矢量,所述获胜码本矢量是基于所述获胜码本矢量的索引选自于增益码本。
  10. 一种编码声音信号的方法,应用于当前帧中的第一子帧,其特征在于,所述方法包括:
    接收所述当前帧的帧分类参数索引;
    根据所述当前帧的帧分类参数索引在第一映射表中查找所述线性域中代数码本增益的线性估计值;所述第一映射表中的每一个条目包括两个值:帧分类参数索引和线性域中代数码本增益的线性估计值;
    计算来自代数码本的代数码本矢量的能量Ec,并对所述能量的平方根进行以10为底的对数运算,再对所述对数运算后的值取相反数后进行以10为底的指数运算得到
    将线性域中代数码本增益的线性估计值乘以得到代数码本的估计增益;
    将所述代数码本的估计增益乘以校正因子得到代数码本增益的量化增益,所述校正因子来自获胜码本矢量,所述获胜码本矢量选自增益码本。
  11. 如权利要求10所述的方法,其特征在于,所述方法还包括:
    传输编码参数,所述编码参数包括:所述当前帧的帧分类参数索引,所述获胜码本矢量在所述增益码本中的索引。
  12. 一种解码声音信号的方法,应用于当前帧中的第一子帧,其特征在于,所述方法包括:
    接收编码参数;所述编码参数包括:所述当前帧的帧分类参数索引,所述获胜码本矢量的索引;
    根据所述当前帧的帧分类参数索引在第一映射表中查找所述线性域中代数码本增益的线性估计值;所述第一映射表中的每一个条目包括两个值:帧分类参数索引和线性域中代数码本增益的线性估计值;
    计算来自代数码本的代数码本矢量的能量Ec,并对所述能量的平方根进行以10为底的对数运算,再对所述对数运算后的值取相反数后进行以10为底的指数运算得到
    将线性域中代数码本增益的线性估计值乘以得到代数码本的估计增益;
    将所述代数码本的估计增益乘以校正因子得到代数码本增益的量化增益,所述校正因子来自获胜码本矢量,所述获胜码本矢量是基于所述获胜码本矢量的索引选自于增益码本。
  13. 如权利要求1-12中任一项所述的方法,其特征在于,所述方法还包括:
    利用所述代数码本增益的量化增益乘以来自代数码本的代数码本矢量,得到代数码本的激励贡献;
    利用选自增益码本的获胜码本矢量包括的自适应码本的量化增益乘以来自自适应码本的自适应码本矢量,得到自适应码本的激励贡献;
    将所述代数码本的激励贡献与所述自适应码本的激励贡献相加,得到总的激励。
  14. 一种语音通信系统,其特征在于,包括:第一装置、第二装置,其中:所述第一装置用于执行权利要求1-6,10-11,13中任一项所述方法,所述第二装置用于执行权利要求7-9,12-13中任一项所述方法。
  15. 一种具有语音编码功能的装置,其特征在于,包括:
    查找部件,用于根据当前帧的帧分类参数索引在第一映射表中查找所述线性域中代数码本增益的线性估计值;所述第一映射表中的每一个条目包括两个值:帧分类参数索引和线性域中代数码本增益的线性估计值;
    第一计算器,用于计算来自代数码本的代数码本矢量的能量;
    第一乘法器,用于将所述线性域中代数码本增益的线性估计值乘以所述代数码本矢量的能量的平方根的倒数,得到代数码本的估计增益;
    第二乘法器,用于将所述代数码本的估计增益乘以获胜码本矢量包括的校正因子,得到代数码本的量化增益;所述获胜码本矢量选自增益码本。
  16. 如权利要求15所述的装置,其特征在于,还包括:通信部件,所述通信部件用于传输编码参数,所述编码参数包括:所述当前帧的帧分类参数索引,所述获胜码本矢量在所述增益码本中的索引。
  17. 一种具有语音编码功能的装置,其特征在于,包括:
    查找部件,用于根据所述当前帧的帧分类参数索引在第一映射表中查找对数域中代数码本增益的线性估计值;所述第一映射表中的每一个条目包括两个值:帧分类参数索引和对数域中代数码本增益的线性估计值;
    转换器,用于将所述对数域中代数码本增益的线性估计值通过指数运算转换到线性域中,得到所述线性域中代数码本增益的线性估计值;
    第一计算器,用于计算来自代数码本的代数码本矢量的能量;
    第一乘法器,用于将所述线性域中代数码本增益的线性估计值乘以所述代数码本矢量的能量的平方根的倒数,得到代数码本的估计增益;
    第二乘法器,用于将所述代数码本的估计增益乘以获胜码本矢量包括的校正因子,得到代数码本的量化增益;所述获胜码本矢量选自增益码本。
  18. 如权利要求15所述的装置,其特征在于,还包括:通信部件,所述通信部件用于传输编码参数,所述编码参数包括:所述当前帧的帧分类参数索引,所述获胜码本矢量在所述增益码本中的索引。
  19. 一种具有语音编码功能的装置,其特征在于,包括:
    线性预测部件,用于利用所述第一子帧中的线性估计常数和所述当前帧的帧类型进行线性估计,得到对数域中代数码本增益的线性估计值;
    转换器,用于将所述对数域中代数码本增益的线性估计值通过指数运算转换到线性域中,得到所述线性域中代数码本增益的线性估计值;
    第一计算器,用于计算来自代数码本的代数码本矢量的能量;
    第一乘法器,用于将所述线性域中代数码本增益的线性估计值乘以所述代数码本矢量的能量的平方根的倒数,得到代数码本的估计增益;
    第二乘法器,用于将所述代数码本的估计增益乘以获胜码本矢量包括的校正因子,得到代数码本的量化增益;所述获胜码本矢量选自增益码本。
  20. 如权利要求19所述的装置,其特征在于,还包括:通信部件,所述通信部件用于传输编码参数,所述编码参数包括:所述当前帧的帧类型,所述线性估计常数,所述获胜码本矢量在所述增益码本中的索引。
  21. 一种具有语音解码功能的装置,其特征在于,包括:
    通信部件,所述通信部件用于接收编码参数;所述编码参数包括:所述当前帧的帧分类参数索引,获胜码本矢量的索引;
    第一查找部件,用于基于所述当前帧的帧分类参数索引在第一映射表中查找所述线性域中代数码本增益的线性估计值;所述第一映射表中的每一个条目包括两个值:帧分类参数索引和线性域中代数码本增益的线性估计值;
    第一计算器,用于计算来自代数码本的代数码本矢量的能量;
    第一乘法器,用于将所述线性域中代数码本增益的线性估计值乘以所述代数码本矢量的能量的平方根的倒数,得到代数码本的估计增益;
    第二乘法器,用于将所述代数码本的估计增益乘以校正因子得到代数码本的量化增益,所述校正因子来自所述获胜码本矢量;
    第二查找部件,用于基于所述获胜码本矢量的索引从增益码本查找出所述获胜码本矢量。
  22. 一种具有语音解码功能的装置,其特征在于,包括:
    通信部件,所述通信部件用于接收编码参数;所述编码参数包括:所述当前帧的帧分类参数索引,获胜码本矢量的索引;
    第一查找部件,用于基于所述当前帧的帧分类参数索引在第一映射表中查找对数域中代数码本增益的线性估计值;所述第一映射表中的每一个条目包括两个值:帧分类参数索引和对数域中代数码本增益的线性估计值;
    转换器,用于将所述对数域中代数码本增益的线性估计值通过指数运算转换到线性域中,得到所述线性域中代数码本增益的线性估计值;
    第一计算器,用于计算来自代数码本的代数码本矢量的能量;
    第一乘法器,用于将所述线性域中代数码本增益的线性估计值乘以所述代数码本矢量的能量的平方根的倒数,得到代数码本的估计增益;
    第二乘法器,用于将所述代数码本的估计增益乘以校正因子得到代数码本的量化增益,所述校正因子来自所述获胜码本矢量;
    第二查找部件,用于基于所述获胜码本矢量的索引从增益码本查找出所述获胜码本矢量。
  23. 一种具有语音解码功能的装置,其特征在于,包括:
    通信部件,所述通信部件用于接收编码参数;所述编码参数包括:所述当前帧的帧类型, 所述第一子帧中的线性估计常数,获胜码本矢量的索引;
    线性预测部件,用于利用所述第一子帧中的线性估计常数和所述当前帧的帧类型进行线性估计,得到对数域中代数码本增益的线性估计值;
    转换器,用于将所述对数域中代数码本增益的线性估计值通过指数运算转换到线性域中,得到所述线性域中代数码本增益的线性估计值;
    第一计算器,用于计算来自代数码本的代数码本矢量的能量;
    第一乘法器,用于将所述线性域中代数码本增益的线性估计值乘以所述代数码本矢量的能量的平方根的倒数,得到代数码本的估计增益;
    第二乘法器,用于将所述代数码本的估计增益乘以校正因子得到代数码本的量化增益,所述校正因子来自所述获胜码本矢量;
    查找部件,用于基于所述获胜码本矢量的索引从增益码本查找出所述获胜码本矢量。
  24. 一种具有语音编码功能的装置,其特征在于,包括:
    第一查找部件,用于根据所述当前帧的帧分类参数索引在第一映射表中查找所述线性域中代数码本增益的线性估计值;所述第一映射表中的每一个条目包括两个值:帧分类参数索引和线性域中代数码本增益的线性估计值;
    第一计算器,用于计算来自代数码本的代数码本矢量的能量Ec;
    对数运算器,用于对所述能量的平方根进行以10为底的对数运算;
    指数运算器,用于对所述对数运算后的值取相反数后进行以10为底的指数运算得到
    第一乘法器,用于将线性域中代数码本增益的线性估计值乘以得到代数码本的估计增益;
    第二乘法器,用于将所述代数码本的估计增益乘以校正因子得到代数码本增益的量化增益,所述校正因子来自获胜码本矢量,所述获胜码本矢量选自增益码本。
  25. 如权利要求24所述的装置,其特征在于,还包括:通信部件,所述通信部件用于传输编码参数,所述编码参数包括:所述当前帧的帧分类参数索引,所述获胜码本矢量在所述增益码本中的索引。
  26. 一种具有语音解码功能的装置,其特征在于,包括:
    通信部件,用于接收编码参数;所述编码参数包括:所述当前帧的帧分类参数索引,所述获胜码本矢量在所述增益码本中的索引;
    第一查找部件,用于根据所述当前帧的帧分类参数索引在第一映射表中查找所述线性域中代数码本增益的线性估计值;所述第一映射表中的每一个条目包括两个值:帧分类参数索引和线性域中代数码本增益的线性估计值;
    第一计算器,用于计算来自代数码本的代数码本矢量的能量Ec;
    对数运算器,用于对所述能量的平方根进行以10为底的对数运算;
    指数运算器,用于对所述对数运算后的值取相反数后进行以10为底的指数运算得到
    第一乘法器,用于将线性域中代数码本增益的线性估计值乘以得到代数码本 的估计增益;
    第二乘法器,用于将所述代数码本的估计增益乘以校正因子得到代数码本增益的量化增益,所述校正因子来自获胜码本矢量;
    第二查找部件,用于基于所述获胜码本矢量的索引从增益码本查找出所述获胜码本矢量。
  27. 如权利要求15-26中任一项所述的装置,其特征在于,还包括:
    第三乘法器,用于利用所述代数码本增益的量化增益乘以来自代数码本的代数码本矢量,得到代数码本的激励贡献;
    第四乘法器,用于利用选自增益码本的获胜码本矢量包括的自适应码本的量化增益乘以来自自适应码本的自适应码本矢量,得到自适应码本的激励贡献;
    加法器,用于将所述代数码本的激励贡献与所述自适应码本的激励贡献相加,得到总的激励。
PCT/CN2023/092547 2022-07-29 2023-05-06 声音编解码方法以及相关装置、系统 WO2024021747A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210908196.9 2022-07-29
CN202210908196.9A CN116052700B (zh) 2022-07-29 2022-07-29 声音编解码方法以及相关装置、系统

Publications (1)

Publication Number Publication Date
WO2024021747A1 true WO2024021747A1 (zh) 2024-02-01

Family

ID=86127993

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/092547 WO2024021747A1 (zh) 2022-07-29 2023-05-06 声音编解码方法以及相关装置、系统

Country Status (2)

Country Link
CN (2) CN117476022A (zh)
WO (1) WO2024021747A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117476022A (zh) * 2022-07-29 2024-01-30 荣耀终端有限公司 声音编解码方法以及相关装置、系统

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20010073396A (ko) * 2000-01-14 2001-08-01 대표이사 서승모 음성 부호화기의 lsp 코드북을 위한 고속탐색 방법
CN101118748A (zh) * 2006-08-04 2008-02-06 北京工业大学 一种代数码本的搜索方法、装置及语音编码器
CN101266795A (zh) * 2007-03-12 2008-09-17 华为技术有限公司 一种格矢量量化编解码的实现方法及装置
CN101572092A (zh) * 2008-04-30 2009-11-04 华为技术有限公司 编解码端的固定码本激励的搜索方法及装置
CN103392203A (zh) * 2011-02-15 2013-11-13 沃伊斯亚吉公司 用于在码激励线性预测编解码器中量化激励的自适应和固定贡献的增益的设备和方法
CN104517612A (zh) * 2013-09-30 2015-04-15 上海爱聊信息科技有限公司 基于amr-nb语音信号的可变码率编码器和解码器及其编码和解码方法
CN116052700A (zh) * 2022-07-29 2023-05-02 荣耀终端有限公司 声音编解码方法以及相关装置、系统

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6947888B1 (en) * 2000-10-17 2005-09-20 Qualcomm Incorporated Method and apparatus for high performance low bit-rate coding of unvoiced speech
EP1464047A4 (en) * 2002-01-08 2005-12-07 Dilithium Networks Pty Ltd TRANSCODE SCHEME BETWEEN CELP-BASED LANGUAGE CODES
JP3981399B1 (ja) * 2006-03-10 2007-09-26 松下電器産業株式会社 固定符号帳探索装置および固定符号帳探索方法
US9275644B2 (en) * 2012-01-20 2016-03-01 Qualcomm Incorporated Devices for redundant frame coding and decoding

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20010073396A (ko) * 2000-01-14 2001-08-01 대표이사 서승모 음성 부호화기의 lsp 코드북을 위한 고속탐색 방법
CN101118748A (zh) * 2006-08-04 2008-02-06 北京工业大学 一种代数码本的搜索方法、装置及语音编码器
CN101266795A (zh) * 2007-03-12 2008-09-17 华为技术有限公司 一种格矢量量化编解码的实现方法及装置
CN101572092A (zh) * 2008-04-30 2009-11-04 华为技术有限公司 编解码端的固定码本激励的搜索方法及装置
CN103392203A (zh) * 2011-02-15 2013-11-13 沃伊斯亚吉公司 用于在码激励线性预测编解码器中量化激励的自适应和固定贡献的增益的设备和方法
CN104517612A (zh) * 2013-09-30 2015-04-15 上海爱聊信息科技有限公司 基于amr-nb语音信号的可变码率编码器和解码器及其编码和解码方法
CN116052700A (zh) * 2022-07-29 2023-05-02 荣耀终端有限公司 声音编解码方法以及相关装置、系统

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JELINEK,M.: "Advances in source-controlledvariable bitrate wideband speech coding", SPECIAL WORKSHOP IN MAUI(SWIM
MACQUEEN: "Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability", 1967, UNIVERSITY OF CALIFORNIAPRESS, article "Some Methods for classification and Analysis of Multivariate Observations", pages: 281 - 297

Also Published As

Publication number Publication date
CN116052700A (zh) 2023-05-02
CN117476022A (zh) 2024-01-30
CN116052700B (zh) 2023-09-29

Similar Documents

Publication Publication Date Title
JP6316398B2 (ja) Celpコーデックにおける励振信号の適応寄与分および固定寄与分の利得を量子化するための装置および方法
US8346544B2 (en) Selection of encoding modes and/or encoding rates for speech compression with closed loop re-decision
JP3254687B2 (ja) 音声符号化方式
JPH0736118B2 (ja) セルプを使用した音声圧縮装置
JP2004526213A (ja) 音声コーデックにおける線スペクトル周波数ベクトル量子化のための方法およびシステム
JPH08263099A (ja) 符号化装置
JPH04270400A (ja) 音声符号化方式
US20070219787A1 (en) Selection of encoding modes and/or encoding rates for speech compression with open loop re-decision
WO2024021747A1 (zh) 声音编解码方法以及相关装置、系统
CA2342353C (en) An adaptive criterion for speech coding
EP1672619A2 (en) Speech coding apparatus and method therefor
JP2002268686A (ja) 音声符号化装置及び音声復号化装置
JP3684751B2 (ja) 信号符号化方法及び装置
US10115408B2 (en) Device and method for quantizing the gains of the adaptive and fixed contributions of the excitation in a CELP codec
Ozerov et al. Flexible quantization of audio and speech based on the autoregressive model
JP3194930B2 (ja) 音声符号化装置
JP3089967B2 (ja) 音声符号化装置
JP3192051B2 (ja) 音声符号化装置
JPH09269799A (ja) 雑音抑圧処理機能を備えた音声符号化回路
KR100318335B1 (ko) 잔차신호의 에너지 레벨 정규화를 통한 음성신호처리복호화기에서의 피치 포스트필터 성능 향상 방법
JPH08160996A (ja) 音声符号化装置
Chui et al. A hybrid input/output spectrum adaptation scheme for LD-CELP coding of speech
JPH07334195A (ja) サブフレーム長可変音声符号化装置
JPH05341800A (ja) 音声符号化装置
JPH04243300A (ja) 音声符号化方式

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23844962

Country of ref document: EP

Kind code of ref document: A1