CN112927703A - Method and apparatus for quantizing linear prediction coefficients and method and apparatus for dequantizing linear prediction coefficients - Google Patents
Method and apparatus for quantizing linear prediction coefficients and method and apparatus for dequantizing linear prediction coefficients Download PDFInfo
- Publication number
- CN112927703A CN112927703A CN202110189590.7A CN202110189590A CN112927703A CN 112927703 A CN112927703 A CN 112927703A CN 202110189590 A CN202110189590 A CN 202110189590A CN 112927703 A CN112927703 A CN 112927703A
- Authority
- CN
- China
- Prior art keywords
- vector
- prediction
- quantization
- quantizer
- quantized
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims description 48
- 238000013139 quantization Methods 0.000 claims abstract description 298
- 239000013598 vector Substances 0.000 claims abstract description 283
- 239000011159 matrix material Substances 0.000 claims abstract description 16
- 238000011002 quantification Methods 0.000 claims description 4
- 238000012549 training Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 description 112
- 238000010586 diagram Methods 0.000 description 43
- 230000003595 spectral effect Effects 0.000 description 28
- 238000004458 analytical method Methods 0.000 description 16
- 238000004364 calculation method Methods 0.000 description 12
- 230000005284 excitation Effects 0.000 description 12
- 230000008569 process Effects 0.000 description 12
- 238000010183 spectrum analysis Methods 0.000 description 8
- 238000009826 distribution Methods 0.000 description 7
- 238000005070 sampling Methods 0.000 description 7
- 238000006243 chemical reaction Methods 0.000 description 6
- 238000001228 spectrum Methods 0.000 description 6
- 239000011295 pitch Substances 0.000 description 5
- 238000004422 calculation algorithm Methods 0.000 description 4
- 238000007781 pre-processing Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000035945 sensitivity Effects 0.000 description 3
- 230000005236 sound signal Effects 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 230000000873 masking effect Effects 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- 206010048669 Terminal state Diseases 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/06—Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
- G10L19/07—Line spectrum pair [LSP] vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/022—Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/032—Quantisation or dequantisation of spectral components
- G10L19/038—Vector quantisation, e.g. TwinVQ audio
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/06—Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L2019/0001—Codebooks
- G10L2019/0004—Design or structure of the codebook
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L2019/0001—Codebooks
- G10L2019/0016—Codebook for LPC parameters
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
A quantization apparatus comprising: a trellis-structured vector quantizer quantizing a first error vector between N-dimensional (here, "N" is two or more) sub-vectors and the first prediction vector; and an inter predictor generating a first prediction vector from the quantized N-dimensional sub-vector, wherein the inter predictor uses a prediction coefficient including an NxN matrix and performs inter prediction using the quantized N-dimensional sub-vector of a previous stage.
Description
Divisional application statement
This application is a divisional application of patent application No. 201580037280.6 entitled "method and apparatus for quantizing and method and apparatus for dequantizing linear prediction coefficients" filed on 2015, 5, month 7.
Technical Field
One or more exemplary embodiments relate to quantization and inverse quantization of linear prediction coefficients, and more particularly, to a method and apparatus for efficiently quantizing linear prediction coefficients with low complexity, and a method and apparatus for inverse quantization.
Background
In a system for encoding a sound, such as speech or audio, Linear Predictive Coding (LPC) coefficients are used to represent the short-term frequency characteristics of the sound. The LPC coefficients are obtained in a form that divides an input sound in units of frames and minimizes the energy of a prediction error per frame. However, the LPC coefficients have a large dynamic range, and the characteristics of the LPC filter used are very sensitive to the quantization error of the LPC coefficients, so that the stability of the filter cannot be guaranteed.
Therefore, the LPC coefficients are quantized by converting them into another coefficient for which the stability of the filter is easily confirmed, interpolation is advantageous, and the quantization characteristic is good. Most preferably, the LPC coefficients are quantized by converting them into Line Spectral Frequencies (LSFs) or derivative spectral frequencies (ISFs). In particular, the scheme of quantizing LSF coefficients may use high inter-frame correlation of LSF coefficients in frequency and time domains, thereby increasing quantization gain.
The LSF coefficient exhibits frequency characteristics of short-term sound, and in the case of a frame in which the frequency characteristics of input sound change sharply, the LSF coefficient of the corresponding frame also changes sharply. However, a quantizer including an inter predictor using high inter correlation of LSF coefficients cannot perform proper prediction on a sharply changed frame, and thus quantization performance is degraded. Therefore, it is necessary to select an optimized quantizer corresponding to the signal characteristics of each frame of the input sound.
Disclosure of Invention
Technical problem
One or more exemplary embodiments include a method and apparatus for efficiently quantizing Linear Predictive Coding (LPC) coefficients with low complexity and a method and apparatus for inverse quantization.
Technical scheme
According to one or more exemplary embodiments, a quantification apparatus includes: a trellis-structured vector quantizer configured to quantize a first error vector between a first prediction vector and an N-dimensional sub-vector, where N is a natural number greater than or equal to 2; and an intra predictor configured to generate the first error vector from the quantized N-dimensional sub-vector, wherein the intra predictor is configured to perform intra prediction using a prediction coefficient having an N × N matrix and by using the quantized N-dimensional sub-vector of a previous stage.
The apparatus may further include a vector quantizer configured to quantize a quantization error of the N-dimensional sub-vector.
The apparatus may further include an inter-frame predictor configured to generate a prediction vector of the current frame from a quantized N-dimensional sub-vector of the previous frame, the prediction error vector being obtained from the N-dimensional sub-vector of the current frame and the prediction vector, when the trellis-structured vector quantizer is configured to quantize a second error vector (which corresponds to a difference between the prediction error vector and the second vector).
The apparatus may further include an inter-predictor configured to generate a prediction vector of the current frame from a quantized N-dimensional sub-vector of a previous frame and a vector quantizer configured to quantize a quantization error of the prediction error vector, the prediction error vector being obtained from the N-dimensional sub-vector of the current frame and the prediction vector, when the trellis-structured vector quantizer is configured to quantize a second error vector (which corresponds to a difference between the prediction error vector and the second vector).
According to one or more exemplary embodiments, a quantification apparatus includes: a first quantization module to perform quantization without inter prediction; and a second quantization module for performing quantization using inter prediction, wherein the first quantization module comprises: a first trellis structured vector quantizer configured to quantize a first error vector between a first prediction vector and an N-dimensional sub-vector, where N is a natural number greater than or equal to 2; and a first intra predictor configured to generate the first error vector from the quantized N-dimensional sub-vector, wherein the first intra predictor is configured to perform intra prediction using a prediction coefficient having an N × N matrix and by using a quantized N-dimensional sub-vector of a previous stage.
The apparatus may further include an error vector quantizer configured to generate a quantized quantization error vector corresponding to a difference between the quantized N-dimensional linear vector of the current stage and the input N-dimensional linear vector by quantizing the quantization error vector.
The intra predictor may be configured to generate a prediction vector from the quantized prediction error vector when the vector quantizer is configured to quantize the prediction error vector between the N-dimensional linear vector of the current stage and the prediction vector of the current frame.
The apparatus may further include an error vector quantizer configured to quantize a quantization error of the prediction error vector when the vector quantizer is configured to quantize the prediction error vector between the N-dimensional linear vector of the current stage and the prediction vector of the current frame.
According to one or more exemplary embodiments, an inverse quantization apparatus includes: a trellis-structured vector inverse quantizer configured to inverse quantize a first quantization index of an N-dimensional sub-vector, where N is a natural number greater than or equal to 2; and an intra predictor configured to generate a prediction vector from a quantized N-dimensional sub-vector corresponding to a result obtained by adding a quantization error vector from the lattice-structured vector dequantizer to the prediction vector, the intra predictor configured to perform intra prediction using a prediction coefficient having an N × N matrix and by using a quantized N-dimensional sub-vector of a previous stage.
The inverse quantization apparatus may further include a vector inverse quantizer configured to quantize a second quantization index of the quantization error of the N-dimensional sub-vector.
The inverse quantization apparatus may further include an inter-predictor configured to generate the prediction vector of the current frame from the quantized N-dimensional sub-vector of the previous frame when the lattice-structured vector inverse quantizer is configured to inverse-quantize the third quantization index of the quantization error vector between the N-dimensional sub-vector and the prediction vector of the current frame.
The inverse quantization apparatus may further include: an inter-predictor configured to generate a prediction vector of a current frame from a quantized N-dimensional sub-vector of a previous frame; and a vector inverse quantizer configured to quantize a fourth quantization index of a quantization error of the prediction error vector when the trellis-structured vector inverse quantizer is configured to inverse quantize a third quantization index of the quantization error vector between the N-dimensional sub-vector and the prediction vector of the current frame.
Advantageous effects
According to an exemplary embodiment, when a voice or audio signal is quantized by classifying the voice or audio signal into a plurality of coding modes according to signal characteristics of the voice or audio and allocating different numbers of bits according to a compression ratio applied to each coding mode, the voice or audio signal can be quantized more efficiently by designing a quantizer having good performance at a low bit rate.
In addition, when a quantization apparatus for providing various bit rates is designed, the use amount of memory can be minimized by sharing codebooks of some quantizers.
Drawings
These and/or other aspects will become apparent and more readily appreciated from the following description of the exemplary embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a block diagram of a sound encoding apparatus according to an exemplary embodiment.
Fig. 2 is a block diagram of a sound encoding apparatus according to another exemplary embodiment.
Fig. 3 is a block diagram of a Linear Predictive Coding (LPC) quantization unit according to an example embodiment.
Fig. 4 is a detailed block diagram of the weighting function determination unit of fig. 3 according to an exemplary embodiment.
Fig. 5 is a detailed block diagram of the first weighting function generation unit of fig. 4 according to an exemplary embodiment.
Fig. 6 is a block diagram of an LPC coefficient quantization unit according to an exemplary embodiment.
Fig. 7 is a block diagram of a selection unit of fig. 6 according to an example embodiment.
Fig. 8 is a flowchart describing the operation of the selection unit of fig. 6 according to an exemplary embodiment.
Fig. 9A-9E are block diagrams illustrating examples of various implementations of the first quantization module shown in fig. 6.
Fig. 10A-10D are block diagrams illustrating examples of various implementations of the second quantization module shown in fig. 6.
Fig. 11A-11F are block diagrams illustrating examples of various implementations of quantizers in which weights are applied to a block constrained trellis coded vector quantizer (BC-TCVQ).
Fig. 12 is a block diagram of a quantization apparatus having a switching structure of an open loop scheme at a low rate according to an exemplary embodiment.
Fig. 13 is a block diagram of a quantization apparatus having a switching structure of an open loop scheme at a high rate according to an exemplary embodiment.
Fig. 14 is a block diagram of a quantization apparatus having a switching structure of an open loop scheme at a low rate according to another exemplary embodiment.
Fig. 15 is a block diagram of a quantization apparatus having a switching structure of an open loop scheme at a high rate according to another exemplary embodiment.
Fig. 16 is a block diagram of an LPC coefficient quantization unit according to an exemplary embodiment.
Fig. 17 is a block diagram of a quantization apparatus having a switching structure of a closed-loop scheme according to an exemplary embodiment.
Fig. 18 is a block diagram of a quantization apparatus having a switching structure of a closed-loop scheme according to another exemplary embodiment.
Fig. 19 is a block diagram of an inverse quantization apparatus according to an exemplary embodiment.
Fig. 20 is a detailed block diagram of an inverse quantization apparatus according to an exemplary embodiment.
Fig. 21 is a detailed block diagram of an inverse quantization apparatus according to another exemplary embodiment.
Detailed Description
The inventive concept is susceptible to various changes or modifications and alternative forms, and specific embodiments have been shown in the drawings and have been described in detail in the specification. However, it should be understood that the specific embodiments do not limit the inventive concept to the specifically disclosed forms but include each modified, equivalent, or alternative embodiment within the spirit and technical scope of the inventive concept. In the description of the inventive concept, when it is determined that detailed description of related well-known features may obscure the nature of the inventive concept, detailed description thereof will be omitted.
Although terms such as 'first' and 'second' may be used to describe various elements, the elements should not be limited by these terms. These terms may be used to classify an element as another element.
The terminology used in the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the inventive concepts in any way. Terms used in the present specification are those general terms which are currently widely used in the art, but they may be changed according to the intention of a person having ordinary skill in the art, precedent, or new technology in the art. Further, a specific term may be selected by the applicant, and in this case, its detailed meaning will be described in the detailed description. Therefore, the terms used in the specification should be understood not as simple names but based on the meanings and overall descriptions of the terms.
Expressions in the singular include expressions in the plural unless they are clearly different from each other in context. In this application, it should be understood that terms such as 'including' and 'having' are used to indicate the presence of the implemented features, numbers, steps, operations, elements, parts, or combinations thereof, without precluding the possibility of the presence or addition of one or more other features, numbers, steps, operations, elements, parts, or combinations thereof.
Hereinafter, embodiments of the inventive concept will be described in detail with reference to the accompanying drawings, and like reference numerals in the drawings denote like elements, and thus, a repetitive description thereof will be omitted.
In general, a trellis-coded quantizer (TCQ) quantizes an input vector by assigning one element to each TCQ level, and a trellis-coded vector quantizer (TCVQ) uses a structure that generates sub-vectors by dividing the entire input vector into sub-vectors and then assigns each sub-vector to a TCQ level. When the quantizer is formed using one element, TCQ is formed, and when the quantizer is formed using a sub-vector by combining a plurality of elements, TCVQ is formed. Therefore, when a two-dimensional (2D) sub-vector is used, the total number of TCQ stages is the same as the size obtained by dividing the size of the input vector by 2. In general, a speech/audio codec encodes an input signal in units of frames and extracts Linear Spectrum (LSF) coefficients for each frame. The LSF coefficients have a vector form and dimensions of 10 or 16 are used for the LSF coefficients. In this case, when considering 2D TCVQ, the number of sub-vectors is 5 or 8.
Fig. 1 is a block diagram of a sound encoding apparatus according to an exemplary embodiment.
The sound encoding apparatus 100 shown in fig. 1 may include an encoding mode selection unit 110, a Linear Predictive Coding (LPC) coefficient quantization unit 130, and a CELP coding unit 150. Each component may be implemented as at least one processor (not shown) by being integrated into at least one module. In one embodiment, since the sound may indicate audio or voice, or a mixed signal of audio and voice, the sound is hereinafter referred to as voice for convenience of description.
Referring to fig. 1, the coding mode selection unit 110 may select one of a plurality of coding modes corresponding to a plurality of rates. The encoding mode selection unit 110 may determine the encoding mode of the current frame by using a signal characteristic of a previous frame, Voice Activity Detection (VAD) information, or an encoding mode.
The LPC coefficient quantization unit 130 may quantize the LPC coefficients by using a quantizer corresponding to the selected coding mode, and determine a quantization index representing the quantized LPC coefficients. The LPC coefficient quantization unit 130 may perform quantization by converting the LPC coefficients into another coefficient suitable for quantization.
The excitation signal encoding unit 150 may perform excitation signal encoding according to the selected encoding mode. For excitation signal coding, Code Excited Linear Prediction (CELP) or algebraic CELP (acelp) algorithms may be used. Representative parameters for encoding LPC coefficients by the CELP scheme are adaptive codebook index, adaptive codebook gain, fixed codebook index, fixed codebook gain, and the like. Excitation signal encoding may be performed based on an encoding mode corresponding to a characteristic of the input signal. For example, four coding modes, i.e., a silent coding (UC) mode, a Voiced Coding (VC) mode, a Generic Coding (GC) mode, and a Transition Coding (TC) mode, may be used. The UC mode may be selected when the speech signal is unvoiced or noise having similar characteristics to unvoiced. The VC mode may be selected when the speech signal is voiced. The TC mode may be used when encoding a signal of a transition period in which the characteristics of a speech signal sharply change. The GC pattern can be used to encode other signals. UC mode, VC mode, TC mode, and GC mode follow the definitions and classification criteria drafted in ITU-T G.718, but are not so limited. The excitation signal encoding unit 150 may include an open-loop pitch search unit (not shown), a fixed codebook search unit (not shown), or a gain quantization unit (not shown), but components may be added to the excitation signal encoding unit 150 or omitted from the excitation signal encoding unit 150 according to the encoding mode. For example, in VC mode, all the components described above are included, and in UC mode, the open-loop pitch search unit is not used. When the number of bits allocated to quantization is large, i.e., in the case of a high bit rate, the excitation signal encoding unit 150 can be simplified in the GC mode and the VC mode. That is, the GC mode can be used for the UC mode and the TC mode by including the UC mode and the TC mode in the GC mode. In the case of high bit rates, an Inactive Coding (IC) mode and an Audio Coding (AC) mode may also be included. When the number of bits allocated to quantization is small, i.e., in the case of a low bit rate, the excitation signal encoding unit 150 may classify the encoding modes into a GC mode, a UC mode, a VC mode, and a TC mode. In case of low bit rate, an IC mode and an AC mode may also be included. The IC mode may be selected for muting and the AC mode may be selected when the characteristics of the speech signal are close to audio.
The coding modes may be further subdivided according to the bandwidth of the speech signal. The bandwidth of a voice signal can be classified into, for example, a Narrow Band (NB), a Wide Band (WB), an ultra wide band (SWB), and a Full Band (FB). NB can have a bandwidth of 300-3400Hz or 50-4000Hz, WB can have a bandwidth of 50-7000Hz or 50-8000Hz, SWB can have a bandwidth of 50-14000Hz or 50-16000Hz, and FB can have a bandwidth of up to 20000 Hz. Here, the numerical values related to the bandwidth are set for convenience, and are not limited thereto. Furthermore, the classification of the bandwidth may also be set to be simpler or more complex.
When the type and number of coding modes are determined, the codebook needs to be trained again using a speech signal corresponding to the determined coding mode.
The excitation signal encoding unit 150 may additionally use a transform coding algorithm according to the coding mode. The excitation signal may be encoded in frame or subframe units.
Fig. 2 is a block diagram of a sound encoding apparatus according to another exemplary embodiment.
The sound encoding apparatus 200 shown in fig. 2 may include a preprocessing unit 210, an LP analysis unit 220, a weighted signal calculation unit 230, an open loop pitch search unit 240, a signal analysis and Voice Activity Detection (VAD) unit 250, an encoding unit 260, a memory update unit 270, and a parametric coding unit 280. Each component may be implemented as at least one processor (not shown) by being integrated into at least one module. In the embodiment, since the sound may indicate audio or voice, or a mixed signal of audio and voice, the sound is hereinafter referred to as voice for convenience of description.
Referring to fig. 2, the preprocessing unit 210 may preprocess an input voice signal. Through the processing of the pre-processing, undesired frequency components may be removed from the speech signal or the frequency characteristics of the speech signal may be adjusted to facilitate encoding. In detail, the preprocessing unit 210 may perform high-pass filtering, pre-emphasis, sample conversion, and the like.
The LP analysis unit 220 may extract LPC coefficients by performing LP analysis on the preprocessed voice signal. Typically, one LP analysis is performed per frame, but two or more LP analyses may be performed per frame for additional sound quality enhancement. In this case, one analysis is LP for the end of frame, which is an existing LP analysis, and the other analysis may be LP for the middle subframe to enhance sound quality. Herein, the end of the current frame indicates the last subframe among the subframes constituting the current frame, and the end of the previous frame indicates the last subframe among the subframes constituting the previous frame. The middle subframe indicates one or more subframes existing in subframes between the last subframe that is the end of the previous frame and the last subframe that is the end of the current frame. For example, one frame may be composed of four subframes. Dimension 10 is used for LPC coefficients when the input signal is NB, and dimensions 16-20 are used for LPC coefficients when the input signal is WB, but the embodiment is not limited thereto.
The weighted signal calculation unit 230 may receive the pre-processed speech signal and the extracted LPC coefficients and calculate a perceptually weighted filtered signal based on the perceptually weighted filter. The perceptual weighting filter may reduce quantization noise of the pre-processed speech signal within a masking range in order to use a masking effect of a human auditory structure.
The open-loop pitch search unit 240 may search for open-loop pitches by using the perceptually weighted filtered signal.
The signal analysis and VAD unit 250 may determine whether the input signal is an active speech signal by analyzing various characteristics including frequency characteristics of the input signal.
The encoding unit 260 may determine an encoding mode of the current frame by using a signal characteristic of the previous frame, VAD information, or an encoding mode, quantize LPC coefficients by using a quantizer corresponding to the selected encoding mode, and encode the excitation signal according to the selected encoding mode. The encoding unit 260 may include the components shown in fig. 1.
The memory update unit 270 may store the encoded current frame and parameters for encoding of subsequent frames during encoding.
The parameter encoding unit 280 may encode parameters to be used for decoding at a decoding end and include the encoded parameters in a bitstream. Preferably, a parameter corresponding to the encoding mode may be encoded. The bit stream generated by the parameter encoding unit 280 may be used for storage or transmission purposes.
Table 1 below shows examples of quantization schemes and structures for four coding modes. A scheme in which quantization is performed without inter prediction may be referred to as a safety net scheme, and a scheme in which quantization is performed with inter prediction may be referred to as a prediction scheme. Further, VQ stands for vector quantizer and BC-TCQ stands for block constrained trellis coded quantizer.
[ Table 1]
BC-TCVQ represents a block constrained trellis coded vector quantizer. TCVQ allows a vector codebook and branch labels by generalizing the TCQ. Principal characteristics of TCVQIs to divide the VQ symbols of the extended set into subsets and label the trellis branches with these subsets. TCVQ is based on a rate 1/2 convolutional code with N-2vA plurality of trellis states and having two branches entering and leaving each trellis state. When M source vectors are given, the least distortion path is searched for using the Viterbi (Viterbi) algorithm. Thus, the optimal trellis path may start at any of the N initial states and end at any of the N termination states. The codebook in TCVQ has 2(R+R')LA vector codeword. Here, since the codebook has 2 which is the nominal rate RVQR'LMultiple codewords, so R' can be a codebook expansion factor. The encoding operation is briefly described as follows. First, for each input vector, the distortion corresponding to the closest codeword in each subset is searched, and the viterbi algorithm is used to search the path of least distortion through the trellis using the input as the branch metric for marking the branch to the subset S as the searched distortion. The BC-TCVQ has low complexity since it requires one bit per source sample to specify the trellis path. When k ≦ 0 ≦ v, the BC-TCVQ structure may have 2 for each allowed initial trellis statekAn initial grid state and 2ν-kAnd a termination state. A single viterbi code starts from the allowed initial trellis state and ends at vector level m-k. To specify the initial state, k bits are needed, and to specify the path to vector level m-k, m-k bits are needed. Each trellis state is pre-assigned a unique termination path at vector level m-k that depends on the initial trellis state by vector level m. Regardless of the value of k, m bits are required to specify the initial trellis state and the path through the trellis.
The BC-TCVQ for VC mode at an internal sampling frequency of 16KHz may use a 16-state and 8-level TCVQ having N dimensions (e.g., 2D vectors). An LSF sub-vector with two elements may be assigned to each stage. Table 2 below shows the initial and terminal states of the 16-state BC-TCVQ. Herein, k and v denote 2 and 4, respectively, and four bits for the initial state and the termination state are used.
[ Table 2]
Initial | End state | |
0 | 0、1、2、3 | |
4 | 4、5、6、7 | |
8 | 8、9、10、11 | |
12 | 12、13、14、15 |
The coding mode may vary depending on the applied bit rate. As described above, in order to quantize LPC coefficients at a high bit rate using two coding modes, 40 or 41 bits per frame may be used in the GC mode, and 46 bits per frame may be used in the TC mode.
Fig. 3 is a block diagram of an LPC coefficient quantization unit according to an exemplary embodiment.
The LPC coefficient quantization unit 300 shown in fig. 3 may include a first coefficient conversion unit 310, a weighting function determination unit 330, an ISF/LSF quantization unit 350, and a second coefficient conversion unit 379. Each component may be implemented as at least one processor (not shown) by being integrated into at least one module. The unquantized LPC coefficients and coding mode information may be provided as inputs to LPC coefficient quantization unit 300.
Referring to fig. 3, the first coefficient conversion unit 310 may convert LPC coefficients extracted by LP analysis on the frame end of a current or previous frame of a speech signal into coefficients of a different form. For example, the first coefficient conversion unit 310 may convert LPC coefficients of the end of frame of the current frame or the previous frame into any one of LSF coefficients and ISF coefficients. In this case, the ISF coefficients or LSF coefficients indicate examples of forms in which LPC coefficients can be quantized more easily.
The weighting function determination unit 330 may determine the weighting function of the ISF/LSF quantization unit 350 by using the ISF coefficients or LSF coefficients converted from the LPC coefficients. The determined weighting function may be used in an operation of selecting a quantization path or a quantization scheme or searching codebook indices with which a weighting error is minimized in quantization. For example, the weighting function determination unit 330 may determine a final weighting function by combining an amplitude weighting function, a frequency weighting function, and a weighting function based on the position of the ISF/LSF coefficient.
Further, the weighting function determination unit 330 may determine the weighting function by considering at least one of a frequency bandwidth, a coding mode, and spectral analysis information. For example, the weighting function determination unit 330 may derive an optimal weighting function for each encoding mode. Alternatively, the weighting function determination unit 330 may derive an optimal weighting function according to the frequency bandwidth of the speech signal. Alternatively, the weighting function determination unit 330 may derive an optimal weighting function from frequency analysis information of the voice signal. In this case, the frequency analysis information may include spectral tilt information. The weighting function determination unit 330 is described in detail below.
The ISF/LSF quantization unit 350 may obtain an optimal quantization index according to an input encoding mode. Specifically, the ISF/LSF quantizing unit 350 may quantize the ISF coefficients or the LSF coefficients converted from the LPC coefficients of the frame end of the current frame. The ISF/LSF quantizing unit 350 may quantize the input signal by using only a safety net scheme without inter-prediction when the input signal is the UC mode or the TC mode corresponding to the non-stationary signal, and the ISF/LSF quantizing unit 350 may determine an optimal quantizing scheme considering a frame error by switching the prediction scheme and the safety net scheme when the input signal is the VC mode or the GC mode corresponding to the stationary signal.
The ISF/LSF quantizing unit 350 may quantize the ISF coefficients or the LSF coefficients by using the weighting function determined by the weighting function determining unit 330. The ISF/LSF quantizing unit 350 may quantize the ISF coefficients or the LSF coefficients by using the weighting function determined by the weighting function determining unit 330 to select one of a plurality of quantization paths. The index obtained as a result of quantization may be used to obtain a quantized isf (qisf) coefficient or a quantized lsf (qlsf) coefficient through an inverse quantization operation.
The second coefficient conversion unit 370 may convert the QISF coefficients or QLSF coefficients into quantized lpc (qlpc) coefficients.
In the following, the relation between the vector quantization of the LPC coefficients and the weighting function is described.
Vector quantization indicates the operation of selecting the codebook index with the smallest error by using a squared error distance measure based on the consideration that all entries in a vector have the same importance. However, for LPC coefficients, as all coefficients have different importance, the perceptual quality of the final synthesized signal can be improved when the error of the important coefficients is reduced. Thus, when quantizing LSF coefficients, the decoding apparatus may select an optimal codebook index by applying a weighting function representing the importance of each LPC coefficient to the squared error distance measure, thereby improving the performance of the synthesized signal.
According to one embodiment, the frequency information of the ISF and LSF and the actual spectral amplitude may be used to determine an amplitude weighting function as to what is actually affected by each ISF or LSF on the spectral envelope. According to one embodiment, additional quantization efficiency may be obtained by combining a frequency weighting function, in which the perceptual features and formant distribution of the frequency domain are considered, with an amplitude weighting function. In this case, since the actual amplitude in the frequency domain is used, the envelope information of the entire frequency can be well reflected, and the weight of each ISF or LSF coefficient can be accurately derived. According to one embodiment, additional quantization efficiency may be obtained by combining a weighting function based on location information of LSF coefficients or ISF coefficients with an amplitude weighting function and a frequency weighting function.
According to one embodiment, when vector quantizing an ISF or LSF converted from LPC coefficients, if the importance of each coefficient is different, a weighting function may be determined indicating which entry is relatively more important in the vector. Furthermore, by determining a weighting function capable of assigning higher weights to higher energy portions by analyzing the spectrum of the frame to be encoded, the accuracy of encoding can be improved. High energy in the spectrum indicates high correlation in the time domain.
In table 1, the optimal quantization index for VQ applied to all modes may be determined as E for minimizing equation 1werr(p)(EWeighted error(p)) of the index.
[ equation 1]
In equation 1, w (i) represents a weighting function, r (i) represents an input of a quantizer, and c (i) represents an output of the quantizer, and is used to obtain an index that minimizes weighted distortion between two values.
Next, the distortion measurement used by BC-TCQ substantially follows the method disclosed in US 7,630,890. In this case, the distortion measure d (x, y) may be represented by equation 2.
[ equation 2]
According to one embodiment, a weighting function may be applied to the distortion measure d (x, y). The weighted distortion may be obtained by extending the distortion measure for BC-TCQ in US 7,630,890 to a measure for a vector and then applying a weighting function to the extended measure. That is, the optimal index may be determined by obtaining weighted distortion as represented in equation 3 below at all stages of the BC-TCVQ.
[ equation 3]
The ISF/LSF quantization unit 350 may perform quantization according to an input coding mode, for example, by switching Lattice Vector Quantizers (LVQs) and BC-TCVQs. If the coding mode is the GC mode, the LVQ may be used, and if the coding mode is the VC mode, the BC-TCVQ may be used. The operation of the selective quantizer when LVQ and BC-TCVQ are mixed is described below. First, a bit rate for encoding may be selected. After selecting the bit rates for encoding, the bits of the LPC quantizer corresponding to each bit rate may be determined. Thereafter, the bandwidth of the input signal may be determined. The quantization scheme may vary depending on whether the input signal is NB or WB. Further, when the input signal is WB, it needs to be additionally determined whether the upper limit of the bandwidth to be actually encoded is 6.4KHz or 8 KHz. That is, since the quantization scheme may vary according to whether the internal sampling frequency is 12.8KHz or 16KHz, it is necessary to check the bandwidth. Next, an optimal coding mode within the limits of the available coding modes may be determined according to the determined bandwidth. For example, four coding modes (UC, VC, GC, and TC) may be used, but only three modes (VC, GC, and TC) may be used at a high bit rate (e.g., 9.6Kbit/s or higher). A quantization scheme, for example, one of LVQ and BC-TCVQ, is selected based on a bit rate for encoding, a bandwidth of an input signal, and an encoding mode, and an index quantized based on the selected quantization scheme is output.
According to one embodiment, it is determined whether the bit rate corresponds to between 24.4Kbps and 65Kbps, and if the bit rate does not correspond to between 24.4Kbps and 65Kbps, the LVQ may be selected. Otherwise, if the bit rate corresponds to between 24.4Kbps and 65Kbps, it is determined whether the bandwidth of the input signal is NB, and if the bandwidth of the input signal is NB, the LVQ may be selected. Otherwise, if the bandwidth of the input signal is not NB, it is determined whether the coding mode is the VC mode, and if the coding mode is the VC mode, BC-TCVQ may be used, and if the coding mode is not the VC mode, LVQ may be used.
According to another embodiment, it is determined whether the bit rate corresponds to between 13.2Kbps and 32Kbps, and if the bit rate does not correspond to between 13.2Kbps and 32Kbps, the LVQ may be selected. Otherwise, if the bit rate corresponds to between 13.2Kbps and 32Kbps, it is determined whether the bandwidth of the input signal is WB, and if the bandwidth of the input signal is not WB, the LVQ may be selected. Otherwise, if the bandwidth of the input signal is WB, it is determined whether the coding mode is VC mode, and if the coding mode is VC mode, BC-TCVQ may be used, and if the coding mode is not VC mode, LVQ may be used.
According to one embodiment, the encoding apparatus may determine the optimal weighting function by combining an amplitude weighting function using a spectral amplitude corresponding to a frequency of an ISF coefficient or an LSF coefficient converted from an LPC coefficient, a frequency weighting function in which perceptual features of an input signal and a formant distribution are considered, and a weighting function based on a position of the LSF coefficient or the ISF coefficient.
Fig. 4 is a block diagram of the weighting function determination unit of fig. 3 according to an exemplary embodiment.
The weighting function determining unit 400 shown in fig. 4 may include a spectrum analyzing unit 410, an LP analyzing unit 430, a first weighting function generating unit 450, a second weighting function generating unit 470, and a combining unit 490. Each component may be integrated and implemented as at least one processor.
Referring to fig. 4, the spectrum analysis unit 410 may analyze frequency domain characteristics of an input signal through a time-to-frequency mapping operation. Herein, the input signal may be a pre-processed signal and the time-to-frequency mapping operation may be performed using a Fast Fourier Transform (FFT), but the embodiment is not limited thereto. The spectral analysis unit 410 may provide spectral analysis information, such as spectral magnitudes obtained as a result of an FFT. Herein, the spectral magnitudes may have a linear scale. In detail, the spectrum analysis unit 410 may generate a spectrum magnitude by performing a 128-point FFT. In this case, the bandwidth of the spectral amplitude may correspond to the range of 0-6400 Hz. The number of spectral magnitudes can be extended to 160 when the internal sampling frequency is 16 KHz. In this case, spectral magnitudes in the range of 6400-. In detail, the omitted spectral magnitudes for the range of 6400-8000Hz may be replaced with the last 32 spectral magnitudes corresponding to the bandwidth of 4800-6400 Hz. For example, an average of the last 32 spectral magnitudes may be used.
The LP analysis unit 430 may generate LPC coefficients by performing LP analysis on the input signal. The LP analysis unit 430 may generate ISF or LSF coefficients from the LPC coefficients.
The first weighting function generating unit 450 may obtain an amplitude weighting function and a frequency weighting function based on spectral analysis information of the ISF or LSF coefficient, and generate the first weighting function by combining the amplitude weighting function and the frequency weighting function. The first weighting function may be obtained based on FFT and larger weights may be assigned when the spectral magnitude is larger. For example, the first weighting function may be determined by normalizing the spectral analysis information (i.e., spectral magnitudes) to satisfy the ISF or LSF bands and then using the magnitudes of the frequencies corresponding to each ISF or LSF coefficient.
The second weighting function generation unit 470 may determine the second weighting function based on the interval or position information of the adjacent ISF or LSF coefficients. According to one embodiment, the second weighting function relating to spectral sensitivity may be generated from two ISF or LSF coefficients adjacent to each ISF or LSF coefficient. Generally, ISF or LSF coefficients are located on a unit circle of a Z domain, and are characterized in that a spectral peak occurs when an interval between adjacent ISF or LSF coefficients is narrower than a surrounding interval. Thus, the second weighting function may be used to approximate the spectral sensitivity of the LSF coefficients based on the location of neighboring LSF coefficients. That is, by measuring how closely adjacent LSF coefficients are located, the density of the LSF coefficients can be predicted, and larger weights can be assigned since the signal spectrum can have peaks near the frequencies where dense LSF coefficients exist. Herein, in order to improve accuracy in approximating spectral sensitivity, various parameters of the LSF coefficients may be additionally used when determining the second weighting function.
As described above, the ISF or LSF coefficients may have an inverse proportional relationship with the interval between the weighting functions. Various implementations may be performed using this relationship between the intervals and the weighting functions. For example, the interval may be represented by a negative value or as a denominator. As another example, to further emphasize the obtained weights, each element of the weighting function may be multiplied by a constant or expressed as a square of the element. As another example, a weighting function obtained secondarily by performing an additional calculation (e.g., a square or a cube) of the weighting function obtained primarily may be further reflected.
An example of deriving the weighting function by using the interval between ISF or LSF coefficients is as follows.
According to one embodiment, the second weighting function W may be obtained by equation 4 belows(n)。
[ equation 4]
Wherein d isi=lsfi+1-lsfi-1
In equation 4, lsfi-1And lsfi+1Indicating the LSF coefficient adjacent to the current LSF coefficient.
According to another embodiment, the second weighting function Ws(n) can be obtained by equation 5 below.
[ equation 5]
In equation 5, lsfnRepresenting the current LSF coefficient, LSFn-1And lsfn+1Represents neighboring LSF coefficients, and M is the dimension of the LP model and may be 16. For example, since the LSF coefficient spans between 0 and π, it can be based on LSF 00 and lsfMThe first and last weights are calculated pi.
The combining unit 490 may determine a final weighting function to be used for quantizing the LSF coefficients by combining the first weighting function with the second weighting function. In this case, as a combination scheme, various schemes may be used: such as a scheme of multiplying a first weighting function by a second weighting function, a scheme of multiplying each weighting function by an appropriate ratio and then adding the multiplication results, and a scheme of multiplying each weight by a predetermined value using a look-up table or the like and then adding the multiplication results.
Fig. 5 is a detailed block diagram of the first weighting function generation unit of fig. 4 according to an exemplary embodiment.
The first weighting function generating unit 500 shown in fig. 5 may include a normalizing unit 510, a magnitude weighting function generating unit 530, a frequency weighting function generating unit 550, and a combining unit 570. Here, for convenience of description, the LSF coefficient is used as an example of the input signal of the first weighting function generating unit 500.
Referring to fig. 5, the normalization unit 510 may normalize the LSF coefficient in the range of 0 to K-1. The LSF coefficients may typically have a range of 0 to pi. For an internal sampling frequency of 12.8KHz, K may be 128, and for an internal sampling frequency of 16.4KHz, K may be 160.
The magnitude weighting function generation unit 530 may generate the magnitude weighting function W based on the spectral analysis information of the normalized LSF coefficient1(n) of (a). According to one embodiment, the magnitude weighting function may be determined based on the spectral magnitudes of the normalized LSF coefficients.
In detail, the magnitude weighting function may be determined using a spectral bin (spectral bin) corresponding to the frequency of the normalized LSF coefficient and two adjacent spectral bins located to the left and right (e.g., one before or one after) the corresponding spectral bin. Each magnitude weighting function W associated with a spectral envelope1(n) may be determined based on equation 6 below by extracting the maximum of the magnitudes of the three spectral bins.
[ equation 6]
In equation 6, Min (minimum value) represents wfA minimum value of (n), and wf(n) may beFrom 10log (E)Maximum of(n)) (where n is 0, a. Herein, M represents 16, and EMaximum of(n)(Emax(n)) represents the maximum of the magnitudes of the three spectral bins for each LSF coefficient.
The frequency weighting function generation unit 550 may generate the frequency weighting function W based on the frequency information of the normalized LSF coefficient2(n) of (a). According to one embodiment, the frequency weighting function may be determined using perceptual features and formant distributions of the input signal. The frequency weighting function generating unit 550 may extract perceptual features of the input signal according to a Bark scale. Further, the frequency weighting function generating unit 550 may determine a weighting function for each frequency based on the first formant of the formant distribution. The frequency weighting function may exhibit a relatively low weight at very low and high frequencies, and the same magnitude of weight at low frequencies in a certain frequency period (e.g., the period corresponding to the first resonance peak). The frequency weighting function generating unit 550 may determine a frequency weighting function according to an input bandwidth and an encoding mode.
The combining unit 570 may combine the amplitude weighting function W1(n) and a frequency weighting function W2(n) combining to determine an FFT-based weighting function Wf(n) of (a). The combining unit 570 may determine the final weighting function by multiplying or adding the amplitude weighting function and the frequency weighting function. For example, the FFT-based weighting function W for tail-of-frame LSF quantization may be calculated based on equation 7 belowf(n)。
[ equation 7]
Wf(n)=W1(n)·W2(n), for n ═ 0.., M-1
Fig. 6 is a block diagram of an LPC coefficient quantization unit according to an exemplary embodiment.
The LPC coefficient quantization unit 600 shown in fig. 6 may include a selection unit 610, a first quantization module 630 and a second quantization module 650.
Referring to fig. 6, the selection unit 610 may select one of quantization without inter prediction and quantization with inter prediction based on a predetermined criterion. Herein, as a predetermined criterion, a prediction error of an unquantized LSF may be used. The prediction error may be obtained based on the inter prediction value.
When quantization without inter prediction is selected, the first quantization module 630 may quantize the input signal provided through the selection unit 610.
When quantization with inter prediction is selected, the second quantization module 650 may quantize the input signal provided through the selection unit 610.
The first quantization module 630 may perform quantization without inter prediction and may be referred to as a safety net scheme. The second quantization module 650 may perform quantization using inter prediction and may be referred to as a prediction scheme.
Accordingly, an optimal quantizer may be selected corresponding to various bit rates from a low bit rate for an efficient interactive voice service to a high bit rate for a service providing differentiated quality.
Fig. 7 is a block diagram of a selection unit of fig. 6 according to an example embodiment.
The selection unit 700 shown in fig. 7 may include a prediction error calculation unit 710 and a quantization scheme selection unit 730. Herein, the prediction error calculation unit 710 may be included in the second quantization module 650 of fig. 6.
Referring to fig. 7, the prediction error calculation unit 710 may calculate a prediction error based on various methods by receiving (as inputs) an inter prediction value p (n), a weighting function w (n), and an LSF coefficient z (n) from which a DC value has been removed. First, the same inter predictor as used in the prediction scheme of the second quantization module 650 may be used. Herein, any one of an Autoregressive (AR) method and a Moving Average (MA) method may be used. As the signal z (n) of the previous frame for inter prediction, a quantized value or an unquantized value may be used. Further, a weighting function may or may not be applied when obtaining the prediction error. Thus, a total of eight combinations are available, and four of the eight combinations are as follows.
First, a weighted AR prediction error using a quantized signal z (n) of a previous frame may be represented by equation 8 below.
[ equation 8]
Second, the AR prediction error using the quantized signal z (n) of the previous frame may be represented by equation 9 below.
[ equation 9]
Third, a weighted AR prediction error using a signal z (n) of a previous frame can be represented by equation 10 below.
[ equation 10]
Fourth, an AR prediction error using a signal z (n) of a previous frame may be represented by equation 11 below.
[ equation 11]
Here, M denotes the dimension of the LSF, and when the bandwidth of the input speech signal is WB, 16 is common to M, and ρ (i) denotes a prediction coefficient of the AR method. As described above, the case of using information on a previous frame is common, and a quantization scheme may be determined using a prediction error obtained as described above.
If the prediction error is greater than a predetermined threshold, this may imply that the current frame tends to be non-stationary. In this case, a safety net scheme may be used. Otherwise, a prediction scheme is used, and in this case, the prediction scheme may be constrained such that the prediction scheme is not continuously selected.
According to one embodiment, in order to prepare for a case in which there is no information on a previous frame due to a frame error occurring on the previous frame, a second prediction error may be obtained using a previous frame of the previous frame, and a quantization scheme may be determined using the second prediction error. In this case, the second prediction error may be expressed by equation 12 below, compared to the first case described above.
[ equation 12]
The quantization scheme selection unit 730 may determine a quantization scheme of the current frame by using the prediction error obtained by the prediction error calculation unit 710. In this case, the encoding mode obtained by the encoding mode determining unit (110 of fig. 1) may be further considered. According to one embodiment, the quantization scheme selection unit 730 is operable in a VC mode or a GC mode.
FIG. 8 is a flow diagram describing the operation of the selection unit of FIG. 6 according to one embodiment. When the prediction mode has a value of 0, this indicates that the safety net scheme is always used, and when the prediction mode has a value other than 0, this indicates that the quantization scheme is determined by switching the safety net scheme and the prediction scheme. Examples of the encoding mode that always uses the security net scheme may be the UC mode and the TC mode. Further, examples of the encoding mode in which the security net scheme and the prediction scheme are switched and used may be a VC mode and a GC mode.
Referring to fig. 8, in operation 810, it is determined whether a prediction mode of a current frame is 0. As a result of the determination in operation 810, if the prediction mode is 0, for example, if the current frame has high variability as in the UC mode or the TC mode, since prediction between frames is difficult, a safety net scheme (i.e., the first quantization module 630) may always be selected in operation 850.
Otherwise, as a result of the determination in operation 810, if the prediction mode is not 0, one of the safety net scheme and the prediction scheme may be determined as a quantization scheme considering the prediction error. To this end, in operation 830, it is determined whether the prediction error is greater than a predetermined threshold. Herein, the threshold value may be predetermined by experiment or simulation. For example, for WB of dimension 16, the threshold may be determined as 3,784,536.3, for example. However, the prediction scheme may be constrained such that the prediction scheme is not continuously selected.
As a result of the determination in operation 830, if the prediction error is greater than or equal to the threshold, a safety net scheme may be selected in operation 850. Otherwise, as a result of the determination in operation 830, if the prediction error is less than the threshold, a prediction scheme may be selected in operation 870.
Fig. 9A-9E are block diagrams illustrating examples of various implementations of the first quantization module shown in fig. 6. According to one embodiment, it is assumed that a 16-dimensional LSF vector is used as input to the first quantization module.
The first quantization module 900 shown in fig. 9A may include: a first quantizer 911 for quantizing the profile of the entire input vector by using TCQ and a second quantizer 913 for additionally quantizing the quantization error signal. The first quantizer 911 may be implemented using a trellis-structured quantizer such as TCQ, TCVQ, BC-TCQ, or BC-TCVQ. The second quantizer 913 may be implemented using a vector quantizer or a scalar quantizer, but is not limited thereto. To improve performance while minimizing memory size, a Split Vector Quantizer (SVQ) may be used, or to improve performance, a multi-stage vector quantizer (MSVQ) may be used. When the second quantizer 913 is implemented using SVQ or MSVQ, two or more candidates may be stored if there is a spare complexity, and then a soft decision technique to perform an optimal codebook index search may be used.
The first and second quantizers 911 and 913 operate as follows.
First, the signal z (n) may be obtained by removing a predefined average value from the unquantized LSF coefficients. The first quantizer 911 may quantize or dequantize the entire vector of the signal z (n). The quantizer used herein may be, for example, TCQ, TCVQ, BC-TCQ, or BC-TCVQ. To obtain the quantization error signal, the difference between the signal z (n) and the dequantized signal may be used to obtain the signal r (n). The signal r (n) may be provided as an input to the second quantizer 913. The second quantizer 913 may be implemented using SVQ, MSVQ, or the like. The signal quantized by the second quantizer 913 becomes a quantization value z (n) after being inversely quantized, and is then added to the result inversely quantized by the first quantizer 911, and a quantized LSF value may be obtained by adding the average value to the quantization value z (n).
The first quantization module 900 illustrated in fig. 9B may further include an intra predictor 932 in addition to the first quantizer 931 and the second quantizer 933. The first and second quantizers 931 and 933 may correspond to the first and second quantizers 911 and 913 of fig. 9A. Since the LSF coefficients are encoded for each frame, prediction can be performed using 10-dimensional or 16-dimensional LSF coefficients in one frame. According to fig. 9B, the signal z (n) may be quantized by a first quantizer 931 and an intra predictor 932. As the past signal to be used for intra prediction, the value t (n) of the previous stage that has been quantized by TCQ is used. The prediction coefficients to be used for intra prediction may be predefined by a codebook training operation. For TCQ, one dimension is typically used, and higher degrees or dimensions may be used, as the case may be. Since TCVQ processes vectors, the prediction coefficients may have an N-dimensional or nxn matrix form corresponding to the size of dimension N of the vector. Herein, N may be a natural number greater than or equal to 2. For example, when the dimension of VQ is 2, it is necessary to obtain a prediction coefficient in advance by using a matrix of 2 dimensions or 2 × 2 size. According to one embodiment, TCVQ uses 2D, and the intra predictor 932 has a size of 2 × 2.
The intra prediction operation of TCQ is as follows. Input signal t of first quantizer 931j(n) (i.e., the first TCQ) can be obtained by equation 13 below.
[ equation 13]
Herein, M denotes the dimension of the LSF coefficient, and ρjRepresenting the 1D prediction coefficients.
The first quantizer 931 may quantize the prediction error vector t (n). According to one embodiment, the first quantizer 931 may be implemented using a TCQ (in detail, BC-TCQ, BC-TCVQ, TCQ, or TCVQ). The intra predictor 932 used with the first quantizer 931 may repeat a quantization operation and a prediction operation in an element unit or a sub-vector unit of an input vector. The operation of the second quantizer 933 is the same as that of the second quantizer 913 of fig. 9A.
When the first quantizer 931 is implemented based on the N-dimensional TCVQ or the N-dimensional BC-TCVQ, the first quantizer 931 may quantize an error vector between the N-dimensional sub-vector and the prediction vector. Herein, N may be a natural number greater than or equal to 2. The intra predictor 932 may generate a prediction vector from the quantized N-dimensional sub-vector. The intra predictor 932 may use prediction coefficients having an N × N matrix, and may perform intra prediction by using a quantized N-dimensional sub-vector of a previous stage. The second quantizer 933 may quantize the quantization error of the N-dimensional sub-vector.
In more detail, the intra predictor 932 may generate a prediction vector of a current stage by a quantized N-dimensional linear vector of a previous stage and a prediction matrix of the current stage. The first quantizer 931 may generate a quantized error vector corresponding to a difference between the prediction vector of the current stage and the N-dimensional linear vector of the current stage by quantizing the error vector. The linear vector of the previous stage may be generated based on the error vector of the previous stage and the prediction vector of the previous stage. The second quantizer 933 may generate a quantized quantization error vector corresponding to a difference between the quantized N-dimensional linear vector of the current stage and the input N-dimensional linear vector by quantizing the quantization error vector.
Fig. 9C illustrates a first quantization module 900 for codebook sharing in addition to the structure of fig. 9A. The first quantization module 900 may include a first quantizer 951 and a second quantizer 953. When a speech/audio encoder supports multi-rate coding, a technique of quantizing the same LSF input vector into various bits is required. In this case, in order to exhibit effective performance while minimizing a codebook memory of a quantizer to be used, two types of bit numbers enabling to have one structure can be implementedAnd (6) distributing. In FIG. 9C, fH(n) represents a high rate output, and fL(n) represents a low rate output. In fig. 9C, when only BC-TCQ/BC-TCVQ is used, quantization of a low rate can be performed using only the number of bits for BC-TCQ/BC-TCVQ. If more accurate quantization is required in addition to the above quantization, an additional second quantizer 953 may be used to quantize the error signal of the first quantizer 951.
In addition to the structure of fig. 9C, fig. 9D also includes an intra predictor 972. The first quantization module 900 may include an intra predictor 972 in addition to the first and second quantizers 971 and 973. The first and second quantizers 971 and 973 may correspond to the first and second quantizers 951 and 953 of fig. 9C.
Fig. 9E illustrates a configuration of an input vector when the first quantizer 911, 931, 951, or 971 is implemented by the 2-dimensional TCVQ in fig. 9A to 9D. In general, when the input vector is 16, the input vector 990 of the 2-dimensional TCVQ may be 8.
Hereinafter, when the first quantizer 931 is implemented by the 2-dimensional TCVQ in fig. 9B, the intra prediction process will be described in detail.
First, an input signal t may be obtainedk(i) I.e., the prediction residual vector of the first quantizer 931, as expressed in equation 14 below.
[ equation 14]
tk(0)=zk(0)
Herein, M denotes the dimension of the LSF coefficient,representing the error vector of the ith dimension, i.e. zk(i) Is determined by the estimated value of (c),representing the (i-1) th dimensional error vector, i.e. zk(i-1) and AjRepresenting a 2 x 2 prediction matrix.
AjWhich can be expressed in equation 15 below.
[ equation 15]
That is, the first quantizer 931 may perform a prediction on the residual vector tk(i) Quantization is performed, and the first quantizer 931 and the intra predictor 932 may quantize zk(i) Quantization is performed. Thus, the i-th dimension error vector, i.e. zk(i) Of the quantization vectorWhich can be represented by equation 16 below.
[ equation 16]
Table 3 below shows an example of intra prediction coefficients for a BC-TCVQ (e.g., the first quantizer 931 used in the safety net scheme).
[ Table 3]
Hereinafter, when the first quantizer 1031 is implemented by the 2-dimensional TCVQ in fig. 10B, an intra prediction process will be described in detail.
In this case, the first quantizer 1031 and the intra predictor 1032 may pair rk(i) Quantization is performed. When the first quantizer 1031 is implemented by BC-TCVQ, E of equation 17 may be made by searchingwerr(p) minimized index to obtain the best index for each stage of the BC-TCVQ.
[ equation 17]
For P1jAnd j 1, M/2
In equation 17, PjRepresenting the number of codevectors in the jth sub-codebook,representing the p-th code vector in the jth sub-codebook, wend(wEnd of the line(i) Represents a weighting function, and may also infer
That is, the first quantizer 1031 may quantize the prediction residual vector tk(i) Quantization is performed, and the first quantizer 1031 and the intra predictor 1032 may quantize rk(i) Quantization is performed. Accordingly, r can be expressed by equation 18 belowk(i) Of the quantization vector
[ equation 18]
Table 4 below shows an example of intra prediction coefficients for BC-TCVQ (e.g., the first quantizer 1031 used in the prediction scheme).
[ Table 4]
Even in the case where the first quantizer 931 is implemented by the 2-dimensional TCVQ, the above-described intra prediction process of each embodiment may be similarly applied, and may be applied regardless of the presence or absence of the second quantizer 933. According to an implementation, the intra prediction process may use an AR method, but is not limited thereto.
The first quantization module 900 shown in fig. 9A and 9B may be implemented without the second quantizer 913 or 933. In this case, a quantization index for a quantization error of a one-dimensional or N-dimensional sub-vector may not be included in the bitstream.
Fig. 10A-10D are block diagrams illustrating examples of various implementations of the second quantization module shown in fig. 6.
The second quantization module 10000 shown in fig. 10A includes an inter predictor 1014 in addition to the structure of fig. 9B. The second quantization module 10000 shown in fig. 10A may further include an inter-predictor 1014 in addition to the first quantizer 1011 and the second quantizer 1013. The inter predictor 1014 is a technique of predicting a current frame by using LSF coefficients quantized with respect to a previous frame. The inter prediction operation uses a method of performing subtraction from the current frame by using a quantized value of the previous frame and then performing addition of a contribution portion after quantization. In this case, a prediction coefficient is obtained for each element.
In addition to the structure of fig. 10A, the second quantization module 10000 shown in fig. 10B includes an intra predictor 1032. The second quantization module 10000 shown in fig. 10B may further include an intra predictor 1032 in addition to the first quantizer 1031, the second quantizer 1033, and the inter predictor 1034. When the first quantizer 1031 is implemented based on an N-dimensional TCVQ or an N-dimensional BC-TCVQ, the first quantizer 1031 may quantize an error vector corresponding to a difference between a prediction error vector (between an N-dimensional sub-vector and a prediction vector of a current frame) and the prediction vector. Herein, N may be a natural number greater than or equal to 2. The intra predictor 1032 may generate a prediction vector from the quantized prediction error vector. The inter predictor 1034 may use a prediction vector of the current frame from a quantized N-dimensional sub-vector of the previous frame. The second quantizer 1033 may quantize a quantization error of the prediction error vector.
In more detail, the first quantizer 1031 may quantize an error vector corresponding to a difference between the prediction error vector of the current stage and the prediction vector. The prediction error vector may correspond to a difference between a prediction vector of the current frame and an N-dimensional linear vector of the current stage. The intra predictor 1032 may generate a prediction vector of a current stage from a quantized prediction error vector of a previous stage and a prediction matrix of the current stage. The second quantizer 1033 may generate a quantized quantization error vector by quantizing a quantization error vector corresponding to a difference between a quantized prediction error vector of a current stage and a prediction error vector corresponding to a difference between a prediction vector of a current stage and an N-dimensional linear vector of the current stage.
Fig. 10C illustrates a second quantization module 1000 for codebook sharing in addition to the structure of fig. 10B. That is, in addition to the structure of fig. 10B, the structure of the codebook that shares BC-TCQ/BC-TCVQ between the low rate and the high rate is also shown. In fig. 10B, the upper circuit diagram indicates an output related to a low rate without using the second quantizer (not shown), and the lower circuit diagram indicates an output related to a high rate using the second quantizer 1063.
Fig. 10D illustrates an example in which the second quantization module 1000 is implemented by omitting an intra predictor from the structure of fig. 10C.
Even in the case where the quantizer is implemented by the 2-dimensional TCVQ, the above-described intra prediction process of each embodiment may be similarly applied, and may be applied regardless of the presence or absence of the second quantizer 933. According to an implementation, the intra prediction process may use an AR method, but is not limited thereto.
The first quantization module 1000 shown in fig. 10A and 10B may be implemented without the second quantizer 1013 or 1033. In this case, a quantization index for a quantization error of a one-dimensional or N-dimensional sub-vector may not be included in the bitstream.
Fig. 11A-11F are block diagrams illustrating examples of various implementations of a quantizer 1100 in which weights are applied to a BC-TCVQ.
Fig. 11A illustrates a basic BC-TCVQ, and may include a weighting function calculation unit 1111 and a BC-TCVQ part 1112. When the BC-TCVQ obtains the best index, an index by which weighted distortion is minimized is obtained. Fig. 11B illustrates a structure in which the intra predictor 1123 is added to fig. 11A. For the intra prediction used in fig. 11B, an AR method or an MA method may be used. According to one embodiment, an AR method is used, and prediction coefficients to be used may be predefined.
Fig. 11C shows a structure that adds an interframe predictor 1134 to fig. 11B for additional performance improvement. Fig. 11C shows an example of a quantizer used in the prediction scheme. For the inter prediction used in fig. 11C, an AR method or an MA method may be used. According to one embodiment, an AR method is used, and prediction coefficients to be used may be predefined. The quantization operation is described below. First, a prediction error value predicted using inter prediction may be quantized by means of BC-TCVQ using inter prediction. The quantization index value is sent to a decoder. The decoding operation is described as follows. The quantized value r (n) is obtained by adding an intra prediction value to the quantized result of the BC-TCVQ. The final quantized LSF value is obtained by adding the predicted value of the interframe predictor 1134 to the quantized value r (n) and then adding the average value to the addition result.
Fig. 11D illustrates a structure in which the intra predictor is omitted from fig. 11C. Fig. 11E shows a structure of how the weight is applied when the second quantizer 1153 is added. The weighting function obtained by the weighting function calculation unit 1151 is used for both the first quantizer 1152 and the second quantizer 1153, and the best index is obtained using the weighting distortion. The first quantizer 1152 may be implemented using BC-TCQ, BC-TCVQ, TCQ, or TCVQ. The second quantizer 1153 may be implemented using SQ, VQ, SVQ, or MSVQ. Fig. 11F shows a structure in which the interframe predictor is omitted from fig. 11E.
The quantizer of the switching structure may be realized by combining the quantizer forms of the various structures already described with reference to fig. 11A to 11F.
Fig. 12 is a block diagram of a quantization apparatus having a switching structure of an open-loop scheme at a low rate according to an exemplary embodiment. The quantization apparatus 1200 shown in fig. 12 may include a selection unit 1210, a first quantization module 1230, and a second quantization module 1250.
The selection unit 1210 may select one of a security net scheme and a prediction scheme as a quantization scheme based on the prediction error.
When the security net scheme is selected, the first quantization module 1230 performs quantization without inter prediction, and may include a first quantizer 1231 and a first intra predictor 1232. In detail, the LSF vector may be quantized into 30 bits by the first quantizer 1231 and the first intra predictor 1232.
The second quantization module 1250 performs quantization using inter prediction when a prediction scheme is selected, and may include a second quantizer 1251, a second intra predictor 1252, and an inter predictor 1253. In detail, a prediction error corresponding to a difference between the LSF vector from which the average value has been removed and the prediction vector may be quantized into 30 bits by the second quantizer 1251 and the second intra predictor 1252.
The quantization apparatus shown in fig. 12 shows an example of LSF coefficient quantization using 31 bits in the VC mode. The first and second quantizers 1231 and 1251 in the quantizing device of fig. 12 may share a codebook with the first and second quantizers 1331 and 1351 in the quantizing device of fig. 13. The operation of the quantization apparatus shown in fig. 12 is described as follows. The signal z (n) may be obtained by removing the mean value from the input LSF value f (n). The selection unit 1210 may select or determine an optimal quantization scheme by using values p (n) and z (n) for inter prediction using a decoded value z (n) in a previous frame, a weighting function, and a prediction mode pred _ mode. According to the selected or determined result, quantization may be performed using one of a security net scheme and a prediction scheme. The selected or determined quantization scheme may be encoded by one bit.
When the safety net scheme is selected by the selection unit 1210, the entire input vector of the LSF coefficient z (n) from which the average value has been removed may be quantized by the first intra predictor 1232 and using the first quantizer 1231 using 30 bits. However, when the prediction scheme is selected by the selection unit 1210, a prediction error signal obtained from the LSF coefficient z (n) from which the average value has been removed using the inter predictor 1253 may be quantized by the second intra predictor 1252 and using the second quantizer 1251 using 30 bits. The first and second quantizers 1231, 1251 may be quantizers, for example, in the form of TCQ or TCVQ. In detail, BC-TCQ, BC-TCVQ, etc. can be used. In this case, the quantizer uses a total of 31 bits. The quantization result is used as the output of the low-rate quantizer, and the main outputs of the quantizer are the quantized LSF vector and the quantization index.
Fig. 13 is a block diagram of a quantization apparatus having a switching structure of an open loop scheme at a high rate according to an exemplary embodiment. The quantization apparatus 1300 shown in fig. 13 may include a selection unit 1310, a first quantization module 1330, and a second quantization module 1350. When compared to fig. 12, there is a difference in that a third quantizer 1333 is added to the first quantization module 1330, and a fourth quantizer 1353 is added to the second quantization module 1350. In fig. 12 and 13, the first quantizers 1231 and 1331 and the second quantizers 1251 and 1351 may use the same codebook, respectively. That is, the 31-bit LSF quantizing apparatus 1200 of fig. 12 and the 41-bit LSF quantizing apparatus 1300 of fig. 13 may use the same codebook for BC-TCVQ. Therefore, although the codebook cannot be referred to as an optimal codebook, the memory size can be significantly saved.
The selection unit 1310 may select one of a security net scheme and a prediction scheme as a quantization scheme based on the prediction error.
When the safety net scheme is selected, the first quantization module 1330 may perform quantization without inter prediction, and may include a first quantizer 1331, a first intra predictor 1332, and a third quantizer 1333.
The second quantization module 1350 may perform quantization using inter prediction when a prediction scheme is selected, and may include a second quantizer 1351, a second intra predictor 1352, a fourth quantizer 1353, and an inter predictor 1354.
The quantization apparatus shown in fig. 13 shows an example of LSF coefficient quantization using 41 bits in the VC mode. The first and second quantizers 1331 and 1351 in the quantizing device 1300 of fig. 13 may share a codebook with the first and second quantizers 1231 and 1251, respectively, in the quantizing device 1200 of fig. 12. The operation of the quantization apparatus 1300 is described as follows. The signal z (n) may be obtained by removing the mean value from the input LSF value f (n). The selection unit 1310 may select or determine an optimal quantization scheme by using values p (n) and z (n) for inter prediction using a decoded value z (n) in a previous frame, a weighting function, and a prediction mode pred _ mode. According to the selected or determined result, quantization may be performed using one of a security net scheme and a prediction scheme. The selected or determined quantization scheme may be encoded by one bit.
When the safety net scheme is selected by the selection unit 1310, the entire input vector of the LSF coefficient z (n) from which the average value has been removed may be quantized and dequantized by the first intra predictor 1332 and the first quantizer 1331 using 30 bits. A second error vector indicative of the difference between the original signal and the dequantized result may be provided as an input to the third quantizer 1333. The third quantizer 1333 may quantize the second error vector by using 10 bits. The third quantizer 1333 may be, for example, SQ, VQ, SVQ, or MSVQ. After quantization and inverse quantization, the final quantized vector may be stored for subsequent frames.
However, when the prediction scheme is selected by the selection unit 1310, a prediction error signal obtained by subtracting p (n) of the inter predictor 1354 from the LSF coefficient z (n) from which the average value has been removed may be quantized or dequantized by using the second quantizer 1351 and the second intra predictor 1352 of 30 bits. The first quantizer 1331 and the second quantizer 1351 may be quantizers, for example, in the form of TCQ or TCVQ. In detail, BC-TCQ, BC-TCVQ, etc. can be used. A second error vector indicative of the difference between the original signal and the dequantized result may be provided as an input to the fourth quantizer 1353. The fourth quantizer 1353 may quantize the second error vector by using 10 bits. Herein, the second error vector may be divided into two 8 × 8-dimensional sub-vectors and then quantized by the fourth quantizer 1353. Since the low frequency band is perceptually more important than the high frequency band, the second error vector may be encoded by allocating different numbers of bits to the first and second VQs. The fourth quantizer 1353 may be, for example, SQ, VQ, SVQ, or MSVQ. After quantization and inverse quantization, the final quantized vector may be stored for subsequent frames.
In this case, the quantizer uses a total of 41 bits. The quantization result is used as the output of the high-rate quantizer, and the main outputs of the quantizer are the quantized LSF vector and the quantization index.
Accordingly, when fig. 12 and 13 are used, the first quantizer 1231 of fig. 12 and the first quantizer 1331 of fig. 13 may share a quantization codebook, and the second quantizer 1251 of fig. 12 and the second quantizer 1351 of fig. 13 may share a quantization codebook, thereby remarkably saving the entire codebook memory. To additionally save codebook memory, the third quantizer 1333 and the fourth quantizer 1353 may also share a quantization codebook. In this case, since the input distribution of the third quantizer 1333 is different from that of the fourth quantizer 1353, a scaling factor may be used to compensate for the difference between the input distributions. The scaling factor may be calculated by considering the input of the third quantizer 1333 and the input distribution of the fourth quantizer 1353. According to one embodiment, an input signal of the third quantizer 1333 may be divided by a scaling factor, and a signal obtained by the division result may be quantized by the third quantizer 1333. The signal quantized by the third quantizer 1333 may be obtained by multiplying the output of the third quantizer 1333 by a scaling factor. As described above, if the input of the third quantizer 1333 or the fourth quantizer 1353 is appropriately scaled and then quantized, a codebook may be shared while maintaining performance at most.
Fig. 14 is a block diagram of a quantization apparatus having a switching structure of an open loop scheme at a low rate according to another exemplary embodiment. In the quantization apparatus 1400 of fig. 14, the low rate parts of fig. 9C and 9D may be applied to the first and second quantizers 1431 and 1451 used by the first and second quantization modules 1430 and 1450. The operation of the quantization apparatus 1400 is described as follows. The weighting function calculation 1400 may obtain the weighting function w (n) by using the input LSF value. The obtained weighting function w (n) may be used by the first quantizer 1431 and the second quantizer 1451. The signal z (n) can be obtained by removing the mean value from the LSF value f (n). The selection unit 1410 may determine an optimal quantization scheme by using values p (n) and z (n) for inter prediction using a decoded value z (n) in a previous frame, a weighting function, and a prediction mode pred _ mode. According to the selected or determined result, quantization may be performed using one of a security net scheme and a prediction scheme. The selected or determined quantization scheme may be encoded by one bit.
When the safety net scheme is selected by the selection unit 1410, the LSF coefficient z (n) from which the average value has been removed may be quantized by the first quantizer 1431. As described with reference to fig. 9C and 9D, the first quantizer 1431 may use intra prediction for high performance or may not use intra prediction for low complexity. When an intra predictor is used, the entire input vector may be provided to the first quantizer 1431 for quantizing the entire input vector by using TCQ or TCVQ through intra prediction.
When the prediction scheme is selected by the selection unit 1410, the LSF coefficient z (n) from which the average value has been removed may be provided to the second quantizer 1451 for quantizing a prediction error signal obtained using inter prediction by using TCQ or TCVQ through intra prediction. The first quantizer 1431 and the second quantizer 1451 may be, for example, quantizers in the form of TCQ or TCVQ. In detail, BC-TCQ, BC-TCVQ, etc. can be used. The quantization result is used as the output of the low rate quantizer.
Fig. 15 is a block diagram of a quantization apparatus having a switching structure of an open-loop scheme at a high rate according to another embodiment. The quantization apparatus 1500 shown in fig. 15 may include a selection unit 1510, a first quantization module 1530, and a second quantization module 1550. When compared to fig. 14, there is a difference in that a third quantizer 1532 is added to the first quantization module 1530 and a fourth quantizer 1552 is added to the second quantization module 1550. In fig. 14 and 15, the first quantizers 1431 and 1531 and the second quantizers 1451 and 1551 may use the same codebook, respectively. Therefore, although the codebook cannot be referred to as an optimal codebook, the memory size can be significantly saved. The operation of the quantization apparatus 1500 is described as follows. When the safety net scheme is selected by the selection unit 1510, the first quantizer 1531 performs first quantization and inverse quantization, and a second error vector indicating a difference between an original signal and an inverse quantization result may be provided as an input of the third quantizer 1532. The third quantizer 1532 may quantize the second error vector. The third quantizer 1532 may be, for example, SQ, VQ, SVQ, or MSVQ. After quantization and inverse quantization, the final quantized vector may be stored for subsequent frames.
However, when the prediction scheme is selected by the selection unit 1510, the second quantizer 1551 performs quantization and inverse quantization, and a second error vector indicating a difference between the original signal and the inverse quantization result may be provided as an input of the fourth quantizer 1552. A fourth quantizer 1552 may quantize the second error vector. The fourth quantizer 1552 may be, for example, SQ, VQ, SVQ, or MSVQ. After quantization and inverse quantization, the final quantized vector may be stored for subsequent frames.
Fig. 16 is a block diagram of an LPC coefficient quantization unit according to another exemplary embodiment.
The LPC coefficient quantization unit 1600 shown in fig. 16 may include a selection unit 1610, a first quantization module 1630, a second quantization module 1650, and a weighting function calculation unit 1670. When compared with the LPC coefficient quantization unit 600 shown in fig. 6, there is a difference in that a weighting function calculation unit 1670 is further included. Detailed implementation examples are shown in fig. 11A to 11F.
Fig. 17 is a block diagram of a quantization apparatus having a switching structure of a closed-loop scheme according to an embodiment. The quantization apparatus 1700 shown in fig. 17 may include a first quantization module 1710, a second quantization module 1730, and a selection unit 1750. The first quantization module 1710 may include a first quantizer 1711, a first intra predictor 1712, and a third quantizer 1713, and the second quantization module 1730 may include a second quantizer 1731, a second intra predictor 1732, a fourth quantizer 1733, and an inter predictor 1734.
Referring to fig. 17, in the first quantization module 1710, the first quantizer 1711 may quantize the entire input vector by using BC-TCVQ or BC-TCQ through the first intra predictor 1712. The third quantizer 1713 may quantize the quantization error signal by using VQ.
In the second quantization module 1730, the second quantizer 1731 may quantize the prediction error signal by using the BC-TCVQ or BC-TCQ passed through the second intra predictor 1732. The fourth quantizer 1733 may quantize the quantization error signal by using VQ.
The selection unit 1750 may select one of the output of the first quantization module 1710 and the output of the second quantization module 1730.
In fig. 17, the security net scheme is the same as that of fig. 9B, and the prediction scheme is the same as that of fig. 10B. Herein, for inter prediction, one of an AR method and an MA method may be used. According to one embodiment, an example is shown using a first order AR approach. The prediction coefficients are predefined and, as past vectors for prediction, a vector is selected that is the best vector between the two schemes in the previous frame.
Fig. 18 is a block diagram of a quantization apparatus having a switching structure of a closed-loop scheme according to another exemplary embodiment. When compared to fig. 17, the intra predictor is omitted. The quantization apparatus 1800 shown in fig. 18 may include a first quantization module 1810, a second quantization module 1830 and a selection unit 1850. The first quantization module 1810 may include a first quantizer 1811 and a third quantizer 1812, and the second quantization module 1830 may include a second quantizer 1831, a fourth quantizer 1832, and an inter-predictor 1833.
Referring to fig. 18, the selection unit 1850 may select or determine an optimal quantization scheme by using weighted distortion as an input, the weighted distortion being obtained using the output of the first quantization module 1810 and the output of the second quantization module 1830. The operation of determining the optimal quantization scheme is described as follows.
Herein, when the prediction mode (prediode) is 0, this indicates a mode in which the safety net scheme is always used, and when the prediction mode (prediode) is not 0, this indicates handover and use of the safety net scheme and the prediction scheme. An example of a mode that always uses the security net scheme may be a TC mode or a UC mode. WDist [0] represents the weighted distortion of the security mesh scheme, and WDist [1] represents the weighted distortion of the prediction scheme. Further, abs _ threshold represents a preset threshold. When the prediction mode is not 0, the optimal quantization scheme can be selected by giving a higher priority to the weighted distortion of the safety net scheme in consideration of the frame error. That is, essentially, if the value of WDist [0] is less than a predefined threshold, then a security mesh scheme may be selected regardless of the value of WDist [1 ]. Even in other cases, for the same weighted distortion, a safety net scheme may be chosen instead of simply choosing a less weighted distortion, since the safety net scheme is more stable to frame errors. Thus, the prediction scheme may be selected only when WDist [0] is greater than PREFERSFNET × WDist [1 ]. Herein, PREFERSFNET ═ 1.15 is available, but not limited thereto. By doing so, when a quantization scheme is selected, bit information indicating the selected quantization scheme and a quantization index obtained by performing quantization using the selected quantization scheme may be transmitted.
Fig. 19 is a block diagram of an inverse quantization apparatus according to an exemplary embodiment.
The inverse quantization apparatus 1900 shown in fig. 19 may include a selection unit 1910, a first inverse quantization module 1930, and a second inverse quantization module 1950.
Referring to fig. 19, the selection unit 1910 may provide encoded LPC parameters (e.g., a prediction residual) to one of the first and second inverse quantization modules 1930 and 1950 based on quantization scheme information included in a bitstream. For example, the quantization scheme information may be represented by one bit.
The first inverse quantization module 1930 may inverse quantize the encoded LPC parameters without inter prediction.
The second inverse quantization module 1950 may inverse quantize the encoded LPC parameters using inter prediction.
The first and second inverse quantization modules 1930 and 1950 may be implemented based on inverse processes of the first and second quantization modules according to each of the above-described various embodiments of an encoding apparatus corresponding to a decoding apparatus.
The inverse quantization apparatus of fig. 19 can be applied regardless of whether the quantizer structure is an open-loop scheme or a closed-loop scheme.
The VC mode at an internal sampling frequency of 16KHz may have two decoding rates of, for example, 31 bits per frame or 40 or 41 bits per frame. VC mode can be decoded by a 16-state 8-stage BC TCVQ.
Fig. 20 is a block diagram of an inverse quantization apparatus according to an exemplary embodiment, which may correspond to a coding rate of 31 bits. The dequantization apparatus 2000 shown in fig. 20 may include a selection unit 2010, a first dequantization module 2030, and a second dequantization module 2050. The first dequantization module 2030 may include a first dequantizer 2031 and a first intra predictor 2032, and the second dequantization module 2050 may include a second dequantizer 2051, a second intra predictor 2052, and an inter predictor 2053. The inverse quantization apparatus of fig. 20 may correspond to the quantization apparatus of fig. 12.
Referring to fig. 20, the selection unit 2010 may provide the encoded LPC parameters to one of the first dequantization module 2030 and the second dequantization module 2050 based on quantization scheme information included in the bitstream.
When the quantization scheme information indicates a security net scheme, the first inverse quantizer 2031 of the first inverse quantization module 2030 may perform inverse quantization by using TCQ, TCVQ, BC-TCQ, or BC-TCVQ. The quantized LSF coefficients may be obtained by the first inverse quantizer 2031 and the first intra predictor 2032. The finally decoded LSF coefficients are generated by adding an average value, which is a predetermined DC value, to the quantized LSF coefficients.
However, when the quantization scheme information indicates a prediction scheme, the second dequantizer 2051 of the second dequantization module 2050 may perform dequantization by using TCQ, TCVQ, BC-TCQ, or BC-TCVQ. The inverse quantization operation starts from the lowest vector among the LSF vectors, and the intra predictor 2052 generates a prediction value of a vector element of the next stage by using the decoded vector. The interframe predictor 2053 generates a prediction value through prediction between frames by using LSF coefficients decoded in a previous frame. The finally decoded LSF coefficient is generated by adding an inter prediction value obtained by the inter predictor 2053 to the quantized LSF coefficient obtained by the second dequantizer 2051 and the intra predictor 2052 and then adding an average value that is a predetermined DC value to the addition result.
The decoding process in fig. 20 will be described as follows.
[ equation 19]
Herein, the prediction residual tk(i) May be decoded by the first inverse quantizer 2031.
When the prediction scheme is used, the prediction vector p may be obtained by equation 20 belowk(i)。
[ equation 20]
Herein, ρ (i) denotes the AR prediction coefficients selected for a particular coding mode at a particular internal sampling frequency (e.g., VC mode at 16 kHz), and M denotes the dimension of the LPC. Can also infer
[ equation 21]
[ equation 22]
For i-0.., M-1 here, M (i) denotes the average vector in a particular coding mode (e.g., VC mode). Can also infer
[ equation 23]
Fig. 21 is a detailed block diagram of an inverse quantization apparatus according to another embodiment, which may correspond to a coding rate of 41 bits. The inverse quantization apparatus 2100 illustrated in fig. 21 may include a selection unit 2110, a first inverse quantization module 2130, and a second inverse quantization module 2150. The first dequantization module 2130 may include a first dequantizer 2131, a first intra predictor 2132, and a third dequantizer 2133, and the second dequantization module 2150 may include a second dequantizer 2151, a second intra predictor 2152, a fourth dequantizer 2153, and an inter predictor 2154. The inverse quantization apparatus of fig. 21 may correspond to the quantization apparatus of fig. 13.
Referring to fig. 21, the selection unit 2110 may provide the encoded LPC parameters to one of the first dequantization module 2130 and the second dequantization module 2150 based on quantization scheme information included in the bitstream.
When the quantization scheme information indicates a security net scheme, the first inverse quantizer 2131 of the first inverse quantization module 2130 may perform inverse quantization by using BC-TCVQ. The third inverse quantizer 2133 may perform inverse quantization by using SVQ. The quantized LSF coefficients may be obtained by the first inverse quantizer 2131 and the first intra predictor 2132. The finally decoded LSF coefficient is generated by adding the quantized LSF coefficient obtained by the third inverse quantizer 2133 to the quantized LSF coefficient and then adding an average value, which is a predetermined DC value, to the addition result.
However, when the quantization scheme information indicates a prediction scheme, the second dequantizer 2151 of the second dequantization module 2150 may perform dequantization by using BC-TCVQ. The inverse quantization operation starts from the lowest vector among the LSF vectors, and the second intra predictor 2152 generates a prediction value of a vector element of the next stage by using the decoded vector. The fourth inverse quantizer 2153 may perform inverse quantization by using SVQ. The quantized LSF coefficients provided from the fourth inverse quantizer 2153 may be added to the quantized LSF coefficients obtained by the second inverse quantizer 2151 and the second intra predictor 2152. The inter predictor 2154 may generate a prediction value through prediction between frames by using LSF coefficients decoded in a previous frame. The finally decoded LSF coefficient is generated by adding the inter prediction value obtained by the inter predictor 2153 to the addition result and then adding an average value, which is a predetermined DC value, to the addition result.
Here, the third inverse quantizer 2133 and the fourth inverse quantizer 2153 may share a codebook.
The decoding process in fig. 21 will be described as follows.
The scheme selection and decoding processes of the first and second inverse quantizers 2131 and 2151 are the same as those of fig. 20.Andmay also be performed by the third and fourth inverse quantizers 2133 and 2153.
[ equation 24]
In the present context, it is intended that,may be obtained by the second dequantizer 2151 and the second intra predictor 2152.
[ equation 25]
In the present context, it is intended that,may be obtained by the first dequantizer 2131 and the first intra predictor 2132.
Although not shown, the inverse quantization apparatus of fig. 19 to 21 may be used as a component corresponding to the decoding apparatus of fig. 2.
In each equation, k may represent a frame and i or j may represent a stage.
The content related to BC-TCVQ used in connection with Quantization/dequantization of LPC coefficients is described in detail in "Block Constrained Trellis Coded Vector Quantization of LSF Parameters for Wideband Speech Codecs" (Jungeun Park and Sangwon Kang, the journal of ETRI, 10.2008, 5.30). Furthermore, details relating to TCVQ are described in "Trellis Coded Vector Quantization" (Thomas r. fischer et al, IEEE Transactions on Information Theory, 11.11.1991, vol.6, 37).
The method according to the embodiments can be edited by a computer executable program and implemented in a general-purpose digital computer for executing the program by using a computer readable recording medium. In addition, a data structure, a program command, or a data file that can be used in embodiments of the present invention can be recorded in a computer-readable recording medium by various means. The computer-readable recording medium may include all types of storage devices for storing data that can be read by a computer system. Examples of the computer readable recording medium include a magnetic medium such as a hard disk, a floppy disk or a magnetic tape, an optical medium such as a compact disc read only memory (CD-ROM) or a Digital Versatile Disc (DVD), a magneto-optical medium such as a floppy disk, and a hardware device specifically configured to store and execute program commands, such as a ROM, a RAM or a flash memory. Also, the computer-readable recording medium may be a transmission medium for transmitting signals for specifying program commands, data structures, and the like. Examples of the program command include a high-level language code executable by a computer using an interpreter and a machine language code made by a compiler.
Although the embodiments of the present invention have been described with reference to limited embodiments and drawings, the embodiments of the present invention are not limited to the above-described embodiments, and their updates and modifications may be variously performed by a person having ordinary skill in the art from the present disclosure. Therefore, the scope of the present invention is defined not by the above description but by the appended claims, and all modifications which are consistent or equivalent with the scope of the technical idea of the present invention will fall within the scope.
Claims (14)
1. A quantification apparatus comprising:
an intra predictor configured to generate a prediction vector by estimating a current-level sub-vector of the prediction vector based on a prediction matrix of a current level and a previous-level sub-vector of a quantized input vector obtained based on the prediction vector and a quantized prediction error vector; and
a trellis-structured vector quantizer configured to quantize a prediction error vector corresponding to a difference between the prediction vector and an input vector to generate the quantized prediction error vector.
2. The apparatus of claim 1, wherein the intra predictor is configured to estimate an N-dimensional sub-vector of the prediction vector by using an nxn prediction matrix and an N-dimensional sub-vector of the quantized input vector, N being a natural number greater than or equal to 2.
3. The apparatus of claim 1, wherein the trellis-structured vector quantizer is configured to divide the prediction error vector into N-dimensional sub-vectors, N being a natural number greater than or equal to 2, and to assign the N-dimensional sub-vectors to a plurality of stages.
4. The apparatus of claim 1, wherein the prediction matrix is predefined by codebook training.
5. The apparatus of claim 1, further comprising a vector quantizer configured to quantize a quantization error vector corresponding to a difference between the input vector and the quantized input vector.
6. The apparatus of claim 1, wherein the trellis-structured vector quantizer is configured to search for a best index based on a weighting function.
7. The apparatus of claim 5, wherein the vector quantizer is configured to search for a best index based on a weighting function.
8. A method of quantification, comprising:
generating, by an intra predictor, a prediction vector by estimating a current-level sub-vector of the prediction vector based on a prediction matrix of a current level and a previous-level sub-vector of a quantized input vector obtained based on the prediction vector and a quantized prediction error vector; and
quantizing, by a trellis-structured vector quantizer, a prediction error vector corresponding to a difference between the prediction vector and an input vector to generate the quantized prediction error vector.
9. The method of claim 8, wherein generating the prediction vector comprises estimating an N-dimensional sub-vector of the prediction vector by using an nxn prediction matrix and an N-dimensional sub-vector of the quantized input vector, N being a natural number greater than or equal to 2.
10. The method of claim 8, wherein quantizing the prediction error vector comprises dividing the prediction error vector into N-dimensional sub-vectors, and allocating the N-dimensional sub-vectors to a plurality of stages, N being a natural number greater than or equal to 2.
11. The method of claim 8, wherein the prediction matrix is predefined by codebook training.
12. The method of claim 8, further comprising:
a quantization error vector corresponding to a difference between the input vector and the quantized input vector is quantized by a vector quantizer.
13. The method of claim 8, wherein quantizing the prediction error vector comprises searching for a best index based on a weighting function.
14. The apparatus of claim 12, wherein quantizing the quantization error vector comprises searching for a best index based on a weighting function.
Applications Claiming Priority (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201461989725P | 2014-05-07 | 2014-05-07 | |
US61/989,725 | 2014-05-07 | ||
US201462029687P | 2014-07-28 | 2014-07-28 | |
US62/029,687 | 2014-07-28 | ||
CN201580037280.6A CN107077857B (en) | 2014-05-07 | 2015-05-07 | Method and apparatus for quantizing linear prediction coefficients and method and apparatus for dequantizing linear prediction coefficients |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201580037280.6A Division CN107077857B (en) | 2014-05-07 | 2015-05-07 | Method and apparatus for quantizing linear prediction coefficients and method and apparatus for dequantizing linear prediction coefficients |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112927703A true CN112927703A (en) | 2021-06-08 |
Family
ID=54392696
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110189314.0A Pending CN112927702A (en) | 2014-05-07 | 2015-05-07 | Method and apparatus for quantizing linear prediction coefficients and method and apparatus for dequantizing linear prediction coefficients |
CN202110189590.7A Pending CN112927703A (en) | 2014-05-07 | 2015-05-07 | Method and apparatus for quantizing linear prediction coefficients and method and apparatus for dequantizing linear prediction coefficients |
CN201580037280.6A Active CN107077857B (en) | 2014-05-07 | 2015-05-07 | Method and apparatus for quantizing linear prediction coefficients and method and apparatus for dequantizing linear prediction coefficients |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110189314.0A Pending CN112927702A (en) | 2014-05-07 | 2015-05-07 | Method and apparatus for quantizing linear prediction coefficients and method and apparatus for dequantizing linear prediction coefficients |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201580037280.6A Active CN107077857B (en) | 2014-05-07 | 2015-05-07 | Method and apparatus for quantizing linear prediction coefficients and method and apparatus for dequantizing linear prediction coefficients |
Country Status (5)
Country | Link |
---|---|
US (3) | US10504532B2 (en) |
EP (3) | EP3142110B1 (en) |
KR (3) | KR102593442B1 (en) |
CN (3) | CN112927702A (en) |
WO (1) | WO2015170899A1 (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3869506A1 (en) | 2014-03-28 | 2021-08-25 | Samsung Electronics Co., Ltd. | Method and device for quantization of linear prediction coefficient and method and device for inverse quantization |
CN112927702A (en) * | 2014-05-07 | 2021-06-08 | 三星电子株式会社 | Method and apparatus for quantizing linear prediction coefficients and method and apparatus for dequantizing linear prediction coefficients |
US11270187B2 (en) * | 2017-11-07 | 2022-03-08 | Samsung Electronics Co., Ltd | Method and apparatus for learning low-precision neural network that combines weight quantization and activation quantization |
US11451840B2 (en) * | 2018-06-18 | 2022-09-20 | Qualcomm Incorporated | Trellis coded quantization coefficient coding |
CN111899748B (en) * | 2020-04-15 | 2023-11-28 | 珠海市杰理科技股份有限公司 | Audio coding method and device based on neural network and coder |
KR20210133554A (en) * | 2020-04-29 | 2021-11-08 | 한국전자통신연구원 | Method and apparatus for encoding and decoding audio signal using linear predictive coding |
CN115277323A (en) * | 2022-07-25 | 2022-11-01 | Oppo广东移动通信有限公司 | Data frame transmission method, device, chip, storage medium and Bluetooth equipment |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020138260A1 (en) * | 2001-03-26 | 2002-09-26 | Dae-Sik Kim | LSF quantizer for wideband speech coder |
US20040230429A1 (en) * | 2003-02-19 | 2004-11-18 | Samsung Electronics Co., Ltd. | Block-constrained TCQ method, and method and apparatus for quantizing LSF parameter employing the same in speech coding system |
KR20060068278A (en) * | 2004-12-16 | 2006-06-21 | 한국전자통신연구원 | Apparatus and method for quantization of mel-cepstrum parameters in dispersed voice recognition system |
CN101089951A (en) * | 2006-06-16 | 2007-12-19 | 徐光锁 | Band spreading coding method and device and decode method and device |
KR20080092770A (en) * | 2007-04-13 | 2008-10-16 | 한국전자통신연구원 | The quantizer and method of lsf coefficient in wide-band speech coder using trellis coded quantization algorithm |
CN101911185A (en) * | 2008-01-16 | 2010-12-08 | 松下电器产业株式会社 | Vector quantizer, vector inverse quantizer, and methods therefor |
CN103325375A (en) * | 2013-06-05 | 2013-09-25 | 上海交通大学 | Coding and decoding device and method of ultralow-bit-rate speech |
CN103620676A (en) * | 2011-04-21 | 2014-03-05 | 三星电子株式会社 | Method of quantizing linear predictive coding coefficients, sound encoding method, method of de-quantizing linear predictive coding coefficients, sound decoding method, and recording medium |
Family Cites Families (41)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE69334349D1 (en) | 1992-09-01 | 2011-04-21 | Apple Inc | Improved vector quatization |
US5596659A (en) | 1992-09-01 | 1997-01-21 | Apple Computer, Inc. | Preprocessing and postprocessing for vector quantization |
IT1271959B (en) * | 1993-03-03 | 1997-06-10 | Alcatel Italia | LINEAR PREDICTION SPEAKING CODEC EXCITED BY A BOOK OF CODES |
JP3042886B2 (en) | 1993-03-26 | 2000-05-22 | モトローラ・インコーポレーテッド | Vector quantizer method and apparatus |
JP3557255B2 (en) | 1994-10-18 | 2004-08-25 | 松下電器産業株式会社 | LSP parameter decoding apparatus and decoding method |
US5774839A (en) | 1995-09-29 | 1998-06-30 | Rockwell International Corporation | Delayed decision switched prediction multi-stage LSF vector quantization |
US6904404B1 (en) | 1996-07-01 | 2005-06-07 | Matsushita Electric Industrial Co., Ltd. | Multistage inverse quantization having the plurality of frequency bands |
JP3246715B2 (en) * | 1996-07-01 | 2002-01-15 | 松下電器産業株式会社 | Audio signal compression method and audio signal compression device |
US6055496A (en) * | 1997-03-19 | 2000-04-25 | Nokia Mobile Phones, Ltd. | Vector quantization in celp speech coder |
US5974181A (en) * | 1997-03-20 | 1999-10-26 | Motorola, Inc. | Data compression system, method, and apparatus |
TW408298B (en) | 1997-08-28 | 2000-10-11 | Texas Instruments Inc | Improved method for switched-predictive quantization |
US6125149A (en) | 1997-11-05 | 2000-09-26 | At&T Corp. | Successively refinable trellis coded quantization |
US6324218B1 (en) | 1998-01-16 | 2001-11-27 | At&T | Multiple description trellis coded quantization |
US7072832B1 (en) | 1998-08-24 | 2006-07-04 | Mindspeed Technologies, Inc. | System for speech encoding having an adaptive encoding arrangement |
AU7486200A (en) * | 1999-09-22 | 2001-04-24 | Conexant Systems, Inc. | Multimode speech encoder |
US6959274B1 (en) | 1999-09-22 | 2005-10-25 | Mindspeed Technologies, Inc. | Fixed rate speech compression system and method |
JP3404024B2 (en) | 2001-02-27 | 2003-05-06 | 三菱電機株式会社 | Audio encoding method and audio encoding device |
JP2003140693A (en) * | 2001-11-02 | 2003-05-16 | Sony Corp | Device and method for decoding voice |
CA2388358A1 (en) | 2002-05-31 | 2003-11-30 | Voiceage Corporation | A method and device for multi-rate lattice vector quantization |
JP2007506986A (en) * | 2003-09-17 | 2007-03-22 | 北京阜国数字技術有限公司 | Multi-resolution vector quantization audio CODEC method and apparatus |
KR100728056B1 (en) | 2006-04-04 | 2007-06-13 | 삼성전자주식회사 | Method of multi-path trellis coded quantization and multi-path trellis coded quantizer using the same |
US8589151B2 (en) | 2006-06-21 | 2013-11-19 | Harris Corporation | Vocoder and associated method that transcodes between mixed excitation linear prediction (MELP) vocoders with different speech frame rates |
US7414549B1 (en) | 2006-08-04 | 2008-08-19 | The Texas A&M University System | Wyner-Ziv coding based on TCQ and LDPC codes |
KR101412255B1 (en) | 2006-12-13 | 2014-08-14 | 파나소닉 인텔렉츄얼 프로퍼티 코포레이션 오브 아메리카 | Encoding device, decoding device, and method therof |
WO2008072736A1 (en) * | 2006-12-15 | 2008-06-19 | Panasonic Corporation | Adaptive sound source vector quantization unit and adaptive sound source vector quantization method |
CN101399041A (en) * | 2007-09-30 | 2009-04-01 | 华为技术有限公司 | Encoding/decoding method and device for noise background |
KR101671005B1 (en) | 2007-12-27 | 2016-11-01 | 삼성전자주식회사 | Method and apparatus for quantization encoding and de-quantization decoding using trellis |
CN101609682B (en) | 2008-06-16 | 2012-08-08 | 向为 | Encoder and method for self adapting to discontinuous transmission of multi-rate wideband |
EP2139000B1 (en) | 2008-06-25 | 2011-05-25 | Thomson Licensing | Method and apparatus for encoding or decoding a speech and/or non-speech audio input signal |
RU2519027C2 (en) | 2009-02-13 | 2014-06-10 | Панасоник Корпорэйшн | Vector quantiser, vector inverse quantiser and methods therefor |
US9269366B2 (en) | 2009-08-03 | 2016-02-23 | Broadcom Corporation | Hybrid instantaneous/differential pitch period coding |
WO2011087333A2 (en) * | 2010-01-15 | 2011-07-21 | 엘지전자 주식회사 | Method and apparatus for processing an audio signal |
WO2011126340A2 (en) | 2010-04-08 | 2011-10-13 | 엘지전자 주식회사 | Method and apparatus for processing an audio signal |
KR101660843B1 (en) | 2010-05-27 | 2016-09-29 | 삼성전자주식회사 | Apparatus and method for determining weighting function for lpc coefficients quantization |
KR101747917B1 (en) * | 2010-10-18 | 2017-06-15 | 삼성전자주식회사 | Apparatus and method for determining weighting function having low complexity for lpc coefficients quantization |
CN105244034B (en) * | 2011-04-21 | 2019-08-13 | 三星电子株式会社 | For the quantization method and coding/decoding method and equipment of voice signal or audio signal |
CN103050121A (en) * | 2012-12-31 | 2013-04-17 | 北京迅光达通信技术有限公司 | Linear prediction speech coding method and speech synthesis method |
CN103236262B (en) * | 2013-05-13 | 2015-08-26 | 大连理工大学 | A kind of code-transferring method of speech coder code stream |
CN103632673B (en) * | 2013-11-05 | 2016-05-18 | 无锡北邮感知技术产业研究院有限公司 | A kind of non-linear quantization of speech linear predictive model |
EP3869506A1 (en) * | 2014-03-28 | 2021-08-25 | Samsung Electronics Co., Ltd. | Method and device for quantization of linear prediction coefficient and method and device for inverse quantization |
CN112927702A (en) * | 2014-05-07 | 2021-06-08 | 三星电子株式会社 | Method and apparatus for quantizing linear prediction coefficients and method and apparatus for dequantizing linear prediction coefficients |
-
2015
- 2015-05-07 CN CN202110189314.0A patent/CN112927702A/en active Pending
- 2015-05-07 CN CN202110189590.7A patent/CN112927703A/en active Pending
- 2015-05-07 KR KR1020227016454A patent/KR102593442B1/en active Application Filing
- 2015-05-07 EP EP15789302.5A patent/EP3142110B1/en active Active
- 2015-05-07 KR KR1020237035370A patent/KR20230149335A/en not_active Application Discontinuation
- 2015-05-07 US US15/309,334 patent/US10504532B2/en active Active
- 2015-05-07 WO PCT/KR2015/004577 patent/WO2015170899A1/en active Application Filing
- 2015-05-07 KR KR1020167031128A patent/KR102400540B1/en active IP Right Grant
- 2015-05-07 CN CN201580037280.6A patent/CN107077857B/en active Active
- 2015-05-07 EP EP24167632.9A patent/EP4375992A3/en active Pending
- 2015-05-07 EP EP24167654.3A patent/EP4418266A2/en active Pending
-
2019
- 2019-12-02 US US16/700,246 patent/US11238878B2/en active Active
-
2022
- 2022-01-10 US US17/571,597 patent/US11922960B2/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020138260A1 (en) * | 2001-03-26 | 2002-09-26 | Dae-Sik Kim | LSF quantizer for wideband speech coder |
KR20020075592A (en) * | 2001-03-26 | 2002-10-05 | 한국전자통신연구원 | LSF quantization for wideband speech coder |
US20040230429A1 (en) * | 2003-02-19 | 2004-11-18 | Samsung Electronics Co., Ltd. | Block-constrained TCQ method, and method and apparatus for quantizing LSF parameter employing the same in speech coding system |
KR20060068278A (en) * | 2004-12-16 | 2006-06-21 | 한국전자통신연구원 | Apparatus and method for quantization of mel-cepstrum parameters in dispersed voice recognition system |
CN101089951A (en) * | 2006-06-16 | 2007-12-19 | 徐光锁 | Band spreading coding method and device and decode method and device |
KR20080092770A (en) * | 2007-04-13 | 2008-10-16 | 한국전자통신연구원 | The quantizer and method of lsf coefficient in wide-band speech coder using trellis coded quantization algorithm |
CN101911185A (en) * | 2008-01-16 | 2010-12-08 | 松下电器产业株式会社 | Vector quantizer, vector inverse quantizer, and methods therefor |
CN103620676A (en) * | 2011-04-21 | 2014-03-05 | 三星电子株式会社 | Method of quantizing linear predictive coding coefficients, sound encoding method, method of de-quantizing linear predictive coding coefficients, sound decoding method, and recording medium |
CN103325375A (en) * | 2013-06-05 | 2013-09-25 | 上海交通大学 | Coding and decoding device and method of ultralow-bit-rate speech |
Also Published As
Publication number | Publication date |
---|---|
US20220130403A1 (en) | 2022-04-28 |
EP3142110C0 (en) | 2024-06-26 |
US10504532B2 (en) | 2019-12-10 |
EP3142110A1 (en) | 2017-03-15 |
EP4418266A2 (en) | 2024-08-21 |
EP3142110B1 (en) | 2024-06-26 |
EP4375992A3 (en) | 2024-07-10 |
KR102400540B1 (en) | 2022-05-20 |
CN107077857A (en) | 2017-08-18 |
CN107077857B (en) | 2021-03-09 |
KR20170007280A (en) | 2017-01-18 |
US11238878B2 (en) | 2022-02-01 |
US20200105285A1 (en) | 2020-04-02 |
US20170154632A1 (en) | 2017-06-01 |
US11922960B2 (en) | 2024-03-05 |
EP3142110A4 (en) | 2017-11-29 |
KR102593442B1 (en) | 2023-10-25 |
KR20230149335A (en) | 2023-10-26 |
CN112927702A (en) | 2021-06-08 |
WO2015170899A1 (en) | 2015-11-12 |
KR20220067003A (en) | 2022-05-24 |
EP4375992A2 (en) | 2024-05-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11848020B2 (en) | Method and device for quantization of linear prediction coefficient and method and device for inverse quantization | |
US11922960B2 (en) | Method and device for quantizing linear predictive coefficient, and method and device for dequantizing same | |
US10249308B2 (en) | Weight function determination device and method for quantizing linear prediction coding coefficient |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |