WO2016121826A1 - 符号化装置、復号装置、これらの方法、プログラム及び記録媒体 - Google Patents
符号化装置、復号装置、これらの方法、プログラム及び記録媒体 Download PDFInfo
- Publication number
- WO2016121826A1 WO2016121826A1 PCT/JP2016/052365 JP2016052365W WO2016121826A1 WO 2016121826 A1 WO2016121826 A1 WO 2016121826A1 JP 2016052365 W JP2016052365 W JP 2016052365W WO 2016121826 A1 WO2016121826 A1 WO 2016121826A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- parameter
- unit
- code
- decoding
- encoding
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 332
- 238000001228 spectrum Methods 0.000 claims abstract description 377
- 230000003595 spectral effect Effects 0.000 claims abstract description 78
- 238000009826 distribution Methods 0.000 claims abstract description 53
- 238000012545 processing Methods 0.000 claims abstract description 49
- 230000002123 temporal effect Effects 0.000 claims description 34
- 230000000737 periodic effect Effects 0.000 claims description 22
- 238000004458 analytical method Methods 0.000 description 73
- 239000006185 dispersion Substances 0.000 description 68
- 238000010606 normalization Methods 0.000 description 31
- 238000013139 quantization Methods 0.000 description 30
- 230000005236 sound signal Effects 0.000 description 30
- 238000006243 chemical reaction Methods 0.000 description 25
- 108010076504 Protein Sorting Signals Proteins 0.000 description 15
- 238000005314 correlation function Methods 0.000 description 13
- 238000010586 diagram Methods 0.000 description 11
- 238000009499 grossing Methods 0.000 description 10
- 238000000605 extraction Methods 0.000 description 8
- 238000012986 modification Methods 0.000 description 8
- 230000004048 modification Effects 0.000 description 8
- 230000002087 whitening effect Effects 0.000 description 7
- 238000004422 calculation algorithm Methods 0.000 description 5
- 238000007906 compression Methods 0.000 description 4
- 230000005284 excitation Effects 0.000 description 4
- 230000008707 rearrangement Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 230000006835 compression Effects 0.000 description 3
- 241000209094 Oryza Species 0.000 description 2
- 235000007164 Oryza sativa Nutrition 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 235000009566 rice Nutrition 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 239000013256 coordination polymer Substances 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000009527 percussion Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/032—Quantisation or dequantisation of spectral components
- G10L19/035—Scalar quantisation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/06—Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/12—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
- G10L19/22—Mode decision, i.e. based on audio signal content versus external parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/002—Dynamic bit allocation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/032—Quantisation or dequantisation of spectral components
Definitions
- the present invention relates to a technique for encoding or decoding a time series signal such as a sound signal.
- a parameter such as LSP is known as a parameter representing the characteristics of a time-series signal such as a sound signal (see Non-Patent Document 1, for example).
- LSP Since LSP is multi-order, it may be difficult to use it directly for sound classification or interval estimation. For example, since the LSP is multi-order, it cannot be said that processing based on a threshold using the LSP is easy.
- This parameter ⁇ is an encoding method for arithmetic coding in a coding scheme that arithmetically encodes a quantized value of a frequency domain coefficient using a linear prediction envelope as used in, for example, 3GPP EVS (Enhanced Voice Services) standard. It is a shape parameter that determines the probability distribution to which the object belongs.
- the parameter ⁇ is related to the distribution of the encoding target, and if the parameter ⁇ is appropriately determined, efficient encoding and decoding can be performed.
- the parameter ⁇ can be an index representing the characteristics of the time series signal. For this reason, although not publicly known, it is conceivable to specify an appropriate encoding process or decoding process configuration based on the parameter ⁇ , and to perform the specified encoding process or decoding process.
- the present invention specifies an appropriate encoding process or decoding process configuration based on a parameter ⁇ , and performs an encoding process or a decoding process with the specified configuration, a decoding apparatus, these methods, a program, and a recording
- the purpose is to provide a medium.
- the encoding apparatus encodes a time-series signal for each predetermined time interval in the frequency domain, and corresponds to the time-series signal with a parameter ⁇ as a positive number.
- the whitening spectrum which is a series obtained by dividing the frequency domain sample sequence by the spectral envelope spectrum envelope estimated by regarding the parameter ⁇ as the power spectrum of the absolute value of the frequency domain sample sequence corresponding to the time series signal to the power ⁇
- a shape parameter of the generalized Gaussian distribution that approximates the histogram of the series
- one of a plurality of parameters ⁇ can be selected for each predetermined time interval, or the parameter ⁇ is variable, and the parameter ⁇ for each predetermined time interval is
- An encoding unit that encodes a time-series signal for each predetermined time interval by an encoding process having a configuration specified based on at least; It is provided.
- an encoding device that encodes a time-series signal for each predetermined time interval in the frequency domain, wherein the parameter ⁇ is a positive number, and for each predetermined time interval. Any of a plurality of parameters ⁇ can be selected or the parameter ⁇ is variable, and the absolute value of the frequency domain sample sequence corresponding to the time-series signal is regarded as the power spectrum as a power spectrum for each predetermined time interval.
- a code is obtained by encoding a frequency domain sample sequence corresponding to a time-series signal by an encoding process in which the bit allocation is changed or the bit allocation is substantially changed based on the value of the spectral envelope estimated by the estimation of the spectral envelope.
- An output encoding unit is provided, and a parameter code representing a parameter ⁇ corresponding to the output code is output.
- the parameter ⁇ is a positive number
- the parameter code representing the parameter ⁇ is regarded as the power spectrum
- the absolute value of the frequency domain sample sequence corresponding to the parameter ⁇ is the ⁇ power
- the input parameter code is decoded as a code that represents the shape parameter of the generalized Gaussian distribution that approximates the histogram of the whitened spectrum sequence, which is a sequence obtained by dividing the frequency domain sample sequence by the spectral envelope spectrum envelope estimated by A parameter code decoding unit for obtaining the parameter ⁇ , a specifying unit for specifying the configuration of the decoding process based on at least the obtained parameter ⁇ , and a decoding unit for decoding the input code by the decoding process of the specified configuration And.
- a decoding device that obtains a frequency domain sample sequence corresponding to a time-series signal by decoding in the frequency domain, and that obtains a parameter ⁇ by decoding an input parameter code Code decoding unit, linear prediction coefficient decoding unit that obtains coefficients that can be converted to linear prediction coefficients by decoding the input linear prediction coefficient code, and conversion to linear prediction coefficients using the obtained parameter ⁇
- a non-smoothed spectrum envelope sequence generation unit that obtains a non-smoothed spectrum envelope sequence that is a series obtained by raising the amplitude spectrum envelope sequence corresponding to a specific coefficient to the 1 / ⁇ power, and a bit allocation that changes based on the non-smoothed spectrum envelope sequence
- the frequency domain sample corresponding to the time-series signal is obtained by decoding the input integer signal code according to the bit allocation that changes substantially. It includes a decoding unit obtaining a Le columns, the.
- the block diagram for demonstrating the example of the conventional encoding apparatus The block diagram for demonstrating the example of the conventional encoding part. The figure for demonstrating generalized Gaussian distribution.
- the block diagram for demonstrating the example of an encoding apparatus The flowchart for demonstrating the example of the encoding method.
- the block diagram for demonstrating the example of an encoding part The block diagram for demonstrating the example of an encoding part.
- the block diagram for demonstrating the example of a decoding apparatus The flowchart for demonstrating the example of a decoding method.
- the flowchart for demonstrating the example of a process of a decoding part The block diagram for demonstrating the example of an encoding apparatus.
- the flowchart for demonstrating the example of the encoding method The block diagram for demonstrating the example of a parameter determination apparatus. The flowchart for demonstrating the example of the parameter determination method. Histogram to explain the technical background.
- the block diagram for demonstrating the example of an encoding apparatus The flowchart for demonstrating the example of the encoding method.
- the block diagram for demonstrating the example of a decoding apparatus The flowchart for demonstrating the example of a decoding method.
- the block diagram for demonstrating the example of a parameter determination part The flowchart for demonstrating the example of a parameter determination part.
- a sound signal which is a time-series signal in the time domain is input to the frequency domain converter 11.
- the sound signal is, for example, an audio signal or an acoustic signal.
- the frequency domain transform unit 11 converts an input time domain sound signal into N frequency MDCT coefficient sequences X (0), X (1),..., X (N ⁇ Convert to 1). N is a positive integer.
- the converted MDCT coefficient sequence X (0), X (1),..., X (N-1) is output to the envelope normalization unit 14.
- the linear prediction analysis unit 12 receives a sound signal that is a time-series signal in the time domain.
- the linear prediction analysis unit 12 generates linear prediction coefficients ⁇ 1 , ⁇ 2 ,..., ⁇ p by performing linear prediction analysis on the sound signal input in units of frames. Further, the linear prediction analysis unit 12 encodes the generated linear prediction coefficients ⁇ 1 , ⁇ 2 ,..., ⁇ p to generate a linear prediction coefficient code. Examples of the linear prediction coefficient code is the linear prediction coefficients ⁇ 1, ⁇ 2, ..., a LSP code is a code corresponding to the column of the quantized value of the LSP (Line Spectrum Pairs) parameter sequence corresponding to alpha p. p is an integer of 2 or more.
- linear prediction analysis unit 12 generates quantized linear prediction coefficients ⁇ ⁇ 1 , ⁇ ⁇ 2 ,..., ⁇ ⁇ p that are linear prediction coefficients corresponding to the generated linear prediction coefficient code.
- the generated quantized linear prediction coefficients ⁇ ⁇ 1 , ⁇ ⁇ 2 ,..., ⁇ ⁇ p are output to the smoothed amplitude spectrum envelope sequence generation unit 14 and the non-smoothed amplitude spectrum envelope sequence generation unit 13.
- the generated linear prediction coefficient code is output to the decoding device.
- a method of obtaining a linear prediction coefficient by obtaining an autocorrelation for a sound signal input in units of frames and performing a Levinson-Durbin algorithm using the obtained autocorrelation is used.
- the MDCT coefficient sequence obtained by the frequency domain conversion unit 11 is input to the linear prediction analysis unit 12, and the Levinson-Durbin algorithm is performed on the inverse Fourier transform of the square value series of each coefficient of the MDCT coefficient sequence.
- a method of obtaining a linear prediction coefficient may be used.
- ⁇ Smoothing Amplitude Spectrum Envelope Sequence Generation Unit 14 Quantized linear prediction coefficients ⁇ ⁇ 1 , ⁇ ⁇ 2 ,..., ⁇ ⁇ p generated by the linear prediction analysis unit 12 are input to the smoothed amplitude spectrum envelope sequence generation unit 14.
- the smoothed amplitude spectrum envelope sequence generation unit 14 uses the quantized linear prediction coefficients ⁇ ⁇ 1 , ⁇ ⁇ 2 ,..., ⁇ ⁇ p to smooth the smoothed amplitude spectrum envelope sequence defined by the following equation (B1) ⁇ W ⁇ (0), ⁇ W ⁇ (1), ..., ⁇ W ⁇ (N-1) are generated.
- Exp ( ⁇ ) is an exponential function with the Napier number as the base, and j is an imaginary unit.
- ⁇ is a positive constant of 1 or less
- the amplitude spectrum envelope sequence ⁇ W (0), ⁇ W (1),..., ⁇ W (N-1) defined by the following formula (B2) It is a coefficient for smoothing the unevenness, in other words, a coefficient for smoothing the amplitude spectrum envelope series.
- the generated smoothed amplitude spectrum envelope sequences ⁇ W ⁇ (0), ⁇ W ⁇ (1),..., ⁇ W ⁇ (N-1) are the envelope normalization unit 15 and the dispersion parameter determination unit of the encoding unit 16. It is output to 163.
- ⁇ Non-smoothed amplitude spectrum envelope sequence generation unit 13 The textured amplitude spectral envelope sequence generating unit 13, quantized linear prediction coefficients the linear prediction analyzer 12 generates ⁇ ⁇ 1, ⁇ ⁇ 2, ..., ⁇ ⁇ p is input.
- the non-smoothed amplitude spectrum envelope sequence generation unit 13 uses the quantized linear prediction coefficients ⁇ ⁇ 1 , ⁇ ⁇ 2 ,..., ⁇ ⁇ p and uses the unsmoothed amplitude spectrum envelope defined by the above equation (B2). Generate the sequence ⁇ W (0), ⁇ W (1), ..., ⁇ W (N-1).
- the generated unsmoothed amplitude spectrum envelope sequence ⁇ W (0), ⁇ W (1),..., ⁇ W (N-1) is output to the dispersion parameter determination unit 163 of the encoding unit 16.
- the envelope normalization unit 15 outputs the MDCT coefficient sequence X (0), X (1),..., X (N-1) generated by the frequency domain conversion unit 11 and the smoothed amplitude spectrum envelope sequence generation unit 14.
- the smoothed amplitude spectrum envelope sequence ⁇ W ⁇ (0), ⁇ W ⁇ (1), ..., ⁇ W ⁇ (N-1) is input.
- the generated normalized MDCT coefficient sequences X N (0), X N (1),..., X N (N ⁇ 1) are output to the encoding unit 16.
- the envelope normalization unit 15 performs a smoothed amplitude spectrum envelope sequence ⁇ W ⁇ (0), which is a sequence in which the amplitude spectrum envelope is blunted.
- ⁇ W ⁇ (1), ..., ⁇ W ⁇ (N-1) is used to normalize MDCT coefficient sequence X (0), X (1), ..., X (N-1) in units of frames .
- the encoding unit 16 includes normalized MDCT coefficient sequences X N (0), X N (1),..., X N (N ⁇ 1) generated by the envelope normalization unit 15, a smoothed amplitude spectrum envelope sequence generation unit 14, the smoothed amplitude spectrum envelope sequence ⁇ W ⁇ (0), ⁇ W ⁇ (1),..., ⁇ W ⁇ (N-1), the non-smoothed amplitude spectrum envelope sequence generator 13 outputs ⁇ W (0), ⁇ W (1), ..., ⁇ W (N-1) is input.
- the encoding unit 16 generates a code corresponding to the normalized MDCT coefficient sequence X N (0), X N (1),..., X N (N ⁇ 1).
- Codes corresponding to the generated normalized MDCT coefficient sequences X N (0), X N (1),..., X N (N ⁇ 1) are output to the decoding device.
- a code obtained by encoding the quantized normalized coefficient series X Q (0), X Q (1),..., X Q (N ⁇ 1) is an integer signal code.
- the encoding unit 16 determines a gain g such that the number of bits of the integer signal code is equal to or less than the allocated bit number B, which is the number of bits allocated in advance, and as large as possible. To do. Then, the encoding unit 16 generates a gain code corresponding to the determined gain g and an integer signal code corresponding to the determined gain g.
- the generated gain code and integer signal code are output to the decoding apparatus as codes corresponding to the normalized MDCT coefficient sequence X N (0), X N (1),..., X N (N ⁇ 1).
- the encoding unit 16 includes a gain acquisition unit 161, a quantization unit 162, a dispersion parameter determination unit 168, an arithmetic encoding unit 169, a gain encoding unit 165, and a determination unit 166.
- a gain updating unit 167 is provided.
- each part of FIG. 2 will be described.
- the gain acquisition unit 161 is a bit in which the number of bits of the integer signal code is allocated in advance from the input normalized MDCT coefficient sequence X N (0), X N (1),..., X N (N ⁇ 1).
- a global gain g that is equal to or smaller than the distribution bit number B, which is a number, and is as large as possible is determined and output.
- the global gain g obtained by the gain acquisition unit 161 is an initial value of the global gain used in the quantization unit 162.
- the quantization unit 162 obtains each coefficient of the input normalized MDCT coefficient sequence X N (0), X N (1),..., X N (N ⁇ 1) by the gain acquisition unit 161 or the gain update unit 167. Then, quantized normalized coefficient sequences X Q (0), X Q (1),..., X Q (N ⁇ 1), which are sequences based on the integer part of the result of division by the global gain g, are obtained and output.
- the global gain g used when the quantization unit 162 is executed for the first time is the global gain g obtained by the gain acquisition unit 161, that is, the initial value of the global gain.
- the global gain g used when the quantization unit 162 is executed for the second time or later is the global gain g obtained by the gain update unit 167, that is, the updated value of the global gain.
- the dispersion parameter determination unit 163 inputs the input unsmoothed amplitude spectrum envelope sequence ⁇ W (0), ⁇ W (1),..., ⁇ W (N-1) and the input smoothed amplitude spectrum envelope sequence ⁇ From W ⁇ (0), ⁇ W ⁇ (1), ..., ⁇ W ⁇ (N-1), the dispersion parameters ⁇ (0), ⁇ (1), ..., Obtain ⁇ (N-1) and output.
- the arithmetic encoding unit 164 uses the dispersion parameters ⁇ (0), ⁇ (1),..., ⁇ (N ⁇ 1) obtained by the dispersion parameter determination unit 163 to perform quantization normalization obtained by the quantization unit 162.
- X Q (0), X Q (1), ..., X Q (N-1) are arithmetically encoded to obtain an integer signal code, and the integer signal code and the number of bits of the integer signal code are consumed The number of bits C is output.
- the determination unit 166 outputs an integer signal code when the number of gain updates is a predetermined number, and also instructs the gain encoding unit 165 to encode the global gain g obtained by the gain updating unit 167. And the number of consumed bits C measured by the arithmetic encoding unit 164 is output to the gain updating unit 167.
- ⁇ Gain Updater 167 When the number of consumed bits C measured by the arithmetic coding unit 164 is larger than the allocated bit number B, the gain updating unit 167 updates the global gain g value to a larger value and outputs the updated value. When the number is smaller than the number B, the value of the global gain g is updated to a small value, and the updated value of the global gain g is output.
- the gain encoder 165 encodes the global gain g obtained by the gain updater 167 in accordance with the instruction signal output from the determination unit 166, obtains a gain code, and outputs the gain code.
- the integer signal code output from the determination unit 166 and the gain code output from the gain encoding unit 165 are output to the decoding device as codes corresponding to the normalized MDCT coefficient sequence.
- the normalized MDCT coefficient sequence is encoded. ing.
- This encoding method is adopted in the above MPEG-4MUSAC and the like.
- the normalization of the MDCT sequence X (0), X (1),..., X (N-1) by the smoothed amplitude spectrum envelope is more effective than the normalization by the non-smoothed amplitude spectrum envelope sequence. ), X (1), ..., X (N-1) are not whitened.
- the MDCT coefficient sequence X (0), X (1), ..., X (N-1) is transformed into the unsmoothed amplitude spectrum envelope sequence ⁇ W (0), ⁇ W (1), ..., ⁇ W (N- Normalized sequence X (0) / ⁇ W (0), X (1) / ⁇ W (1),..., X (N-1) / ⁇ W (N-1 )
- normalized MDCT coefficient sequences X N (0), X N (1), ..., X N (N-1) has ⁇ W (0) / ⁇ W ⁇ (0), ⁇ W (1) / ⁇ W ⁇ (1),..., ⁇ W (N-1) / ⁇ W
- the envelope irregularities represented by the sequence of ⁇ (N-1) (hereinafter referred to as normalized amplitude spectrum envelope sequence ⁇ W N (0), ⁇ W N (1),..., W N (N-1)) It is left.
- Fig. 16 shows the envelope irregularities of the normalized MDCT sequence ⁇ W (0) / ⁇ W ⁇ (0), ⁇ W (1) / ⁇ W ⁇ (1),..., ⁇ W (N-1) / ⁇ W ⁇
- envelope The curve of 0.2-0.3 is the normalized MDCT coefficient X N (k) corresponding to sample k where the irregularity of the normalized MDCT sequence ⁇ W (k) / ⁇ W ⁇ (k) is 0.2 or more and less than 0.3 Represents the frequency of the values.
- envelope The curve of 0.3-0.4 is the normalized MDCT coefficient X N (k) corresponding to the sample k whose envelope irregularities ⁇ W (k) / ⁇ W ⁇ (k) of the normalized MDCT sequence is 0.3 or more and less than 0.4 Represents the frequency of the values.
- envelope: The curve of 0.4-0.5 is the normalized MDCT coefficient X N (k) corresponding to the sample k whose envelope irregularity ⁇ W (k) / ⁇ W ⁇ (k) is 0.4 or more and less than 0.5 Represents the frequency of the values.
- the average value of each coefficient included in the normalized MDCT coefficient sequence is almost 0, but the variance is related to the envelope value. That is, it can be seen that there is a relation that the variance of the normalized MDCT coefficient is larger because the base of the curve representing the frequency is wider as the envelope irregularity of the normalized MDCT sequence is larger.
- encoding using this relationship is performed. Specifically, for each coefficient of the frequency domain coefficient sequence to be encoded, encoding is performed such that the bit allocation is changed or the bit allocation is substantially changed based on the spectrum envelope.
- an optimal bit allocation assuming the encoding target belonging to a certain probability distribution (for example, Laplace distribution) is assigned to a code belonging to the probability distribution that deviates from the assumption. If it is performed on the conversion target, the compression efficiency may decrease.
- a generalized Gaussian distribution represented by the following formula, which is a distribution that can express various probability distributions, is used as the probability distribution to which the encoding target belongs.
- ⁇ (> 0) which is a shape parameter
- ⁇ is a predetermined number greater than zero.
- the value of ⁇ may be determined in advance, or may be selected or varied for each frame that is a predetermined time interval.
- ⁇ in the above equation is a value corresponding to the dispersion of the distribution, and information on the unevenness of the spectrum envelope is incorporated using this value as a dispersion parameter.
- the dispersion parameters ⁇ (0), ⁇ (1),..., ⁇ (N ⁇ 1) are generated from the spectrum envelope, and for the quantized normalized coefficient X Q (k) at each frequency k, f
- ⁇ (k), ⁇ ) is followed is configured, and encoding is performed using the arithmetic code based on this configuration.
- the dispersion parameter for each coefficient of (N-1) is calculated by the following equation (A1).
- ⁇ is the square root of ⁇ 2 .
- the Levinson-Durbin algorithm is performed on the inverse Fourier transform of a series of values obtained by raising the absolute value of the MDCT coefficient to the power of ⁇ , and the resulting linear prediction coefficient is quantized ⁇ 1 , ⁇ ⁇ 2 ,..., ⁇ ⁇ p is used in place of the quantized linear prediction coefficients ⁇ ⁇ 1 , ⁇ ⁇ 2 ,..., ⁇ ⁇ p and the unsmoothed amplitude spectrum envelope sequence ⁇ H (0), ⁇ H (1 ), ..., ⁇ H (N-1) and smoothed amplitude spectrum envelope sequence ⁇ H ⁇ (0), ⁇ H ⁇ (1),..., ⁇ H ⁇ (N-1) And formula (A3)
- ⁇ 2 / ⁇ / g of formula (A1) is a value related closely to the entropy variation of the values of each frame if the bit rate is fixed is small. For this reason, it is also possible to use a predetermined fixed value as ⁇ 2 / ⁇ / g. Thus, when using a fixed value, it is not necessary to newly add information because of the method of the present invention.
- the above technique is based on the minimization problem based on the code length when the quantized normalized coefficient sequence X Q (0), X Q (1), ..., X Q (N-1) is arithmetically coded. Is. The derivation of the above technique is described below.
- the minimization problem of code length L for distributed parameter sequence is ⁇ ⁇ (k) / ( ⁇ B ⁇ ( ⁇ )) and
- the dispersion parameter sequence ⁇ (0), ⁇ (1 ), ..., ⁇ (N-1) and the linear prediction coefficients ⁇ 1, ⁇ 2, ..., ⁇ p, correspondence between the energy sigma 2 prediction residual If one is determined, an optimization problem for obtaining a linear prediction coefficient that minimizes the code length can be established. However, in order to use a conventional fast solution, the following correspondence is made.
- the conventional linear prediction analysis that is, applying the Levinson-Durbin algorithm to the inverse Fourier transform of the power spectrum, has a linear prediction coefficient that minimizes the Itakura Saito distance between the power spectrum and the all-pole spectral envelope. It is known that this is a desired operation. Therefore, the above code length minimization problem is the same as in the conventional method by applying the Levinson-Durbin algorithm to the ⁇ th power of the amplitude spectrum, that is, the ⁇ th power of the absolute value of the MDCT coefficient series. An optimal solution can be obtained.
- FIG. 4 A configuration example of the encoding apparatus according to the first embodiment is shown in FIG.
- the encoding apparatus of the third embodiment includes a frequency domain transform unit 21, a linear prediction analysis unit 22, a non-smoothed amplitude spectrum envelope sequence generation unit 23, and a smoothed amplitude spectrum envelope sequence generation.
- a unit 24, an envelope normalization unit 25, an encoding unit 26, and a parameter determination unit 27 are provided.
- An example of each process of the encoding method according to the first embodiment realized by this encoding apparatus is shown in FIG.
- any one of a plurality of parameters ⁇ can be selected by the parameter determination unit 27 for each predetermined time interval.
- the parameter determination unit 27 stores a plurality of parameters ⁇ as parameters ⁇ candidates.
- the parameter determination unit 27 sequentially reads one parameter ⁇ among the plurality of parameters, and outputs it to the linear prediction analysis unit 22, the unsmoothed amplitude spectrum envelope sequence generation unit 23, and the decoding unit 26 (step A0).
- the frequency domain transform unit 21, the linear prediction analysis unit 22, the unsmoothed amplitude spectrum envelope sequence generation unit 23, the smoothed amplitude spectrum envelope sequence generation unit 24, the envelope normalization unit 25, and the encoding unit 26 include a parameter determination unit 27.
- processing from step A1 to step A6 described below is performed to generate a code for the frequency domain sample sequence corresponding to the time-series signal in the same predetermined time interval.
- two or more codes may be obtained for frequency domain sample sequences corresponding to time-series signals in the same predetermined time interval.
- the codes for the frequency domain sample sequences corresponding to the time-series signals in the same predetermined time section are a combination of these two or more obtained codes.
- the code is a combination of a linear prediction coefficient code, a gain code, and an integer signal code.
- the parameter determination unit 27 selects one code from the codes obtained for each parameter ⁇ with respect to the frequency domain sample sequence corresponding to the time-series signal in the same predetermined time interval. Then, the parameter ⁇ corresponding to the selected code is determined (step A7). This determined parameter ⁇ becomes the parameter ⁇ for the frequency domain sample sequence corresponding to the time-series signal in the same predetermined time interval. Then, the parameter determining unit 27 outputs the selected code and the code representing the determined parameter ⁇ to the decoding device. Details of the process of step A7 by the parameter determination unit 27 will be described later.
- one parameter ⁇ is read by the parameter determination unit 27, and processing is performed on the read one parameter ⁇ .
- the frequency domain converter 21 receives a sound signal that is a time-series signal in the time domain.
- sound signals are voice digital signals or acoustic digital signals.
- the frequency domain transform unit 21 converts the input time domain sound signal into N frequency MDCT coefficient sequences X (0), X (1),..., X (N ⁇ 1) (step A1). N is a positive integer.
- the obtained MDCT coefficient sequences X (0), X (1),..., X (N-1) are output to the linear prediction analysis unit 22 and the envelope normalization unit 25.
- the subsequent processing is performed in units of frames.
- the frequency domain conversion unit 21 obtains a frequency domain sample sequence corresponding to the sound signal, for example, an MDCT coefficient sequence.
- the linear prediction analysis unit 22 receives the MDCT coefficient sequence X (0), X (1),..., X (N-1) obtained by the frequency domain conversion unit 21.
- the linear prediction analysis unit 22 uses the MDCT coefficient sequence X (0), X (1),..., X (N-1) to define ⁇ R (0), ⁇ R defined by the following equation (A7): (1),..., ⁇ R (N-1) are subjected to linear prediction analysis to generate linear prediction coefficients ⁇ 1 , ⁇ 2 ,..., ⁇ p, and the generated linear prediction coefficients ⁇ 1 , ⁇ 2 ,. Encode ⁇ p to generate linear prediction coefficient code and quantized linear prediction coefficients ⁇ ⁇ 1 , ⁇ ⁇ 2 ,..., ⁇ ⁇ p , which are quantized linear prediction coefficients corresponding to the linear prediction coefficient code ( Step A2).
- the generated quantized linear prediction coefficients ⁇ ⁇ 1 , ⁇ ⁇ 2 ,..., ⁇ ⁇ p are output to the non-smoothed spectrum envelope sequence generation unit 23 and the smoothed amplitude spectrum envelope sequence generation unit 24.
- the energy ⁇ 2 of the prediction residual is calculated in the course of the linear prediction analysis process.
- the calculated energy ⁇ 2 of the prediction residual is output to the variance parameter determining unit 268 of the encoding unit 26.
- the generated linear prediction coefficient code is transmitted to the parameter determination unit 27.
- the linear prediction analysis unit 22 firstly performs an inverse Fourier transform in which the absolute value of the MDCT coefficient sequence X (0), X (1),. , That is, in the time domain corresponding to the absolute value of MDCT coefficient sequence X (0), X (1), ..., X (N-1) to the ⁇ th power A pseudo-correlation function signal sequence ⁇ R (0), ⁇ R (1), ..., ⁇ R (N-1) which is a signal string is obtained. Then, the linear prediction analysis unit 22 performs linear prediction analysis using the obtained pseudo correlation function signal sequence ⁇ R (0), ⁇ R (1), ..., ⁇ R (N-1) to obtain a linear prediction coefficient.
- ⁇ 1 , ⁇ 2 ,..., ⁇ p are generated.
- the linear prediction analysis unit 22 encodes the generated linear prediction coefficients ⁇ 1 , ⁇ 2 ,..., ⁇ p so as to encode the linear prediction coefficient code and the quantized linear prediction coefficient corresponding to the linear prediction coefficient code.
- ⁇ ⁇ 1 , ⁇ ⁇ 2 ,..., ⁇ ⁇ p are obtained.
- the linear prediction coefficients ⁇ 1 , ⁇ 2 , ..., ⁇ p are obtained when the absolute value of the MDCT coefficient sequence X (0), X (1), ..., X (N-1) is considered as the power spectrum Is a linear prediction coefficient corresponding to a signal in the time domain.
- the generation of the linear prediction coefficient code by the linear prediction analysis unit 22 is performed by, for example, a conventional encoding technique.
- the conventional encoding technique is, for example, an encoding technique in which a code corresponding to the linear prediction coefficient itself is a linear prediction coefficient code, and a code corresponding to the LSP parameter by converting the linear prediction coefficient into an LSP parameter.
- an encoding technique for converting a linear prediction coefficient into a PARCOR coefficient and a code corresponding to the PARCOR coefficient as a linear prediction coefficient code for example, an encoding technique for converting a linear prediction coefficient into a PARCOR coefficient and a code corresponding to the PARCOR coefficient as a linear prediction coefficient code.
- a code corresponding to a linear prediction coefficient itself is a linear prediction coefficient code
- a plurality of quantized linear prediction coefficient candidates are determined in advance, and each candidate is stored in association with a linear prediction coefficient code in advance.
- any one of candidates is determined as a quantized linear prediction coefficient for the generated linear prediction coefficient, and a quantized linear prediction coefficient and a linear prediction coefficient code are obtained.
- a code corresponding to a linear prediction coefficient itself is a linear prediction coefficient code
- a plurality of quantized linear prediction coefficient candidates are determined in advance, and each candidate is stored in association with a linear prediction coefficient code in advance.
- any one of candidates is determined as a quantized linear prediction coefficient for the generated linear prediction coefficient, and a quantized linear prediction coefficient and a linear prediction coefficient code are obtained.
- the linear prediction analysis unit 22 obtains a pseudo correlation function signal sequence obtained by performing an inverse Fourier transform assuming that the absolute value of the absolute value of the frequency domain sample sequence, which is an MDCT coefficient sequence, is a power spectrum, for example. To generate coefficients that can be converted into linear prediction coefficients.
- the unsmoothed amplitude spectrum envelope sequence generation unit 23 receives the quantized linear prediction coefficients ⁇ ⁇ 1 , ⁇ ⁇ 2 ,..., ⁇ ⁇ p generated by the linear prediction analysis unit 22.
- Textured amplitude spectral envelope sequence generating unit 23 the quantized linear prediction coefficient ⁇ ⁇ 1, ⁇ ⁇ 2, ..., ⁇ ⁇ is the sequence of the amplitude spectrum envelope corresponding to p textured amplitude spectral envelope sequence ⁇ H ( 0), ⁇ H (1),..., ⁇ H (N-1) are generated (step A3).
- the generated non-smoothed amplitude spectrum envelope sequence ⁇ H (0), ⁇ H (1), ..., ⁇ H (N-1) is output to the encoding unit 26.
- Textured amplitude spectral envelope sequence generating unit 23 the quantized linear prediction coefficient ⁇ ⁇ 1, ⁇ ⁇ 2, ..., using the ⁇ beta p, unsmoothed amplitude spectral envelope sequence ⁇ H (0), ⁇ H ( 1), ..., ⁇ H (N-1), the unsmoothed amplitude spectrum envelope sequence defined by equation (A2) ⁇ H (0), ⁇ H (1), ..., ⁇ H (N-1) Is generated.
- the non-smoothed amplitude spectrum envelope sequence generation unit 23 is a sequence obtained by raising the amplitude spectrum envelope sequence corresponding to the coefficient that can be converted into the linear prediction coefficient generated by the linear prediction analysis unit 22 to the 1 / ⁇ power.
- the spectral envelope is estimated by obtaining a non-smoothed spectral envelope sequence.
- the sequence obtained by raising c to a power of a sequence composed of a plurality of values, where c is an arbitrary number is a sequence composed of values obtained by raising each of the plurality of values to the c-th power.
- a series obtained by raising the amplitude spectrum envelope series to the power of 1 / ⁇ is a series constituted by values obtained by raising each coefficient of the amplitude spectrum envelope to the power of 1 / ⁇ .
- the 1 / ⁇ power processing by the non-smoothed amplitude spectrum envelope sequence generation unit 23 is caused by processing in which the absolute value ⁇ power of the frequency domain sample sequence is regarded as a power spectrum performed by the linear prediction analysis unit 22. It is. That is, the process of the 1 / ⁇ power by the non-smoothed amplitude spectrum envelope sequence generation unit 23 is performed by the process in which the absolute value of the frequency domain sample sequence performed by the linear prediction analysis unit 22 is regarded as the power spectrum as ⁇ . This is done to return the raised value to its original value.
- ⁇ Smoothing Amplitude Spectrum Envelope Sequence Generation Unit 24 Quantized linear prediction coefficients ⁇ ⁇ 1 , ⁇ ⁇ 2 ,..., ⁇ ⁇ p generated by the linear prediction analysis unit 22 are input to the smoothed amplitude spectrum envelope sequence generation unit 24.
- the generated smoothed amplitude spectrum envelope sequences ⁇ H ⁇ (0), ⁇ H ⁇ (1),..., ⁇ H ⁇ (N ⁇ 1) are output to the envelope normalization unit 25 and the encoding unit 26.
- the smoothed amplitude spectrum envelope sequence generation unit 24 uses the quantized linear prediction coefficients ⁇ ⁇ 1 , ⁇ ⁇ 2 ,..., ⁇ ⁇ p and the correction coefficient ⁇ to smooth the smoothed amplitude spectrum envelope sequence ⁇ H ⁇ (0), ⁇ H ⁇ (1),..., ⁇ H ⁇ (N-1), the smoothed amplitude spectrum envelope sequence defined by equation (A3) ⁇ H ⁇ (0), ⁇ H ⁇ (1),..., ⁇ H ⁇ (N-1) is generated.
- the correction coefficient ⁇ is a predetermined constant less than 1, and the amplitude unevenness of the unsmoothed amplitude spectrum envelope sequence ⁇ H (0), ⁇ H (1),..., ⁇ H (N-1)
- the coefficient for blunting in other words, the coefficient for smoothing the unsmoothed amplitude spectrum envelope sequence ⁇ H (0), ⁇ H (1), ..., ⁇ H (N-1).
- the envelope normalization unit 25 includes the MDCT coefficient sequence X (0), X (1),..., X (N-1) obtained by the frequency domain conversion unit 21 and the smoothed amplitude spectrum envelope generation unit 24. ⁇ H ⁇ (0), ⁇ H ⁇ (1), ..., ⁇ H ⁇ (N-1) are input.
- the envelope normalization unit 25 converts each coefficient of the MDCT coefficient sequence X (0), X (1),..., X (N-1) into a corresponding smoothed amplitude spectrum envelope sequence ⁇ H ⁇ (0), ⁇ H. Normalized MDCT coefficient sequence X N (0), X N (1), ..., X N (N-1 by normalizing with each value of ⁇ (1), ..., ⁇ H ⁇ (N-1) ) Is generated (step A5).
- the generated normalized MDCT coefficient sequence is output to the encoding unit 26.
- the encoding unit 26 includes normalized MDCT coefficient sequences X N (0), X N (1),..., X N (N ⁇ 1) generated by the envelope normalization unit 25, an unsmoothed amplitude spectrum envelope generation unit. 23, the non-smoothed amplitude spectrum envelope sequence ⁇ H (0), ⁇ H (1),..., ⁇ H (N-1), and the smoothed amplitude spectrum envelope sequence generated by the smoothed amplitude spectrum envelope generation unit 24 ⁇ H ⁇ (0), ⁇ H ⁇ (1),..., ⁇ H ⁇ (N ⁇ 1) and the average residual energy ⁇ 2 calculated by the linear prediction analysis unit 22 are input.
- the encoding unit 26 performs encoding by performing, for example, the processing from step A61 to step A65 shown in FIG. 8 (step A6).
- the encoding unit 26 obtains a global gain g corresponding to the normalized MDCT coefficient sequence X N (0), X N (1),..., X N (N ⁇ 1) (step A61), and the normalized MDCT coefficient sequence Quantized normalized coefficient series X, which is a series of integer values obtained by quantizing the result of dividing each coefficient of X N (0), X N (1), ..., X N (N-1) by global gain g Q (0), X Q (1), ..., X Q (N-1) is obtained (step A62), and the quantized normalized coefficient series X Q (0), X Q (1), ..., X Q Dispersion parameters ⁇ (0), ⁇ (1), ..., ⁇ (N-1) corresponding to each coefficient of (N-1) are set to global gain g and unsmoothed amplitude spectrum envelope sequence ⁇ H (0), ⁇ H (1),..., ⁇ H (N-1) and smoothed amplitude spectrum envelope series ⁇ H ⁇ (0), ⁇
- the normalized amplitude spectrum envelope sequence in the above formula (A1) ⁇ H N (0 ), ⁇ H N (1), ..., ⁇ H N is unsmoothed amplitude spectral envelope sequence ⁇ H (0), ⁇ H (1),..., ⁇ H (N-1) values are converted into corresponding smoothed amplitude spectrum envelope sequences ⁇ H ⁇ (0), ⁇ H ⁇ (1),..., ⁇ H ⁇ (N- Divided by each value of 1), that is, obtained by the following equation (A8).
- the generated integer signal code and gain code are output to the parameter determination unit 27 as codes corresponding to the normalized MDCT coefficient sequence.
- step A61 to step A65 the encoding unit 26 determines a global gain g such that the number of bits of the integer signal code is equal to or smaller than the allocated bit number B, which is the number of bits allocated in advance, and as large as possible.
- a function of generating a gain code corresponding to the determined global gain g and an integer signal code corresponding to the determined global gain g is realized.
- step A63 the characteristic processing is included in step A63, where the global gain g and the quantized normalized coefficient series X Q (0), X Q (1 ),..., X Q (N-1) are encoded to obtain a code corresponding to the normalized MDCT coefficient sequence.
- the encoding process itself includes various techniques including those described in Non-Patent Document 1. Known techniques exist. Two specific examples of the encoding process performed by the encoding unit 26 will be described below.
- FIG. 6 shows a configuration example of the encoding unit 26 of the first specific example.
- the encoding unit 26 of the first specific example includes a gain acquisition unit 261, a quantization unit 262, a dispersion parameter determination unit 268, an arithmetic encoding unit 269, and a gain encoding unit 265.
- a gain acquisition unit 261 As shown in FIG. 6, the encoding unit 26 of the first specific example includes a gain acquisition unit 261, a quantization unit 262, a dispersion parameter determination unit 268, an arithmetic encoding unit 269, and a gain encoding unit 265.
- the gain acquisition unit 261 receives the normalized MDCT coefficient sequence X N (0), X N (1),..., X N (N ⁇ 1) generated by the envelope normalization unit 25.
- Gain acquisition unit 261 the normalized MDCT coefficients X N (0), X N (1), ..., from X N (N-1), the number of bits of the integer signal code is the number of bits in advance allocation
- a global gain g that is equal to or less than the number of allocated bits B and that is as large as possible is determined and output (step S261).
- the gain acquisition unit 261 has, for example, a negative correlation between the square root of the total energy of the normalized MDCT coefficient sequence X N (0), X N (1),..., X N (N ⁇ 1) and the allocated bit number B.
- the multiplication value with a certain constant is obtained as the global gain g and output.
- the gain acquisition unit 261 calculates the total energy of the normalized MDCT coefficient sequence X N (0), X N (1),..., X N (N ⁇ 1), the number of allocated bits B, and the global gain g. , And a global gain g may be obtained and output by referring to the table.
- the gain acquisition unit 261 obtains a gain for dividing all samples of the normalized frequency domain sample sequence, which is a normalized MDCT coefficient sequence, for example.
- the obtained global gain g is output to the quantization unit 262 and the dispersion parameter determination unit 268.
- the quantization unit 262 includes the normalized MDCT coefficient sequence X N (0), X N (1),..., X N (N ⁇ 1) generated by the envelope normalization unit 25 and the global obtained by the gain acquisition unit 261. Gain g is input.
- the quantization unit 262 is a series of integer parts as a result of dividing each coefficient of the normalized MDCT coefficient sequence X N (0), X N (1),..., X N (N ⁇ 1) by the global gain g. quantized normalized haze coefficient sequence X Q (0), X Q (1), ..., X Q (N-1) the obtained output (step S262).
- the quantization unit 262 divides each sample of the normalized frequency domain sample sequence, which is a normalized MDCT coefficient sequence, for example, by the gain and quantizes it to obtain a quantized normalized coefficient sequence.
- the obtained quantized normalized coefficient series X Q (0), X Q (1),..., X Q (N ⁇ 1) are output to the arithmetic coding unit 269.
- the variance parameter determination unit 268 includes the parameter ⁇ read by the parameter determination unit 27, the global gain g obtained by the gain acquisition unit 261, and the unsmoothed amplitude spectrum envelope sequence ⁇ H generated by the unsmoothed amplitude spectrum envelope generation unit 23. (0), ⁇ H (1), ..., ⁇ H (N-1), the smoothed amplitude spectrum envelope sequence generated by the smoothed amplitude spectrum envelope generator 24 ⁇ H ⁇ (0), ⁇ H ⁇ (1) ,..., ⁇ H ⁇ (N ⁇ 1) and the prediction residual energy ⁇ 2 obtained by the linear prediction analysis unit 22 are input.
- the dispersion parameter determination unit 268 calculates the global gain g, the unsmoothed amplitude spectrum envelope sequence ⁇ H (0), ⁇ H (1), ..., ⁇ H (N-1), and the smoothed amplitude spectrum envelope sequence ⁇ H. ⁇ (0), ⁇ H ⁇ (1), ..., ⁇ H ⁇ (N-1) and, from the energy sigma 2 Metropolitan prediction residual, the above formula (A1), the dispersion parameter sequence by formula (A8) phi Each of the dispersion parameters (0), ⁇ (1),..., ⁇ (N ⁇ 1) is obtained and output (step S268).
- the obtained dispersion parameter series ⁇ (0), ⁇ (1),..., ⁇ (N ⁇ 1) are output to the arithmetic coding unit 269.
- the arithmetic coding unit 269 includes the parameter ⁇ read by the parameter determining unit 27, the quantized normalized coefficient series X Q (0), X Q (1),..., X Q (N -1) and the dispersion parameter series ⁇ (0), ⁇ (1),..., ⁇ (N ⁇ 1) obtained by the dispersion parameter determination unit 268 are input.
- the arithmetic coding unit 269 uses a dispersion parameter sequence ⁇ (0) as a dispersion parameter corresponding to each coefficient of the quantized normalized coefficient series X Q (0), X Q (1),..., X Q (N ⁇ 1). ), ⁇ (1), ..., ⁇ (N-1) using the respective dispersion parameters, the quantized normalized coefficient series X Q (0), X Q (1), ..., X Q (N-1 ) Is arithmetically encoded to obtain and output an integer signal code (step S269).
- the arithmetic coding unit 269 performs generalized Gaussian distribution on each coefficient of the quantized normalized coefficient series X Q (0), X Q (1),..., X Q (N ⁇ 1) during arithmetic coding. Bit allocation that is optimal when following f GG (X
- the obtained integer signal code is output to the parameter determination unit 27.
- Quantized normalized Haze coefficient sequence X Q (0), X Q (1), ..., arithmetic coding may be performed over a plurality of coefficients in X Q (N-1).
- the dispersion parameters of the dispersion parameter series ⁇ (0), ⁇ (1),..., ⁇ (N-1) are unsmoothed amplitude spectrum envelopes as can be seen from equations (A1) and (A8). Since it is based on the sequence ⁇ H (0), ⁇ H (1), ..., ⁇ H (N-1), the arithmetic coding unit 269 is based on the estimated spectral envelope (unsmoothed amplitude spectral envelope). Thus, it can be said that encoding is performed in which the bit allocation is substantially changed.
- the gain encoder 265 receives the global gain g obtained by the gain acquisition unit 261.
- the gain encoding unit 265 encodes the global gain g to obtain and output a gain code (step S265).
- the generated integer signal code and gain code are output to the parameter determination unit 27 as codes corresponding to the normalized MDCT coefficient sequence.
- Steps S261, S262, S268, S269, and S265 of this specific example 1 correspond to the above steps A61, A62, A63, A64, and A65, respectively.
- FIG. 7 shows a configuration example of the encoding unit 26 of the specific example 2.
- the encoding unit 26 of the specific example 2 includes a gain acquisition unit 261, a quantization unit 262, a dispersion parameter determination unit 268, an arithmetic encoding unit 269, a gain encoding unit 265, For example, a determination unit 266 and a gain update unit 267 are provided.
- a gain acquisition unit 261 the encoding unit 26 of the specific example 2
- a quantization unit 262 includes a quantization unit 262, a dispersion parameter determination unit 268, an arithmetic encoding unit 269, a gain encoding unit 265,
- a determination unit 266 and a gain update unit 267 are provided.
- the gain unit 261 receives the normalized MDCT coefficient sequence X N (0), X N (1),..., X N (N ⁇ 1) generated by the envelope normalization unit 25.
- Gain acquisition unit 261 the normalized MDCT coefficients X N (0), X N (1), ..., from X N (N-1), the number of bits of the integer signal code is the number of bits in advance allocation
- a global gain g that is equal to or less than the number of allocated bits B and that is as large as possible is determined and output (step S261).
- the gain acquisition unit 261 has, for example, a negative correlation between the square root of the total energy of the normalized MDCT coefficient sequence X N (0), X N (1),..., X N (N ⁇ 1) and the allocated bit number B.
- the multiplication value with a certain constant is obtained as the global gain g and output.
- the obtained global gain g is output to the quantization unit 262 and the dispersion parameter determination unit 268.
- the global gain g obtained by the gain acquisition unit 261 is an initial value of the global gain used by the quantization unit 262 and the dispersion parameter determination unit 268.
- the quantization unit 262 includes a normalized MDCT coefficient sequence X N (0), X N (1),..., X N (N ⁇ 1) generated by the envelope normalization unit 25 and a gain acquisition unit 261 or a gain update unit.
- the global gain g obtained by 267 is input.
- the quantization unit 262 is a series of integer parts as a result of dividing each coefficient of the normalized MDCT coefficient sequence X N (0), X N (1),..., X N (N ⁇ 1) by the global gain g. quantized normalized haze coefficient sequence X Q (0), X Q (1), ..., X Q (N-1) the obtained output (step S262).
- the global gain g used when the quantization unit 262 is executed for the first time is the global gain g obtained by the gain acquisition unit 261, that is, the initial value of the global gain.
- the global gain g used when the quantizing unit 262 is executed for the second time or later is the global gain g obtained by the gain updating unit 267, that is, the updated value of the global gain.
- the obtained quantized normalized coefficient series X Q (0), X Q (1),..., X Q (N ⁇ 1) are output to the arithmetic coding unit 269.
- the dispersion parameter determination unit 268 includes the parameter ⁇ read by the parameter determination unit 27, the global gain g obtained by the gain acquisition unit 261 or the gain update unit 267, and the non-smoothed amplitude generated by the non-smoothed amplitude spectrum envelope generation unit 23.
- smoothed amplitude spectrum envelope sequence generated by the smoothed amplitude spectrum envelope generator 24 ⁇ H ⁇ (0), ⁇ H ⁇ (1),..., ⁇ H ⁇ (N ⁇ 1) and the prediction residual energy ⁇ 2 obtained by the linear prediction analysis unit 22 are input.
- the dispersion parameter determination unit 268 calculates the global gain g, the unsmoothed amplitude spectrum envelope sequence ⁇ H (0), ⁇ H (1), ..., ⁇ H (N-1), and the smoothed amplitude spectrum envelope sequence ⁇ H. ⁇ (0), ⁇ H ⁇ (1), ..., ⁇ H ⁇ (N-1) and, from the energy sigma 2 Metropolitan prediction residual, the above formula (A1), the dispersion parameter sequence by formula (A8) phi Each of the dispersion parameters (0), ⁇ (1),..., ⁇ (N ⁇ 1) is obtained and output (step S268).
- the global gain g used when the dispersion parameter determination unit 268 is executed for the first time is the global gain g obtained by the gain acquisition unit 261, that is, the initial value of the global gain.
- the global gain g used when the dispersion parameter determination unit 268 is executed for the second time or later is the global gain g obtained by the gain update unit 267, that is, the updated value of the global gain.
- the obtained dispersion parameter series ⁇ (0), ⁇ (1),..., ⁇ (N ⁇ 1) are output to the arithmetic coding unit 269.
- the arithmetic coding unit 269 includes the parameter ⁇ read by the parameter determining unit 27, the quantized normalized coefficient series X Q (0), X Q (1),..., X Q (N -1) and the dispersion parameter series ⁇ (0), ⁇ (1),..., ⁇ (N ⁇ 1) obtained by the dispersion parameter determination unit 268 are input.
- the arithmetic coding unit 269 uses a dispersion parameter sequence ⁇ (0) as a dispersion parameter corresponding to each coefficient of the quantized normalized coefficient series X Q (0), X Q (1),..., X Q (N ⁇ 1). ), ⁇ (1), ..., ⁇ (N-1) using the respective dispersion parameters, the quantized normalized coefficient series X Q (0), X Q (1), ..., X Q (N-1 ) Are arithmetically encoded to obtain and output an integer signal code and a consumed bit number C that is the number of bits of the integer signal code (step S269).
- the arithmetic coding unit 269 performs generalized Gaussian distribution on each coefficient of the quantized normalized coefficient series X Q (0), X Q (1),..., X Q (N ⁇ 1) during arithmetic coding.
- ⁇ (k), ⁇ ) is configured, and encoding is performed using the arithmetic code based on this configuration.
- the expected value of the bit allocation to each coefficient of the quantized normalized coefficient series X Q (0), X Q (1),..., X Q (N-1) is expressed as the dispersion parameter series ⁇ (0), ⁇ (1),..., ⁇ (N ⁇ 1).
- the obtained integer signal code and the number C of consumed bits are output to the determination unit 266.
- Quantized normalized Haze coefficient sequence X Q (0), X Q (1), ..., arithmetic coding may be performed over a plurality of coefficients in X Q (N-1).
- the dispersion parameters of the dispersion parameter series ⁇ (0), ⁇ (1),..., ⁇ (N-1) are unsmoothed amplitude spectrum envelopes as can be seen from equations (A1) and (A8). Since it is based on the sequence ⁇ H (0), ⁇ H (1), ..., ⁇ H (N-1), the arithmetic coding unit 269 is based on the estimated spectral envelope (unsmoothed amplitude spectral envelope). Thus, it can be said that encoding is performed in which the bit allocation is substantially changed.
- ⁇ Determining unit 266 The integer signal code obtained by the arithmetic coding unit 269 is input to the determination unit 266.
- the determination unit 266 outputs an integer signal code when the number of gain updates is a predetermined number, and also instructs the gain encoding unit 265 to encode the global gain g obtained by the gain updating unit 267.
- the gain update count is less than the predetermined count, the consumed bit count C measured by the arithmetic encoding section 264 is output to the gain update section 267 (step S266).
- the gain updating unit 267 receives the number of consumed bits C measured by the arithmetic coding unit 264.
- the gain updating unit 267 updates the global gain g value when the consumed bit number C is greater than the allocated bit number B, and outputs the updated value.
- the gain g is updated to a smaller value, and the updated global gain g is output (step S267).
- the updated global gain g obtained by the gain update unit 267 is output to the quantization unit 262 and the gain encoding unit 265.
- the gain encoding unit 265 receives the output instruction from the determination unit 266 and the global gain g obtained by the gain update unit 267.
- the gain encoder 265 encodes the global gain g according to the instruction signal to obtain and output a gain code (step 265).
- the integer signal code output from the determination unit 266 and the gain code output from the gain encoding unit 265 are output to the parameter determination unit 27 as codes corresponding to the normalized MDCT coefficient sequence.
- step S267 performed last corresponds to the above step A61
- steps S262, S263, S264, and S265 correspond to the above steps A62, A63, A64, and A65, respectively.
- the encoding unit 26 may perform encoding that changes the bit allocation based on the estimated spectral envelope (non-smoothed amplitude spectral envelope), for example, by performing the following processing.
- the encoding unit 26 first obtains a global gain g corresponding to the normalized MDCT coefficient sequence X N (0), X N (1),..., X N (N ⁇ 1), and normalizes the MDCT coefficient sequence X N. (0), X N (1), ..., X N (N-1) coefficients divided by the global gain g Quantized normalized coefficient series X Q ( Find 0), X Q (1), ..., X Q (N-1).
- the quantized bit corresponding to each coefficient of this quantized normalized coefficient series X Q (0), X Q (1), ..., X Q (N-1) has a range in which X Q (k) is distributed.
- the range can be determined from the envelope estimate.
- the encoding unit 26 for example, the value of the normalized amplitude spectrum envelope sequence based on linear prediction as in the following equation (A9) ⁇ H N ( k) can be used to determine the range of X Q (k).
- the encoding unit 26 determines the number of allocated bits by collecting a plurality of samples instead of assigning each sample, and the quantization unit 26 does not perform scalar quantization for each sample but also a vector for each vector including a plurality of samples. It is also possible to quantize.
- X Q (k) can be changed from -2 b (k) -1 to 2 b (k ) Can take 2 b (k) types of integers up to -1 .
- the encoding unit 26 encodes each sample with b (k) bits to obtain an integer signal code.
- the generated integer signal code is output to the decoding device.
- the encoding unit 26 encodes the global gain g to obtain and output a gain code.
- the encoding unit 26 may perform encoding other than arithmetic encoding.
- ⁇ Parameter determining unit 27 Through the processing from step A1 to step A6, codes generated for each parameter ⁇ with respect to frequency domain sample sequences corresponding to time-series signals in the same predetermined time interval (in this example, linear prediction coefficient code, gain code) And the integer signal code) are input to the parameter determination unit 27.
- codes generated for each parameter ⁇ with respect to frequency domain sample sequences corresponding to time-series signals in the same predetermined time interval in this example, linear prediction coefficient code, gain code
- the integer signal code are input to the parameter determination unit 27.
- the parameter determination unit 27 selects one code from the codes obtained for each parameter ⁇ with respect to the frequency domain sample sequence corresponding to the time-series signal in the same predetermined time interval, and sets the selected code as the selected code.
- the corresponding parameter ⁇ is determined (step A7).
- This determined parameter ⁇ becomes the parameter ⁇ for the frequency domain sample sequence corresponding to the time-series signal in the same predetermined time interval.
- the parameter determining unit 27 outputs the selected code and the parameter code representing the determined parameter ⁇ to the decoding device.
- the selection of the code is performed based on at least one of the code amount of the code and the coding distortion corresponding to the code. For example, the code with the smallest code amount or the code with the smallest coding distortion is selected.
- the coding distortion is an error between the frequency domain sample sequence obtained from the input signal and the frequency domain sample sequence obtained by locally decoding the generated code.
- the encoding apparatus may include an encoding distortion calculation unit for calculating encoding distortion.
- the encoding distortion calculation unit includes a decoding unit that performs processing similar to that of the decoding device described below, and locally decodes the code generated by the decoding unit. Thereafter, the coding distortion calculation unit calculates an error between the frequency domain sample sequence obtained from the input signal and the frequency domain sample sequence obtained by local decoding, and obtains the coding distortion.
- the decoding device of the first embodiment includes a linear prediction coefficient decoding unit 31, a non-smoothed amplitude spectrum envelope sequence generating unit 32, a smoothed amplitude spectrum envelope sequence generating unit 33, and a decoding unit 34. And an envelope denormalization unit 35, a time domain conversion unit 36, and a parameter decoding unit 37, for example.
- FIG. 9 An example of each process of the decoding method according to the first embodiment realized by this decoding apparatus is shown in FIG.
- the decoding apparatus receives at least the parameter code, the code corresponding to the normalized MDCT coefficient sequence, and the linear prediction coefficient code output from the encoding apparatus.
- ⁇ Parameter decoding unit 37> The parameter code output from the encoding device is input to the parameter decoding unit 37.
- the parameter decoding unit 37 obtains a decoding parameter ⁇ by decoding the parameter code.
- the obtained decoding parameter ⁇ is output to the non-smoothed amplitude spectrum envelope sequence generation unit 32, the smoothed amplitude spectrum envelope sequence generation unit 33, and the decoding unit 34.
- the parameter decoding unit 37 stores a plurality of decoding parameters ⁇ as candidates.
- the parameter decoding unit 37 obtains a decoding parameter ⁇ candidate corresponding to the parameter code as a decoding parameter ⁇ .
- the plurality of decoding parameters ⁇ stored in the parameter decoding unit 37 are the same as the plurality of parameters ⁇ stored in the parameter determining unit 27 of the encoding device.
- the linear prediction coefficient decoding unit 31 receives the linear prediction coefficient code output from the encoding device.
- Linear prediction coefficient decoding unit 31 for each frame, the linear prediction coefficient code that has been entered for example by decoding by conventional decoding technique decodes the linear prediction coefficient ⁇ ⁇ 1, ⁇ ⁇ 2, ..., obtaining ⁇ beta p ( Step B1).
- the obtained decoded linear prediction coefficients ⁇ ⁇ 1 , ⁇ ⁇ 2 ,..., ⁇ ⁇ p are output to the non-smoothed amplitude spectrum envelope sequence generation unit 32 and the non-smoothed amplitude spectrum envelope sequence generation unit 33.
- the conventional decoding technique is the same as the linear prediction coefficient quantized by decoding the linear prediction coefficient code when the linear prediction coefficient code is a code corresponding to the quantized linear prediction coefficient, for example.
- linear prediction coefficients and LSP parameters can be converted to each other, and conversion between decoded linear prediction coefficients and decoded LSP parameters is performed according to the input linear prediction coefficient code and information necessary for subsequent processing. What is necessary is just to perform a process. From the above, what includes the decoding process of the linear prediction coefficient code and the conversion process performed as necessary is “decoding by a conventional decoding technique”.
- the linear prediction coefficient decoding unit 31 decodes the input linear prediction coefficient code, thereby reversing the absolute value of the frequency domain sample sequence corresponding to the time-series signal as a power spectrum.
- a coefficient that can be converted into a linear prediction coefficient corresponding to a pseudo correlation function signal sequence obtained by performing Fourier transform is generated.
- the unsmoothed amplitude spectrum envelope sequence generation unit 32 includes the decoding parameter ⁇ obtained by the parameter decoding unit 37 and the decoded linear prediction coefficients ⁇ ⁇ 1 , ⁇ ⁇ 2 ,. Is entered.
- Textured amplitude spectral envelope sequence generating unit 32 decodes the linear prediction coefficient ⁇ ⁇ 1, ⁇ ⁇ 2, ..., ⁇ ⁇ unsmoothed amplitude spectrum is a series of amplitude spectrum envelope corresponding to p envelope sequence ⁇ H (0 ), ⁇ H (1),..., ⁇ H (N-1) are generated by the above equation (A2) (step B2).
- the generated non-smoothed amplitude spectrum envelope sequence ⁇ H (0), ⁇ H (1), ..., ⁇ H (N-1) is output to the decoding unit 34.
- the unsmoothed amplitude spectrum envelope sequence generation unit 32 converts the amplitude spectrum envelope sequence corresponding to the coefficient that can be converted into the linear prediction coefficient generated by the linear prediction coefficient decoding unit 31 to 1 / ⁇ .
- a non-smoothed spectral envelope sequence which is a raised sequence is obtained.
- the smoothed amplitude spectrum envelope sequence generation unit 33 receives the decoding parameter ⁇ obtained by the parameter decoding unit 37 and the decoded linear prediction coefficients ⁇ ⁇ 1 , ⁇ ⁇ 2 ,..., ⁇ ⁇ p obtained by the linear prediction coefficient decoding unit 31. Entered.
- Smoothing the amplitude spectral envelope sequence generating unit 33 decodes the linear prediction coefficient ⁇ ⁇ 1, ⁇ ⁇ 2, ..., smoothing the amplitude is a sequence blunted amplitude of irregularities of the amplitude spectral envelope of the sequence corresponding to the ⁇ beta p spectral envelope sequence ⁇ H ⁇ (0), ⁇ H ⁇ (1), ..., ⁇ H ⁇ a (N-1) produced by the equation a (3) above (step B3).
- the generated smoothed amplitude spectrum envelope sequences ⁇ H ⁇ (0), ⁇ H ⁇ (1),..., ⁇ H ⁇ (N-1) are output to the decoding unit 34 and the envelope denormalization unit 35.
- the decoding unit 34 includes a decoding parameter ⁇ obtained by the parameter decoding unit 37, a code corresponding to the normalized MDCT coefficient sequence output by the encoding device, and a non-smoothed amplitude spectrum generated by the non-smoothed amplitude spectrum envelope generating unit 32.
- Envelope sequence ⁇ H (0), ⁇ H (1), ..., ⁇ H (N-1) and smoothed amplitude spectrum envelope sequence generated by the smoothed amplitude spectrum envelope generator 33 ⁇ H ⁇ (0), ⁇ H ⁇ (1), ..., ⁇ H ⁇ (N-1) is input.
- the decryption unit 34 includes a dispersion parameter determination unit 342.
- the decoding unit 34 performs decoding by performing, for example, the processing from step B41 to step B44 shown in FIG. 11 (step B4). That is, the decoding unit 34 decodes the gain code included in the code corresponding to the input normalized MDCT coefficient sequence for each frame to obtain the global gain g (step B41).
- the dispersion parameter determination unit 342 of the decoding unit 34 includes a global gain g, a non-smoothed amplitude spectrum envelope sequence ⁇ H (0), ⁇ H (1),..., ⁇ H (N-1) and a smoothed amplitude spectrum envelope sequence.
- the decoding unit 34 converts the integer signal code included in the code corresponding to the normalized MDCT coefficient sequence to arithmetic corresponding to each dispersion parameter of the dispersion parameter sequence ⁇ (0), ⁇ (1),..., ⁇ (N ⁇ 1).
- arithmetic decoding is performed to obtain decoded normalized coefficient series ⁇ X Q (0), ⁇ X Q (1), ..., ⁇ X Q (N-1) (step B43), and decoding normalized Coefficient sequence ⁇ X Q (0), ⁇ X Q (1), ..., ⁇ X Q (N-1) is multiplied by global gain g and decoded normalized MDCT coefficient sequence ⁇ X N (0), ⁇ X N (1),..., ⁇ X N (N-1) are generated (step B44).
- the decoding unit 34 may perform decoding of the input integer signal code according to bit allocation that substantially changes based on the non-smoothed spectrum envelope sequence.
- the decoding unit 34 When encoding is performed by the process described in [Modification of Encoding Unit 26], the decoding unit 34 performs, for example, the following process.
- the decoding unit 34 decodes the gain code included in the code corresponding to the input normalized MDCT coefficient sequence for each frame to obtain the global gain g.
- the dispersion parameter determination unit 342 of the decoding unit 34 includes a non-smoothed amplitude spectrum envelope sequence ⁇ H (0), ⁇ H (1),...
- the decoding unit 34 obtains b (k) by Expression (A10) based on each dispersion parameter ⁇ (k) of the dispersion parameter series ⁇ (0), ⁇ (1),..., ⁇ (N ⁇ 1).
- XQ (k) can be sequentially decoded with the number of bits b (k) and the normalized normalized coefficient sequence ⁇ X Q (0), ⁇ X Q (1),..., ⁇ X Q (N -1) is obtained, and the coefficients of the decoded normalized coefficient series ⁇ X Q (0), ⁇ X Q (1), ..., ⁇ X Q (N-1) are multiplied by the global gain g to obtain the decoding normal MDCT coefficient sequence ⁇ X N (0), ⁇ X N (1), ..., ⁇ X N (N-1) is generated.
- the decoding unit 34 may perform decoding of the input integer signal code in accordance with bit allocation that changes based on the non-smoothed spectrum envelope sequence.
- the generated decoded normalized MDCT coefficient sequence ⁇ X N (0), ⁇ X N (1),..., ⁇ X N (N ⁇ 1) is output to the envelope denormalization unit 35.
- the envelope denormalization unit 35 includes a smoothed amplitude spectrum envelope sequence ⁇ H ⁇ (0), ⁇ H ⁇ (1), ..., ⁇ H ⁇ (N-1) generated by the smoothed amplitude spectrum envelope generation unit 33.
- the decoding normalization MDCT coefficient sequence ⁇ X N (0), ⁇ X N (1),..., ⁇ X N (N-1) generated by the decoding unit 34 is input.
- the envelope denormalization unit 35 uses the smoothed amplitude spectrum envelope sequence ⁇ H ⁇ (0), ⁇ H ⁇ (1),..., ⁇ H ⁇ (N-1) to decode the normalized MDCT coefficient sequence ⁇ X
- N (0), ⁇ X N (1), ..., ⁇ X N (N-1) the decoded MDCT coefficient sequence ⁇ X (0), ⁇ X (1), ..., ⁇ X (N-1) is generated (step B5).
- the generated decoded MDCT coefficient sequence ⁇ X (0), ⁇ X (1), ..., ⁇ X (N-1) is output to the time domain conversion unit 36.
- the envelope inverse normalization unit 35, k 0, 1, ..., a N-1, decoding the normalized MDCT coefficients ⁇ X N (0), ⁇ X N (1), ..., ⁇ X N (N -1) for each coefficient ⁇ X N (k), the smoothed amplitude spectrum envelope series ⁇ H ⁇ (0), ⁇ H ⁇ (1),..., ⁇ H ⁇ (N-1) envelope values ⁇ H
- the time domain transform unit 36 receives the decoded MDCT coefficient sequence ⁇ X (0), ⁇ X (1),..., ⁇ X (N-1) generated by the envelope denormalization unit 35.
- the time domain transform unit 36 transforms the decoded MDCT coefficient sequence ⁇ X (0), ⁇ X (1), ..., ⁇ X (N-1) obtained by the envelope denormalization unit 35 into the time domain for each frame.
- a sound signal (decoded sound signal) in units of frames is obtained (step B6).
- the decoding device obtains a time-series signal by decoding in the frequency domain.
- the encoding apparatus and method according to the first embodiment generate a code by performing encoding for each of a plurality of parameters ⁇ , select an optimal code from the codes generated for each parameter ⁇ , and select the selected code. And a parameter code corresponding to the selected code.
- the parameter determination unit 27 first determines the parameter ⁇ , performs encoding based on the determined parameter ⁇ , generates a code, and outputs it. .
- the parameter ⁇ is made variable by the parameter determination unit 27 for each predetermined time interval.
- the parameter ⁇ being variable for each predetermined time interval means that the parameter ⁇ can be changed if the predetermined time interval is changed, and the value of the parameter ⁇ is not changed in the same time interval.
- the encoding device includes a frequency domain transform unit 21, a linear prediction analysis unit 22, a non-smoothed amplitude spectrum envelope sequence generation unit 23, a smoothed amplitude spectrum envelope sequence generation unit 24, and an envelope.
- a normalization unit 25 an encoding unit 26, and a parameter determination unit 27 ′ are provided.
- An example of each process of the encoding method realized by this encoding apparatus is shown in FIG.
- a time domain sound signal which is a time-series signal, is input to the parameter determination unit 27 ′.
- sound signals are voice digital signals or acoustic digital signals.
- the parameter determining unit 27 ′ determines the parameter ⁇ by a process described later based on the input time series signal (step A7 ′). ⁇ determined by the parameter determination unit 27 ′ is output to the linear prediction analysis unit 22, the non-smoothed amplitude spectrum envelope estimation unit 23, the smoothed amplitude spectrum envelope estimation unit 24, and the encoding unit 26.
- the parameter determination unit 27 'generates a parameter code by encoding the determined ⁇ .
- the generated parameter code is transmitted to the decoding device.
- the frequency domain transform unit 21, the linear prediction analysis unit 22, the unsmoothed amplitude spectrum envelope sequence generation unit 23, the smoothed amplitude spectrum envelope sequence generation unit 24, the envelope normalization unit 25, and the encoding unit 26 include a parameter determination unit 27.
- a code is generated by the same processing as in the first embodiment (step A1 to step A6).
- the code is a combination of a linear prediction coefficient code, a gain code, and an integer signal code.
- the generated code is transmitted to the decoding device.
- FIG. 14 shows a configuration example of the parameter determination unit 27 '.
- the parameter determination unit 27 ′ includes, for example, a frequency domain conversion unit 41, a spectrum envelope estimation unit 42, a whitened spectrum sequence generation unit 43, and a parameter acquisition unit 44.
- the spectrum envelope estimation unit 42 includes, for example, a linear prediction analysis unit 421 and a non-smoothed amplitude spectrum envelope sequence generation unit 422.
- FIG. 2 shows an example of each process of the parameter determination method realized by the parameter determination unit 27 '.
- the time domain sound signal which is a time series signal, is input to the frequency domain transform unit 41.
- Examples of sound signals are voice digital signals or acoustic digital signals.
- the frequency domain conversion unit 41 converts the input time domain sound signal into N frequency MDCT coefficient sequences X (0), X (1),..., X (N ⁇ Convert to 1). N is a positive integer.
- the obtained MDCT coefficient sequences X (0), X (1),..., X (N-1) are output to the spectrum envelope estimation unit 42 and the whitened spectrum sequence generation unit 43.
- the subsequent processing is performed in units of frames.
- the frequency domain conversion unit 41 obtains a frequency domain sample sequence corresponding to the sound signal, for example, an MDCT coefficient sequence (step C41).
- the spectrum envelope estimation unit 42 receives the MDCT coefficient sequence X (0), X (1),..., X (N ⁇ 1) obtained by the frequency domain conversion unit 21.
- the spectrum envelope estimation unit 42 Based on the parameter ⁇ 0 determined by a predetermined method, the spectrum envelope estimation unit 42 performs spectrum envelope estimation using the absolute value ⁇ 0 of the frequency domain sample sequence corresponding to the time-series signal as a power spectrum ( Step C42).
- the estimated spectrum envelope is output to the whitened spectrum sequence generation unit 43.
- the spectrum envelope estimation unit 42 estimates the spectrum envelope by generating a non-smoothed amplitude spectrum envelope sequence, for example, by processing of a linear prediction analysis unit 421 and a non-smoothed amplitude spectrum envelope sequence generation unit 422 described below. .
- the parameter ⁇ 0 is determined by a predetermined method.
- ⁇ 0 is a predetermined number greater than zero.
- ⁇ 0 1.
- the frame before the frame for which the current parameter ⁇ is to be obtained (hereinafter referred to as the current frame) is, for example, a frame before the current frame and in the vicinity of the current frame.
- the frame in the vicinity of the current frame is, for example, a frame immediately before the current frame.
- ⁇ Linear prediction analysis unit 421 MDCT coefficient sequences X (0), X (1),..., X (N ⁇ 1) obtained by the frequency domain transform unit 41 are input to the linear prediction analysis unit 421.
- the linear prediction analysis unit 421 uses the MDCT coefficient sequence X (0), X (1),..., X (N-1) to define ⁇ R (0), ⁇ R defined by the following equation (C1). (1),..., ⁇ R (N-1) are subjected to linear prediction analysis to generate linear prediction coefficients ⁇ 1 , ⁇ 2 ,..., ⁇ p, and the generated linear prediction coefficients ⁇ 1 , ⁇ 2 ,. ⁇ p is encoded to generate a linear prediction coefficient code and quantized linear prediction coefficients ⁇ ⁇ 1 , ⁇ ⁇ 2 ,..., ⁇ ⁇ p which are quantized linear prediction coefficients corresponding to the linear prediction coefficient code.
- the generated quantized linear prediction coefficients ⁇ ⁇ 1 , ⁇ ⁇ 2 ,..., ⁇ ⁇ p are output to the non-smoothed spectrum envelope sequence generation unit 422.
- the linear prediction analyzer 421 first MDCT coefficients X (0), X (1 ), ..., X (N-1) of the inverse Fourier that the eta 0 squared regarded as a power spectrum of the absolute value Time domain corresponding to the absolute value of the MDCT coefficient sequence X (0), X (1), ..., X (N-1) to the ⁇ th power Pseudo correlation function signal sequence ⁇ R (0), ⁇ R (1), ..., ⁇ R (N-1), which are the signal sequences. Then, the linear prediction analysis unit 421 performs linear prediction analysis using the obtained pseudo correlation function signal sequence ⁇ R (0), ⁇ R (1), ..., ⁇ R (N-1) to obtain a linear prediction coefficient.
- ⁇ 1 , ⁇ 2 ,..., ⁇ p are generated.
- the linear prediction analysis unit 421 encodes the generated linear prediction coefficients ⁇ 1 , ⁇ 2 ,..., ⁇ p so as to encode a linear prediction coefficient code and a quantized linear prediction coefficient corresponding to the linear prediction coefficient code.
- ⁇ ⁇ 1 , ⁇ ⁇ 2 ,..., ⁇ ⁇ p are obtained.
- Linear prediction coefficients ⁇ 1, ⁇ 2, ..., ⁇ p is, MDCT coefficient sequence X (0), X (1 ), ..., and the eta 0 square of the absolute value of X (N-1) was regarded as a power spectrum It is a linear prediction coefficient corresponding to the time domain signal.
- the generation of the linear prediction coefficient code by the linear prediction analysis unit 421 is performed by, for example, a conventional encoding technique.
- the conventional encoding technique is, for example, an encoding technique in which a code corresponding to the linear prediction coefficient itself is a linear prediction coefficient code, and a code corresponding to the LSP parameter by converting the linear prediction coefficient into an LSP parameter.
- an encoding technique for converting a linear prediction coefficient into a PARCOR coefficient and a code corresponding to the PARCOR coefficient as a linear prediction coefficient code for example, an encoding technique for converting a linear prediction coefficient into a PARCOR coefficient and a code corresponding to the PARCOR coefficient as a linear prediction coefficient code.
- the linear prediction analysis unit 421 obtains a pseudo correlation function signal sequence obtained by performing an inverse Fourier transform assuming that the absolute value of the absolute value of the frequency domain sample sequence, which is an MDCT coefficient sequence, is a power spectrum, for example. Then, a linear prediction analysis is performed to generate a coefficient that can be converted into a linear prediction coefficient (step C421).
- ⁇ Non-smoothed Amplitude Spectrum Envelope Sequence Generation Unit 422 Quantized linear prediction coefficients ⁇ ⁇ 1 , ⁇ ⁇ 2 ,..., ⁇ ⁇ p generated by the linear prediction analysis unit 421 are input to the unsmoothed amplitude spectrum envelope sequence generation unit 422.
- Textured amplitude spectral envelope sequence generation unit 422 the quantized linear prediction coefficient ⁇ ⁇ 1, ⁇ ⁇ 2, ..., ⁇ ⁇ is the sequence of the amplitude spectrum envelope corresponding to p textured amplitude spectral envelope sequence ⁇ H ( 0), ⁇ H (1), ..., ⁇ H (N-1) are generated.
- the generated non-smoothed amplitude spectrum envelope sequence ⁇ H (0), ⁇ H (1), ..., ⁇ H (N-1) is output to the whitened spectrum sequence generation unit 43.
- Textured amplitude spectral envelope sequence generation unit 422 the quantized linear prediction coefficient ⁇ ⁇ 1, ⁇ ⁇ 2, ..., using the ⁇ beta p, unsmoothed amplitude spectral envelope sequence ⁇ H (0), ⁇ H ( 1),..., ⁇ H (N-1) as unsmoothed amplitude spectrum envelope sequence defined by equation (C2) ⁇ H (0), ⁇ H (1),..., ⁇ H (N-1) Is generated.
- the unsmoothed amplitude spectrum envelope sequence generation unit 422 performs linear prediction analysis on the unsmoothed spectrum envelope sequence that is a sequence obtained by raising the amplitude spectrum envelope sequence corresponding to the pseudo correlation function signal sequence to the 1 / ⁇ 0 power.
- the spectral envelope is estimated by obtaining the coefficient based on the coefficient that can be converted into the linear prediction coefficient generated by the unit 421 (step C422).
- the whitened spectrum sequence generation unit 43 includes an MDCT coefficient sequence X (0), X (1),..., X (N-1) obtained by the frequency domain conversion unit 41 and a non-smoothed amplitude spectrum envelope generation unit 422.
- the generated non-smoothed amplitude spectrum envelope sequence ⁇ H (0), ⁇ H (1), ..., ⁇ H (N-1) is input.
- the whitened spectrum sequence generation unit 43 converts each coefficient of the MDCT coefficient sequence X (0), X (1),..., X (N-1) into a corresponding non-smoothed amplitude spectrum envelope sequence ⁇ H (0), By dividing each value of ⁇ H (1), ..., ⁇ H (N-1), the whitened spectrum series X W (0), X W (1), ..., X W (N-1) Generate.
- the generated whitening spectrum series X W (0), X W (1),..., X W (N ⁇ 1) are output to the parameter acquisition unit 44.
- k the coefficients X (()) of the MDCT coefficient sequence X (0), X (1),.
- k the coefficients X (()) of the MDCT coefficient sequence X (0), X (1),.
- ⁇ H (0), ⁇ H (1),..., ⁇ H (N-1) values ⁇ H (k) the whitened spectrum sequence X
- the whitened spectrum sequence generation unit 43 obtains a whitened spectrum sequence that is a sequence obtained by dividing a frequency domain sample sequence that is an MDCT coefficient sequence, for example, by a spectrum envelope that is an unsmoothed amplitude spectrum envelope sequence, for example ( Step C43).
- the parameter acquisition unit 44 receives the whitened spectrum series X W (0), X W (1),..., X W (N ⁇ 1) generated by the whitened spectrum series generating unit 43.
- the parameter acquisition unit 44 approximates the histogram of the whitened spectrum series X W (0), X W (1),..., X W (N ⁇ 1) with the generalized Gaussian distribution having the parameter ⁇ as a shape parameter. Is obtained (step C44).
- the parameter acquisition unit 44 is a distribution of histograms in which the generalized Gaussian distribution having the parameter ⁇ as a shape parameter is a whitened spectrum series X W (0), X W (1), ..., X W (N-1).
- the parameter ⁇ that is close to is determined.
- the generalized Gaussian distribution with the parameter ⁇ as a shape parameter is defined as follows, for example.
- ⁇ is a gamma function.
- ⁇ is a parameter corresponding to the variance.
- ⁇ obtained by the parameter acquisition unit 44 is defined by the following equation (C3), for example.
- F ⁇ 1 is an inverse function of the function F. This equation is derived by the so-called moment method.
- the parameter acquisition unit 44 inputs the value of m 1 / ((m 2 ) 1/2 ) into the formulated inverse function F ⁇ 1 .
- the parameter ⁇ can be obtained by calculating the output value.
- the parameter acquisition unit 44 calculates, for example, the first method or the second method described below in order to calculate the value of ⁇ defined by the equation (C3).
- the parameter ⁇ may be obtained by
- a first method for obtaining the parameter ⁇ will be described.
- the parameter obtaining unit 44 based on the whitened spectrum sequence to calculate the m 1 / ((m 2) 1/2), a plurality of different which had been prepared beforehand, corresponding to the eta F ⁇ corresponding to F ( ⁇ ) closest to the calculated m 1 / ((m 2 ) 1/2 ) is obtained with reference to the pair of ( ⁇ ).
- a plurality of different pairs of F ( ⁇ ) corresponding to ⁇ prepared in advance are stored in advance in the storage unit 441 of the parameter acquisition unit 44.
- the parameter acquisition unit 44 refers to the storage unit 441, finds F ( ⁇ ) closest to the calculated m 1 / ((m 2 ) 1/2 ), and stores ⁇ corresponding to the found F ( ⁇ ). Read from the unit 441 and output.
- the approximate curve function of the inverse function F ⁇ 1 is set as, for example, ⁇ F ⁇ 1 represented by the following formula (C3 ′), and the parameter acquisition unit 44 uses m 1 / ((m 2 ) 1/2 ) is calculated, and ⁇ is calculated by calculating the output value when m 1 / ((m 2 ) 1/2 ) calculated in the approximate curve function ⁇ F -1 is input.
- ⁇ F ⁇ 1 represented by the following formula (C3 ′)
- ⁇ obtained by the parameter acquisition unit 44 is not an expression (C3) but an expression (C3) using positive integers q1 and q2 determined in advance as in an expression (C3 ′′) (where q1 ⁇ q2). It may be defined by a generalized formula.
- ⁇ can be obtained by the same method as that when ⁇ is defined by equation (C3). That is, the parameter acquisition unit 44 calculates a value m q1 / ((m q2 ) q1 / q2 ) based on the q 1st moment m q1 and the q 2nd moment m q2 based on the whitened spectrum series. Then, for example, as in the first and second methods described above, the calculated m q1 / ((() by referring to a plurality of different pairs of F ′ ( ⁇ ) corresponding to ⁇ prepared in advance.
- ⁇ is a value based on two different moments m q1 and m q2 having different dimensions.
- the value of the moment with the lower dimension or a value based on this (hereinafter referred to as the former) and the value of the moment with the higher dimension or ⁇ may be obtained based on the value of the ratio based on the value (hereinafter referred to as the latter), the value based on the value of this ratio, or the value obtained by dividing the former by the latter.
- the value based on the moment for example, is that the m Q a Q to the moment and m as a given real number.
- ⁇ may be obtained by inputting these values into the approximate curve function ⁇ F- 1 .
- the approximate curve function to F ′ ⁇ 1 may be a monotonically increasing function whose output is a positive value in the domain to be used, as described above.
- the parameter determination unit 27 ′ may obtain the parameter ⁇ by loop processing. That is, the parameter determination unit 27 ′ sets the parameter ⁇ obtained by the parameter acquisition unit 44 as the parameter ⁇ 0 determined by a predetermined method, and performs processing by the spectrum envelope estimation unit 42, the whitened spectrum sequence generation unit 43, and the parameter acquisition unit 44. May be performed once more.
- the parameter ⁇ obtained by the parameter acquisition unit 44 is output to the spectrum envelope estimation unit 42.
- the spectrum envelope estimation unit 42 estimates the spectrum envelope by performing the same process as described above using ⁇ obtained by the parameter acquisition unit 44 as the parameter ⁇ 0 .
- the whitened spectrum sequence generation unit 43 Based on the newly estimated spectrum envelope, the whitened spectrum sequence generation unit 43 generates a whitened spectrum sequence by performing the same process as described above.
- the parameter acquisition unit 44 performs a process similar to the process described above based on the newly generated whitened spectrum sequence to obtain the parameter ⁇ .
- the processing of the spectrum envelope estimation unit 42, the whitened spectrum series generation unit 43, and the parameter acquisition unit 44 may be further performed a predetermined number of times ⁇ .
- the spectrum envelope estimation unit 42 performs the spectrum envelope estimation unit 42, the whitened spectrum sequence generation unit 43, and the parameter until the absolute value of the difference between the parameter ⁇ obtained this time and the parameter ⁇ obtained last time is equal to or less than a predetermined threshold. You may repeat the process of the acquisition part 44. FIG.
- any encoding process may be used as long as the configuration of the encoding process can be specified based on at least the parameter ⁇ , and an encoding process other than the encoding process of the encoding unit 26 is used. Also good.
- the encoding device includes, for example, a parameter determination unit 27 ′, an acoustic feature amount extraction unit 521, a specification unit 522, and an encoding unit 523.
- a parameter determination unit 27 ′ the encoding device performs each process illustrated in FIG. 18 to realize the encoding method.
- the parameter determination unit 27 ′ receives a time domain sound signal in frame units, which is a time-series signal. Examples of sound signals are voice digital signals or acoustic digital signals.
- the parameter determination unit 27 determines the parameter ⁇ by a process described later based on the input time series signal (step FE1).
- the parameter determination unit 27 performs processing for each frame having a predetermined time length. That is, the parameter ⁇ is determined for each frame.
- the parameter ⁇ determined by the parameter determination unit 27 ′ is output to the specifying unit 522.
- FIG. 21 shows a configuration example of the parameter determination unit 27 '.
- the parameter determination unit 27 ′ includes, for example, a frequency domain conversion unit 41, a spectrum envelope estimation unit 42, a whitened spectrum sequence generation unit 43, and a parameter acquisition unit 44.
- the spectrum envelope estimation unit 42 includes, for example, a linear prediction analysis unit 421 and a non-smoothed amplitude spectrum envelope sequence generation unit 422.
- FIG. 22 shows an example of each process of the parameter determination method realized by the parameter determination unit 27 '.
- the time domain sound signal which is a time series signal, is input to the frequency domain transform unit 41.
- the frequency domain conversion unit 41 converts the input time domain sound signal into N frequency MDCT coefficient sequences X (0), X (1),..., X (N ⁇ Convert to 1). N is a positive integer.
- the obtained MDCT coefficient sequences X (0), X (1),..., X (N-1) are output to the spectrum envelope estimation unit 42 and the whitened spectrum sequence generation unit 43.
- the subsequent processing is performed in units of frames.
- the frequency domain conversion unit 41 obtains a frequency domain sample sequence corresponding to the time series signal, for example, an MDCT coefficient sequence (step C41).
- the spectrum envelope estimation unit 42 receives the MDCT coefficient sequence X (0), X (1),..., X (N ⁇ 1) obtained by the frequency domain conversion unit 21.
- the spectrum envelope estimation unit 42 Based on the parameter ⁇ 0 determined by a predetermined method, the spectrum envelope estimation unit 42 performs spectrum envelope estimation using the absolute value ⁇ 0 of the frequency domain sample sequence corresponding to the time-series signal as a power spectrum ( Step C42).
- the estimated spectrum envelope is output to the whitened spectrum sequence generation unit 43.
- the spectrum envelope estimation unit 42 estimates the spectrum envelope by generating a non-smoothed amplitude spectrum envelope sequence, for example, by processing of a linear prediction analysis unit 421 and a non-smoothed amplitude spectrum envelope sequence generation unit 422 described below. .
- the parameter ⁇ 0 is determined by a predetermined method.
- ⁇ 0 is a predetermined number greater than zero.
- ⁇ 0 1.
- the frame before the frame for which the current parameter ⁇ is to be obtained (hereinafter referred to as the current frame) is, for example, a frame before the current frame and in the vicinity of the current frame.
- the frame in the vicinity of the current frame is, for example, a frame immediately before the current frame.
- ⁇ Linear prediction analysis unit 421 MDCT coefficient sequences X (0), X (1),..., X (N ⁇ 1) obtained by the frequency domain transform unit 41 are input to the linear prediction analysis unit 421.
- the linear prediction analysis unit 421 uses the MDCT coefficient sequence X (0), X (1),..., X (N-1) to define ⁇ R (0), ⁇ R defined by the following equation (C1). (1),..., ⁇ R (N-1) are subjected to linear prediction analysis to generate linear prediction coefficients ⁇ 1 , ⁇ 2 ,..., ⁇ p, and the generated linear prediction coefficients ⁇ 1 , ⁇ 2 ,. ⁇ p is encoded to generate a linear prediction coefficient code and quantized linear prediction coefficients ⁇ ⁇ 1 , ⁇ ⁇ 2 ,..., ⁇ ⁇ p which are quantized linear prediction coefficients corresponding to the linear prediction coefficient code.
- the generated quantized linear prediction coefficients ⁇ ⁇ 1 , ⁇ ⁇ 2 ,..., ⁇ ⁇ p are output to the non-smoothed spectrum envelope sequence generation unit 422.
- the linear prediction analyzer 421 first MDCT coefficients X (0), X (1 ), ..., X (N-1) of the inverse Fourier that the eta 0 squared regarded as a power spectrum of the absolute value
- the linear prediction analysis unit 421 performs linear prediction analysis using the obtained pseudo correlation function signal sequence ⁇ R (0), ⁇ R (1), ..., ⁇ R (N-1) to obtain a linear prediction coefficient. ⁇ 1 , ⁇ 2 ,..., ⁇ p are generated. Then, the linear prediction analysis unit 421 encodes the generated linear prediction coefficients ⁇ 1 , ⁇ 2 ,..., ⁇ p so as to encode a linear prediction coefficient code and a quantized linear prediction coefficient corresponding to the linear prediction coefficient code. ⁇ ⁇ 1 , ⁇ ⁇ 2 ,..., ⁇ ⁇ p are obtained.
- Linear prediction coefficients ⁇ 1, ⁇ 2, ..., ⁇ p is, MDCT coefficient sequence X (0), X (1 ), ..., and the eta 0 square of the absolute value of X (N-1) was regarded as a power spectrum It is a linear prediction coefficient corresponding to the time domain signal.
- the generation of the linear prediction coefficient code by the linear prediction analysis unit 421 is performed by, for example, a conventional encoding technique.
- the conventional encoding technique is, for example, an encoding technique in which a code corresponding to the linear prediction coefficient itself is a linear prediction coefficient code, and a code corresponding to the LSP parameter by converting the linear prediction coefficient into an LSP parameter.
- an encoding technique for converting a linear prediction coefficient into a PARCOR coefficient and a code corresponding to the PARCOR coefficient as a linear prediction coefficient code for example, an encoding technique for converting a linear prediction coefficient into a PARCOR coefficient and a code corresponding to the PARCOR coefficient as a linear prediction coefficient code.
- the linear prediction analysis unit 421 obtains a pseudo correlation function signal sequence obtained by performing an inverse Fourier transform assuming that the absolute value of the absolute value of the frequency domain sample sequence, which is an MDCT coefficient sequence, is a power spectrum, for example.
- the linear prediction coefficient is generated by performing linear prediction analysis using the data (step C421).
- ⁇ Non-smoothed Amplitude Spectrum Envelope Sequence Generation Unit 422 Quantized linear prediction coefficients ⁇ ⁇ 1 , ⁇ ⁇ 2 ,..., ⁇ ⁇ p generated by the linear prediction analysis unit 421 are input to the unsmoothed amplitude spectrum envelope sequence generation unit 422.
- Textured amplitude spectral envelope sequence generation unit 422 the quantized linear prediction coefficient ⁇ ⁇ 1, ⁇ ⁇ 2, ..., ⁇ ⁇ is the sequence of the amplitude spectrum envelope corresponding to p textured amplitude spectral envelope sequence ⁇ H ( 0), ⁇ H (1), ..., ⁇ H (N-1) are generated.
- the generated non-smoothed amplitude spectrum envelope sequence ⁇ H (0), ⁇ H (1), ..., ⁇ H (N-1) is output to the whitened spectrum sequence generation unit 43.
- Textured amplitude spectral envelope sequence generation unit 422 the quantized linear prediction coefficient ⁇ ⁇ 1, ⁇ ⁇ 2, ..., using the ⁇ beta p, unsmoothed amplitude spectral envelope sequence ⁇ H (0), ⁇ H ( 1),..., ⁇ H (N-1) as unsmoothed amplitude spectrum envelope sequence defined by equation (C2) ⁇ H (0), ⁇ H (1),..., ⁇ H (N-1) Is generated.
- the unsmoothed amplitude spectrum envelope sequence generation unit 422 performs linear prediction analysis on the unsmoothed spectrum envelope sequence that is a sequence obtained by raising the amplitude spectrum envelope sequence corresponding to the pseudo correlation function signal sequence to the 1 / ⁇ 0 power.
- the spectral envelope is estimated by obtaining the coefficient based on the coefficient that can be converted into the linear prediction coefficient generated by the unit 421 (step C422).
- the unsmoothed spectrum envelope sequence generation unit 422 replaces the quantized linear prediction coefficients ⁇ ⁇ 1 , ⁇ ⁇ 2 ,..., ⁇ ⁇ p with the linear prediction coefficients ⁇ 1 , ⁇ 2 generated by the linear prediction analysis unit 421. ,..., ⁇ p may be used to obtain non-smoothed amplitude spectrum envelope sequences ⁇ H (0), ⁇ H (1),. In this case, the linear prediction analysis unit 421, the quantized linear prediction coefficient ⁇ ⁇ 1, ⁇ ⁇ 2, ..., may not the process of obtaining the ⁇ beta p.
- the whitened spectrum sequence generation unit 43 includes an MDCT coefficient sequence X (0), X (1),..., X (N-1) obtained by the frequency domain conversion unit 41 and a non-smoothed amplitude spectrum envelope generation unit 422.
- the generated non-smoothed amplitude spectrum envelope sequence ⁇ H (0), ⁇ H (1), ..., ⁇ H (N-1) is input.
- the whitened spectrum sequence generation unit 43 converts each coefficient of the MDCT coefficient sequence X (0), X (1),..., X (N-1) into a corresponding non-smoothed amplitude spectrum envelope sequence ⁇ H (0), By dividing each value of ⁇ H (1), ..., ⁇ H (N-1), the whitened spectrum series X W (0), X W (1), ..., X W (N-1) Generate.
- the generated whitening spectrum series X W (0), X W (1),..., X W (N ⁇ 1) are output to the parameter acquisition unit 44.
- k the coefficients X (()) of the MDCT coefficient sequence X (0), X (1),.
- k the coefficients X (()) of the MDCT coefficient sequence X (0), X (1),.
- ⁇ H (0), ⁇ H (1),..., ⁇ H (N-1) values ⁇ H (k) the whitened spectrum sequence X
- the whitened spectrum sequence generation unit 43 obtains a whitened spectrum sequence that is a sequence obtained by dividing a frequency domain sample sequence that is an MDCT coefficient sequence, for example, by a spectrum envelope that is an unsmoothed amplitude spectrum envelope sequence, for example ( Step C43).
- the parameter acquisition unit 44 receives the whitened spectrum series X W (0), X W (1),..., X W (N ⁇ 1) generated by the whitened spectrum series generating unit 43.
- the parameter acquisition unit 44 approximates the histogram of the whitened spectrum series X W (0), X W (1),..., X W (N ⁇ 1) with the generalized Gaussian distribution having the parameter ⁇ as a shape parameter. Is obtained (step C44).
- the parameter acquisition unit 44 is a distribution of histograms in which the generalized Gaussian distribution having the parameter ⁇ as a shape parameter is a whitened spectrum series X W (0), X W (1), ..., X W (N-1).
- the parameter ⁇ that is close to is determined.
- the generalized Gaussian distribution with the parameter ⁇ as a shape parameter is defined as follows, for example.
- ⁇ is a gamma function.
- ⁇ is a parameter corresponding to the variance.
- ⁇ obtained by the parameter acquisition unit 44 is defined by the following equation (C3), for example.
- F ⁇ 1 is an inverse function of the function F. This equation is derived by the so-called moment method.
- the parameter acquisition unit 44 inputs the value of m 1 / ((m 2 ) 1/2 ) into the formulated inverse function F ⁇ 1 .
- the parameter ⁇ can be obtained by calculating the output value.
- the parameter acquisition unit 44 calculates, for example, the first method or the second method described below in order to calculate the value of ⁇ defined by the equation (C3).
- the parameter ⁇ may be obtained by
- a first method for obtaining the parameter ⁇ will be described.
- the parameter obtaining unit 44 based on the whitened spectrum sequence to calculate the m 1 / ((m 2) 1/2), a plurality of different which had been prepared beforehand, corresponding to the eta F ⁇ corresponding to F ( ⁇ ) closest to the calculated m 1 / ((m 2 ) 1/2 ) is obtained with reference to the pair of ( ⁇ ).
- a plurality of different pairs of F ( ⁇ ) corresponding to ⁇ prepared in advance are stored in advance in the storage unit 441 of the parameter acquisition unit 44.
- the parameter acquisition unit 44 refers to the storage unit 441, finds F ( ⁇ ) closest to the calculated m 1 / ((m 2 ) 1/2 ), and stores ⁇ corresponding to the found F ( ⁇ ). Read from the unit 441 and output.
- the approximate curve function of the inverse function F ⁇ 1 is set as, for example, ⁇ F ⁇ 1 represented by the following formula (C3 ′), and the parameter acquisition unit 44 uses m 1 / ((m 2 ) 1/2 ) is calculated, and ⁇ is calculated by calculating the output value when m 1 / ((m 2 ) 1/2 ) calculated in the approximate curve function ⁇ F -1 is input.
- ⁇ F ⁇ 1 represented by the following formula (C3 ′)
- ⁇ obtained by the parameter acquisition unit 44 is not an expression (C3) but an expression (C3) using positive integers q1 and q2 determined in advance as in an expression (C3 ′′) (where q1 ⁇ q2). It may be defined by a generalized formula.
- ⁇ can be obtained by the same method as that when ⁇ is defined by equation (C3). That is, the parameter acquisition unit 44 calculates a value m q1 / ((m q2 ) q1 / q2 ) based on the q 1st moment m q1 and the q 2nd moment m q2 based on the whitened spectrum series. Then, for example, as in the first and second methods described above, the calculated m q1 / ((() by referring to a plurality of different pairs of F ′ ( ⁇ ) corresponding to ⁇ prepared in advance.
- ⁇ is a value based on two different moments m q1 and m q2 having different dimensions.
- the value of the moment with the lower dimension or a value based on this (hereinafter referred to as the former) and the value of the moment with the higher dimension or ⁇ may be obtained based on the value of the ratio based on the value (hereinafter referred to as the latter), the value based on the value of this ratio, or the value obtained by dividing the former by the latter.
- the value based on the moment for example, is that the m Q a Q to the moment and m as a given real number.
- ⁇ may be obtained by inputting these values into the approximate curve function ⁇ F- 1 .
- the approximate curve function to F ′ ⁇ 1 may be a monotonically increasing function whose output is a positive value in the domain to be used, as described above.
- the parameter determination unit 27 ′ may obtain the parameter ⁇ by loop processing. That is, the parameter determination unit 27 ′ sets the parameter ⁇ obtained by the parameter acquisition unit 44 as the parameter ⁇ 0 determined by a predetermined method, and performs processing by the spectrum envelope estimation unit 42, the whitened spectrum sequence generation unit 43, and the parameter acquisition unit 44. May be performed once more.
- the parameter ⁇ obtained by the parameter acquisition unit 44 is output to the spectrum envelope estimation unit 42.
- the spectrum envelope estimation unit 42 estimates the spectrum envelope by performing the same process as described above using ⁇ obtained by the parameter acquisition unit 44 as the parameter ⁇ 0 .
- the whitened spectrum sequence generation unit 43 Based on the newly estimated spectrum envelope, the whitened spectrum sequence generation unit 43 generates a whitened spectrum sequence by performing the same process as described above.
- the parameter acquisition unit 44 performs a process similar to the process described above based on the newly generated whitened spectrum sequence to obtain the parameter ⁇ .
- the processing of the spectrum envelope estimation unit 42, the whitened spectrum series generation unit 43, and the parameter acquisition unit 44 may be further performed a predetermined number of times ⁇ .
- the spectrum envelope estimation unit 42 performs the spectrum envelope estimation unit 42, the whitened spectrum sequence generation unit 43, and the parameter until the absolute value of the difference between the parameter ⁇ obtained this time and the parameter ⁇ obtained last time is equal to or less than a predetermined threshold. You may repeat the process of the acquisition part 44. FIG.
- the acoustic feature quantity extraction unit 521 receives a time domain sound signal in a frame unit, which is a time-series signal.
- the acoustic feature quantity extraction unit 521 calculates an index representing the loudness of the time series signal as the acoustic feature quantity (step FE2). An index indicating the calculated sound volume is output to the specifying unit 522.
- the acoustic feature quantity extraction unit 521 generates an acoustic feature quantity code corresponding to the acoustic feature quantity and outputs the acoustic feature quantity code to the decoding device.
- the index that represents the loudness of the time series signal may be any index that represents the loudness of the time series signal.
- the index representing the loudness of the time series signal is, for example, the energy of the time series signal.
- the acoustic feature quantity extracting unit 521 determines the loudness of the sound.
- the acoustic feature amount extracting unit 521 uses the sound index. It is not necessary to calculate an index that represents the size of.
- the identification unit 522 receives the parameter ⁇ determined by the parameter determination unit 27 ′ and an index representing the loudness of the time-series signal calculated by the acoustic feature amount extraction unit 521. Further, a sound signal in frame units, which is a time series signal, is input as necessary.
- the specifying unit 522 specifies the configuration of the encoding process based on at least the parameter ⁇ (step FE3), generates a specific code that can specify the configuration of the encoding process, and outputs the specific code to the decoding device. Information about the configuration of the encoding process specified by the specifying unit 522 is output to the encoding unit 523.
- the identifying unit 522 may identify the configuration of the encoding process based only on the parameter ⁇ , or may identify the configuration of the encoding process based on the parameter ⁇ and other parameters.
- the configuration of the encoding process may be an encoding method such as TCX (Transform Coded ACE Excitation), ACELP (Algebraic Code ⁇ Excited Linear ⁇ Prediction), or a frame length that is a unit of temporal processing in a certain encoding method
- TCX Transform Coded ACE Excitation
- ACELP Algebraic Code ⁇ Excited Linear ⁇ Prediction
- frame length that is a unit of temporal processing in a certain encoding method
- the number of bits allocated to a code, the order of a coefficient that can be converted into a linear prediction coefficient, and the value of an arbitrary parameter used in the encoding process may be used.
- the frame length which is a unit of temporal processing, the number of bits assigned to the code, the order of the coefficient that can be converted into a linear prediction coefficient, and an arbitrary number used in the encoding process according to the parameter ⁇ It may be possible to appropriately determine the values of the parameters.
- the encoding apparatus and method of the second embodiment described above with reference to FIGS. 12 and 13 determine parameter values used in the encoding process according to the parameter ⁇ . For this reason, the encoding apparatus and method of the second embodiment described above with reference to FIGS. 12 and 13 is an example of a modification of the second embodiment that specifies the configuration of the encoding process based on the parameter ⁇ . It can be said that there is.
- the specific code that can specify the configuration of the encoding process may be any code as long as it can specify the configuration of the encoding process.
- the specific code that can specify the configuration of the encoding process is “11” when the TCX having a long frame length is specified as the configuration of the encoding process, and is specified when the TCX having a short frame length is specified. “100”, “101” when ACELP is specified, for example, a flag with a predetermined bit string such as “0” when a low bit encoding process that transmits only noise level and specification is specified. is there.
- the specific code that can specify the configuration of the encoding process may be a parameter code representing the parameter ⁇ , for example.
- the specific code that can specify the configuration of the encoding process is specified by the specific code, and if the configuration of the encoding process is specified by the specific code, the configuration of the corresponding decoding process is also specified. It can also be said.
- Specifying unit 522 when the index with a predetermined threshold value C e representing the magnitude of the sound sequence signal is compared with, also, it compares the parameter eta with a predetermined threshold value C eta.
- C e maximum amplitude value * (1/128).
- C ⁇ 1.
- the time-series signal is music mainly composed of wind instruments and stringed instruments.
- the identification unit 522 determines to perform an encoding process suitable for continuous music.
- the encoding process suitable for continuous music is, for example, a TCX encoding process with a long frame length, specifically, a TCX encoding process of 1024 frames.
- the time-series signal is music mainly composed of speech or percussion instruments having a large time variation. There is a high possibility.
- the specifying unit 522 divides the time-series signal input as necessary into four, for example, creates four subframes, and measures the energy of the time-series signal for each subframe.
- the specifying unit 522 determines to perform an encoding process suitable for music having a large time variation.
- the encoding process suitable for music with a large time variation is, for example, a TCX encoding process with a short frame length, specifically, a TCX encoding process for 256 frames.
- C E 1.5.
- the specifying unit 522 determines to perform an encoding process suitable for speech.
- the encoding process suitable for speech is speech encoding processing such as ACELP and CELP (Code Excited Linear Prediction).
- the time-series signal is highly likely to be a silent interval.
- the silent section does not mean a section where there is no sound at all, but means a section where there is no target sound but background sound and ambient noise exist.
- the specifying unit 522 determines that the time series signal is a silent section.
- the time-series signal is background music (hereinafter referred to as BGM) which is continuous music with a low volume. It is highly possible that the background sound has a characteristic such as In this case, the specifying unit 522 determines to perform an encoding process suitable for a background sound having a characteristic such as BGM.
- the encoding process suitable for background sound having a characteristic such as BGM is, for example, a TCX encoding process with a short frame length, specifically, a TCX encoding process for a 256-bit point frame.
- the specifying unit 522 is not limited to the parameter ⁇ , but includes at least the degree of temporal variation of the index representing the loudness of the input time-series signal, the spectral shape, the temporal variation of the spectral shape, and the periodicity of the pitch.
- the configuration of the encoding process may be specified further based on one.
- the acoustic feature used by the identifying unit 522 in the degree of temporal variation, spectral shape, spectral shape temporal variation, and pitch periodicity of an index representing the volume of the input time-series signal.
- the amount is calculated and output to the specifying unit 522.
- the acoustic feature quantity extraction unit 521 generates an acoustic feature quantity code corresponding to the calculated acoustic feature quantity and outputs the acoustic feature quantity code to the decoding device.
- the specifying unit 522 determines the sound volume of the time-series signal. It is determined whether or not the temporal variation of the index representing is large, and whether or not the parameter ⁇ is large.
- Whether the parameter ⁇ is large can be determined based on, for example, a predetermined threshold C ⁇ . That is, if parameter ⁇ ⁇ predetermined threshold C ⁇ , it can be determined that parameter ⁇ is large, and otherwise parameter ⁇ is small.
- the specifying unit 522 determines to perform an encoding process suitable for music having a large time variation.
- the specifying unit 522 determines that the time series signal is a silent section.
- the specifying unit 522 determines to perform an encoding process suitable for continuous music.
- the specifying unit 522 determines whether the spectrum shape of the time series signal is flat, and the parameter ⁇ is Determine if it is larger.
- E V a predetermined threshold value
- the specifying unit 522 determines that the time series signal is a silent section.
- the specifying unit 522 determines to perform an encoding process suitable for music having a large time variation.
- the specifying unit 522 determines to perform an encoding process suitable for speech.
- the specifying unit 522 determines to perform an encoding process suitable for continuous music.
- the identifying unit 522 determines whether the temporal variation of the spectrum shape of the time series signal is large. It is also determined whether or not the parameter ⁇ is large.
- Whether the temporal variation of the spectral shape of the time series signal is flat can be determined based on a predetermined threshold value E V ′ .
- E V ′ a predetermined threshold value
- the value F V ((1/4) ⁇ 4 subframes of the 4th subframe constituting the time series signal is obtained by dividing the arithmetic average of the absolute values of the primary PARCOR coefficients of the 4th subframe by the geometric mean.
- the time series signal It can be determined that the temporal variation of the spectral shape of the time series signal is small when the temporal variation of the spectral shape is large.
- the specifying unit 522 determines to perform an encoding process suitable for speech.
- the specifying unit 522 determines to perform an encoding process suitable for music having a large time variation.
- the specifying unit 522 determines that the time series signal is a silent section.
- the specifying unit 522 determines to perform an encoding process suitable for continuous music.
- the specifying unit 522 determines whether the periodicity of the pitch of the time series signal is large, Also, it is determined whether the parameter ⁇ is large.
- Whether large periodicity of pitch time series signals can be determined based on, for example, a predetermined threshold C P. That is, if the periodicity of the pitch of the time-series signal is equal to or greater than the predetermined threshold value CP, it can be determined that the periodicity of the pitch is large; otherwise, the periodicity of the pitch of the time-series signal is small.
- a periodicity of pitch for example, normalized correlation function with a sequence separated by pitch period ⁇ samples
- the specifying unit 522 determines to perform an encoding process suitable for speech.
- the specifying unit 522 determines to perform an encoding process suitable for continuous music.
- the specifying unit 522 determines that the time series signal is a silent section.
- the specifying unit 522 determines to perform an encoding process suitable for music having a large time variation.
- the encoding unit 523 receives a sound signal in frame units, which is a time-series signal, and information on the configuration of the encoding process specified by the specifying unit 522.
- the encoding unit 523 generates a code by encoding the input time-series signal by the encoding process having the specified configuration (step FE4).
- the generated code is output to the decoding device.
- a TCX (Transform Coded Excitation) encoding process with a long frame length specifically, a TCX encoding process of 1024 frames is performed.
- a TCX encoding process suitable for music with a large time variation for example, a TCX encoding process with a short frame length, specifically, a TCX encoding process for 256 frames is performed.
- a TCX encoding process suitable for a background sound having a characteristic such as BGM for example, a TCX encoding process with a short frame length, specifically, a TCX encoding process for 256 frames is performed.
- speech coding processing such as ACELP (Algebraic Code Excited Linear Prediction) and CELP (Code Excited Linear Prediction) is performed.
- ACELP Algebraic Code Excited Linear Prediction
- CELP Code Excited Linear Prediction
- the encoding unit 523 does not encode the input time-series signal, for example, (i) the first method or (ii) described below. ) The second method is performed.
- the encoding part 523 transmits the information which shows that it is a silence area to a decoding apparatus. Information indicating that it is a silent section is transmitted with a low bit such as 1 bit. After the encoding 523 transmits the information indicating that it is a silent section, while the time-series signal to be processed is determined to be a silent section, the identification unit 522 determines that it is a silent section. The information shown may not be sent again.
- the encoding unit 523 transmits information indicating that it is a silent period, the shape of the spectrum envelope of the time series signal, and the information of the amplitude of the time series signal to the decoding device.
- the decoding device includes, for example, a specific code decoding unit 525, an acoustic feature code decoding unit 526, a specifying unit 527, and a decoding unit 528.
- a specific code decoding unit 525 an acoustic feature code decoding unit 526, a specifying unit 527, and a decoding unit 528.
- Each part of the decoding device performs each process illustrated in FIG. 20 to realize a decoding method.
- the specific code output from the encoding device is input to the specific code decoding unit 525.
- the specific code decoding unit 525 decodes the specific code and acquires information about the configuration of the encoding process (step FD1). Information about the configuration of the acquired encoding process is output to the specifying unit 527.
- the specific code is a parameter code
- the specific code decoding unit 525 decodes the parameter code to obtain a parameter ⁇
- the obtained parameter ⁇ is output to the identifying unit 527 as information about the configuration of the encoding process.
- the acoustic feature amount code decoding unit 526 receives the acoustic feature amount code output from the encoding device.
- the acoustic feature amount code decoding unit 526 decodes the acoustic feature amount code, and indicates an index indicating the loudness of the time-series signal, temporal variation of the index indicating the loudness, spectral shape, and temporal shape of the spectral shape.
- An acoustic feature amount that is at least one of the degree of variation and periodicity of pitch is obtained (step FD2).
- the obtained acoustic feature amount is output to the specifying unit 527.
- the acoustic feature quantity code decoding unit 526 performs the process. Absent.
- the identifying unit 527 identifies the configuration of the decoding process based on the information on the configuration of the encoding process (step FD3). For example, the specifying unit 527 specifies the configuration of the decoding process corresponding to the configuration of the encoding process specified by the information about the configuration of the encoding process. The specifying unit 527 may specify the configuration of the decoding process based on the information about the configuration of the encoding process and the acoustic feature amount as necessary. Information about the configuration of the identified decoding process is output to the decoding unit 528.
- the parameter ⁇ is input as information about the configuration of the encoding process, and an index that represents the volume of the sound of the time-series signal, temporal variation of the index that represents the volume of the sound, spectrum shape, time of the spectrum shape
- an acoustic feature quantity that is at least one of the degree of periodic variation and the periodicity of pitch is input.
- the specifying unit 527 specifies the configuration of the decoding process corresponding to the configuration of the encoding process specified by the specifying unit 522 using the parameter ⁇ and the acoustic feature amount according to the determination criterion.
- the decoding process can be any of decoding processes suitable for continuous music, decoding processes suitable for music with large temporal fluctuations, decoding processes suitable for background sounds with characteristics such as BGM, and decoding processes suitable for audio. Is identified.
- the specifying unit 527 determines that the time series signal is a silent section.
- the decoding unit 528 receives the code output from the encoding device and information about the configuration of the decoding process specified by the specifying unit 527.
- the decoding unit 528 obtains a sound signal in frame units, which is a time-series signal, by the decoding process having the specified configuration (step FD4).
- TCX Transform Coded Excitation
- a decoding process suitable for music with a large time fluctuation for example, a TCX decoding process with a short frame length, specifically, a TCX decoding process for 256 frames is performed.
- a decoding process suitable for background sound with a characteristic such as BGM for example, a TCX decoding process with a short frame length, specifically, a TCX decoding process for 256 frames is performed.
- ACELP Algebraic Code Excited Linear Prediction
- CELP Code Excited Linear Prediction
- the decoding unit 528 When the decoding device receives information indicating that it is a silent section, or when the identifying unit 527 determines that the time-series signal is a silent section, the decoding unit 528, for example, will be described below (i) No. Process of method 1 or (ii) second method is performed.
- the decoding unit 528 generates a predetermined noise.
- the decoding unit 528 uses the information on the shape of the spectral envelope of the time-series signal and the amplitude of the time-series signal received together with information indicating that it is a silent section, to determine a predetermined noise. Deform and output.
- a noise deformation method an existing method used in EVS (Enhanced Voice Service) or the like may be used.
- the decoding unit 528 may generate noise when receiving information indicating that it is a silent section.
- the spectrum envelope estimation unit 2A is a frequency domain that is, for example, an MDCT coefficient sequence corresponding to a time series signal. It can be said that the spectrum envelope (unsmoothed amplitude spectrum envelope sequence) is estimated by regarding the absolute value of the sample string to the power of ⁇ as the power spectrum.
- “considered as a power spectrum” means to use a power of ⁇ where a power spectrum is normally used.
- the linear prediction analysis unit 22 of the spectrum envelope estimation unit 2A performs, for example, a pseudo correlation obtained by performing an inverse Fourier transform in which the absolute value of ⁇ power of a frequency domain sample sequence that is an MDCT coefficient sequence is regarded as a power spectrum. It can be said that a coefficient that can be converted into a linear prediction coefficient is obtained by performing a linear prediction analysis using the function signal sequence. Further, the unsmoothed amplitude spectrum envelope sequence generation unit 23 of the spectrum envelope estimation unit 2A converts the amplitude spectrum envelope sequence corresponding to the coefficient that can be converted into the linear prediction coefficient obtained by the linear prediction analysis unit 22 to the 1 / ⁇ th power. It can be said that the spectral envelope is estimated by obtaining the non-smoothed spectral envelope sequence which is the obtained sequence.
- the encoding unit 2B is a spectrum estimated by the spectrum envelope estimation unit 2A. Coding for changing the bit allocation based on the envelope (non-smoothed amplitude spectrum envelope sequence) or changing the bit allocation substantially for each coefficient of the frequency domain sample sequence corresponding to the time-series signal, for example, MDCT coefficient sequence It can be said that it is going.
- the decoding unit 3A is input according to a bit allocation that changes based on a non-smoothed spectrum envelope sequence or a bit allocation that changes substantially. It can be said that the frequency domain sample sequence corresponding to the time-series signal is obtained by decoding the integer signal code.
- the encoding unit 2B may perform encoding other than the arithmetic encoding described above if the bit allocation is changed based on the spectral envelope (unsmoothed amplitude spectral envelope sequence) or the bit allocation is changed substantially. Processing may be performed.
- the decoding unit 3A performs a decoding process corresponding to the encoding process performed by the encoding unit 2B.
- the encoding unit 2B may perform Golomb-Rice encoding on the frequency domain sample sequence using the Rice parameter determined based on the spectrum envelope (unsmoothed amplitude spectrum envelope sequence).
- the decoding unit 3A may perform Golomb-Rice decoding using the Rice parameter determined based on the spectrum envelope (unsmoothed amplitude spectrum envelope sequence).
- the encoding device may not perform the encoding process to the end when determining the parameter ⁇ .
- the parameter determination unit 27 may determine the parameter ⁇ based on the estimated code amount.
- the encoding unit 2B uses each of the plurality of parameters ⁇ to estimate the code obtained by the same encoding process as described above for the frequency domain sample sequence corresponding to the time-series signal in the same predetermined time interval. Get quantity.
- the parameter determination unit 27 selects one of a plurality of parameters ⁇ based on the obtained estimated code amount. For example, the parameter ⁇ having the smallest estimated code amount is selected.
- the encoding unit 2B obtains and outputs a code by performing the same encoding process as described above using the selected parameter ⁇ .
- the encoding apparatus may further include a dividing unit 28 indicated by a broken line in FIG. Based on the frequency domain sample sequence that is, for example, the MDCT coefficient sequence generated by the frequency domain transform unit 21, the dividing unit 28 includes a first frequency domain sample sequence that includes samples corresponding to periodic components of the frequency domain sample sequence, Generating a second frequency domain sample sequence composed of samples other than the sample corresponding to the periodic component of the frequency domain sample sequence, and outputting information representing the sample corresponding to the periodic component to the decoding apparatus as auxiliary information .
- the dividing unit 28 Based on the frequency domain sample sequence that is, for example, the MDCT coefficient sequence generated by the frequency domain transform unit 21, the dividing unit 28 includes a first frequency domain sample sequence that includes samples corresponding to periodic components of the frequency domain sample sequence, Generating a second frequency domain sample sequence composed of samples other than the sample corresponding to the periodic component of the frequency domain sample sequence, and outputting information representing the sample corresponding to the periodic component to the decoding apparatus as auxiliary information .
- the first frequency domain sample sequence is a sample sequence composed of samples corresponding to the peaks of the frequency domain sample sequence
- the second frequency domain sample sequence is a sample corresponding to the valleys of the frequency domain sample sequence. It is a sample sequence composed of
- one or a plurality of consecutive samples including a periodicity of a time-series signal corresponding to a frequency domain sample sequence in a frequency domain sample sequence or a sample corresponding to a fundamental frequency, and a frequency domain in a frequency domain sample sequence
- the first frequency domain includes a sample sequence composed of all or part of one or a plurality of consecutive samples including samples corresponding to the periodicity of the time series signal corresponding to the sample sequence or an integer multiple of the fundamental frequency.
- a second frequency domain sample sequence is generated from the sample sequence and a sample sequence composed of samples not included in the first frequency domain sample sequence of the frequency domain sample sequences.
- the generation of the first frequency domain sample sequence and the second frequency domain sample sequence can be performed using a method described in International Publication WO2012 / 046685.
- the linear prediction analysis unit 22, the unsmoothed amplitude spectrum envelope sequence generation unit 23, the smoothed amplitude spectrum envelope sequence generation unit 24, the envelope normalization unit 25, the encoding unit 26, and the parameter determination unit 27 include a first frequency domain sample sequence And about each of a 2nd frequency domain sample sequence, the encoding process demonstrated in 1st embodiment or 2nd embodiment is performed, and a code
- encoding can be performed more efficiently by encoding each of the first frequency domain sample sequence and the second frequency domain sample sequence.
- the decoding apparatus may further include a combining unit 38 indicated by a broken line in FIG.
- the decoding apparatus performs the decoding process described in the first embodiment or the second embodiment based on a code (for example, a parameter code, a linear prediction coefficient code, an integer signal code, and a gain code) corresponding to the first frequency domain sample sequence.
- a code for example, a parameter code, a linear prediction coefficient code, an integer signal code, and a gain code
- a decoding process is performed to obtain a decoded second frequency domain sample sequence.
- the combining unit 38 appropriately combines the decoded first frequency domain sample sequence and the decoded second frequency domain sample sequence using the input auxiliary information, for example, by decoding MDCT coefficient sequences ⁇ X (0), ⁇ X (1 ),..., ⁇ X (N-1) is obtained as a decoded frequency domain sample sequence.
- the time domain transform unit obtains a time series signal by transforming the decoded frequency domain sample sequence into the time domain.
- the combination using the auxiliary information can be performed using the method described in International Publication No. WO2012 / 046685.
- the encoding device encodes only the first frequency domain sample sequence, generates only the code corresponding to the first frequency domain sample sequence, The code corresponding to the two frequency domain sample sequences is not generated, and the decoding apparatus uses the first frequency domain sample sequence obtained from the code and the second frequency domain sample sequence with the sample value of 0 as a decoded frequency domain sample.
- a column may be obtained.
- the linear prediction analysis unit 22, the non-smoothed amplitude spectrum envelope sequence generation unit 23, the smoothed amplitude spectrum envelope sequence generation unit 24, the envelope normalization unit 25, the encoding unit 26, and the parameter determination unit 27 are included in the first frequency domain. Even if the sample sequence after the rearrangement is a sample sequence obtained by combining the sample sequence and the second frequency domain sample sequence, the encoding process described in the first embodiment or the second embodiment is performed to generate a code. Good. For example, when arithmetic coding is performed, a parameter code, a linear prediction coefficient code, an integer signal code, and a gain code corresponding to the rearranged sample sequence are generated.
- encoding can be performed more efficiently by performing encoding on the rearranged sample sequence.
- the decoding device performs the decoding process described in the first embodiment or the second embodiment, obtains a sample string after decoding rearrangement, and uses the input auxiliary information to obtain the sample string after decoding rearrangement,
- the first frequency domain sample sequence and the second frequency domain sample sequence are rearranged according to the rule corresponding to the rule generated by the encoder, for example, the decoded MDCT coefficient sequence ⁇ X (0), ⁇ X (1),.
- a decoded frequency domain sample sequence which is ⁇ X (N-1) is obtained.
- the time domain transform unit 36 transforms the decoded frequency domain sample sequence into the time domain to obtain a time series signal. Rearrangement using auxiliary information can be performed using the method described in International Publication No. WO2012 / 046685.
- the encoding device (1) performs a coding process on the frequency domain sample sequence to generate a code, and (2) performs a coding process on each of the first frequency domain sample sequence and the second frequency domain sample sequence.
- a method of generating a code (3) a method of generating a code by encoding only the first frequency domain sample sequence, and (4) obtained by combining the first frequency domain sample sequence and the second frequency domain sample sequence. Any method may be selected for each frame from among the methods for generating a code by performing an encoding process on the rearranged sample sequence that is a sample sequence.
- the encoding apparatus also outputs a code indicating which method (1) to (4) is selected, and the decoding apparatus corresponds to any of the above methods according to the code input for each frame. Perform decryption.
- the parameter determination unit 27 of the encoding device and the parameter decoding unit 37 of the decoding device may store parameter ⁇ candidates corresponding to the above methods (1) to (4). .
- the linear prediction analysis unit 22 of the encoding device and the linear prediction coefficient decoding unit 31 of the decoding device include quantized linear prediction coefficient candidates corresponding to the methods (1) to (4), and Decoded linear prediction coefficient candidates may be stored.
- the unsmoothed amplitude spectrum envelope sequence generation unit 23 and the unsmoothed amplitude spectrum envelope sequence generation unit 422 are, for example, MDCT coefficient sequences ⁇ X (0), ⁇ X (1),..., ⁇ X (N-1).
- the periodic integrated envelope sequence may be generated by modifying the spectrum envelope sequence (unsmoothed amplitude spectrum envelope sequence) based on the periodic component of the frequency domain sample sequence.
- the non-smoothed amplitude spectrum envelope sequence generation unit 32 generates a cycle of a decoded frequency domain sample sequence that is, for example, a decoded MDCT coefficient sequence ⁇ X (0), ⁇ X (1), ..., ⁇ X (N-1).
- the periodic integrated envelope sequence may be generated by modifying the spectral envelope sequence (non-smoothed amplitude spectral envelope sequence) based on the sex component.
- the dispersion parameter determination unit 268, the decoding unit 34, and the whitened spectrum sequence generation unit 43 of the encoding unit 26 use the periodic integrated envelope sequence instead of the spectrum envelope sequence (unsmoothed amplitude spectrum envelope sequence).
- the same processing as above is performed. Since the periodic integrated envelope sequence has good approximation accuracy near the peak due to the pitch period of the time series signal, the use of the periodic integrated envelope sequence can increase the coding efficiency.
- the greater the period of the frequency domain sample sequence the greater the periodicity of at least the integer multiple of the frequency domain sample sequence in the spectrum envelope sequence and the value of samples in the vicinity of the integer multiple of the period.
- the integrated envelope series As the degree of periodicity of the time-series signal is larger, a sequence obtained by greatly changing the value of a sample in the vicinity of at least an integer multiple of the period of the frequency domain sample sequence and the integer multiple of the period of the spectrum envelope sequence. It may be a periodic integrated envelope sequence. Further, as the period of the frequency domain sample sequence is larger, a sequence obtained by changing the values of many samples in the vicinity of an integer multiple of the period of the frequency domain sample sequence in the spectrum envelope sequence may be used as the periodic integrated envelope sequence. .
- T is the interval of the periodic component of the frequency domain sample sequence
- L is the number of digits after the decimal point of interval T
- v is an integer of 1 or more
- floor ( ⁇ ) is the decimal point A function that returns an integer value by rounding down, Round ( ⁇ ) rounds off the first decimal place, and returns an integer value
- T ′ T ⁇ 2 L , ⁇ H [0],..., ⁇ H [N-1 ]
- ⁇ is a value that determines the mixing ratio of spectral envelope ⁇ H [n] and periodic envelope P [k], (U ⁇ T ′) / 2 L ⁇ v ⁇ 1 ⁇ k ⁇ (U ⁇ T ′) / 2 L + v ⁇ 1
- P [1], ..., P [N] is obtained as shown below, and the periodicity defined by the following equation using the obtained periodic envelope series P [1], ..., P [N]
- the integrated envelope sequence ⁇ H M [1], ..., ⁇ H M [N] may be obtained.
- h and PD may be predetermined values other than the above example.
- ⁇ that is a value that determines the mixing ratio of the spectral envelope ⁇ H [n] and the periodic envelope P [k] may be determined in advance by the encoding device and the decoding device, or may be a value of ⁇ determined by the encoding device.
- a code indicating information may be generated and output to the decoding device. In the latter case, the decoding apparatus obtains ⁇ by decoding the code indicating the input information of ⁇ .
- the non-smoothed amplitude spectrum envelope sequence generation unit 32 of the decoding device can obtain the same periodic integrated envelope sequence as the periodic integrated envelope sequence generated by the encoding device by using the obtained ⁇ .
- the encoding unit 2C has at least a parameter ⁇ for each predetermined time interval. It can be said that the time-series signal for each predetermined time interval is encoded by the encoding process of the configuration specified based on the above.
- the encoding unit 2D is specified based on at least the parameter ⁇ for each predetermined time interval. It can be said that the time-series signal for each predetermined time interval is encoded by the encoding processing of the configuration.
- the encoding unit 2C and the encoding unit 2D perform the same processing.
- the processing described above is not only executed in time series in the order described, but may also be executed in parallel or individually as required by the processing capability of the apparatus that executes the processing.
- each method or each apparatus may be realized by a computer.
- the processing content of each method or each device is described by a program.
- various processes in each method or each device are realized on the computer.
- the program describing the processing contents can be recorded on a computer-readable recording medium.
- a computer-readable recording medium for example, any recording medium such as a magnetic recording device, an optical disk, a magneto-optical recording medium, and a semiconductor memory may be used.
- this program is distributed by selling, transferring, or lending a portable recording medium such as a DVD or CD-ROM in which the program is recorded. Further, the program may be distributed by storing the program in a storage device of the server computer and transferring the program from the server computer to another computer via a network.
- a computer that executes such a program first stores a program recorded on a portable recording medium or a program transferred from a server computer in its storage unit. When executing the process, this computer reads the program stored in its own storage unit and executes the process according to the read program.
- a computer may read a program directly from a portable recording medium and execute processing according to the program. Further, each time a program is transferred from the server computer to the computer, processing according to the received program may be executed sequentially.
- the program is not transferred from the server computer to the computer, and the above-described processing is executed by a so-called ASP (Application Service Provider) type service that realizes a processing function only by an execution instruction and result acquisition. It is good.
- the program includes information provided for processing by the electronic computer and equivalent to the program (data that is not a direct command to the computer but has a property that defines the processing of the computer).
- each device is configured by executing a predetermined program on a computer, at least a part of these processing contents may be realized by hardware.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Description
低ビット(例えば10kbit/s~20kbit/s程度)の音信号の符号化方法として、DFT(離散フーリエ変換)やMDCT(変形離散コサイン変換)などの周波数領域での直交変換係数に対する適応符号化が知られている。例えば標準規格技術であるMEPG USAC(Unified Speech and Audio Coding)は、TCX(transform coded excitation:変換符号化励振)符号化モードを持ち、この中ではMDCT係数をフレームごとに正規化して量子化後に可変長符号化している(例えば、参考文献1参照)。
従来のTCXに基づく符号化装置の構成例を図1に示す。以下、図1の各部について説明する。
周波数領域変換部11には、時間領域の時系列信号である音信号が入力される。音信号は、例えば音声信号又は音響信号である。
線形予測分析部12には、時間領域の時系列信号である音信号が入力される。
平滑化振幅スペクトル包絡系列生成部14には、線形予測分析部12が生成した量子化線形予測係数^α1,^α2,…,^αpが入力される。
非平滑化振幅スペクトル包絡系列生成部13には、線形予測分析部12が生成した量子化線形予測係数^α1,^α2,…,^αpが入力される。
包絡正規化部15には、周波数領域変換部11が生成したMDCT係数列X(0),X(1),…,X(N-1)及び平滑化振幅スペクトル包絡系列生成部14が出力した平滑化振幅スペクトル包絡系列^Wγ(0),^Wγ(1),…,^Wγ(N-1)が入力される。
符号化部16には、包絡正規化部15が生成した正規化MDCT係数列XN(0),XN(1),…,XN(N-1)、平滑化振幅スペクトル包絡系列生成部14が出力した平滑化振幅スペクトル包絡系列^Wγ(0),^Wγ(1),…,^Wγ(N-1)、非平滑化振幅スペクトル包絡系列生成部13が出力した非平滑化振幅スペクトル包絡系列^W(0),^W(1),…,^W(N-1)が入力される。
符号化部16が行う符号化処理の具体例について説明する。
利得取得部161は、入力された正規化MDCT係数列XN(0),XN(1),…,XN(N-1)から、整数信号符号のビット数が、予め配分されたビット数である配分ビット数B以下、かつ、なるべく大きな値となるようなグローバルゲインgを決定して出力する。利得取得部161が得たグローバルゲインgは、量子化部162で用いられるグローバルゲインの初期値となる。
量子化部162は、入力された正規化MDCT係数列XN(0),XN(1),…,XN(N-1)の各係数を利得取得部161または利得更新部167が得たグローバルゲインgで割り算した結果の整数部分による系列である量子化正規化済係数系列XQ(0),XQ(1),…,XQ(N-1)を得て出力する。
分散パラメータ決定部163は、入力された非平滑化振幅スペクトル包絡系列^W(0),^W(1),…,^W(N-1)と、入力された平滑化振幅スペクトル包絡系列^Wγ(0),^Wγ(1),…,^Wγ(N-1)とから、下記の式(B3)により各周波数に対する分散パラメータφ(0),φ(1),…,φ(N-1)を得て出力する。
算術符号化部164は、分散パラメータ決定部163が得た分散パラメータφ(0),φ(1),…,φ(N-1)を用いて、量子化部162が得た量子化正規化済係数系列XQ(0),XQ(1),…,XQ(N-1)を算術符号化して整数信号符号を得て、整数信号符号と、整数信号符号のビット数である消費ビット数Cとを出力する。この算術符号は、各周波数k(=0,…,N-1)での量子化正規化済係数系列が以下の確率変数Xに関する例えば以下の式で示されるラプラス分布に従っているときに最適になるようなビットの割り当てを行う。
判定部166は、利得の更新回数が予め定めた回数の場合には、整数信号符号を出力するとともに、利得符号化部165に対し利得更新部167が得たグローバルゲインgを符号化する指示信号を出力し、利得の更新回数が予め定めた回数未満である場合には、利得更新部167に対し、算術符号化部164が計測した消費ビット数Cを出力する。
利得更新部167は、算術符号化部164が計測した消費ビット数Cが配分ビット数Bより多い場合にはグローバルゲインgの値を大きな値に更新して出力し、消費ビット数Cが配分ビット数Bより少ない場合にはグローバルゲインgの値を小さな値に更新し、更新後のグローバルゲインgの値を出力する。
利得符号化部165は、判定部166が出力した指示信号に従って、利得更新部167が得たグローバルゲインgを符号化して利得符号を得て出力する。
(符号化)
第一実施形態の符号化装置の構成例を図4に示す。第三実施形態の符号化装置は、図4に示すように、周波数領域変換部21と、線形予測分析部22と、非平滑化振幅スペクトル包絡系列生成部23と、平滑化振幅スペクトル包絡系列生成部24と、包絡正規化部25と、符号化部26と、パラメータ決定部27とを例えば備えている。この符号化装置により実現される第一実施形態の符号化方法の各処理の例を図5に示す。
第一実施形態では、所定の時間区間ごとに複数のパラメータηの何れかがパラメータ決定部27により選択可能とされている。
周波数領域変換部21には、時間領域の時系列信号である音信号が入力される。音信号の例は、音声ディジタル信号又は音響ディジタル信号である。
線形予測分析部22には、周波数領域変換部21が得たMDCT係数列X(0),X(1),…,X(N-1)が入力される。
非平滑化振幅スペクトル包絡系列生成部23には、線形予測分析部22が生成した量子化線形予測係数^β1,^β2,…,^βpが入力される。
平滑化振幅スペクトル包絡系列生成部24には、線形予測分析部22が生成した量子化線形予測係数^β1,^β2,…,^βpが入力される。
包絡正規化部25には、周波数領域変換部21が得たMDCT係数列X(0),X(1),…,X(N-1)及び平滑化振幅スペクトル包絡生成部24が生成した平滑化振幅スペクトル包絡系列^Hγ(0),^Hγ(1),…,^Hγ(N-1)が入力される。
符号化部26には、包絡正規化部25が生成した正規化MDCT係数列XN(0),XN(1),…,XN(N-1)、非平滑化振幅スペクトル包絡生成部23が生成した非平滑化振幅スペクトル包絡系列^H(0),^H(1),…,^H(N-1)、平滑化振幅スペクトル包絡生成部24が生成した平滑化振幅スペクトル包絡系列^Hγ(0),^Hγ(1),…,^Hγ(N-1)及び線形予測分析部22が算出した平均残差のエネルギーσ2が入力される。
符号化部26が行う符号化処理の具体例1として、ループ処理を含まない例について説明する。
利得取得部261には、包絡正規化部25が生成した正規化MDCT係数列XN(0),XN(1),…,XN(N-1)が入力される。
量子化部262には、包絡正規化部25が生成した正規化MDCT係数列XN(0),XN(1),…,XN(N-1)及び利得取得部261が得たグローバルゲインgが入力される。
分散パラメータ決定部268には、パラメータ決定部27が読み出したパラメータη、利得取得部261が得たグローバルゲインg、非平滑化振幅スペクトル包絡生成部23が生成した非平滑化振幅スペクトル包絡系列^H(0),^H(1),…,^H(N-1)、平滑化振幅スペクトル包絡生成部24が生成した平滑化振幅スペクトル包絡系列^Hγ(0),^Hγ(1),…,^Hγ(N-1)及び線形予測分析部22が得た予測残差のエネルギーσ2が入力される。
算術符号化部269には、パラメータ決定部27が読み出したパラメータη、量子化部262が得た量子化正規化済係数系列XQ(0),XQ(1),…,XQ(N-1)及び分散パラメータ決定部268が得た分散パラメータ系列φ(0),φ(1),…,φ(N-1)が入力される。
利得符号化部265には、利得取得部261が得たグローバルゲインgが入力される。
符号化部26が行う符号化処理の具体例2として、ループ処理を含む例について説明する。
利得部261には、包絡正規化部25が生成した正規化MDCT係数列XN(0),XN(1),…,XN(N-1)が入力される。
量子化部262には、包絡正規化部25が生成した正規化MDCT係数列XN(0),XN(1),…,XN(N-1)及び利得取得部261又は利得更新部267が得たグローバルゲインgが入力される。
分散パラメータ決定部268には、パラメータ決定部27が読み出したパラメータη、利得取得部261又は利得更新部267が得たグローバルゲインg、非平滑化振幅スペクトル包絡生成部23が生成した非平滑化振幅スペクトル包絡系列^H(0),^H(1),…,^H(N-1)、平滑化振幅スペクトル包絡生成部24が生成した平滑化振幅スペクトル包絡系列^Hγ(0),^Hγ(1),…,^Hγ(N-1)及び線形予測分析部22が得た予測残差のエネルギーσ2が入力される。
算術符号化部269には、パラメータ決定部27が読み出したパラメータη、量子化部262が得た量子化正規化済係数系列XQ(0),XQ(1),…,XQ(N-1)及び分散パラメータ決定部268が得た分散パラメータ系列φ(0),φ(1),…,φ(N-1)が入力される。
判定部266には、算術符号化部269が得た整数信号符号が入力される。
利得更新部267には、算術符号化部264が計測した消費ビット数Cが入力される。
利得符号化部265には、判定部266からの出力指示及び利得更新部267が得たグローバルゲインgが入力される。
符号化部26は、例えば以下の処理を行うことにより、推定されたスペクトル包絡(非平滑化振幅スペクトル包絡)を基にビット割り当てを変える符号化を行ってもよい。
ステップA1からステップA6の処理により、同一の所定の時間区間の時系列信号に対応する周波数領域サンプル列に対して各パラメータηごとに生成された符号(この例では、線形予測係数符号、利得符号及び整数信号符号)は、パラメータ決定部27に入力される。
符号化装置に対応する復号装置の構成例を図9に示す。第一実施形態の復号装置は、図9に示すように、線形予測係数復号部31と、非平滑化振幅スペクトル包絡系列生成部32と、平滑化振幅スペクトル包絡系列生成部33と、復号部34と、包絡逆正規化部35と、時間領域変換部36と、パラメータ復号部37とを例えば備えている。この復号装置により実現される第一実施形態の復号方法の各処理の例を図10に示す。
パラメータ復号部37には、符号化装置が出力したパラメータ符号が入力される。
線形予測係数復号部31には、符号化装置が出力した線形予測係数符号が入力される。
非平滑化振幅スペクトル包絡系列生成部32には、パラメータ復号部37が求めた復号パラメータη及び線形予測係数復号部31が得た復号線形予測係数^β1,^β2,…,^βpが入力される。
平滑化振幅スペクトル包絡系列生成部33には、パラメータ復号部37が求めた復号パラメータη及び線形予測係数復号部31が得た復号線形予測係数^β1,^β2,…,^βpが入力される。
復号部34には、パラメータ復号部37が求めた復号パラメータη、符号化装置が出力した正規化MDCT係数列に対応する符号、非平滑化振幅スペクトル包絡生成部32が生成した非平滑化振幅スペクトル包絡系列^H(0),^H(1),…,^H(N-1)及び平滑化振幅スペクトル包絡生成部33が生成した平滑化振幅スペクトル包絡系列^Hγ(0),^Hγ(1),…,^Hγ(N-1)が入力される。
包絡逆正規化部35には、平滑化振幅スペクトル包絡生成部33が生成した平滑化振幅スペクトル包絡系列^Hγ(0),^Hγ(1),…,^Hγ(N-1)及び復号部34が生成した復号正規化MDCT係数列^XN(0),^XN(1),…,^XN(N-1)が入力される。
時間領域変換部36には、包絡逆正規化部35が生成した復号MDCT係数列^X(0),^X(1),…,^X(N-1)が入力される。
第一実施形態の符号化装置及び方法は、複数のパラメータηのそれぞれについて符号化を行い符号を生成し、パラメータηごとに生成された符号の中から最適な符号を選択し、選択された符号及び選択された符号に対応するパラメータ符号を出力するものであった。
第二実施形態の符号化装置の構成例を図12に示す。符号化装置は、図12に示すように、周波数領域変換部21と、線形予測分析部22と、非平滑化振幅スペクトル包絡系列生成部23と、平滑化振幅スペクトル包絡系列生成部24と、包絡正規化部25と、符号化部26と、パラメータ決定部27’とを例えば備えている。この符号化装置により実現される符号化方法の各処理の例を図13に示す。
パラメータ決定部27’には、時系列信号である時間領域の音信号が入力される。音信号の例は、音声ディジタル信号又は音響ディジタル信号である。
パラメータ決定部27’により決定されたηは、線形予測分析部22、非平滑化振幅スペクトル包絡推定部23、及び平滑化振幅スペクトル包絡推定部24及び符号化部26に出力される。
周波数領域変換部41には、時系列信号である時間領域の音信号が入力される。音信号の例は、音声ディジタル信号又は音響ディジタル信号である。
スペクトル包絡推定部42には、周波数領域変換部21が得たMDCT係数列X(0),X(1),…,X(N-1)が入力される。
線形予測分析部421には、周波数領域変換部41が得たMDCT係数列X(0),X(1),…,X(N-1)が入力される。
非平滑化振幅スペクトル包絡系列生成部422には、線形予測分析部421が生成した量子化線形予測係数^β1,^β2,…,^βpが入力される。
白色化スペクトル系列生成部43には、周波数領域変換部41が得たMDCT係数列X(0),X(1),…,X(N-1)及び非平滑化振幅スペクトル包絡生成部422が生成した非平滑化振幅スペクトル包絡系列^H(0),^H(1),…,^H(N-1)が入力される。
パラメータ取得部44には、白色化スペクトル系列生成部43が生成した白色化スペクトル系列XW(0),XW(1),…,XW(N-1)が入力される。
第二実施形態の復号装置及び方法は、第一実施形態と同様であるため重複説明を省略する。
なお、少なくともパラメータηに基づいて符号化処理の構成を特定可能であれば、符号化処理はどのようなものであってもよく、符号化部26の符号化処理以外の符号化処理を用いてもよい。
第二実施形態の変形例の符号化装置及び方法の一例について説明する。
パラメータ決定部27’には、時系列信号であるフレーム単位の時間領域の音信号が入力される。音信号の例は、音声ディジタル信号又は音響ディジタル信号である。
周波数領域変換部41には、時系列信号である時間領域の音信号が入力される。
スペクトル包絡推定部42には、周波数領域変換部21が得たMDCT係数列X(0),X(1),…,X(N-1)が入力される。
線形予測分析部421には、周波数領域変換部41が得たMDCT係数列X(0),X(1),…,X(N-1)が入力される。
非平滑化振幅スペクトル包絡系列生成部422には、線形予測分析部421が生成した量子化線形予測係数^β1,^β2,…,^βpが入力される。
白色化スペクトル系列生成部43には、周波数領域変換部41が得たMDCT係数列X(0),X(1),…,X(N-1)及び非平滑化振幅スペクトル包絡生成部422が生成した非平滑化振幅スペクトル包絡系列^H(0),^H(1),…,^H(N-1)が入力される。
パラメータ取得部44には、白色化スペクトル系列生成部43が生成した白色化スペクトル系列XW(0),XW(1),…,XW(N-1)が入力される。
音響特徴量抽出部521には、時系列信号であるフレーム単位の時間領域の音信号が入力される。
特定部522には、パラメータ決定部27’が決定したパラメータηと、音響特徴量抽出部521が計算した時系列信号の音の大きさを表す指標とが入力される。また、必要に応じて時系列信号であるフレーム単位の音信号が入力される。
時系列信号のスペクトル形状が平坦ではなく、かつ、パラメータηが大きい場合には、時系列信号は音声である可能性が高い。この場合、特定部522は、音声に適した符号化処理を行うことを決定する。
符号化部523には、時系列信号であるフレーム単位の音信号と、特定部522が特定した符号化処理の構成についての情報とが入力される。
符号化部523は、無音区間であることを示す情報を復号装置に送信する。無音区間であることを示す情報は、例えば1ビット等の低ビットで送信される。符号化523は無音区間であることを示す情報を送信した後に、処理の対象となっている時系列信号が無音区間であると特定部522により決定されている間は、無音区間であることを示す情報を再度送らなくてもよい。
符号化部523は、無音区間であることを示す情報、時系列信号のスペクトル包絡の形状及び時系列信号の振幅の情報を復号装置に送信する。
復号装置及び方法の一例について説明する。
特定符号復号部525には、符号化装置が出力した特定符号が入力される。
特定符号がパラメータ符号である場合には、特定符号復号部525は、パラメータ符号を復号してパラメータηを得て、得られたパラメータηを符号化処理の構成についての情報として特定部527に出力する。
音響特徴量符号復号部526には、符号化装置が出力した音響特徴量符号が入力される。
特定部527には、特定符号復号部525により得られた符号化処理の構成についての情報が入力される。また、特定部527には、必要に応じて、音響特徴量符号復号部526により得られた音響特徴量が入力される。
復号部528には、符号化装置が出力した符号と、特定部527により特定された復号処理の構成についての情報とが入力される。
符号化側の(i)第1の方法に対応するものである。
復号部528は、無音区間であることを示す情報と共に受信した、時系列信号のスペクトル包絡の形状及び時系列信号の振幅の情報を用いて、予め定められたノイズを変形して出力する。ノイズの変形方法は、EVS(Enhanced Voice Service)等で用いられている既存の手法を用いれば良い。
線形予測分析部22及び非平滑化振幅スペクトル包絡系列生成部23を1つのスペクトル包絡推定部2Aとして捉えると、このスペクトル包絡推定部2Aは、時系列信号に対応する例えばMDCT係数列である周波数領域サンプル列の絶対値のη乗をパワースペクトルと見做したスペクトル包絡(非平滑化振幅スペクトル包絡系列)の推定を行っていると言える。ここで、「パワースペクトルと見做した」とは、パワースペクトルを通常用いるところに、η乗のスペクトルを用いることを意味する。
(U×T’)/2L-v-1≦k≦(U×T’)/2L+v-1
の範囲の整数kについて、
Claims (23)
- 所定の時間区間ごとの時系列信号を周波数領域で符号化する符号化装置であって、
パラメータηを正の数として、時系列信号に対応するパラメータηを、その時系列信号に対応する周波数領域サンプル列の絶対値のη乗をパワースペクトルと見做すことにより推定されたスペクトル包絡スペクトル包絡で上記周波数領域サンプル列を除算した系列である白色化スペクトル系列のヒストグラムを近似する一般化ガウス分布の形状パラメータとして、上記所定の時間区間ごとに複数のパラメータηの何れかが選択可能又はパラメータηが可変とされており、
上記所定の時間区間ごとのパラメータηに少なくとも基づいて特定される構成の符号化処理により、上記所定の時間区間ごとの時系列信号を符号化する符号化部と、
を含む符号化装置。 - 請求項1の符号化装置であって、
上記符号化部は、上記所定の時間区間ごとに、上記時系列信号に対応する周波数領域サンプル列の絶対値のη乗をパワースペクトルと見做したスペクトル包絡の推定により推定されたスペクトル包絡の値を基にビット割り当てを変える又は実質的にビット割り当てが変わる符号化処理により、上記時系列信号に対応する周波数領域サンプル列を符号化して符号を得て出力し、
上記出力された符号に対応するパラメータηを表すパラメータ符号を出力する、
符号化装置。 - 請求項2の符号化装置であって、
上記所定の時間区間ごとにパラメータηを決定するパラメータ決定部を更に含み、
上記符号化部は、上記決定されたパラメータηを用いて上記符号化処理を行うことにより符号を得て出力する、
符号化装置。 - 請求項2の符号化装置であって、
上記符号化部は、上記複数のパラメータηのそれぞれを用いて同一の所定の時間区間の時系列信号に対応する周波数領域サンプル列に対して上記符号化処理を行うことにより複数の符号を得て、
得られた符号の符号量及び得られた符号に対応する符号化歪の少なくとも一方に基づいて上記複数の符号の中の何れか1つの符号を選択して出力する、
符号化装置。 - 請求項2の符号化装置であって、
上記符号化部は、上記複数のパラメータηのそれぞれを用いて同一の所定の時間区間の時系列信号に対応する周波数領域サンプル列に対する上記符号化処理により得られる符号の推定符号量を得て、
上記得られた推定符号量に基づいて上記複数のパラメータηの何れか1つを選択し、
上記選択されたパラメータηを用いて上記符号化処理を行うことにより符号を得て出力する、
符号化装置。 - 請求項2から5の何れかの符号化装置であって、
上記周波数領域サンプル列を、上記周波数領域サンプル列の周期性成分に対応するサンプルから構成される第一周波数領域サンプル列と、上記周波数領域サンプル列の周期性成分に対応するサンプル以外のサンプルから構成される第二周波数領域サンプル列とに分割し、上記周期性成分に対応するサンプルを表す情報を補助情報として出力する分割部を更に含み、
上記符号化装置は、第一周波数領域サンプル列及び第二周波数領域サンプル列のそれぞれについて上記符号化処理を行う、
符号化装置。 - 請求項1の符号化装置であって、
入力された時系列信号に対応するパラメータηを決定するパラメータ決定部と、
少なくとも上記決定されたパラメータηに基づいて符号化処理の構成を特定し、上記符号化処理の構成を特定可能な特定符号を生成し出力する特定部と、を更に含み、
上記符号化部は、上記特定された構成の符号化処理により、上記入力された時系列信号を符号化する、
符号化装置。 - 請求項7の符号化装置において、
上記特定部は、上記決定されたパラメータηだけではなく、上記入力された時系列信号の音の大きさを表す指標、音の大きさを表す指標の時間的変動、スペクトル形状、スペクトル形状の時間的変動、ピッチの周期性の度合いの少なくとも1つに更に基づいて符号化処理の構成を特定する、
符号化装置。 - 請求項8の符号化装置において、
上記符号化処理の構成を特定可能な特定符号は、上記入力された時系列信号に対応するパラメータηを表すパラメータ符号である、
符号化装置。 - 所定の時間区間ごとの時系列信号を周波数領域で符号化する符号化装置であって、
パラメータηを正の数として、上記所定の時間区間ごとに複数のパラメータηの何れかが選択可能又はパラメータηが可変とされており、
上記所定の時間区間ごとに、上記時系列信号に対応する周波数領域サンプル列の絶対値のη乗をパワースペクトルと見做したスペクトル包絡の推定により推定されたスペクトル包絡の値を基にビット割り当てを変える又は実質的にビット割り当てが変わる符号化処理により、上記時系列信号に対応する周波数領域サンプル列を符号化して符号を得て出力する符号化部を含み、
上記出力された符号に対応するパラメータηを表すパラメータ符号を出力する、
符号化装置。 - パラメータηを正の数として、パラメータηを表すパラメータ符号を、そのパラメータηに対応する周波数領域サンプル列の絶対値のη乗をパワースペクトルと見做すことにより推定されるスペクトル包絡スペクトル包絡で上記周波数領域サンプル列を除算した系列である白色化スペクトル系列のヒストグラムを近似する一般化ガウス分布の形状パラメータを表す符号として、
入力されたパラメータ符号を復号してパラメータηを得るパラメータ符号復号部と、
少なくとも上記得られたパラメータηに基づいて復号処理の構成を特定する特定部と、
上記特定された構成の復号処理により、入力された符号の復号を行う復号部と、
を含む復号装置。 - 請求項11の復号装置であって、
上記復号装置は、周波数領域での復号により時系列信号に対応する周波数領域サンプル列を得る復号装置であり、
入力された線形予測係数符号を復号することにより、線形予測係数に変換可能な係数を得る線形予測係数復号部と、
上記得られたパラメータηを用いて、上記線形予測係数に変換可能な係数に対応する振幅スペクトル包絡の系列を1/η乗した系列である非平滑化スペクトル包絡系列を得る非平滑化スペクトル包絡系列生成部と、を更に含み、
上記復号部は、上記非平滑化スペクトル包絡系列に基づいて変わるビット割り当て又は実質的に変わるビット割り当てに従って、入力された整数信号符号の復号を行うことにより上記時系列信号に対応する周波数領域サンプル列を得る、
復号装置。 - 請求項11の復号装置において、
入力された音響特徴符号を復号して、音の大きさを表す指標、音の大きさを表す指標の時間的変動、スペクトル形状、スペクトル形状の時間的変動、ピッチの周期性の度合いの少なくとも1つを得る音響特徴符号復号部を更に含み、
上記特定部は、上記得られたパラメータηだけではなく、上記音の大きさを表す指標、音の大きさを表す指標の時間的変動、スペクトル形状、スペクトル形状の時間的変動、ピッチの周期性の少なくとも1つに更に基づいて復号処理の構成を特定する、
復号装置。 - 請求項11又は13の復号装置において、
無音区間であることを示す情報を受け取った場合には、上記復号部はノイズを発生させる、
復号装置。 - 周波数領域での復号により時系列信号に対応する周波数領域サンプル列を得る復号装置であって、
入力されたパラメータ符号を復号してパラメータηを得るパラメータ符号復号部と、
入力された線形予測係数符号を復号することにより、線形予測係数に変換可能な係数を得る線形予測係数復号部と、
上記得られたパラメータηを用いて、上記線形予測係数に変換可能な係数に対応する振幅スペクトル包絡の系列を1/η乗した系列である非平滑化スペクトル包絡系列を得る非平滑化スペクトル包絡系列生成部と、
上記非平滑化スペクトル包絡系列に基づいて変わるビット割り当て又は実質的に変わるビット割り当てに従って、入力された整数信号符号の復号を行うことにより上記時系列信号に対応する周波数領域サンプル列を得る復号部と、
を含む復号装置。 - 所定の時間区間ごとの時系列信号を周波数領域で符号化する符号化方法であって、
パラメータηを正の数として、時系列信号に対応するパラメータηを、その時系列信号に対応する周波数領域サンプル列の絶対値のη乗をパワースペクトルと見做すことにより推定されたスペクトル包絡スペクトル包絡で上記周波数領域サンプル列を除算した系列である白色化スペクトル系列のヒストグラムを近似する一般化ガウス分布の形状パラメータとして、上記所定の時間区間ごとに複数のパラメータηの何れかが選択可能又はパラメータηが可変とされており、
上記所定の時間区間ごとのパラメータηに少なくとも基づいて特定される構成の符号化処理により、上記所定の時間区間ごとの時系列信号を符号化する符号化ステップと、
を含む符号化方法。 - 請求項16の符号化方法であって、
上記符号化ステップは、上記所定の時間区間ごとに、上記時系列信号に対応する周波数領域サンプル列の絶対値のη乗をパワースペクトルと見做したスペクトル包絡の推定により推定されたスペクトル包絡の値を基にビット割り当てを変える又は実質的にビット割り当てが変わる符号化処理により、上記時系列信号に対応する周波数領域サンプル列を符号化して符号を得て出力し、
上記出力された符号に対応するパラメータηを表すパラメータ符号を出力する、
符号化方法。 - 所定の時間区間ごとの時系列信号を周波数領域で符号化する符号化方法であって、
パラメータηを正の数として、上記所定の時間区間ごとに複数のパラメータηの何れかが選択可能又はパラメータηが可変とされており、
上記所定の時間区間ごとに、上記時系列信号に対応する周波数領域サンプル列の絶対値のη乗をパワースペクトルと見做したスペクトル包絡の推定により推定されたスペクトル包絡の値を基にビット割り当てを変える又は実質的にビット割り当てが変わる符号化処理により、上記時系列信号に対応する周波数領域サンプル列を符号化して符号を得て出力する符号化ステップを含み、
上記出力された符号に対応するパラメータηを表すパラメータ符号を出力する、
符号化方法。 - パラメータηを正の数として、パラメータηを表すパラメータ符号を、そのパラメータηに対応する周波数領域サンプル列の絶対値のη乗をパワースペクトルと見做すことにより推定されるスペクトル包絡スペクトル包絡で上記周波数領域サンプル列を除算した系列である白色化スペクトル系列のヒストグラムを近似する一般化ガウス分布の形状パラメータを表す符号として、
入力されたパラメータ符号を復号してパラメータηを得るパラメータ符号復号ステップと、
少なくとも上記得られたパラメータηに基づいて復号処理の構成を特定する特定ステップと、
上記特定された構成の復号処理により、入力された符号の復号を行う復号ステップと、
を含む復号方法。 - 請求項19の復号方法であって、
上記復号方法は、周波数領域での復号により時系列信号に対応する周波数領域サンプル列を得る復号方法であり、
入力された線形予測係数符号を復号することにより、線形予測係数に変換可能な係数を得る線形予測係数復号ステップと、
上記得られたパラメータηを用いて、上記線形予測係数に変換可能な係数に対応する振幅スペクトル包絡の系列を1/η乗した系列である非平滑化スペクトル包絡系列を得る非平滑化スペクトル包絡系列生成ステップと、
上記非平滑化スペクトル包絡系列に基づいて変わるビット割り当て又は実質的に変わるビット割り当てに従って、入力された整数信号符号の復号を行うことにより上記時系列信号に対応する周波数領域サンプル列を得る復号ステップと、
を含む復号方法。 - 周波数領域での復号により時系列信号に対応する周波数領域サンプル列を得る復号方法であって、
入力されたパラメータ符号を復号してパラメータηを得るパラメータ符号復号ステップと、
入力された線形予測係数符号を復号することにより、線形予測係数に変換可能な係数を得る線形予測係数復号ステップと、
上記得られたパラメータηを用いて、上記線形予測係数に変換可能な係数に対応する振幅スペクトル包絡の系列を1/η乗した系列である非平滑化スペクトル包絡系列を得る非平滑化スペクトル包絡系列生成ステップと、
上記非平滑化スペクトル包絡系列に基づいて変わるビット割り当て又は実質的に変わるビット割り当てに従って、入力された整数信号符号の復号を行うことにより上記時系列信号に対応する周波数領域サンプル列を得る復号ステップと、
を含む復号方法。 - 請求項1から10の何れかの符号化装置又は請求項11から15の何れかの復号装置の各部としてコンピュータを機能させるためのプログラム。
- 請求項1から10の何れかの符号化装置又は請求項11から15の何れかの復号装置の各部としてコンピュータを機能させるためのプログラムを記録したコンピュータ読み取り可能な記録媒体。
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2016572110A JP6387117B2 (ja) | 2015-01-30 | 2016-01-27 | 符号化装置、復号装置、これらの方法、プログラム及び記録媒体 |
KR1020177020235A KR101996307B1 (ko) | 2015-01-30 | 2016-01-27 | 부호화 장치, 복호 장치, 이들의 방법, 프로그램 및 기록 매체 |
US15/544,465 US10224049B2 (en) | 2015-01-30 | 2016-01-27 | Apparatuses and methods for encoding and decoding a time-series sound signal by obtaining a plurality of codes and encoding and decoding distortions corresponding to the codes |
CN201680007279.3A CN107210042B (zh) | 2015-01-30 | 2016-01-27 | 编码装置、编码方法以及记录介质 |
EP16743429.9A EP3252758B1 (en) | 2015-01-30 | 2016-01-27 | Encoding apparatus, decoding apparatus, and methods, programs and recording media for encoding apparatus and decoding apparatus |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2015017691 | 2015-01-30 | ||
JP2015-017691 | 2015-01-30 | ||
JP2015081770 | 2015-04-13 | ||
JP2015-081770 | 2015-04-13 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016121826A1 true WO2016121826A1 (ja) | 2016-08-04 |
Family
ID=56543436
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2016/052365 WO2016121826A1 (ja) | 2015-01-30 | 2016-01-27 | 符号化装置、復号装置、これらの方法、プログラム及び記録媒体 |
Country Status (6)
Country | Link |
---|---|
US (1) | US10224049B2 (ja) |
EP (1) | EP3252758B1 (ja) |
JP (1) | JP6387117B2 (ja) |
KR (1) | KR101996307B1 (ja) |
CN (2) | CN113921021A (ja) |
WO (1) | WO2016121826A1 (ja) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102070145B1 (ko) | 2015-01-30 | 2020-01-28 | 니폰 덴신 덴와 가부시끼가이샤 | 파라미터 결정 장치, 방법, 프로그램 및 기록 매체 |
EP3270376B1 (en) * | 2015-04-13 | 2020-03-18 | Nippon Telegraph and Telephone Corporation | Sound signal linear predictive coding |
US11621010B2 (en) * | 2018-03-02 | 2023-04-04 | Nippon Telegraph And Telephone Corporation | Coding apparatus, coding method, program, and recording medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH08288852A (ja) * | 1995-04-11 | 1996-11-01 | Pioneer Electron Corp | 量子化装置及び量子化方法 |
JP2006304270A (ja) * | 2005-03-23 | 2006-11-02 | Fuji Xerox Co Ltd | 復号化装置、逆量子化方法及びこれらのプログラム |
WO2012046685A1 (ja) * | 2010-10-05 | 2012-04-12 | 日本電信電話株式会社 | 符号化方法、復号方法、符号化装置、復号装置、プログラム、記録媒体 |
Family Cites Families (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5651090A (en) * | 1994-05-06 | 1997-07-22 | Nippon Telegraph And Telephone Corporation | Coding method and coder for coding input signals of plural channels using vector quantization, and decoding method and decoder therefor |
US6714907B2 (en) * | 1998-08-24 | 2004-03-30 | Mindspeed Technologies, Inc. | Codebook structure and search for speech coding |
JP2002055699A (ja) * | 2000-08-10 | 2002-02-20 | Mitsubishi Electric Corp | 音声符号化装置および音声符号化方法 |
JP3590342B2 (ja) * | 2000-10-18 | 2004-11-17 | 日本電信電話株式会社 | 信号符号化方法、装置及び信号符号化プログラムを記録した記録媒体 |
DE60126149T8 (de) * | 2000-11-27 | 2008-01-31 | Nippon Telegraph And Telephone Corp. | Verfahren, einrichtung und programm zum codieren und decodieren eines akustischen parameters und verfahren, einrichtung und programm zum codieren und decodieren von klängen |
US6871176B2 (en) * | 2001-07-26 | 2005-03-22 | Freescale Semiconductor, Inc. | Phase excited linear prediction encoder |
CN100394693C (zh) * | 2005-01-21 | 2008-06-11 | 华中科技大学 | 一种变长码的编码和解码方法 |
JPWO2007037359A1 (ja) * | 2005-09-30 | 2009-04-16 | パナソニック株式会社 | 音声符号化装置および音声符号化方法 |
US7813563B2 (en) * | 2005-12-09 | 2010-10-12 | Florida State University Research Foundation | Systems, methods, and computer program products for compression, digital watermarking, and other digital signal processing for audio and/or video applications |
KR100738109B1 (ko) * | 2006-04-03 | 2007-07-12 | 삼성전자주식회사 | 입력 신호의 양자화 및 역양자화 방법과 장치, 입력신호의부호화 및 복호화 방법과 장치 |
CN101140759B (zh) * | 2006-09-08 | 2010-05-12 | 华为技术有限公司 | 语音或音频信号的带宽扩展方法及系统 |
WO2009027606A1 (fr) * | 2007-08-24 | 2009-03-05 | France Telecom | Codage/decodage par plans de symboles, avec calcul dynamique de tables de probabilites |
US8856049B2 (en) * | 2008-03-26 | 2014-10-07 | Nokia Corporation | Audio signal classification by shape parameter estimation for a plurality of audio signal samples |
GB2466674B (en) * | 2009-01-06 | 2013-11-13 | Skype | Speech coding |
KR101542370B1 (ko) * | 2011-02-16 | 2015-08-12 | 니폰 덴신 덴와 가부시끼가이샤 | 부호화 방법, 복호 방법, 부호화 장치, 복호 장치, 프로그램, 및 기록 매체 |
US9009036B2 (en) * | 2011-03-07 | 2015-04-14 | Xiph.org Foundation | Methods and systems for bit allocation and partitioning in gain-shape vector quantization for audio coding |
ES2704742T3 (es) * | 2011-04-05 | 2019-03-19 | Nippon Telegraph & Telephone | Descodificación de una señal acústica |
WO2012144128A1 (ja) * | 2011-04-20 | 2012-10-26 | パナソニック株式会社 | 音声音響符号化装置、音声音響復号装置、およびこれらの方法 |
WO2013176177A1 (ja) * | 2012-05-23 | 2013-11-28 | 日本電信電話株式会社 | 符号化方法、復号方法、符号化装置、復号装置、プログラム、および記録媒体 |
CN107004422B (zh) | 2014-11-27 | 2020-08-25 | 日本电信电话株式会社 | 编码装置、解码装置、它们的方法及程序 |
EP3270376B1 (en) * | 2015-04-13 | 2020-03-18 | Nippon Telegraph and Telephone Corporation | Sound signal linear predictive coding |
-
2016
- 2016-01-27 KR KR1020177020235A patent/KR101996307B1/ko active IP Right Grant
- 2016-01-27 CN CN202111170288.3A patent/CN113921021A/zh active Pending
- 2016-01-27 US US15/544,465 patent/US10224049B2/en active Active
- 2016-01-27 WO PCT/JP2016/052365 patent/WO2016121826A1/ja active Application Filing
- 2016-01-27 JP JP2016572110A patent/JP6387117B2/ja active Active
- 2016-01-27 EP EP16743429.9A patent/EP3252758B1/en active Active
- 2016-01-27 CN CN201680007279.3A patent/CN107210042B/zh active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH08288852A (ja) * | 1995-04-11 | 1996-11-01 | Pioneer Electron Corp | 量子化装置及び量子化方法 |
JP2006304270A (ja) * | 2005-03-23 | 2006-11-02 | Fuji Xerox Co Ltd | 復号化装置、逆量子化方法及びこれらのプログラム |
WO2012046685A1 (ja) * | 2010-10-05 | 2012-04-12 | 日本電信電話株式会社 | 符号化方法、復号方法、符号化装置、復号装置、プログラム、記録媒体 |
Also Published As
Publication number | Publication date |
---|---|
EP3252758A4 (en) | 2018-09-05 |
JPWO2016121826A1 (ja) | 2017-11-02 |
EP3252758B1 (en) | 2020-03-18 |
CN107210042A (zh) | 2017-09-26 |
KR101996307B1 (ko) | 2019-07-04 |
US10224049B2 (en) | 2019-03-05 |
CN107210042B (zh) | 2021-10-22 |
JP6387117B2 (ja) | 2018-09-05 |
CN113921021A (zh) | 2022-01-11 |
EP3252758A1 (en) | 2017-12-06 |
KR20170098278A (ko) | 2017-08-29 |
US20180047401A1 (en) | 2018-02-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6422813B2 (ja) | 符号化装置、復号装置、これらの方法及びプログラム | |
WO2012137617A1 (ja) | 符号化方法、復号方法、符号化装置、復号装置、プログラム、記録媒体 | |
JP5596800B2 (ja) | 符号化方法、周期性特徴量決定方法、周期性特徴量決定装置、プログラム | |
JP6633787B2 (ja) | 線形予測復号装置、方法、プログラム及び記録媒体 | |
JP6457552B2 (ja) | 符号化装置、復号装置、これらの方法及びプログラム | |
JP6392450B2 (ja) | マッチング装置、判定装置、これらの方法、プログラム及び記録媒体 | |
JP6595687B2 (ja) | 符号化方法、符号化装置、プログラム、および記録媒体 | |
CN112927703A (zh) | 对线性预测系数量化的方法和装置及解量化的方法和装置 | |
JP6387117B2 (ja) | 符号化装置、復号装置、これらの方法、プログラム及び記録媒体 | |
CN107430869B (zh) | 参数决定装置、方法及记录介质 | |
CN112820304A (zh) | 解码装置、解码方法、解码程序、记录介质 | |
JP5336942B2 (ja) | 符号化方法、復号方法、符号化器、復号器、プログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16743429 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2016572110 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 15544465 Country of ref document: US |
|
ENP | Entry into the national phase |
Ref document number: 20177020235 Country of ref document: KR Kind code of ref document: A |
|
REEP | Request for entry into the european phase |
Ref document number: 2016743429 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |