TW201329960A

TW201329960A - Quantization device and quantization method

Info

Publication number: TW201329960A
Application number: TW101100945A
Authority: TW
Inventors: Toshiyuki Morii
Original assignee: Panasonic Corp
Priority date: 2012-01-10
Filing date: 2012-01-10
Publication date: 2013-07-16

Abstract

Provided are a quantization device and a quantization method of obtaining enough encoding performance by less computation load for reducing encoding distortion. In a multistage vector quantization unit, in a situation of setting vector quantization of a first order as a pre-assigned alternate number N where 1 is subtracted from the alternate number each time when the vector quantization unit after the second order entering next order, and alternate number is under 3, evaluation is carried out for quantization distortion. Alternate number of the next order is set as a predefined numerical value P when the quantization distortion is greater than a specified threshold value. Alternate number of the next order is set as a numerical value Q less than the predefined P when quantization distortion is less than a specified threshold value.

Description

Quantization device and quantization method

Field of invention

本發明係關於一種使用樹狀搜尋進行量化的量化裝置及量化方法。The present invention relates to a quantization apparatus and a quantization method for performing quantization using a tree search.

Background of the invention

在移動通訊中，為了有效利用傳輸頻帶，需要進行語音及影像的數位元資訊的壓縮編碼。其中，對廣泛用於行動電話的語音編碼解碼器(編碼/解碼)技術的期待較大，對壓縮率較高的以往的高效率編碼，強烈要求更佳音質。另外，為了使公眾使用，必須進行標準化，世界上在積極研究開發。In mobile communication, in order to effectively utilize the transmission band, it is necessary to compress and encode the digital information of voice and video. Among them, there is a great expectation for a speech codec (encoding/decoding) technique widely used for mobile phones, and a higher quality of sound is strongly required for a conventional high-efficiency coding having a high compression ratio. In addition, in order to make the public use, it must be standardized, and the world is actively researching and developing.

近年來，在ITU-T(International Telecommunication Union Telecommunication Standardization Sector，國際電信聯盟遠程通信標準化組)及MPEG(Moving Picture Expert Group，動態影像壓縮標準)中研討既可編碼語音也可編碼音樂的編碼解碼器的標準化，要求更有效且高品質的語音編碼解碼器。In recent years, in the ITU-T (International Telecommunication Union Telecommunication Standardization Sector) and MPEG (Moving Picture Expert Group), a codec capable of encoding speech or encoding music has been studied. Standardization requires a more efficient and high quality speech codec.

藉由在20年前建立的、基本方式的CELP(Code Excited Linear Prediction，碼激勵線性預測)，大幅提高了語音編碼技術的性能，該CELP對語音的發聲機構進行型樣化而巧妙應用向量量化。在國際標準中，在ITU-T標準G.729、G.722.2、ETSI標準AMR、AMR-WB、3GPP2標準VMR-WB等許多標準方式中，採用CELP。The performance of the speech coding technology is greatly improved by the CELP (Code Excited Linear Prediction) established 20 years ago. The CELP models the speech sounding mechanism and applies the vector quantization skillfully. . In the international standard, CELP is adopted in many standard methods such as ITU-T standard G.729, G.722.2, ETSI standard AMR, AMR-WB, 3GPP2 standard VMR-WB.

上述CELP的主要技術為能夠以低位元率對語音頻譜的概形進行編碼的LPC(Linear Prediction Coding，線性預測編碼)分析、以及藉由LPC分析而得到的參數的量化。尤其，用於最近的標準方式的大部分的係基於線頻譜的量化。具有代表性的係LSP(Line Spectral Pair，線頻譜對)及改良了LSP的ISP(Immittance Spectral Pair，導納頻譜對)，兩者都係因為插值性良好，所以與向量量化(以下，稱為“VQ(Vector Quantization)”)的親和性較高。藉由將這些用於編碼，能夠以低位元率傳輸頻譜資訊。由此，格外提高了以CELP為基本的編碼解碼器的性能。The main technique of the above CELP is LPC (Linear Prediction Coding) analysis capable of encoding the outline of the speech spectrum at a low bit rate, and quantization of parameters obtained by LPC analysis. In particular, most of the most recent standard methods are based on line spectrum quantization. Representative LSP (Line Spectral Pair) and ISP (Immittance Spectral Pair) with improved LSP, both of which are interpolated with vector quantization (hereinafter referred to as "VQ (Vector Quantization)" has a higher affinity. By using these for encoding, spectral information can be transmitted at a low bit rate. Thereby, the performance of the CELP-based codec is particularly improved.

最近，為了對應高效率且高品質的語音編碼解碼器的要求，在ITU-T、MPEG、3GPP等中，對寬頻訊號(16kbps)、超寬頻訊號(32kbps)進行編碼的編碼解碼器在逐漸標準化。在為了對寬頻、超寬頻的數位訊號進行編碼而使用LPC係數的情況下，需要以較多的位元元數對16級以上的級數較多的LSP或ISP進行編碼。因此，一般使用將編碼對象(對象向量)分割為複數個，並對分割後的各個分別進行量化的“分割式VQ”，但由於無法使用向量的元素間的統計相關，所以編碼性能下降。Recently, in response to the requirements of high-efficiency and high-quality speech codecs, codecs that encode wide-band signals (16 kbps) and ultra-wideband signals (32 kbps) are gradually standardized in ITU-T, MPEG, 3GPP, and the like. . When an LPC coefficient is used to encode a wide-band or ultra-wideband digital signal, it is necessary to encode an LSP or an ISP having a large number of stages of 16 or more with a large number of bit elements. Therefore, generally, a "divided VQ" in which a coding target (object vector) is divided into a plurality of pieces and quantized separately is used. However, since statistical correlation between elements of a vector cannot be used, coding performance is degraded.

因此，作為能夠獲得更佳編碼性能的方法，使用多級量化(Multiple stage quantization)。此係不將目標向量進行分割，而使用複數個較小的向量量化，連續地進行量化，以使誤差逐漸減小。即，係在下一級對前級的量化的誤差向量進行量化的方法。只要使用在前級中誤差最小的，就能夠非常減小計算量。但是，若僅將誤差最小的量化結果作為候補而進行多級量化，則綜合性的編碼失真不夠小，量化性能惡化。Therefore, as a method capable of obtaining better coding performance, multiple stage quantization is used. This does not divide the target vector, but uses a plurality of smaller vector quantizations to continuously quantize to gradually reduce the error. That is, a method of quantizing the quantized error vector of the previous stage in the next stage. As long as the error is the smallest in the previous stage, the amount of calculation can be greatly reduced. However, if only the quantization result with the smallest error is used as a candidate for multi-stage quantization, the comprehensive coding distortion is not sufficiently small, and the quantization performance is deteriorated.

因此，考慮使用從上位保留幾個誤差較小的量化結果的候補的樹狀搜尋(Tree search)。由此，能夠以較少的計算量獲得較高的編碼性能。尤其，在分配位元數較多的情況下，為了將計算量抑制得較少而增加級數，但在較多得級數的多級量化中，若不使用樹狀搜尋，則無法獲得足夠的量化性能。Therefore, it is considered to use a tree search which retains candidates of quantization results having a small error from the upper level. Thereby, higher encoding performance can be obtained with less calculation amount. In particular, when the number of allocation bits is large, the number of stages is increased in order to suppress the amount of calculation. However, in the multi-level quantization of a large number of stages, if tree search is not used, sufficient Quantitative performance.

在專利文獻1中，記載了以多級對CELP的激勵向量進行量化的方法。另外，悉知在級數較多的情況下，藉由使用樹狀搜尋，能夠進行效率良好的搜尋。作為效率良好的多級搜尋方法，已知將各級中保留的候補(誤差較小的量化結果)的數設為N而進行搜尋的方法，並將該方法稱為“N最佳搜尋(N best search)”。Patent Document 1 describes a method of quantizing an excitation vector of a CELP in multiple stages. In addition, it is known that in the case of a large number of stages, an efficient search can be performed by using a tree search. As a multi-level search method that is efficient, it is known to perform a search by setting the number of candidates (quantization results with small errors) remaining in each stage to N, and this method is called "N-optimal search (N Best search)”.

另外，在專利文獻2中，不使用向量量化，但記載了基於N最佳搜尋的搜尋例。Further, in Patent Document 2, vector quantization is not used, but a search example based on N-optimal search is described.

Prior technical literature Patent literature

[專利文獻1]　日本特開2003-8446號公報[Patent Document 1] Japanese Patent Laid-Open Publication No. 2003-8446

[專利文獻2]　日本特開2000-261321號公報[Patent Document 2] Japanese Patent Laid-Open Publication No. 2000-261321

然而，上述的使用N>1的N最佳搜尋的多級向量量化能夠使最終的編碼失真比將各級的候補篩選為一個候補(N=1)小，但計算量增加N倍。反之，若將N的數抑制得較少，則編碼失真較大。However, the above-described multi-level vector quantization using the N-optimal search of N>1 enables the final coding distortion to be smaller than the candidate for each level (N=1), but the amount of calculation is increased by N times. On the other hand, if the number of N is suppressed to be small, the coding distortion is large.

如此，在使用以往的N最佳搜尋的多級向量量化中，不設法以更少的計算量而減小編碼失真，因而無法獲得足夠的編碼性能。Thus, in the multi-level vector quantization using the conventional N-optimal search, it is not tried to reduce the coding distortion with a smaller amount of calculation, and thus it is not possible to obtain sufficient coding performance.

本發明的目的在於，提供以較少的計算量而減小編碼失真，獲得足夠的編碼性能的量化裝置及量化方法。It is an object of the present invention to provide a quantization apparatus and a quantization method which reduce coding distortion with a small amount of calculation and obtain sufficient coding performance.

本發明的量化裝置使用樹狀搜尋進行多級量化，該量化裝置所採用的結構包括：搜尋單元，將編碼對象的一個以上的目標的各個目標與儲存於碼簿的碼向量進行匹配，求得前級中決定的或預先設定的候補數的、從量化失真最小的一方開始一個以上的候補；計算單元，對前述候補，從前述目標中減去前述碼向量而計算量化誤差向量；及候補數決定單元，基於在前述前級中決定的候補數，決定在下一級使用的候補數。The quantization apparatus of the present invention performs multi-level quantization using a tree search, and the quantization apparatus adopts a structure including: a search unit that matches each target of one or more objects of the encoding target with a code vector stored in the codebook. One or more candidates from the least quantized distortion determined in the previous stage or in a predetermined number of candidates; the calculation unit calculates the quantization error vector by subtracting the code vector from the target for the candidate; and the candidate number The determination unit determines the number of candidates to be used in the next stage based on the number of candidates determined in the preceding stage.

本發明的量化方法使用樹狀搜尋進行多級量化，該方法包括以下步驟：將編碼對象的一個以上的目標的各個目標與儲存於碼簿的碼向量進行匹配，在第一級求得預先指定的候補數的、從量化失真最小的一方開始一個以上的候補，在第二級以後求得在前級中決定的候補數的、從量化失真最小的一方開始一個以上的候補；對前述候補，從前述目標中減去前述碼向量而計算量化誤差向量；及基於在前述前級中決定的候補數，決定在下一級使用的候補數。The quantization method of the present invention performs multi-level quantization using tree search, and the method includes the steps of: matching each target of one or more targets of the encoding object with a code vector stored in the codebook, and obtaining a pre-specified first level The candidate number is one or more candidates from the one with the smallest quantization distortion, and the candidate number determined in the previous stage is obtained after the second level, and one or more candidates are selected from the least quantization distortion; for the candidate, The quantization error vector is calculated by subtracting the code vector from the target; and the number of candidates used in the next stage is determined based on the number of candidates determined in the preceding stage.

根據本發明，能夠以較少的計算量而減小編碼失真，獲得足夠的編碼性能。According to the present invention, it is possible to reduce coding distortion with a small amount of calculation, and to obtain sufficient coding performance.

Simple illustration

第1圖係表示本發明的第一實施例之CELP編碼裝置的結構的方塊圖。Fig. 1 is a block diagram showing the configuration of a CELP encoding apparatus according to a first embodiment of the present invention.

第2圖係表示第1圖所示的多級向量量化單元的內部結構的方塊圖。Fig. 2 is a block diagram showing the internal structure of the multilevel vector quantization unit shown in Fig. 1.

第3圖係表示第2圖所示的向量量化單元的內部結構的方塊圖。Fig. 3 is a block diagram showing the internal structure of the vector quantization unit shown in Fig. 2.

第4圖係表示第3圖所示的候補數決定單元中的候補數決定步驟的流程圖。Fig. 4 is a flowchart showing a procedure for determining the number of candidates in the candidate number determining means shown in Fig. 3.

第5圖係表示本發明的第二實施例之候補數決定單元中的候補數決定步驟的流程圖。Fig. 5 is a flowchart showing a procedure for determining the number of candidates in the candidate number determining means in the second embodiment of the present invention.

Form for implementing the invention

以下，參照附圖詳細說明本發明的實施例。Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

(第1實施例)(First embodiment)

第1圖係表示本發明的第1實施例之CELP編碼裝置100的結構的方塊圖。該CELP編碼裝置100對於由聲道資訊及激勵資訊構成的語音訊號S11中的聲道資訊，藉由求得LPC參數(線性預測係數)而進行編碼。另外，CELP編碼裝置100藉由對於激勵資訊求得確定使用預先儲存的語音型樣的哪一個型樣的碼資料、即確定在適應碼簿103及固定碼簿104生成哪一種激勵向量(碼向量)的碼資料，對激勵資訊進行編碼。Fig. 1 is a block diagram showing the configuration of a CELP encoding apparatus 100 according to a first embodiment of the present invention. The CELP encoding apparatus 100 encodes the channel information in the voice signal S11 composed of the channel information and the excitation information by obtaining an LPC parameter (linear prediction coefficient). Further, the CELP encoding apparatus 100 determines which of the pattern data of the pre-stored speech pattern is used for determining the excitation information, that is, which excitation vector (code vector) is generated in the adaptation codebook 103 and the fixed codebook 104. Code data, encoding the incentive information.

具體而言，CELP編碼裝置100的各單元進行以下的動作。Specifically, each unit of the CELP encoding apparatus 100 performs the following operations.

LPC分析單元101對語音訊號S11進行線性預測分析，求頻譜包絡資訊的LPC參數，並將其輸出至多級向量量化單元102及聽覺加權單元111。The LPC analysis unit 101 performs linear prediction analysis on the speech signal S11, obtains the LPC parameters of the spectral envelope information, and outputs them to the multi-level vector quantization unit 102 and the auditory weighting unit 111.

多級向量量化單元102對由LPC分析單元101得到的LPC參數進行多級向量量化，將得到的量化LPC參數輸出至LPC合成濾波器109，將量化LPC參數的碼資料輸出至CELP編碼裝置100的外部。The multi-level vector quantization unit 102 performs multi-stage vector quantization on the LPC parameters obtained by the LPC analysis unit 101, outputs the obtained quantized LPC parameters to the LPC synthesis filter 109, and outputs the code data of the quantized LPC parameters to the CELP encoding device 100. external.

另一方面，適應碼簿103儲存由LPC合成濾波器109使用的先前的驅動激勵，根據與從失真最小化單元112指示的碼資料對應的適應碼簿延遲(lag)，基於儲存的驅動激勵，生成1子訊框的激勵向量。將該激勵向量作為適應碼簿向量而輸出至乘法器106。On the other hand, the adaptation codebook 103 stores the previous drive excitation used by the LPC synthesis filter 109, based on the adaptive codebook delay (lag) corresponding to the code material indicated by the distortion minimization unit 112, based on the stored drive excitation, Generates an excitation vector for 1 sub-frame. The excitation vector is output to the multiplier 106 as an adaptive codebook vector.

固定碼簿104預先儲存複數個規定形狀的激勵向量，並將與從失真最小化單元112指示的碼資料對應的激勵向量作為固定碼簿向量而輸出至乘法器107。在此，說明以下的情況，即：固定碼簿104係代數碼簿，對於使用基於兩種個數的脈波的代數碼簿的情況的結構，藉由加法運算進行加權。The fixed codebook 104 stores a plurality of excitation vectors of a predetermined shape in advance, and outputs an excitation vector corresponding to the code material indicated by the distortion minimizing unit 112 as a fixed codebook vector to the multiplier 107. Here, a case will be described in which the fixed codebook 104 is a digital book, and the weighting is performed by an addition operation in the case of using a codebook based on two types of pulse waves.

代數激勵係指，在較多的標準編碼解碼器中採用的激勵，且係僅以位置及極性(+-)為資訊的、設置少數大小為1的脈衝的激勵。例如，記載於ARIB規格書“RCR STD-27K”的5.3節的“CS-ACELP”中的5.3.1.9章、5.4節的“ACELP”中的5.4.3.7章等。Algebraic excitation refers to the excitation used in more standard codecs, and the excitation of a few pulses of size 1 is set only with position and polarity (+-) as information. For example, it is described in Chapter 5.3.1.9 of "CS-ACELP" in Section 5.3 of the ARIB specification "RCR STD-27K", Chapter 5.4.3.7 of "ACELP" in Section 5.4, and the like.

另外，上述適應碼簿103用於表示有聲語音等的週期性較強的成份。另一方面，固定碼簿104用於表示白噪音等的週期性較弱的成份。Further, the above-described adaptive codebook 103 is used to indicate a component having a strong periodicity such as voiced speech. On the other hand, the fixed codebook 104 is used to indicate a periodic weak component such as white noise.

增益碼簿105根據來自失真最小化單元112的指示，生成從適應碼簿103輸出的適應碼簿向量用的增益(適應碼簿增益)、以及從固定碼簿104輸出的固定碼簿向量用的增益(固定碼簿增益)，並將其分別輸出至乘法器106及107。The gain codebook 105 generates a gain (adaptive codebook gain) for the adaptive codebook vector output from the adaptive codebook 103, and a fixed codebook vector output from the fixed codebook 104, based on an instruction from the distortion minimizing unit 112. The gain (fixed codebook gain) is output to the multipliers 106 and 107, respectively.

乘法器106將從增益碼簿105輸出的適應碼簿增益與從適應碼簿103輸出的適應碼簿向量相乘，並將其輸出至加法器108。The multiplier 106 multiplies the adaptive codebook gain output from the gain codebook 105 by the adaptive codebook vector output from the adaptive codebook 103, and outputs it to the adder 108.

乘法器107將從增益碼簿105輸出的固定碼簿增益與從固定碼簿104輸出的固定碼簿向量相乘，並將其輸出至加法器108。The multiplier 107 multiplies the fixed codebook gain output from the gain codebook 105 by the fixed codebook vector output from the fixed codebook 104, and outputs it to the adder 108.

加法器108將從乘法器106輸出的適應碼簿向量與從乘法器107輸出的固定碼簿向量相加，並將相加後的激勵向量作為驅動激勵而輸出至LPC合成濾波器109。The adder 108 adds the adaptive codebook vector output from the multiplier 106 to the fixed codebook vector output from the multiplier 107, and outputs the added excitation vector as a drive excitation to the LPC synthesis filter 109.

LPC合成濾波器109將從多級向量量化單元102輸出的量化LPC參數作為濾波器係數，並使用由適應碼簿103及固定碼簿104生成的、將激勵向量作為驅動激勵的濾波器函數、即LPC合成濾波器，生成合成訊號。將該合成訊號輸出至加法器110。The LPC synthesis filter 109 uses the quantized LPC parameters output from the multi-stage vector quantization unit 102 as filter coefficients, and uses the filter function generated by the adaptation codebook 103 and the fixed codebook 104 to use the excitation vector as the drive excitation, that is, The LPC synthesis filter generates a composite signal. The composite signal is output to the adder 110.

加法器110藉由從語音訊號S11中減去由LPC合成濾波器109生成的合成訊號而計算誤差訊號，並將該誤差訊號輸出至聽覺加權單元111。另外，該誤差訊號相當於編碼失真。The adder 110 calculates an error signal by subtracting the synthesized signal generated by the LPC synthesis filter 109 from the voice signal S11, and outputs the error signal to the auditory weighting unit 111. In addition, the error signal is equivalent to coding distortion.

聽覺加權單元111對從加法器110輸出的編碼失真施加聽覺上的加權，並輸出至失真最小化單元112。The auditory weighting unit 111 applies an audible weight to the encoded distortion output from the adder 110, and outputs it to the distortion minimizing unit 112.

失真最小化單元112對每個子訊框求得從聽覺加權單元111輸出的編碼失真最小的適應碼簿103、固定碼簿104及增益碼簿105的各索引，並將這些索引作為碼資料而輸出至CELP編碼裝置100的外部。更詳細而言，基於上述適應碼簿103及固定碼簿104而生成合成訊號，用於求得該訊號的編碼失真的一連串處理為閉環控制(回饋控制)，失真最小化單元112在1子訊框內對指示給各碼簿的碼資料進行各種變更而搜尋各碼簿，並輸出最終得到的、使編碼失真為最小的各碼簿的碼資料。The distortion minimizing unit 112 obtains, for each subframe, the indices of the adaptive codebook 103, the fixed codebook 104, and the gain codebook 105 which have the smallest coding distortion output from the auditory weighting unit 111, and outputs these indexes as code data. To the outside of the CELP encoding device 100. More specifically, the composite signal is generated based on the adaptive codebook 103 and the fixed codebook 104, and a series of processes for obtaining the coding distortion of the signal is closed-loop control (feedback control), and the distortion minimization unit 112 is in the sub-signal. The code data indicated to each codebook is changed in the frame to search for each codebook, and the code data of each codebook which is finally obtained to minimize the coding distortion is output.

另外，對每個子訊框，將在編碼失真為最小時的驅動激勵反饋給適應碼簿103。適應碼簿103藉由該反饋，更新儲存的驅動激勵。In addition, for each subframe, the drive excitation when the coding distortion is minimized is fed back to the adaptation codebook 103. The adaptation codebook 103 updates the stored drive stimulus by the feedback.

在此，說明固定碼簿104的搜尋方法。首先，藉由搜尋使下式(1)的編碼失真最小化的激勵向量，導出激勵向量的搜尋及碼資料。Here, a search method of the fixed codebook 104 will be described. First, the search and code data of the excitation vector are derived by searching for an excitation vector that minimizes the coding distortion of the following equation (1).

(式1)(Formula 1)

E=|x-(pHa+qHs)|²‧‧‧(1) E = | x - (pHa + qHs) | 2 ‧‧‧ (1)

E：編碼失真、x：編碼目標、p：適應碼簿向量的增益、H：聽覺加權合成濾波器、a：適應碼簿向量、q：固定碼簿向量的增益、s：固定碼簿向量E: coding distortion, x: coding target, p: gain of adaptive codebook vector, H: auditory weighted synthesis filter, a: adaptive codebook vector, q: gain of fixed codebook vector, s: fixed codebook vector

一般而言，由於以開環(各自的環)搜尋適應碼簿向量及固定碼簿向量，所以藉由搜尋使下式(2)的編碼失真最小化的固定碼簿向量，進行固定碼簿104的碼的導出。In general, since the adaptive codebook vector and the fixed codebook vector are searched for in an open loop (respective ring), the fixed codebook 104 is performed by searching for a fixed codebook vector that minimizes coding distortion of the following equation (2). The export of the code.

(式2)(Formula 2)

E：編碼失真、x：編碼目標(聽覺加權語音訊號)、p：適應碼簿向量的最佳增益、H：聽覺加權合成濾波器、a：適應碼簿向量、q：固定碼簿向量的增益、s：固定碼簿向量、y：固定碼簿搜尋的目標向量E: coding distortion, x: coding target (audio-weighted speech signal), p: optimal gain for adaptive codebook vector, H: auditory weighted synthesis filter, a: adaptive codebook vector, q: gain of fixed codebook vector , s: fixed codebook vector, y: target vector for fixed codebook search

在此，在搜尋了激勵的碼之後決定增益p、q，所以在此以最佳增益進行搜尋。於是，將上式(2)可以寫成下式(3)。Here, the gains p and q are determined after searching for the code of the excitation, so the search is performed with the optimum gain. Thus, the above formula (2) can be written as the following formula (3).

(式3)(Formula 3)

而且，得知使該失真的式最小化係與使下式(4)的函數C最大化為相同值。Further, it is known that the equation for minimizing the distortion is maximized to the same value as the function C of the following equation (4).

(式4)(Formula 4)

因此，在由如代數碼簿的激勵的少數脈波構成的激勵的搜尋的情況下，只要預先計算yH及HH，則能夠以較少的計算量計算上述函數C。Therefore, in the case of searching for an excitation composed of a small number of pulse waves excited by the codebook, the above-described function C can be calculated with a small amount of calculation as long as yH and HH are calculated in advance.

第2圖係表示第1圖所示的多級向量量化單元102的內部結構的方塊圖。在本實施例中，作為頻譜參數(LPC參數)的量化方法，使用多級向量量化(多級VQ)。多級VQ係指，連續前進複數個級的VQ，且係在下一級對前級的量化失真進行量化的方法。在此，設想量化位元數較多，級數也為較多的6至10級以上，說明多級向量量化單元102的內部結構。Fig. 2 is a block diagram showing the internal structure of the multilevel vector quantization unit 102 shown in Fig. 1. In the present embodiment, as a quantization method of a spectral parameter (LPC parameter), multi-level vector quantization (multi-level VQ) is used. The multi-level VQ refers to a method of continuously advancing a plurality of stages of VQ and quantizing the quantization distortion of the previous stage in the next stage. Here, it is assumed that the number of quantization bits is large and the number of stages is also 6 to 10 or more, and the internal structure of the multi-level vector quantization unit 102 will be described.

向量量化單元201-1對由LPC分析單元101得到的LPC參數、即編碼對象(對象向量)進行量化。具體而言，計算與儲存於碼簿中的碼向量之間的距離(量化失真)，進行求得最小距離的序號的向量量化。在樹狀搜尋中，從距離(量化失真)最小的一方開始求得幾個候補的序號。向量量化單元201-1求得虛擬目標向量、碼候補(在樹狀搜尋中，為序號的列(候補序號列))及候補數作為量化失真，並將求得的虛擬目標向量、碼候補及候補數輸出至向量量化單元201-2，將碼候補也輸出至碼決定單元202。The vector quantization unit 201-1 quantizes the LPC parameters obtained by the LPC analysis unit 101, that is, the encoding target (object vector). Specifically, the distance (quantization distortion) between the code vector stored in the codebook is calculated, and vector quantization of the sequence of the minimum distance is performed. In the tree search, the number of several candidates is obtained from the side with the smallest distance (quantization distortion). The vector quantization unit 201-1 obtains the virtual target vector, the code candidate (the column of the serial number (the candidate number column) in the tree search), and the candidate number as the quantization distortion, and obtains the obtained virtual target vector and code candidate. The candidate number is output to the vector quantization unit 201-2, and the code candidates are also output to the code decision unit 202.

向量量化單元201-2對從向量量化單元201-1輸出的虛擬目標向量(在樹狀搜尋中有時存在複數個)進行與向量量化單元201-1相同的量化，並將虛擬目標向量、碼候補(候補序號列)及候補數輸出至向量量化單元201-3，將碼候補也輸出至碼決定單元202。The vector quantization unit 201-2 performs the same quantization as the vector quantization unit 201-1 on the virtual target vector (the plural number sometimes exists in the tree search) output from the vector quantization unit 201-1, and sets the virtual target vector and code. The candidate (candidate number column) and the candidate number are output to the vector quantization unit 201-3, and the code candidates are also output to the code decision unit 202.

向量量化單元201-3至201-J分別進行與向量量化單元201-1相同的量化，向量量化單元203-J將虛擬目標向量、碼候補(候補序號列)及候補數輸出至碼決定單元202。The vector quantization units 201-3 to 201-J perform the same quantization as the vector quantization unit 201-1, respectively, and the vector quantization unit 203-J outputs the virtual target vector, the code candidate (candidate number column), and the candidate number to the code decision unit 202. .

碼決定單元202將從向量量化單元201-1至201-J輸出的候補序號列中量化失真最少的候補序號列的序號統合為一個資料串，並將其作為碼資料而傳送至CELP編碼裝置100的外部。另外，若從作為多級向量量化單元102的輸入的目標向量中減去最終的失真，則其為使用碼資料進行解碼的結果所得的解碼向量。基於該解碼向量，求得由LPC合成濾波器109使用的量化LPC參數，並將其傳送至LPC合成濾波器109。The code decision unit 202 integrates the numbers of the candidate number columns with the least quantization distortion among the candidate number columns output from the vector quantization units 201-1 to 201-J into one data string, and transmits them to the CELP encoding apparatus 100 as code data. The outside. Further, if the final distortion is subtracted from the target vector which is the input of the multi-stage vector quantization unit 102, it is a decoded vector obtained as a result of decoding using the code material. Based on the decoded vector, the quantized LPC parameters used by the LPC synthesis filter 109 are obtained and transmitted to the LPC synthesis filter 109.

第3圖係表示第2圖所示的向量量化單元201-j(1≦j≦J)的內部結構的方塊圖。以下，使用第3圖說明向量量化單元201-j(1≦j≦J)的內部結構。Fig. 3 is a block diagram showing the internal structure of the vector quantization unit 201-j (1≦j≦J) shown in Fig. 2. Hereinafter, the internal structure of the vector quantization unit 201-j (1≦j≦J) will be described using FIG.

將三個訊號輸入至向量量化單元201-j。一個係候補數j，此係由量化單元201-j保留為候補而輸出至下一級的向量量化單元201-(j+1)的、候補序號列及虛擬目標向量的數。下一個係目標向量或虛擬目標向量(以下，有時將這些總稱為“虛擬目標向量”)j，此係作為最初的編碼對象(目標向量)或在級的中途由前級的向量量化單元201-(j-1)得到的作為編碼失真的虛擬目標向量。最後係候補序號列j，此係至向量量化單元201-j為止失真最少的各向量量化單元的序號列。另外，目標向量係一個，但虛擬目標向量j及候補序號列j有時存在複數個。Three signals are input to the vector quantization unit 201-j. One candidate number j is reserved by the quantization unit 201-j as a candidate, and is output to the number of the candidate number column and the virtual target vector of the vector quantization unit 201-(j+1) of the next stage. The next system target vector or virtual target vector (hereinafter, these are collectively referred to as "virtual target vectors"), which is the original encoding target (target vector) or the vector quantization unit 201 of the preceding stage in the middle of the stage. - (j-1) A virtual target vector obtained as a coding distortion. Finally, the candidate sequence number column j is the sequence number of each vector quantization unit with the least distortion to the vector quantization unit 201-j. Further, the target vector is one, but the virtual target vector j and the candidate number column j sometimes have a plurality of numbers.

在此，將候補數j作為K，將候補數j-1作為M。另外，在向量量化單元201-1中，由於目標向量為一個，所以M=1。另外，在最後級的向量量化單元201-J中，只要求得一個候補序號列，所以K=1即可。需要注意的係，M係輸入的目標向量及候補序號列j的數，K係意味著輸出至下一級的向量量化單元201-(j+1)的候補數。Here, the candidate number j is taken as K, and the candidate number j-1 is taken as M. Further, in the vector quantization unit 201-1, since the target vector is one, M=1. Further, in the vector quantization unit 201-J of the final stage, only one candidate number column is required, so K=1. Note that the M-system input target vector and the number of the candidate number column j, K means the number of candidates output to the next-stage vector quantization unit 201-(j+1).

失真計算及碼簿搜尋單元301進行M個虛擬目標向量的全部與儲存於碼簿302中的所有碼向量的匹配(一般為基於歐氏距離(作為向量，對每個元素取得差分而求得平方和)的距離計算)，從距離(量化失真)最小的一方開始搜尋K個候補，並求得其碼序號。此時，也決定原來的序號列。而且，參照候補序號列j，將候補的碼序號連接到原來的序號列而計算K個候補序號列j+1，並輸出至下一級向量量化單元201-(j+1)。另外，將候補數j、候補的碼序號的碼向量、量化對象的目標向量輸出至虛擬目標計算單元304。另外，將候補數j和編碼失真中的一個值輸出至候補數決定單元303。The distortion calculation and codebook search unit 301 performs matching of all of the M virtual target vectors with all code vectors stored in the codebook 302 (generally based on the Euclidean distance (as a vector, the difference is obtained for each element) And the distance calculation), searching for K candidates from the side with the smallest distance (quantization distortion), and obtaining the code number. At this time, the original serial number column is also determined. Then, referring to the candidate number column j, the candidate code number is connected to the original number column to calculate the K candidate number columns j+1, and is output to the next-stage vector quantization unit 201-(j+1). Further, the candidate number j, the code vector of the code number of the candidate, and the target vector of the quantization target are output to the virtual target calculation unit 304. Further, one of the candidate number j and the coding distortion is output to the candidate number determining unit 303.

另外，在該向量量化單元201-j係最初級的向量量化單元201-1的情況下，候補數j及候補序號列j預先設定在向量量化單元201-1的內部，僅輸入目標向量。另外，在該向量量化單元201-j係最後級的向量量化單元201-J的情況下，候補數為1，只要將距離(量化失真)最小的序號連接到與目標向量對應的候補序號列，並將其作為候補序號列j+1而輸出到碼決定單元202，就可使候補數決定單元303及虛擬目標計算單元304不發揮作用。Further, in the case where the vector quantization unit 201-j is the vector quantization unit 201-1 of the first stage, the candidate number j and the candidate number sequence j are set in advance in the vector quantization unit 201-1, and only the target vector is input. Further, in the case where the vector quantization unit 201-j is the vector quantization unit 201-J of the last stage, the number of candidates is 1, and the number whose minimum distance (quantization distortion) is the smallest is connected to the candidate number column corresponding to the target vector. When it is output to the code decision unit 202 as the candidate number column j+1, the candidate number determining unit 303 and the virtual target calculating unit 304 can be made to function.

以下，表示失真計算及碼簿搜尋單元301的具體處理例。設j=4、M=4、K=3、向量長度為L，由於目標(在此，虛擬目標向量)為x_i ⁰、x_i ¹、x_i ²、x_i ³，候補序號列由於j=4，所以假設在其前存在三級使用了大小64(6位元)的碼簿的向量量化單元，而為(5，12，31)(5，12，48)(31，11，57)(31、3、18)的四列。該四列的候補列的各列與上述四個虛擬目標向量具有一對一的關係。將碼向量設為C_i ^m。m係碼向量的序號。量化失真E_n,m由下式(5)表示。Hereinafter, a specific processing example of the distortion calculation and codebook search unit 301 will be described. Let j=4, M=4, K=3, and the vector length be L. Since the target (here, the virtual target vector) is x _i ⁰ , x _i ¹ , x _i ² , x _i ³ , the candidate serial number column is j. =4, so assume that there are three levels of vector quantization units using a 64-bit (6-bit) codebook, and (5,12,31)(5,12,48) (31,11,57) ) Four columns of (31, 3, 18). Each of the four columns of candidate columns has a one-to-one relationship with the four virtual target vectors. Set the code vector to C _i ^m . The serial number of the m-code vector. The quantization distortion E _n,m is expressed by the following formula (5).

(式5)(Formula 5)

而且，求得該量化失真E_n,m最小的前三位碼序號。求得的結果，假設前三位為(1)虛擬目標向量為0時的碼序號35、(2)虛擬目標向量為0時的碼序號8、(3)虛擬目標向量為3時的碼序號52。若參照上述候補序號列而在最後附加前述碼訊號，則下一傳送的三個序號列為(5，12，31，35)、(5，12，31，8)、(31，3，18，52)作為候補序號列j+1。而且，將(x_i ⁰，C_i ³⁵)、(x_i ⁰，C_i ⁸)、(x_i ³，C_i ⁵²)的三組虛擬目標向量及碼向量輸出至虛擬目標計算單元304。另外，將候補數3及前三位中的一個距離(量化失真)輸出至候補數決定單元303。此外，在本實施例中，也可以輸出三個距離中的任一個距離。此因為，無論輸出哪一個距離，性能上都不產生較大的差異。Moreover, the first three-digit code number with the smallest quantization distortion E _{n,m is} obtained. As a result of the evaluation, it is assumed that the first three digits are (1) the code number 35 when the virtual target vector is 0, (2) the code number 8 when the virtual target vector is 0, and (3) the code number when the virtual target vector is 3. 52. If the code signal is added last with reference to the candidate serial number column, the three serial numbers of the next transmission are (5, 12, 31, 35), (5, 12, 31, 8), (31, 3, 18). , 52) as the candidate serial number column j+1. Further, three sets of virtual target vectors and code vectors of (x _i ⁰ , C _i ³⁵ ), (x _i ⁰ , C _i ⁸ ), (x _i ³ , C _i ⁵² ) are output to the virtual target calculation unit 304. Further, one of the candidate number 3 and the first three bits (quantization distortion) is output to the candidate number determining unit 303. Further, in the present embodiment, it is also possible to output any one of three distances. This is because no matter which distance is output, there is no big difference in performance.

候補數決定單元303參照從失真計算及碼簿搜尋單元301輸出的候補數j及距離(量化失真)，決定由下一級的向量量化單元201-(j+1)使用的候補數j+1，並將其輸出至向量量化單元201-(j+1)。The candidate number determining unit 303 refers to the candidate number j and the distance (quantization distortion) output from the distortion calculation and codebook search unit 301, and determines the candidate number j+1 used by the vector quantization unit 201-(j+1) of the next stage. And it is output to the vector quantization unit 201-(j+1).

虛擬目標計算單元304參照從失真計算及碼簿搜尋單元301輸出的目標與碼向量的組，從目標向量中減去碼向量而計算K個虛擬目標向量j+1。在上述具體例中，(x_i ⁰-C_i ³⁵)、(x_i ⁰-C_i ⁸)、(x_i ³-C_i ⁵²)的三個向量為虛擬目標向量j+1。The virtual target calculation unit 304 refers to the group of the target and the code vector output from the distortion calculation and codebook search unit 301, and subtracts the code vector from the target vector to calculate K virtual target vectors j+1. In the above specific example, the three vectors of (x _i ⁰ - C _i ³⁵ ), (x _i ⁰ - C _i ⁸ ), and (x _i ³ - C _i ⁵² ) are virtual target vectors j+1.

接著，包含運演算法的效果而詳細說明上述候補數決定單元303。首先，在由樹狀搜尋VQ使用的N最佳搜尋中，在級數較多的情況下，計算量與候補數N成比例地增加至N倍，相反地，若減小N，則量化性能惡化。因此，本發明人重複進行使用了樹狀搜尋的多級VQ的模擬實驗，進行樹狀搜尋的性能分析，提取了以下四個傾向。Next, the above-described candidate number determining unit 303 will be described in detail including the effect of the arithmetic algorithm. First, in the N-optimal search used by the tree-like search VQ, in the case where the number of stages is large, the amount of calculation increases to N times in proportion to the number of candidates N. Conversely, if N is decreased, the quantization performance deteriorates. . Therefore, the inventors repeated the simulation experiment of the multi-stage VQ using the tree search, performed the performance analysis of the tree search, and extracted the following four trends.

即，(1)使N最佳搜尋中的候補數N對每級增加或不變，也無法獲得對應於計算量的性能。保留複數個候補在多級量化中的最初級中對量化性能產生效果。That is, (1) the number of candidates N in the N-optimal search is increased or changed for each stage, and the performance corresponding to the calculation amount cannot be obtained. Retaining a plurality of candidates has an effect on the quantization performance in the initial stage of multi-level quantization.

(2)在前進一個級時，若急劇降低搜尋的候補數，則量化性能大幅下降。(2) When one step is advanced, if the number of search candidates is drastically lowered, the quantization performance is greatly degraded.

(3)N=2與N=1之間存在巨大之差，在級數較多的情況下，能夠以N=2獲得大致足夠的量化性能。(3) There is a huge difference between N=2 and N=1. In the case of a large number of stages, substantially sufficient quantization performance can be obtained with N=2.

(4)在前進複數個級數後編碼失真不變小的情況下，最終的離群值(outlier)(量化誤差為某值以上的比例)惡化的可能性增加。(4) When the coding distortion does not become small after a plurality of stages are advanced, the probability of deterioration of the final outlier (the ratio of the quantization error to a certain value or more) is increased.

鑑於上述傾向，本發明人創造了藉由組合以下三個運演算法進行的樹狀搜尋。即，藉由以下步驟進行。In view of the above tendency, the inventors have created a tree search by combining the following three algorithms. That is, it is performed by the following steps.

(步驟1)在第一級，僅保留預先指定的候補數N而進入下一級。(Step 1) In the first stage, only the pre-specified number of candidates N is retained and the next stage is entered.

(步驟2)從第二級開始，每次進入下一級時，將候補數如N-1、N-2地減1。(Step 2) Starting from the second stage, each time the next stage is entered, the number of candidates is reduced by one as N-1 and N-2.

(步驟3)在候補數為預定的值P以下時，每次對量化失真進行評估，在大於規定的閾值的情況下，將下一級的候補數設為P，在閾值以下的情況下，將下一級的候補數設為小於預定的P的值Q。在以下的說明中，作為P及Q的例子，說明P=3、Q=2。另外，在計算量充裕的情況下，該數值也可以係更大的數值。此時，能夠進一步減小編碼失真。(Step 3) When the number of candidates is equal to or lower than the predetermined value P, the quantization distortion is evaluated each time. When the number of candidates is greater than a predetermined threshold, the number of candidates for the next stage is P, and when the threshold is equal to or lower than the threshold value, The number of candidates for the next stage is set to a value Q smaller than the predetermined P. In the following description, P=3 and Q=2 will be described as examples of P and Q. In addition, in the case where the amount of calculation is sufficient, the value can also be a larger value. At this time, the coding distortion can be further reduced.

適用了如此運演算法的係候補數決定單元303，其結果，藉由對最初設為較多的候補在每次進入下一級時減1(即(步驟2))，能夠在最初部分選擇確實的候補，而且能夠不使量化性能惡化而儘早地找到最小限度的候補數，並且能夠以較少的計算量獲得足夠的量化性能。另外，在候補數為3(=P)以下的情況下，每次對量化失真進行評估，若量化失真較大，則將候補數增加為3(=P)，若量化失真足夠小，則將候補數減少為2(=Q)(即(步驟(3))，因而能夠控制為以最低限度的計算量達成足夠小的編碼失真，能夠以較少的計算量獲得足夠的量化性能。The system candidate number determining unit 303 to which such an operation algorithm is applied, as a result, can be selected in the first part by subtracting 1 each time the candidate is set to the next level (i.e., (step 2)). As a candidate, it is possible to find the minimum number of candidates as early as possible without deteriorating the quantization performance, and it is possible to obtain sufficient quantization performance with a small amount of calculation. In addition, when the number of candidates is 3 (=P) or less, the quantization distortion is evaluated each time, and if the quantization distortion is large, the number of candidates is increased to 3 (= P), and if the quantization distortion is sufficiently small, The number of candidates is reduced to 2 (= Q) (i.e., (step (3)), and thus it is possible to control to achieve sufficiently small coding distortion with a minimum amount of calculation, and to obtain sufficient quantization performance with a small amount of calculation.

接著，使用第4圖，說明候補數決定單元303中的候補數決定步驟。在以下的說明中，以KK表示候補數j+1。輸入至候補數決定單元303為從失真計算及碼簿搜尋單元301得到的候補數j(K)、距離(量化失真)。假設由候補數決定單元303把握級數J。另外，假設在開始本量化之前預定K的初始值、距離的基準值。另外，在第4圖中，例如，將50000作為距離的基準值，但也可能存在其他值較為適當的情況。根據向量的維或元素的值的大小等，決定適當的值即可。Next, the candidate number determining step in the candidate number determining unit 303 will be described using FIG. In the following description, the candidate number j+1 is denoted by KK. The input-to-candidate determination unit 303 is the candidate number j(K) and the distance (quantization distortion) obtained from the distortion calculation and codebook search unit 301. It is assumed that the number of stages J is grasped by the candidate number determining unit 303. In addition, it is assumed that the initial value of K and the reference value of the distance are predetermined before starting the quantization. Further, in Fig. 4, for example, 50000 is used as the reference value of the distance, but there may be cases where other values are appropriate. The appropriate value may be determined according to the dimension of the vector or the value of the element.

首先，在步驟(以下，省略為“ST”)401中，判定是否為級序號j=1、即是否為向量量化單元201-1，在級序號j=1(“是”)的情況下，轉移至ST402，而在非級序號j=1(“否”)的情況下，轉移至ST405。First, in the step (hereinafter, abbreviated as "ST") 401, it is determined whether or not the level number j = 1, that is, whether it is the vector quantization unit 201-1, and in the case where the level number j = 1 ("Yes"), The process proceeds to ST402, and in the case of the non-level number j=1 ("NO"), the process proceeds to ST405.

在ST402中，將候補數K(此時，K的初始值)作為輸入，判定總級數是否大於7，在總級數大於7的情況下，轉移至ST403，而在總級數不大於7的情況下，轉移至ST404。另外，除了該“7”的數值以外，當然，依據條件，也有可能存在其他值較為適當的情況。根據總級數或候補數的初始值等，預先決定適當的值即可。In ST402, the candidate number K (in this case, the initial value of K) is taken as an input, and it is determined whether the total number of stages is greater than 7, and if the total number of stages is greater than 7, the process proceeds to ST403, and the total number of stages is not more than 7. In the case, transfer to ST404. Further, in addition to the numerical value of "7", of course, depending on the conditions, there may be cases where other values are appropriate. The appropriate value may be determined in advance based on the total number of stages or the initial value of the number of candidates.

在ST403中，設為KK=K-1，而且在ST404中，設為KK=K。In ST403, KK=K-1 is set, and in ST404, KK=K is set.

在ST405中，由於在ST401中判定為非級序號j=1(非向量量化單元201-1)，所以設為KK=K-1，在ST406中，判定是否為級序號j=4以上且距離(量化失真)超過基準值，在滿足該條件(“是”)的情況下，轉移至ST407，而在不滿足該條件(“否”)的情況下，轉移至ST409。另外，在此，設定為級序號j=4以上，但也有可能存在其他值較為適當的情況。In ST405, it is determined in ST401 that the non-level number j=1 (non-vector quantization unit 201-1) is set to KK=K-1, and in ST406, it is determined whether or not the level number j=4 or more and the distance. (Quantization distortion) exceeds the reference value, and if the condition is satisfied (YES), the process proceeds to ST407, and if the condition is not satisfied (NO), the process proceeds to ST409. Here, although the order number j=4 or more is set here, there may be cases where other values are appropriate.

在ST407中，判定KK是否小於3(=P)，在KK小於3(=P)(“是”)的情況下，轉移至ST408而設為KK=3，而在KK不小於3(=P)(“否”)的情況下，轉移至ST411。In ST407, it is determined whether KK is less than 3 (=P), and if KK is less than 3 (=P) ("Yes"), the transition to ST408 is set to KK=3, and KK is not less than 3 (=P). ) ("No"), the process moves to ST411.

另外，在ST409中，判定KK是否小於2(=Q)，在KK小於2(=Q)(“是”)的情況下，轉移至ST410而設為KK=2，而在KK不小於2(=Q)(“否”)的情況下，轉移至ST411。Further, in ST409, it is determined whether KK is less than 2 (=Q), and when KK is less than 2 (=Q) ("Yes"), the process proceeds to ST410 and is set to KK=2, and KK is not less than 2 ( In the case of =Q) ("No"), the process moves to ST411.

如此，在ST406至ST410中，獲得以下的效果，即：若在進行了一定程度的量化的階段距離(量化失真)足夠小，則使候補數較小，在距離還較大的情況下，使候補數更大而使綜合量化失真更小。其係確保最低候補數的“2”(=Q)並且使用候補數“3”(=P)而使綜合量化失真更小的運演算法。在本發明人的量化實驗中，確認藉由該距離的判定，能夠減低離群值(量化失真為某個較大值以上的比例)。As described above, in ST406 to ST410, when the phase distance (quantization distortion) at which the quantization is performed to a certain extent is sufficiently small, the number of candidates is made small, and when the distance is large, the distance is made larger. The larger the number of candidates, the smaller the integrated quantization distortion. This is an operation algorithm that ensures "2" (=Q) of the lowest candidate number and uses the candidate number "3" (=P) to make the integrated quantization distortion smaller. In the quantification experiment of the present inventors, it was confirmed that the outlier value (the ratio of the quantization distortion to a certain larger value or more) can be reduced by the determination of the distance.

在ST411中，判定是否為級序號j=J、即是否為最終級，在級序號j=J(“是”)的情況下，轉移至ST412，在非級序號j=J(“否”)的情況下，結束該級中的候補數決定步驟。In ST411, it is determined whether or not the level number j=J, that is, whether it is the final level, and if the level number j=J ("Yes"), the process proceeds to ST412, and the non-level number j=J ("No") In the case of this, the candidate number determination step in the stage is ended.

在ST412中，設定為KK=1，並結束最終級中的候補數決定處理。In ST412, KK=1 is set, and the candidate number determination processing in the final stage is ended.

在此，為了表示本發明的有效性，表示適用於CELP的ISF量化的量化實驗。編碼器係以CELP為基本的，位元率為約24kbps，使用的資料係寬頻的頻率的日語40樣本。量化的係ISF(Immittance Spectral Frequency)的16維的向量。作為基準的多級VQ係N基準的樹狀搜尋，具有六級以上的級數。本發明將相同的N作為初始的候補數。以下的表1表示量化實驗結果。Here, in order to express the effectiveness of the present invention, a quantitative experiment of ISF quantization applicable to CELP is shown. The encoder is based on CELP, the bit rate is about 24 kbps, and the data used is a Japanese 40 sample of the frequency of the wide frequency. The quantified is a 16-dimensional vector of the ISF (Immittance Spectral Frequency). The tree search of the multi-stage VQ system N reference as a reference has six or more stages. The present invention uses the same N as the initial candidate number. Table 1 below shows the results of the quantitative experiments.

由上述表1可知，將最大訊框的計算量削減約1.7wMOPS(weitghed Mega Oparation Per Second)，能夠大幅削減計算量。另外，可知S/N比(Signal/Noise ratio，信噪比)幾乎不變，在客觀值中合成音幾乎不惡化。即使以SD(Spectral Distance)比較ISF的失真，也只是0.01dB的微量的惡化，在觀察2dB以上的比例的離群值中，惡化僅為0.2%。此係每500訊框一次的比例，表示幾乎不惡化。而且，由於本發明造成的處理的增加僅係候補數的決定，其計算量輕微，所以對運演算法整體造成的影響也較小。As can be seen from the above Table 1, the calculation amount of the maximum frame is reduced by about 1.7 wMOPS (weitghed Mega Oparation Per Second), and the amount of calculation can be drastically reduced. In addition, it can be seen that the S/N ratio (Signal/Noise ratio) is almost constant, and the synthesized sound hardly deteriorates in the objective value. Even if the distortion of the ISF is compared by SD (Spectral Distance), it is only a slight deterioration of 0.01 dB, and in the outlier value of the ratio of 2 dB or more, the deterioration is only 0.2%. This ratio of 500 frames per frame indicates that it hardly deteriorates. Moreover, since the increase in processing caused by the present invention is only a decision of the number of candidates, and the amount of calculation is slight, the overall impact on the operation algorithm is also small.

如此，根據第1實施例，在使用了樹狀搜尋的多級VQ中，在第一級設為預先指定的候補數N，在第二級以後，每次進入下一級時將候補數減1，在候補數為3以下的情況下，每次對量化失真進行評估，在大於規定的閾值的情況下，將下一級的候補數設為3(=P)，而在閾值以下的情況下，將下一級的候補數設為2(=Q)。由此，能夠在最初的部分選擇確實的候補，並且能夠不使量化性能惡化而儘早地找到最小限度的候補數，而且能夠以較少的計算量獲得足夠的量化性能。另外，能夠控制為以最低限度的計算量達成足夠小的編碼失真。As described above, according to the first embodiment, in the multi-stage VQ using the tree search, the number of candidates N specified in advance is set in the first stage, and the number of candidates is reduced by 1 each time after entering the next stage after the second stage. When the number of candidates is 3 or less, the quantization distortion is evaluated each time, and when it is larger than a predetermined threshold, the number of candidates in the next stage is set to 3 (=P), and when the threshold is equal to or lower than the threshold value, Set the number of candidates for the next level to 2 (=Q). Thereby, it is possible to select a reliable candidate in the first portion, and it is possible to find the minimum candidate number as early as possible without deteriorating the quantization performance, and it is possible to obtain sufficient quantization performance with a small amount of calculation. In addition, it is possible to control to achieve sufficiently small coding distortion with a minimum amount of calculation.

(第2實施例)(Second embodiment)

本發明的第2實施例的CELP編碼裝置的結構與第1實施例的第1圖所示的結構相同，不同之處僅在於向量量化單元201-j的候補數決定單元303的功能，所以根據需要，引用第1圖至第3圖而進行說明。The configuration of the CELP encoding apparatus according to the second embodiment of the present invention is the same as the configuration shown in the first embodiment of the first embodiment, and differs only in the function of the candidate number determining unit 303 of the vector quantization unit 201-j. It needs to be described with reference to Figs. 1 to 3 .

第5圖係表示本發明的第2實施例之候補數決定單元303的候補數決定步驟的流程圖。以下，使用第5圖說明候補數決定步驟。但是，在第5圖中，對與第4圖共用的部分附加與第4圖相同的符號，並省略重複的說明。Fig. 5 is a flowchart showing a procedure for determining the number of candidates of the candidate number determining unit 303 in the second embodiment of the present invention. Hereinafter, the candidate number determining step will be described using FIG. In the fifth embodiment, the same reference numerals are given to the portions that are the same as those in the fourth embodiment, and the overlapping description will be omitted.

另外，在以下的說明中，設為與第1實施例的第4圖相同的條件。即，以KK表示候補數j+1。輸入至候補數決定單元303為從失真計算及碼簿搜尋單元301得到的候補數j(K)、距離(量化失真)。另外，假設由候補數決定單元303把握級數J。另外，假設在開始本量化之前預定K的初始值、距離的基準值。另外，在第5圖中，例如，將50000作為距離的基準值，但也可能存在其他值較為適當的情況。根據向量的維或元素的值的大小等，決定適當的值即可。In addition, in the following description, the same conditions as the fourth figure of the first embodiment are used. That is, the candidate number j+1 is represented by KK. The input-to-candidate determination unit 303 is the candidate number j(K) and the distance (quantization distortion) obtained from the distortion calculation and codebook search unit 301. Further, it is assumed that the number of stages J is determined by the candidate number determining unit 303. In addition, it is assumed that the initial value of K and the reference value of the distance are predetermined before starting the quantization. In addition, in FIG. 5, for example, 50000 is used as the reference value of the distance, but there may be cases where other values are appropriate. The appropriate value may be determined according to the dimension of the vector or the value of the element.

在ST501中，判定是否為級序號j=3以上，或者是否為KK=3以下，在滿足該條件(“是”)的情況下，轉移至ST502，而在不滿足該條件(“否”)的情況下，轉移至ST411。In ST501, it is determined whether or not the level number j=3 or more, or is KK=3 or less, and if the condition is satisfied (YES), the process proceeds to ST502, and the condition is not satisfied (No). In the case, transfer to ST411.

在ST502中，判定距離(量化失真)是否超過基準值，在超過(“是”)的情況下，轉移至ST407，而在不超過(“否”)的情況下，轉移至ST409。In ST502, it is determined whether the distance (quantization distortion) exceeds the reference value, and if it exceeds (YES), the process proceeds to ST407, and if it does not exceed (NO), the process proceeds to ST409.

如此，根據第2實施例，藉由確認在對量化失真進行評估之前候補數KK已足夠小，若候補數KK足夠小，則能夠立即進行使用了量化失真的候補數控制，能夠以儘可能少的計算量得到足夠的量化性能。As described above, according to the second embodiment, it is confirmed that the candidate number KK is sufficiently small before the evaluation of the quantization distortion, and if the candidate number KK is sufficiently small, the candidate number control using the quantization distortion can be immediately performed, and the number of candidates can be minimized. The amount of calculations is sufficient to quantify performance.

另外，在上述各實施例中，如第3圖所示，將候補數決定單元303設置在失真計算及碼簿搜尋單元301的後級，但也可以將候補數決定單元303設置在失真計算及碼簿搜尋單元301的前級。此時，候補數決定單元303能夠使用來自前級的向量量化單元的距離(量化失真)及候補數，獲得同樣的效果係不言而喻的。Further, in each of the above embodiments, as shown in FIG. 3, the candidate number determining unit 303 is provided in the subsequent stage of the distortion calculation and codebook search unit 301, but the candidate number determining unit 303 may be set in the distortion calculation and The previous stage of the codebook search unit 301. At this time, it is needless to say that the candidate number determining unit 303 can obtain the same effect by using the distance (quantization distortion) and the number of candidates from the vector quantization unit of the previous stage.

另外，在上述各實施例中，表示了CELP中的例子，但本發明係可用於向量量化的發明，所以不限於CELP係不言而喻的。例如，既可用於利用了MDCT(Modified Discrete Cosine Transform，修正式離散餘弦轉換)或QMF(Quadrature Mirror Filter，正交鏡像濾波器)的頻譜的量化，也可應用於從頻帶擴展技術之低頻域的頻譜中搜尋類似的頻譜形狀的運演算法。另外，本發明能夠適用於使用LPC分析的所有編碼方式。Further, in the above embodiments, the example in the CELP is shown, but the present invention is applicable to the invention of vector quantization, and therefore it is not limited to the CELP system. For example, it can be used for quantization of spectrum using MDCT (Modified Discrete Cosine Transform) or QMF (Quadrature Mirror Filter), and can also be applied to the low frequency domain of the band extension technique. An algorithm for searching for similar spectral shapes in the spectrum. In addition, the present invention can be applied to all encoding methods using LPC analysis.

另外，在上述各實施例中，表示了對ISF進行編碼的例子，但本發明並不限於此，也可以適用於對ISP(Immittance Spectrum Pairs，導納頻譜對)、LSP(Line Spectrum Pairs，線頻譜對)、PARCOR(PARtial autoCORrelation，偏自我相關)等的參數進行量化的情況。此因為，使用其他量化方法，代替實施例中的ISF量化即可。Further, in each of the above embodiments, an example of encoding the ISF is shown. However, the present invention is not limited thereto, and may be applied to ISP (Immittance Spectrum Pairs) and LSP (Line Spectrum Pairs). The case where the parameters such as spectrum pair) and PARCOR (PARtial autoCORrelation) are quantized. This is because other quantization methods are used instead of the ISF quantization in the embodiment.

另外，在上述各實施例中，對CELP的頻譜參數的樹狀搜尋VQ適用了本發明，但本發明也對其他參數向量的量化有效係不言而喻的。此因為，參數的性質不對本發明造成影響。Further, in the above embodiments, the present invention has been applied to the tree search VQ of the spectral parameters of the CELP, but the present invention is also self-evident for the quantization of other parameter vectors. This is because the nature of the parameters does not affect the invention.

另外，在上述各實施例中，將歐氏距離用於失真計算及碼簿搜尋單元301，但也可以係加權歐氏距離或街道距離(city block distance)(絕對值之和)等其他距離尺度。此因為，本發明係涉及候補數決定單元303的運演算法，距離尺度與本發明無關。Further, in the above embodiments, the Euclidean distance is used for the distortion calculation and codebook search unit 301, but other distance scales such as weighted Euclidean distance or city block distance (sum of absolute values) may be used. . Therefore, the present invention relates to an operation algorithm of the candidate number decision unit 303, and the distance scale is not relevant to the present invention.

另外，在上述各實施例中，表示了應用於編碼器的情況，但本發明也可應用於語音辨識或影像辨識等的型樣匹配(pattern matching)中使用的樹狀搜尋。此因為，本發明係涉及樹狀搜尋的候補數的決定，不對運演算法整體的目的造成影響。Further, in the above embodiments, the case where the encoder is applied is shown, but the present invention is also applicable to tree search used in pattern matching such as speech recognition or image recognition. Therefore, the present invention relates to the determination of the number of candidates for tree search, and does not affect the overall purpose of the algorithm.

另外，能夠將上述各實施例中說明的編碼裝置搭載於通訊終端裝置或基地台裝置而使用。Further, the encoding device described in each of the above embodiments can be used by being mounted on a communication terminal device or a base station device.

另外，在上述各實施例中，將與距離(量化失真)進行比較的基準值作為預先決定的常數，但此也可以為根據各級(級序號)而為不同的值係不言而喻的。此因為，本發明不限制基準值。藉由在各級(級序號)中改變基準值，能夠實現更有效的搜尋。Further, in each of the above embodiments, the reference value compared with the distance (quantization distortion) is used as a predetermined constant. However, it may be a case that the value is different depending on each stage (level number). . Therefore, the present invention does not limit the reference value. A more efficient search can be achieved by changing the reference value at each level (level number).

另外，在上述各實施例中，將“3及2”的預定的數值用於候補數的控制，但也可以使用“4及3”、“4及2”等的數值。另外，也可以使該數值在各級(級序號)不同。根據計算量充裕的情況或需要更高的性能的情況等各種情況，設定這些數值即可。Further, in each of the above embodiments, the predetermined numerical values of "3 and 2" are used for the control of the candidate number, but numerical values such as "4 and 3", "4 and 2" may be used. In addition, the numerical value may be different in each stage (level number). These values can be set according to various situations such as a case where the amount of calculation is sufficient or a case where higher performance is required.

另外，在第2實施例中，分別將“3及3”的預定的數值(常數)用於j及KK的判定，但也可以改變為“2及2”、“2及3”、“4及3”、“2及4”、“4及4”或“5及4”等。另外，也可以在各級(級序號)不同。根據計算量充裕的情況、需要更高的性能的情況等各種情況，設定這些數值即可。Further, in the second embodiment, the predetermined numerical values (constants) of "3 and 3" are used for the determination of j and KK, respectively, but may be changed to "2 and 2", "2 and 3", and "4". And 3", "2 and 4", "4 and 4" or "5 and 4", etc. In addition, it may be different in each level (level number). These values can be set in various cases such as a case where the amount of calculation is sufficient and a case where higher performance is required.

此外，在上述各實施例中，以硬體構成本發明時為例作說明，但本發明在與硬體配合下，亦可以軟體實現。Further, in the above embodiments, the present invention has been described by way of hardware, but the present invention can also be realized by software in combination with a hardware.

此外，用於上述各實施例之說明的各功能區塊，典型上係作為積體電路之LSI來實現。此等亦可個別地單晶片化，亦可以包含一部分或全部之方式而單晶片化。此處係作為LSI，但依積體度之差異，有時亦稱為IC、系統LSI、超大LSI(super LSI)、特大LSI(ultra LSI)。Further, each functional block used in the description of each of the above embodiments is typically implemented as an LSI of an integrated circuit. These may also be individually singulated, or may be singulated in part or in whole. Here, it is an LSI, but it may be called an IC, a system LSI, a super LSI, or an ultra LSI depending on the difference in the degree of integration.

此外，積體電路化之方法並非限定於LSI者，亦可以專用電路或通用處理器來實現。亦可利用製造LSI後可程式化之FPGA(現場可編程閘陣列(Field Programmable Gate Array))，或是可再構成LSI內部之電路胞(cell)的連接或設定之可重構處理器(Reconfigurable Processor)。Further, the method of integrating the circuit is not limited to the LSI, and may be implemented by a dedicated circuit or a general-purpose processor. It is also possible to use an FPGA (Field Programmable Gate Array) that can be programmed after the LSI is manufactured, or a reconfigurable processor that can reconfigure the connection or setting of a circuit inside the LSI (Reconfigurable) Processor).

再者，因半導體技術之進步或衍生之其他技術而開發出替換成LSI之積體電路化的技術時，當然亦可使用其技術進行功能區塊之積體化。亦有可能適用生物技術等。Furthermore, when a technology that replaces the integrated circuit of LSI is developed due to advances in semiconductor technology or other technologies derived therefrom, it is naturally also possible to use a technique to integrate the functional blocks. It is also possible to apply biotechnology and the like.

在2010年9月17日提交的特願第2010-210116號、以及在2010年10月13日提交的特願第2010-230537號的日本專利申請中包含的說明書、附圖及說明書摘要的公開內容全部引用於本申請。The disclosure of the specification, drawings, and abstract included in Japanese Patent Application No. 2010-210116, filed on Sep. The contents are all incorporated by reference.

Industrial availability

本發明的量化裝置及量化方法能夠適用於語音編碼裝置等。The quantization apparatus and the quantization method of the present invention can be applied to a speech encoding apparatus or the like.

101．．．LPC分析單元101. . . LPC analysis unit

102．．．多級向量量化單元102. . . Multilevel vector quantization unit

103．．．適應碼簿103. . . Adaptive code book

104．．．固定碼簿104. . . Fixed codebook

105．．．增益碼簿105. . . Gain codebook

106、107．．．乘法器106, 107. . . Multiplier

108、110．．．加法器108, 110. . . Adder

109．．．LPC合成濾波器109. . . LPC synthesis filter

111．．．聽覺加權單元111. . . Auditory weighting unit

112．．．失真最小化單元112. . . Distortion minimization unit

201-1至201-J．．．向量量化單元201-1 to 201-J. . . Vector quantization unit

202．．．碼決定單元202. . . Code decision unit

301．．．失真計算及碼簿搜尋單元301. . . Distortion calculation and codebook search unit

302．．．碼簿302. . . Code book

303．．．候補數決定單元303. . . Candidate determination unit

304．．．虛擬目標計算單元304. . . Virtual target computing unit

ST401~ST412．．．步驟ST401~ST412. . . step

ST501~ST502．．．步驟ST501~ST502. . . step

202．．．碼決定單元202. . . Code decision unit

Claims

A quantization apparatus for performing multi-level quantization using a tree search, the quantization apparatus comprising: a search unit that matches each target of one or more objects of the encoding object with a code vector stored in the codebook, and determines a determination in the previous stage Or one or more candidates from the one having the smallest quantization distortion, or a calculation unit that calculates the quantization error vector by subtracting the code vector from the target for the candidate; and the candidate number determining unit is based on The number of candidates determined in the previous stage determines the number of candidates to be used in the next stage.

The quantizing apparatus according to claim 1, wherein the candidate number determining unit determines to use the number of candidates that are decremented by one from the number of candidates determined in the preceding stage in the next stage.

The quantizing apparatus of the first aspect of the invention, wherein the candidate number determining means determines that the number of candidates determined by the preceding stage is equal to or less than a predetermined value P, and when the quantization distortion is larger than a predetermined threshold, When the quantization distortion is equal to or less than the predetermined threshold value, it is determined that the value Q smaller than the P specified in advance is used as the candidate number in the next stage.

The quantizing apparatus according to claim 1, wherein, in the case of the first stage, the search unit obtains a candidate of a predetermined number of candidates from the least quantized distortion.

The quantizing apparatus according to the first aspect of the invention, wherein the candidate number determining means is greater than a predetermined amount when the current number of stages is equal to or greater than a predetermined number of stages, or the number of candidates is equal to or less than a predetermined number of candidates P When the threshold value and the number of candidates are smaller than the predetermined number of candidates R, it is determined that the candidate number R is used in the next stage, and when the quantization distortion is equal to or less than the predetermined threshold value and the candidate number is less than the predetermined candidate number Q, the number of candidates is determined. The first level uses the candidate number Q.

A quantization method for performing multi-level quantization using a tree search, the method comprising the steps of: matching each target of one or more objects of the encoding object with a code vector stored in a codebook, and obtaining a pre-specified level at the first level One or more candidates from the candidate having the smallest quantization distortion, and one or more candidates starting from the second stage and having the smallest number of candidates determined in the previous stage, and the candidate is the candidate. In the above object, the quantization error vector is calculated by subtracting the code vector; and the number of candidates used in the next stage is determined based on the number of candidates determined in the preceding stage.