WO2001024164A1 - Voice encoder, voice decoder, and voice encoding and decoding method - Google Patents

Voice encoder, voice decoder, and voice encoding and decoding method Download PDF

Info

Publication number
WO2001024164A1
WO2001024164A1 PCT/JP2000/006542 JP0006542W WO0124164A1 WO 2001024164 A1 WO2001024164 A1 WO 2001024164A1 JP 0006542 W JP0006542 W JP 0006542W WO 0124164 A1 WO0124164 A1 WO 0124164A1
Authority
WO
WIPO (PCT)
Prior art keywords
quantization
spectrum envelope
speech
information
fundamental frequency
Prior art date
Application number
PCT/JP2000/006542
Other languages
French (fr)
Japanese (ja)
Inventor
Tadashi Yonezaki
Original Assignee
Matsushita Electric Industrial Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Matsushita Electric Industrial Co., Ltd. filed Critical Matsushita Electric Industrial Co., Ltd.
Priority to EP00961220A priority Critical patent/EP1132891A1/en
Priority to AU73212/00A priority patent/AU7321200A/en
Publication of WO2001024164A1 publication Critical patent/WO2001024164A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/06Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients

Definitions

  • Speech coding apparatus speech decoding apparatus, and speech coding / decoding method
  • the present invention relates to a speech encoding device, a speech decoding device, and a speech coding / decoding method used for a communication device of a wireless communication system such as a mobile phone and a mobile phone.
  • FIG. 1 is a block diagram showing the configuration of a conventional speech encoding device and speech decoding device.
  • spectrum envelope analysis section 11 estimates spectral envelope information of an input speech signal.
  • the spectrum envelope quantization unit 12 quantizes the spectrum envelope information estimated by the spectrum envelope analysis unit 11.
  • the inverse filter 13 filters the inverse characteristic of the frequency characteristic of the spectrum envelope information quantized by the spectrum envelope quantization unit 12 with respect to the input audio signal, and removes the spectrum envelope component.
  • This signal is considered to mimic the sound source signal generated in the vocal cords during the vocalization process.
  • this signal is referred to as “sound source signal”.
  • Excitation codebook 14 stores signals having flat frequency characteristics.
  • Source coding section 15 searches source codebook 14 for a signal closest to the source signal, and outputs the code (hereinafter referred to as “source code”).
  • the multiplexing unit 16 receives the spectrum envelope output from the spectrum envelope quantization unit 12.
  • a code indicating the quantized value of the link information and the excitation code output from excitation encoding section 15 are multiplexed as a code sequence and transmitted to the communication channel.
  • demultiplexing section 21 separates the received code string into a code indicating a quantized value of spectrum envelope information and an excitation code.
  • excitation codebook 22 In excitation codebook 22, the same signals as in excitation codebook 14 are stored. Sound source selecting section 23 selects and extracts a signal corresponding to the received sound source code from sound source codebook 22.
  • the synthesis filter 24 filters the signal extracted by the sound source selection unit 23 so as to have the frequency characteristic of the received spectrum envelope information, and outputs a decoded voice.
  • the spectral envelope information having different dynamic ranges and quantization characteristics of the signal is separated from the source signal, and the quantizer corresponding to each characteristic is separated.
  • high-quality speech codec is realized.
  • An object of the present invention is to provide a speech coding apparatus, a speech decoding apparatus, and a speech codec method capable of implementing high-quality speech decoding even when transmitting information at a low bit rate. is there.
  • FIG. 1 is a block diagram illustrating a configuration of a conventional speech encoding device and a conventional speech decoding device.
  • FIG. 2 is a block diagram illustrating a configuration of the speech encoding device and the speech decoding device according to the first embodiment of the present invention.
  • FIG. 3 is a block diagram showing an internal configuration of a spectrum envelope quantization unit of the speech coding apparatus according to Embodiment 2 of the present invention.
  • FIG. 4 is a block diagram showing an internal configuration of a spectrum envelope quantization unit of the speech coding apparatus according to Embodiment 3 of the present invention.
  • FIG. 5 is a model diagram of a spectrum envelope curved surface according to Embodiment 3 of the present invention
  • FIG. 6 is an internal configuration of a spectrum envelope quantization unit of a speech encoding device according to Embodiment 4 of the present invention.
  • FIG. 7 is a block diagram showing an internal configuration of a spectrum envelope quantization unit of a speech coding apparatus according to Embodiment 5 of the present invention.
  • FIG. 8 is a block diagram showing an internal configuration of a model applicator of a speech coding apparatus according to Embodiment 6 of the present invention.
  • FIG. 9 is a block diagram illustrating an internal configuration of a parameter quantizer of a speech coding apparatus according to Embodiment 7 of the present invention.
  • FIG. 10 is a block diagram showing an internal configuration of a parameter quantizer in a speech coding apparatus according to Embodiment 8 of the present invention.
  • FIG. 11 is a block diagram showing an internal configuration of a parameter overnight quantizer of a speech coding apparatus according to Embodiment 9 of the present invention
  • FIG. 12 is a block diagram showing the internal configuration of the spectrum envelope configuration unit of the speech decoding device according to Embodiment 10 of the present invention.
  • FIG. 2 is a block diagram showing a configuration of the speech encoding device and the speech decoding device according to Embodiment 1 of the present invention.
  • speech analysis section 101 extracts a fundamental frequency and short-time spectrum envelope information from an input speech signal.
  • the fundamental frequency quantization unit 102 quantizes the fundamental frequency extracted by the speech analysis unit 101.
  • the matrix generation unit 103 generates a spectrum envelope surface on a time-frequency plane by arranging the short-time spectrum envelope information extracted by the speech analysis unit 101 along the time axis.
  • the spectrum envelope quantization unit 104 quantizes the spectrum envelope curved surface generated by the matrix generation unit 103.
  • the spectral envelope information is quantized as a continuous function on the time-frequency plane.If only the extracted spectral envelope is quantized, the spectral envelope information is quantized depending on the sound source information. This is because information that is the gist of the present invention cannot be separated in the quantization processing.
  • the multiplexing unit 105 includes a code indicating the quantized value of the spectrum envelope curved surface output from the spectrum envelope quantization unit 104 and the basic value output from the fundamental frequency quantization unit 102. A code indicating a quantized value of the frequency is multiplexed and transmitted to the communication path.
  • demultiplexing section 201 converts the received code string into a code indicating the quantized value of the spectrum envelope information and a code indicating the quantized value of the fundamental frequency. To separate.
  • the spectrum envelope configuration unit 202 reconstructs a quantized spectrum envelope surface from the received spectrum envelope information.
  • the speech synthesis unit 203 synthesizes and outputs a decoded speech by cutting out the spectrum envelope curved surface reconstructed by the spectrum envelope construction unit 202 based on the fundamental frequency information.
  • a fundamental frequency and short-time spectrum envelope information are extracted from an input speech signal input by a speech analysis unit 101 of the speech coding apparatus 100.
  • the extracted fundamental frequency is quantized by the fundamental frequency quantization unit 102.
  • the extracted short-time spectrum envelope information is arranged along the time axis in a matrix generation unit 103, and a spectrum envelope curved surface on a time-frequency plane is generated.
  • the spectrum envelope curved surface is quantized by a spectrum envelope quantization unit 104.
  • the quantized fundamental frequency and spectrum envelope curved surface are multiplexed by the multiplexing unit 105 and transmitted to the communication path. Then, the fundamental frequency and the spectrum envelope surface are received by the demultiplexing unit 201 of the speech decoding apparatus 200, and are separated into a quantized value of the spectrum envelope information and a quantized value of the fundamental frequency. .
  • the quantized value of the spectrum envelope information is input to the spectrum envelope construction unit 202, and the spectrum envelope curved surface is reconstructed in the spectrum envelope construction unit 202. Then, the speech synthesis unit 203 uses the reconstructed spectrum envelope surface as a basis. By extracting based on this frequency information, the decoded speech is synthesized and output.
  • FIG. 3 is a block diagram showing an internal configuration of the spectrum envelope quantization unit of the speech encoding device according to Embodiment 2 of the present invention.
  • the configuration of the speech coding apparatus according to the present embodiment is the same as the configuration of the speech coding apparatus shown in FIG. 2 of Embodiment 1, and a description thereof will be omitted.
  • the two-dimensional orthogonal transformer 310 performs two-dimensional orthogonal transformation on the spectrum envelope curved surface in the time axis direction and the frequency axis direction.
  • the parameter overnight quantizer 302 quantizes the transform coefficients obtained by the two-dimensional orthogonal transform process in the two-dimensional orthogonal transformer 301.
  • the parameter overnight quantizer 302 quantizes only the coefficient information of the low frequency component.
  • FIG. 4 is a block diagram showing the internal configuration of the spectrum envelope quantization unit of the speech encoding device according to Embodiment 3 of the present invention.
  • the configuration of the speech coding apparatus according to the present embodiment is shown in FIG. 2 of Embodiment 1. Since the configuration is the same as the configuration of the speech encoding device shown, the description is omitted.
  • the model applicator 311 models a spectrum envelope curved surface and extracts model parameters.
  • This model is a model of a spectrum envelope surface in the time-frequency space.For example, as shown in Fig. 5, the application of the all-pole model to both sections on the time axis of the spectrum envelope surface and its application It can be modeled by interpolation.
  • the parameter overnight quantizer 302 quantizes the model parameter overnight extracted by the model applicator 311.
  • the quantization efficiency of the spectral envelope surface can be improved, and high quality speech decoding can be achieved even when transmitting information at a low bit rate. can do.
  • FIG. 6 is a block diagram showing the internal configuration of the spectrum envelope quantization unit of the speech encoding device according to Embodiment 4 of the present invention.
  • the configuration of the speech coding apparatus according to the present embodiment is the same as the configuration of the speech coding apparatus shown in FIG. 2 of Embodiment 1, and a description thereof will be omitted.
  • the time axis orthogonal transformer 3221 performs orthogonal transformation in the time axis direction on the spectrum envelope curved surface.
  • the model applicator 3 1 1 applies a model according to the order on the time axis to the orthogonally transformed time axis conversion coefficient, and extracts parameters.
  • the parameter overnight quantizer 3 02 quantizes the model parameters extracted by the model applicator 3 1 1.
  • FIG. 7 is a block diagram showing the internal configuration of the spectrum envelope quantization unit of the speech encoding device according to Embodiment 5 of the present invention.
  • the configuration of the speech coding apparatus according to the present embodiment is the same as the configuration of the speech coding apparatus shown in FIG. 2 of Embodiment 1, and a description thereof will be omitted.
  • the time-axis orthogonal transformer 331 1 performs orthogonal transformation on the spectrum envelope curved surface in the time-axis direction, and models the orthogonally transformed time-axis transformation coefficient. Classify into those that do and those that do not. As this classification method, for example, there is a method of applying the all-pole model because the coefficient of the 0th order on the time axis is a spectrum envelope obtained by averaging the spectrum envelope surface, and not applying the model to the other coefficients.
  • the model applicator 3 1 1 applies a model corresponding to the order on the time axis to a part of the orthogonally transformed time axis transform coefficients, and extracts parameters.
  • the frequency axis orthogonal transformer 333 performs orthogonal transformation in the frequency axis direction for the time axis conversion coefficient to which no model is applied.
  • the parameter quantizer 302 quantizes the model parameters extracted by the model applicator 311 and the transform coefficients output from the frequency-axis orthogonal transformer 332.
  • FIG. 8 is a block diagram showing an internal configuration of a model applicator of a speech encoding device according to Embodiment 6 of the present invention.
  • model applicator 3 11 is the one shown in any of the above-described third to fifth embodiments.
  • the model parameter overnight estimator 4 01 applies the model to the input signal and Extract the evening.
  • the input signal is modeled using an all-pole model in consideration of the speech generation process.
  • the model cannot represent the zero included in the signal, and the model causes analysis distortion.
  • model error estimator 402 estimates the analysis distortion generated when applying the model and outputs it to the parameter quantizer.
  • FIG. 9 is a block diagram showing an internal configuration of a parameter quantizer in a speech coding apparatus according to Embodiment 7 of the present invention.
  • parameter quantizer 302 according to the present embodiment is as described in any of Embodiments 2 to 5 above.
  • the weight calculator 501 determines the quantization sensitivity for each quantization target value using the fundamental frequency information.
  • the fundamental frequency information For example, an example of a method of determining the quantization sensitivity in the weight calculator 501 will be described.
  • the spectrum envelope surface is cut out according to the fundamental frequency and connected on the time axis to generate a decoded speech.
  • the harmonic amplitude value of the fundamental frequency is important information as compared with other spectral amplitude values. Therefore, a weight coefficient surface is generated by weighting the harmonic amplitude value at the extracted spectrum envelope position.
  • the quantization sensitivity is determined for each quantization target value by performing conversion using the same method as that for obtaining the quantization target value and calculating a weight coefficient in the quantization target parameter space.
  • the weight calculator 502 uses the spectral envelope information for each quantization target value. Determine the quantization sensitivity.
  • Determine the quantization sensitivity an example of a method of determining the quantization sensitivity in the weight calculator 502 will be described.
  • the quantization sensitivity is determined for each quantization target value by performing conversion using the same method as that used to obtain the quantization target value and calculating the weighting coefficient in the parameter space to be quantized. . Since the adaptation method of the quantizer in the weight calculator 502 is also required in the decoding process, it is necessary to use the spectrum envelope information quantized in the previous frame in order to synchronize with the decoding process. desirable.
  • the statistic accumulator 503 stores a statistic for each quantization target value obtained in advance.
  • the quantization generator 504 includes the quantization sensitivity for the quantization target value output from the weight calculator 501 and the weight calculator 502, and the statistics stored in the statistic storage 503. Design quantizer from quantity.
  • the quantization step width is determined based on the variance and the quantization sensitivity.
  • the quantization sensitivity is large. That is, the quantization step width is set to be smaller than the quantization target value that is easily affected by the quantization error.
  • the quantizer 505 quantizes the value to be quantized based on the design result of the quantization generator 504.
  • the quantization sensitivity for each quantization target value is determined using two pieces of information of the fundamental frequency information and the spectrum envelope information.
  • the quantization sensitivity may be determined using one of the two pieces of information to design a quantizer.
  • FIG. 10 is a block diagram showing the internal configuration of the parameter quantizer in the speech coding apparatus according to Embodiment 8 of the present invention.
  • parameter quantizer 302 according to the present embodiment is as described in any of Embodiments 2 to 5 above.
  • the error scale determiner 511 adaptively determines the quantization error scale on the spectrum envelope using the fundamental frequency information.
  • the error scale determiner 512 adaptively determines a quantization error scale on the spectrum envelope using the spectrum envelope information.
  • the error scale synthesizer 5 13 synthesizes the error scales obtained by the error scale determiner 5 11 and the error scale determiner 5 12 into one error scale.
  • the codebook 5 14 stores quantized values.
  • the spectrum envelope constructor 515 converts the quantized values stored in the codebook 514 into a spectrum envelope curved surface.
  • the spectrum envelope constructor 516 converts the quantization target value into a spectrum envelope curved surface.
  • the error calculator 517 is based on the error scale output from the error scale synthesizer 513, and the spectral envelope surface and the spectral envelope formed by the spectral envelope composer 515 are used. Calculate the error with respect to the spectrum envelope curved surface composed by the constructors 516.
  • the code selector 518 selects a code corresponding to the quantized value with the smallest error from the codebook 514 and outputs the selected code.
  • the error of the spectrum envelope curved surface is calculated on the time-frequency plane using the error scale at the time of quantization adapted to the fundamental frequency and the spectrum envelope information. Objective quantization distortion and auditory distortion can be reduced.
  • both the fundamental frequency and the spectrum envelope information are used.
  • the quantization error scale on the spectrum envelope is determined by using the above method, the quantization error scale may be determined for any one of them, and the error may be calculated.
  • FIG. 11 is a block diagram showing an internal configuration of a parameter quantizer in the speech coding apparatus according to Embodiment 9 of the present invention.
  • parameter quantizer 302 according to the present embodiment is as described in any of Embodiments 2 to 5 above.
  • the error function determiner 521 adaptively determines a quantization error weighting function on the spectrum envelope using the fundamental frequency information.
  • the error function determiner 522 adaptively determines a quantization error weight function on the spectrum envelope using the spectrum envelope information.
  • the error function synthesizer 5 2 3 synthesizes the quantization error weight function obtained by the error function determiner 5 2 1 and the error function determiner 5 2 2 into one error function.
  • the error function converter 524 defines an error measure over a quantization parameter for converting the quantization error weight function output from the error function synthesizer 523.
  • the codebook 525 stores quantized values.
  • the error calculator 526 calculates an error between the quantization target value and the quantized value stored in the codebook 525 based on the error measure output from the error function converter 524.
  • the code selector 527 selects a code corresponding to the quantized value that minimizes the error from the codebook 525 and outputs the selected code.
  • the objective of the synthesized speech signal can be reduced with a small amount of processing. Quantization distortion and audible distortion can be reduced.
  • the quantization error weight function on the spectrum envelope is determined for both the fundamental frequency and the spectrum envelope information, but the quantization error weight function is determined for one of them, and the error May be calculated. (Embodiment 10)
  • FIG. 12 is a block diagram showing the internal configuration of the spectrum envelope configuration unit of the speech decoding device according to Embodiment 10 of the present invention.
  • the configuration of the speech decoding apparatus according to the present embodiment is the same as the configuration of the speech decoding apparatus shown in FIG.
  • an envelope surface is generated on the decoding side by complementing a parameter that has not been received by using a parameter value that has been statistically obtained in advance.
  • the parameter storing unit 601 stores the parameter value statistically obtained in advance corresponding to each parameter not to be quantized. I have.
  • the spectrum envelope generator 602 generates a spectrum envelope curved surface based on the input spectrum envelope information.
  • the speech coding apparatus As described above, according to the speech coding apparatus, the speech decoding apparatus, and the speech coding / decoding method of the present invention, the spectrum envelope information and the sound source information are completely separated from each other, and the quantum of the spectrum envelope information is obtained. It is possible to realize speech codec processing that is not affected by quantization accuracy, and to realize highly efficient speech codec processing through a high-efficiency quantization method of spectrum envelope information that is effective in an analysis-synthesis model. Therefore, even when information is transmitted at a low bit rate, high-quality speech decoding can be realized.
  • the present specification is based on Japanese Patent Application No. 11-27551 19 filed on Sep. 28, 1999. This content is included here.
  • INDUSTRIAL APPLICABILITY The present invention is suitable for use in a communication terminal apparatus, which is a base station apparatus of a wireless communication system that performs wireless communication of voice data.

Abstract

A voice analyzer (101) in a voice encoder (100) extracts the fundamental frequency and spectral envelope information from an input voice signal. A fundamental frequency quantizer (102) quantizes the fundamental frequency. A matrix generator (103) derives a spectral envelope from the spectral envelope information, and a spectral envelope quantizer (104) quantizes the spectral envelope. A multiplexer (105) multiplexes the quantized spectrum envelope and the quantized fundamental frequency for transmission. In a voice decoder (200), a spectral envelope composer (202) restores the quantized spectral envelope from the spectral envelope information, and a voice synthesizer (203) extracts the spectral envelope based on the fundamental frequency information to synthesize the decoded voice. Thus, high-quality voice decoding can be achieved in the case of transmission at a low bit rate.

Description

明 細 書 音声符号化装置、 音声復号化装置及び音声符復号化方法 技術分野  Description Speech coding apparatus, speech decoding apparatus, and speech coding / decoding method
本発明は、 自動車電話、 携帯電話等の無線通信システムの通信装置に使用さ れる音声符号化装置、 音声復号化装置及び音声符復号化方法に関する。 背景技術  The present invention relates to a speech encoding device, a speech decoding device, and a speech coding / decoding method used for a communication device of a wireless communication system such as a mobile phone and a mobile phone. Background art
近年、 需要が急増している無線通信システムの分野では、 電波資源の有効利 用のため低ビットレートで高品質に音声を符復号化できる装置の開発が進めら れている。  In recent years, in the field of wireless communication systems, where demand has been rapidly increasing, development of devices capable of encoding and decoding speech at a low bit rate and high quality is being promoted for effective use of radio wave resources.
図 1は、 従来の音声符号化装置及び音声復号化装置の構成を示すプロック図 である。  FIG. 1 is a block diagram showing the configuration of a conventional speech encoding device and speech decoding device.
図 1の音声符号化装置 1において、 スぺクトル包絡分析部 1 1は、 入力音声 信号のスペクトル包絡情報を推定する。 スペクトル包絡量子化部 1 2は、 スぺ クトル包絡分析部 1 1にて推定されたスぺクトル包絡情報を量子化する。 逆フィルタ 1 3は、 入力音声信号に対して、 スペクトル包絡量子化部 1 2に て量子化されたスぺクトル包絡情報の周波数特性の逆特性をフィル夕リングし てスペクトル包絡成分を除去する。 これにより、 周波数特性が平坦な信号を得 ることができる。 この信号は、 発声過程において声帯で生じる音源信号を模し たものと考えられる。 以下、 この信号を 「音源信号」 という。  In speech encoding apparatus 1 of FIG. 1, spectrum envelope analysis section 11 estimates spectral envelope information of an input speech signal. The spectrum envelope quantization unit 12 quantizes the spectrum envelope information estimated by the spectrum envelope analysis unit 11. The inverse filter 13 filters the inverse characteristic of the frequency characteristic of the spectrum envelope information quantized by the spectrum envelope quantization unit 12 with respect to the input audio signal, and removes the spectrum envelope component. Thus, a signal having a flat frequency characteristic can be obtained. This signal is considered to mimic the sound source signal generated in the vocal cords during the vocalization process. Hereinafter, this signal is referred to as “sound source signal”.
音源符号帳 1 4には、 平坦な周波数特性を有する信号が蓄積されている。 音 源符号化部 1 5は、 音源信号に最も近い信号を音源符号帳 1 4から探索し、 そ の符号 (以下、 「音源符号」 という) を出力する。  Excitation codebook 14 stores signals having flat frequency characteristics. Source coding section 15 searches source codebook 14 for a signal closest to the source signal, and outputs the code (hereinafter referred to as “source code”).
多重化部 1 6は、 スぺクトル包絡量子化部 1 2から出力されたスぺクトル包 絡情報の量子化値を示す符号と、 音源符号化部 1 5から出力された音源符号と を符号列として多重化して通信路に送出する。 The multiplexing unit 16 receives the spectrum envelope output from the spectrum envelope quantization unit 12. A code indicating the quantized value of the link information and the excitation code output from excitation encoding section 15 are multiplexed as a code sequence and transmitted to the communication channel.
図 1の音声復号化装置 2において、 逆多重化部 2 1は、 受信した符号列をス ぺク卜ル包絡情報の量子化値を示す符号と音源符号とに分離する。  In speech decoding apparatus 2 in FIG. 1, demultiplexing section 21 separates the received code string into a code indicating a quantized value of spectrum envelope information and an excitation code.
音源符号帳 2 2には、 音源符号帳 1 4と同一の信号が蓄積されている。 音源 選択部 2 3は、 受信した音源符号に対応する信号を音源符号帳 2 2の中から選 択して抽出する。  In excitation codebook 22, the same signals as in excitation codebook 14 are stored. Sound source selecting section 23 selects and extracts a signal corresponding to the received sound source code from sound source codebook 22.
合成フィルタ 2 4は、 音源選択部 2 3にて抽出された信号が、 受信したスぺ クトル包絡情報の周波数特性を有するようにフィル夕リングし、 復号音声を出 力する。  The synthesis filter 24 filters the signal extracted by the sound source selection unit 23 so as to have the frequency characteristic of the received spectrum envelope information, and outputs a decoded voice.
このように、 従来の音声符号化装置及び音声復号化装置では、 信号のダイナ ミックレンジや量子化特性が異なるスぺクトル包絡情報を音源信号と分離し、 それそれの特性に応じた量子化器を構成することにより、 高品質な音声符復号 化を実現している。  As described above, in the conventional speech coding apparatus and speech decoding apparatus, the spectral envelope information having different dynamic ranges and quantization characteristics of the signal is separated from the source signal, and the quantizer corresponding to each characteristic is separated. Thus, high-quality speech codec is realized.
しかしながら、 上記従来の音声符号化装置及び音声復号化装置では、 スぺク トル包絡情報を量子化した結果に基づいてフィル夕リングを行っているため、 低ビッ卜レート化によりスぺクトル包絡情報の量子化において十分な精度を得 られない場合、 音源信号を平坦化することができず量子化効率が低下し、 復号 音声の品質が劣化するという問題を有する。 発明の開示  However, in the above-described conventional speech coding apparatus and speech decoding apparatus, since filtering is performed based on the result of quantizing the spectrum envelope information, the spectrum envelope information is reduced by reducing the bit rate. If sufficient accuracy cannot be obtained in the quantization of the sound source, there is a problem that the excitation signal cannot be flattened, the quantization efficiency decreases, and the quality of the decoded speech deteriorates. Disclosure of the invention
本発明の目的は、 低ビットレートで情報を送信する場合であっても高品質に 音声復号を実現することができる音声符号化装置、 音声復号化装置及び音声符 復号化方法を提供することである。  An object of the present invention is to provide a speech coding apparatus, a speech decoding apparatus, and a speech codec method capable of implementing high-quality speech decoding even when transmitting information at a low bit rate. is there.
この目的は、 時間的に連続したスぺクトル包絡曲面を基本周波数に基づいて 切り出すことにより音声信号が生成されることに着目し、 スぺクトル包絡情報 と音源情報を完全に分離してスぺクトル包絡情報の量子化精度に影響されない 音声符復号化処理を実現すること、 かつ、 分析合成モデルにおいて有効なスぺ クトル包絡情報の高効率量子化手法を通して高能率音声符復号処理を実現する ことにより達成される。 図面の簡単な説明 The purpose of this is to focus on the fact that a speech signal is generated by cutting out a temporally continuous spectrum envelope surface based on the fundamental frequency. And speech source information are completely separated to realize speech codec processing that is not affected by the quantization accuracy of the spectrum envelope information, and a high-efficiency quantization method of the spectrum envelope information that is effective in the analysis / synthesis model This is achieved by realizing a high-efficiency speech codec through. BRIEF DESCRIPTION OF THE FIGURES
図 1は、 従来の音声符号化装置及び音声復号化装置の構成を示すプロック図、 図 2は、 本発明の実施の形態 1に係る音声符号化装置及び音声復号化装置の 構成を示すブロック図、  FIG. 1 is a block diagram illustrating a configuration of a conventional speech encoding device and a conventional speech decoding device. FIG. 2 is a block diagram illustrating a configuration of the speech encoding device and the speech decoding device according to the first embodiment of the present invention. ,
図 3は、 本発明の実施の形態 2に係る音声符号化装置のスぺクトル包絡量子 化部の内部構成を示すブロック図、  FIG. 3 is a block diagram showing an internal configuration of a spectrum envelope quantization unit of the speech coding apparatus according to Embodiment 2 of the present invention.
図 4は、 本発明の実施の形態 3に係る音声符号化装置のスぺクトル包絡量子 化部の内部構成を示すプロック図、  FIG. 4 is a block diagram showing an internal configuration of a spectrum envelope quantization unit of the speech coding apparatus according to Embodiment 3 of the present invention.
図 5は、 本発明の実施の形態 3に係るスぺクトル包絡曲面のモデル図、 図 6は、 本発明の実施の形態 4に係る音声符号化装置のスぺクトル包絡量子 化部の内部構成を示すブロック図、  FIG. 5 is a model diagram of a spectrum envelope curved surface according to Embodiment 3 of the present invention, and FIG. 6 is an internal configuration of a spectrum envelope quantization unit of a speech encoding device according to Embodiment 4 of the present invention. A block diagram showing the
図 7は、 本発明の実施の形態 5に係る音声符号化装置のスぺクトル包絡量子 化部の内部構成を示すブロック図、  FIG. 7 is a block diagram showing an internal configuration of a spectrum envelope quantization unit of a speech coding apparatus according to Embodiment 5 of the present invention.
図 8は、 本発明の実施の形態 6に係る音声符号化装置のモデル適用器の内部 構成を示すブロック図、  FIG. 8 is a block diagram showing an internal configuration of a model applicator of a speech coding apparatus according to Embodiment 6 of the present invention.
図 9は、 本発明の実施の形態 7に係る音声符号化装置のパラメ一夕量子化器 の内部構成を示すプロック図、  FIG. 9 is a block diagram illustrating an internal configuration of a parameter quantizer of a speech coding apparatus according to Embodiment 7 of the present invention.
図 10 は、 本発明の実施の形態 8に係る音声符号化装置のパラメ一夕量子化 器の内部構成を示すブロック図、  FIG. 10 is a block diagram showing an internal configuration of a parameter quantizer in a speech coding apparatus according to Embodiment 8 of the present invention.
図 11 は、 本発明の実施の形態 9に係る音声符号化装置のパラメ一夕量子化 器の内部構成を示すブロック図、 及び、 図 12 は、 本発明の実施の形態 1 0に係る音声復号化装置のスぺクトル包絡 構成部の内部構成を示すプロック図である。 発明を実施するための最良の形態 FIG. 11 is a block diagram showing an internal configuration of a parameter overnight quantizer of a speech coding apparatus according to Embodiment 9 of the present invention, and FIG. 12 is a block diagram showing the internal configuration of the spectrum envelope configuration unit of the speech decoding device according to Embodiment 10 of the present invention. BEST MODE FOR CARRYING OUT THE INVENTION
以下、 本発明の実施の形態について、 図面を用いて説明する。  Hereinafter, embodiments of the present invention will be described with reference to the drawings.
(実施の形態 1 )  (Embodiment 1)
図 2は、 本発明の実施の形態 1に係る音声符号化装置及び音声復号化装置の 構成を示すプロック図である。  FIG. 2 is a block diagram showing a configuration of the speech encoding device and the speech decoding device according to Embodiment 1 of the present invention.
図 2の音声符号化装置 1 0 0において、 音声分析部 1 0 1は、 入力音声信号 から基本周波数と短時間スぺクトル包絡情報を抽出する。 基本周波数量子化部 1 0 2は、 音声分析部 1 0 1にて抽出された基本周波数を量子化する。  In speech encoding apparatus 100 in FIG. 2, speech analysis section 101 extracts a fundamental frequency and short-time spectrum envelope information from an input speech signal. The fundamental frequency quantization unit 102 quantizes the fundamental frequency extracted by the speech analysis unit 101.
なお、 入力音声信号から基本周波数と短時間スぺクトル包絡情報を抽出する 音声分析に関しては、 既に [河原英紀、 増田郁代、 "時間周波数領域での補間を 用いた音声の変換について, "信学技報 ΕΑ96-28,ρρ.9-18,1996] 等において、 STRAIGHT分析合成モデルに基づいて行ったものが開示されている。 このモ デルにおいて、 音源情報は基本周波数のみでありスぺクトル包絡情報と完全に 独立しているため、 音源情報及びスぺクトル包絡情報の量子化誤差が互いの情 報の量子化に影響を及ぼすことはない。  As for the speech analysis that extracts the fundamental frequency and short-time spectrum envelope information from the input speech signal, see [Hideki Kawahara, Ikuyo Masuda, "Conversion of speech using interpolation in the time-frequency domain," Technical report ΕΑ96-28, ρρ.9-18, 1996], etc., discloses a method based on a STRAIGHT analysis / synthesis model. In this model, the sound source information is only the fundamental frequency and is completely independent of the spectrum envelope information.Therefore, the quantization error of the sound source information and the spectrum envelope information affects the quantization of each other's information. Has no effect.
マトリックス生成部 1 0 3は、 音声分析部 1 0 1にて抽出された短時間スぺ クトル包絡情報を時間軸に沿って並べることにより時間—周波数平面上のスぺ クトル包絡曲面を生成する。 スペクトル包絡量子化部 1 0 4は、 マトリックス 生成部 1 0 3にて生成されたスぺクトル包絡曲面を量子化する。  The matrix generation unit 103 generates a spectrum envelope surface on a time-frequency plane by arranging the short-time spectrum envelope information extracted by the speech analysis unit 101 along the time axis. The spectrum envelope quantization unit 104 quantizes the spectrum envelope curved surface generated by the matrix generation unit 103.
なお、 スぺクトル包絡情報は時間一周波数平面平面上の連続関数として量子 化するのは、 切り出されるスペクトル包絡のみを量子化した場合、 音源情報に 依存してスペクトル包絡情報を量子化することとなり、 本発明の骨子である情 報の量子化処理における分離ができなくなるためである。 多重化部 1 0 5は、 スぺクトル包絡量子化部 1 0 4から出力されたスぺクト ル包絡曲面の量子化値を示す符号と、 基本周波数量子化部 1 0 2から出力され た基本周波数の量子化値を示す符号とを多重化して通信路に送出する。 The spectral envelope information is quantized as a continuous function on the time-frequency plane.If only the extracted spectral envelope is quantized, the spectral envelope information is quantized depending on the sound source information. This is because information that is the gist of the present invention cannot be separated in the quantization processing. The multiplexing unit 105 includes a code indicating the quantized value of the spectrum envelope curved surface output from the spectrum envelope quantization unit 104 and the basic value output from the fundamental frequency quantization unit 102. A code indicating a quantized value of the frequency is multiplexed and transmitted to the communication path.
図 2の音声復号化装置 2 0 0において、 逆多重化部 2 0 1は、 受信した符号 列をスぺクトル包絡情報の量子化値を示す符号と基本周波数の量子化値を示す 符号とに分離する。  In speech decoding apparatus 200 in FIG. 2, demultiplexing section 201 converts the received code string into a code indicating the quantized value of the spectrum envelope information and a code indicating the quantized value of the fundamental frequency. To separate.
スぺクトル包絡構成部 2 0 2は、 受信したスぺクトル包絡情報から量子化さ れたスぺクトル包絡曲面を再構成する。 音声合成部 2 0 3は、 スぺクトル包絡 構成部 2 0 2にて再構成されたスぺクトル包絡曲面を、 基本周波数情報に基づ いて切り出すことにより復号音声を合成し出力する。  The spectrum envelope configuration unit 202 reconstructs a quantized spectrum envelope surface from the received spectrum envelope information. The speech synthesis unit 203 synthesizes and outputs a decoded speech by cutting out the spectrum envelope curved surface reconstructed by the spectrum envelope construction unit 202 based on the fundamental frequency information.
次に、 図 2に示した本実施の形態に係る音声符号化装置及び音声復号化装置 の情報処理動作の流れについて説明する。  Next, the flow of the information processing operation of the speech coding apparatus and the speech decoding apparatus according to the present embodiment shown in FIG. 2 will be described.
まず、 音声符号化装置 1 0 0の音声分析部 1 0 1にて、 入力した入力音声信 号から基本周波数と短時間スぺクトル包絡情報が抽出される。 抽出された基本 周波数は、 基本周波数量子化部 1 0 2にて量子化される。  First, a fundamental frequency and short-time spectrum envelope information are extracted from an input speech signal input by a speech analysis unit 101 of the speech coding apparatus 100. The extracted fundamental frequency is quantized by the fundamental frequency quantization unit 102.
一方、 抽出された短時間スペクトル包絡情報は、 マトリックス生成部 1 0 3 にて時間軸に沿って並べられ、 時間—周波数平面上のスぺクトル包絡曲面が生 成される。 スぺクトル包絡曲面は、 スぺクトル包絡量子化部 1 0 4にて量子化 される。  On the other hand, the extracted short-time spectrum envelope information is arranged along the time axis in a matrix generation unit 103, and a spectrum envelope curved surface on a time-frequency plane is generated. The spectrum envelope curved surface is quantized by a spectrum envelope quantization unit 104.
量子化された基本周波数及びスぺクトル包絡曲面は、 多重化部 1 0 5にて多 重され通信路に送出される。 そして、 基本周波数及びスペクトル包絡曲面は、 音声復号化装置 2 0 0の逆多重化部 2 0 1に受信され、 スぺクトル包絡情報の 量子化値と基本周波数の量子化値とに分離される。  The quantized fundamental frequency and spectrum envelope curved surface are multiplexed by the multiplexing unit 105 and transmitted to the communication path. Then, the fundamental frequency and the spectrum envelope surface are received by the demultiplexing unit 201 of the speech decoding apparatus 200, and are separated into a quantized value of the spectrum envelope information and a quantized value of the fundamental frequency. .
スぺクトル包絡情報の量子化値はスぺクトル包絡構成部 2 0 2に入力され、 スぺクトル包絡構成部 2 0 2においてスぺクトル包絡曲面が再構成される。 そして、 音声合成部 2 0 3において、 再構成されたスぺクトル包絡曲面が基 本周波数情報に基づいて切り出されることにより、 復号音声が合成され出力さ れる。 The quantized value of the spectrum envelope information is input to the spectrum envelope construction unit 202, and the spectrum envelope curved surface is reconstructed in the spectrum envelope construction unit 202. Then, the speech synthesis unit 203 uses the reconstructed spectrum envelope surface as a basis. By extracting based on this frequency information, the decoded speech is synthesized and output.
このように、 音源情報とスぺクトル包絡情報を独立して量子化することによ り、 いずれかの情報の量子化精度が劣化することに伴って他方の情報の量子化 効率が低下することを避けることができ、 低ビットレートで情報を送信する場 合であっても高品質に音声復号を実現することができる。  In this way, by independently quantizing the sound source information and the spectrum envelope information, the quantization efficiency of one of the information is reduced and the quantization efficiency of the other information is reduced. Therefore, even when information is transmitted at a low bit rate, high-quality speech decoding can be realized.
(実施の形態 2 )  (Embodiment 2)
図 3は、 本発明の実施の形態 2に係る音声符号化装置のスぺクトル包絡量子 化部の内部構成を示すブロック図である。  FIG. 3 is a block diagram showing an internal configuration of the spectrum envelope quantization unit of the speech encoding device according to Embodiment 2 of the present invention.
なお、 本実施の形態に係る音声符号化装置の構成は、 実施の形態 1の図 2に 示した音声符号化装置の構成と同様であるので説明を省略する。  Note that the configuration of the speech coding apparatus according to the present embodiment is the same as the configuration of the speech coding apparatus shown in FIG. 2 of Embodiment 1, and a description thereof will be omitted.
図 3のスぺクトル包絡量子化部 1 0 4において、 2次元直交変換器 3 0 1は、 スぺクトル包絡曲面に対し時間軸方向及び周波数軸方向の 2次元直交変換を行 う。 パラメ一夕量子化器 3 0 2は、 2次元直交変換器 3 0 1における 2次元直 交変換処理にて得られた変換係数を量子化する。  In the spectrum envelope quantization unit 104 in FIG. 3, the two-dimensional orthogonal transformer 310 performs two-dimensional orthogonal transformation on the spectrum envelope curved surface in the time axis direction and the frequency axis direction. The parameter overnight quantizer 302 quantizes the transform coefficients obtained by the two-dimensional orthogonal transform process in the two-dimensional orthogonal transformer 301.
ここで、 一般にスぺクトル包絡曲面の高周波成分の差異は聴感上認識されに くい。 従って、 直交変換して得られる低周波成分の係数情報のみを用いて復号 化側で音声合成しても音声品質が大きく劣化することはない。 そこで、 パラメ 一夕量子化器 3 0 2では、 低周波成分の係数情報のみを量子化する。  Here, in general, the difference between the high-frequency components of the spectrum envelope curved surface is hard to be recognized by hearing. Therefore, even if speech is synthesized on the decoding side using only the low frequency component coefficient information obtained by the orthogonal transform, the speech quality does not significantly deteriorate. Therefore, the parameter overnight quantizer 302 quantizes only the coefficient information of the low frequency component.
このように、 直交変換を用いることで聴覚上重要でない情報を削除すること ができ、 低ビットレ一卜で情報を送信する場合であっても高品質に音声復号を 実現することができる。  As described above, by using the orthogonal transform, information that is not perceptually important can be deleted, and high-quality speech decoding can be realized even when information is transmitted at a low bit rate.
(実施の形態 3 )  (Embodiment 3)
図 4は、 本発明の実施の形態 3に係る音声符号化装置のスぺクトル包絡量子 化部の内部構成を示すブロック図である。  FIG. 4 is a block diagram showing the internal configuration of the spectrum envelope quantization unit of the speech encoding device according to Embodiment 3 of the present invention.
なお、 本実施の形態に係る音声符号化装置の構成は、 実施の形態 1の図 2に 示した音声符号化装置の構成と同様であるので説明を省略する。 The configuration of the speech coding apparatus according to the present embodiment is shown in FIG. 2 of Embodiment 1. Since the configuration is the same as the configuration of the speech encoding device shown, the description is omitted.
図 4のスペクトル包絡量子化部 1 0 4において、 モデル適用器 3 1 1は、 ス ぺクトル包絡曲面をモデル化してモデルパラメ一夕を抽出する。  In the spectral envelope quantization unit 104 of FIG. 4, the model applicator 311 models a spectrum envelope curved surface and extracts model parameters.
このモデルは、 時間—周波数空間におけるスぺクトル包絡曲面をモデル化し たものであり、 例えば、 図 5に示すように、 スペクトル包絡曲面の時間軸にお ける両断面に対する全極モデルの適用及びその補完によりモデル化することが できる。  This model is a model of a spectrum envelope surface in the time-frequency space.For example, as shown in Fig. 5, the application of the all-pole model to both sections on the time axis of the spectrum envelope surface and its application It can be modeled by interpolation.
パラメ一夕量子化器 3 0 2は、 モデル適用器 3 1 1にて抽出されたモデルパ ラメ一夕を量子化する。  The parameter overnight quantizer 302 quantizes the model parameter overnight extracted by the model applicator 311.
このように、 スペクトル包絡曲面をモデル化することにより、 スペクトル包 絡曲面の量子化効率を向上させることができ、 低ビットレートで情報を送信す る場合であっても高品質に音声復号を実現することができる。  In this way, by modeling the spectral envelope surface, the quantization efficiency of the spectral envelope surface can be improved, and high quality speech decoding can be achieved even when transmitting information at a low bit rate. can do.
(実施の形態 4 )  (Embodiment 4)
図 6は、 本発明の実施の形態 4に係る音声符号化装置のスぺクトル包絡量子 化部の内部構成を示すブロック図である。  FIG. 6 is a block diagram showing the internal configuration of the spectrum envelope quantization unit of the speech encoding device according to Embodiment 4 of the present invention.
なお、 本実施の形態に係る音声符号化装置の構成は、 実施の形態 1の図 2に 示した音声符号化装置の構成と同様であるので説明を省略する。  Note that the configuration of the speech coding apparatus according to the present embodiment is the same as the configuration of the speech coding apparatus shown in FIG. 2 of Embodiment 1, and a description thereof will be omitted.
図 6のスぺクトル包絡量子化部 1 0 4において、 時間軸直交変換器 3 2 1は、 スぺクトル包絡曲面に対し時間軸方向の直交変換を行う。 モデル適用器 3 1 1 は、 直交変換された時間軸変換係数に対し、 時間軸上の次数に応じたモデルを 適用しパラメ一夕を抽出する。 パラメ一夕量子化器 3 0 2は、 モデル適用器 3 1 1にて抽出されたモデルパラメ一夕を量子化する。  In the spectrum envelope quantization unit 104 of FIG. 6, the time axis orthogonal transformer 3221 performs orthogonal transformation in the time axis direction on the spectrum envelope curved surface. The model applicator 3 1 1 applies a model according to the order on the time axis to the orthogonally transformed time axis conversion coefficient, and extracts parameters. The parameter overnight quantizer 3 02 quantizes the model parameters extracted by the model applicator 3 1 1.
このように、 時間軸変換係数の次数に応じたモデルを適用することにより、 モデル化による量子化効率を向上させることができ、 低ビットレートで情報を 送信する場合であっても高品質に音声復号を実現することができる。  In this way, by applying a model according to the order of the time-base transform coefficient, the quantization efficiency by modeling can be improved, and even when information is transmitted at a low bit rate, high-quality speech can be obtained. Decoding can be achieved.
(実施の形態 5 ) 図 7は、 本発明の実施の形態 5に係る音声符号化装置のスぺクトル包絡量子 化部の内部構成を示すプロック図である。 (Embodiment 5) FIG. 7 is a block diagram showing the internal configuration of the spectrum envelope quantization unit of the speech encoding device according to Embodiment 5 of the present invention.
なお、 本実施の形態に係る音声符号化装置の構成は、 実施の形態 1の図 2に 示した音声符号化装置の構成と同様であるので説明を省略する。  Note that the configuration of the speech coding apparatus according to the present embodiment is the same as the configuration of the speech coding apparatus shown in FIG. 2 of Embodiment 1, and a description thereof will be omitted.
図 7のスペクトル包絡量子化部 1 0 4において、 時間軸直交変換器 3 3 1は、 スぺクトル包絡曲面に対し時間軸方向の直交変換を行い、 直交変換された時間 軸変換係数をモデル化するものとしないものとに分類する。 この分類方法とし て、 例えば、 時間軸 0次の係数は、 スペクトル包絡曲面を平均したスペクトル 包絡であるので全極モデルを適用し、 それ以外の係数にはモデルを適用しない 方法等がある。  In the spectral envelope quantization unit 104 of FIG. 7, the time-axis orthogonal transformer 331 1 performs orthogonal transformation on the spectrum envelope curved surface in the time-axis direction, and models the orthogonally transformed time-axis transformation coefficient. Classify into those that do and those that do not. As this classification method, for example, there is a method of applying the all-pole model because the coefficient of the 0th order on the time axis is a spectrum envelope obtained by averaging the spectrum envelope surface, and not applying the model to the other coefficients.
モデル適用器 3 1 1は、 直交変換された時間軸変換係数の一部に対し、 時間 軸上の次数に応じたモデルを適用しパラメ一夕を抽出する。 周波数軸直交変換 器 3 3 2は、 モデルを適用しない時間軸変換係数に対し周波数軸方向の直交変 換を行う。 パラメ一夕量子化器 3 0 2は、 モデル適用器 3 1 1にて抽出された モデルパラメータ及び周波数軸直交変換器 3 3 2から出力された変換係数を量 子化する。  The model applicator 3 1 1 applies a model corresponding to the order on the time axis to a part of the orthogonally transformed time axis transform coefficients, and extracts parameters. The frequency axis orthogonal transformer 333 performs orthogonal transformation in the frequency axis direction for the time axis conversion coefficient to which no model is applied. The parameter quantizer 302 quantizes the model parameters extracted by the model applicator 311 and the transform coefficients output from the frequency-axis orthogonal transformer 332.
このように、 モデル化により量子化効率が向上する時間軸変換係数に対して のみモデルを適用することにより、 モデル化による量子化効率を向上しつつモ デル化歪を低減し、 低ビットレートで情報を送信する場合であっても高品質に 音声復号を実現することができる。  In this way, by applying the model only to the time-domain transform coefficients for which the quantization efficiency is improved by the modeling, the modeling distortion is reduced while the quantization efficiency is improved by the modeling, and at a low bit rate. Even when information is transmitted, high-quality speech decoding can be realized.
(実施の形態 6 )  (Embodiment 6)
図 8は、 本発明の実施の形態 6に係る音声符号化装置のモデル適用器の内部 構成を示すプロック図である。  FIG. 8 is a block diagram showing an internal configuration of a model applicator of a speech encoding device according to Embodiment 6 of the present invention.
なお、 本実施の形態に係るモデル適用器 3 1 1は、 上記実施の形態 3から 5 のいずれかに示したものである。  Note that the model applicator 3 11 according to the present embodiment is the one shown in any of the above-described third to fifth embodiments.
モデルパラメ一夕推定器 4 0 1は、 入力信号に対してモデルを適用しパラメ 一夕を抽出する。 The model parameter overnight estimator 4 01 applies the model to the input signal and Extract the evening.
例えば、 音声符号化の場合、 音声生成過程を考慮して全極モデルで入力信号 をモデル化する。 しかし、 モデルの次数が低い場合、 モデルは、 信号に含まれ る零点を表すことができず、 モデルによる分析歪が生じる。  For example, in the case of speech coding, the input signal is modeled using an all-pole model in consideration of the speech generation process. However, when the order of the model is low, the model cannot represent the zero included in the signal, and the model causes analysis distortion.
そこで、 モデル誤差推定器 4 0 2は、 モデルを適用する際に生じた分析歪を 推定してパラメ一夕量子化器に出力する。  Therefore, the model error estimator 402 estimates the analysis distortion generated when applying the model and outputs it to the parameter quantizer.
このように、 モデル化歪を量子化することにより、 モデル化による量子化効 率を向上しつつモデル化歪を低減し、 低ビットレートで情報を送信する場合で あっても高品質に音声復号を実現することができる。  In this way, by quantizing the modeling distortion, the modeling distortion is reduced while improving the quantization efficiency by modeling, and high-quality speech decoding is performed even when information is transmitted at a low bit rate. Can be realized.
(実施の形態 7 )  (Embodiment 7)
図 9は、 本発明の実施の形態 7に係る音声符号化装置のパラメ一夕量子化器 の内部構成を示すプロック図である。  FIG. 9 is a block diagram showing an internal configuration of a parameter quantizer in a speech coding apparatus according to Embodiment 7 of the present invention.
なお、 本実施の形態に係るパラメ一夕量子化器 3 0 2は、 上記実施の形態 2 から 5のいずれかに示したものである。  Note that the parameter quantizer 302 according to the present embodiment is as described in any of Embodiments 2 to 5 above.
重み算出器 5 0 1は、 基本周波数情報を用いて各量子化対象値に対する量子 化感度を決定する。 以下、 重み算出器 5 0 1における量子化感度の決定方法の 一例を示す。  The weight calculator 501 determines the quantization sensitivity for each quantization target value using the fundamental frequency information. Hereinafter, an example of a method of determining the quantization sensitivity in the weight calculator 501 will be described.
音声復号化処理ではスぺクトル包絡曲面を基本周波数に従って切り出し、 時 間軸上で連結することで復号音声を生成する。 このとき、 基本周波数の高調波 振幅値は、 その他のスペクトル振幅値と比較し重要な情報となる。 そこで、 切 り出されるスぺクトル包絡位置の高調波振幅値に重み付けした重み係数曲面を 生成する。  In the speech decoding process, the spectrum envelope surface is cut out according to the fundamental frequency and connected on the time axis to generate a decoded speech. At this time, the harmonic amplitude value of the fundamental frequency is important information as compared with other spectral amplitude values. Therefore, a weight coefficient surface is generated by weighting the harmonic amplitude value at the extracted spectrum envelope position.
次に量子化対象値を得た変換と同様の方法を用いて変換して量子化対象パラ メータ空間での重み係数を算出することにより、 各量子化対象値に対する量子 化感度を決定する。  Next, the quantization sensitivity is determined for each quantization target value by performing conversion using the same method as that for obtaining the quantization target value and calculating a weight coefficient in the quantization target parameter space.
重み算出器 5 0 2は、 スぺクトル包絡情報を用いて各量子化対象値に対する 量子化感度を決定する。 以下、 重み算出器 5 0 2における量子化感度の決定方 法の一例を示す。 The weight calculator 502 uses the spectral envelope information for each quantization target value. Determine the quantization sensitivity. Hereinafter, an example of a method of determining the quantization sensitivity in the weight calculator 502 will be described.
ある信号に対し同じ大きさの雑音が付加される場合、 ダイナミックレンジの 小さな信号は大きな信号と比較し、 聴感上において雑音が目立つ。 そこで、 ス ぺクトル包絡曲面で振幅が小さい程大きく重み付けした重み係数曲面を生成す る。  When the same amount of noise is added to a signal, a signal with a small dynamic range is more noticeable in hearing than a large signal. Therefore, a weighted coefficient surface is generated which is heavily weighted as the amplitude is smaller in the spectrum envelope surface.
次に量子化対象値を得た変換と同様の方法を用いて変換して量子化対象パラ メ一夕空間での重み係数を算出することにより、 各量子化対象値に対する量子 化感度を決定する。 なお、 重み算出器 5 0 2における量子化器の適応法は復号 化処理においても必要となるため、 復号化処理との同期をとるべく前フレーム で量子化したスぺクトル包絡情報を用いることが望ましい。  Next, the quantization sensitivity is determined for each quantization target value by performing conversion using the same method as that used to obtain the quantization target value and calculating the weighting coefficient in the parameter space to be quantized. . Since the adaptation method of the quantizer in the weight calculator 502 is also required in the decoding process, it is necessary to use the spectrum envelope information quantized in the previous frame in order to synchronize with the decoding process. desirable.
統計量蓄積器 5 0 3には、 予め求めた量子化対象値毎の統計量が蓄積されて いる。 量子化生成器 5 0 4は、 重み算出器 5 0 1及び重み算出器 5 0 2から出 力された量子化対象値に対する量子化感度と、 統計量蓄積器 5 0 3に蓄積され ている統計量から量子化器を設計する。  The statistic accumulator 503 stores a statistic for each quantization target value obtained in advance. The quantization generator 504 includes the quantization sensitivity for the quantization target value output from the weight calculator 501 and the weight calculator 502, and the statistics stored in the statistic storage 503. Design quantizer from quantity.
例えば、 スカラー量子化器を用いる場合、 統計量として量子化対象値の分散 を蓄積しておき、 この分散と量子化感度に基づいて量子化ステップ幅を決定す る。 分散が同じ場合には量子化感度が大きい。 すなわち、 量子化誤差の影響を 受けやすい量子化対象値に対して量子化ステップ幅が小さくなるようにする。 量子化器 5 0 5は、 量子化生成器 5 0 4の設計結果に基づいて量子化対象値 を量子化する。  For example, when a scalar quantizer is used, the variance of the quantization target value is accumulated as a statistic, and the quantization step width is determined based on the variance and the quantization sensitivity. When the dispersion is the same, the quantization sensitivity is large. That is, the quantization step width is set to be smaller than the quantization target value that is easily affected by the quantization error. The quantizer 505 quantizes the value to be quantized based on the design result of the quantization generator 504.
このように、 基本周波数及びスぺクトル包絡情報に対して量子化器を適応さ せることにより、 合成音声信号の客観的な量子化歪及び聴感上の歪を低減する ことができる。  Thus, by applying the quantizer to the fundamental frequency and the spectrum envelope information, it is possible to reduce objective quantization distortion and auditory distortion of the synthesized speech signal.
なお、 本実施の形態では、 基本周波数情報及びスペクトル包絡情報の 2つの 情報を用いてそれそれ各量子化対象値に対する量子化感度を決定したが、 いず れか一方の情報を用いて量子化感度を決定し、 量子化器を設計してもよい。In the present embodiment, the quantization sensitivity for each quantization target value is determined using two pieces of information of the fundamental frequency information and the spectrum envelope information. The quantization sensitivity may be determined using one of the two pieces of information to design a quantizer.
(実施の形態 8 ) (Embodiment 8)
図 10 は、 本発明の実施の形態 8に係る音声符号化装置のパラメ一夕量子化 器の内部構成を示すブロック図である。  FIG. 10 is a block diagram showing the internal configuration of the parameter quantizer in the speech coding apparatus according to Embodiment 8 of the present invention.
なお、 本実施の形態に係るパラメ一夕量子化器 3 0 2は、 上記実施の形態 2 から 5のいずれかに示したものである。  Note that the parameter quantizer 302 according to the present embodiment is as described in any of Embodiments 2 to 5 above.
誤差尺度決定器 5 1 1は、 基本周波数情報を用いてスぺクトル包絡上での量 子化誤差尺度を適応的に決定する。 誤差尺度決定器 5 1 2は、 スぺクトル包絡 情報を用いてスぺクトル包絡上での量子化誤差尺度を適応的に決定する。 誤差 尺度合成器 5 1 3は、 誤差尺度決定器 5 1 1及び誤差尺度決定器 5 1 2で得ら れた誤差尺度を 1つの誤差尺度に合成する。  The error scale determiner 511 adaptively determines the quantization error scale on the spectrum envelope using the fundamental frequency information. The error scale determiner 512 adaptively determines a quantization error scale on the spectrum envelope using the spectrum envelope information. The error scale synthesizer 5 13 synthesizes the error scales obtained by the error scale determiner 5 11 and the error scale determiner 5 12 into one error scale.
符号帳 5 1 4には、 量子化値が蓄積されている。 スぺクトル包絡構成器 5 1 5は、 符号帳 5 1 4に蓄積されている量子化値をスぺクトル包絡曲面に変換す る。 スぺクトル包絡構成器 5 1 6は、 量子化対象値をスぺクトル包絡曲面に変 換する。  The codebook 5 14 stores quantized values. The spectrum envelope constructor 515 converts the quantized values stored in the codebook 514 into a spectrum envelope curved surface. The spectrum envelope constructor 516 converts the quantization target value into a spectrum envelope curved surface.
誤差算出器 5 1 7は、 誤差尺度合成器 5 1 3から出力された誤差尺度に基づ いて、 スぺクトル包絡構成器 5 1 5にて構成されたスぺクトル包絡曲面とスぺ クトル包絡構成器 5 1 6にて構成されたスぺクトル包絡曲面との誤差を算出す る。  The error calculator 517 is based on the error scale output from the error scale synthesizer 513, and the spectral envelope surface and the spectral envelope formed by the spectral envelope composer 515 are used. Calculate the error with respect to the spectrum envelope curved surface composed by the constructors 516.
符号選択器 5 1 8は、 誤差が最小となる量子化値に対応する符号を符号帳 5 1 4から選択して出力する。  The code selector 518 selects a code corresponding to the quantized value with the smallest error from the codebook 514 and outputs the selected code.
このように、 基本周波数及びスぺクトル包絡情報に対して適応させた量子化 時の誤差尺度を用いてスぺクトル包絡曲面の誤差を時間一周波数平面上で算出 することにより、 合成音声信号の客観的な量子化歪及び聴感上の歪を低減する ことができる。  As described above, the error of the spectrum envelope curved surface is calculated on the time-frequency plane using the error scale at the time of quantization adapted to the fundamental frequency and the spectrum envelope information. Objective quantization distortion and auditory distortion can be reduced.
なお、 本実施の形態では、 基本周波数及びスペクトル包絡情報の両方に関し てスぺクトル包絡上での量子化誤差尺度を決定したが、 いずれか一方に関して 量子化誤差尺度を決定し、 誤差を算出してもよい。 In this embodiment, both the fundamental frequency and the spectrum envelope information are used. Although the quantization error scale on the spectrum envelope is determined by using the above method, the quantization error scale may be determined for any one of them, and the error may be calculated.
(実施の形態 9 )  (Embodiment 9)
図 11 は、 本発明の実施の形態 9に係る音声符号化装置のパラメ一夕量子化 器の内部構成を示すブロック図である。  FIG. 11 is a block diagram showing an internal configuration of a parameter quantizer in the speech coding apparatus according to Embodiment 9 of the present invention.
なお、 本実施の形態に係るパラメ一夕量子化器 3 0 2は、 上記実施の形態 2 から 5のいずれかに示したものである。  Note that the parameter quantizer 302 according to the present embodiment is as described in any of Embodiments 2 to 5 above.
誤差関数決定器 5 2 1は、 基本周波数情報を用いてスぺクトル包絡上での量 子化誤差重み関数を適応的に決定する。 誤差関数決定器 5 2 2は、 スぺクトル 包絡情報を用いてスぺクトル包絡上での量子化誤差重み関数を適応的に決定す る。 誤差関数合成器 5 2 3は、 誤差関数決定器 5 2 1及び誤差関数決定器 5 2 2で得られた量子化誤差重み関数を 1つの誤差関数に合成する。 誤差関数変換 器 5 2 4は、 誤差関数合成器 5 2 3から出力された量子化誤差重み関数を変換 する量子化パラメ一夕上での誤差尺度を定義する。  The error function determiner 521 adaptively determines a quantization error weighting function on the spectrum envelope using the fundamental frequency information. The error function determiner 522 adaptively determines a quantization error weight function on the spectrum envelope using the spectrum envelope information. The error function synthesizer 5 2 3 synthesizes the quantization error weight function obtained by the error function determiner 5 2 1 and the error function determiner 5 2 2 into one error function. The error function converter 524 defines an error measure over a quantization parameter for converting the quantization error weight function output from the error function synthesizer 523.
符号帳 5 2 5には、 量子化値が蓄積されている。 誤差算出器 5 2 6は、 誤差 関数変換器 5 2 4から出力された誤差尺度に基づいて、 量子化対象値と符号帳 5 2 5に蓄積されている量子化値との誤差を算出する。  The codebook 525 stores quantized values. The error calculator 526 calculates an error between the quantization target value and the quantized value stored in the codebook 525 based on the error measure output from the error function converter 524.
符号選択器 5 2 7は、 誤差が最小となる量子化値に対応する符号を符号帳 5 2 5から選択して出力する。  The code selector 527 selects a code corresponding to the quantized value that minimizes the error from the codebook 525 and outputs the selected code.
このように、 基本周波数及びスペクトル包絡情報に対して適応させた量子化 時の誤差尺度を用いて量子化パラメ一夕間の誤差を算出することにより、 少な い処理量で合成音声信号の客観的な量子化歪及び聴感上の歪を低減することが できる。  As described above, by calculating the error between the quantization parameters using the error scale at the time of quantization adapted to the fundamental frequency and the spectral envelope information, the objective of the synthesized speech signal can be reduced with a small amount of processing. Quantization distortion and audible distortion can be reduced.
なお、 本実施の形態では、 基本周波数及びスペクトル包絡情報の両方に関し てスペクトル包絡上での量子化誤差重み関数を決定したが、 いずれか一方に関 して量子化誤差重み関数を決定し、 誤差を算出してもよい。 (実施の形態 1 0 ) In the present embodiment, the quantization error weight function on the spectrum envelope is determined for both the fundamental frequency and the spectrum envelope information, but the quantization error weight function is determined for one of them, and the error May be calculated. (Embodiment 10)
図 12 は、 本発明の実施の形態 1 0に係る音声復号化装置のスぺクトル包絡 構成部の内部構成を示すブロック図である。  FIG. 12 is a block diagram showing the internal configuration of the spectrum envelope configuration unit of the speech decoding device according to Embodiment 10 of the present invention.
なお、 本実施の形態に係る音声復号化装置の構成は、 実施の形態 1の図 2に 示した音声復号化装置の構成と同様であるので説明を省略する。  Note that the configuration of the speech decoding apparatus according to the present embodiment is the same as the configuration of the speech decoding apparatus shown in FIG.
ここで、 上記実施の形態 2で説明したように、 直交変換を用いる音声符復号 化方法では、 符号側において、 聴覚上重要でない高周波成分を伝送しないこと で情報圧縮を図っている。 そこで、 本実施の形態では、 復号側において、 予め 統計的に求めたパラメ一夕値を使用して、 受信されなかったパラメ一夕を補完 することにより包絡曲面を生成する。  Here, as described in the second embodiment, in the speech codec using the orthogonal transform, information is compressed by not transmitting a high-frequency component that is not perceptually important on the code side. Therefore, in the present embodiment, an envelope surface is generated on the decoding side by complementing a parameter that has not been received by using a parameter value that has been statistically obtained in advance.
図 12 のスぺクトル包絡構成部 2 0 2において、 パラメ一夕蓄積器 6 0 1に は、 量子化対象外の各パラメータに対応して予め統計的に求めたパラメ一夕値 が蓄積されている。 スぺクトル包絡生成器 6 0 2は、 入力したスぺクトル包絡 情報に基づいてスぺクトル包絡曲面を生成する。  In the spectral envelope composing unit 202 shown in FIG. 12, the parameter storing unit 601 stores the parameter value statistically obtained in advance corresponding to each parameter not to be quantized. I have. The spectrum envelope generator 602 generates a spectrum envelope curved surface based on the input spectrum envelope information.
このように、 量子化対象外のパラメ一夕として統計的に求めた値を用いるこ とにより、 任意の値を用いた場合と比較して正確なスぺクトル包絡曲面を復元 することができる。  As described above, by using a value statistically obtained as a parameter that is not subject to quantization, a more accurate spectrum envelope surface can be restored as compared with a case where an arbitrary value is used.
以上説明したように、 本発明の音声符号化装置、 音声復号化装置及び音声符 復号化方法によれば、 スぺクトル包絡情報と音源情報を完全に分離してスぺク トル包絡情報の量子化精度に影響されなレ、音声符復号化処理を実現すること、 かつ、 分析合成モデルにおいて有効なスぺクトル包絡情報の高効率量子化手法 を通して高能率音声符復号処理を実現することができるので、 低ビットレート で情報を送信する場合であっても高品質に音声復号を実現することができる。 本明細書は、 1 9 9 9年 9月 2 8日出願の特願平 1 1— 2 7 5 1 1 9号に基 づくものである。 この内容をここに含めておく。 産業上の利用可能性 本発明は、 音声データの無線通信を行う無線通信システムの基地局装置ある レ、は通信端末装置に用いるのに好適である。 As described above, according to the speech coding apparatus, the speech decoding apparatus, and the speech coding / decoding method of the present invention, the spectrum envelope information and the sound source information are completely separated from each other, and the quantum of the spectrum envelope information is obtained. It is possible to realize speech codec processing that is not affected by quantization accuracy, and to realize highly efficient speech codec processing through a high-efficiency quantization method of spectrum envelope information that is effective in an analysis-synthesis model. Therefore, even when information is transmitted at a low bit rate, high-quality speech decoding can be realized. The present specification is based on Japanese Patent Application No. 11-27551 19 filed on Sep. 28, 1999. This content is included here. INDUSTRIAL APPLICABILITY The present invention is suitable for use in a communication terminal apparatus, which is a base station apparatus of a wireless communication system that performs wireless communication of voice data.

Claims

請 求 の 範 囲 The scope of the claims
1 . 入力音声信号から基本周波数とスぺクトル包絡情報を抽出する音声分析手 段と、 抽出された基本周波数を量子化する基本周波数量子化手段と、 抽出され たスぺクトル包絡情報からスぺクトル包絡曲面を生成するマ卜リックス生成手 段と、 生成されたスペクトル包絡曲面を量子化するスペクトル包絡量子化手段 と、 スぺクトル包絡曲面の量子化値と基本周波数の量子化値とを多重化して送 出する多重化手段とを具備する音声符号化装置。  1. Speech analysis means for extracting the fundamental frequency and spectrum envelope information from the input speech signal, fundamental frequency quantization means for quantizing the extracted fundamental frequency, and spectrum analysis from the extracted spectrum envelope information. A matrix generating means for generating a vector envelope surface, a spectral envelope quantization means for quantizing the generated spectral envelope surface, and a multiplexing of the quantized value of the spectral envelope surface and the quantized value of the fundamental frequency. And a multiplexing means for multiplexing and transmitting.
2 . スペクトル包絡量子化手段は、 スペクトル包絡曲面に対し時間軸方向及び 周波数軸方向の 2次元直交変換を行う 2次元直交変換手段と、 この 2次元直交 変換手段から出力された変換係数を量子化するパラメ一夕量子化手段とを具備 する請求の範囲 1記載の音声符号化装置。  2. The spectrum envelope quantization means performs two-dimensional orthogonal transformation on the spectrum envelope curved surface in the time axis direction and the frequency axis direction, and quantizes the transform coefficient output from the two-dimensional orthogonal transformation means. 2. The speech encoding apparatus according to claim 1, further comprising a parameter quantization unit that performs the parameter quantization.
3 . スペクトル包絡量子化手段は、 スペクトル包絡曲面をモデル化してモデル パラメ一夕を抽出するモデル適用手段と、 抽出されたモデルパラメ一夕を量子 化するパラメ一夕量子化手段とを具備する請求の範囲 1記載の音声符号化装置。  3. The spectral envelope quantizing means comprises: a model applying means for modeling a spectral envelope surface to extract model parameters, and a parameter quantizing means for quantizing the extracted model parameters. A speech coding apparatus according to claim 1.
4 . スペクトル包絡量子化手段は、 スペクトル包絡曲面に対し時間軸方向の直 交変換を行う時間軸直交変換手段と、 直交変換された時間軸変換係数に対し、 時間軸上の次数に応じたモデルを適用しパラメ一夕を抽出するモデル適用手段 と、 抽出されたモデルパラメ一夕を量子化するパラメ一夕量子化手段とを具備 する請求の範囲 1記載の音声符号化装置。 4. The spectral envelope quantization means is a time axis orthogonal transform means for performing orthogonal transform in the time axis direction on the spectrum envelope curved surface, and a model corresponding to the order on the time axis for the orthogonal transformed time axis transform coefficient. 2. The speech encoding apparatus according to claim 1, comprising: a model application unit that applies parameters to extract parameter data; and a parameter parameter quantization device that quantizes the extracted model parameters.
5 . モデルを適用しない時間軸変換係数に対し周波数軸方向の直交変換を行う 周波数軸直交変換手段を具備し、 パラメ一夕量子化手段は、 抽出されたモデル パラメ一夕及び前記周波数軸直交変換手段から出力された変換係数を量子化す る請求の範囲 4記載の音声符号化装置。  5. A frequency axis orthogonal transform means for performing orthogonal transform in the frequency axis direction on a time axis transform coefficient to which no model is applied, wherein the parameter quantizing means includes an extracted model parameter set and the frequency axis orthogonal transform. 5. The speech encoding apparatus according to claim 4, wherein the transform coefficient output from the means is quantized.
6 . モデル適用手段は、 入力信号に対してモデルを適用しパラメ一夕を抽出す るモデルパラメ一夕推定手段と、 このモデルパラメ一夕推定手段にてモデルを 適用した際に生じた分析歪を推定するモデル誤差推定手段とを具備する請求の 範囲 3記載の音声符号化装置。 6. The model applying means is a model parameter estimating means for applying a model to the input signal and extracting parameters, and an analysis distortion generated when the model is applied by the model parameter estimating means. And a model error estimating means for estimating The speech encoding device according to range 3.
7 . パラメ一夕量子化手段は、 基本周波数情報とスペクトル包絡情報の少なく とも一方を用いて各量子化対象値に対する量子化感度を決定する重み算出手段 と、 予め求めた量子化対象値毎の統計量を蓄積する統計量蓄積手段と、 前記重 み算出手段から出力された量子化対象値に対する量子化感度及び統計量蓄積手 段に蓄積されている統計量から量子化器を設計する量子化生成手段と、 この量 子化生成手段の設計結果に基づ 、て量子化対象値を量子化する量子化手段とを 具備する請求の範囲 2記載の音声符号化装置。  7. Parameter quantization means includes weight calculation means for determining quantization sensitivity for each quantization target value using at least one of the fundamental frequency information and spectrum envelope information, and a pre-determined quantization target value for each quantization target value. A statistic accumulation means for accumulating the statistic; a quantization sensitivity for the quantization target value output from the weight calculating means and a quantization for designing a quantizer based on the statistic accumulated in the statistic accumulation means. 3. The speech encoding apparatus according to claim 2, comprising: a generating unit; and a quantizing unit that quantizes a quantization target value based on a design result of the quantization generating unit.
8 . パラメ一夕量子化手段は、 基本周波数情報とスペクトル包絡情報の少なく とも一方を用いてスぺクトル包絡上での量子化誤差尺度を適応的に決定する誤 差尺度決定手段と、 符号帳に蓄積されている量子化値をスぺクトル包絡曲面に 変換する第 1スぺクトル包絡構成手段と、 量子化対象値をスぺクトル包絡曲面 に変換する第 2スぺクトル包絡構成手段と、 前記第 1スぺクトル包絡構成手段 にて構成されたスぺクトル包絡曲面と前記第 2スぺクトル包絡構成手段にて構 成されたスぺクトル包絡曲面との誤差を誤差尺度に基づいて算出する誤差算出 手段と、 誤差が最小となる量子化値に対応する符号を符号帳から選択する符号 選択手段とを具備する請求の範囲 2記載の音声符号化装置。  8. The parameter quantization means includes: an error scale determination means for adaptively determining a quantization error scale on the spectrum envelope using at least one of the fundamental frequency information and the spectrum envelope information; and a codebook. First spectral envelope forming means for converting the quantized value accumulated in the spectrum into a spectral envelope surface, second spectral envelope forming means for converting the quantization target value into a spectral envelope surface, An error between the spectrum envelope curved surface constituted by the first spectrum envelope composing means and the spectrum envelope curved surface constituted by the second spectrum envelope composing means is calculated based on an error scale. 3. The speech encoding apparatus according to claim 2, comprising: an error calculating unit that performs the calculation; and a code selecting unit that selects a code corresponding to the quantization value that minimizes the error from a codebook.
9 . パラメ一夕量子化手段は、 基本周波数情報とスペクトル包絡情報の少なく とも一方を用いてスぺクトル包絡上での量子化誤差重み関数を適応的に決定す る誤差関数決定手段と、 量子化誤差重み関数を変換する量子化パラメ一夕上で の誤差尺度を定義する誤差関数変換手段と、 誤差尺度に基づいて量子化対象値 と符号帳に蓄積されている量子化値との誤差を算出する誤差算出手段と、 誤差 が最小となる量子化値に対応する符号を符号帳から選択する符号選択手段とを 具備する請求の範囲 2記載の音声符号化装置。  9. The parameter quantizing means includes: an error function determining means for adaptively determining a quantization error weight function on the spectrum envelope using at least one of the fundamental frequency information and the spectral envelope information; Error function conversion means for defining an error measure over the quantization parameter for converting the quantization error weight function, and an error between the quantization target value and the quantized value stored in the codebook based on the error measure. 3. The speech encoding apparatus according to claim 2, comprising: an error calculation unit that calculates the error;
1 0 . 請求の範囲 1記載の音声符号化装置から送信された符号列をスペクトル 包絡情報の量子化値を示す符号と基本周波数の量子化値を示す符号とに分離す る逆多重化手段と、 受信したスぺクトル包絡情報から量子化されたスぺクトル 包絡曲面を再構成するスぺクトル包絡構成手段と、 再構成されたスぺクトル包 絡曲面を基本周波数情報に基づいて切り出して復号音声を合成する音声合成手 段とを具備する音声復号化装置。 10. A code string transmitted from the speech coding apparatus according to claim 1 is separated into a code indicating a quantized value of spectral envelope information and a code indicating a quantized value of a fundamental frequency. Demultiplexing means, a spectrum envelope constructing means for reconstructing a quantized spectrum envelope surface from the received spectrum envelope information, and a fundamental frequency information representing the reconstructed spectrum envelope surface. And a speech synthesizing means for synthesizing the decoded speech by extracting the speech based on the speech.
1 1 . スぺクトル包絡構成手段は、 量子化対象外の各パラメ一夕に対応して予 め統計的に求めたパラメ一夕値を蓄積するパラメ一夕蓄積手段と、 入力したス ぺクトル包絡情報に基づいてスぺクトル包絡曲面を生成するスぺクトル包絡生 成手段とを具備する請求の範囲 1 0記載の音声復号化装置。  11. The spectral envelope composing means includes a parameter storing means for storing parameter values statistically obtained in advance corresponding to each parameter not to be quantized, and an input spectrum. 10. The speech decoding device according to claim 10, further comprising: a spectrum envelope generating unit configured to generate a spectrum envelope curved surface based on the envelope information.
1 2 . 符号側にて、 入力音声信号から基本周波数とスペクトル包絡情報を抽出 し、 抽出された基本周波数を量子化し、 抽出されたスペクトル包絡情報からス ぺクトル包絡曲面を生成して量子化し、 スぺクトル包絡曲面の量子化値と基本 周波数の量子化値とを多重化して送出し、 復号側にて、 受信した符号列をスぺ クトル包絡情報の量子化値を示す符号と基本周波数の量子化値を示す符号とに 分離し、 受信したスぺクトル包絡情報から量子化されたスぺクトル包絡曲面を 再構成し、 再構成されたスペクトル包絡曲面を基本周波数情報に基づいて切り 出して復号音声を合成する音声符復号化方法。  1 2. On the code side, extract the fundamental frequency and spectrum envelope information from the input speech signal, quantize the extracted fundamental frequency, generate a spectrum envelope surface from the extracted spectrum envelope information, quantize it, The quantized value of the spectrum envelope surface and the quantized value of the fundamental frequency are multiplexed and transmitted. On the decoding side, the received code string is converted into a code representing the quantized value of the spectrum envelope information and the fundamental frequency. It is separated into a code indicating the quantization value, the quantized spectrum envelope surface is reconstructed from the received spectrum envelope information, and the reconstructed spectrum envelope surface is cut out based on the fundamental frequency information. A speech codec decoding method for synthesizing decoded speech.
1 3 . コンピュータに、 入力音声信号から基本周波数とスペク トル包絡情報を 抽出させる手順と、 抽出された基本周波数を量子化させる手順と、 抽出された スぺクトル包絡情報からスぺクトル包絡曲面を生成させる手順と、 生成された スぺクトル包絡曲面を量子化させる手順と、 スぺクトル包絡曲面の量子化値と 基本周波数の量子化値とを多重化させる手順とを実行させるための音声符号化 プログラムを記録した機械読み取り可能な記録媒体。  1 3. The procedure for causing the computer to extract the fundamental frequency and the spectrum envelope information from the input audio signal, the procedure for quantizing the extracted fundamental frequency, and the step of extracting the spectrum envelope surface from the extracted spectrum envelope information. A speech code for executing a generating step, a step of quantizing the generated spectral envelope surface, and a step of multiplexing the quantized value of the spectral envelope surface and the quantized value of the fundamental frequency. A machine-readable recording medium that records a program.
1 4 . コンピュータに、 受信した符号列をスペクトル包絡情報の量子化値を示 す符号と基本周波数の量子化値を示す符号とに分離させる手順と、 受信したス ぺクトル包絡情報から量子化されたスぺクトル包絡曲面を再構成させる手順と、 再構成されたスぺクトル包絡曲面を基本周波数情報に基づいて切り出させて復 号音声を合成させる手順とを実行させるための音声復号化プログラムを記録し た機械読み取り可能な記録媒体。 14. The computer separates the received code sequence into a code indicating the quantized value of the spectral envelope information and a code indicating the quantized value of the fundamental frequency, and further quantizes the code sequence from the received spectral envelope information. A procedure for reconstructing the spectrum envelope surface that has been reconstructed, and extracting and reconstructing the reconstructed spectrum envelope surface based on the fundamental frequency information. And a machine-readable recording medium on which a speech decoding program for executing a procedure for synthesizing a signal speech is recorded.
PCT/JP2000/006542 1999-09-28 2000-09-25 Voice encoder, voice decoder, and voice encoding and decoding method WO2001024164A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP00961220A EP1132891A1 (en) 1999-09-28 2000-09-25 Voice encoder, voice decoder, and voice encoding and decoding method
AU73212/00A AU7321200A (en) 1999-09-28 2000-09-25 Voice encoder, voice decoder, and voice encoding and decoding method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP11/275119 1999-09-28
JP27511999A JP3360046B2 (en) 1999-09-28 1999-09-28 Audio encoding device, audio decoding device, and audio codec decoding method

Publications (1)

Publication Number Publication Date
WO2001024164A1 true WO2001024164A1 (en) 2001-04-05

Family

ID=17550984

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2000/006542 WO2001024164A1 (en) 1999-09-28 2000-09-25 Voice encoder, voice decoder, and voice encoding and decoding method

Country Status (4)

Country Link
EP (1) EP1132891A1 (en)
JP (1) JP3360046B2 (en)
AU (1) AU7321200A (en)
WO (1) WO2001024164A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10004506B2 (en) 2011-05-27 2018-06-26 Ethicon Llc Surgical system

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100738109B1 (en) * 2006-04-03 2007-07-12 삼성전자주식회사 Method and apparatus for quantizing and inverse-quantizing an input signal, method and apparatus for encoding and decoding an input signal
JP5799824B2 (en) * 2012-01-18 2015-10-28 富士通株式会社 Audio encoding apparatus, audio encoding method, and audio encoding computer program

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH01227200A (en) * 1988-03-07 1989-09-11 Fujitsu Ltd Formant extracting device

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH01227200A (en) * 1988-03-07 1989-09-11 Fujitsu Ltd Formant extracting device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HIDEKI KAWAHARA: "A high quality speech analysis, modification and synthesis method STRAIGHT: Insights on auditory scene analysis", ACOUSTICAL SOCIETY OF JAPAN, vol. 9, 17 September 1997 (1997-09-17), pages 189 - 192, XP002933376 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10004506B2 (en) 2011-05-27 2018-06-26 Ethicon Llc Surgical system

Also Published As

Publication number Publication date
EP1132891A1 (en) 2001-09-12
AU7321200A (en) 2001-04-30
JP3360046B2 (en) 2002-12-24
JP2001100798A (en) 2001-04-13

Similar Documents

Publication Publication Date Title
US10224054B2 (en) Signal processing apparatus and signal processing method, encoder and encoding method, decoder and decoding method, and program
JP3881943B2 (en) Acoustic encoding apparatus and acoustic encoding method
JP4550289B2 (en) CELP code conversion
US8949119B2 (en) Signal processing apparatus and signal processing method, encoder and encoding method, decoder and decoding method, and program
JP5413839B2 (en) Encoding device and decoding device
JP5975243B2 (en) Encoding apparatus and method, and program
JP5535241B2 (en) Audio signal restoration apparatus and audio signal restoration method
KR100882771B1 (en) Perceptually Improved Enhancement of Encoded Acoustic Signals
MX2013009305A (en) Noise generation in audio codecs.
JP3881946B2 (en) Acoustic encoding apparatus and acoustic encoding method
JP2009069856A (en) Method for estimating artificial high band signal in speech codec
US20100174532A1 (en) Speech encoding
MX2013009303A (en) Audio codec using noise synthesis during inactive phases.
JP2009515212A (en) Audio compression
JP2003323199A (en) Device and method for encoding, device and method for decoding
EP1782419A1 (en) Scalable audio coding
KR20070061843A (en) Scalable encoding apparatus and scalable encoding method
JP4603485B2 (en) Speech / musical sound encoding apparatus and speech / musical sound encoding method
EP3550563B1 (en) Encoder, decoder, encoding method, decoding method, and associated programs
JPWO2010016270A1 (en) Quantization apparatus, encoding apparatus, quantization method, and encoding method
TW202215417A (en) Multi-channel signal generator, audio encoder and related methods relying on a mixing noise signal
JP4786183B2 (en) Speech decoding apparatus, speech decoding method, program, and recording medium
CN115171709B (en) Speech coding, decoding method, device, computer equipment and storage medium
JP2004302259A (en) Hierarchical encoding method and hierarchical decoding method for sound signal
JP3360046B2 (en) Audio encoding device, audio decoding device, and audio codec decoding method

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

WWE Wipo information: entry into national phase

Ref document number: 09856096

Country of ref document: US

WWE Wipo information: entry into national phase

Ref document number: 2000961220

Country of ref document: EP

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWP Wipo information: published in national office

Ref document number: 2000961220

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWW Wipo information: withdrawn in national office

Ref document number: 2000961220

Country of ref document: EP