JP3416331B2

JP3416331B2 - Audio decoding device

Info

Publication number: JP3416331B2
Application number: JP10665195A
Authority: JP
Inventors: 田幸司吉; 中直也田
Original assignee: Panasonic Corp; Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Corp; Panasonic Holdings Corp
Priority date: 1995-04-28
Filing date: 1995-04-28
Publication date: 2003-06-16
Anticipated expiration: 2018-06-16
Also published as: JPH08305398A

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、伝送路上で符号誤りが
生じる移動通信システムなどにおいて、復号側で、符号
誤りを検出したフレームに対して音声信号の補間を行う
音声復号化装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech decoding apparatus for interpolating a speech signal with respect to a frame in which a coding error is detected on the decoding side in a mobile communication system or the like in which a coding error occurs on a transmission line.

【０００２】[0002]

【従来の技術】近年、ディジタル移動通信システムの実
用化が急速に進められている。移動通信システムにおい
て、伝送路上での符号誤りが生じ、誤り訂正でも訂正し
きれない誤り検出がされた場合、過去の音声符号化パラ
メータを用いて音声復号を行うフレーム補間がよく用い
られている。2. Description of the Related Art In recent years, the practical application of digital mobile communication systems has been rapidly advanced. In a mobile communication system, when a code error occurs on a transmission path and an error that cannot be corrected by error correction is detected, frame interpolation is often used in which voice decoding is performed using past voice encoding parameters.

【０００３】図４は従来のこの種の音声復号化装置の構
成を示すものである。図４において、１は符号器から送
信され、音声の一定区間であるフレーム毎の符号化情報
を表す受信データを、それぞれの音声符号化パラメータ
に分離するとともに、当該フレームが補間フレームかど
うかを判定するパラメータ分離・補間判定器、２は前フ
レームで復号に用いた符号化パラメータを保持する前フ
レーム符号化パラメータ保持器、３は補間フレームか否
かにより当該フレームで使用する符号化パラメータを切
り替える切替器、４は切替器３で選択された符号化パラ
メータおよびフレーム補間情報を用いて当該フレームの
音声復号を行う音声復号器である。FIG. 4 shows the structure of a conventional speech decoding apparatus of this type. In FIG. 4, 1 is transmitted from the encoder, separates the received data representing the coding information for each frame, which is a fixed section of speech, into respective speech coding parameters, and determines whether the frame is an interpolated frame. Parameter separation / interpolation determiner, 2 is a previous frame coding parameter holder that holds the coding parameter used for decoding in the previous frame, and 3 is a switch that switches the coding parameter used in that frame depending on whether it is an interpolation frame Units 4 and 4 are voice decoders that perform voice decoding of the frame using the coding parameters and frame interpolation information selected by the switch 3.

【０００４】以上のように構成された音声復号化装置に
ついて、以下その動作について説明する。まず、パラメ
ータ分離・補間判定器１において、符号器から送信され
た送信データを受信し、当該フレームの符号化パラメー
タと、当該フレームが補間フレームか否かを表すフレー
ム補間情報とを出力する。そして、当該フレームが補間
フレームでない場合は、切替器３をａ側にし、当該フレ
ームの符号化データを用いて、音声復号器４により通常
の音声復号を行う。一方、当該フレームが補間フレーム
の場合、切替器３をｂ側にして、前フレーム符号化パラ
メータ保持器２に蓄えられた前フレームの符号化パラメ
ータを用いて音声復号を行う。The operation of the speech decoding apparatus configured as above will be described below. First, the parameter separation / interpolation determiner 1 receives the transmission data transmitted from the encoder, and outputs the encoding parameter of the frame and frame interpolation information indicating whether the frame is an interpolation frame. When the frame is not an interpolated frame, the switch 3 is set to the side a, and the audio decoder 4 performs normal audio decoding using the encoded data of the frame. On the other hand, when the frame is an interpolated frame, the switch 3 is set to the b side, and speech decoding is performed using the encoding parameter of the previous frame stored in the previous frame encoding parameter holder 2.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、上記の
従来の音声復号化装置では、補間フレームにおいて、前
フレームの符号化パラメータをそのまま使用して復号す
るため、補間フレームの復号音声における音声品質劣化
が生じるという問題点を有していた。However, in the above conventional speech decoding apparatus, since the decoding parameters of the previous frame are used as they are in the interpolation frame for decoding, there is a deterioration in speech quality in the decoded speech of the interpolation frame. It had a problem that it occurred.

【０００６】具体的には、低ビットレート音声符号化で
主流の方式であるＣＥＬＰ（Code Excited Linear Pred
iction）方式において、補間フレームにおいて、前フレ
ームのピッチ周期を表すパラメータ（ラグ）を用いて適
応コードブック音源を生成する際、そのラグが補間フレ
ームに対する最適な値となっていない場合があり、適正
なピッチ周期性を有する適応コードブック音源が生成さ
れない場合がある。Specifically, CELP (Code Excited Linear Pred), which is a mainstream method for low bit rate speech coding, is used.
When generating an adaptive codebook sound source using the parameter (lag) that represents the pitch period of the previous frame in the interpolation method, the lag may not be the optimum value for the interpolation frame. An adaptive codebook sound source having different pitch periodicity may not be generated.

【０００７】また、音源ゲインパラメータが、生成され
た適応コードブック音源に依存して符号化されるような
構成の場合、前フレームの音源ゲインパラメータをその
まま使用すると、適応コードブックの符号器と復号器で
の不整合により、適正なゲインの駆動音源を生成できな
い。Further, in the case where the excitation gain parameter is encoded depending on the generated adaptive codebook excitation, if the excitation gain parameter of the previous frame is used as it is, the encoder and decoding of the adaptive codebook are performed. It is impossible to generate a driving sound source with an appropriate gain due to mismatch in the power supply.

【０００８】さらに、符号器が、有声モードと無声モー
ドの符号化モードを有するような構成となっている場
合、従来は補間フレームの前フレームのモードと同一の
モード・同一の符号化パラメータで復号するが、無声モ
ードでも有声性の高い場合や逆に有声モードでも無声性
の高い場合等があり、その場合に補間フレームの復号音
声の品質が劣化する。Further, when the encoder has a structure having coding modes of voiced mode and unvoiced mode, conventionally, decoding is performed with the same mode and the same coding parameter as the mode of the previous frame of the interpolated frame. However, there are cases where voicedness is high even in the unvoiced mode, and conversely, voicedness is high even in the voiced mode, in which case the quality of the decoded speech of the interpolated frame deteriorates.

【０００９】本発明は、上記従来の問題を解決するもの
で、補間フレームの復号音声における聴感上の劣化を抑
えることのできる優れた音声復号化装置を提供すること
を目的とする。An object of the present invention is to solve the above-mentioned conventional problems, and an object thereof is to provide an excellent speech decoding apparatus capable of suppressing perceptual deterioration in decoded speech of an interpolated frame.

【００１０】[0010]

【課題を解決するための手段】本発明は、上記目的を達
成するために、第１の構成として、符号器から送信さ
れ、音声の一定区間であるフレーム毎の符号化情報を表
す受信データをそれぞれの符号化パラメータに分離する
とともに、当該フレームが補間フレームかどうかを判定
するパラメータ分離・補間判定器と、前フレームで復号
に用いた符号化パラメータを保持する前フレーム符号化
パラメータ保持器と、前フレームまでで復号された音声
の一定区間の駆動音源信号を保持する駆動音源保持器
と、前記駆動音源保持器からの駆動音源信号および前フ
レーム符号化パラメータ保持器からの前フレームの複数
のピッチ周期パラメータを用いて、前記前フレームの複
数のピッチ周期パラメータの中から、当該フレームのフ
レーム補間に最適なピッチ周期パラメータを決定する補
間ピッチ周期決定器と、補間フレームか否かにより当該
フレームで使用する符号化パラメータを切り替える切替
器と、前記切替器で選択された符号化パラメータおよび
フレーム補間情報を用いて当該フレームの音声復号を行
う音声復号器を備えた構成を有している。In order to achieve the above object, the present invention has, as a first configuration, received data transmitted from an encoder and representing encoded information for each frame which is a fixed section of speech. as well as separated into the sign-parameter, and determines the parameter separation and interpolation determination unit the frame is whether the interpolation frame, and the frame coding parameter holder before holding the encoding parameters used for decoding the previous frame , A driving sound source holder that holds a driving sound source signal in a certain section of the sound decoded up to the preceding frame, and a plurality of driving sound source signals from the driving sound source holder and the preceding frame from the preceding frame coding parameter holder. using the pitch period parameter, the optimal peak from a plurality of pitch period parameter of the previous frame, the frame interpolation of the frame By using the interpolation pitch period determiner for determining a switch cycle parameters, and switch for switching the coding parameters used in the frame according to whether the interpolation frame, the coding parameter and frame interpolation information selected by said switcher It has a configuration including a voice decoder for performing voice decoding of the frame.

【００１１】また、第２の構成として、符号器から送信
され、音声の一定区間であるフレーム毎の符号化情報を
表す受信データをそれぞれの音声符号化パラメータに分
離するとともに、当該フレームが補間フレームかどうか
を判定するパラメータ分離・補間判定器と、フレーム補
間情報から当該フレームの音声復号に用いる符号化パラ
メータを決定する現フレーム符号化パラメータ決定器
と、過去の駆動音源を蓄える適応コードブックと、雑音
音源等の固定の音源ベクトルを保持する固定コードブッ
クと、フレーム補間時に適応コードブックおよび固定コ
ードブックから生成された信号を用いてそれらのゲイン
を制御する音源ゲイン制御器と、適応コードブックおよ
び固定コードブックから生成された信号にゲインを乗じ
る乗算器と、駆動音源を生成する加算器とを備えた構成
を有している。As a second configuration, the received data transmitted from the encoder and representing the coding information for each frame, which is a fixed section of speech, is separated into respective speech coding parameters, and the frame is interpolated frame. A parameter separation / interpolation determiner that determines whether or not, a current frame encoding parameter determiner that determines an encoding parameter used for voice decoding of the frame from frame interpolation information, and an adaptive codebook that stores past driving sound sources, A fixed codebook that holds a fixed source vector such as a noise source, an adaptive codebook during frame interpolation, and a source gain controller that controls those gains using signals generated from the fixed codebook, an adaptive codebook, and A multiplier that multiplies the signal generated from the fixed codebook by the gain and the driving sound It has a configuration in which an adder for generating a.

【００１２】さらに、第３の構成として、音声の有声区
間と無声区間それぞれに対応した符号化モードを有する
音声復号化装置において、符号器から送信され、音声の
一定区間であるフレーム毎の符号化情報を表す受信デー
タをそれぞれの音声符号化パラメータおよび符号化モー
ド情報に分離するとともに、当該フレームが補間フレー
ムかどうかを判定するパラメータ分離・補間判定器と、
前フレームで復号に用いた符号化パラメータ・符号化モ
ードを保持する前フレーム符号化パラメータ保持器と、
前フレームまでで復号された音声の一定区間の駆動音源
信号を保持する駆動音源保持器と、前記駆動音源保持器
から出力された駆動音源信号および前フレームが有声符
号化モードの場合は前フレームの複数のピッチ周期パラ
メータを用いて、当該フレームのフレーム補間時におけ
る符号化モードを決定し、さらに前記符号化モードが有
声符号化モードの場合は、前記前フレームの複数のピッ
チ周期パラメータの中から、最適なピッチ周期パラメー
タを決定する補間ピッチ周期決定器と、フレーム補間情
報により当該フレームの復号時の符号化モード情報の入
力を切り替えるモード情報切替器と、補間フレームか否
かにより当該フレームで使用する符号化パラメータを切
り替える符号化パラメータ切替器と、モード情報切替器
の出力により得られたモード情報により符号化モードを
切り替えるモード切替器と、前記符号化パラメータ切替
器で選択された符号化パラメータおよびフレーム補間情
報を用いて当該フレームの音声復号を行う有声モード復
号器および無声モード復号器を備えた構成を有してい
る。Further, as a third configuration, in a speech decoding apparatus having a coding mode corresponding to each of a voiced section and a unvoiced section of speech, the coding is performed for each frame which is a fixed section of speech transmitted from an encoder. A parameter separation / interpolation determiner for determining whether or not the frame is an interpolated frame, while separating the reception data representing the information into respective audio encoding parameters and encoding mode information,
A previous frame coding parameter holder that holds the coding parameter / coding mode used for decoding in the previous frame,
A driving excitation cage holding the excitation signal of a predetermined section of the audio decoded in up to the previous frame, when the excitation signal and the previous frame output from the driving excitation cage voiced encoding mode of the previous frame Multiple pitch period parameters
Use a meter when setting the frame interpolation
Determine the coding mode to be
In the voice coding mode, the multiple frames of the previous frame are
From among the pitch period parameters, the optimum pitch period parameter
The interpolation pitch cycle determiner that determines the input data, the mode information switch that switches the input of the encoding mode information at the time of decoding the frame according to the frame interpolation information, and the encoding parameter used in the frame depending on whether it is an interpolation frame or not. A coding parameter switch for switching, a mode switch for switching the coding mode according to the mode information obtained from the output of the mode information switch, and a coding parameter and frame interpolation information selected by the coding parameter switch are used. In this configuration, a voiced mode decoder and an unvoiced mode decoder for performing voice decoding of the frame are provided.

【００１３】[0013]

【作用】本発明は、上記第１の構成により、フレーム補
間時に用いるピッチ周期パラメータを前フレームの複数
のピッチ周期パラメータから最適なものを選択すること
により、より適切なピッチ周期性を有する駆動音源を生
成することができる。According to the first aspect of the present invention, the driving sound source having more appropriate pitch periodicity is selected by selecting the optimum pitch period parameter used in frame interpolation from the plurality of pitch period parameters of the previous frame. Can be generated.

【００１４】また、第２の構成により、過去の駆動音源
から生成され過去の駆動音源との相関性を有する適応コ
ードブック音源と、雑音成分等過去の駆動音源との相関
性のない固定コードブック音源とのパワ比を、固定の割
合または過去の駆動音源の相関性に応じて可変の割合で
ゲインを制御して駆動音源を生成することにより、より
聴感上の劣化を抑えた補間フレームでの音声復号を行う
ことができる。With the second configuration, the adaptive codebook sound source generated from the past driving sound source and having the correlation with the past driving sound source, and the fixed codebook having no correlation with the past driving sound source such as the noise component. The power ratio with the sound source is controlled in a fixed ratio or with a variable ratio according to the correlation of the past driving sound source to control the gain to generate the driving sound source. Voice decoding can be performed.

【００１５】さらに、第３の構成により、補間ピッチ周
期決定器において、過去の駆動音源の相関性により前フ
レームの有声性／無声性を評価し、有声性が高い場合に
は有声モードで、そうでない場合には無声モードで当該
補間フレームの復号処理を行うことにより、より聴感上
の劣化を抑えた音声復号を行うことができる。Further, according to the third configuration, the interpolated pitch period determiner evaluates the voicedness / unvoicedness of the previous frame based on the correlation of the past driving sound sources, and when the voicedness is high, the voiced mode is used. If not, by performing the decoding process of the interpolation frame in the unvoiced mode, it is possible to perform the voice decoding while suppressing the deterioration of the auditory sense.

【００１６】[0016]

【Example】

（実施例１）以下、本発明の実施例について、図面を参
照しながら説明する。図１は本発明の第１の実施例にお
ける音声復号化装置の構成を示したものである。図１に
おいて、１０１は、符号器から送信され、音声の一定区
間（フレーム）毎の符号化情報を表す受信データをそれ
ぞれの音声符号化パラメータに分離するとともに、当該
フレームが補間フレームかどうかを判定するパラメータ
分離・補間判定器、１０２は前フレームで復号に用いた
符号化パラメータを保持する前フレーム符号化パラメー
タ保持器、１０３は前フレームまでで復号された音声の
一定区間の駆動音源信号を保持する駆動音源保持器、１
０４は駆動音源保持器１０３からの駆動音源および前フ
レーム符号化パラメータ保持器からの前フレームの複数
のピッチ周期またはそれに類するパラメータを用いて、
当該フレームのフレーム補間に最適なピッチ周期または
それに類するパラメータを決定する補間ピッチ周期決定
器、１０５は補間フレームか否かにより当該フレームで
使用する符号化パラメータを切り替える切替器、１０６
は切替器１０５で選択された符号化パラメータおよびフ
レーム補間情報を用いて当該フレームの音声復号を行う
音声復号器である。(Embodiment 1) Hereinafter, an embodiment of the present invention will be described with reference to the drawings. FIG. 1 shows the configuration of a speech decoding apparatus according to the first embodiment of the present invention. In FIG. 1, reference numeral 101 separates the received data, which is transmitted from the encoder and represents the coding information for each fixed section (frame) of the speech, into respective speech coding parameters and determines whether the frame is an interpolated frame. Parameter separation / interpolation determiner, 102 is a previous-frame coding parameter holder that holds the coding parameter used for decoding in the previous frame, and 103 holds the driving excitation signal of a certain section of the speech decoded up to the previous frame Drive sound source holder,
Reference numeral 04 indicates a plurality of pitch periods of the preceding frame from the driving sound source holder 103 and the preceding frame coding parameter holder or a parameter similar thereto,
An interpolation pitch cycle determiner for determining an optimum pitch cycle for the frame interpolation of the frame or a parameter similar thereto, 105 is a switch for switching an encoding parameter used in the frame depending on whether or not the frame is an interpolation frame, 106
Is a speech decoder that performs speech decoding of the frame using the coding parameter and the frame interpolation information selected by the switch 105.

【００１７】以上のように構成された音声復号化装置に
ついて、以下その動作について説明する。まず、パラメ
ータ分離・補間判定器１０１において、符号器から送信
された送信データを受信し、当該フレームの符号化パラ
メータに分離するとともに、当該フレームが補間フレー
ムか否かを判定し、符号化パラメータおよびフレーム補
間情報を出力する。ここで、補間フレームとは、当該フ
レームの受信データが欠落または伝送路誤りによる誤り
が検出されたフレームで、受信データをそのまま復号す
ると明かな音声品質劣化を生じるフレームである。そし
て、当該フレームが補間フレームでない場合は、切替器
１０５をａ側にし、当該フレームの符号化データを用い
て、音声復号器１０６により通常の音声復号を行う。The operation of the speech decoding apparatus configured as described above will be described below. First, the parameter separation / interpolation determiner 101 receives the transmission data transmitted from the encoder, separates it into the encoding parameters of the frame, determines whether the frame is an interpolation frame, and determines the encoding parameters and Output frame interpolation information. Here, the interpolated frame is a frame in which received data of the frame is missing or an error due to a transmission path error is detected, and is a frame in which clear decoding of the received data causes clear voice quality deterioration. If the frame is not an interpolated frame, the switch 105 is set to the a side, and the audio decoder 106 performs normal audio decoding using the encoded data of the frame.

【００１８】一方、当該フレームが補間フレームの場
合、まず、駆動音源保持器１０３の出力である駆動音源
および前フレーム符号化パラメータ保持器１０２からの
前フレームの複数のピッチ周期またはそれに類するパラ
メータを用いて、補間ピッチ周期決定器１０４により、
当該フレームのフレーム補間に最適なピッチ周期または
それに類するパラメータを決定する。On the other hand, when the frame is an interpolated frame, first, a plurality of pitch periods of the previous frame from the driving excitation and the previous frame coding parameter holder 102, which are the outputs of the driving excitation holder 103, or a parameter similar thereto is used. Then, by the interpolation pitch period determiner 104,
The optimum pitch period for frame interpolation of the frame or a similar parameter is determined.

【００１９】決定方法の一例を以下に示す。まず、前フ
レームのピッチ周期パラメータLAG[is] を復号して得ら
れたピッチ周期を、 lg[is] (is = 0,...,NS-1) とする。ここで、NSは、前フレームに含まれるピッチ周
期パラメータの個数で、音声復号器がＣＥＬＰ方式の場
合、フレームを複数のサブフレームに分割し、それぞれ
でピッチ周期パラメータ（ラグ）を符号化するので、サ
ブフレーム数がNSとなる。そして、前フレームの過去の
駆動音源、d[n] (n=0,...,L-1, L: 蓄積駆動音源長) を
用いて、それぞれのlg[is]に対して、相関係数 r[is]を
下式のように算出する。An example of the determination method is shown below. First, the pitch period obtained by decoding the pitch period parameter LAG [is] of the previous frame is set to lg [is] (is = 0, ..., NS-1). Here, NS is the number of pitch period parameters included in the previous frame, and when the speech decoder uses the CELP method, the frame is divided into a plurality of subframes, and the pitch period parameter (lag) is coded in each subframe. , The number of subframes is NS. Then, using the past driving sound source of the previous frame, d [n] (n = 0, ..., L-1, L: accumulated driving sound source length), for each lg [is], the phase relation Calculate the number r [is] as follows:

【００２０】[0020]

【数１】そしてr[is] が最大となるis(is _maxする) に対するピ
ッチ周期lg[is _max]を補間フレームの最適ピッチ周期と
し、前フレーム符号化パラメータ保持器１０２のピッチ
周期パラメータLAG[is] の値を全てLAG[is_max]に置き換
える。なお、LAG[is] が、小数遅延間隔で符号化されて
いる場合には、相関係数r を遅延値lg isの前後複数サ
ンプル求めて補間することによって、LAG[is] に対する
r[is] を算出すれば良い。[Equation 1] Then, the pitch period lg [is _max ] for is (is _max ) that maximizes r [is] is set as the optimum pitch period of the interpolation frame, and the value of the pitch period parameter LAG [is] of the previous frame coding parameter holder 102 is set. Replace all with LAG [is _max ]. When LAG [is] is coded at a fractional delay interval, the correlation coefficient r is calculated by interpolating multiple samples before and after the delay value lg is to interpolate LAG [is].
It is sufficient to calculate r [is].

【００２１】次に、切替器１０５をｂ側にし、ピッチ周
期パラメータが最適なものに置き換えられた前フレーム
符号化パラメータを出力し、それを用いて音声復号器１
０６により補間フレームに対する音声復号を行う。な
お、前フレーム符号化パラメータ保持器１０２および駆
動音源保持器１０３は、フレーム単位で、次フレーム処
理用に現フレームで復号に用いた符号化パラメータおよ
び駆動音源により更新される。Next, the switch 105 is set to the side b, and the preceding frame coding parameter in which the pitch period parameter is replaced with the optimum one is output, and the speech decoder 1 is used.
According to 06, voice decoding is performed on the interpolated frame. The previous frame coding parameter holder 102 and the driving excitation holder 103 are updated on a frame-by-frame basis with the coding parameter and driving excitation used for decoding in the current frame for processing the next frame.

【００２２】以上のように、本実施例によれば、フレー
ム補間時に用いるピッチ周期パラメータを前フレームの
複数のピッチ周期パラメータから最適なものを選択する
ことにより、より適切なピッチ周期性を有する駆動音源
を生成することができる。As described above, according to the present embodiment, by selecting the optimum pitch period parameter used during frame interpolation from the plurality of pitch period parameters of the previous frame, driving with more appropriate pitch periodicity can be achieved. A sound source can be generated.

【００２３】（実施例２）図２は本発明の第２の実施例
における音声復号化装置の構成を示したものである。図
２において、２０１は符号器から送信され、音声の一定
区間であるフレーム毎の符号化情報を表す受信データを
それぞれの符号化パラメータに分離するとともに、当該
フレームが補間フレームかどうかを判定するパラメータ
分離・補間判定器、２０２はフレーム補間情報から当該
フレームの音声復号に用いる符号化パラメータを決定す
る現フレーム符号化パラメータ決定器、２０３は過去の
駆動音源を蓄える適応コードブック、２０４は雑音音源
等の固定の音源ベクトルを保持する固定コードブック、
２０５は符号化パラメータのうち、ゲイン符号をゲイン
値に復号するゲイン復号器、２０６はフレーム補間時に
適応コードブック２０３および固定コードブック２０４
から生成された信号を用いてそれらのゲインを制御する
音源ゲイン制御器、２１１はゲイン入力を切り替える切
替器、２０７は適応コードブック２０３および固定コー
ドブック２０４から生成された信号にゲインを乗じる乗
算器、２０８は駆動音源を生成する加算器、２０９はＬ
ＰＣ係数を復号するＬＰＣ係数復号器、２１０は駆動音
源を入力としてＬＰＣ合成により復号音声を合成する合
成フィルタである。(Embodiment 2) FIG. 2 shows the configuration of a speech decoding apparatus according to a second embodiment of the present invention. In FIG. 2, reference numeral 201 denotes a parameter which is transmitted from the encoder and separates the received data representing the coding information for each frame, which is a fixed section of speech, into respective coding parameters and determines whether the frame is an interpolated frame. Separation / interpolation determiner, 202 is a current frame coding parameter determiner that determines a coding parameter used for speech decoding of the frame from frame interpolation information, 203 is an adaptive codebook that stores past driving sound sources, 204 is a noise sound source, etc. Fixed codebook, which holds a fixed source vector of
Reference numeral 205 denotes a gain decoder that decodes a gain code into a gain value among coding parameters, and 206 denotes an adaptive codebook 203 and a fixed codebook 204 at the time of frame interpolation.
A sound source gain controller that controls the gains of the signals using the signals generated from, 211 is a switch that switches the gain input, and 207 is a multiplier that multiplies the signals generated from the adaptive codebook 203 and the fixed codebook 204 by the gain. , 208 is an adder for generating a driving sound source, 209 is L
An LPC coefficient decoder that decodes PC coefficients, and 210 is a synthesis filter that synthesizes decoded speech by LPC synthesis using a driving sound source as an input.

【００２４】以上のように構成された音声復号化装置に
ついて、以下その動作について説明する。まず、パラメ
ータ分離・補間判定器２０１において、符号器から送信
された送信データを受信し、当該フレームの符号化パラ
メータに分離するとともに、当該フレームが補間フレー
ムか否かを判定し、符号化パラメータおよびフレーム補
間情報を出力する。以下、当該フレームが補間フレーム
の場合とそうでない場合に分けて説明する。The operation of the speech decoding apparatus configured as above will be described below. First, the parameter separation / interpolation determiner 201 receives the transmission data transmitted from the encoder, separates it into the coding parameters of the frame, determines whether the frame is an interpolation frame, and determines the coding parameter and Output frame interpolation information. Hereinafter, the case where the frame is an interpolation frame and the case where the frame is not are described separately.

【００２５】まず、補間フレームでない場合は、現フレ
ーム符号化パラメータ決定器２０２において、パラメー
タ分離・補間判定器２０１の出力パラメータをそのまま
出力し、ピッチ周期パラメータ（ＬＡＧ）を用いて適応
コードブック２０３により適応コードブック音源を、ま
た固定コードブック２０４のインデクス（ＩＤＸ）を用
いて固定コードブック音源を生成し、切替器２１１をａ
側にし、ゲイン符号（ＧＡＩＮ）をゲイン復号器２０５
により復号して得られたゲイン（β、γ）を乗算器２０
７により適応コードブック音源および固定コードブック
音源にそれぞれに乗じ、加算器２０８により加算して駆
動音源を生成し、ＬＰＣ係数復号器２０９により得られ
たＬＰＣ係数を用いて、合成フィルタ２１０により音声
復号を行う。First, when the frame is not an interpolated frame, the current frame coding parameter determiner 202 outputs the output parameter of the parameter separation / interpolation determiner 201 as it is, and the adaptive codebook 203 uses the pitch period parameter (LAG). The adaptive codebook sound source is generated, and the fixed codebook sound source is generated by using the index (IDX) of the fixed codebook 204.
The gain code (GAIN) to the gain decoder 205.
The gain (β, γ) obtained by decoding by
7. The adaptive codebook sound source and the fixed codebook sound source are multiplied by 7 respectively, added by the adder 208 to generate a driving sound source, and the LPC coefficient obtained by the LPC coefficient decoder 209 is used to perform speech decoding by the synthesis filter 210. I do.

【００２６】次に、補間フレームの場合について説明す
る。まず、現フレーム符号化パラメータ決定器２０２に
おいて、ここで保持されている前フレームで復号に用い
た符号化パラメータを利用して、ＬＰＣ符号について
は、前フレームの符号を、ピッチ周期符号については、
前フレーム任意のピッチ周期符号、または実施例１で記
載した方法により得れた最適ピッチ周期符号を、固定コ
ードブックインデクスについては、パラメータ分離器２
０１から得られた現フレームのインデクス符号またはラ
ンダムに発生された値を持つインデクス符号をそれぞれ
出力する。適応コードブック２０３および固定コードブ
ック２０４からそれぞれ適応コードブック音源a[n] (n=
0,...,N-1)および固定コードブック音源c[n] (n=0,
1,...,N-1)を生成し、それらを用いて、音源ゲイン制御
器２０６により適応コードブック音源ゲインβおよび固
定コードブック音源ゲインγを以下により求める。 β＝Ｇａ・ＤＭＰａ γ＝（１−Ｇａ）・（Pow ＿a/Pow ＿s ）・ＤＭＰｓここで、Ｇａは、生成される駆動音源を構成する適応コ
ードブック音源と固定コードブック音源とのパワの比を
設定する定数値（０<=Ｇａ<=１）で、例えば０．９５と
する。ＤＭＰａおよびＤＭＰｓは、連続補間時のパワ減
衰化を制御する定数（＜１）、またPow ＿a およびPow
＿s は、適応コードブック音源a[n]、固定コードブック
音源c[n]のパワで、下式で表される。Next, the case of the interpolation frame will be described. First, in the current frame coding parameter determiner 202, the coding parameter used for decoding in the previous frame held here is used, the code of the previous frame is used for the LPC code, and the code of the pitch period code is used for the pitch period code.
For the fixed codebook index, the parameter separator 2 is used as the arbitrary pitch period code of the previous frame or the optimum pitch period code obtained by the method described in the first embodiment.
The index code of the current frame obtained from 01 or the index code having a randomly generated value is output. From the adaptive codebook 203 and the fixed codebook 204, the adaptive codebook sound source a [n] (n =
0, ..., N-1) and fixed codebook sound source c [n] (n = 0,
1, ..., N−1) are generated, and using them, the adaptive codebook excitation gain β and the fixed codebook excitation gain γ are obtained by the excitation gain controller 206. β = Ga · DMPa γ = (1−Ga) · (Pow_a / Pow_s) · DMPs where Ga is the power ratio between the adaptive codebook sound source and the fixed codebook sound source that constitute the generated driving sound source. Is set to a constant value (0 <= Ga <= 1), for example, 0.95. DMPa and DMPs are constants (<1) that control power attenuation during continuous interpolation, and Pow_a and Pow
_S is the power of the adaptive codebook sound source a [n] and the fixed codebook sound source c [n], and is represented by the following equation.

【００２７】[0027]

【数２】 [Equation 2]

【００２８】得られたゲインβおよびγは、切替器２１
１をｂ側にして、乗算器２０７により適応コードブック
音源および固定コードブック音源に乗算し、各音源を加
算して駆動音源を生成し、合成フィルタ２１０により音
声合成を行い、復号音声を得る。The gains β and γ obtained are used for the switching unit 21.
With 1 being the b side, the adaptive codebook sound source and the fixed codebook sound source are multiplied by the multiplier 207, each sound source is added to generate a driving sound source, and speech synthesis is performed by the synthesis filter 210 to obtain a decoded speech.

【００２９】以上のように、本実施例によれば、過去の
駆動音源から生成され、過去の駆動音源との相関性を有
する適応コードブック音源と、雑音成分等過去の駆動音
源との相関性のない固定コードブック音源とのパワ比を
固定の割合となるようゲインを制御し、駆動音源を生成
することにより、聴感上の劣化を抑えた補間フレームで
の音声復号を行うことができる。As described above, according to this embodiment, the correlation between the adaptive codebook sound source generated from the past driving sound source and having the correlation with the past driving sound source, and the past driving sound source such as the noise component. By controlling the gain so that the power ratio with a fixed codebook sound source having no noise becomes a fixed ratio and generating a driving sound source, it is possible to perform speech decoding in an interpolated frame with suppressed perceptual deterioration.

【００３０】また、適応コードブック音源と固定コード
ブック音源とのパワ比Ｇａを固定値とする代わりに、過
去の駆動音源の相関性に応じて可変値とし、可変の割合
でゲインを制御するようにしても良い。この場合、過去
の駆動音源の相関係数を求め、相関が高い場合はＧａを
大きく、低い場合には小さくする。このように駆動音源
を生成することにより、より聴感上の劣化を抑えた補間
フレームでの音声復号を行うことができる。Further, instead of setting the power ratio Ga between the adaptive codebook sound source and the fixed codebook sound source to a fixed value, a variable value is set according to the correlation of the past driving sound source, and the gain is controlled at a variable ratio. You can In this case, the correlation coefficient of the past driving sound source is obtained, and Ga is increased when the correlation is high and is decreased when the correlation is low. By generating the driving sound source in this way, it is possible to perform voice decoding in an interpolated frame in which deterioration in hearing is further suppressed.

【００３１】（実施例３）図３は本発明の第３の実施例
における音声復号化装置の構成を示したものである。本
実施例における音声復号化装置は、音声の有声区間と無
声区間それぞれに対応した符号化モードを有する構成を
有している。図３において、３０１は符号器から送信さ
れ、音声の一定区間であるフレーム毎の符号化情報を表
す受信データをそれぞれの音声符号化パラメータおよび
復号化モード情報に分離するとともに、当該フレームが
補間フレームかどうかを判定するパラメータ分離・補間
判定器、３０２は前フレームで復号に用いた符号化パラ
メータ・符号化モードを保持する前フレーム符号化パラ
メータ保持器、３０３は前フレームまでで復号された音
声の一定区間の駆動音源信号を保持する駆動音源保持
器、３０４は駆動音源保持器３０３から出力された駆動
音源および前フレームが有声符号化モードの場合は前フ
レームの複数のピッチ周期またはそれに類するパラメー
タを用いて、当該フレームのフレーム補間時における符
号化モードおよび有声符号化モードの場合は最適なピッ
チ周期またはそれに類するパラメータを決定する補間ピ
ッチ周期決定器、３０５はフレーム補間情報により当該
フレームの復号時の符号化モード情報の入力を切り替え
るモード情報切替器、３０６は補間フレームか否かによ
り当該フレームで使用する符号化パラメータを切り替え
る符号化パラメータ切替器、３０７はモード情報切替器
の出力により得られたモード情報により符号化モードを
切り替えるモード切替器、３０８および３０９は符号化
パラメータ切替器３０６で選択された符号化パラメータ
およびフレーム補間情報を用いて当該フレームの音声復
号を行う有声モード復号器および無声モード復号器、３
１０は復号音声出力切替器、３１１は駆動音源出力切替
器である。(Embodiment 3) FIG. 3 shows the configuration of a speech decoding apparatus according to a third embodiment of the present invention. The speech decoding apparatus according to the present embodiment has a configuration having coding modes corresponding to the voiced section and the unvoiced section of the speech. In FIG. 3, reference numeral 301 denotes a coder, which separates received data representing coding information for each frame, which is a fixed section of speech, into respective speech coding parameters and decoding mode information, and the frame is an interpolated frame. Parameter separation / interpolation determiner for determining whether or not, 302 is a previous frame encoding parameter holder that retains the encoding parameter / encoding mode used for decoding in the previous frame, and 303 is the speech decoded up to the previous frame. A driving sound source holder that holds a driving sound source signal in a certain section, 304 represents a driving sound source output from the driving sound source holder 303 and a plurality of pitch periods of the previous frame or a parameter similar thereto when the previous frame is in the voiced coding mode. By using the coding mode and voiced coding mode during frame interpolation of the frame. In the case of, the interpolating pitch period determiner for determining the optimum pitch period or a parameter similar thereto, 305 is a mode information switcher for switching the input of the coding mode information at the time of decoding the frame according to the frame interpolation information, and 306 is an interpolation frame A coding parameter switcher that switches the coding parameter used in the frame depending on whether or not there is a mode switcher 307 that switches the coding mode according to the mode information obtained from the output of the mode information switcher, and 308 and 309 are the coding parameters. A voiced mode decoder and an unvoiced mode decoder that perform voice decoding of the frame using the coding parameter and the frame interpolation information selected by the switch 306, 3
Reference numeral 10 is a decoded voice output switch, and 311 is a driving sound source output switch.

【００３２】以上のように構成された音声復号化装置に
ついて、以下その動作について説明する。まず、パラメ
ータ分離・補間判定器３０１において、符号器から送信
された送信データを受信し、当該フレームの各符号化パ
ラメータ・符号化モード情報に分離するとともに、当該
フレームが補間フレームか否かを判定し、符号化パラメ
ータ・符号化モードおよびフレーム補間情報を出力す
る。以下、当該フレームが補間フレームの場合とそうで
ない場合に分けて説明する。The operation of the speech decoding apparatus configured as above will be described below. First, in the parameter separation / interpolation determiner 301, the transmission data transmitted from the encoder is received and separated into each encoding parameter / encoding mode information of the frame, and it is determined whether the frame is an interpolation frame. Then, the coding parameter / coding mode and the frame interpolation information are output. Hereinafter, the case where the frame is an interpolation frame and the case where the frame is not are described separately.

【００３３】まず、補間フレームでない場合は、符号化
パラメータ切替器３０６をａ側にするとともに、モード
情報切替器３０５をａ側にし、パラメータ分離器３０１
で得られた当該フレームの符号化パラメータ・符号化モ
ードをそのまま用いて、符号化モード情報に応じた音声
復号を行う。すなわち、符号化モードが有声モードの場
合は、有声モード復号器３０８により、また無声モード
の場合は、無声モード復号器３０９により音声復号を行
う。ここで、有声モード復号器３０８は、音声の有声区
間用に設けられたモードで、少なくともピッチ周期情報
（例えばＣＥＬＰ符号化においてはラグ情報）を復号化
に用いるモードである。また、無声モード復号器３０９
は、音声の無声区間用に設けられたモードで、ピッチ周
期情報を復号化に用いないモードである。なお、復号モ
ードに応じて復号音声出力切替器３１０および駆動音源
出力切替器３１１を切り替え、現フレームで復号された
モードに応じて復号音声、駆動音源を出力する。First, when the frame is not an interpolated frame, the coding parameter switch 306 is set to the a side, the mode information switch 305 is set to the a side, and the parameter separator 301 is set.
Using the coding parameter / coding mode of the frame obtained in step 3, the audio decoding is performed according to the coding mode information. That is, voice decoding is performed by the voiced mode decoder 308 when the encoding mode is the voiced mode and by the voiceless mode decoder 309 when the voiced mode is the unvoiced mode. Here, the voiced mode decoder 308 is a mode provided for a voiced section of speech and is a mode in which at least pitch period information (for example, lag information in CELP coding) is used for decoding. Also, the silent mode decoder 309
Is a mode provided for the unvoiced section of the voice, in which the pitch period information is not used for decoding. The decoded audio output switch 310 and the driving sound source output switch 311 are switched according to the decoding mode, and the decoded sound and driving sound source are output according to the mode decoded in the current frame.

【００３４】次に、補間フレームの場合について説明す
る。補間フレームの場合、モード情報切替器３０５およ
び符号化パラメータ切替器３０６をそれぞれｂ側にし、
前フレーム符号化パラメータ保持器３０２、駆動音源保
持器３０３、補間ピッチ周期決定器３０４を用いて、以
下に示すような方法で現フレームの符号化モード、符号
化パラメータを決定し、符号化モードに応じてモード切
替器３０７を切り替え、決定された符号化パラメータを
用いて、有声モード復号器３０８または無声モード復号
器３０９により音声復号を行う。Next, the case of the interpolation frame will be described. In the case of an interpolated frame, the mode information switch 305 and the coding parameter switch 306 are respectively set to the b side,
Using the previous frame coding parameter holder 302, the driving sound source holder 303, and the interpolating pitch period determiner 304, the coding mode and coding parameter of the current frame are determined by the following method, and the coding mode is set. The mode switch 307 is switched accordingly, and the voiced mode decoder 308 or the unvoiced mode decoder 309 performs voice decoding using the determined coding parameter.

【００３５】ここで、補間フレームにおける符号化モー
ド・符号化パラメータの決定方法を説明する。現フレー
ムの符号化モードは、前フレームの符号化モードを基本
とする。まず前フレーム符号化モードが有声モードの場
合、現フレームは有声モードと判定し、前フレーム符号
化パラメータ保持器３０２で保持されている前フレーム
の複数のピッチ周期パラメータ（ＣＥＬＰ型復号器の場
合は、前フレームの各サブフレームのラグ）を入力とし
て実施例１に示した方法により駆動音源の自己相関を最
大にするような最適な補間ピッチ周期を補間ピッチ周期
決定器３０４で決定し、現フレームのピッチ周期パラメ
ータとする。現フレームのピッチ周期パラメータとして
最適なものを求めずに、前フレームの任意のものを用い
るようにしても良い。また、前フレームが有声モードの
場合でも、決定された最適ピッチ周期における正規化自
己相関係数があるしきい値以下の場合には、ピッチ周期
性が低いとみなし、無声モードと判定するようにしても
良い。次に、前フレーム符号化モードが無声モードの場
合、過去の駆動音源の相関性に基づき有声／無声モード
を判定する。すなわち、駆動音源保持器３０３で保持さ
れた過去の駆動音源を用い、補間ピッチ周期決定器３０
４において、駆動音源の自己相関係数を最大値にするよ
うな遅延値すなわちピッチ周期を求め、その時の正規化
相関係数があるしきい値以上の場合、そのピッチ周期を
最適値として出力し、補間フレームを有声モードと判定
し、逆にしきい値以下の場合には無声モードと判定す
る。Here, a method of determining the coding mode / coding parameter in the interpolation frame will be described. The coding mode of the current frame is based on the coding mode of the previous frame. First, when the previous frame coding mode is the voiced mode, it is determined that the current frame is the voiced mode, and a plurality of pitch period parameters of the previous frame held in the previous frame coding parameter holder 302 (in the case of CELP type decoder, , The lag of each sub-frame of the previous frame is input, and the interpolation pitch cycle determiner 304 determines the optimum interpolation pitch cycle that maximizes the autocorrelation of the driving sound source by the method described in the first embodiment. Pitch period parameter of. Instead of obtaining the optimum pitch period parameter for the current frame, any parameter of the previous frame may be used. Even if the previous frame is in the voiced mode, if the normalized autocorrelation coefficient in the determined optimum pitch period is less than a certain threshold value, the pitch periodicity is considered to be low, and the unvoiced mode is determined. May be. Next, when the previous frame coding mode is the unvoiced mode, the voiced / unvoiced mode is determined based on the correlation between the past driving sound sources. That is, using the past driving sound source held by the driving sound source holder 303, the interpolation pitch cycle determiner 30 is used.
In 4, the delay value that maximizes the autocorrelation coefficient of the driving sound source, that is, the pitch period is obtained, and if the normalized correlation coefficient at that time is greater than or equal to a threshold value, the pitch period is output as the optimum value. , The interpolated frame is determined to be the voiced mode, and conversely, when it is less than or equal to the threshold value, it is determined to be the unvoiced mode.

【００３６】以上のように、本実施例によれば、現フレ
ームの符号化モードを決定し、そのモード情報を出力す
るとともに、、有声モードの場合は最適ピッチ周期を含
めて、前フレーム符号化パラメータを現フレームの復号
パラメータとして出力する。As described above, according to this embodiment, the coding mode of the current frame is determined, the mode information is output, and in the case of the voiced mode, the preceding frame coding is performed including the optimum pitch period. The parameter is output as the decoding parameter of the current frame.

【００３７】なお、上述したような補間フレーム符号化
モード判定の代わりに、前フレームが有声符号化モード
・無声符号化モードに関わらず、補間ピッチ周期決定器
３０４において、過去の復号駆動音源に対して、自己相
関を最大にするようなピッチ周期を求め、その時の正規
化相関係数があるしきい値以上の場合、そのピッチ周期
を最適値として出力し、補間フレームを有声モードと判
定し、逆にしきい値以下の場合には無声モードと判定す
るようにしても良い。Instead of the interpolated frame coding mode determination as described above, the interpolated pitch cycle determiner 304 determines the past decoded driving sound source regardless of whether the previous frame is the voiced coding mode or the unvoiced coding mode. Then, the pitch period that maximizes the autocorrelation is obtained, and when the normalized correlation coefficient at that time is greater than or equal to a threshold value, the pitch period is output as the optimum value, and the interpolated frame is determined to be the voiced mode, On the contrary, when it is equal to or less than the threshold value, it may be determined to be the unvoiced mode.

【００３８】[0038]

【発明の効果】本発明は、上記第１の実施例から明らか
なように、フレーム補間時に用いるピッチ周期パラメー
タを前フレームの複数のピッチ周期パラメータから最適
なものを選択することにより、より適切なピッチ周期性
を有する駆動音源を生成することができる。As is apparent from the first embodiment, the present invention is more suitable by selecting the optimum pitch period parameter used during frame interpolation from the plurality of pitch period parameters of the previous frame. It is possible to generate a driving sound source having pitch periodicity.

【００３９】また、上記第２の実施例から明らかなよう
に、過去の駆動音源から生成され過去の駆動音源との相
関性を有する適応コードブック音源と、雑音成分等過去
の駆動音源との相関性のない固定コードブック音源との
パワ比を、固定の割合または過去の駆動音源の相関性に
応じて可変の割合でゲインを制御して駆動音源を生成す
ることにより、より聴感上の劣化を抑えた補間フレーム
での音声復号を行うことができる。As is apparent from the second embodiment, the correlation between the adaptive codebook sound source generated from the past driving sound source and having the correlation with the past driving sound source, and the past driving sound source such as the noise component. The power ratio with a fixed codebook sound source that has no effect is controlled by controlling the gain at a fixed ratio or at a variable ratio according to the correlation of the past driving sound source to generate a driving sound source. It is possible to perform voice decoding with suppressed interpolation frames.

【００４０】さらに、上記第３の実施例から明らかなよ
うに、補間ピッチ周期決定器において、過去の駆動音源
の相関性により、前フレームの有声性／無声性を評価
し、有声性が高い場合には有声モードで、そうでない場
合には無声モードで当該補間フレームの復号処理を行う
ことにより、より聴感上の劣化を抑えた音声復号を行う
ことができる。Further, as is clear from the third embodiment, the interpolated pitch period determiner evaluates the voicing / unvoicedness of the previous frame based on the correlation of the past driving sound sources, and when the voicing is high. In the voiced mode, otherwise, by performing the decoding process of the interpolation frame in the unvoiced mode, it is possible to perform the voice decoding in which the deterioration in the auditory sense is further suppressed.

[Brief description of drawings]

【図１】本発明の実施例１における音声復号化装置のブ
ロック図FIG. 1 is a block diagram of a speech decoding apparatus according to a first embodiment of the present invention.

【図２】本発明の実施例２における音声復号化装置のブ
ロック図FIG. 2 is a block diagram of a speech decoding apparatus according to a second embodiment of the present invention.

【図３】本発明の実施例３における音声復号化装置のブ
ロック図FIG. 3 is a block diagram of a speech decoding apparatus according to a third embodiment of the present invention.

【図４】従来例における音声復号化装置のブロック図FIG. 4 is a block diagram of a speech decoding device in a conventional example.

[Explanation of symbols]

１０１パラメータ分離・補間判定器１０２前フレーム符号化パラメータ保持器１０３駆動音源保持器１０４補間ピッチ周期決定器１０５切替器１０６音声復号器２０１パラメータ分離・補間判定器２０２前フレーム符号化パラメータ保持器２０３適応コードブック２０４固定コードブック２０５ゲイン復号器２０６ゲイン制御器２０７乗算器２０８加算器２０９ＬＰＣ係数復号器２１０合成フィルタ２１１切替器３０１パラメータ分離・補間判定器３０２前フレーム符号化パラメータ保持器３０３駆動音源保持器３０４補間ピッチ周期決定器３０５モード情報切替器３０６符号化パラメータ切替器３０７モード切替器３０８有声モード復号器３０９無声モード復号器３１０復号音声出力切替器３１１駆動音源出力切替器 101 Parameter separation / interpolation judgment device 102 previous frame coding parameter holder 103 Drive sound source holder 104 Interpolation pitch cycle determiner 105 switch 106 speech decoder 201 Parameter separation / interpolation judgment device 202 previous frame coding parameter holder 203 Adaptive codebook 204 fixed codebook 205 gain decoder 206 gain controller 207 multiplier 208 adder 209 LPC coefficient decoder 210 Synthesis filter 211 Switch 301 Parameter separation / interpolation judgment device 302 previous frame coding parameter holder 303 Drive sound source holder 304 Interpolation pitch cycle determiner 305 Mode information switch 306 Encoding parameter switch 307 Mode switch 308 Voiced Mode Decoder 309 Silent mode decoder 310 Decoded audio output switch 311 Drive sound source output switch

───────────────────────────────────────────────────── フロントページの続き (56)参考文献特開平３−51900（ＪＰ，Ａ) 特開平５−73097（ＪＰ，Ａ) 特開平５−19796（ＪＰ，Ａ) ─────────────────────────────────────────────────── ─── Continued front page (56) References JP-A-3-51900 (JP, A) JP-A-5-73097 (JP, A) Japanese Patent Laid-Open No. 5-19796 (JP, A)

Claims

(57) [Claims]

1. Separation of received data, which is transmitted from an encoder and represents coding information for each frame, which is a constant section of speech, into respective coding parameters and also determines whether or not the frame is an interpolated frame.・
An interpolation determiner, a previous frame coding parameter holder that holds the coding parameter used for decoding in the previous frame,
A driving sound source holder that holds a driving sound source signal in a certain section of the sound decoded up to the previous frame, and a plurality of pitches of the driving sound source signal from the driving sound source holder and the preceding frame from the preceding frame coding parameter holder Using a cycle parameter, from among the plurality of pitch cycle parameters of the previous frame, an interpolation pitch cycle determiner for determining an optimum pitch cycle parameter for frame interpolation of the frame,
A switching device that switches the coding parameter used in the frame depending on whether it is an interpolated frame or not, and a speech decoder that performs speech decoding of the frame using the coding parameter and frame interpolation information selected by the switching device are provided. Speech decoding device.

2. The speech decoder according to claim 1, wherein the speech decoder is a CELP type decoder, and a lag for each subframe of the previous frame is used as the plurality of pitch period parameters input to the interpolating pitch period determiner. Device.

3. The interpolating pitch cycle determiner outputs a pitch cycle that maximizes the autocorrelation among a plurality of past pitch cycle parameters for the past decoded driving sound source as an optimum value. Voice decoding device.

4. Parameter separation for separating received data, which is transmitted from an encoder and represents coding information for each frame, which is a fixed section of speech, into respective coding parameters and determines whether or not the frame is an interpolation frame.・
Interpolation determiner, current frame coding parameter determiner that determines coding parameters used for speech decoding of the frame from frame interpolation information, adaptive codebook that stores past driving sound sources, fixed sound source vector such as noise sound source A fixed codebook that holds, a source gain controller that controls the gains of the adaptive codebook and the signals generated from the fixed codebook during frame interpolation, and a signal generated from the adaptive codebook and the fixed codebook. A speech decoding apparatus including a multiplier that multiplies a gain and an adder that generates a driving sound source.

5. The speech decoding apparatus according to claim 4, wherein the excitation gain controller controls the gain so that a ratio of powers of the adaptive codebook excitation and the fixed codebook excitation becomes a fixed ratio.

6. A sound source gain controller varies the power ratio of each of the adaptive codebook sound source and the fixed codebook sound source according to the correlation of past driving sound sources, and when the correlation is high, the adaptive codebook sound source is adjusted. 5. The speech decoding apparatus according to claim 4, wherein the gain is controlled so that the power ratio of is increased.

7. A speech decoding apparatus having coding modes corresponding to voiced sections and unvoiced sections of speech, respectively.
Parameter that separates the received data that is transmitted from the encoder and that represents the coding information for each frame, which is a fixed section of speech, into each speech coding parameter and coding mode information, and that determines whether the frame is an interpolated frame Separation / interpolation decision unit, previous frame coding parameter holder that holds the coding parameter / coding mode used for decoding in the previous frame, and driving excitation signal in a certain section of the sound decoded up to the previous frame A driving sound source holder for driving, and a driving sound source signal output from the driving sound source holder and a plurality of pitch period parameters of the previous frame when the previous frame is in a voiced coding mode, the code at the time of frame interpolation of the frame. If the encoding mode is determined and the encoding mode is the voiced encoding mode, the previous frame Number of pitch cycle parameters, an interpolation pitch cycle determiner that determines an optimum pitch cycle parameter, a mode information switch that switches input of coding mode information at the time of decoding the frame according to frame interpolation information, and an interpolation frame A coding parameter switcher that switches a coding parameter used in the frame depending on whether or not the mode, a mode switcher that switches a coding mode based on the mode information obtained from the output of the mode information switcher, and the coding parameter switcher. A voice decoding apparatus including a voiced mode decoder and a voiceless mode decoder for performing voice decoding of the frame using the coding parameter and frame interpolation information selected in.

8. The voiced mode decoder and the unvoiced mode decoder are both CELP type decoders, the voiced mode decoder has at least an adaptive codebook, and pitch period information is used for decoding, and the unvoiced mode decoder is used. 8. The speech decoding apparatus according to claim 7, further comprising: a configuration having no adaptive codebook, wherein lags for each subframe of the previous frame are used as the plurality of pitch period parameters input to the interpolation pitch period determiner.

9. The interpolation pitch period determiner determines a pitch period that maximizes the autocorrelation among a plurality of past pitch period parameters for a past driving sound source when the previous frame is in a voiced coding mode. The speech decoding according to claim 7, wherein the decoded frame is output as an optimum value and the interpolated frame is determined as a voiced mode.
Apparatus.

10. A pitch period which maximizes an autocorrelation among a plurality of past pitch period parameters for a past decoding driving excitation when the previous frame is in a voiced coding mode. If the normalized correlation coefficient at that time is more than a certain threshold value, the pitch period is output as an optimum value and the interpolated frame is determined to be a voiced mode. The speech decoding apparatus according to claim 7, wherein the determination is performed.

11. The interpolation pitch period determiner obtains a pitch period that maximizes the autocorrelation with respect to the past driving sound source when the previous frame is in the unvoiced coding mode, and the normalized correlation coefficient at that time is determined. 8. The speech decoding apparatus according to claim 7, wherein when the threshold value is equal to or more than a threshold value, the pitch period is output as an optimum value to determine the interpolated frame as a voiced mode, and conversely, when the threshold value is less than or equal to the threshold value, the unvoiced mode is determined. .

12. An interpolating pitch period determiner obtains a pitch period that maximizes autocorrelation with respect to a past driving sound source regardless of a voiced coding mode / unvoiced coding mode of a previous frame, 8. If the normalized correlation coefficient is above a certain threshold value, the pitch period is output as an optimum value to determine the interpolated frame as a voiced mode, and conversely, if it is below the threshold value, it is determined as an unvoiced mode. The described speech decoding device.