JP2853170B2

JP2853170B2 - Audio encoding / decoding system

Info

Publication number: JP2853170B2
Application number: JP1139524A
Authority: JP
Inventors: 一範小澤
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1989-05-31
Filing date: 1989-05-31
Publication date: 1999-02-03
Anticipated expiration: 2014-02-03
Also published as: JPH034300A

Description

【発明の詳細な説明】（産業上の利用分野）本発明は音声信号を低いビットレートで効率的に符号
化し、復号化するための音声符号化復号化方式に関す
る。Description: TECHNICAL FIELD The present invention relates to an audio encoding / decoding system for efficiently encoding and decoding an audio signal at a low bit rate.

（従来の技術）音声信号を低いビットレート、例えば16Kb/s程度以下
で伝送する方式としては、マルチパルス符号化法などが
知られている。これらは音源信号を複数個のパルス組合
せ（マルチパルス）で表し、声道の特徴をデジタルフィ
ルタで表し、音源パルスの情報とフィルタの係数を、一
定時間区間（フレーム）毎に求めて伝送している。この
方法の詳細については、例えばAraseki,Ozawa,Ono,Ochi
ai氏による“Multi−pulse Excited Speech Coder Base
d on Maximum Crosscorrelation Search Algorithm",
（GLOBECOM83,IEEE Global Telecommunication,講演番
号23.3,1983）（文献１）に記載されている。この方法
では、声道情報と音源信号を分離してそれぞれ表現する
こと、および音源信号を表現する手段として複数のパル
ス列の組合せ（マルチパルス）を用いることにより、復
号後に良好な音声信号を出力できる。音源信号を表すパ
ルス列を求める基本的な考え方については第５図を用い
て説明する。図中の入力端子900からはフレーム毎に分
割された音声信号が入力される。合成フィルタ920には
現フレームの音声信号から求められたスペクトルパラメ
ータが入力されている。音源計算回路910において初期
マルチパルスを発生し、これを前記合成フィルタ920に
入力することによって出力として合成音声波形が得られ
る。減算器940で前記入力信号から合成音声波形を減ず
る。この結果を重み付け回路950へ入力し、現フレーム
での重み付け誤差電力を得る。そしてこの重み付け誤差
電力を最小とするように、音源計算回路910において規
定個数のマルチパルスの振幅と位置を求める。(Prior Art) As a method of transmitting an audio signal at a low bit rate, for example, about 16 Kb / s or less, a multi-pulse encoding method and the like are known. In these, the sound source signal is represented by a plurality of pulse combinations (multi-pulse), the characteristics of the vocal tract are represented by a digital filter, and the information of the sound source pulse and the coefficient of the filter are obtained and transmitted for each fixed time section (frame). I have. For details of this method, see, for example, Araseki, Ozawa, Ono, Ochi
“Multi-pulse Excited Speech Coder Base by ai
d on Maximum Crosscorrelation Search Algorithm ",
(GLOBECOM83, IEEE Global Telecommunication, lecture number 23.3, 1983) (Reference 1). In this method, a good voice signal can be output after decoding by separating and expressing the vocal tract information and the sound source signal and using a combination (multi-pulse) of a plurality of pulse trains as means for expressing the sound source signal. . The basic concept of obtaining a pulse train representing a sound source signal will be described with reference to FIG. An audio signal divided for each frame is input from an input terminal 900 in the figure. The spectrum parameters obtained from the audio signal of the current frame are input to the synthesis filter 920. An initial multi-pulse is generated in the sound source calculation circuit 910 and is input to the synthesis filter 920 to obtain a synthesized speech waveform as an output. A subtractor 940 subtracts a synthesized speech waveform from the input signal. The result is input to the weighting circuit 950 to obtain the weighted error power in the current frame. Then, the sound source calculation circuit 910 obtains the amplitudes and positions of a specified number of multi-pulses so as to minimize the weighted error power.

（発明が解決しようとする課題）しかしながら、この従来法ではビットレートが充分に
高く音源パルスの数が充分なときは音質が良好であった
が、ビットレートを下げて行くと音質が低下するという
問題点が合った。(Problems to be Solved by the Invention) However, in this conventional method, the sound quality is good when the bit rate is sufficiently high and the number of sound source pulses is sufficient, but the sound quality deteriorates as the bit rate is reduced. The problem fits.

この問題点を改善するために、マルチパルス音源のピ
ッチ毎の準周期性（ピッチ相関）を利用したピッチ予測
マルチパルス法が提案されている。この方法の詳細は、
例えば、特願昭58−139022号明細書（文献２）に詳しい
のでここでは説明を省略する。しかしながら、マルチパ
ルス音源のピッチ毎の準周期性は大振幅のパルスでは大
きいと考えられるが、全てのパルスについてこのような
周期性が存在するわけではなく、振幅の小さなパルスは
ピッチ毎の周期性は少ないと考えられる。前記文献２の
ピッチ予測マルチパルス法では、フレーム内で予め定め
られたすべての個数のパルスについてピッチ毎の周期性
を仮定して全てのパルスをピッチ予測により求めている
ので、特に周期性の少ないパルスに対してはピッチ予測
によりかえって特性が悪化するという問題点があった。
特にこのことは、母音同士の遷移区間や過渡部において
顕著であり、このような部分で音質が劣化するという問
題点があった。In order to improve this problem, a pitch prediction multipulse method using quasi-periodicity (pitch correlation) for each pitch of a multipulse sound source has been proposed. For more information on this method,
For example, the detailed description is omitted from the specification of Japanese Patent Application No. 58-139022 (Reference 2). However, the quasi-periodicity of each pitch of a multi-pulse sound source is considered to be large for large-amplitude pulses, but not all pulses have such periodicity. Is considered to be small. In the pitch prediction multi-pulse method described in Reference 2, all pulses are obtained by pitch prediction assuming periodicity for each pitch for all predetermined numbers of pulses in a frame. There was a problem that the characteristics of the pulse deteriorated due to the pitch prediction.
In particular, this is remarkable in a transition section between vowels and a transition section, and there is a problem that sound quality is deteriorated in such a section.

さらに、前記文献２の方法では、ピッチ情報をインパ
ルス応答に含ませているため非常に時間長の長いインパ
ルス応答（例えば20msec以上）を必要とし、予め定めら
れた個数の全てのパルスをピッチ予測により求めている
ので、パルスの探索に要する演算量は非常に多く、現在
のLSI秘術をもってしても装置をコンパクトに実現する
ことは困難であった。Further, in the method of Document 2, since the pitch information is included in the impulse response, an impulse response having a very long time length (for example, 20 msec or more) is required, and a predetermined number of all the pulses are determined by pitch prediction. Therefore, the amount of calculation required for searching for a pulse is extremely large, and it is difficult to realize a compact device even with current LSI techniques.

本発明の目的は、ビットレートが高いところでも、下
げていっても従来よりも良好な音声を再生することが可
能で、すくない演算量で実現可能な音声符号化復号化方
式を提供することにある。It is an object of the present invention to provide a speech encoding / decoding method which can reproduce a better sound than before even when the bit rate is high or lower, and which can be realized with a small amount of calculation. is there.

（課題を解決するための手段）本発明の音声符号化復号化方式は、送信側では離散的
な音声信号を入力し前記音声信号からフレーム毎にスペ
クトル包絡を表すスペクトルパラメータとピッチ周期を
表すピッチパラメータとを抽出し、前記フレームの音声
信号を前記ピッチパラメータに応じた小区間に分割し、
前記小区間のうちの１つの区間の音声信号に対して前記
ピッチパラメータと前記スペクトルパラメータを用いて
第１のマルチパルスを求め、他の区間では前記マルチパ
ルスを補正する係数を求め、前記マルチパルスと前記係
数により求めた信号を前記音声信号から除去して得られ
る信号に対してスペクトルパラメータを用いて第２のマ
ルチパルスを求め、受信側では前記第１のマルチパルス
と前記ピッチパラメータと前記係数と前記第２のマルチ
パルスを用いて音源信号を復元し、さらに前記スペクト
ルパラメータを用いて構成される合成フィルタを駆動し
て合成音声信号を求めることを特徴とする。(Means for Solving the Problems) According to a speech encoding / decoding method of the present invention, a discrete speech signal is input on the transmission side, and a spectrum parameter representing a spectrum envelope and a pitch representing a pitch period are provided for each frame from the speech signal. Parameters, and the audio signal of the frame is divided into small sections according to the pitch parameter,
A first multi-pulse is obtained using the pitch parameter and the spectrum parameter for an audio signal of one of the small sections, and a coefficient for correcting the multi-pulse is obtained in another section, and the multi-pulse is obtained. And a signal obtained by removing the signal obtained from the coefficient from the audio signal is used to obtain a second multi-pulse using a spectrum parameter. On the receiving side, the first multi-pulse, the pitch parameter and the coefficient are obtained. And recovering the sound source signal using the second multi-pulse, and further driving a synthesis filter configured using the spectrum parameters to obtain a synthesized speech signal.

また本発明による音声符号化方式は、送信側では離散
的な音声信号を入力し前記音声信号からフレーム毎にス
ペクトル包絡を表すスペクトルパラメータとピッチ周期
を表すピッチパラメータとを抽出し、前記フレームの音
声信号を前記ピッチパラメータに応じた小区間に分割
し、前記音声信号の音源信号として前記小区間のうち１
つの区間において前記ピッチパラメータと前記スペクト
ルパラメータを用いて第１のマルチパルスを求め、他の
区間では前記マルチパルスを補正する係数を求め、前記
マルチパルスと前記係数により求めた信号を前記音声信
号から除去して得られる信号に対して前記スペクトルパ
ラメータを用いて第２のマルチパルスを求めて得られる
マルチパルス音源か、予め定められた種類の雑音信号か
ら構成される符号帳から前記音声信号と合成信号との誤
差電力を小さくするように選択した雑音信号を用いて表
し、受信側では前記第１のマルチパルスと前記ピッチパ
ラメータと前記係数と前記第２のマルチパルスを用いて
音源信号を復元するか、前記選択した雑音信号を用いて
音源信号を復元し、前記スペクトルパラメータを用いて
構成される合成フィルタを前記音源信号により駆動して
合成音声信号を求めることを特徴とする。Also, in the speech encoding method according to the present invention, a discrete speech signal is input on the transmission side, and a spectrum parameter representing a spectrum envelope and a pitch parameter representing a pitch period are extracted from the speech signal for each frame, and the speech of the frame is extracted. The signal is divided into small sections according to the pitch parameter, and one of the small sections is used as a sound source signal of the audio signal.
In one section, a first multi-pulse is obtained using the pitch parameter and the spectrum parameter, and in another section, a coefficient for correcting the multi-pulse is obtained, and a signal obtained from the multi-pulse and the coefficient is obtained from the audio signal. A multi-pulse sound source obtained by obtaining a second multi-pulse using the spectrum parameter for a signal obtained by removing the signal, or synthesizing the speech signal from a codebook composed of a predetermined type of noise signal. The signal is represented by using a noise signal selected so as to reduce the error power with respect to the signal, and on the receiving side, the source signal is restored using the first multi-pulse, the pitch parameter, the coefficient, and the second multi-pulse. Alternatively, a sound source signal is reconstructed using the selected noise signal, and a synthesis file configured using the spectrum parameters is restored. The filter is driven by the excitation signal and obtaining the synthesized speech signal.

（作用）第１の発明による音声符号化復号化方式は、フレーム
区間（例えば20ms）の音声信号の音源信号を、有音区間
ではフレームを分割した小区間において、ピッチ補間に
より求めたマルチパルス（第１のマルチパルス）と、フ
レーム全体においてピッチ予測無しで求めたマルチパル
ス（第２のマルチパルス）とを用いて表すことを特徴と
している。前記第１のマルチパルスの計算は次のように
行う。マルチパルス音源のピッチ毎の準周期性を非常に
効率よく利用すると共に演算量を大きく低減するため
に、フレームをあらかじめピッチ周期に応じた小区間
（サブフレーム）に分割し、前記サブフレームのうちの
１つのサブフレーム（代表区間）についてのみマルチパ
ルスを求める。他のサブフレームについては前記代表区
間で求めたマルチパルスのゲインと位相を補正する補正
係数を求め、この係数を用いて他のサブフレームにおい
て、前記代表区間のマルチパルスのゲインと位相を補正
してパルスを発生させ、フレーム全体のパルスを復元す
る。そして前記パルスによりフレームで信号を再生して
前記音声信号から前記信号を減算した後に、前記フレー
ムにおいて前記文献１と同様の方法により、マルチパル
ス（第２のマルチパルス）を求めるわけである。(Operation) In the speech encoding / decoding method according to the first aspect of the invention, a sound source signal of a speech signal in a frame section (for example, 20 ms) is converted into a multipulse (Pulse interpolation) obtained by pitch interpolation in a small section obtained by dividing a frame in a sound section. It is characterized by using a multi-pulse (second multi-pulse) obtained without pitch prediction in the entire frame (first multi-pulse). The calculation of the first multipulse is performed as follows. In order to use the quasi-periodicity of each pitch of the multi-pulse sound source very efficiently and greatly reduce the amount of calculation, the frame is divided into small sections (sub-frames) corresponding to the pitch cycle in advance, and A multipulse is obtained only for one subframe (representative section). For other sub-frames, a correction coefficient for correcting the gain and phase of the multi-pulse obtained in the representative section is obtained, and in this other sub-frame, the gain and phase of the multi-pulse in the representative section are corrected using this coefficient. To generate a pulse to restore the pulse of the entire frame. Then, after the signal is reproduced in a frame by the pulse and the signal is subtracted from the audio signal, a multi-pulse (second multi-pulse) is obtained in the frame by the same method as in Reference 1.

以下で本方式の基本的な処理を第３図を用いて説明す
る。第３図は、本発明の作用を示すブロック図である。
入力端子100から音声信号を入力し、前記音声信号を予
め定められた時間長の（例えば20ms）フレームに分割す
る。LPC、ピッチ分析部150はフレームの音声信号からス
ペクトル包絡を表すスペクトルパラメータとして、予め
定められた次数のLPC係数を衆知のLPC分析によりもとめ
る。LPC係数としては、ここで用いる線形予測係数a_iの
他にLSP、ホルマント、LPCケプストラムなどの他の良好
なパラメータを用いることもできる。また、LPC以外の
分析法、例えばケプストラムやPSE、ARMA法などを用い
ることもできる。以下では線形予測係数を用いるものと
して説明を行う。また150は、フレームの音声からピッ
チパラメータとしてピッチ周期Ｍを計算する。これには
衆知の自己相関法を用いることができる。The basic processing of this method will be described below with reference to FIG. FIG. 3 is a block diagram showing the operation of the present invention.
An audio signal is input from the input terminal 100, and the audio signal is divided into frames of a predetermined time length (for example, 20 ms). The LPC / pitch analysis unit 150 obtains LPC coefficients of a predetermined order as spectral parameters representing a spectral envelope from the audio signal of the frame by a well-known LPC analysis. As the LPC coefficient, in addition to the linear prediction coefficient a _i used here, other good parameters such as LSP, formant, and LPC cepstrum can also be used. In addition, analysis methods other than LPC, for example, cepstrum, PSE, ARMA method and the like can also be used. Hereinafter, description will be made assuming that a linear prediction coefficient is used. Also, 150 calculates a pitch period M as a pitch parameter from the voice of the frame. A well-known autocorrelation method can be used for this.

ピッチ補間マルチパルス計算部250及びマルチパルス
計算部270の動作を第４図を引用して説明する。第４図
（ａ）はフレームの音声信号を表す。ここでは一例とし
てフレーム長を20msとしている。ピッチ補間マルチパル
ス計算部250では、まず、（ｂ）のように、フレームを
ピッチ周期Ｍを用いて小区間（サブフレーム）に分割す
る。ここではサブフレームの長さはピッチ周期Ｍと同一
としている。The operation of the pitch interpolation multi-pulse calculation unit 250 and the multi-pulse calculation unit 270 will be described with reference to FIG. FIG. 4A shows an audio signal of a frame. Here, the frame length is set to 20 ms as an example. The pitch interpolation multi-pulse calculation unit 250 first divides a frame into small sections (subframes) using a pitch period M as shown in (b). Here, the length of the subframe is the same as the pitch period M.

次に、前記文献１と同一の方法により、前記線形予測
係数から構成される合成フィルタのインパルス応答ｈ
（ｎ）の自己相関関数R_hh（ｍ）、聴感重みずけ音声信
号と前記インパルス応答ｈ（ｎ）との相互相関関数Φ_hx
（ｍ）を求める。次に、前記サブフレームのうちの予め
定められた１つの区間（以下、代表区間と呼ぶ。ここで
は例えば第４図（ｂ）の区間）についてのみ、予め定
められた個数Ｋ（ここでは４としている）のマルチパル
ス（第１のマルチパルス）の振幅g_i、位置m_iを求める。
ここでマルチパルスの求め方は前記文献１を参照でき
る。第４図（ｃ）は求めたマルチパルスを示す。次に、
代表区間以外のサブフレームでは、代表区間で求めたマ
ルチパルスのゲイン、位相を補正してパルスを発生する
ためのゲイン補正係数、位相補正係数を求める。フレー
ム内のｊ番目のサブフレームにおけるゲイン補正係数
c_j、位相補正係数d_jは次式の誤差電力を最小化するよう
に求める。Next, the impulse response h of the synthesis filter composed of the linear prediction coefficients is calculated in the same manner as in Reference 1.
(N) auto-correlation function R _hh (m), cross-correlation function Φ _hx between the perceptually weighted audio signal and the impulse response h (n)
(M) is obtained. Next, only for a predetermined section (hereinafter referred to as a representative section, here, for example, the section in FIG. 4B) of the subframe, a predetermined number K (here, 4 amplitude g _i of the multi-pulse (first multi-pulse) of are) obtains the position m _i.
Here, the method of obtaining the multi-pulse can be referred to the above-mentioned document 1. FIG. 4 (c) shows the obtained multi-pulse. next,
In subframes other than the representative section, a gain correction coefficient and a phase correction coefficient for generating a pulse by correcting the gain and phase of the multipulse obtained in the representative section are obtained. Gain correction coefficient in the j-th subframe in the frame
c _j and the phase correction coefficient d _j are determined so as to minimize the error power of the following equation.

ここでx_j（ｎ）、s_j（ｎ）はｊ番目のサブフレームに
おける音声信号、マルチパルスのゲイン、位相を補正し
て求めた合成音声をそれぞれ示す。ただしここでｈ（ｎ）は合成フィルタのインパルス応答であ
る。（２）式を（１）式に代入してc_jで偏微分して０と
おくことにより、（１）式を最小化するc_j、d_jを求める
事ができる。詳細は特願昭63−208201号明細書（文献
３）等を参照できる。このようにして基本的にはフレー
ム内の他のサブフレーム区間すべてについてゲイン補正
係数、位相補正係数を求める。そして代表区間のマルチ
パルスとゲイン補正数、位相補正係数を用いて第４図
（ｄ）のようにフレーム全体のパルスを再生する。な
お、代表区間のフレーム内位置は、いくつかのサブフレ
ームを探索して決定してもよいし、あらかじめ決めてお
いてもよい。前者の方法の詳細は例えば前記文献３等を
参照できる。 Here, x _j (n) and s _j (n) indicate the speech signal in the j-th subframe and the synthesized speech obtained by correcting the gain and phase of the multi-pulse, respectively. However Here, h (n) is the impulse response of the synthesis filter. By placing a 0 to partial differentiation (2) by substituting expression of equation (1) c _j, it can be calculated c _j, d _j that minimizes equation (1). For details, reference can be made to Japanese Patent Application No. 63-208201 (Document 3). Thus, basically, the gain correction coefficient and the phase correction coefficient are obtained for all the other subframe sections in the frame. Then, the pulses of the entire frame are reproduced as shown in FIG. 4D using the multi-pulse of the representative section, the gain correction number, and the phase correction coefficient. The position of the representative section in the frame may be determined by searching for some subframes or may be determined in advance. The details of the former method can be referred to, for example, the above-mentioned Document 3.

次に、再生したパルスｖ（ｎ）を用いて（３）式で定
義される合成フィルタを駆動して再生信号ｘ′（ｎ）を
得る。Next, by using the reproduced pulse v (n), the synthesis filter defined by the equation (3) is driven to obtain a reproduction signal x '(n).

ここでa_iは線形予測係数である。 Here, a _i is a linear prediction coefficient.

減算器260は次式にしたがい音声信号ｘ（ｎ）から
ｘ′（ｎ）を減算してｅ（ｎ）を得る。The subtractor 260 subtracts x '(n) from the audio signal x (n) according to the following equation to obtain e (n).

ｅ（ｎ）＝ｘ（ｎ）−ｘ′（ｎ）（４）次に、マルチパルス計算部270はｅ（ｎ）に対して、
前記文献１と同一の方法を用いてｅ（ｎ）に聴感重み付
けをした信号と合成フィルタの重みずけインパルス応答
との相互相関関数と、前記重みずけインパルス応答の自
己相関関数を用いて、フレーム内で予け定められた個数
Ｑのマルチパルス（第２のマルチパルス）を求める。こ
れを第４図（ｅ）に示す。図ではＱを４としている。e (n) = x (n) −x ′ (n) (4) Next, the multipulse calculation unit 270 calculates e (n)
Using the cross-correlation function between the signal weighted perceptually for e (n) and the weighted impulse response of the synthesis filter using the same method as in Document 1, and the autocorrelation function of the weighted impulse response, A predetermined number Q of multi-pulses (second multi-pulses) in a frame is obtained. This is shown in FIG. 4 (e). In the figure, Q is set to 4.

一方、無声フレームでは、フレーム全体に対してマル
チパルスの振幅、位置を求める。On the other hand, in an unvoiced frame, the amplitude and position of the multipulse are obtained for the entire frame.

送信側の伝送情報は、合成フィルタのスペクトルパラ
メータの他に、有声フレームでは、スペクトル包絡を表
すスペクトルパラメータa_i、ピッチＭ、代表区間のＫ個
のマルチパルスの振幅と位置、ゲイン補正係数、位相補
正係数、代表区間のフレーム内位置、Ｑ個のマルチパル
スの振幅と位置である。また、無声フレームでは、マル
チパルスの振幅、位置を伝送する。The transmission information on the transmitting side includes, in addition to the spectral parameters of the synthesis filter, in a voiced frame, the spectral parameters a _i representing the spectral envelope, the pitch M, the amplitude and position of the K multipulses in the representative section, the gain correction coefficient, the phase The correction coefficient, the position in the frame of the representative section, and the amplitude and position of the Q multi-pulses. In an unvoiced frame, the multipulse amplitude and position are transmitted.

第２の発明では、有声フレームでは第１の発明と同じ
動作をするが、無声フレームではマルチパルスではなく
て、予め定められた種類の雑音信号からなる符号帳から
一種類を選択した雑音信号を用いて音源信号を表すこと
を特徴とする。雑音信号としては、例えばガウス性の統
計分布を有する乱数を用いることができる。雑音信号の
時間方向の長さ（次元数）は通常フレームよりも短い長
さ（例えば５〜10ms）とする。また雑音信号の種類は2^B
種類とする。このような符号帳から入力音声に対して最
もよい雑音信号を選択する方法としては、雑音信号を用
いて合成フィルタを駆動して音声を合成して原音声との
誤差電力を求め、誤差電力を最小化する雑音信号を選択
する方法が知られている。この方法の詳細は、例えばSc
hroeder,Atal氏による“Code−excited linear predict
ion（CELP）:High quality speech at very low bitrat
es"と題した論文（Proc.ICASSP,pp.937−940,1985）
（文献４）等を参照することができる。In the second invention, in a voiced frame, the same operation as in the first invention is performed, but in an unvoiced frame, not a multipulse but a noise signal in which one type is selected from a codebook including a predetermined type of noise signal is used. The sound source signal is represented by using the sound source signal. As the noise signal, for example, a random number having a Gaussian statistical distribution can be used. The length (the number of dimensions) of the noise signal in the time direction is shorter than the normal frame (for example, 5 to 10 ms). The type of noise signal is 2 ^B
Type. As a method of selecting the best noise signal for the input speech from such a codebook, a noise filter is used to drive a synthesis filter to synthesize speech, obtain an error power from the original speech, and calculate the error power. A method for selecting a noise signal to be minimized is known. For details of this method, for example, Sc
“Code-excited linear predict by hroeder, Atal
ion (CELP): High quality speech at very low bitrat
es "(Proc. ICASSP, pp. 937-940, 1985)
(Reference 4) can be referred to.

無声フレームでは、選択された雑音信号を示すインデ
ックス、ゲイン、ピッチ再生フィルタのピッチゲイン、
ピッチ周期、合成フィルタのスペクトルパラメータを受
信側へ伝送する。In the unvoiced frame, the index indicating the selected noise signal, the gain, the pitch gain of the pitch reproduction filter,
The pitch period and the spectrum parameters of the synthesis filter are transmitted to the receiving side.

（実施例）第１の発明の一実施例を示す第１図において、入力端
子500から離散的な音声信号ｘ（ｎ）を入力する。(Embodiment) In FIG. 1 showing an embodiment of the first invention, a discrete audio signal x (n) is input from an input terminal 500.

スペクトル、ピッチパラメータ計算回路520では分割
したフレーム区間（例えば20ms）の音声信号スペクトル
包絡を表す合成フィルタのスペクトルパラメータa_iを、
衆知のLPC分析法によって求める。また、ピッチ周期Ｍ
を衆知の自己相関法により求める。The spectrum and pitch parameter calculation circuit 520 calculates the spectrum parameters a _i of the synthesis filter representing the audio signal spectrum envelope of the divided frame section (for example, 20 ms),
Determined by the well-known LPC analysis method. The pitch period M
Is calculated by the well-known autocorrelation method.

求められたスペクトルパラメータ及びピッチ周期に対
して、量子化器525において量子化を行う。量子化の方
法は、特願昭59−272435号明細書（文献５）に示されて
いるようなスカラー量子化や、あるいはベクトル量子化
を行ってもよい。ベクトル量子化の具体的な方法につい
ては、例えば、Makhoul氏らによる“Vector quantizati
on in speech coding"（Proc.IEEE.pp.1551−1558,198
5）（文献６）などの論文を参照できる。The quantizer 525 performs quantization on the obtained spectrum parameter and pitch period. As a quantization method, scalar quantization or vector quantization as described in Japanese Patent Application No. 59-272435 (Reference 5) may be performed. For a specific method of vector quantization, see, for example, “Vector quantizati
on in speech coding "(Proc.IEEE.pp.1551-1558,198
5) Refer to papers such as (Reference 6).

逆量子化器530は、量子化した結果を用いて逆量子化
して出力する。The inverse quantizer 530 inversely quantizes using the result of the quantization and outputs the result.

減算器535はフレームの音声信号から影響信号を減算
して出力する。The subtractor 535 subtracts the influence signal from the audio signal of the frame and outputs the result.

重み付け回路540は、音声信号と逆量子化されたスペ
クトルパラメータを用いて前記信号に聴感重み付けを行
う。重み付けの方法は、前記文献２の重み付け回路200
を参照することができる。The weighting circuit 540 performs perceptual weighting on the audio signal using the inversely quantized spectral parameter. The weighting method is described in the weighting circuit 200
Can be referred to.

インパルス応答計算回路550は、逆量子化されたスペ
クトルパラメータａ′_iを用いて聴感重みずけをした合
成フィルタのインパルス応答ｈ（ｎ）を計算する。具体
的な方法は前記文献２のインパルス応答計算回路を参照
できる。The impulse response calculation circuit 550 calculates the impulse response h (n) of the synthesis filter that has been subjected to the perceptual weighting using the dequantized spectral parameters a ′ _i . The specific method can be referred to the impulse response calculation circuit of the above reference 2.

自己相関関数計算回路560は前記インパルス応答に対
して自己相関関数R_hh（ｍ）を計算し、それぞれ音源パ
ルス計算回路580とパルス計算回路586へ出力する。自己
相関関数の計算法は前記文献２の自己相関関数計算回路
180を参照することができる。The autocorrelation function calculation circuit 560 calculates an autocorrelation function R _hh (m) for the impulse response, and outputs the results to the sound source pulse calculation circuit 580 and the pulse calculation circuit 586, respectively. The calculation method of the autocorrelation function is described in the document 2
180 can be referenced.

相互相関関数計算回路570は前記聴感重み付けられた
信号と、前記インパルス応答ｈ（ｎ）との相互相関関数
Φ_xh（ｍ）を計算する。The cross-correlation function calculation circuit 570 calculates a cross-correlation function Φ _xh (m) between the perceptually weighted signal and the impulse response h (n).

音源パルス計算回路580では、まず、フレームを逆量
子化したピッチ周期Ｍ′を用いて前記第４図（ｂ）のよ
うにサブフレーム区間に分割する。そして予め定められ
た１つのサブフレーム区間（代表区間）（例えば第４図
（ｂ）のサブフレーム）について、Φ_xh（ｍ）とR_hh
（ｍ）とを用いてＫ個のマルチパルス列（第１のマルチ
パルス）の振幅g_iと位置m_iを求める。パルス列の計算方
法については、前記文献２の音源パルス計算回路を参照
することができる。The sound source pulse calculation circuit 580 first divides a frame into subframe sections as shown in FIG. 4B using a pitch period M 'obtained by dequantizing a frame. Then, for one predetermined subframe section (representative section) (for example, the subframe in FIG. 4B), Φ _xh (m) and R _hh
(M) and using the determined amplitude g _i and position m _i of the K multi-pulse train (first multi-pulse). For the method of calculating the pulse train, reference can be made to the sound source pulse calculation circuit of Reference 2.

補正係数計算回路583では作用の項で示した（１），
（２）式に従い、代表区間以外のサブフレーム区間にお
いてゲイン補正係数c_j、位相補正係数d_jを計算して出力
する。In the correction coefficient calculation circuit 583, it is shown in the operation section (1),
According to the equation (2), the gain correction coefficient c _j and the phase correction coefficient _dj are calculated and output in the subframe sections other than the representative section.

量子化器585は、前記マルチパルス列の振幅と位置を
量子化して符号を出力する。具体的な方法は前記文献
１、２などを参照できる。またゲイン補正係数、位相補
正係数、代表区間のフレーム内位置を量子化して符号を
出力する。具体的な方法は例えば前記文献３などを参照
できる。これらの出力はさらに逆量子化され、ピッチ補
間回路605に出力され第４図（ｄ）のようにフレーム全
体のパルスが復元される。The quantizer 585 quantizes the amplitude and position of the multi-pulse train and outputs a code. The specific method can be referred to the above-mentioned documents 1 and 2. In addition, a code is output by quantizing the gain correction coefficient, the phase correction coefficient, and the position in the frame of the representative section. The specific method can be referred to, for example, the above-mentioned document 3. These outputs are further dequantized and output to the pitch interpolation circuit 605 to restore the pulses of the entire frame as shown in FIG. 4 (d).

前記復元されたパルスは、合成フィルタ610に通すこ
とによって、前記（３）式に従い合成音声信号ｘ′
（ｎ）が求まる。The restored pulse is passed through a synthesis filter 610 to produce a synthesized speech signal x 'according to the above equation (3).
(N) is obtained.

減算器615は、前記音声信号ｘ（ｎ）から合成音声信
号ｘ′（ｎ）を（４）式に従い減ずることによって、残
差信号ｅ（ｎ）を得る。The subtracter 615 obtains a residual signal e (n) by subtracting the synthesized voice signal x '(n) from the voice signal x (n) according to the equation (4).

重み付け回路600は前記残差信号に対して聴感重みず
けを行う。The weighting circuit 600 performs perceptual weighting on the residual signal.

相互相関関数計算回路603は重み付け回路600の出力と
前記インパルス応答ｈ（ｎ）との相互相関関数を計算す
る。The cross-correlation function calculation circuit 603 calculates a cross-correlation function between the output of the weighting circuit 600 and the impulse response h (n).

パルス計算回路586では、前記相互相関関数とインパ
ルス応答ｈ（ｎ）の自己相関関数を用いて、予め定めら
れた個数のマルチパルス（第２のマルチパルス）の振幅
と位置を求める。The pulse calculation circuit 586 obtains the amplitude and position of a predetermined number of multi-pulses (second multi-pulses) using the cross-correlation function and the auto-correlation function of the impulse response h (n).

量子化器620は前記マルチパルスの振幅、位置を量子
化して出力するとともに、これらを逆量子化して合成フ
ィルタ625へ出力する。The quantizer 620 quantizes and outputs the amplitude and position of the multi-pulse, and inversely quantizes these and outputs the result to the synthesis filter 625.

合成フィルタ625は残差信号を合成して出力する。 The combining filter 625 combines and outputs the residual signal.

加算器627は合成フィルタ625と合成フィルタ610の出
力を加算してフレームの再生信号を求め、さらに次フレ
ームに対する影響信号をもとめて出力する。影響信号計
算の具体的な方法は前記文献２を参照できる。An adder 627 adds the outputs of the synthesis filters 625 and 610 to obtain a reproduced signal of the frame, and further obtains and outputs an influence signal for the next frame. Reference 2 can be referred to for a specific method of calculating the influence signal.

マルチプレクサ635は、量子化器585、620の出力であ
るマルチパルス列の振幅、位置、補正係数、代表区間の
位置を表す符号、パラメータ量子化器525の出力である
スペクトルパラメータ、ピッチ周期を表す符号を組み合
せて出力する。The multiplexer 635 outputs the amplitude of the multi-pulse train output from the quantizers 585 and 620, the position, the correction coefficient, the code representing the position of the representative section, the spectrum parameter output from the parameter quantizer 525, and the code representing the pitch period. Output in combination.

一方、受信側では、デマルチプレクサ710は、ピッチ
補間マルチパルス（第１のマルチパルス）の振幅、位
置、補正係数、代表区間の位置を表す符号、マルチパル
ス（第２のマルチパルス）の振幅、位置を表す符号、ス
ペクトルパラメータ、ピッチ周期を表す符号を分離して
出力する。On the other hand, on the receiving side, the demultiplexer 710 outputs the amplitude, position, correction coefficient, code representing the position of the representative section, the amplitude of the multipulse (second multipulse), A code representing a position, a spectrum parameter, and a code representing a pitch period are separated and output.

第１のパルス復号器720はピッチ補間マルチパルスの
振幅、位置を復号する。第２のパルス復号器725は第２
のマルチパルスの振幅、位置を復号する。パラメータ復
号器750は、送信側の逆量子化器530と同じ働きをして、
スペクトルパラメータａ′_i、ピッチ周期Ｍ′を復号し
て出力する。The first pulse decoder 720 decodes the amplitude and position of the pitch interpolation multi-pulse. The second pulse decoder 725 is
The multi-pulse amplitude and position are decoded. The parameter decoder 750 works in the same way as the inverse quantizer 530 on the transmission side,
The spectrum parameter a ′ _i and the pitch period M ′ are decoded and output.

ピッチ補間回路726は、送信側のピッチ補間回路605と
同一の動作を行う。The pitch interpolation circuit 726 performs the same operation as the pitch interpolation circuit 605 on the transmission side.

パルス発生器727は前記第２のマルチパルスによる音
源信号をフレーム長だけ発生させる。The pulse generator 727 generates a sound source signal based on the second multi-pulse by a frame length.

加算器740はパルス発生器727とピッチ補間回路726の
出力信号を加算してフレームの駆動音源信号を求め、合
成フィルタ回路760を駆動する。The adder 740 adds the output signals of the pulse generator 727 and the pitch interpolation circuit 726 to obtain a driving sound source signal of the frame, and drives the synthesis filter circuit 760.

合成フィルタ回路760は、前記駆動音源信号及び前記
復号されたスペクトルパラメータを用いて、フレーム毎
に合成音声波形を求めて出力する。The synthesis filter circuit 760 obtains and outputs a synthesized speech waveform for each frame using the driving excitation signal and the decoded spectrum parameter.

以上で第１の発明の一実施例の説明を終える。 This is the end of the description of the embodiment of the first invention.

第２図は第２の発明の一実施例を示すブロック図であ
る。図において第１図と同一の番号を付した構成要素
は、第１図と同一の動作を行うので説明は省略する。FIG. 2 is a block diagram showing one embodiment of the second invention. In the figure, components having the same reference numerals as those in FIG. 1 perform the same operations as those in FIG.

図において、スペクトル、ピッチパラメータ計算回路
522はスペクトルパラメータa_iを衆知のLPC分析を用いて
求め、ピッチパラメータとしてピッチ周期Ｍ、ピッチゲ
インｂを衆知の自己相関法を用いて求める。In the figure, spectrum and pitch parameter calculation circuit
In step 522, a spectrum parameter a _i is obtained by using a well-known LPC analysis, and a pitch period M and a pitch gain b are obtained as pitch parameters by using a well-known autocorrelation method.

量子化器522は、スペクトルパラメータa_iをPARCOR係
数あるいはLSP係数に変換した後に量子化する。ここで
はPARCOR係数を用いる。またピッチ周期Ｍ、ピッチゲイ
ンｂを量子化する。またこれらの量子化値を復号化して
復号値ａ′_i、Ｍ′、ｂ′を出力する。The quantizer 522 converts the spectral parameters a _i into PARCOR coefficients or LSP coefficients and then quantizes them. Here, the PARCOR coefficient is used. Further, the pitch period M and the pitch gain b are quantized. Further, these quantized values are decoded to output decoded values a ′ _i , M ′, b ′.

コードブック800は、2^B（Ｂはビット数を示す）種類
の雑音信号をあらかじめ格納している。雑音信号の発生
の方法は前記文献４を参照できる。このうちから一種類
ずつたたみこみ回路810へ出力する。The codebook 800 stores 2 ^B (B indicates the number of bits) kinds of noise signals in advance. For the method of generating a noise signal, reference can be made to the aforementioned reference 4. From these, the signals are output to the convolution circuit 810 one by one.

畳み込み回路810は、一種類の雑音信号ｃ（ｎ）と前
記インパルス応答ｈ（ｎ）を次式に従いたたみこみ、結
果をスイッチ820に出力する。The convolution circuit 810 convolves one type of noise signal c (n) and the impulse response h (n) according to the following equation, and outputs the result to the switch 820.

ｆ（ｎ）＝ｃ（ｎ）＊ｈ（ｎ）（５）ここで記号＊は畳み込み和を表す。f (n) = c (n) * h (n) (5) Here, the symbol * represents a convolution sum.

スイッチ820は有声フレームではインパルス応答計算
回路550の出力を相関関数計算回路560へ出力し、無声フ
レームでは畳み込み回路810の出力を自己相関関数計算
回路560へ出力する。ここで有声、無声の判別は例え
ば、復号化したピッチゲインｂ′の値が予めさだめられ
たしきい値を越えたときは有声、そうでないときは無声
と判別することができる。The switch 820 outputs the output of the impulse response calculation circuit 550 to the correlation function calculation circuit 560 for voiced frames, and outputs the output of the convolution circuit 810 to the autocorrelation function calculation circuit 560 for unvoiced frames. Here, voiced or unvoiced can be determined as voiced when the value of the decoded pitch gain b 'exceeds a predetermined threshold, and unvoiced otherwise.

スイッチ825は自己相関関数計算回路560の出力を、有
声フレームでは音源パルス計算回路580へ出力し、無声
フレームでは信号選択回路830へ出力する。The switch 825 outputs the output of the autocorrelation function calculation circuit 560 to the sound source pulse calculation circuit 580 for voiced frames, and outputs to the signal selection circuit 830 for unvoiced frames.

信号選択回路830は相互相関関数Φ_xhと自己相関関数R
_hhとを用いて次式の計算を行う。The signal selection circuit 830 has a cross-correlation function Φ _xh and an auto-correlation function R
_The following equation is calculated using _hh .

Ｇ＝（Φ_xh）²／R_hh （６）（６）式の計算を全ての雑音信号に対して行い、（６）
式を最大化する雑音信号を選択し、選択された雑音信号
を表すインデックスと（６）式で求めたゲインＧを出力
する。G = (Φ _xh ) ² / R _hh (6) The equation (6) is calculated for all noise signals, and (6)
A noise signal that maximizes the expression is selected, and an index representing the selected noise signal and the gain G obtained by Expression (6) are output.

符号器840は、ゲインＧを予め定められたビット数で
量子化しマルチプレクサ635へ出力する。また量子化値
を復号化してピッチ再生フィルタ850へ出力する。The encoder 840 quantizes the gain G with a predetermined number of bits and outputs the result to the multiplexer 635. Further, it decodes the quantized value and outputs it to pitch reproduction filter 850.

ピッチ再生フィルタ850は次式に従い音源信号ｖ
（ｎ）を求めて出力する。The pitch reproduction filter 850 generates the sound source signal v according to the following equation.
(N) is obtained and output.

Ｖ（ｎ）＝ｃ（ｎ）＋ｂ′・ｖ（ｎ−Ｍ）（７）ここでｃ（ｎ）は選択された雑音信号である。V (n) = c (n) + b ′ · v (n−M) (7) where c (n) is the selected noise signal.

合成フィルタ860はｖ（ｎ）を入力して合成音声を求
めて出力する。The synthesis filter 860 receives v (n), obtains and outputs synthesized speech.

スイッチ865は、減算器535に対して有声フレームでは
加算器627の出力を出力し、無声フレームでは合成フィ
ルタ860の出力を出力する。The switch 865 outputs the output of the adder 627 for the voiced frame to the subtractor 535, and outputs the output of the synthesis filter 860 for the unvoiced frame.

受信側では、復号回路875は、雑音信号のゲイン、イ
ンデックスを復号する。On the receiving side, the decoding circuit 875 decodes the gain and index of the noise signal.

パラメータ復号回路870は、ピッチゲインｂ′、ピッ
チ周期Ｍ′、スペクトルパラメータa_i′を復号する。The parameter decoding circuit 870 decodes the pitch gain b ', the pitch period M', and the spectrum parameter _ai '.

ピッチ再生フィルタ880は、送信側のピッチ再生フィ
ルタ850と同一の動作を行ない、無声フレームにおける
音源信号を復号する。Pitch reproduction filter 880 performs the same operation as pitch reproduction filter 850 on the transmission side, and decodes a sound source signal in an unvoiced frame.

スイッチ870は有声フレームと無声フレームで音源信
号を切り替える。The switch 870 switches a sound source signal between a voiced frame and an unvoiced frame.

以上で第２の発明の一実施例の説明を終了する。 This is the end of the description of the embodiment of the second invention.

以上述べた構成は本発明の一実施例に過ぎず、種々の
変形も可能である。The configuration described above is merely an embodiment of the present invention, and various modifications are possible.

マルチパルスの計算方法としては、前記文献１に示し
た方法の他に、種々の衆知な方法を用いることができ
る。これには、例えば、Ozawa氏らによる“A Study on
Pulse Search Algorithms for Multi−pulse Speech Co
der Realization"（IEEE JSAC,pp.133−141,1986）（文
献７）を参照することができる。As a method of calculating the multi-pulse, various well-known methods can be used in addition to the method shown in the above-mentioned document 1. This includes, for example, “A Study on
Pulse Search Algorithms for Multi-pulse Speech Co
der Realization "(IEEE JSAC, pp. 133-141, 1986) (Reference 7).

また、ピッチ周期、ピッチゲインの計算法としては、
前述の実施例で示した方法の他に、例えば、下記（８）
式のように、過去の音源信号ｖ（ｎ）とピッチ再生フィ
ルタ、合成フィルタで再生した信号と、現サブフレーム
の入力音声信号ｘ（ｎ）との誤差電力Ｅを最小化するよ
うな位置Ｍを探索し、そのときの係数ｂを求めることも
できる。Also, pitch period and pitch gain can be calculated by:
In addition to the method shown in the above embodiment, for example, the following (8)
As shown in the equation, a position M that minimizes the error power E between the past sound source signal v (n), the signal reproduced by the pitch reproduction filter and the synthesis filter, and the input audio signal x (n) of the current subframe. And the coefficient b at that time can be obtained.

ここで、ｈ（ｎ）は合成フィルタのインパルス応答、
ｗ（ｎ）は聴感重みずけ回路のインパルス応答を示す。 Where h (n) is the impulse response of the synthesis filter,
w (n) indicates the impulse response of the auditory weighting circuit.

また、送信側の合成フィルタ610で重みずけ信号を再
生するようにして、重みずけ回路540からこれを減算す
るような構成とすると、重みずけ回路600を省略するこ
とができる。In addition, if the weighting signal is reproduced by the synthesis filter 610 on the transmission side and is subtracted from the weighting circuit 540, the weighting circuit 600 can be omitted.

また送信側における合成フィルタ610、625、860を共
通化することもできる。Also, the synthesis filters 610, 625, and 860 on the transmission side can be shared.

また、特性は少し低下するが、送信側で影響信号の減
算を省略することもできる。このような構成とすると、
減算器535、合成フィルタ625、加算器627、ピッチ再生
フィルタ850、合成フィルタ860が不要となり、構成を簡
略化できる。In addition, although the characteristics are slightly lowered, the subtraction of the influence signal on the transmitting side can be omitted. With such a configuration,
The subtractor 535, the synthesis filter 625, the adder 627, the pitch reproduction filter 850, and the synthesis filter 860 become unnecessary, and the configuration can be simplified.

（発明の効果）第１の発明によれば、有声フレームでは、ピッチ毎の
周期性の強いパルスについては、ピッチ補間により１つ
のサブフレーム区間のパルスを求めることにより非常に
効率的に表し、ピッチ毎の相関のそれほど強くないパル
スについてはピッチ補間を用いずにマルチパルスを求め
ているので、全てのパルスに対してピッチ予測を用いて
求める従来法と比較して、母音遷移部や過渡部など周期
性が少し弱くなる部分で音質を大きく改善することがで
きるという効果がある。さらにピッチ補間では一つのサ
ブフレームに対してのみマルチパルスを求めているの
で、ピッチ予測マルチパルスに比べ必要な演算量を大幅
に低減することが可能という大きな効果がある。さら
に、第２の発明によれば、周期性がなく音源信号が雑音
的な無声フレームでは、最も良好な雑音信号を選択して
音源を表しているので従来方式に比べ音質がさらに改善
されるという効果がある。(Effects of the Invention) According to the first invention, in a voiced frame, a pulse having a strong periodicity for each pitch is represented very efficiently by obtaining a pulse in one subframe section by pitch interpolation, Since the multi-pulse is calculated without using the pitch interpolation for the pulse whose correlation is not so strong for each pulse, the vowel transition part and the transition part are compared with the conventional method that uses the pitch prediction for all the pulses. There is an effect that the sound quality can be greatly improved in a portion where the periodicity is slightly weakened. Further, in the pitch interpolation, a multi-pulse is obtained only for one sub-frame, so that there is a great effect that a necessary calculation amount can be greatly reduced as compared with the pitch prediction multi-pulse. Further, according to the second aspect, in an unvoiced frame in which the sound source signal is noisy without periodicity, the best noise signal is selected to represent the sound source, so that the sound quality is further improved as compared with the conventional method. effective.

[Brief description of the drawings]

第１図は第１の発明による音声符号化復号化方式の一実
施例の構成を示すブロック図、第２図は第２の発明によ
る音声符号化復号化方式の一実施例の構成を示すブロッ
ク図、第３図は本発明の作用を示すブロック図である。
第４図はピッチ補間マルチパルスの例を表すブロック図
である。第５図は従来方式の例を示すブロック図であ
る。図において、150…LPC、ピッチ分析部、250…音源パル
ス計算部、270…パルス計算部、520,522…スペクトル、
ピッチパラメータ計算回路、525…パラメータ量子化
器、530…逆量子化器、535,260…減算器、540…重みず
け回路、550…インパルス応答計算回路、560…自己相関
関数計算回路、570,603…相互相関関数計算回路、585,6
20…量子化器、627…加算器、586…パルス計算回路、60
5,726…ピッチ補間回路、610,625,760,860…合成フィル
タ、635…マルチプレクサ、710…デマルチプレクサ、72
0…第１のパルス復号器、725…第２のパルス復号器、75
0,870…パラメータ復号器、727…パルス発生器、800…
コードブック、810…畳み込み回路、820,825,865…スイ
ッチ、830…信号選択回路、850,880…ピッチ再生フィル
タ、875…復号回路。FIG. 1 is a block diagram showing a configuration of an embodiment of a speech encoding / decoding system according to the first invention, and FIG. 2 is a block diagram showing a configuration of an embodiment of a speech encoding / decoding system according to the second invention. FIG. 3 is a block diagram showing the operation of the present invention.
FIG. 4 is a block diagram showing an example of a pitch interpolation multi-pulse. FIG. 5 is a block diagram showing an example of the conventional system. In the figure, 150 LPC, pitch analyzer, 250 pulse generator, 270 pulse calculator, 520, 522 spectrum,
Pitch parameter calculation circuit, 525: parameter quantizer, 530: inverse quantizer, 535,260: subtractor, 540: weighting circuit, 550: impulse response calculation circuit, 560: autocorrelation function calculation circuit, 570, 603: cross correlation Function calculation circuit, 585,6
20 ... Quantizer, 627 ... Adder, 586 ... Pulse calculation circuit, 60
5,726: pitch interpolation circuit, 610, 625, 760, 860: synthesis filter, 635: multiplexer, 710: demultiplexer, 72
0 ... first pulse decoder, 725 ... second pulse decoder, 75
0,870… Parameter decoder, 727… Pulse generator, 800…
Codebook, 810: convolution circuit, 820, 825, 865 switch, 830: signal selection circuit, 850, 880: pitch reproduction filter, 875: decoding circuit.

───────────────────────────────────────────────────── フロントページの続き (58)調査した分野(Int.Cl.⁶，ＤＢ名) G10L 3/00 - 9/18 H03M 7/30 H04B 14/04 ＪＩＣＳＴ（ＪＯＩＳ)──────────────────────────────────────────────────続き Continued on the front page (58) Fields investigated (Int. Cl. ⁶ , DB name) G10L 3/00-9/18 H03M 7/30 H04B 14/04 JICST (JOIS)

Claims

(57) [Claims]

1. A transmitting side inputs a discrete voice signal, extracts a spectrum parameter representing a spectrum envelope and a pitch parameter representing a pitch period for each frame from the voice signal, and converts the voice signal of the frame into the pitch parameter. , A first multi-pulse is obtained for the audio signal of one of the small sections using the pitch parameter and the spectrum parameter, and the multi-pulse is corrected in another section. The second multi-pulse is obtained by using the spectrum parameter after removing the signal obtained by the multi-pulse and the coefficient from the audio signal, and the receiving side obtains the first multi-pulse and the pitch parameter. And recovering the sound source signal using the correction coefficient and the second multi-pulse. Speech coding and decoding method and obtains the synthesized speech signal by driving a configured synthesis filter using the Le parameters.

2. A transmitting side inputs a discrete voice signal, extracts a spectrum parameter representing a spectrum envelope and a pitch parameter representing a pitch period for each frame from the voice signal, and converts the voice signal of the frame into the pitch parameter. , And a first multipulse is obtained as a sound source signal of the audio signal using the pitch parameter and the spectrum parameter in one of the small sections, and the multipulse is obtained in another section. A multi-pulse sound source obtained by obtaining a second multi-pulse using the spectral parameter with respect to a signal obtained by removing the multi-pulse and a signal obtained by the coefficient from the audio signal. Alternatively, the speech signal and the noise may be obtained from a codebook composed of a predetermined type of noise signal. It is expressed by using a noise signal selected so as to reduce the error power with respect to the synthesized signal obtained from the signal, and the receiving side uses the first multi-pulse, the pitch parameter, the correction coefficient, and the second multi-pulse. Or restore the sound source signal using the selected noise signal, and restore the sound source signal using the selected noise signal, and drive a synthesis filter configured using the spectral parameters with the sound source signal to obtain a synthesized speech signal. Audio encoding / decoding system to be used.