JPS60186898A

JPS60186898A - Voice encoding system and apparatus

Info

Publication number: JPS60186898A
Application number: JP59042305A
Authority: JP
Inventors: 一範小澤; 卓荒関
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1984-03-06
Filing date: 1984-03-06
Publication date: 1985-09-24
Anticipated expiration: 2009-04-27
Also published as: JPH0632031B2

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】（産業上の利用分野）本発明は音声信号の低ビツトレイト波形符号化方式、特
に伝送情報量を１０にビット／秒以下とするような符号
化方式と装置に関する。DETAILED DESCRIPTION OF THE INVENTION (Field of Industrial Application) The present invention relates to a low bit rate waveform encoding method for audio signals, and particularly to an encoding method and apparatus for reducing the amount of transmitted information to 10 bits/second or less.

（従来技術とその問題点）音声信号を１０にビット／秒程度以下の伝送情報量で符
号化するだめの効果的な方法としては、音声信号の駆動
音源信号系列を、それを用いて再生した信号と入力信号
との誤差最小を条件として、短時間毎に探索する方法が
、よく知られている。(Prior art and its problems) An effective method for encoding an audio signal with a transmission information amount of less than about 10 bits per second is to reproduce the driving sound source signal sequence of the audio signal using the A well-known method is to search for short periods of time, subject to the minimum error between the signal and the input signal.

これらの方法はその探索方法によって木符号化（ＴＲＩ
　Ｃ０ＤＩＮＧ　）、ベクトル量子化（ＶＥＣＴＯ几Ｑ
ＵＡ、ＮＴ　Ｉ　ＺＡＴ　Ｉ　ＯＮ　）と呼ばれ一’ｃ
いる。また、これらの方法以外に、駆動音源信号系列を
表わす複数個のパルス系列を、短時間毎に、符号器側で
、アナリシス・パイ・シンセシス　（ＡＮＡＬＹＳＩＳ
−ＢＹ−８ＹＮＴＨＥ８ＩＳ　；　Ａ　−Ｂ　−８）の
手法を用いて逐次的にめようとする方式が最近、提案さ
れている。本発明は、この方式に関係するものである。These methods use tree encoding (TRI) depending on their search method.
C0DING), vector quantization (VECTOQ)
It is called UA, NT I ZAT I ON).
There is. In addition to these methods, analysis pie synthesis (ANALYSIS) is also used to generate a plurality of pulse sequences representing the drive excitation signal sequence at short intervals on the encoder side.
-BY-8YNTHE8IS;A-B-8) A method has recently been proposed in which the method is attempted to be sequentially performed. The present invention relates to this method.

この方式の詳細については、ピー・ニス・アタール（Ｂ
　−８−ＡＴＡＬ　）氏らによるアイ・シー・ニー・ニ
ス・ニス・ピー（工・Ｃ−Ａ−８−８−Ｐ）の予稿集、
１９８２年６１４〜６１７頁に掲載の［ア・ニュー・モ
デル・オブ・エル・ピー・シー・エクサイティション・
フォー・プロデューシング・ナチーラル・サウンディン
グ・スピーチ・アット・ロウ・ビット−ｖイッ」（Ａ　
ＮＥＷ　ＭＯＤＥＬ　０ＦＬＰＣＢＸＣＩＴＡＴＩＯＮ
　ＦＯＲＰＲＯＤＵＣＩＮＧ　ＮＡＴＵＲＡＬ　−８Ｏ
ＵＮＤＩＮＧ　８ＰＦｉＥＣＨＡＴ　ＬＯＷ　ＢＩＴ　
ＲＡＴＥＳ”）と題した論文（文献１）に説明されてい
るので、ここでは簡単に説明を行なうにとどめる。For more information on this method, see Pi Nis Attar (B.
-8-ATAL) et al.'s I.C.N.N.I.N.S.P. (Engineering/C-A-8-8-P) proceedings,
[A New Model of LPC Excitation, published in 1982, pp. 614-617.
"For Producing Natural Sounding Speech at Low Bits" (A
NEW MODEL 0FLPCBXCITATION
FORPRODUCING NATURAL -8O
UNDING 8PFiECHAT LOW BIT
Since it is explained in a paper entitled "RATES" (Reference 1), a brief explanation will be given here.

第１図は、前記文献１、に記載された従来方式における
符号器側の処理を示すブロック図である。FIG. 1 is a block diagram showing the processing on the encoder side in the conventional method described in Document 1.

図において、１００は符号器入力端子を示し、Ａ／Ｄ変
換された音声信号系列ｘ（、）が入力される。In the figure, 100 indicates an encoder input terminal, into which an A/D converted audio signal sequence x(,) is input.

１１０はバックアメモリ回路であシ、音声信号系列を１
フレーム（例えば８　ＫＨｚサンプリングの場合でフレ
ーム長を１０　ｍ５ｅｃとすると８０サンプル）分、蓄
積する。１１０の出力値は減算器１２０と、にパラメー
タ計算回路１８０とに出力される。但し、文献１、によ
ればにパラメータのかわシにレフレクション・コエフィ
シエンツ（ＲＥＦＩＪＣＴＩＯＮＣＯＥＦＦＩＣＩＥＮ
ＴＳ　）と記載されているが、これはにパラメータと同
一のパラメータである。棟だ、Ｋパラメータはバーコー
Ａ／（ＰＡＲＣＯＲ）係数とも呼ばれる。Ｋパラメータ
計算回路１８０は、１１０の出力値を用い、共分散法に
従って、フレーム毎の音声信号スペクトルを表わすにパ
ラメータＫｉを１６次分（１≦ｉ≦１６）求め、これら
を合成フィルタ１３０へ出力する。１４０は、音源パル
ス発生回路であり、１フレームにあらかじめ定められた
個数のパルス系列を発生させる。ここでは、このパルス
系列をａ　（ｎ）と記する。音源パルス発生回路１４０
によって発生された音源パルス系列の一例を第２図に示
す。第２図で横軸は離散的な時刻を、縦軸は振幅をそれ
ぞれに示す。ここでは、１フレーム内に８個のパルスを
発生させる場合について示しである。音源パルス発生回
路１４０によって発生されたパルス系列ｄ　（ｎ）は、
合成フィルタ１３０を駆動する。合成フィルター３０は
、ｄ（ｎ）を入力し、音声信号、　（ｎ）に対応する再
生信号〜（ｎ）をめ、これを減算器１２０へ出力する。Reference numeral 110 is a backup memory circuit, which stores the audio signal series as 1.
A frame (for example, in the case of 8 KHz sampling and a frame length of 10 m5ec, 80 samples) is accumulated. The output value of 110 is output to a subtracter 120 and a parameter calculation circuit 180. However, according to Document 1, the parameters of the reflection coefficient (REFIJCTIONCOEFFICIEN)
TS ), which is the same parameter as . The K parameter is also called the PARCOR coefficient. The K-parameter calculation circuit 180 uses the output value of 110 to obtain 16-order parameters Ki (1≦i≦16) representing the audio signal spectrum for each frame according to the covariance method, and outputs these to the synthesis filter 130. do. 140 is a sound source pulse generation circuit, which generates a predetermined number of pulse sequences in one frame. Here, this pulse sequence is denoted as a (n). Sound source pulse generation circuit 140
FIG. 2 shows an example of a sound source pulse sequence generated by. In FIG. 2, the horizontal axis represents discrete time, and the vertical axis represents amplitude. Here, a case is shown in which eight pulses are generated within one frame. The pulse sequence d (n) generated by the sound source pulse generation circuit 140 is
The synthesis filter 130 is driven. The synthesis filter 30 inputs d(n), obtains the reproduced signal ˜(n) corresponding to the audio signal (n), and outputs this to the subtracter 120 .

ここで、合成フィルター３０は、ＫパラメータＫｉを入
力し、これらを予測パラメータａｉ（工≦１≦１６）へ
変換し、ａｉを用いて再生信号Ｘ（ｎ）を計算する。Here, the synthesis filter 30 inputs the K parameters Ki, converts them into prediction parameters ai (k≦1≦16), and calculates the reproduced signal X(n) using ai.

１（ｎ）は、ｄ（ｎ）とａｉを用い下式のように表わす
ことができる。1(n) can be expressed as shown below using d(n) and ai.

上式でＰは合成フィルタの次数を示し、ここではＰ＝１
６としている。減算器１２０は、原信号マ（ｎ）と再生
信号ｘ（ｎ）との差、　（、）を計算し、重み付は回路
１９０へ出力する。１９０は、ｅ（ｎ）を入力し、重み
伺は関数ｗ　（ｎ）を用い、次式に従って重み付は誤差
ｅｗ（ｎ）を計算する。In the above formula, P indicates the order of the synthesis filter, here P=1
It is set at 6. The subtracter 120 calculates the difference, (,) between the original signal Ma(n) and the reproduced signal x(n), and outputs the weighting to the circuit 190. 190 inputs e(n), uses the function w(n) for the weighting, and calculates the weighting error ew(n) according to the following equation.

ｅｗ（ｎ）　＝　Ｗ　（、）米ｅ（ｎ）　（２）上式で
、記号−１はたたみこみ積分を表わ゛す。ew(n) = W (,) e(n) (2) In the above equation, the symbol -1 represents a convolution integral.

また、重み付は関数ｗ（ｎ）は、周波数軸上で重み付け
を行なうものであシ、その２変換値をＷ（ｚ）とすると
、合成フィルタの予測パラメータａｉを用いて、次式に
より表わされる。In addition, the weighting function w(n) is a weighting function on the frequency axis, and if its two-converted value is W(z), it can be expressed by the following equation using the prediction parameter ai of the synthesis filter. It will be done.

上式でｒはＯ≦ｒ≦１の定数であり　、Ｗ（Ｚ）の周波
数特性を決定する。つまＪ、ｙ＝１とすると、Ｗ（Ｚ）
＝１となシ、そのノル波数特性は平坦となる。In the above equation, r is a constant of O≦r≦1, and determines the frequency characteristics of W(Z). If J, y=1, W(Z)
= 1, the Norr wavenumber characteristic becomes flat.

一方、ｒ＝０とすると、Ｗ（２）〕は合成ノイルタの周
波数特性の逆特性となる。従って、ｒの値によってＷ　
（Ｚ）の特性を変えることができる。また、（３）式で
示したようにＷ勿）を合成フィルタの周波数特性に依存
させて決めているのは、聴感的なマスク効果を利用して
いるためである。つまり、入力音声信号のスペクトルの
パワが大きな箇所では（例えばフォルマントの近傍）、
再生４６号のスペクトルとの誤差が少々大きくても、そ
の誤差は耳につきにくいという聴感的な性質による。第
３図に、あるフレームにおける入力音声信号のスペクト
ルと、Ｗ　（Ｚ）の周波数特性の一例とを示した。ここ
ではｒ−０８とした。図において、横軸は周波数（最大
４　ＫＨｚ　）を、縦軸は対数振幅（最大６０ｄＢ）を
それぞれ示す。また、上部の曲線は音声信号のスペクト
ルを、下部の曲線は重み付は関数の周波数特性を表わし
ている。On the other hand, when r=0, W(2)] has a frequency characteristic opposite to that of the composite nolter. Therefore, depending on the value of r, W
The characteristics of (Z) can be changed. Furthermore, as shown in equation (3), the reason W is determined depending on the frequency characteristics of the synthesis filter is because an auditory masking effect is utilized. In other words, in places where the spectral power of the input audio signal is large (for example, near formants),
Even if the error with the spectrum of Reproduction No. 46 is a little large, the error is difficult to hear due to the audible property. FIG. 3 shows the spectrum of the input audio signal in a certain frame and an example of the frequency characteristics of W (Z). Here, it was set as r-08. In the figure, the horizontal axis represents frequency (maximum 4 KHz), and the vertical axis represents logarithmic amplitude (maximum 60 dB). The upper curve represents the spectrum of the audio signal, and the lower curve represents the frequency characteristics of the weighting function.

第１図へ戻って、重み付は誤差ｅｗ（ｎ）は、誤差最小
化回路１５０ヘフイードバツクされる。誤差最小化回路
１５０は、ｅＷ（ｎ）の値を１フレーム分記憶し、これ
らを用いて次式に従い、重み付けられた２乗誤差εを計
算する。Returning to FIG. 1, the weighted error ew(n) is fed back to the error minimization circuit 150. The error minimization circuit 150 stores the values of eW(n) for one frame, and uses them to calculate the weighted squared error ε according to the following equation.

ε−Σｅ　ｗ　（ｎ）２（４） −１ここで、Ｎは２乗誤差を計算するサンプル数を示す。文
献１、の方式では、この時間長を５　ｍ５ｅｃとしてお
シ、これは８　ＫＨ２サンプリングの場合にはＮ＝４０
に相当する。次に、誤差最小化回路１５０は、前記（４
）式で計算した２乗誤差εを小さくするように音源パル
ス発生回路１４０に対し、パルス位置及び振幅情報を与
える。１４０は、この情報に基づいて音源パルス系列を
発生させる。ε−Σe w (n)2(4) −1 Here, N indicates the number of samples for calculating the squared error. In the method of Reference 1, this time length is set to 5 m5ec, which is N = 40 in the case of 8KH2 sampling.
corresponds to Next, the error minimization circuit 150 performs the above (4)
) The pulse position and amplitude information is given to the sound source pulse generation circuit 140 so as to reduce the squared error ε calculated by the equation. 140 generates a sound source pulse sequence based on this information.

合成フィルタ１３０は、この音源パルス系列を駆動源と
して再生信号；　（ｎ）を計算する。次に減算器１２０
では、先に計算した原信号と再生信号との誤差ｅ（ｎ）
から現在求まった再生信号？　（ｎ）を減算して、これ
を新たな誤差ｅ（ｎ）とする。重み付は回路１９０はｅ
（ｎ）を入力し重み付は誤差ｅＷ（ｎ）を計算し、これ
を誤差最小化回路１５０ヘフイードバツクする。誤差最
小化回路１５０は、再び２乗誤差を計算し、これを小さ
くするように音源パルス系列の振幅と位置を調整する。The synthesis filter 130 uses this sound source pulse sequence as a driving source to calculate a reproduced signal; (n). Next, the subtractor 120
Now, the error e(n) between the original signal and the reproduced signal calculated earlier is
The playback signal currently obtained from ? (n) is subtracted and this is set as a new error e(n). The weighting circuit 190 is
(n) is input, the weighting calculates the error eW(n), and feeds this back to the error minimization circuit 150. The error minimization circuit 150 calculates the squared error again, and adjusts the amplitude and position of the sound source pulse sequence to reduce the squared error.

こうして音源パルス系列の発生から誤差最小化による音
源パルス系列の調整までの一連の処理は、音源パルス系
列のパルス数があらかじめ定められた数に達するまでく
シ返され、音源パルス系列が決定される。In this way, a series of processes from generation of a sound source pulse sequence to adjustment of the sound source pulse sequence by error minimization are repeated until the number of pulses in the sound source pulse sequence reaches a predetermined number, and the sound source pulse sequence is determined. .

以上で従来方式の説明を終了する。This concludes the explanation of the conventional method.

この方式の場合に、伝送すべき情報は、合成フィルタの
にパラメータＫｉ（１≦１≦１６）と、音源パルス系列
のパルス位置及び振幅であシ、１フレーム内にたてるパ
ルスの数によって任意の伝送レイトを実現できる。さら
に、伝送レイトを１６Ｋｂｐｓ〜１０Ｋｂｐｓとする領
域に対しては、良好な再生音質が得られ有効な方式の一
つと考えられる。In the case of this method, the information to be transmitted is the parameter Ki (1≦1≦16) of the synthesis filter, the pulse position and amplitude of the sound source pulse sequence, and is arbitrary depending on the number of pulses generated within one frame. transmission rate can be achieved. Furthermore, it is considered to be one of the effective methods since good reproduction sound quality can be obtained for a transmission rate range of 16 Kbps to 10 Kbps.

しかしながら、この従来方式は、演算量が非常に多いと
いう欠点がある。これは音源パルス系列におけるパルス
の位置と振幅を計算する際に、そのパルスに基づいて再
生した信号と原信号との誤差及び２乗誤差を計算し、そ
れらをフィードバックさせて、２乗誤差を小さくするよ
うにパルス位置と振幅を調整していることに起因してい
る。更には、パルスの数があらかじめ定められた値に達
するまでこの処理をくり返すことに起因している。However, this conventional method has the disadvantage that the amount of calculation is extremely large. When calculating the position and amplitude of a pulse in a sound source pulse sequence, this calculates the error and square error between the reproduced signal and the original signal based on the pulse, and feeds them back to reduce the square error. This is due to the fact that the pulse position and amplitude are adjusted accordingly. Furthermore, this is caused by repeating this process until the number of pulses reaches a predetermined value.

更に、この従来方式によれば、１０Ｋｂｐｓ程度以下の
ビットレイトでは、ピッチ周波数の高い入力信号の場合
、例えば女性の声を入力した場合には、再生品質が劣化
するという欠点があった。こことになり、このピッチ波
形を良好に再生するためには、ピッチ周波数が低い話者
の場合と比べて、より多くの個数の音源パルスを必要と
するためである。従ってこの理由から、品質の劣化なし
に、伝送ビットレイトを大幅に下げる、すなわち１フレ
ーム内のパルス数を大幅に減少させることが困難であっ
た。Furthermore, this conventional method has the disadvantage that, at a bit rate of about 10 Kbps or less, the reproduction quality deteriorates when an input signal with a high pitch frequency is input, for example when a female voice is input. This is because, in order to reproduce this pitch waveform satisfactorily, a larger number of sound source pulses is required compared to the case of a speaker with a low pitch frequency. Therefore, for this reason, it has been difficult to significantly reduce the transmission bit rate, that is, to significantly reduce the number of pulses within one frame, without deteriorating quality.

（発明の目的）本発明の目的は、比較的少ない演算量で、１０Ｋｂｐｓ
以下のビットレイトでも高品質な音声を再生し得る音声
符号化方式とその装置を提供することにある。(Objective of the Invention) The object of the present invention is to achieve a speed of 10 Kbps with a relatively small amount of calculation.
An object of the present invention is to provide an audio encoding method and an apparatus thereof that can reproduce high-quality audio even at the following bit rates.

（発明の構成）本発明によれば送信側では離散的音声信号系列を入力し
ピッチの微細構造を表わすピッチパラメータと短時間ス
ペクトル包絡を表わすスペクトルパラメータを抽出して
符号化し、前記スペクトルパラメータをもとに前記短時
間スペクトル包絡に応じたインパルス応答系列の自己相
関々数を計算し、前記音声信号系列と前記インパルス応
答系列とに応じた相互相関々数を計算し、前記自己相関
々数と前記相互相関々数とを用い前記ピッチパラメータ
を加味して前記音声信号系列を良好に表わし得る第１の
パルス系列をめて符号化し、前記第１のパルス系列を表
わす符号と前記ピッチパラメータ及び前記スペクトルパ
ラメータを表わす符号とを組み合わせて出力し、受信側
では前記組み合わされた符号を入力し前記第一のパルス
系列を表わす符号と前記ピッチパラメータを表わす符号
と前記スペクトルパラメータを表わす符号とを分離して
復号し、前記復号された第１のパルス系列をもとに前記
復号されたピッチパラメータを加味して第２のパルス系
列をめ、前記第２のパルス系列と前記復号されたスペク
トルパラメータとを用いて前記音声信号系列を再生する
ようにしたことを特徴とする音声符号化方式が得られる
。(Structure of the Invention) According to the present invention, on the transmitting side, a discrete audio signal sequence is input, a pitch parameter representing a pitch fine structure and a spectral parameter representing a short-time spectral envelope are extracted and encoded, and the spectral parameters are also calculate the autocorrelations of the impulse response sequence according to the short-time spectral envelope, calculate the crosscorrelations according to the audio signal sequence and the impulse response sequence, and calculate the autocorrelations and the A first pulse sequence that can satisfactorily represent the audio signal sequence is encoded using the cross-correlation coefficients and the pitch parameter, and a code representing the first pulse sequence, the pitch parameter, and the spectrum are encoded. A receiving side inputs the combined code and separates the code representing the first pulse sequence, the code representing the pitch parameter, and the code representing the spectrum parameter. decode, determine a second pulse sequence based on the decoded first pulse sequence, taking into account the decoded pitch parameter, and use the second pulse sequence and the decoded spectral parameter. There is obtained an audio encoding system characterized in that the audio signal sequence is reproduced by using the audio signal sequence.

また本発明によれば離散的音声信号系列を入力し前記音
声信号系列からピッチの微細構造を表わすピッチパラメ
ータと短時間スペクトル包絡を表わすスペクトルパラメ
ータとを抽出し符号化するパラメータ計算回路と、前記
パラメータ計算回路の出力系列を入力し前記音声信号系
列の短時間スペクトルに応じたインパルス応答系列の自
己相関々数を計算する自己相関々数計算回路と、前記音
声信号系列と前記パラメータ計算回路の出力系列とを入
力し前記音声信号系列と前記短時間スペクトルに応じた
インパルス応答系列とで表わされる相互相関々数を計算
する相互相関々数計算回路と、前記自己相関々数計算回
路の出力系列と前記相互相関々数計算回路の出力系列と
前記パラメータ計算回路の出力系列とを入力し前記自己
相関々数と前記相互相関々数とを用い前記ピッチパラメ
ータを加味して前記音声信号系列を良好に表わし得る第
１のパルス系列をめて符号化する第１のパルス系列計算
回路と、前記パラメータ計算回路の出力符号と前記第１
のパルス系列計算回路の出力系列とを組み合わせて出力
するマルチプレクサ回路とを有することを特徴とする音
声符号化装置が得られる。Further, according to the present invention, a parameter calculation circuit inputs a discrete audio signal sequence and extracts and encodes a pitch parameter representing a pitch fine structure and a spectral parameter representing a short-time spectral envelope from the audio signal sequence; an autocorrelation calculation circuit that inputs the output sequence of the calculation circuit and calculates the autocorrelation of the impulse response sequence according to the short-time spectrum of the audio signal sequence; and an output sequence of the audio signal sequence and the parameter calculation circuit. a cross-correlation calculation circuit that calculates a cross-correlation number represented by the audio signal sequence and an impulse response sequence corresponding to the short-time spectrum; The output series of the cross-correlation number calculation circuit and the output series of the parameter calculation circuit are input, and the audio signal series is well represented using the autocorrelation number and the cross-correlation number and taking the pitch parameter into consideration. a first pulse sequence calculation circuit that encodes the obtained first pulse sequence; and an output code of the parameter calculation circuit and the first pulse sequence.
and a multiplexer circuit that combines and outputs the output sequence of the pulse sequence calculation circuit.

さらに本発明によれば組み合わされた符号系列を入力し
音源パルス系列を表わす符号とピッチパラメータを表わ
す符号とスペクトルパラメータを表わす符号とを分離す
るデマルチプレクサ回路と、分離して得られた前記第１
のパルス系列を表わす符号を入力して復号する第１のパ
ルス系列復号回路と、分離して得られた前記ピッチパラ
メータを表わす符号を入力して復号するピッチパラメー
タ復号回路と、分離して得られた前記スペクトルパラメ
ータを表わす符号を入力して復号するスペクトルパラメ
ータ復号回路と、前記第１のパルス系列復号回路の出力
系列と前記ピッチパラメータ復号回路の出力系列とを入
力し前記復号された第１のパルス系列をもとに前記復号
されだピッチパラメータを加味して第２のパルス系列を
める第２のパルス系列発生回路と、前記第２のパルス系
列発生回路の出力系列と前記スペクトルパラメータ復号
回路の出力系列とを入力し音声信号系列を再生し出力す
る合成フィルタ回路とを有することを特徴とする音声復
号化装置が得られる。Further, according to the present invention, a demultiplexer circuit inputs the combined code sequence and separates the code representing the sound source pulse sequence, the code representing the pitch parameter, and the code representing the spectral parameter;
a first pulse sequence decoding circuit inputting and decoding a code representing the pulse sequence obtained separately; a pitch parameter decoding circuit inputting and decoding a code representing the separately obtained pitch parameter; a spectral parameter decoding circuit that inputs and decodes a code representing the spectral parameter; and a spectral parameter decoding circuit that inputs and decodes a code representing the spectral parameter; a second pulse sequence generation circuit that generates a second pulse sequence based on the pulse sequence by taking into account the decoded pitch parameter; and an output sequence of the second pulse sequence generation circuit and the spectral parameter decoding circuit. There is obtained an audio decoding device characterized in that it has a synthesis filter circuit which inputs an output sequence of , and reproduces and outputs an audio signal sequence.

（実施例）一般に音源系列は、有声部において非常に周期性が強い
。本発明においては、この周期性を利用して音源パルス
を探索することにより、少い音源パルスで良好な音質を
提供している。(Example) In general, a sound source sequence has very strong periodicity in voiced parts. In the present invention, by searching for sound source pulses using this periodicity, good sound quality is provided with a small number of sound source pulses.

以下に本発明による音声符号化方式の構成を図面を用い
て詳細に説明する。第４図（ａ）は、本発明による音声
符号化方式の符号器側の一実施例を示すブロック図であ
シ、第４図（ｂ）は後号器側の一実施例を示すブロック
図である。第４図（ａ）において、音声信号系列、　（
ｎ）は、入力端子１９５から入力され、あらかじめ定め
られたサンプル数だけ区切られてバッファメモリ回路３
４０に蓄積される。次ににパラメータ計算回路２８０は
、バッファメモリ回路３４０に蓄積されている音声信号
のうち、あらかじめ定められたサンプル数を入力し、こ
れを用いてあらかじめ定められた次数Ｐ個のＬＰＧパラ
メータを、衆知の方法（例えば線形予測分析法）に従い
計算する。ＬＰＣパラメータとしては、種々のものが考
えられるが、以下ではにパラメータＫｉ　（１≦ｉ≦Ｐ
）を用いるものとして説明を進める。Ｋパラメータはパ
ーコール係数と同一のパラメータである。Ｋパラメータ
Ｋｉはにパラメータ符号化回路２００に出力される。Ｋ
パラメータ符号化回路２００は、あらかじめ定められた
量子化ビット数に基づいてに、を符号化し、符号１，１
をマルチプレクサ４５０へ出力する。また、Ｋパラメー
タ符号化回路２００は、ｌｋｉを復号化して得たにパラ
メータ復号値に′ｉを用いてこれを衆知の方法に従って
予測係数値’ｉ　（ｉ≦ｉ≦ｐ）に変換し、インパルス
応答計算回路２１０と重み付は回路４１０と合成フィル
タ回路４００へ出力する。The configuration of the audio encoding system according to the present invention will be explained in detail below using the drawings. FIG. 4(a) is a block diagram showing an embodiment of the encoder side of the speech encoding system according to the present invention, and FIG. 4(b) is a block diagram showing an embodiment of the post-encoder side. It is. In FIG. 4(a), the audio signal sequence, (
n) is input from the input terminal 195, separated by a predetermined number of samples, and sent to the buffer memory circuit 3.
It is accumulated to 40. Next, the parameter calculation circuit 280 inputs a predetermined number of samples of the audio signal stored in the buffer memory circuit 340, and uses this to calculate LPG parameters of a predetermined order P. Calculate according to the method (e.g. linear predictive analysis method). Various LPC parameters can be considered, but below, the parameter Ki (1≦i≦P
) will be used in the explanation. The K parameter is the same parameter as the Percoll coefficient. The K parameter Ki is output to the parameter encoding circuit 200. K
The parameter encoding circuit 200 encodes , based on a predetermined number of quantization bits, and generates codes 1, 1
is output to multiplexer 450. Further, the K parameter encoding circuit 200 uses 'i as the decoded parameter value obtained by decoding lki, converts it into a prediction coefficient value 'i (i≦i≦p) according to a well-known method, and generates an impulse The response calculation circuit 210 and weighting are output to a circuit 410 and a synthesis filter circuit 400.

次にピッチ分析回路３７．０は、バッファメモリ回路３
４０の出力である１フレ一ム分の音声信号を用いてピッ
チ周期ｐｄを計算する。Ｐｄの計算法としては、例えば
アール・ブイ・コックス（Ｒ、Ｖ。Next, the pitch analysis circuit 37.0 uses the buffer memory circuit 3
The pitch period pd is calculated using the audio signal for one frame, which is the output of 40. As a method for calculating Pd, for example, R. V. Cox (R, V.

ＣＯＸ　）氏によるアイ・イー・イー・イー　トランザ
クションズ　オン　ニーｅニスΦニス・ピー（ｒＥＥＥ
　ＴＲＡＮＳＡＣＴＩＯＮＳ　ＯＮ　Ａ−８−８−Ｐ　
）誌１９８３年２月号、２５８〜２７２頁に掲載の「リ
アルタイム・インプリメンティジョン・オン・タイム・
ドメイン吻バー　モ、＝ツク伊スケーリング・オプ囃ス
ピーチ・フォー・レイト・モディフィケーション・アン
ド・コーディング」（″′工ｔＥＡＬ−ＴＩＭＥＩＭＰ
ＬＥＭＥＮＴＡＴＩＯＮ　ＯＦ　ＴＩＭＥ　ＤＯ１’１
ＡＩＮ　ＨＡＲＭＯＮＩＣ５ＣＡＬＩＮＧ　ＯＦ　５Ｐ
ＥＥＣＨＦＯＲＲＡＴＥ　ＭＯＤＩＦＩＣＡＴＩＯＮＡ
ＮＤ　Ｃ０ＤＩＮＧ”）と題した論文（文献２）等に説
明されている音声信号の自己相関々数を用いて計算する
方法が知られている。また、他の衆知な方法を用いて計
算することもできるし、音声信号を予測した後の予測残
差信号から計算することもできる。ピッチ符号化回路３
８０はピッチ周期Ｐｄを入力し、あらかじめ定められた
量子化ビット数で量子化符号化し、符号１ｄをゲート回
路４６０へ出力する。また、ピッチ符号化回路３８０は
１ｄを復号化して得たＰＡをパルス計算回路３９０とパ
ルス発生回路４２０へ出力する。IEE Transactions on NissP (rEEE
TRANSACTIONS ON A-8-8-P
) magazine, February 1983 issue, pages 258-272.
Speech for Late Modification and Coding
LEMENTATION OF TIME DO1'1
AIN HARMONIC5 CALING OF 5P
EECHFORRATE MODIFICATIONA
There is a known method of calculation using the autocorrelation coefficients of the audio signal, which is described in a paper titled "ND C0DING" (Reference 2).In addition, calculations using other well-known methods are also possible. It can also be calculated from the prediction residual signal after predicting the audio signal.Pitch encoding circuit 3
80 inputs the pitch period Pd, performs quantization encoding using a predetermined number of quantization bits, and outputs the code 1d to the gate circuit 460. Further, the pitch encoding circuit 380 outputs the PA obtained by decoding 1d to the pulse calculation circuit 390 and the pulse generation circuit 420.

次にインパルス応答計算回路２１０は、予測係数値ａ’
Ｈ（１≦ｉ≦Ｐ）を入力し、次式で示される重み付けさ
れた合成フィルタの伝達関数を表わすインパルス応答ｈ
ｗ（ｎ）を、あらかじめ定められたサンプル数だけ計算
する。Next, the impulse response calculation circuit 210 calculates the prediction coefficient value a'
H (1≦i≦P), and the impulse response h representing the transfer function of the weighted synthesis filter expressed by the following equation.
w(n) is calculated for a predetermined number of samples.

Ｉ−１ｗ（Ｚ）＝Ｗ（Ｚ）／　（１−Σａ’１Ｚ−’）
　（５）ここでＨＷ（Ｚ）は重み付けされた合成フィル
タのＺ変換上での伝達関数を示す。また、Ｗ（Ｚ）は前
述の（３）式で示しだ重み付は関数の２変換表現である
。インパルス応答計算回路２１０はインパルス応答ｈｗ
（ｎ）を自己相関々数計算回路３６０と相互相関々数計
算回路３５０とへ出力する。I-1w(Z)=W(Z)/(1-Σa'1Z-')
(5) Here, HW(Z) represents the transfer function of the weighted synthesis filter on Z transformation. Further, W(Z) is expressed in the above-mentioned equation (3), and the weighting is a two-transform expression of the function. The impulse response calculation circuit 210 has an impulse response hw
(n) is output to the autocorrelation number calculation circuit 360 and the cross correlation number calculation circuit 350.

次に自己相関々数計算回路３６０は、インパルス応答計
算回路２１０からインパルス応答ｈＷ　（ｎ）を入力し
、次式に従って自己相関々数Ｒｈｈ（・）をあ自己相関
々数几ｈｈ（τ）はパルス計算回路３９０へ出力される
。Next, the autocorrelation number calculation circuit 360 inputs the impulse response hW (n) from the impulse response calculation circuit 210, and calculates the autocorrelation number Rhh(·) according to the following formula, and the autocorrelation number Rhh(τ) It is output to the pulse calculation circuit 390.

次に減算器２８５は、バッファメモリ回路３４０に蓄積
された音声信号ｘ（ｎ）を入力し、Ｘ（ｎ）から合成フ
ィルタ回路４００の出力系列を１フレームサンプル分減
算し、減算結果ｅ（ｎ）を重み付は回路４１０へ出力す
る。Next, the subtracter 285 inputs the audio signal x(n) accumulated in the buffer memory circuit 340, subtracts the output series of the synthesis filter circuit 400 by one frame sample from X(n), and subtracts the result e(n). ) is weighted and output to the circuit 410.

次に重み付は回路４１０は、減算器２８５から減算結果
ｅ（ｎ）を入力し、またにパラメータ計算回路２００か
ら予測係数値ａ′１を入力し、ｅ（ｎ）に対して重み付
けを施しｅＷ（ｎ）を出力する。ここでｅｗ　（ｎ）は
２変換表現で次式のように書ける。Next, the weighting circuit 410 inputs the subtraction result e(n) from the subtracter 285 and also inputs the prediction coefficient value a'1 from the parameter calculation circuit 200, and weights e(n). Output eW(n). Here, ew (n) can be written as a two-transform expression as follows.

ＥＶ（Ｚ）＝　Ｅ　（ｚ）　−ｗ（ｚ）　（’ｔ）ここ
でＥＷ（Ｚ）、　ｍＺ）はそれぞれｅ、（ｎ）のＺ変換
値、ｅ（ｎ）のｚ変換値を示す。またＷ　（Ｚ）は前記
（３）式で示される重み付は関数の２変換値を示す。重
み付は回路４．１０は、ｅ試、＞を相互相関々数計算回
路３５０へ出力する。EV(Z)=E (z) −w(z) ('t) Here, EW(Z) and mZ) represent the Z-transformed values of e and (n), and the Z-transformed value of e(n), respectively. Further, W (Z) represents the two-transformed value of the weighted function shown in equation (3) above. The weighting circuit 4.10 outputs e-trial, > to the cross-correlation calculation circuit 350.

次に相互相関々数計算回路３５０は重み付は回路４１０
からｅＷ（ｎ）を入力し、またインパルス応答計算回路
２１０からインパルス応答ｈｗ（ｎ）を入力し、次式に
従って相互相関々数ψｈｘ　（ｎ）をあらかじめ定めら
れたサンプル数だけ計算する。Next, the cross-correlation calculation circuit 350 is weighted by the circuit 410.
eW(n) is input from the impulse response calculation circuit 210, and the impulse response hw(n) is input from the impulse response calculation circuit 210, and the cross-correlation number ψhx (n) is calculated for a predetermined number of samples according to the following equation.

ψｈｘＧｒｌ’−ΣｅＶ（ｎ）−ｈＷ　（ｎ　−ｍ　）
Ｆ　Ｃ１≦ｍ≦Ｎ）　（８）ｒ＋＝１相互相関々数ψｈｘ（・）はパルス計算回路３９０へ出
力される。ψhxGrl'-ΣeV(n)-hW (n-m)
FC1≦m≦N) (8) r+=1 The cross-correlation number ψhx(·) is output to the pulse calculation circuit 390.

次にパルス計算回路３９０は相互相関々数ψｈｘ（・）
と自己相関々数几ｈｈ（・）とを用いピッチ周期Ｐ′ｄ
を加味して駆動音源パルス系列のもととなる第１のパル
ス系列を計算する。具体的にはピッチ周期〆ｄを利用し
たパルス系列の計算と、Ｐ′ｄを利用しないパルス系列
の計算とを行なう。Next, the pulse calculation circuit 390 calculates the cross-correlation number ψhx(・)
and the autocorrelation number hhh(・), the pitch period P'd is
The first pulse sequence, which is the basis of the driving sound source pulse sequence, is calculated by taking into account the following. Specifically, a pulse sequence calculation using the pitch period d and a pulse sequence calculation not using P'd are performed.

まずＰ′ｄを用いない場合のパルス系列計算アルゴリズ
ムを示す。入力音声信号と合成音声信号との重み付は誤
差電力を最小化するパルス系列は次式に従って１パルス
ずつ順次計算される。First, a pulse sequence calculation algorithm when P'd is not used will be described. As for the weighting of the input audio signal and the synthesized audio signal, a pulse sequence that minimizes the error power is sequentially calculated pulse by pulse according to the following equation.

−１ここでｙｉはフレーム内のｉ番目のパルスの振幅を示す
。まだｋはフレーム内にたてる全パルス数を、町はｉ番
目のパルスのフレーム内の位置を示す。−1 Here, yi indicates the amplitude of the i-th pulse within the frame. k indicates the total number of pulses generated within the frame, and town indicates the position within the frame of the i-th pulse.

（９）式においてパルスの位置ｍｉはハの絶対値最大値
をとるフレーム内位置からまる。In equation (9), the pulse position mi is determined from the position in the frame where the absolute value of c is the maximum value.

次にＰ’ｄを利用した場合のパルス系列の計算法につい
て説明する。音声信号の有声部は周期性が非常に強く駆
動音源パルス系列は周期的に並んでいる。従って音源パ
ルスを１つ計算するごとにピッチ周期を利用して、ピッ
チ周期だけ離れた位置にパルスを外挿すれば、等測的に
音源パルス数を増加させることが可能で特性を大幅に改
善することができる。第５図はピッチ周期Ｐ’ｄを利用
した場合のパルス系列をめる過程の一例を示す図である
。Next, a method of calculating a pulse sequence using P'd will be explained. The voiced portion of the audio signal has very strong periodicity, and the driving sound source pulse sequence is arranged periodically. Therefore, if you use the pitch period every time you calculate one sound source pulse and extrapolate the pulse to a position separated by the pitch period, you can increase the number of sound source pulses isometrically and greatly improve the characteristics. can do. FIG. 5 is a diagram showing an example of the process of creating a pulse sequence when the pitch period P'd is used.

パルス系列の計算は前記（９）式に従う。第５図（ａ）
は相互相関々数計算回路３５０から計算された１フレ一
ム分の相互相関々数を示す。ここでフレーム長は１６０
サンプルとしている。第５図（ｂ）は（９）式に従って
めた第１番目のパルスを示す図である。The calculation of the pulse sequence follows the above equation (9). Figure 5(a)
indicates the cross-correlation number for one frame calculated by the cross-correlation number calculation circuit 350. Here the frame length is 160
It is used as a sample. FIG. 5(b) is a diagram showing the first pulse calculated according to equation (9).

この第１番目のパルスに対し、ピッチ周期Ｐ′ｄを利用
してパルスを外そうしたのが第５図（ｃ）である。FIG. 5(c) shows that the first pulse is removed using the pitch period P'd.

第５図（ｄ）は第５図ＣＣ）でまった３つのパルス（ｇ
、。Figure 5(d) shows three pulses (g
,.

、Ｌ、’、ｇｕ）の影響をさし引いた図である。第５図
（ｅ）はパルスＩ２をめた図である。第５図（ｆ）はパ
ルスｇ２に対してピンチ周期Ｐ′ｄを利用してパルスを
外そうした図である。以上のようにしてまったパルス系
列に対して、復号器側に伝送すべきパルスは、この例で
は、Ｊ７１と１２の２つのみでよい。これは復号器側で
は、伝送されだピッチ周期Ｐ’ｄを用いてパルスを外そ
うすることによって、島＋＋ＪＬ□、及びＬ＋＋Ｆｚ２
を発生させることができるからである。, L,',gu). FIG. 5(e) is a diagram showing the pulse I2. FIG. 5(f) is a diagram in which the pulse g2 is removed using the pinch period P'd. In this example, only two pulses, J71 and J12, are required to be transmitted to the decoder side for the pulse sequence created as described above. On the decoder side, by removing the pulses using the transmitted pitch period P'd,
This is because it can generate

従って少ないパルス数でも非常に良好な特性を得ること
ができる。以上でパルス系列の計算法の説明を終える。Therefore, very good characteristics can be obtained even with a small number of pulses. This concludes the explanation of the pulse sequence calculation method.

第４図Ｃ８）に戻って、パルス計算回路３９０は、ピッ
チ周期を利用してめたパルス系列とピッチ周期を利用し
ないでめたパルス系列とに対して、ピッチ周期を用いた
場合と用いない場合についての、入力信号と再生信号と
の誤差電力を次式に従って計算する。Returning to FIG. 4 C8), the pulse calculation circuit 390 calculates whether or not the pitch period is used for the pulse sequence calculated using the pitch period and the pulse sequence calculated without using the pitch period. The error power between the input signal and the reproduced signal for each case is calculated according to the following equation.

Ｊ　＝　ＰＬｅｅ（ｏ）−Σｇｉψｈｘ（ｍｉ）　ＱＯ
）−１ここでＦｉは（９）式のパルス振幅、ψｈｘ（・）は相
互相関々数を示す。まだ”ｅｅ　（ｏ）は重み付は回路
４１０の出力値ｅｗ（ｎ）のＮサンプル分の電力を示す
。ピンチ周期を利用し女い場合の誤差電力をＪＮ、ピッ
チ周期を利用した場合の誤差電力をＪＰとする。ＪＮと
Ｊｐとは比較回路４３０へ出力される。また、ピッチ周
期を利用してめたパルス系列とピッチ周期を利用しない
でめたパルス系列とは切り換え回路４４０へ出力される
。J = PLee(o)−Σgiψhx(mi) QO
)-1 Here, Fi represents the pulse amplitude of equation (9), and ψhx(·) represents the number of cross-correlation. The weighted "ee" (o) indicates the power for N samples of the output value ew(n) of the circuit 410. The error power when using the pinch period is JN, and the error when using the pitch period is JN. Let JP be the electric power. JN and Jp are output to the comparison circuit 430. Also, the pulse sequence obtained using the pitch period and the pulse sequence obtained without using the pitch period are output to the switching circuit 440. Ru.

次に比較回路４３０は誤差電力ＪＮとＪｐとを比較する
。ＪＰがＪＮよシも小さければ、ピッチ周期を利用した
方が特性が良好であると判断し、この情報を切り換え回
路４４０、比較回路４３０、ゲート回路４６０へ出力す
る。またＪｐがＪＮよシも犬きい場合には、ピッチ周期
を利用し々いという情報を切シ換え回路４４０、比較回
路４３０、ゲート回路４６０へ出力する。Next, the comparison circuit 430 compares the error power JN and Jp. If JP is smaller than JN, it is determined that the characteristics are better using the pitch period, and this information is output to switching circuit 440, comparison circuit 430, and gate circuit 460. Further, if Jp is shorter than JN, information indicating that the pitch period is used more often is output to switching circuit 440, comparison circuit 430, and gate circuit 460.

次に切り換え回路４４０は、比較回路４３０からの比較
情報を入力し、この情報に従って２種のパルス系列のう
ち、一方を符号化回路４７０へ出力する。Next, the switching circuit 440 inputs the comparison information from the comparison circuit 430 and outputs one of the two types of pulse sequences to the encoding circuit 470 according to this information.

次にゲート回路４６０は、比較回路４３０からの比較情
報を入力し、ピッチ周期を利用した方がよい場合には、
符号ｌｄをそのままマルチプレクサ４５０へ出力する。Next, the gate circuit 460 inputs the comparison information from the comparison circuit 430, and if it is better to use the pitch period,
The code ld is output as is to the multiplexer 450.

まだ、ピッチ周期を利用しない方がよい場合にはピッチ
周期０を表わす符号１ｄをマルチプレクサ４５０へ出力
する。If it is still better not to use the pitch period, a code 1d representing pitch period 0 is output to the multiplexer 450.

次に符号化回路４７０は切り換え回路４４０からパルス
系列を入力し、各パルスの振幅９位置をあらかじめ定め
られたビット数で符号化する。また、各パルスの振幅７
位置の復号値１’　ｉ　Ｈｎ’ｌ’ｉをパルス発生回路
４２０へ出力する。ここでパルス系列の符号化の方法は
種々考えられる。一つは、パルス列の振幅９位置を別々
に符号化する方法であシ、寸た一つは振幅２位置を一緒
に符号化する方法である。前者の方法について一例を説
明する。Next, the encoding circuit 470 inputs the pulse sequence from the switching circuit 440 and encodes the nine amplitude positions of each pulse with a predetermined number of bits. Also, the amplitude 7 of each pulse
The decoded position value 1' i Hn'l'i is output to the pulse generation circuit 420 . Here, various methods of encoding the pulse sequence can be considered. One is to encode nine amplitude positions of the pulse train separately, and the other is to encode two amplitude positions together. An example of the former method will be explained.

まず、パルス系列の振幅の符号化法としては、フレーム
内のパルス系列の振幅の最大値孕正現化系数として、こ
の値を用いて各パルスの振幅を正規化した後に、量子化
、符号化する方法が考えられる。まだ、他の方法として
は、振幅の確率分布を正規型と仮定して、正規型の場合
の最適量子化器を用いる方法が考えられる。これについ
ては、ジェー・マックス（Ｊ−ＭＡＸ）氏によるアイ・
アール・イー・トランザクションズ・オン・インフォメ
ーション・セオリー（ＩＲ，Ｅ　ＴＲＡＮＳＡＣＴＩＯ
ＮＳ　ＯＮＩＮＦＯＲＭＡＴＩＯＮ　ＴＨｇｏａｙ　）
の１９６０年３月号、７〜１２頁に掲載の１クオンタイ
ジング・フォー・ミニマム・ディストーション」（ＱＵ
ＡＮＴＩＺＩＮＧＦｏｕ　ＭＩＮＩＭＴＪＭＪ）ＩＳＴ
Ｏ几Ｔｌ０Ｎ″）と題した論文（文献３）等に詳述され
ているので、ここでは説明を省略する。更に、各パルス
の振幅を直交関係にある他のパラメータに変換した後に
量子化、符号化を施してもよい。また、パルス振幅毎に
ビット割シ当てを変えてもよい。次に、パルス位置の符
号化についても種々の方法が考えられる。例えば、ファ
クシミリ信号符号化の分野でよく知られているランレン
グス符号等を用いてもよい。これは符号゛０”または”
■”の続く長さをあらかじめ定められた符号系列を用い
て表わすものである。また、正規化係数の符号化には、
従来よく知られている対数圧縮符号化等を用いることが
できる。First, as a method for encoding the amplitude of a pulse sequence, the maximum value of the amplitude of the pulse sequence within a frame is used as a normalization coefficient, and after normalizing the amplitude of each pulse, quantization and encoding are performed. There are ways to do this. Another possible method is to assume that the amplitude probability distribution is a normal type and use an optimal quantizer for the normal type. Regarding this, Mr. J-MAX (J-MAX)
IR, E TRANSACTIO
NS ON INFORMATION THgoay)
1 Quantizing for Minimum Distortion” (QU) published in March 1960 issue, pages 7-12.
ANTIZINGFou MINIMTJMJ)IST
Since it is explained in detail in the paper entitled "O 几Tl0N" (Reference 3), etc., the explanation is omitted here.Furthermore, after converting the amplitude of each pulse to other parameters in an orthogonal relationship, quantization, It may be encoded.Also, the bit assignment may be changed for each pulse amplitude.Next, various methods can be considered for encoding the pulse position.For example, in the field of facsimile signal encoding, A well-known run-length code etc. may be used. This is the code "0" or "
■The length of “” is expressed using a predetermined code sequence. Also, to encode the normalization coefficient,
Conventionally well-known logarithmic compression encoding or the like can be used.

尚、パルス系列の符号化に関しては、ここで説明した符
号化方法に限らず、衆知の最良の方法を用いることがで
きることは勿論である。It should be noted that the coding of the pulse sequence is not limited to the coding method described here, and it goes without saying that the best known method can be used.

第４図（ａ）に戻って、パルス発生回路４２０は、パル
ス系列復号値＋ｊ？’　ｉ　Ｈｒｎ’　ｉを用いてｒｌ
ｌ　’ｉの位置に振幅ｙ′１をもつパルス列を発生させ
る。この際に、比較回路４３０から入力した情報に基づ
きピッチ周期を利用する場合は、ピッチ符号化回路３８
０から入力したピッチ周期復号値ｐ／ｄを用いて符号化
回路４７０から入力したパルス系列復号値に対してピッ
チ周期ｐ／ｄだけ離れた位置にパルスを外そうする。こ
のようにしてまった駆動音源パルス系列は合成フィルタ
回路４００へ出力される。合成フィルタ回路４００は、
パルス発生回路４２０から駆動音源パルス系列を入力す
る。また、Ｋパラメータ符号化回路２００から予測係数
値ａ′１を入力し合成フィルタが構成される。合成フィ
ルタ回路４００は、入力した駆動音源パルスを用いてフ
ィルタ動作の後、１フレ一ム分の応答信号を計算し、減
算器２８５へ出力する。応答信号ｘ（ｎ）のここで◇（
、）の値は２フレ一ム分（１≦ｎ≦２Ｎ）計算される。Returning to FIG. 4(a), the pulse generation circuit 420 generates the pulse sequence decoded value +j? 'i Hrn' rl using i
A pulse train having an amplitude y'1 is generated at the position l'i. At this time, when using the pitch period based on the information input from the comparison circuit 430, the pitch encoding circuit 38
Using the pitch period decoded value p/d inputted from 0, the pulse is removed to a position separated by the pitch period p/d from the pulse sequence decoded value inputted from the encoding circuit 470. The driving sound source pulse sequence thus obtained is output to the synthesis filter circuit 400. The synthesis filter circuit 400 is
A driving sound source pulse sequence is input from the pulse generation circuit 420. Furthermore, a prediction coefficient value a'1 is input from the K-parameter encoding circuit 200 to configure a synthesis filter. The synthesis filter circuit 400 performs a filter operation using the input driving sound source pulse, calculates a response signal for one frame, and outputs it to the subtracter 285. Here of the response signal x(n) ◇(
, ) are calculated for two frames (1≦n≦2N).

ｄ（ｎ）は駆動音源信号を表わし、１≦ｎ≦Ｎではパル
ス発生回路４２０から出力された駆動音源パルスを用い
る。まだＮ＋１≦ｎ≦２Ｎでは全てＯの系列を用いる。d(n) represents a driving sound source signal, and when 1≦n≦N, the driving sound source pulse output from the pulse generation circuit 420 is used. Still, when N+1≦n≦2N, all O sequences are used.

０９式でめた仝（ｎ）のうち、第２フレーム目のＡ（ｎ
）（Ｎ＋１≦ｎ≦２Ｎ）の値が減算器２８５へ出力され
る。A(n) in the second frame of A(n) determined by formula 09
) (N+1≦n≦2N) is output to the subtracter 285.

次にマルチプレクサ４５０は、符号化回路４７０の出力
符号とにパラメータ符号化回路２００の出力符号とゲー
ト回路４６０の出力符号とを入力し、これらを組み合わ
せて送信側出力端子４８０から通信路へ出力する。以上
で本発明による音声符号化方式の符号器側の■、明を終
える。Next, the multiplexer 450 inputs the output code of the parameter encoding circuit 200 and the output code of the gate circuit 460 to the output code of the encoding circuit 470, combines them, and outputs them from the transmission side output terminal 480 to the communication path. . This concludes the section (1) and (1) on the encoder side of the audio encoding system according to the present invention.

次に本発明による音声符号化方式の復号器側について第
４図（ｂ）を参照して説明する。デマルチプレクサ５０
０は、復号器側入力端子４９０から符号を入力する。デ
マルチプレクサ５００は、入力符号のうち、Ｋパラメー
タを表わす符号系列とピッチ情報を表わす符号系列と、
第１のパルス系列を表わす符号系列とを分離し、Ｋパラ
メータを表わす符号系列をにパラメータ復号回路５２０
へ出力し、第１のパルス系列を表わす符号系列を、パル
ス系列復号回路５３０へ出力する。Ｋパラメータ復号回
路５２０及びピッチ復号回路５１０は、入力した符号系
列を復号し、合成フィルタ回路５５０へ出力する。Next, the decoder side of the audio encoding system according to the present invention will be explained with reference to FIG. 4(b). Demultiplexer 50
0 inputs the code from the decoder side input terminal 490. Of the input codes, the demultiplexer 500 separates a code sequence representing the K parameter and a code sequence representing pitch information,
A parameter decoding circuit 520 separates the code sequence representing the first pulse sequence and converts the code sequence representing the K parameter into a parameter decoding circuit 520.
The pulse sequence decoding circuit 530 outputs a code sequence representing the first pulse sequence to the pulse sequence decoding circuit 530. K-parameter decoding circuit 520 and pitch decoding circuit 510 decode the input code sequence and output it to synthesis filter circuit 550.

パルス系列復号回路５３０は、第１のパルス系列を表わ
す符号系列を入力し、復号化してパルス系列の振幅２位
置情報としてパルス発生回路５４０へ出力する。パルス
発生ｍ門５４ｏは、第１のパルス系列の振幅２位置情報
を入力し、第２のパルス系列に対応した駆動音源パルス
系列を発生させる。The pulse sequence decoding circuit 530 inputs the code sequence representing the first pulse sequence, decodes it, and outputs it to the pulse generation circuit 540 as amplitude two-position information of the pulse sequence. The pulse generation gate 54o receives the amplitude and two position information of the first pulse sequence and generates a driving sound source pulse sequence corresponding to the second pulse sequence.

この際にピッチ復号回路５１０からピッチ周期律号値Ｐ
Ｉｄを入力し、この値が０でなかったら、入力したパル
ス系列に対してＰ′ｄだけ離れた位置にパルスを外そう
する。このようにしてめた駆動音源パルス系列を合成フ
ィルタ回路５５０へ出力する。合成フィルタ回路５５０
は、Ｋパラメータ復号回路５２０からにパラメータ復号
値に’ｉを入力し、パルス発生回路５４０の出力パルス
列を駆動源として合成信号マ（ｎ）をめ、受信側出力端
子５６０から出力する。以上で本発明にょる復号器側の
説明を終える。At this time, the pitch periodic value P is sent from the pitch decoding circuit 510.
Id is input, and if this value is not 0, the pulse is removed to a position separated by P'd with respect to the input pulse sequence. The drive sound source pulse sequence thus obtained is output to the synthesis filter circuit 550. Synthesis filter circuit 550
inputs 'i as a parameter decoded value from the K parameter decoding circuit 520, generates a composite signal M(n) using the output pulse train of the pulse generation circuit 540 as a driving source, and outputs it from the receiving side output terminal 560. This completes the explanation of the decoder side according to the present invention.

本実施例の構成によれば、符号器側においてピッチ周期
を利用してパルス系列をめた場合とピッチ周期を利用し
ないでパルス系列をめた場合とで誤差電力を計算し誤差
電力のより小さい方のパルス系列、つまり入力音声をよ
り忠実に再現できるパルス系列を伝送し、これを復号器
側での再生に用いる構成としているので、入力音声信号
の過渡部やピッチパラメータの抽出誤りに起因する劣化
を防止することができるという効果がある。According to the configuration of this embodiment, the error power is calculated on the encoder side between when a pulse sequence is determined using the pitch period and when the pulse sequence is determined without using the pitch period, and the error power is reduced. Since the configuration is such that a pulse sequence that can reproduce the input audio more faithfully is transmitted and used for reproduction on the decoder side, there is a possibility that errors in extraction of transient parts of the input audio signal or pitch parameters may occur. This has the effect of preventing deterioration.

尚、ピッチ周期を利用するか利用しないかを判断するた
めのよシ簡便な方法として、°ピッチゲインを用いるこ
ともできる。ここでピッチゲインはピッチ周期だけの遅
れをもつ相関係数の値からまる。このようにしてめたピ
ッチゲインをあらかじめ定められたしきい値と比較して
、ピッチゲインがしきい値以下であれば音源パルスを計
算する際にピッチ周期を利用しないようにする。また復
号器側に伝送するピッチ周期は０としておく。このよう
な構成にすることによって、（１０１式の誤差電力の計
算と比較回路４３０は不要となり、演算量を低減するこ
とができる。Incidentally, as a more convenient method for determining whether to use the pitch period or not, it is also possible to use the pitch gain. Here, the pitch gain is determined from the value of the correlation coefficient with a delay equal to the pitch period. The pitch gain obtained in this way is compared with a predetermined threshold, and if the pitch gain is less than or equal to the threshold, the pitch period is not used when calculating the sound source pulse. Further, the pitch period transmitted to the decoder side is set to 0. By adopting such a configuration, the error power calculation and comparison circuit 430 of Equation 101 becomes unnecessary, and the amount of calculation can be reduced.

まだ００）式に示したパルス計算法においては、パルス
を１つづつ順番に計算していた。この方法においては次
のパルスを計算する際にこれより過去にまった複数個の
パルスの振幅を再調整するようにしてもよい。このよう
にすることによってパルスが互いに独立でない場合に特
性が向上する。In the pulse calculation method shown in formula 00), pulses were calculated one by one in sequence. In this method, when calculating the next pulse, the amplitudes of a plurality of pulses accumulated in the past may be readjusted. This improves the characteristics when the pulses are not independent of each other.

また音源パルスをめる方法としては、よシ最適なパルス
系列を計算する方法のような他の良好なパルス系列計算
法をめることができる。Further, as a method for calculating the sound source pulses, other good pulse sequence calculation methods, such as a method for calculating an optimal pulse sequence, can be used.

また本実施例の構成においては、ピッチ周期を利用した
方が特性が良好であると判別された場合には請求まった
全てのパルス系列に対して外そう処理を施していた。こ
の外そう処理は必ずしも全てのパルス系列に対して施す
必要はなく、パルス外そう効果の大きい特定のパルスを
選択してこの選択されたパルスに対してのみ外そう処理
を施すようにしてもよい。特定のパルスの選択法として
は種々考えられる。例えば振幅の大きなパルスはより周
期性が強いと考えられるので、フレーム内でまったパル
ス°のうち、振幅の大きなパルスからＭ個のパルスに対
してのみ外そう処理を施すようにしてもよい。ここでＭ
の値はあらかじめ定められた値でもよいし、フレーム毎
に変化させてもよい。また各パルス毎に外そう処理を施
すか否かを判別するためδ情報（１ビツト）を付加する
ようにしてもよい。このような構成とすることによって
、伝送情報量は若干増加するが、特性はよシ改善される
という効果がある。In addition, in the configuration of this embodiment, if it is determined that the characteristics are better when the pitch period is used, removal processing is performed on all requested pulse sequences. This removing process does not necessarily need to be applied to all pulse sequences; it is also possible to select a specific pulse that has a large pulse removing effect and apply the removing process only to this selected pulse. . Various methods can be considered for selecting a specific pulse. For example, since it is considered that pulses with large amplitudes have stronger periodicity, processing may be performed to remove only M pulses with large amplitudes out of the pulses o that are included in a frame. Here M
The value may be a predetermined value or may be changed for each frame. Further, δ information (1 bit) may be added to each pulse in order to determine whether or not to perform removal processing. With such a configuration, although the amount of transmitted information increases slightly, the characteristics are significantly improved.

本実施例の構成においては、短時間スペクトル構造を表
わすインパルス応答系列の自己相関々数を計算する際に
、インパルス応答計算回路２１０によってにパラメータ
復号値を用いてインパルス応答を計算した後に、このイ
ンパルス応答を用いて自己相関々数計算回路３６０にて
自己相関々数を計算していた。ディジタル信号処理の分
野でよく知られているように、インパルス応答の自己相
関々数はパワスペクトルと対応関係にある。従ってまず
にパラメータ復号値を用いてパワスペクトルをめ、その
後にこの対応関係を用いて自己相関々数を計算するよう
な構成としてもよい。一方、音声信号と短時間スペクト
ル包絡を表わすインパルス応答との相互相関々数を計算
する際に、本実施例の構成では重み伺は回路４１０の出
力値ｅＷ（ｎ）とにパラメータ復号値に′１を用いてイ
ンパルス応答計算回路２１０にて計算したインパルス応
答ｈｗ（ｎ）を用いて相互相関々数ψｈｘ（・）を計算
していた。In the configuration of this embodiment, when calculating the autocorrelation number of an impulse response sequence representing a short-time spectral structure, the impulse response calculation circuit 210 calculates the impulse response using the decoded parameter values, and then calculates the impulse response using the decoded parameter values. An autocorrelation number calculation circuit 360 calculates an autocorrelation number using the response. As is well known in the field of digital signal processing, the autocorrelation coefficients of an impulse response have a corresponding relationship with the power spectrum. Therefore, a configuration may be adopted in which the power spectrum is first determined using the decoded parameter values, and then the autocorrelation number is calculated using this correspondence. On the other hand, when calculating the cross-correlation number between the audio signal and the impulse response representing the short-time spectral envelope, in the configuration of this embodiment, the weight is determined by the output value eW(n) of the circuit 410 and the parameter decoded value. The impulse response hw(n) calculated by the impulse response calculation circuit 210 using 1 is used to calculate the cross-correlation number ψhx(·).

よく知られているように、相互相関々数はクロス・パワ
スペクトルと対応関係にある。従ってまずｅｗ（ｎ）と
に′１とを用いてクロス・パワスペクトルをめ、その後
に相互相関々数を計算するよう々構成としてもよい。尚
、パワスペクトルと自己相関々数との対応関係、及びク
ロス・パワスペクトルと相互相関々数との対応関係につ
いては、ニー・ブイ・オッペンハイム（Ａ　−Ｖ　−Ｏ
ＰＰＥＮＩ（ＥＴＭ　）氏らによる「ディジタル信号処
理」（”ＤＩＧＩＴＡＬＳＩＧＮＡＬ　Ｐ凹ＣＥ８ＳＩ
ＮＧ”）と題した単行本（文献４）の第８章にて詳細に
説明されているので、ここでは説明を省略する。As is well known, the cross-correlation number corresponds to the cross-power spectrum. Therefore, the configuration may be such that the cross power spectrum is first calculated using ew(n) and '1, and then the cross-correlation coefficients are calculated. Regarding the correspondence between the power spectrum and the autocorrelation number, and the correspondence between the cross power spectrum and the cross-correlation number, see N. V. Oppenheim (A-V-O
“Digital Signal Processing” by Mr. PPENI (ETM) et al.
Since it is explained in detail in Chapter 8 of the book entitled "NG") (Reference 4), the explanation will be omitted here.

更に本発明によれば、フレーム境界での波形の不連続に
起因したフレーム境界近傍での再生信号の劣化がほとん
どないという大きな効果がある。Further, according to the present invention, there is a great effect that there is almost no deterioration of the reproduced signal near the frame boundary due to waveform discontinuity at the frame boundary.

との効果は、符号器側において、現フレームのパルス系
列を計算する際に、ｌフレーム過去の駆動音源パルス系
列によって合成フィルタを駆動して得られた応答信号系
列を、現フレームにまで伸ばしてめ、これを入力音声信
号系列から減算した結果に対して覗、フレームのパルス
系列を計算するという構成にしたことに起因している。The effect of this is that on the encoder side, when calculating the pulse sequence of the current frame, the response signal sequence obtained by driving the synthesis filter using the driving excitation pulse sequence l frames past is extended to the current frame. This is due to the fact that the frame pulse sequence is calculated by looking at the result of subtracting this from the input audio signal sequence.

また、本実施例ではフレーム長を一定とした場合につい
て説明したが、フレーム長を時間的に変化させる可変長
フレームとしてもよい。また、１フレーム内にたてる音
源パルスの個数は一定でなくてもよい。Further, in this embodiment, a case where the frame length is constant has been described, but a variable length frame may be used in which the frame length is changed over time. Further, the number of sound source pulses generated within one frame does not need to be constant.

例えばＳ／Ｎを一定とするように各フレームのパルス系
列の個数を変化させるようにしてもよい。For example, the number of pulse sequences in each frame may be changed so as to keep the S/N constant.

寸だ本実施例においては、ピッチ周期を利用してめたパ
ルス系列とピッチ周期を利用しないでめたパルス系列の
うち、入力信号をより忠実に再現し得るパルス系列を選
択する基準として、（１０）式で示した誤差電力を用い
た。これは他の最良な方法を用いることができる。例え
ばピッチゲインからピッチを用いた場合の予測ゲインを
計算し、この値をあらかじめ定められたしきい値と比較
するような構成にしてもよい。In this embodiment, as a criterion for selecting a pulse sequence that can reproduce the input signal more faithfully between a pulse sequence obtained using the pitch period and a pulse sequence obtained without using the pitch period, ( The error power shown in equation 10) was used. This can be done using other best methods. For example, a configuration may be adopted in which a predicted gain when using the pitch is calculated from the pitch gain, and this value is compared with a predetermined threshold value.

また、前述の本発明の実施例においては、１フレーム内
のパルス系列の符号化は、パルス系列が全てまった後に
、第４図（ａ）の符号化回路４７０によって符号化を施
したが、符号化をパルス系列の計算に含めて、パルスを
１つ計算する毎に、符号化を行ない、次のパルスを計算
するという構成にしてもよい。このような構成をとるこ
とによって、符号化の歪をも含めた誤差を最小とするよ
うなパルス系列がまるので、更に品質を向上させること
ができる。Furthermore, in the embodiment of the present invention described above, the encoding of the pulse sequence within one frame is performed by the encoding circuit 470 of FIG. 4(a) after all the pulse sequences have been completed. It may also be configured such that encoding is included in the pulse sequence calculation, and each time one pulse is calculated, encoding is performed and the next pulse is calculated. By adopting such a configuration, a pulse sequence that minimizes errors including encoding distortion can be created, so that the quality can be further improved.

寸だ、以上説明した実施例においては、短時間音声信号
系列のスペクトル包絡を表わすパラメータとしてはにパ
ラメータを用いたが、これはよく知られている他のパラ
メータ（例えばＬＳＦパラメータ等）を用いてもよい。In the example described above, the parameter was used as the parameter representing the spectral envelope of the short-time audio signal sequence, but this can be done using other well-known parameters (such as the LSF parameter). Good too.

更に前述の（５）式、（方式において重み付は関数Ｗ　
（Ｚ）はなくてもよい。Furthermore, the above-mentioned equation (5), (in the method, the weighting is the function W
(Z) may be omitted.

また、本実施例においては、フレーム境界での再生波形
の不連続に起因する品質劣化を防ぐために、現フレーム
より１フレーム過去の駆動音源パルスに由来した応答信
号系列を計算し、現フレームの入力音声からこの応答信
号を減算した後に、パルス系列を計算したが、第６図に
示すように、パルス系列の計算に用いるデータとして、
パルスＮＴはパルスを伝送するフレームを示し、Ｎは音
源パルスを計算するフレームを示す。このような構成と
することによって、１フレーム過去の駆動音源パルスに
由来した応答信号系列を計算する必要がなくなるという
効果がある。In addition, in this embodiment, in order to prevent quality deterioration due to discontinuity of the reproduced waveform at frame boundaries, a response signal sequence derived from the drive sound source pulse one frame past the current frame is calculated, and The pulse sequence was calculated after subtracting this response signal from the voice, but as shown in Figure 6, the data used for calculating the pulse sequence was
Pulse NT indicates the frame in which the pulse is transmitted, and N indicates the frame in which the sound source pulse is calculated. With such a configuration, there is an effect that there is no need to calculate a response signal sequence derived from a driving sound source pulse one frame past.

（発明の効果）以上詳細に説明した通シ、本発明によれば、パルス系列
の言１算において、ピンチ周期を利用してパルスを外そ
うしパルス数を増加させているので、伝送ビットレイト
が低い（音源パルス数が少ない）場合で（良好な再生音
声を得ることができるという効果がある。特に従来方式
において男声品質の劣化していたピッチ周波数の高い女
性音声に対しても１０　Ｋｂｐｓ以下の伝送情報量で良
好な再生音声を得ることができる。また音源パルスを（
１０）式に従いめているので、文献１．の従来方式のよ
うに、音源パルスで合成フィルタを駆動して再生信号を
め、原信号との２乗誤差をフィードバックしてパルスを
調整するという径路がなく、またその処理をく９返す必
要もないので、演算量を大幅に低減できるという効果が
ある。またピッチ周期を利用したパルスの外そう法はわ
ずかな演算量の追加で実現できるという効果がある。(Effects of the Invention) As explained in detail above, according to the present invention, when calculating the pulse sequence, the pinch period is used to remove pulses and increase the number of pulses, so the transmission bit rate can be reduced. It has the effect of being able to obtain good reproduced audio even when the number of sound source pulses is low (the number of sound source pulses is small).In particular, it is possible to obtain a high-pitched female voice with a high pitch frequency, where the male voice quality deteriorated in the conventional method. Good reproduced audio can be obtained with the amount of information transmitted.Also, the sound source pulse (
10), so it is determined according to the formula 1. Unlike the conventional method, there is no path to drive a synthesis filter with a sound source pulse to obtain a reproduced signal, and then adjust the pulse by feeding back the squared error with the original signal, and there is no need to repeat the process nine times. Therefore, there is an effect that the amount of calculation can be significantly reduced. Furthermore, the method of removing pulses using the pitch period has the advantage that it can be realized with a small amount of additional calculation.

[Brief explanation of drawings]

第１図は従来方式の構成を示すブロック図、第２図は音
源パルス系列の一例を示す図、第３図は入力音声信号系
列の周波数特性と第１図に記載の重み付は回路の周波数
特性の一例を示す図、第４合の音源パルスの探索過程の
一例を示す図、第６図はパルス伝送フレームと音源パル
ス計算フレームとの位置関係を説明するだめの図である
。図において、１１０　、３４０・・・バッファメモリ回
路、１２０　、２８５・・・減算回路、１３０　、４０
０　、５５０・・・合成フィルタ回路、１４．０　、４
２０　、５４０・・・パルス発生回路、１５０・・・誤
差最小化回路、１８０　、２８０・・・Ｋパ２メータ計
算回路、１９０　、４１０・・・重み付は回路、２００
・・・Ｋパラメータ符号化回路、２１０・・・インパル
ス応答割算回路、３５０・・・相互相関計算回路、３６
０・・・自己相関計算回路、３７０・・・ピッチ分析回
路、３８０・・・ピッチ符号化回路、３９０・・・パル
ス計算回路・、４３０・・・比較回路、４４０・・・切
り換え回路、４７０・・・符号化回路、４５０・・・マ
ルチプレクサ、４６０・・・ゲート回路、５００・・・
デマルチプレクサ、５１０・・・ピッチ復号回路、５２
０・・・Ｋパラメータ復号回路、５３０・・・音源パル
ス復号回路をそれぞれ示す。、λ；：８．．ｌ　′Ｆ鐸士内バｔｆ、　習第１図第２図第５図１６図手続補正書（自発）６０．５．２９昭和　年　月　日１、事件の表示　昭和５９年　特許　願第０４２３０５
号２、発明の名称　音声符号化方式とその装置３、補正
をする者事件との関係　出　願　人東京都港区芝五丁目３３番１号（４２３）　日本電気株式会社代表者　関本忠弘４、代理人７″″−＼・て１１　。ｍｌ、：、、、す（−１５、補正の対象（１）明細書の特許請求の範囲の欄（２）明細書の発明の詳細な説明の欄６　補正の内容（１）　特許請求の範囲を別紙のとおシ補正する。（２）明細書第１６頁第１６行目に「少い」とあるのを
「少ない」と補正する。（３）明細書第２３頁第８行目に「さし引いた」とある
のを「第５図（ａ）の相互相関々数からさし引いた」と
補正する。（４）明細書第２３頁第１４行目に「伝送された」とあ
るのを「伝送されたｇ＋＋ｇｓと」と補正する。（５）明細書第２４頁第７行目に「ｇｌは」とあるのを
「ｇｉはピッチ周期を用いない場合は」と補正する。（６）明細書第２４頁第７行目に「ψｈｘ’　とあるの
を「ピッチ周期を用いる場合は外そうしたパルスを含む
全てのパルス振幅、ψｈｘ”　と補正する。（７）明細書第３１頁第１９行目に「することによって
」とあるのを「することによってフレームあたシのパル
ス数が多く」と補正する。（８）明細書第３７頁第１７行目に「（１１式」とある
のを「（９）式」と補正する。・　−＼、代理人　弁理士　内　原　！−ノ゛別紙特許請求の範囲（１）送信側では、離散的音声信号系列をピッチの微細
構造を表わすピッチパラメータと短時間スペクトル包絡
を表わすスペクトルパラメータを抽出して符号化し、前
記スペクトルパラメータをもとに前記短時間スペクトル
包絡に応じたインパルス応答系列の自己相関々数を計算
し、前記ピッチパラメータと前記自己相関々数とを用い
て前記音声信号系列を良好に表わし得る第１のパルス系
列をめて符号化し前記ピッチパラメータ及び前記スペク
トルパラメータを表わす符号とを組み合わせて出力し、
受信側では、前記１組み合わされた符号を前記第一のパ
ルス系列を表わす符号と前記ピッチパラメータを表わす
符号と前記スペクトルノくラメータを表わす符号とを分
離して復号し、前記復号された第１のパルス系列をもと
に前記復号されたピッチパラメータを用いて第２のパル
ス系列をめ、前記第２のパルス系列と前記復号されたス
ペクトルパラメータとを用いて前記音声信号系列を再生
することを特徴とする音声符号化方式。（２）離散的音声信号系列を入力し前記音声信号系列か
らピッチの微細構造を表わすピッチパラメータと短時間
スペクトル包絡を表わすスペクトルパラメータとを抽出
し符号化するパラメータ計算回路と、前記パラメータ計
算回路の出力系列を入力し前記音声信号系列の短時間ス
ペクトルに応じたインパルス応答系列の自己相関に数を
計算する自己相関々数計算回路と、前記音声信号系列と
前記パラメータ計算回路の出力系列とを入力し前記音声
信号系列と前記短時間スペクトルに応じたインパルス応
答系列とに応じた相互相関々数を計算する相互相互々数
計算回路と、前記自己相関々数計算回路の出力系列と前
記相互相関々数計算回路の出力系列と前記パラメータ計
算回路の出力系列とが入力され前記音声信号を良好に表
わしうる第１のパルス系列をめて符号化する第１のパル
ス計算回路と、前記パラメータ計算回路の出力符号と前
記第１のパルス系列計算回路の出力系列とを組み合わせ
て出力するマルチプレクサ回路とを有することを特徴と
する音声符号化装置。（３）入力された音声信号系列を良好に表わし得る第１
のパルス系列を表わす符号と前記音声信号系列のピッチ
の微細構造を表わすピッチパラメータを表わす符号と前
記音声信号系列の短時間スペクトル包絡を表わすスペク
トルパラメータを表わす符号とが組み合わされた符号系
列を入力し前記第１のパルス系列を表わす符号と前記ピ
ッチパラメータを表わす符号と前記スペクトルパラメー
タを表わす符号とを分離するデマルチブレフサ回路と、
分離して得られた前記第１のパルス系列を表わす符号を
入力して復号する第１のパルス系列復号回路と、分離し
て得られた前記ピッチパラメータを表わす符号を入力し
て復号するピッチパラメータ復号回路と、分離して得ら
れた前記スペクトルパラメータを表わす符号を入力して
復号するスペクトルパラメータ復号回路と、前記第１の
パルス系列復号回路の出力系列と前記ピッチパラメータ
復号回路の出力系列とを入力し前記復号された第１のパ
ルス系列をもとに前記復号されたピッチパラメータを用
いて第２のパルス系列をめる第２のパルス系列発生回路
と、前記第２のパルス系列発生回路の出力系列と前記ス
ペクトルパラメータ復号回路の出力系列とを入力し音声
信号系列を再生し出力する合成フィルタ回路とを有する
ことを特徴とする音声復号化装置。１７Ｎ５代理人　弁理士　内　原　−（゛＼ニノFigure 1 is a block diagram showing the configuration of the conventional system, Figure 2 is a diagram showing an example of a sound source pulse sequence, Figure 3 is a diagram showing the frequency characteristics of the input audio signal sequence, and the weighting shown in Figure 1 is the frequency of the circuit. A diagram showing an example of the characteristics, a diagram showing an example of the search process for the sound source pulse of the fourth combination, and FIG. 6 are diagrams for explaining the positional relationship between the pulse transmission frame and the sound source pulse calculation frame. In the figure, 110, 340...buffer memory circuit, 120, 285...subtraction circuit, 130, 40
0, 550...Synthesis filter circuit, 14.0, 4
20, 540... Pulse generation circuit, 150... Error minimization circuit, 180, 280... K parameter calculation circuit, 190, 410... Weighting circuit, 200
. . . K parameter encoding circuit, 210 . . . Impulse response division circuit, 350 . . . Cross-correlation calculation circuit, 36
0... Autocorrelation calculation circuit, 370... Pitch analysis circuit, 380... Pitch encoding circuit, 390... Pulse calculation circuit, 430... Comparison circuit, 440... Switching circuit, 470 ... Encoding circuit, 450... Multiplexer, 460... Gate circuit, 500...
Demultiplexer, 510... Pitch decoding circuit, 52
0: K parameter decoding circuit; 530: sound source pulse decoding circuit. , λ;:8. ．． l 'F Takushi Naiba tf, Xi Figure 1 Figure 2 Figure 5 Figure 16 Procedural Amendment (Voluntary) 60.5.29 Showa Year Month Day 1, Case Indication 1982 Patent Application No. 042305
No. 2, Title of the invention Audio encoding system and its device 3, Relationship with the amended person's case Applicant 5-33-1 Shiba, Minato-ku, Tokyo (423) NEC Corporation Representative Tadahiro Sekimoto 4; Agent 7″″-＼・te11. ml, :,,,su(-1 5. Subject of amendment (1) Claims column of the specification (2) Detailed explanation of the invention column 6 of the specification Contents of the amendment (1) Claims Amend the range as shown in the attached sheet. (2) Amend "less" on page 16, line 16 of the specification to "less". (3) Amend page 23, line 8 of the specification. The phrase "subtracted" should be corrected to "subtracted from the cross-correlation numbers in Figure 5 (a)." (4) "Transmitted" on page 23, line 14 of the specification. (5) In the 7th line of page 24 of the specification, the statement "gl is" is corrected to "gi is when the pitch period is not used." (6) In the 7th line of page 24 of the specification, ``ψhx' is corrected to ``If pitch period is used, all pulse amplitudes including such pulses, ψhx.'' (7) Specification No. The phrase "by doing so" on page 31, line 19 is corrected to "by doing so, the number of pulses in the frame is increased." (8) On page 37, line 17 of the specification, "(11・−＼、Patent Attorney Uchihara !−ﾛAttachment Claims (1) On the transmitting side, the discrete audio signal sequence is A pitch parameter representing a fine structure and a spectral parameter representing a short-time spectral envelope are extracted and encoded, and based on the spectral parameters, an autocorrelation number of an impulse response sequence corresponding to the short-time spectral envelope is calculated; A first pulse sequence that can satisfactorily represent the audio signal sequence is encoded using the pitch parameter and the autocorrelation coefficient, and the code representing the pitch parameter and the spectral parameter is combined and output.
On the receiving side, the one combined code is decoded by separating the code representing the first pulse sequence, the code representing the pitch parameter, and the code representing the spectral parameter, and the decoded first A second pulse sequence is determined based on the pulse sequence using the decoded pitch parameter, and the audio signal sequence is reproduced using the second pulse sequence and the decoded spectral parameter. Characteristic voice encoding method. (2) a parameter calculation circuit that inputs a discrete audio signal sequence and extracts and encodes a pitch parameter representing a pitch fine structure and a spectral parameter representing a short-time spectral envelope from the audio signal sequence; an autocorrelation coefficient calculation circuit that receives an output sequence and calculates a number of autocorrelations of an impulse response sequence according to a short-time spectrum of the audio signal sequence; and inputs the audio signal sequence and the output sequence of the parameter calculation circuit. a mutual correlation calculation circuit for calculating a cross-correlation number according to the speech signal sequence and an impulse response sequence corresponding to the short-time spectrum; and an output series of the autocorrelation calculation circuit and the cross-correlation calculation circuit. a first pulse calculation circuit that receives the output sequence of the numerical calculation circuit and the output sequence of the parameter calculation circuit and encodes a first pulse sequence that can satisfactorily represent the audio signal; A speech encoding device comprising: a multiplexer circuit that combines and outputs the output code and the output sequence of the first pulse sequence calculation circuit. (3) The first one that can satisfactorily represent the input audio signal sequence.
input a code sequence in which a code representing a pulse sequence of , a code representing a pitch parameter representing a pitch fine structure of the audio signal sequence, and a code representing a spectral parameter representing a short-time spectral envelope of the audio signal sequence are combined. a demultiplexer circuit that separates a code representing the first pulse sequence, a code representing the pitch parameter, and a code representing the spectrum parameter;
a first pulse sequence decoding circuit that receives and decodes a code representing the first pulse sequence obtained by separation; and a pitch parameter that inputs and decodes a code representing the pitch parameter obtained by separation. a decoding circuit, a spectral parameter decoding circuit inputting and decoding a code representing the separated spectral parameter, an output series of the first pulse sequence decoding circuit and an output series of the pitch parameter decoding circuit; a second pulse sequence generation circuit that generates a second pulse sequence using the decoded pitch parameter based on the input and decoded first pulse sequence; An audio decoding device comprising: a synthesis filter circuit that inputs an output sequence and an output sequence of the spectral parameter decoding circuit, reproduces and outputs an audio signal sequence. 17N5 Agent Patent Attorney Uchihara -(゛＼Nino

Claims

[Claims]

(1) On the transmitting side, a pitch parameter representing a pitch fine structure and a spectral parameter representing a short-time spectral envelope are extracted and encoded from a discrete audio signal sequence, and the short-time spectral envelope is encoded based on the spectral parameters. calculate the autocorrelation number of the impulse response sequence according to the speech signal sequence and the impulse response sequence, calculate the cross-correlation number according to the speech signal sequence and the impulse response sequence, and calculate the first pitch parameter that can satisfactorily represent the negative signal sequence. The pulse sequence is
If it is less than a predetermined threshold, the first pulse code sequence is determined using only the autocorrelation function and the cross-correlation function, the first pulse sequence is encoded, and the first pulse sequence is A code representing the first pulse sequence, a code representing the pitch parameter, and a code representing the spectral parameter are combined and output, and on the receiving side, the combined code is combined with a code representing the first pulse sequence, a code representing the pitch parameter, and a code representing the spectral parameter. A second pulse sequence is generated using the decoded pitch parameter based on the decoded first pulse sequence. An audio encoding method characterized in that the audio signal sequence is reproduced using the second pulse sequence and the decoded spectrum parameter.

(2) a parameter calculation circuit that receives a complete input of a discrete audio signal sequence and extracts and encodes a pitch parameter representing a pitch fine structure and a spectral variation representing a short-time spectrum envelope from the audio signal sequence; and the parameter calculation circuit. a cross-correlation calculation circuit that receives all the output series of the input pulse response sequence and calculates the auto-correlation number of the impulse response sequence corresponding to the short-time spectrum of the audio signal sequence; , when the output series of the autocorrelation coefficient calculation circuit, the output series of the cross correlation coefficient calculation circuit, and the output series of the parameter calculation circuit are input, and the pitch parameter exceeds a predetermined threshold value, A first pulse sequence that can represent the audio signal well is determined and encoded from the pitch parameter, the autocorrelation function, and the cross-correlation function, and when the pitch parameter is less than or equal to the aperture, the autocorrelation function and the cross-correlation function are encoded. a first pulse calculation circuit that encodes the first pulse sequence from the cross-correlation function; and outputs a combination of the output code of the parameter calculation circuit and the output sequence of the first pulse sequence calculation circuit. 1′4″1′”° characterized by having a multiplexer circuit that

(3) The first step is to obtain a good representation of the input audio signal sequence.
a code representing a pulse sequence of the audio signal sequence, a code representing a pitch parameter representing the pitch fine structure of the audio signal sequence, and a code representing a spectral parameter representing the short-time spectral envelope of the audio signal sequence. a demultiplexer circuit that separates a code representing the first pulse sequence, a code representing the pitch parameter, and a code representing the spectral parameter;
a first pulse sequence decoding circuit that inputs and decodes a code representing the first pulse sequence obtained by separation; a pulse sequence decoder that inputs and decodes a code representing all of the pitch parameters obtained by separation; a spectral parameter decoding circuit that inputs and decodes codes representing all of the separated spectral parameters;
The output sequence of the pulse sequence decoding circuit and the output sequence of the pitch parameter decoding circuit are input, and a second pulse sequence is calculated using the decoded pitch parameter based on the decoded first pulse sequence. a second pulse sequence generation circuit, and an output sequence of the second pulse sequence generation circuit and the spectral parameter recovery circuit;