JPS63108400A

JPS63108400A - Voice encoder

Info

Publication number: JPS63108400A
Application number: JP61254542A
Authority: JP
Inventors: 福井　昭; 中川　富士夫
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1986-10-24
Filing date: 1986-10-24
Publication date: 1988-05-13

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Abstract] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、マルチパルス駆動音声符号化方式に関する。[Detailed description of the invention] [Industrial application field] The present invention relates to a multi-pulse driven audio encoding system.

特に、演算量の制限下で符号化音質の優れたパルスを探
索する手段に関する。In particular, the present invention relates to means for searching for pulses with excellent encoded sound quality under a limited amount of calculation.

〔overview〕

本発明は、相互相関のフレーム外も探索してフレーム内
のパルスをあらかじめ定めた数だけ求める音声符号化方
法において、相互相関のフレーム外のパルス個数が所定の個数に達し
たとき相互相関の絶対値の最大値の探索を相互相関のフ
レーム内に限定することにより、　　　　　−ｉ　　− 制限されたパルス探索ループ回数で符号化音質の改良を
図ることができるようにしたもげである。The present invention provides a speech encoding method in which a predetermined number of pulses within the frame are obtained by searching outside the cross-correlation frame, and when the number of pulses outside the cross-correlation frame reaches a predetermined number, the absolute value of the cross-correlation is determined. By limiting the search for the maximum value within the cross-correlation frame, it is possible to improve the encoded sound quality with a limited number of pulse search loops.

[Conventional technology]

８〜１６ｋｂｐｓのビットレートで音声を符号化する方
式の一つとして、２０ミリ秒程度のフレームに対し１０
〜３０個のパルスで線形予測フィルタを駆動するマルチ
パルス駆動音声符号化方式がある。本方式では、人力信
号と合成信号との誤差が最小になるようにパルスを求め
る必要があるが、演算の簡略化法として、入力信号と線
形予測フィルタのインパルス応答の相互相関の絶対値が
最大になる位置にパルスを立て、２つ目以降のパルスは
相互相関のパルスを立てた位置からインパルス応答を引
いたものを新たな相互相関として同様にして求めていく
方法が提案されている。One of the methods for encoding audio at a bit rate of 8 to 16 kbps is 10 bits per frame of about 20 milliseconds.
There is a multi-pulse driven speech encoding method that drives a linear predictive filter with ~30 pulses. In this method, it is necessary to find pulses so that the error between the human signal and the composite signal is minimized, but as a method to simplify the calculation, the absolute value of the cross-correlation between the input signal and the impulse response of the linear prediction filter is maximized. A method has been proposed in which a pulse is set at the position where the cross-correlation pulse is set, and the second and subsequent pulses are obtained by subtracting the impulse response from the position at which the cross-correlation pulse was set, and then obtaining the new cross-correlation in the same way.

また、入力信号とインパルス応答の相互相関から前フレ
ームの影響を除去するために、合成フィルタの入力をゼ
ロにして影響信号を求めて入力信号から引き、ノイズを
聴感特性を利用してマスクするように聴感の重み付はフ
ィルタを通した信号を求め、この信号とインパルス応答
の相互相関を用いるのが一般的である。パルス探索では
、相互相関のフレーム内のみでなくフレーム外も探索し
、フレーム内のパルス数をあらかじめ定めた数だけ求め
る方がフレーム内のみを探索するより符号化音質が優れ
ていることを特徴とするが知られている。In addition, in order to remove the influence of the previous frame from the cross-correlation between the input signal and impulse response, the input of the synthesis filter is set to zero, the influence signal is obtained and subtracted from the input signal, and the noise is masked using the auditory characteristics. For auditory weighting, it is common to obtain a signal passed through a filter and use the cross-correlation between this signal and the impulse response. Pulse search is characterized by searching not only within the cross-correlation frame but also outside the frame, and finding a predetermined number of pulses within the frame provides better encoded sound quality than searching only within the frame. It is known to do.

このようにして探索されたマルチパルスはマルチパルス
駆動音声符号化手段に与えられる。The thus searched multipulses are applied to multipulse driven speech encoding means.

[References]

東海大学出版会列　古井；ディジタル音声処理１）５〜
１）６ページ。Tokai University Press series Furui; Digital audio processing 1) 5~
1) 6 pages.

[Problem that the invention seeks to solve]

しかし、マルチパルス符号化を実時藺で行うには、全体
のパルス探索ループ回数を制限する必要があり、その場
合には、必ずしもフレーム外もパルス探索を行った方が
符号化音質が良くなるとは限らなかった。However, in order to perform multi-pulse encoding in real time, it is necessary to limit the overall number of pulse search loops, and in that case, it is not always necessary to perform pulse search outside the frame for better encoded sound quality. was not limited.

本発明は、パルス探索ループ回数が制限されていても、
優れた符号化音質を実現できる音声符号化装置を提供す
ることを目的とする。Even if the number of pulse search loops is limited,
It is an object of the present invention to provide a speech encoding device that can realize excellent encoded sound quality.

[Means for solving problems]

本発明は、音声入力信号を線形予測分析して線形予測係
数を決定する手段と、この線形予測係数に相応の線形予
測フィルタのインパルス応答を求める手段と、音声入力
信号とインパルス応答の相互相関を求める手段と、この
相互相関の絶対値が最大になる時間位置に位置しこの相
互相関の大きさを振幅とする第一のパルスおよびあらか
じめ定められた範囲内の相互相関に基づき決定された振
幅と時間位置を有し、この第一のパルスにつづくあらか
じめ定められた個数のパルス群を生成する手段とを備え
た音声符号化装置において、上記相互相関のあらかじめ
定められた範囲外のパルスの個数が上記あらかじめ定め
られた個数に達したときは、上記相互相関の絶対値の最
大値の探索を上記相互相関のあらかじめ定められた範囲
内に限定する手段を備えたことを特徴とする。The present invention provides means for determining linear prediction coefficients by linear predictive analysis of an audio input signal, means for determining an impulse response of a linear prediction filter corresponding to the linear prediction coefficient, and cross-correlation between the audio input signal and the impulse response. a first pulse located at a time position where the absolute value of this cross-correlation is maximum and whose amplitude is the magnitude of this cross-correlation; and an amplitude determined based on the cross-correlation within a predetermined range. and a means for generating a predetermined number of pulses following the first pulse, wherein the number of pulses outside the predetermined range of the cross-correlation is The present invention is characterized by comprising means for limiting the search for the maximum value of the absolute value of the cross-correlation within the predetermined range of the cross-correlation when the predetermined number is reached.

[Effect]

音声入力信号を線形予測分析して決定した線形予測係数
に相応の線形予測フィルタのインパルス応答と音声入力
信号との相互相関の絶対値の最大の位置に相互相関の大
きさの第一のパルスを立てる。この相互相関のパルスを
立てた位置にインパルス応答の自己相関をパルスの大き
さに正規化し、この値を相互相関から差し引いた値をあ
らかじめ定められた相互相関の範囲内で新たな相互相関
を求めあらかじめ定めた個数のパルスを立てるに際し、
相互相関の範囲内のみでなく範囲外でもパルス探索を行
い、相互相関のあらかじめ定められた範囲外のパルスの
個数があらかじめ定められた個数に達した場合は、相互
相関の絶対値最大値の探索を相互相関のあらかじめ定め
られた範囲内に限定する。A first pulse with the magnitude of the cross-correlation is placed at the position of the maximum absolute value of the cross-correlation between the impulse response of the linear prediction filter corresponding to the linear prediction coefficient determined by linear prediction analysis of the audio input signal and the audio input signal. stand up The autocorrelation of the impulse response is normalized to the pulse size at the position where the cross-correlation pulse is set, and this value is subtracted from the cross-correlation to obtain a new cross-correlation within the predetermined cross-correlation range. When generating a predetermined number of pulses,
A pulse search is performed not only within the cross-correlation range but also outside the cross-correlation range, and when the number of pulses outside the cross-correlation range reaches a predetermined number, search for the maximum absolute value of the cross-correlation. is limited to within a predetermined range of cross-correlation.

〔Example〕

以下、本発明実施例を図面に基づいて説明する。 Embodiments of the present invention will be described below based on the drawings.

第１図はこの実施例の構成を示すブロック構成図である
。この実施例装置は、影響信号Ｘ、を合成する合成フィ
ルタ１と、入力信号Ｘとこめ影響信号Ｘ、との差に対し
聴感の重み付けを行う重み付はフィルタ２と、入力信号
Ｘを線形予測分析して線形予測係数Ａを生成する線形予
測分析回路３と、線形予測係数Ａを係数とする線形予測
フィルタのインパルス応答Ｒを求めるインパルス応答生
成回路４と、重み付はフィルタの出力Ｘ％１とインパル
ス応答との相互相関Φ×ｒを求める相互相関生成回路５
と、インパルス応答の自己相関Φｒｒを求める自己相関
生成回路６と、相互相関Φｘｒと自己相関Φｒｒに基づ
いてパルス探索を行うパルス探索回路７と、パルス探索
回路７の出力と線形予測係数Ａを符号化する符号化回路
８とを備える。FIG. 1 is a block diagram showing the configuration of this embodiment. This embodiment device includes a synthesis filter 1 that synthesizes the influence signal X, a weighting filter 2 that performs auditory weighting on the difference between the input signal X and the influence signal A linear prediction analysis circuit 3 analyzes and generates a linear prediction coefficient A, an impulse response generation circuit 4 calculates an impulse response R of a linear prediction filter using the linear prediction coefficient A as a coefficient, and weighting is performed by filter output X%1 Cross-correlation generation circuit 5 for calculating the cross-correlation Φ×r between Φ×r and the impulse response
, an autocorrelation generation circuit 6 that calculates the autocorrelation Φrr of the impulse response, a pulse search circuit 7 that performs a pulse search based on the cross-correlation Φxr and the autocorrelation Φrr, and encodes the output of the pulse search circuit 7 and the linear prediction coefficient A. and an encoding circuit 8 for encoding.

次に、この実施例の動作を第１図および第２図に基づい
て説明する。入力信号Ｘから前フレームの影、響を除去
するために、合成フィルタの内部データを保持したまま
入力をゼロにして影響信号Ｘ。Next, the operation of this embodiment will be explained based on FIGS. 1 and 2. In order to remove the influence of the previous frame from the input signal X, the influence signal X is created by setting the input to zero while retaining the internal data of the synthesis filter.

を求め、入力信号Ｘから引き、ノイズを聴感特性を利用
してマスクするように聴感の重み付はフィルタ２を通し
た信号Ｘｗを求める。入力信号Ｘを線形予測分析して、
線形予測係数Ａを求める。線形予測係数Ａを係数に持つ
線形予測フィルタのインパルス応答Ｒを求め、重み付は
信号Ｘ−とインパルス応答Ｒの相互相関Φｘｒと、イン
パルス応答Ｒの自己相関Φｒｒを求める０次に、パルス
探索では、長さｌｘの相互相関φｘｒ　（ｉ）、ｉ　＝
　１〜Ｉ　ｘの最大値φｘｒ　（ｉｓ）を求め、位置ｉ
ｍに大きさΦｘｒ（ｉｓ、）のパルスを立てる。相互相
関Φｘｒから長さ±Ｔｒの自己相関Φｒｒを位置ｉｓで
大きさを　Φｘｒ（ｉＩＩ＋）に合わせて引く。is obtained, subtracted from the input signal X, and subjected to aural weighting to obtain a signal Xw that is passed through a filter 2 so as to mask noise using the audible characteristics. Perform linear predictive analysis on the input signal X,
Find the linear prediction coefficient A. The impulse response R of the linear prediction filter having the linear prediction coefficient A as a coefficient is determined, and the weighting is the cross-correlation Φxr of the signal X- and the impulse response R, and the autocorrelation Φrr of the impulse response R is determined. , cross-correlation φxr (i) of length lx, i =
Find the maximum value φxr (is) of 1 to I x, and find the maximum value φxr (is) of
A pulse of magnitude Φxr(is,) is generated at m. The autocorrelation Φrr of length ±Tr is subtracted from the cross-correlation Φxr at the position is with the magnitude adjusted to Φxr(iII+).

（ｉ−−１ｒ〜Ｉｒ）このようにしてパルスをあらかじめ定めた数だけ、例え
ば９．６ｋｂｐｓに符号化する場合には、１５個程度求
める。同じ位置にパルスが立つこともあるため、１５個
の異なる位置のパルスを求めるためには数十回にわたり
パルス探索演算を行わな（ではならない。同じ位置にパ
ルスが立った場合は、以前に求めたパルスの大きさと今
回の大きさを足した値を新たにパルスの大きさとする。(i--1r to Ir) When encoding a predetermined number of pulses in this manner, for example, at 9.6 kbps, approximately 15 pulses are obtained. Since pulses may appear at the same position, in order to find pulses at 15 different positions, pulse search calculations must be performed several dozen times. The new pulse size is the sum of the previous pulse size and the current size.

前述のように、音声を例えば２０ミリ秒毎のフレームに
分割し、フレーム内に１５個のパルスを求める場合には
、相互相関長をフレーム長より長くとり、フレーム内の
みでなくフレーム外もパルスを探索し、フレーム内のパ
ルス数があらかじめ定めた数だけ求める方がフレーム内
のみを探索するよりも符号化音質が優れている。しかし
、マルチパルス符号化を実時間で行うためには、全体の
パルス探索ループ回数を制限する必要があり、その場合
には、必ずしもフレーム外でもパルス探索を行った方が
符号化音質が良くなるとは限らなかった。As mentioned above, if audio is divided into frames of, for example, every 20 milliseconds and 15 pulses are to be found within each frame, the cross-correlation length should be longer than the frame length, and pulses will be generated not only within the frame but also outside the frame. Searching for a predetermined number of pulses within a frame results in better encoded sound quality than searching only within the frame. However, in order to perform multi-pulse encoding in real time, it is necessary to limit the overall number of pulse search loops, and in that case, it is not necessary to perform pulse search outside the frame for better encoded sound quality. was not limited.

第２図は、パルス探索ループを制限する場合のパルス探
索のフローチャートである。従来は、図中の破線の部分
がなく、１フレームが１６０サンプ゛　　ルの場合に、
フレーム外のパルス探索を４０サンプル長くして２００
サンプルまで行う場合には、最初にパルス探索範囲を２
００に設定してパルス探索を行い、パルスがあらかじめ
定めた値、第２図では１５になるまで、または、合計ル
ープ回数があらかじめ定めた値、第２図では２５になる
まで行う。フレーム外のパルスを探索しない場合は、最
初にパルス探索範囲を１６０に設定して行う。FIG. 2 is a flowchart of pulse search when the pulse search loop is limited. Conventionally, when there is no broken line in the figure and one frame has 160 samples,
Extend the pulse search outside the frame by 40 samples to 200
If you want to perform sampling, first set the pulse search range to 2.
00 and pulse search is performed until the pulse reaches a predetermined value, 15 in FIG. 2, or until the total number of loops reaches a predetermined value, 25 in FIG. If a pulse outside the frame is not to be searched, the pulse search range is first set to 160.

本発明では、第２図の破線の部分を追加したことが特徴
になっている。すなわち、最初、パルス探索長はフレー
ム外を含む２００サンプルとし、フレーム外にパルスが
あらかじめ定めた数、第２図では５になったらパルス探
索長をフレーム長である１６０に制限する。The present invention is characterized in that the portion indicated by the broken line in FIG. 2 is added. That is, initially, the pulse search length is set to 200 samples including outside the frame, and when the number of pulses outside the frame reaches a predetermined number (5 in FIG. 2), the pulse search length is limited to 160 samples, which is the frame length.

〔Effect of the invention〕

本発明は以上説明したように、フレーム外に立つパルス
の個数があらかじめ定めた値以上になると、パルス探索
範囲をフレーム内に限定するので、パルス探索ループ回
数が制限されている場合の符号化音声の音質を高める効
果がある。As explained above, the present invention limits the pulse search range to within the frame when the number of pulses standing outside the frame exceeds a predetermined value. It has the effect of improving the sound quality.

[Brief explanation of the drawing]

第１図は本発明実施例の構成を示すブロック構成図。第２図は本発明実施例の動作を示すフローチャート。１・・・合成フィルタ、２・・・重み付はフィルタ、３
・・・線形予測分析回路、４・・・インパルス応答生成
回路、５・・・相互相関生成回路、６・・・自己相関生
成回路、７・・・パルス探索回路、８・・・符号化回路
。特許出願人　日本電気株式会社、−１、代理人　　弁理
士　井　出　直　孝０、　　パ′：実施例の構成第１図FIG. 1 is a block configuration diagram showing the configuration of an embodiment of the present invention. FIG. 2 is a flowchart showing the operation of the embodiment of the present invention. 1... Synthesis filter, 2... Weighting is a filter, 3
...Linear prediction analysis circuit, 4. Impulse response generation circuit, 5. Cross correlation generation circuit, 6. Autocorrelation generation circuit, 7. Pulse search circuit, 8. Encoding circuit. . Patent applicant: NEC Corporation, -1, agent: Nao Takashi Ide0, PA': Structure of the embodiment Fig. 1

Claims

[Claims]

(1) Means for determining linear prediction coefficients by performing linear predictive analysis on an audio input signal; Means for determining an impulse response of a linear prediction filter corresponding to the linear prediction coefficient; and determining cross-correlation between the audio input signal and the impulse response. means, a first pulse located at a time position where the absolute value of this cross-correlation is maximum and having an amplitude equal to the magnitude of this cross-correlation, and an amplitude and time determined based on the cross-correlation within a predetermined range; and means for generating a predetermined number of pulses following the first pulse, wherein the number of pulses outside the predetermined range of the cross-correlation is When the predetermined number is reached,
A speech encoding device characterized by comprising means for limiting the search for the maximum value of the absolute value of the cross-correlation within a predetermined range of the cross-correlation.