JPH01207799A

JPH01207799A - Multipulse voice encoder

Info

Publication number: JPH01207799A
Application number: JP63033351A
Authority: JP
Inventors: Yasuhiro Wake; 和気　靖浩
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1988-02-15
Filing date: 1988-02-15
Publication date: 1989-08-21

Abstract

PURPOSE:To reduce the quantity of operation and to improve the quality of synthesized voice by selecting the number of pulses in each subframe based on a power ratio in each frequency component and extracting sound source pulses corresponding to the selected number of pulses. CONSTITUTION:The output results of band pass filters (BPFs) 1-N having respectively different passing bands are sent to a power calculating part 7 to calculate power in each frequency component. A pulse number assigning table 9 selects the number of pulses to be assigned to a subframe or the like based on the power rate of each frequency component. A pitch forecasting sound source pulse searching part 4 searches the pulse of a driving sound source successively based on the output of an LPC parameter extracting part 2, the output of a pitch extracting part 3 and the number of pulses in each subframe and forms the position of the sound source pulse and its amplitude. Consequently, the quantity of arithmetic operation can be reduced and the quality of synthesized voice can be improved.

Description

【発明の詳細な説明】（産業上の利用分野）本発明は音声処理装置に関し、特に入力音声よりピッチ
を抽出し、このピッチ情報を利用して、音声の駆動音源
パルスを抽出し、伝送するピッチ予測付きマルチパルス
音声符号化装置に関する。[Detailed Description of the Invention] (Industrial Application Field) The present invention relates to an audio processing device, and in particular, to an audio processing device that extracts pitch from input audio, and uses this pitch information to extract and transmit audio driving sound source pulses. The present invention relates to a multipulse speech encoding device with pitch prediction.

（従来の技術）従来、マルチパルス音声符号化装置では、メモリ容量や
演算時間の制限から、１フレームを複数個の固定長サブ
フレームに分割し音源パルスを求めていた。このような
構成のマルチパルス音声処理装置の音質を改善するため
に、入力音声のピッチを利用する方法が考えられた（ピ
ッチ予測付きマルチパルス音声符号化装置）、この種の
ピッチ予測付きマルチパルス音声符号化装置は、予め１
フレーム内に求めるべき駆動音源パルスの数を決めてお
き、１フレームをピッチ周期のサブフレームに分割し、
このサブフレーム毎に、音源パルスを抽出し伝送する構
成となっていた。・ここで、サブフレーム毎の音源パル
ス数の選び方には次のような方式が考えられてきた。(Prior Art) Conventionally, in a multipulse speech encoding device, one frame is divided into a plurality of fixed-length subframes to obtain sound source pulses due to limitations in memory capacity and calculation time. In order to improve the sound quality of a multipulse speech processing device with such a configuration, a method was devised that uses the pitch of the input speech (multipulse speech encoding device with pitch prediction). The speech encoding device has 1
The number of drive sound source pulses to be obtained within a frame is determined, one frame is divided into subframes of pitch period,
The configuration was such that a sound source pulse was extracted and transmitted for each subframe. -Here, the following methods have been considered for selecting the number of sound source pulses for each subframe.

■１フレーム内のパルス数を一定とする範囲内で、サブ
フレーム毎のパルス数を固定とする。(2) The number of pulses in each subframe is fixed within the range where the number of pulses in one frame is constant.

■１フレーム内のパルス数を一定とする範囲内で、サブ
フレーム毎のパルス数を可変とする。(2) The number of pulses in each subframe is made variable within the range where the number of pulses in one frame is constant.

（発明が解決しようとする課題）前述した従来のピッチ予測付きマルチパルス音声符号化
装置では、サブフレーム毎の音源パルス数の選び方によ
って次のような問題点があった。(Problems to be Solved by the Invention) The conventional multipulse speech encoding device with pitch prediction described above has the following problems due to the way the number of sound source pulses is selected for each subframe.

■サブフレーム（ピッチ周期）毎のパルスの個数を固定
とした場合パルス探索のための演算量は少なくて済む。(2) If the number of pulses per subframe (pitch period) is fixed, the amount of calculation for pulse search can be reduced.

しかし、入力音声はピッチ周期毎に変動するにも関わら
ず、サブフレーム毎の駆動音源パルス数は固定になって
いるから、入力音声が、あるサブフレーム区間で周波数
的に高い成分を持つような場合、あるいは、１フレーム
内で音韻が変化するような場合においては、合成音質が
劣化する。なぜならば、決められた数の音源パルスだけ
では波形を忠実に再現できないからである。However, although the input audio fluctuates with each pitch period, the number of driving sound source pulses for each subframe is fixed, so it is possible that the input audio has high frequency components in a certain subframe section. In this case, or in cases where the phoneme changes within one frame, the synthesized sound quality deteriorates. This is because the waveform cannot be faithfully reproduced with only a predetermined number of sound source pulses.

■サブフレーム毎のパルス数を可変とした場合サブフレ
ームのパルス数を可変とするから■に示したような合成
音質の劣化は避けられるが、サブフレーム毎のパルス数
を決定するための手段は非常に複雑であり、その演算量
は膨大なものとなっていた。従来のパルス数決定方式は
、まず１フレ一ム全体に渡ってパルス探索を行い、１フ
レーム内のパルス位置の偏りを求め、再度サブフレーム
毎のパルス数を調整するものであった。従来のビ・ｙチ
予測付きマルチパルス音声符号化装置の例を第２図に示
す。■If the number of pulses per subframe is made variable Since the number of pulses per subframe is made variable, the deterioration of synthesized sound quality as shown in ■ can be avoided, but the method for determining the number of pulses per subframe is It is extremely complex and requires a huge amount of calculation. The conventional method for determining the number of pulses first performs a pulse search over the entire frame, determines the deviation of pulse positions within one frame, and then adjusts the number of pulses for each subframe again. FIG. 2 shows an example of a conventional multipulse speech encoding device with bi- and y-chi prediction.

（課題を解決するための手段）本発明のマルチパルス音声符号化装置は、従来のピッチ
予測付きマルチパルス音声符号化装置に加え、入力音声
に含まれる周波数成分を抽出するための複数の帯域通過
フィルタと、前記サブフレーム毎の音源パルス数の割当
表と、前記周波数成分毎のパワー比を求める手段と、こ
のパワー比に基づき、前記サブフレーム毎のパルス数割
当表より抽出すべきパルス数を選択し、前記パルス数だ
けの音源パルスを抽出す・る手段とを有している。(Means for Solving the Problems) The multipulse speech encoding device of the present invention includes, in addition to the conventional multipulse speech encoding device with pitch prediction, a plurality of band pass a filter, an allocation table for the number of sound source pulses for each subframe, a means for determining the power ratio for each frequency component, and a means for determining the number of pulses to be extracted from the pulse number allocation table for each subframe based on the power ratio. and means for selecting and extracting as many sound source pulses as the number of pulses.

（実施例）次に、実施例を挙げ本発明について一層詳しく説明する
。第１図は本発明の一実施例を示すブロック回路図であ
る。第１図において、入力端子１より入力された音声信
号は線形予測パラメータ（ＬＰＧ）抽出部２と、ピッチ
抽・山部３と、ピッチ予測付き音源パルス探索部４と、
帯域フィルタパンクロに各々入力される。ここで、ピッ
チ予測付き音源パルス探索部４は、サブフレーム毎に音
源パルスの探索を行うサブフレーム区間パルス探索部４
１とピッチの予測を行うピッチ予測部４２とで構成され
ている。(Example) Next, the present invention will be described in more detail with reference to Examples. FIG. 1 is a block circuit diagram showing one embodiment of the present invention. In FIG. 1, an audio signal input from an input terminal 1 is sent to a linear prediction parameter (LPG) extractor 2, a pitch extraction/peak portion 3, a sound source pulse searcher with pitch prediction 4,
Each is input to a bandpass filter panchromatic. Here, the sound source pulse search unit 4 with pitch prediction includes a subframe interval pulse search unit 4 that searches for sound source pulses for each subframe.
1 and a pitch prediction unit 42 that predicts pitch.

帯域フィルタパンクロはそれぞれ異なった通過帯域を持
つバンドパス・フィルタ（ＢＰＦ−１，２，・・・、Ｎ
）で構成されている。これら各フィルタの出力結果はパ
ワー算出部７に送られ、ここで各々の周波数成分毎のパ
ワーが計算される０周波数成分毎のパワーはパワー比較
部８に入力される。パワー比較部８はサブフレーム毎の
各周波数成分のパワー比を生成する。パルス数割当テー
ブル９では、パワー比較部８から出力される各周波数成
分のパワー比に基づき、サブフレーム毎の割当パルス数
が選ばれる。サブフレーム毎のパルス数割当結果は、前
述のピッチ予測付き音源パルス探索部４に入力される。Bandpass filter Panchromatic bandpass filters (BPF-1, 2, ..., N
). The output results of each of these filters are sent to the power calculation section 7, where the power for each frequency component is calculated, and the power for each 0 frequency component is input to the power comparison section 8. The power comparator 8 generates a power ratio of each frequency component for each subframe. In the pulse number assignment table 9, the number of pulses to be assigned for each subframe is selected based on the power ratio of each frequency component output from the power comparator 8. The pulse number allocation result for each subframe is input to the above-mentioned sound source pulse search unit 4 with pitch prediction.

ピッチ予測付き音源パルス探索部４では、ＬＰＧパラメ
ータ抽出部２の出力と、ピッチ抽出部３の出力と、サブ
フレーム毎のパルス数とをもとに順次に駆動音源パルス
の探索が行われ、音源パルスの位置と振幅とが生成され
る。この音源パルスの位置と振幅は多重化回路５に入力
される。多重化回路５は、前述のＬＰＧパラメータと、
ピッチ情報と、音源パルスとを多重化し出力端子１０か
ら出力する。The sound source pulse search unit 4 with pitch prediction sequentially searches for driving sound source pulses based on the output of the LPG parameter extraction unit 2, the output of the pitch extraction unit 3, and the number of pulses for each subframe. A pulse position and amplitude are generated. The position and amplitude of this sound source pulse are input to the multiplexing circuit 5. The multiplexing circuit 5 has the above-mentioned LPG parameters,
The pitch information and the sound source pulse are multiplexed and output from the output terminal 10.

（発明の効果）以上に説明したように、本発明におけるサブフレーム毎
の音源パルス数決定方式は、従来の方式（フレーム全体
に渡り一旦パルス探索を行い、サブフレーム毎のパルス
数を再調整する方式）に比べ、演算量をはるかに削減す
ることができる。(Effects of the Invention) As explained above, the method of determining the number of sound source pulses for each subframe in the present invention is different from the conventional method (first performing a pulse search over the entire frame and readjusting the number of pulses for each subframe). The amount of calculation can be reduced significantly compared to the previous method.

また、本発明においては、駆動音源パルス数はサブフレ
ーム毎に可変となるから従来の方式（サブフレーム毎の
パルス数を固定とする方式）に比べ、フレーム内の音源
パルスの偏りが表現でき、合成音質を向上できる。In addition, in the present invention, since the number of drive sound source pulses is variable for each subframe, it is possible to express the bias of sound source pulses within a frame compared to the conventional method (a method in which the number of pulses for each subframe is fixed). The quality of synthesized sound can be improved.

このように、本発明によれば、演算量が少なくてしかも
合成音質の向上が可能なピッチ予測付きマルチパルス音
声符号化装置を提供できる。As described above, according to the present invention, it is possible to provide a multipulse speech encoding device with pitch prediction that requires a small amount of calculation and can improve the quality of synthesized speech.

[Brief explanation of the drawing]

第１図は本発明の一実施例を示すブロック回路図、第２
図は従来のピッチ予測付きマルチパルス符号化装置の例
を示すブロック回路図である。１・・・入力端子、２・・・線形予測パラメータ（ＬＰ
Ｃ）抽出部、３・・・ピッチ抽出部、４・・・ピッチ予
測付き駆動音源パルス探索部、４１・・・サブフレーム
区間音源パルス探索部、４２・・・ピッチ予測部、５・
・・多垂化ブロック、６・・・帯域フィルタバンク、７
・・・パワー算出部、８・・・パワー比較部、９・・・
サブフレーム毎の音源パルス数割当テーブル、１０・・
・出力端子、１１・・・フレーム区間音源パルス探索部
、１２・・・サブフレーム毎のパルス数調整ブロック。FIG. 1 is a block circuit diagram showing one embodiment of the present invention, and FIG.
The figure is a block circuit diagram showing an example of a conventional multipulse encoding device with pitch prediction. 1... Input terminal, 2... Linear prediction parameter (LP
C) Extraction unit, 3... Pitch extraction unit, 4... Drive sound source pulse search unit with pitch prediction, 41... Subframe section sound source pulse search unit, 42... Pitch prediction unit, 5.
...Multiple block, 6...Band filter bank, 7
...Power calculation section, 8...Power comparison section, 9...
Sound source pulse number allocation table for each subframe, 10...
- Output terminal, 11... Frame section sound source pulse search unit, 12... Pulse number adjustment block for each subframe.

Claims

[Claims]

In a multi-pulse audio encoding device that divides input audio into frames of a fixed time length, extracts and transmits a fixed number of driving sound source pulses from the input audio for each frame, extracts the pitch of the input audio, and divides the frame into frames. A means for dividing into subframes having the pitch period, a bandpass filter having different passbands to which input audio in the subframe section is input, and a power ratio of frequency components extracted for each bandpass filter are determined. a sound source pulse number allocation table for determining the sound source pulse number allocation for each subframe based on the power ratio; and means for extracting the number of sound source pulses indicated in the sound source pulse number allocation table. A multipulse speech encoding device characterized by: