JPS5941597B2 - Speech analysis and synthesis device - Google Patents

Speech analysis and synthesis device

Info

Publication number
JPS5941597B2
JPS5941597B2 JP53001283A JP128378A JPS5941597B2 JP S5941597 B2 JPS5941597 B2 JP S5941597B2 JP 53001283 A JP53001283 A JP 53001283A JP 128378 A JP128378 A JP 128378A JP S5941597 B2 JPS5941597 B2 JP S5941597B2
Authority
JP
Japan
Prior art keywords
linear prediction
residual power
normalized
power
speech
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
JP53001283A
Other languages
Japanese (ja)
Other versions
JPS5494210A (en
Inventor
哲 田口
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
Nippon Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Electric Co Ltd filed Critical Nippon Electric Co Ltd
Priority to JP53001283A priority Critical patent/JPS5941597B2/en
Priority to US06/000,942 priority patent/US4301329A/en
Priority to CA319,347A priority patent/CA1123514A/en
Publication of JPS5494210A publication Critical patent/JPS5494210A/en
Publication of JPS5941597B2 publication Critical patent/JPS5941597B2/en
Expired legal-status Critical Current

Links

Description

【発明の詳細な説明】 本発明は線形予測係数を用いた音声分析合成装置に関し
、伝送パラメータの量子化による合成音声の劣化を軽減
し、さらに伝送エラー率の高い伝送路で使用して好適な
合成音声を得るための音声分析合成装置に係るものであ
る。
DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a speech analysis and synthesis device using linear prediction coefficients, which reduces deterioration of synthesized speech due to quantization of transmission parameters, and which is suitable for use in transmission paths with high transmission error rates. This invention relates to a speech analysis and synthesis device for obtaining synthesized speech.

一般に、線形方程式の直接的な解として求まる予測係数
または前記予測係数の変形である部分自己相関係数もし
くはそれらを変換した線形予測係数を用いる、いわゆる
線形予測形音声分析合成装置(たとえばJ(5HNR、
HASKEW、J、M。
In general, a so-called linear predictive speech analysis and synthesis device (for example, J(5HNR) ,
HASKEW, J.M.

KELLY、、ROBERTM、KELLY、JR、、
ANDTHOMASH、MCKINNEY′″Resu
ltsofaStudyoftheLinearPre
dictionV0c0der一部EEETRANSA
CTIONONCOMMUNICATIONS、VOL
、COM−21。
KELLY, ROBERTM, KELLY, JR.
ANDTHOMASH, MCKINNEY'''Resu
ltsofa Study of the Linear Pre
dictionV0c0derpartEEETRANSA
CTIONONCOMMUNICATIONS, VOL
, COM-21.

A6.9SEPTEMBER、1973及び板倉文忠・
斉藤収Ξ「最尤スペクトル推定法をもちいた音声情報圧
縮」日本音響学会誌27巻9号(工971))にぉいて
は分析側から合成側へ被分析波形の情報を伝達するため
に、例えば線形予測係数、ピッチ周波数、短時間平均電
力、短時間平均電力を1に正規化した場合の予測残差電
力である正規化予測残差電力および有声無声判定信号の
5種類の伝送パラメータ、又は前記5種類のパラメータ
の複号パラメータ、例えばピッチ周波数と有声無声判定
信号の複合パラメータ、短時間平均電力と正規化予測残
差電力の複合パラメータ等、もしくは音声波形の一部又
は予測残差波形等が用いられている。一般にこの種の音
声分析合成装置では、伝送路に対する経済性を高めるた
め前記伝送パラメータを合成音声の品質との関係におい
て許容される範囲で極力小数のビットで伝送する必要が
ある。さらに被分析波形の分析周期も極力大きくする必
要がある。線形予測係数を用いた音声分析合成装置に用
いる伝送パラメータのうち正規化予測残差電力又は短時
間平均電力と正規化予測残差電力との複合パラメータは
他の伝送パラメータと比較して時間的な変化率が一般に
著しく大きいことが実験的に知られている。
A6.9 SEPTEMBER, 1973 and Fumitada Itakura
In Shu Saito's "Speech Information Compression Using Maximum Likelihood Spectrum Estimation" Journal of the Acoustical Society of Japan, Vol. 27, No. 9 (Eng. 971)), in order to transmit information on the analyzed waveform from the analysis side to the synthesis side, For example, five types of transmission parameters: linear prediction coefficient, pitch frequency, short-time average power, normalized prediction residual power which is prediction residual power when short-time average power is normalized to 1, and voiced/unvoiced determination signal, or A decoding parameter of the five types of parameters, for example, a composite parameter of pitch frequency and voiced/unvoiced determination signal, a composite parameter of short-time average power and normalized predicted residual power, or a part of the speech waveform or predicted residual waveform, etc. is used. Generally, in this type of speech analysis and synthesis apparatus, in order to improve the economical efficiency of the transmission path, it is necessary to transmit the transmission parameters using as few bits as possible within the allowable range in relation to the quality of the synthesized speech. Furthermore, it is necessary to make the analysis period of the waveform to be analyzed as large as possible. Among the transmission parameters used in a speech analysis and synthesis device using linear prediction coefficients, the normalized predicted residual power or the composite parameter of short-time average power and normalized predicted residual power are temporally sensitive compared to other transmission parameters. It is experimentally known that the rate of change is generally extremely large.

従来この種の音声分析合成装置には、スペクトラムパラ
メータである線形予測係数の量子化の精度と、正規化予
測残差電力又は短時間平均電力と正規化予測残差電力と
の複合パラメータの量子化の精度とが合成音声の振巾再
現性に影響を与え、スペクトラムを表現するのには十分
な精度で量子化された線形予測係数を用いても合成され
る音声の振巾再現性がしばしば劣化するという第1の欠
点があつた。
Conventionally, this type of speech analysis/synthesis device requires the accuracy of quantization of linear prediction coefficients, which are spectrum parameters, and the quantization of a composite parameter of normalized predicted residual power or short-time average power and normalized predicted residual power. The amplitude reproducibility of synthesized speech often deteriorates even when using linear prediction coefficients quantized with sufficient accuracy to represent the spectrum. There was the first drawback.

又、正規化予測残差電力又は短時間平均電力と正規化予
測残差電力との混合パラメータの時間的な変化率が他の
伝送パラメータと比較して著しく大きく、被分析波形の
分析周期を線形予測係数等の他の伝送パラメータを分析
するのに必要な分析周期よりも短かくする必要があり、
伝送容量の増加の原因となるという第2の欠点があつた
。更に、従来の装置は伝送エラーの影響で、線形予測係
数と正規化予測残差電力との一方又は両方が、又は線形
予測係数及び短時間平均電力と正規化予測残差電力との
複合パラメータとの一方又は両方が変形されると、合成
音声の振巾再現性が劣化し、特に合成音声の振巾が増大
する場合には、合成音声の聴者に異和感を与えるという
第3の欠点を持つていた。前記第1の欠点は従来のこの
種の音声分析合成装置に特有のものであり、合成音声の
音質と伝送容量との関係において最適な選択は可能であ
るが、本質的に軽減することほできない。前記第2の欠
点を緩和するために正規化予測残差電力又は短時間平均
電力と正規化予測残差電力との複合パラメータ以外の伝
送パラメータを求める分析フレーム周期を長く、正規化
予測残差電力又は短時間平均電力と正規化予測残差電力
との複合パラメータを求める分析フレーム周期を短かく
し、分析フレーム周期の長い伝送パラメータを分析フレ
ーム周期の短かい伝送パラメータの分析間隔に対応して
補間すると、線形予測係数と正規化予測残差電力又は短
時間平均電力と正規化予測残差電力との複合バラメータ
との関係がくづれ、合成音の音質の劣化の原因となる。
前記第3の欠点は前記第1の欠点と同様に従来のこの種
の音声分析合成装置に特有のものであり、本質的に改善
することはできない。本発明の目的は伝送パラメータの
伝送間隔と量子化精度とに係る伝送パラメータの量子化
による合成音声の品質の劣化を軽減し、更に伝送路エラ
の影響による合成音声の異状振巾の発生を軽減し得る線
形予測係数を用いた音声分析合成装置を提供することに
ある。
In addition, the temporal change rate of the normalized predicted residual power or the mixed parameter of short-time average power and normalized predicted residual power is significantly large compared to other transmission parameters, and the analysis cycle of the analyzed waveform is linear. It needs to be shorter than the analysis period required to analyze other transmission parameters such as prediction coefficients.
The second drawback is that it causes an increase in transmission capacity. Furthermore, due to the influence of transmission errors, conventional devices have been configured such that one or both of the linear prediction coefficient and the normalized prediction residual power, or a composite parameter of the linear prediction coefficient, the short-time average power, and the normalized prediction residual power. If one or both of these are deformed, the amplitude reproducibility of the synthesized speech deteriorates, and especially when the amplitude of the synthesized speech increases, the third drawback is that it gives a sense of strangeness to the listener of the synthesized speech. I had it. The first drawback is unique to conventional speech analysis and synthesis devices of this type, and although it is possible to select the optimal one in terms of the relationship between the quality of the synthesized speech and the transmission capacity, it cannot essentially be alleviated. . In order to alleviate the second drawback, the analysis frame period for determining transmission parameters other than the normalized predicted residual power or the composite parameter of short-time average power and normalized predicted residual power is lengthened, and the normalized predicted residual power is increased. Alternatively, by shortening the analysis frame period for obtaining a composite parameter of short-time average power and normalized predicted residual power, and interpolating the transmission parameter with a long analysis frame period in accordance with the analysis interval of the transmission parameter with a short analysis frame period. , the relationship between the linear prediction coefficient and the composite parameter of the normalized predicted residual power or the short-term average power and the normalized predicted residual power collapses, causing deterioration of the sound quality of the synthesized sound.
The third drawback, like the first drawback, is unique to this type of conventional speech analysis and synthesis apparatus, and cannot be essentially improved. The purpose of the present invention is to reduce the deterioration in the quality of synthesized speech due to quantization of transmission parameters related to the transmission interval and quantization accuracy of transmission parameters, and further reduce the occurrence of abnormal amplitude of synthesized speech due to the influence of transmission path errors. An object of the present invention is to provide a speech analysis and synthesis device using linear prediction coefficients that can be used.

本発明は線形予測型の音声分析合成装置に関するもので
あり、合成側において伝送された線形予測係数から正規
化予測残差電力を求める手段で構成されている。
The present invention relates to a linear prediction type speech analysis/synthesis device, which includes means for obtaining normalized prediction residual power from linear prediction coefficients transmitted on the synthesis side.

本発明の特徴は、量子化エラー及び伝送路エラーの影響
を受けた線形予測係数から求めた正規化予測残差電力を
用いて音声を合成することにある。
A feature of the present invention is that speech is synthesized using normalized prediction residual power obtained from linear prediction coefficients affected by quantization errors and transmission path errors.

このため線形予測係数の量子化及び伝送路エラーによる
変形が原因となる合成音声の振巾変動原因を補償し得る
という効果がある。次に図面を参照して本発明を詳細に
説明する。
Therefore, it is possible to compensate for amplitude fluctuations in synthesized speech caused by deformation due to quantization of linear prediction coefficients and transmission path errors. Next, the present invention will be explained in detail with reference to the drawings.

図は本発明の一実施例を示すプロツク図である。図にお
いて、101は分析側を、102は線形予測係数伝送路
を、103は音源情報伝送路を、104は合成側を示す
。音声波形データが波形入力端子、105を介して線形
予測分析器、106と音源パラメータ分析器、107と
に入力される。線形予測分析器106は音声波形ゼータ
から線形予測係数を計測し、前記線形予測係数を線形予
測係数符号化器108へ供給する。線形予測係数符号化
器108は線形予測係数を量子化し線形予測係数伝送路
102へ出力する。音源パラメータ分析器107は音声
波形データからピツチ周波数、有声無声判別信号と短時
間平均電力から構成される音波情報を計測し音源情報符
号化器109へ供給する。音源情報符号化器109は音
源情報を量子化し音源情報伝送路103へ出力する。線
形予測係数符号器108で量子化された線形予測係数は
線形予測係数伝送路102を介して、線形予測係数復号
化器110に供給される。線形予測係数復号化器110
は量子化された線形予測係数伝送路102で伝送エラー
を加えられた線形予測係数を復号し、復号された線形予
測係数を復号線形予測係数伝送路111と復号線形予測
係数伝送路112とへ出力する。正規化予測残差電力計
算器113は復号線予測係数伝送路111を介して供給
された復号線形予測係数から正規化予測残差電力を、例
えば復号線形予測係数が部分自己相関係数の形式であれ
ば、部分自己相関係数の次数をPPとして(−T(1)
一(i次の部分自己相関係数)2)で求め正規化予測残
差電力伝送路114へ出力する。
The figure is a block diagram showing one embodiment of the present invention. In the figure, 101 indicates an analysis side, 102 a linear prediction coefficient transmission path, 103 a sound source information transmission path, and 104 a synthesis side. Speech waveform data is input to a linear prediction analyzer, 106 and a sound source parameter analyzer, 107 via a waveform input terminal, 105. A linear prediction analyzer 106 measures linear prediction coefficients from the audio waveform zeta and supplies the linear prediction coefficients to a linear prediction coefficient encoder 108 . The linear prediction coefficient encoder 108 quantizes the linear prediction coefficient and outputs it to the linear prediction coefficient transmission path 102. The sound source parameter analyzer 107 measures sound wave information consisting of pitch frequency, voiced/unvoiced discrimination signal, and short-time average power from the audio waveform data, and supplies it to the sound source information encoder 109. The sound source information encoder 109 quantizes the sound source information and outputs it to the sound source information transmission path 103. The linear prediction coefficients quantized by the linear prediction coefficient encoder 108 are supplied to the linear prediction coefficient decoder 110 via the linear prediction coefficient transmission path 102. Linear prediction coefficient decoder 110
decodes the linear prediction coefficient to which the transmission error has been added in the quantized linear prediction coefficient transmission line 102, and outputs the decoded linear prediction coefficient to the decoded linear prediction coefficient transmission line 111 and the decoded linear prediction coefficient transmission line 112. do. The normalized prediction residual power calculator 113 calculates the normalized prediction residual power from the decoded linear prediction coefficients supplied via the decoded linear prediction coefficient transmission path 111, for example, in the form of a partial autocorrelation coefficient. If so, set the order of the partial autocorrelation coefficient to PP (-T(1)
1 (i-th order partial autocorrelation coefficient) 2) and outputs it to the normalized predicted residual power transmission path 114.

予測残差電力は予測残差信号の電力、即ち予測係数から
成る信号を取り除いた残余の信号(予測残差信号)の電
力である。正規化予測残差電力はこの予測残差信号を人
力信号で除して得られる。この正規化予測残差電力は必
ずしも予測残差信号の電力を直接的に求めなくとも上述
のように線形予測係数(部分自己相関係数)より算出で
きることは良く知られている。例えば、東北大学電気通
信研究所主催第8回シンポジウム論文集1971年2月
の板倉文忠著論文”統計的手法による音声の特徴抽出−
ページ−5−1〜−5〜12における式(至)により示
されている。
The prediction residual power is the power of the prediction residual signal, that is, the power of the residual signal (prediction residual signal) after removing the signal consisting of the prediction coefficients. The normalized prediction residual power is obtained by dividing this prediction residual signal by the human input signal. It is well known that the normalized prediction residual power can be calculated from the linear prediction coefficient (partial autocorrelation coefficient) as described above without necessarily calculating the power of the prediction residual signal directly. For example, in the Proceedings of the 8th Symposium sponsored by the Institute of Telecommunications, Tohoku University, published in February 1971, Fumitada Itakura's paper ``Speech feature extraction using statistical methods''
It is shown by the formula (to) on pages -5-1 to -5 to 12.

この式におけるUnを部分自己相関係数の次数Pまで漸
化的に算出した値が正規化予測残差電力である。即ち、
U1−UO(1−Kr)、U2−U1(1−KS)、・
・・・・・・・・・・・・・・、Up−Up−1(1−
K2p)として算出したUpが正規化予測残差電力であ
る。ここでU。は部分泊己相関係数の次数“0―言い換
えれば予測を行なわない場合の正規化予測残差電力を表
わし、次の理由で1.0となる。即ち、予測を行なわな
いため予測残差信号は入力音声信号と一致する。従つて
予測残差電力と入力音声信号の電力である短時間平均電
力とは一致し、前記2つの電力の比は1.0となる。尚
、Unは部分自己相関係数の次数nの、Un−1は次数
n−1の予測残差電力であることは以下により明らかで
ある。
The value obtained by recursively calculating Un in this equation up to the order P of the partial autocorrelation coefficient is the normalized predicted residual power. That is,
U1-UO (1-Kr), U2-U1 (1-KS),・
・・・・・・・・・・・・・・・, Up-Up-1 (1-
Up calculated as K2p) is the normalized predicted residual power. U here. represents the order of the partial correlation coefficient “0”, in other words, the normalized prediction residual power when no prediction is performed, and is 1.0 for the following reason. In other words, since no prediction is performed, the prediction residual signal matches the input audio signal. Therefore, the predicted residual power and the short-term average power, which is the power of the input audio signal, match, and the ratio of the two powers is 1.0. Note that Un is the partial self It is clear from the following that Un-1 of the order n of the correlation coefficient is the prediction residual power of the order n-1.

上記文献中の式(20によればであり、Un.−1はε
Ftの2乗値の期待値、即ち信号ε(n−1)の電力で
ある。ε(n−1)は予測誤差(又は予測残差、予測誤
差信号、予測残差信号)と称され、慣用されている。従
つてUn−,、Unは各々部分自己相関係数の次数。1
、nの予測残差電力である。
According to the formula (20) in the above document, Un.-1 is ε
This is the expected value of the square value of Ft, that is, the power of the signal ε(n-1). ε(n-1) is called a prediction error (or prediction residual, prediction error signal, prediction residual signal) and is commonly used. Therefore, Un-, and Un are the orders of the partial autocorrelation coefficients. 1
, n is the predicted residual power.

そして初期値U。=1.0とおいたときのUnが正規化
予測残差電力を示すことは明らかである。正規化予測残
差電力は正規化予測残差電力伝送路114を介して励振
信号発生器115へ入力される。
And the initial value U. It is clear that Un when =1.0 indicates the normalized prediction residual power. The normalized predicted residual power is input to the excitation signal generator 115 via the normalized predicted residual power transmission line 114.

音源情報符号化器109で量子化された音源情報は音源
情報伝送路103を介して音源情報復号化器116に供
給される。音源情報復号化器116は量子化された音源
情報を復号し、復号音源情報伝送路117を介して、前
記復号された音源情報を励振信号発生器115へ供給す
る。励振信号発生器115は前記正規化予測残差電力と
、前記復号された音源情報とからフイルタ励振信号を発
生し、フイルタ励振信号伝送路118を介して音声合成
フイルタ119へ供給する。音声合成フィルタ119は
復号線形予測係数伝送路112を介して供給される復号
された線形予測係数と前記フイルタ励振信号とから音声
を合成し、波形出力端子120へ出力する。なお正規化
予測残差計算器113は復号された線形予測係数の形式
が部分自己相関係数以外の場合には線形演算により部分
自己相関係数に変換する等の手段またはこれと等価な手
段により正規化予測残差電力を求め得ることは明らかで
ある。また音源情報伝送路103を介して供給される短
時間平均電力等の伝送パラメータも伝送路の影響を受け
るが、短時間平均電力等の時間的変化は正規化予測残差
電力の時間的変化と比較して緩やかであり、受信側にお
いて平滑化しても合成音の品質に与える影響は比較的小
さく、伝送路のエラーを容易に軽減することが可能であ
り、本発明の目的に妨げられることはないなお本発明を
ボイス励振方式の線形予測形音声分析合成装置(例えば
B.S.APAL.M.R.SCHROEGER.V.
STOVRBELLTELEPHONELABORAT
ORIESMURRAYHILL,.N.JO7974
゛TVOlCeExcitedPredictiveC
OrdingSystemfOrLOwBitRate
TransmissiONOfSpeech′5IEE
ECATL0GNUMBER75CH09712CSC
BICC75・JUNEl6〜18)に適用し得ること
は、本発明が直接には音源情報の伝送法と無関係である
ことが明らかである。
The sound source information quantized by the sound source information encoder 109 is supplied to the sound source information decoder 116 via the sound source information transmission line 103. The sound source information decoder 116 decodes the quantized sound source information and supplies the decoded sound source information to the excitation signal generator 115 via the decoded sound source information transmission line 117. The excitation signal generator 115 generates a filter excitation signal from the normalized prediction residual power and the decoded sound source information, and supplies it to the speech synthesis filter 119 via the filter excitation signal transmission path 118. The speech synthesis filter 119 synthesizes speech from the decoded linear prediction coefficients supplied via the decoded linear prediction coefficient transmission path 112 and the filter excitation signal, and outputs the synthesized speech to the waveform output terminal 120. Note that if the format of the decoded linear prediction coefficient is other than a partial autocorrelation coefficient, the normalized prediction residual calculator 113 converts it into a partial autocorrelation coefficient by linear operation, or by an equivalent means. It is clear that the normalized prediction residual power can be determined. In addition, transmission parameters such as short-time average power supplied via the sound source information transmission path 103 are also affected by the transmission path, but temporal changes in short-time average power, etc. are similar to temporal changes in normalized predicted residual power. It is relatively gentle, and even if it is smoothed on the receiving side, the effect on the quality of the synthesized sound is relatively small, and it is possible to easily reduce errors in the transmission path, and the purpose of the present invention is not hindered. However, the present invention is applicable to a voice excitation type linear predictive speech analysis and synthesis device (for example, B.S.APAL.M.R.SCHROEGER.V.
STOVRBELLTELEPHONELABORAT
ORIESMURRAYHILL,. N. JO7974
゛TVOlCeExcitedPredictiveC
OrdingSystemfOrLOwBitRate
TransmissionONOfSpeech'5IEE
ECATL0GNUMBER75CH09712CSC
It is clear that the present invention is applicable to BICC75/JUNEL6-18) and is not directly related to the method of transmitting sound source information.

また予測残差波形励振方式の音声分析合成装置(例えば
CHONGKWANUN.ANDD.THOMASMA
GILL゛TheResidualExcitedLi
nearPredictiOnVOcOderwith
TransmissiOnRatEBelOw9.6K
bits/s″IEEETransactiOnsOn
COmmunieatiOnslvOl.COM−23
、A6.l2、Decemberl975)において、
分析側で予測残差波形を、分析側における線形予測係数
から求まる正規化予測残差電力で除し、予測残差波形の
振巾変動範囲を圧縮した後に、合成側へ予測残差波形を
伝送し、合成側において合成側で線形予測係数から求め
た正規化予測残差電力を前記予測残差波形に乗すること
により、線形予測係数の伝送路エラー等による影響が、
合成音声の振巾再現性を劣化させることを防ぎ得ること
は明らかである。
Also, a speech analysis and synthesis device using a predictive residual waveform excitation method (for example, CHONGKWANUN.ANDD.THOMASMA)
GILL゛TheResidualExcitedLi
nearPredictiOnVOcOderwith
TransmissionOnRatEBelOw9.6K
bits/s″IEEETransactiOnsOn
Community OnslvOl. COM-23
, A6. l2, December 975),
The analysis side divides the prediction residual waveform by the normalized prediction residual power found from the linear prediction coefficients on the analysis side, compresses the amplitude fluctuation range of the prediction residual waveform, and then transmits the prediction residual waveform to the synthesis side. However, by multiplying the predicted residual waveform by the normalized prediction residual power obtained from the linear prediction coefficients on the synthesis side, the influence of transmission path errors on the linear prediction coefficients can be reduced.
It is clear that deterioration of the amplitude reproducibility of synthesized speech can be prevented.

【図面の簡単な説明】[Brief explanation of drawings]

図は本発明の実施例を説明するためのプロツク図である
。 101・・・・・・分析側構成、102・・・・・・線
形予測係数伝送路、103・・・・・・音源情報伝送路
、104・・・・・・合成側構成、105・・・・・・
波形入力端子、106・・・・・・線形予測分析器、1
07・・・・・・音源パラメータ分析器、108・・・
・・・線形予測係数符号化器、109・・・・・・音源
情報符号化器、110...・・・線形予測係数復号化
器、111・・・・・・復号線形予測係数伝送路、11
2・・・・・・復号線形予測係数伝送路、113・・・
・・・正規化予測残差電力計算器、114・・・・・・
正規化予測残差電力伝送路、115・・・・・・励振信
号発生器、116・・・・・・音源情報復号化器、11
7・・・・・・復号音源情報伝送路、118・・・・・
・フイルタ励振信号伝送路、119・・・・・・音声合
成フイルタ、120・・・・・・波形出力端子。
The figure is a block diagram for explaining an embodiment of the present invention. 101... Analysis side configuration, 102... Linear prediction coefficient transmission line, 103... Sound source information transmission line, 104... Synthesis side configuration, 105...・・・・・・
Waveform input terminal, 106...Linear prediction analyzer, 1
07...Sound source parameter analyzer, 108...
. . . Linear prediction coefficient encoder, 109 . . . Sound source information encoder, 110. .. .. ...Linear prediction coefficient decoder, 111...Decoding linear prediction coefficient transmission path, 11
2...Decoded linear prediction coefficient transmission path, 113...
...Normalized prediction residual power calculator, 114...
Normalized predicted residual power transmission line, 115... Excitation signal generator, 116... Sound source information decoder, 11
7...Decoded sound source information transmission line, 118...
- Filter excitation signal transmission line, 119...Speech synthesis filter, 120...Waveform output terminal.

Claims (1)

【特許請求の範囲】[Claims] 1 分析側で予め定めた時間間隔毎に線形予測係数等の
入力音声信号の周波数スペクトラムを示すパラメータを
計測し、合成側に設けた合成フィルタの係数を前記パラ
メータによつて定めて音声を合成する音声分析合成装置
において、合成側に前記パラメータから、予測残差電力
を入力信号電力で除した信号で定義される正規化予測残
差電力を算出する手段と、この正規化予測残差電力を前
記合成フィルタの励振音源情報としたことを特徴とする
音声分析合成装置。
1. Measure parameters indicating the frequency spectrum of the input audio signal, such as linear prediction coefficients, at predetermined time intervals on the analysis side, and synthesize speech by determining the coefficients of the synthesis filter provided on the synthesis side based on the parameters. In the speech analysis and synthesis device, the synthesis side includes means for calculating normalized predicted residual power defined by a signal obtained by dividing the predicted residual power by the input signal power from the parameters, and a means for calculating the normalized predicted residual power defined by the signal obtained by dividing the predicted residual power by the input signal power. A speech analysis and synthesis device characterized in that excitation sound source information of a synthesis filter is used.
JP53001283A 1978-01-09 1978-01-09 Speech analysis and synthesis device Expired JPS5941597B2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP53001283A JPS5941597B2 (en) 1978-01-09 1978-01-09 Speech analysis and synthesis device
US06/000,942 US4301329A (en) 1978-01-09 1979-01-04 Speech analysis and synthesis apparatus
CA319,347A CA1123514A (en) 1978-01-09 1979-01-09 Speech analysis and synthesis apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP53001283A JPS5941597B2 (en) 1978-01-09 1978-01-09 Speech analysis and synthesis device

Publications (2)

Publication Number Publication Date
JPS5494210A JPS5494210A (en) 1979-07-25
JPS5941597B2 true JPS5941597B2 (en) 1984-10-08

Family

ID=11497117

Family Applications (1)

Application Number Title Priority Date Filing Date
JP53001283A Expired JPS5941597B2 (en) 1978-01-09 1978-01-09 Speech analysis and synthesis device

Country Status (1)

Country Link
JP (1) JPS5941597B2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011156866A1 (en) 2010-06-18 2011-12-22 Nautilus Minerals Pacific Pty Ltd Method and apparatus for bulk seafloor mining

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5660499A (en) * 1979-10-22 1981-05-25 Casio Computer Co Ltd Audible sounddsource circuit for voice synthesizer

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011156866A1 (en) 2010-06-18 2011-12-22 Nautilus Minerals Pacific Pty Ltd Method and apparatus for bulk seafloor mining

Also Published As

Publication number Publication date
JPS5494210A (en) 1979-07-25

Similar Documents

Publication Publication Date Title
US8630863B2 (en) Method and apparatus for encoding and decoding audio/speech signal
US8452606B2 (en) Speech encoding using multiple bit rates
US8396706B2 (en) Speech coding
US8473284B2 (en) Apparatus and method of encoding/decoding voice for selecting quantization/dequantization using characteristics of synthesized voice
JPH0962299A (en) Code exciting linear predictive coding device
EP1096476B1 (en) Speech signal decoding
JPH0850500A (en) Voice encoder and voice decoder as well as voice coding method and voice encoding method
US5027405A (en) Communication system capable of improving a speech quality by a pair of pulse producing units
JPH09152896A (en) Sound path prediction coefficient encoding/decoding circuit, sound path prediction coefficient encoding circuit, sound path prediction coefficient decoding circuit, sound encoding device and sound decoding device
EP0849724A2 (en) High quality speech coder and coding method
JP3248215B2 (en) Audio coding device
KR0155315B1 (en) Celp vocoder pitch searching method using lsp
JPS5941597B2 (en) Speech analysis and synthesis device
JPH0782360B2 (en) Speech analysis and synthesis method
US4908863A (en) Multi-pulse coding system
KR20060067016A (en) Apparatus and method for voice coding
JP2956068B2 (en) Audio encoding / decoding system
KR100205060B1 (en) Pitch detection method of celp vocoder using normal pulse excitation method
JP2581050B2 (en) Voice analysis and synthesis device
JPH05224698A (en) Method and apparatus for smoothing pitch cycle waveform
JP2853126B2 (en) Multi-pulse encoder
JP3475958B2 (en) Speech encoding / decoding apparatus including speechless encoding, decoding method, and recording medium recording program
KR0138878B1 (en) Method for reducing the pitch detection time of vocoder
JPH01257999A (en) Voice signal encoding and decoding method, voice signal encoder and voice signal decoder
JPH06208398A (en) Generation method for sound source waveform