JPH03245198A

JPH03245198A - Voice analyzing and synthesizing device

Info

Publication number: JPH03245198A
Application number: JP4314190A
Authority: JP
Inventors: Takayuki Ishikawa; 孝行石川
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1990-02-23
Filing date: 1990-02-23
Publication date: 1991-10-31

Abstract

PURPOSE:To reproduce a high-quality voice which is stable and has high distinctness by transmitting the whole basic analysis frame of a transmission frame by a variable length frame expressing the same with plural pieces of representative analysis frames. CONSTITUTION:A representative frame deciding means 109 decides the representative analysis frame candidate forming the series array of the interpolation LPC (Linear Prediction Coding) having the optimum min. distortion quantity to be set by taking the synthesized voice and data transmission quantity into account from m-pieces of min. distortion quantities varying from each other obtd. by min. distortion quantity determining means 105 to 107 as the representative analysis frame and expresses n-pieces of the analysis frames as the variable length frame with the quantity and number of the representative analysis frames. The deterioration in the quality of the synthesized voice is suppressed in this way and the high-quality synthesized voice is obtd.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は音声分析合成装置に関し、特に分析側から合成
側に音源情報とともに供給すべき入力音声の特徴パラメ
ータであるＬＰＣ係数を可変長フレーム化して伝送する
音声分析合成装置に関する。[Detailed Description of the Invention] [Industrial Field of Application] The present invention relates to a speech analysis and synthesis device, and in particular, it converts LPC coefficients, which are characteristic parameters of input speech, to be supplied from the analysis side to the synthesis side together with sound source information into variable length frames. The present invention relates to a speech analysis and synthesis device for transmitting data.

[Conventional technology]

従来、この種の音声合成分析装置は、入力音声を分析す
る分析側において、音声信号を例えはｌＱ　ｍ　ＳＥＣ
ごとの基本分析フレームごとにＬＰＣ分析を行ない、そ
の結果得られたＬＰＣ係数と、音源情報であるピッチ周
期、電力情報等を合成側に伝送し入力音声の合成を行な
っている。この場合、伝送すべき音声の情報量を極力圧
縮するため、基本分析フレームごとに得られるＬＰＣ係
数を各基本分析フレームごとに伝送せず、例えばｎ個の
基本分析フレームをまとめて１個の伝送フレームとし、
各伝送フレームことにその中からＰ個の代表フレームを
選出し、このｎ個の基本分析フレームからなる伝送フレ
ームの範囲ではすべて各代表フレームのＬＰＣ係数で入
力音声のスペクトル包絡を表現する。すなわち、ｎ個の
基本分析フレームをＰ個のフレーム（ｎ＞Ｐ）で表現す
ることでｎ個の基本分析フレーム全てを伝送せずに、前
述のＰ個の代表フレームのＬＰＣ係数と、その繰り返し
回数Ｌｉ　（ｉ＝１〜Ｐｊ、ｎ＝ｔＬｉ＞とを伝送する
可変長フレーム表現を利用して伝送すべき音声情報量を
圧縮していた。Conventionally, in this type of speech synthesis analysis device, on the analysis side that analyzes input speech, the speech signal is converted into, for example, lQ m SEC.
LPC analysis is performed for each basic analysis frame, and the LPC coefficients obtained as a result and sound source information such as pitch period and power information are transmitted to the synthesis side to synthesize input speech. In this case, in order to compress the amount of audio information to be transmitted as much as possible, the LPC coefficients obtained for each basic analysis frame are not transmitted for each basic analysis frame, but, for example, n basic analysis frames are combined into one transmission. As a frame,
In particular, P representative frames are selected from each transmission frame, and within the range of the transmission frame consisting of these n basic analysis frames, the spectral envelope of the input speech is expressed by the LPC coefficient of each representative frame. In other words, by expressing n basic analysis frames as P frames (n>P), without transmitting all n basic analysis frames, the LPC coefficients of the aforementioned P representative frames and their repetitions can be transmitted. The amount of audio information to be transmitted has been compressed using a variable length frame representation that transmits the number of times Li (i=1 to Pj, n=tLi>).

これは、音声の巨視的構造が時変性を有し、その変化率
も時間的に不均一であり、したかって音声を分析するに
あたり等間隔の固定フレームで分析するよりも、音声の
変化に応じて可変長フレームとして処理する方が効率良
く分析できることにもとづいている。This is because the macroscopic structure of speech is time-varying and its rate of change is also temporally non-uniform. This is based on the fact that it is more efficient to analyze data by processing it as a variable length frame.

[Invention or problem to be solved]

上述した従来の可変長フレームによる音声分析合成装置
は、伝送情報量を一定く例えば１２００ｂｐｓ）にする
ため、前述の代表フレーム数Ｐは固定的に定めており、
従って変化の速い時間歪の大きい音声はその変化に追従
できす、Ｐ個の代表フレームだけではｎ個の基本分析フ
レームを有する伝送フレームが精度よく表現出来す、合
成音声の品質の劣化を生し、不安定かつ不明瞭な合成音
声を生成するという欠点がある。In the above-mentioned conventional speech analysis and synthesis apparatus using variable length frames, the number of representative frames P is fixedly determined in order to keep the amount of transmitted information constant (for example, 1200 bps).
Therefore, speech that changes rapidly and has large time distortion can follow the changes.If only P representative frames are used, a transmission frame with n basic analysis frames can be expressed accurately, but this will cause deterioration in the quality of synthesized speech. , which has the disadvantage of producing unstable and unclear synthesized speech.

本発明の目的は上述した欠点を除去し、合成音声の品質
劣化を著しく抑圧し、高品質の合成音声を得る音声分析
合成装置を提供することにある。SUMMARY OF THE INVENTION An object of the present invention is to provide a speech analysis and synthesis device that eliminates the above-mentioned drawbacks, significantly suppresses deterioration in the quality of synthesized speech, and obtains high-quality synthesized speech.

[Means to solve the problem]

本発明の音声分析合成装置は、入力音声信号を伝送フレ
ーム単位かつ所定のｎ個の分析フレームごとに分析して
所定の次数のＬＰＣ係数と音源情報ととを抽出する特徴
パラメータ抽出手段と、前記ｎ個の分析フレームから互
いに異る個数のｍ個の分析フレームの組合せを前記ｎ個
の分析フレームを代表して表現する代表分析フレーム候
補として選択する代表フレーム候補選択手段と、前記代
表フレーｌ、候補選択手段によって選択される代表分析
フレーム候補におけるＬＰＣ係数に対して内挿および外
挿による補間を実施し、ｎ個の分析フレームを有するｍ
個の補間Ｌ　Ｐ　Ｃ係数列を得なうえ、前記ｍ個の補間
ＬＰＣ係数列と前記特徴パラメータ抽出手段によって抽
出したｎ個の分析フレームのＬＰＣ係数列との差を歪量
として計測し前記歪量が最小となるｍ個の最小歪量を得
る最小歪量計測手段と、前記最小歪量計測手段によって
得られた０１個の最小歪量から合成音声とデータ伝送量
とを勘案して設定する最適な最小歪量を有する補間ＬＰ
Ｃ係数列を形成する代表分析フレーム候補を代表分析フ
レームとして判定し、この代表分析フレームの個数なら
びにナンバーをもって前記ｎ個の分析フレームを可変長
フレームとして表現する代表フレーム判定手段と、前記
代表分析フレームの個数ならびにナンバーを前記音源情
報とともに分析側から合成側に音声分析情報として送出
する分析情報送出手段と、前記音声分析情報にもとづい
て入力音声信号の合成を行合う音声合成手段とを備えて
構成される。The speech analysis and synthesis apparatus of the present invention includes a feature parameter extraction means for analyzing an input speech signal in units of transmission frames and every predetermined n analysis frames to extract LPC coefficients of a predetermined order and sound source information; representative frame candidate selection means for selecting a combination of m different numbers of analysis frames from n analysis frames as a representative analysis frame candidate representing the n analysis frames; Interpolation and extrapolation are performed on the LPC coefficients in the representative analysis frame candidates selected by the candidate selection means, and m
In addition to obtaining interpolated LPC coefficient sequences, the difference between the m interpolated LPC coefficient sequences and the LPC coefficient sequences of the n analysis frames extracted by the feature parameter extraction means is measured as a distortion amount, and the distortion A minimum distortion amount measuring means that obtains the m minimum distortion amounts, and a minimum distortion amount of 01 obtained by the minimum distortion amount measuring means, and is set by taking into account the synthesized voice and the amount of data transmission. Interpolation LP with optimal minimum distortion amount
representative frame determining means for determining a representative analysis frame candidate forming a C coefficient sequence as a representative analysis frame, and expressing the n analysis frames as variable length frames using the number and number of the representative analysis frames; and a voice synthesis means for synthesizing input voice signals based on the voice analysis information. be done.

Ｃ実施例］次に、本発明について図面を参照して説明する。C Example] Next, the present invention will be explained with reference to the drawings.

第１図は本発明の一実施例の構成を示すフロック図であ
る。FIG. 1 is a block diagram showing the configuration of an embodiment of the present invention.

第１図に示す実施例は、合成側く送信側）１゜合成側〈
受信側）２および伝送り３から構成される。In the embodiment shown in FIG.
It consists of a receiving side) 2 and a transmitting side 3.

分析側１は、入力音声信号を受けてこれを所定の帯域制
限した後量子化した量子化音声信号にＬＰＣ分析を施す
Ｌ　Ｐ　Ｇ分析器１０１１代表フレームを選択する代表
フレーム候補選択器ｆｌ）　１０２　。The analysis side 1 includes an LPG analyzer 1011 that receives an input audio signal, limits the band to a predetermined value, and then performs LPC analysis on the quantized audio signal, and a representative frame candidate selector fl) 102 that selects a representative frame. .

代表フレーム候補選択器（２１１０３、代表フレーム候
補選択器ｆ３１１０４　、歪量を計測する最小歪量計測
器ｆｌ）　１０５　、最小歪量計測器（２１１０６、最
小歪量計測器＋３１１０７　、ピッチ抽出器１０８．ｆ
ｉ適歪量判定器１０９．ＬＰＣパラメータ編集器１１０
゜多重化器１１１を備え、また合成側２は、多重化分離
器２０１．ＬＰＣパラメータ復号器２０２ピッチ発生器
３．切替器４．ＬＰＣパラメータ補間器２０５．雑音発
生器２０６．ＬＰＣ合成フィルタ２０７．可変利得増幅
器２０８を備えて構成される。Representative frame candidate selector (21103, representative frame candidate selector f31104, minimum distortion amount measuring device fl that measures the amount of distortion) 105, minimum distortion amount measuring device (21106, minimum distortion amount measuring device +31107, pitch extractor 108.f)
i-suitable distortion amount determiner 109. LPC parameter editor 110
The combining side 2 includes a multiplexer 111 and a demultiplexer 201 . LPC parameter decoder 202 pitch generator 3. Switcher 4. LPC parameter interpolator 205. Noise generator 206. LPC synthesis filter 207. It is configured with a variable gain amplifier 208.

次に、第１図の実施例の動作について説明する。Next, the operation of the embodiment shown in FIG. 1 will be explained.

入力音声信号は、ＬＰＣ分析器１０１てＢＰＦ（Ｂａｎ
ｄ　　Ｐａ５ｓ　　Ｆｉｌｔｅｒ）を通して不要な信号
を除去したあと、基本分析フレームごとにＡ−Ｄコンバ
ータで所定のサンプリング周波数たとえば８ＫＨｚで標
本化され、所定のピッＩ・数たとえば１２ヒツトで量子
化された量子化音声信号とされる。このあと、量子化音
声信号は公知の分析手法によってにパラメータなどのＬ
ＰＣ係数を所定の次数で抽出される。このＬＰＣ係数の
抽出は１伝送フレームあたりｎ個ずつ基本分析フレーム
は、次に３つの代表フレーム候補選択器（１）１０２〜
代表フレーム候補選択器（３１１０４によってそれぞれ
４個、５個および６個ずつの組合せでＬＰＣ係数を含み
次次に選択され、最小歪量計測器（１）　１０５〜最小
歪量計測器（３１１０７に供給される。The input audio signal is passed through the LPC analyzer 101 through a BPF (Ban
After removing unnecessary signals through a Pa5s Filter), each basic analysis frame is sampled by an A-D converter at a predetermined sampling frequency, e.g., 8 KHz, and quantized at a predetermined pitch/number, e.g., 12 hits. It is considered to be an audio signal. After this, the quantized audio signal is analyzed by a well-known analysis method to determine the parameters, etc.
PC coefficients are extracted with a predetermined order. The basic analysis frame is extracted by n LPC coefficients per transmission frame, and then the three representative frame candidate selectors (1) 102 to
The representative frame candidate selector (31104 sequentially selects LPC coefficients in combinations of 4, 5, and 6, respectively, and supplies them to the minimum distortion amount measuring device (1) 105 to the minimum distortion amount measuring device (31107) be done.

本実施例では、代表フレーム候補選択器ｆｉ＋　１０２
〜代表フレーム候補選択器（３１１０４はそれぞれ、４
．５および６個の基本分析フレームとそのＬＰＣ係数を
選択しているか、互い異る個数ｍをそれぞれ何種類、何
個とするかは、装置の運用目的、過去の音声資料等にも
とづき予め決定される。In this embodiment, the representative frame candidate selector fi+ 102
~Representative frame candidate selector (31104 are 4
．． Whether the 5 and 6 basic analysis frames and their LPC coefficients are selected, and how many types and numbers of different frames m to use, respectively, are determined in advance based on the operational purpose of the device, past audio materials, etc. Ru.

こうし２て、代表フレーム候補選択器１１）　１０２は
、伝送フレームあたり２０個の基本分析フレームから４
個の代表フレームをＬＰＣ係数を含み選出する。また代
表フレーム候補選択器＋２）　１０３は１伝送フレーム
から５個の代表フレームを選出し、さらに代表フレーム
候補選択器ｆ３）　１０４は１伝送フレームから６個の
代表フレームを選出する。In this way, the representative frame candidate selector 11) 102 selects 4 out of 20 basic analysis frames per transmission frame.
representative frames containing LPC coefficients are selected. Further, a representative frame candidate selector +2) 103 selects five representative frames from one transmission frame, and a representative frame candidate selector f3) 104 selects six representative frames from one transmission frame.

最小歪量計測器ｆｉｌ　１０５は、代表フレーム候補選
択器ｉｌｌ　１０５が次次に選択する４個の代表フレー
ム候補による歪量を表現するものである。同じく、最小
歪量計測器＋２１１０６は代表フレーム候補選択器＋２
１１０３か選んだ５個の代表フレーム候補による歪量を
、また最小歪量計測器（３１１０７は、代表フレーム候
補選択器＋３１１０４か選んだ６個の代表フレーム候補
による歪量を表現する。The minimum distortion amount measuring device fil 105 expresses the amount of distortion due to the four representative frame candidates selected one after another by the representative frame candidate selector ill 105. Similarly, the minimum distortion amount measuring device +21106 is the representative frame candidate selector +2
The minimum distortion amount measuring device (31107 represents the amount of distortion due to the five representative frame candidates selected by the representative frame candidate selector+31104).

この歪量の計測は次のようにして行なわれる。This amount of distortion is measured as follows.

たとえば、最小歪量計測器ｆｌｌ　１０５は５代表フレ
ーム候補選択器ｆｌｌ　１０２から提供される４個すつ
の代表フレーム候補のＬＰＣ係数を利用し、２０４＝１
６（個）の基本分析フレームぶんのＬＰＣ係数を内・外
挿の補間処理によって求め、補間ＬＰＣ係数列を得る。For example, the minimum distortion amount measuring device fll 105 uses the LPC coefficients of the four representative frame candidates provided from the five representative frame candidate selector fll 102, and 204=1
LPC coefficients for six (6) basic analysis frames are determined by interpolation processing of interpolation and extrapolation to obtain an interpolated LPC coefficient sequence.

一方、最小歪量計測器（１）１０５は、ＬＰＣ分析器１
０から伝送フレームあたり２０個の前基本分析フレーム
のＬＰＣ係数列を入力し、これと補間ＬＰＣ係数列との
差を歪量として計算する。歪量計測器（１）　１０５は
、すべての４個の代表フレーム候補について得られる補
間ＬＰＣ係数列について歪量を計測し゛、この歪量の最
小のものを最適歪量判定器１０９に供給する。On the other hand, the minimum strain measuring device (1) 105 is the LPC analyzer 1
0 to 20 LPC coefficient sequences of the previous basic analysis frame per transmission frame are input, and the difference between this and the interpolated LPC coefficient sequence is calculated as the amount of distortion. Distortion amount measuring device (1) 105 measures the amount of distortion for the interpolated LPC coefficient sequences obtained for all four representative frame candidates, and supplies the minimum amount of distortion to the optimum amount of distortion determining unit 109.

同様にして、最小歪量計測器＋２１１０６および最小歪
量計測器１０７は、それぞれ５個および６個の代表フレ
ーム候補にもとづいて得られる補間ＬＰＣ係数列と全基
本分析フレームによるＬ　Ｐ　Ｇ係数列との差から歪量
の最小なものを計測し、最適歪量判定器１０９に供給す
る。Similarly, the minimum distortion amount measuring device +21106 and the minimum distortion amount measuring device 107 calculate the interpolated LPC coefficient sequence obtained based on five and six representative frame candidates, and the LPG coefficient sequence obtained from all basic analysis frames, respectively. The minimum amount of distortion is measured from the difference between the two, and is supplied to the optimum amount of distortion determiner 109.

最適歪量判定器１０９は、３個の最小歪量計測器から提
供される最小歪量を比較し、最小歪量の大小と伝送デー
タ量の大小との条件から最適の歪量を提供する代表フレ
ーム候補の数とフレームナンバとを決定し、これを代表
フレームとして、この代表フレーム数と代表フレームナ
ンバーを多重化器１１１に供給する。The optimum distortion amount determiner 109 compares the minimum distortion amounts provided by the three minimum distortion amount measuring devices, and determines the optimum distortion amount based on the conditions of the minimum distortion amount and the amount of transmitted data. The number of frame candidates and the frame number are determined, this is set as a representative frame, and the representative frame number and representative frame number are supplied to the multiplexer 111.

こうして決定された代表フレームのナンバーはＬＰＣパ
ラメータ編集器１１０に供給され、ＬＰＣパラメータ編
集器１１０は、代表フレームナンバーのＬＰＣ係数をＬ
ＰＣ分析器１０１から読み出してＬＰＣ係数として多重
化器１１１に供給する。The number of the representative frame determined in this way is supplied to the LPC parameter editor 110, and the LPC parameter editor 110 converts the LPC coefficient of the representative frame number into
It is read out from the PC analyzer 101 and supplied to the multiplexer 111 as an LPC coefficient.

さて、電力抽出器１１２およびピッチ抽出器１０８はそ
れぞれ、公知の手法により量子化音声信号から基本分析
フレームごとの短時間平均音声電力とピッチ情報を抽出
し、これら音声電力とピッチ情報も多重化器１１１に供
給される。Now, the power extractor 112 and the pitch extractor 108 each extract short-term average audio power and pitch information for each basic analysis frame from the quantized audio signal using a known method, and these audio power and pitch information are also used in the multiplexer. 111.

多重化器１１１は、こうして提供される代表フレーム数
とそのフレームナンバー、ならびにＬＰＣ係数、および
音声電力とピッチ情報に間するデータを量子化し、適宜
組み合せて多重化データとして伝送路３を介して合成側
２に伝送する。The multiplexer 111 quantizes the number of representative frames and frame numbers provided in this way, the LPC coefficients, and the data between the audio power and pitch information, and combines them as appropriate to synthesize multiplexed data via the transmission line 3. Transmit to side 2.

こうして、伝送フレームあたり２０個の基本分析フレー
ムを４個が５個もしくは６個の代表フレームで表現して
合成側２に送出する可変長フレーム合成が行なわれる。In this way, variable length frame synthesis is performed in which 20 basic analysis frames per transmission frame are expressed as 4 representative frames, 5 representative frames, or 6 representative frames and sent to the synthesizing side 2.

さて、合成側２は、伝送路３を介して受けた多重化情報
を多重化分離器２０１で多重化分離し、代表フレーム数
とＬ　Ｐ　Ｃ係数はＬＰＣパラメータ復号化器２０２、
代表フレームナンバーはＬＰＣパラメータ補間器２０５
、音声電力は可変利得増幅器２０８、ピッチ情報はピッ
チ発生器２０３にそれぞれ供給する。Now, on the combining side 2, the multiplexed information received via the transmission path 3 is demultiplexed by the demultiplexer 201, and the number of representative frames and LPC coefficients are determined by the LPC parameter decoder 202,
The representative frame number is determined by the LPC parameter interpolator 205.
, audio power is supplied to a variable gain amplifier 208, and pitch information is supplied to a pitch generator 203, respectively.

ＬＰＣパラメータ復号器２０２は、代表フレーム数によ
ってＬＰＣ係数の復号化におけるピッｌ−数を決定して
ＬＰＣパラメータを正しく復号化してＬＰＣパラメータ
補間器２０５に供給する。The LPC parameter decoder 202 determines the number of pins in decoding the LPC coefficients based on the number of representative frames, correctly decodes the LPC parameters, and supplies the decoded LPC parameters to the LPC parameter interpolator 205 .

ＬＰＣパラメータ補間器２０５は、代表フレームナンバ
ーとその復号化ＬＰＣパラメータにもとづいて代表フレ
ームナンバー以外のＬＰＣパラメータを線形補間によっ
て得、これをディジタルフィルタ構成のＬＰＣ合成フィ
ルタ２０７のフィルタ係数として供給する。The LPC parameter interpolator 205 obtains LPC parameters other than the representative frame number by linear interpolation based on the representative frame number and its decoded LPC parameters, and supplies this as a filter coefficient to the LPC synthesis filter 207 having a digital filter configuration.

ピッチ情報を提供されたピ・・ｌチ発生器２０３は、ピ
ッチ情報に対応した繰返しのパルス列を発生し、これを
切替器２０５を介して可変利得増幅器２゜８に供給する
。The pitch generator 203 provided with the pitch information generates a repetitive pulse train corresponding to the pitch information, and supplies this to the variable gain amplifier 2.8 through the switch 205.

可変利得増幅器２０８は、音声電力に対応した可変利得
でピッチ発生器３の出力を増幅し、音源情報としてＬＰ
Ｃ合戊合成ルタ２０７の入力としてＬ　Ｐ　Ｃ合成フィ
ルタ２０７を駆動し、ディジタル音声を得、これを内蔵
Ｄ　−、Ａコンバータでアナログ化したのちＢＰＦで所
定の帯域制限を行なったのち出力音声として送出する。The variable gain amplifier 208 amplifies the output of the pitch generator 3 with a variable gain corresponding to the audio power, and outputs the LP as sound source information.
The LPC synthesis filter 207 is driven as an input to the C synthesis filter 207 to obtain digital audio, which is converted to analog using the built-in D- and A converters, and then subjected to a predetermined band restriction using the BPF, and then output as output audio. Send.

雑音発生器２０６は、ピッチ発生器２０３の出力を供給
されつつ、これが零のときは無音、無音状態と判断して
ランタム雑音出力をピッチ情報に代えて出力するように
切替器２０４を動作させる。The noise generator 206 is supplied with the output of the pitch generator 203, and when the output is zero, it determines that there is no sound, and operates the switch 204 to output a random noise output instead of pitch information.

こうして、伝送量を大幅に抑圧し、しかも再生音声の劣
化を著しく抑圧しうる音声分析合成が実施できる。In this way, it is possible to perform audio analysis and synthesis that can significantly suppress the amount of transmission and also significantly suppress the deterioration of reproduced audio.

（発明の効果〕以上説明したように本発明は、伝送フレームの含む全基
本分析フレームを複数個の代表分析フレームて表現する
可変長フレームで伝送することにより、伝送量を著しく
抑圧し、かつ変化が速く時間歪量の大きい音声において
も、安定した明瞭度の良い高品質音声か再生できるとい
う効果かある。(Effects of the Invention) As explained above, the present invention significantly suppresses the amount of transmission and changes This has the effect of being able to reproduce stable, clear, and high-quality audio even for audio with high speed and large amount of time distortion.

[Brief explanation of drawings]

第１図は本発明の一実施例の構成を示すブロック図であ
る。］・・・分析側、２・・・合成側、３・・伝送路、１０
１・・ＬＰＣ分析器、１０２・・・代表フレーム候補選
択器（１）、１０３・・・代表フレーム候補選択器（２
）、１０４・・・代表フレーム候補選択器（３）　、　
１．０５・・・最小歪量計測器（１１，１０６・・・最
小歪量計測器（２）、１０７・・・最小歪量計測Ｂ（３
）、１０８・・・ピッチ抽出器、１０９・・・最適歪量
判定器、１１０・・・ＬＰＣパラメータ編集器、１．１
１・・・多重化器、２０１・・多重化分離器、２０２・
・ＬＰＣパラメータ復号化器、２０３・・・ピッチ発生
器、２０４・・・切替器、２０５・・・Ｌｐｃパラメー
タ補間器、２６・・・雑音発生器、０７・・・ＬＰＣ合成フィルタ、Ｏ８・・可変利得増幅器。FIG. 1 is a block diagram showing the configuration of an embodiment of the present invention. ]... Analysis side, 2... Synthesis side, 3... Transmission line, 10
1... LPC analyzer, 102... Representative frame candidate selector (1), 103... Representative frame candidate selector (2
), 104...Representative frame candidate selector (3),
1.05... Minimum strain amount measuring device (11, 106... Minimum strain amount measuring device (2), 107... Minimum strain amount measuring device B (3
), 108...Pitch extractor, 109...Optimum distortion amount determiner, 110...LPC parameter editor, 1.1
1... Multiplexer, 201... Multiplex demultiplexer, 202...
・LPC parameter decoder, 203...Pitch generator, 204...Switcher, 205...LPC parameter interpolator, 2 6...Noise generator, 07...LPC synthesis filter, O8・- Variable gain amplifier.

Claims

[Claims] An input audio signal is analyzed in units of transmission frames and every predetermined n analysis frames to obtain LPC (Lin) of a predetermined order.
earPrediction Coding (linear prediction analysis) coefficients and sound source information; representative frame candidate selection means for selecting a representative analysis frame candidate to be expressed; and LP in the representative analysis frame candidate selected by the representative frame candidate selection means;
Interpolation by interpolation and extrapolation is performed on the C coefficient, and n
m interpolated LPC coefficient sequences having m analysis frames are obtained, and L of the m interpolated LPC coefficient sequences and n analysis frames extracted by the feature parameter extraction means.
Minimum distortion amount measuring means for measuring the difference from a PC coefficient sequence as a distortion amount to obtain m minimum distortion amounts that minimize the distortion amount; and m minimum distortion amounts obtained by the minimum distortion amount measuring means. A representative analysis frame candidate forming an interpolated LPC coefficient sequence having an optimal minimum distortion amount set by taking into account the synthesized speech and data transmission amount is determined as a representative analysis frame, and the number and number of this representative analysis frame are used as the representative analysis frame. representative frame determining means for expressing n analysis frames as variable length frames; analysis information sending means for sending the number and number of the representative analysis frames together with the sound source information from the analysis side to the synthesis side as audio analysis information;
A speech analysis and synthesis device comprising: speech synthesis means for synthesizing input speech signals based on the speech analysis information.