JPH02247700A - Voice synthesizing device - Google Patents

Voice synthesizing device

Info

Publication number
JPH02247700A
JPH02247700A JP1070030A JP7003089A JPH02247700A JP H02247700 A JPH02247700 A JP H02247700A JP 1070030 A JP1070030 A JP 1070030A JP 7003089 A JP7003089 A JP 7003089A JP H02247700 A JPH02247700 A JP H02247700A
Authority
JP
Japan
Prior art keywords
spectral
filter
speech
lsp
circuit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP1070030A
Other languages
Japanese (ja)
Inventor
Nobuhide Yamazaki
山崎 信英
Hiroo Kitagawa
博雄 北川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ricoh Co Ltd
Original Assignee
Ricoh Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ricoh Co Ltd filed Critical Ricoh Co Ltd
Priority to JP1070030A priority Critical patent/JPH02247700A/en
Publication of JPH02247700A publication Critical patent/JPH02247700A/en
Pending legal-status Critical Current

Links

Abstract

PURPOSE:To stabilize synthesized voices and to improve quality thereof by determining the spectral parameter of an original voice signal which is subjected to a high-band stress and passing the original voice signal through a reverse filter of this spectrum. CONSTITUTION:The original voice is subjected to the high-band stress in a high-pass filter 1 and is subjected to the spectral analysis of the original voice at every frame in an LSP analysis circuit 2, from which the voice is outputted as an LSP parameter. The original voice obtd. by passing the LSP reverse filter 3 having the characteristics reverse from the spectral characteristics by the LSP parameter is subjected to a high-band suppression and is made into a residual waveform. This residual waveform is edited in accordance with the data on pitch and amplitude by a residual waveform synthesis circuit 6, by which the synthesized residual waveform is formed and is outputted as the synthesized voice by an LSP synthesis filter 4. The residual waveform of the large power is thereby obtd. and the high-quality synthesized voice is stably obtd.

Description

【発明の詳細な説明】 伎亙立互 本発明は、音声合成装置、より詳細には、残差波形を利
用した規則音声合成装置に関し、高品位な音声を合成す
るための分析合成系に係わる。
[Detailed Description of the Invention] The present invention relates to a speech synthesis device, and more particularly, to a regular speech synthesis device using a residual waveform, and relates to an analysis and synthesis system for synthesizing high-quality speech. .

災米伎亙 従来の音声合成器は、音声のスペクトル包絡を表現する
声道フィルタと、音声のピッチ周期、振幅、スペクトル
微細構造を表現する駆動音源信号生成装置から構成され
る。通常、声道フィルタはPARCOR方式やLSP方
式などのデジタルフィルタで構成され、駆動音源はイン
パルスやホワイトノイズを切り換えて用いられる(白鳥
英−:音声合成技術、情報処理Vo1.24 No、8
 pp、993−1000では、音声生成モデルとして
音源にパルス周期と雑音を切り替えている)。
A conventional speech synthesizer consists of a vocal tract filter that expresses the spectral envelope of speech, and a driving sound source signal generation device that expresses the pitch period, amplitude, and spectral fine structure of speech. Usually, the vocal tract filter is composed of a digital filter such as the PARCOR method or the LSP method, and the driving sound source is used by switching between impulse and white noise (Hide Shiratori: Speech Synthesis Technology, Information Processing Vol. 1.24 No. 8)
pp. 993-1000, the sound source is switched between pulse period and noise as a speech generation model).

駆動音源としては、人間の原音声信号を逆フイルタリン
グ処理して得た残差波形を用いることもある(特公昭5
8−88798号公報では、残差信号から切り出したお
よそ1ピツチの周期の波形を駈動波形としている)。こ
れは、スペクトルパラメータで近似できなかった成分を
音源でカバーして、音質向上を狙ったものである。
As a driving sound source, a residual waveform obtained by inverse filtering the original human voice signal may be used (Tokuko Kokō 5).
In Japanese Patent No. 8-88798, a waveform with a period of about 1 pitch extracted from the residual signal is used as a cantering waveform). This aims to improve sound quality by covering components that cannot be approximated by spectral parameters with the sound source.

ところが、従来の残差波形は音韻によって残差波形のパ
ワーが極端に低くなる場合があり、これが合成フィルタ
の発振を引き起こす原因となっていた。
However, in the conventional residual waveform, the power of the residual waveform may be extremely low depending on the phoneme, which causes oscillation of the synthesis filter.

■−−−昨 本発明は、上述のごとき実情に鑑みてなされたもので、
特に、高域を抑圧した残差波形を用いて駆動音源のパワ
ーを大きくし、これによって、合成器の発振を減少させ
、より安定で高品質な合成音を得ることを目的としてな
されたものである。
■---The present invention was made in view of the above-mentioned circumstances.
In particular, this was done with the aim of increasing the power of the driving sound source by using a residual waveform with suppressed high frequencies, thereby reducing the oscillation of the synthesizer and obtaining a more stable and high-quality synthesized sound. be.

豆−一戒 本発明は、上記目的を達成するために、声道の特性ある
いは音声のスペクトル包絡を模擬する回路と該回路を駆
動する音源とからなる音声合成装置において、原音声信
号を高域強調するフィルタと、入力信号のスペクトル包
絡を表現するパラメータを求める分析回路と、このスペ
クトル包絡の周波数特性に対する逆フィルタを有し、原
音声信号を高域強調したもののスペクトルパラメータを
求め、このスペクトルの逆フィルタに原音声信号を通す
ことによって得られる高域を抑圧した残差波形を用いて
駆動すること、或いは、声道の特性あるいは音声のスペ
クトル包絡を模擬する回路と該回路を駆動する音源とか
らなる音声合成装置において、原音声信号のスペクトル
の傾きを除去する適応逆フィルタと、入力信号のスペク
トル包絡を表現するパラメータを求める分析回路と、こ
のスペクトル包絡の周波数特性に対する逆フィルタを有
し、原音声信号スペクトルの傾きを除去した後、これを
分析回路によってスペクトルパラメータを求め、JM汗
声信号をこのスペクトルの逆フィルタに与えることで、
原音声信号におけるスペクトルの傾きを音源信号に含め
、逆にスペクトルの傾きをスペクトルパラメータから除
去することを特徴としたものである。以下、本発明の実
施例に基づいて説明する。
In order to achieve the above-mentioned object, the present invention provides a speech synthesis device that includes a circuit that simulates the characteristics of the vocal tract or the spectral envelope of speech and a sound source that drives the circuit. It has an emphasizing filter, an analysis circuit that obtains parameters representing the spectral envelope of the input signal, and an inverse filter for the frequency characteristics of this spectral envelope. Drive using a residual waveform with suppressed high frequencies obtained by passing the original speech signal through an inverse filter, or a circuit that simulates the characteristics of the vocal tract or the spectral envelope of speech and a sound source that drives the circuit. A speech synthesis device comprising: an adaptive inverse filter for removing the spectral slope of the original speech signal; an analysis circuit for obtaining parameters expressing the spectral envelope of the input signal; and an inverse filter for the frequency characteristics of the spectral envelope; After removing the slope of the original audio signal spectrum, the spectrum parameters are obtained using an analysis circuit, and the JM sweat voice signal is applied to an inverse filter of this spectrum.
This method is characterized in that the spectral slope of the original audio signal is included in the sound source signal, and conversely, the spectral slope is removed from the spectral parameters. Hereinafter, the present invention will be explained based on examples.

而して1本発明においては、原音声信号を高域強調した
もののスペクトルパラメータを求め、このスペクトルの
逆フィルタに原音声信号を通すことによって高域を抑圧
した残差波形を得るが、その波形は、従来のインパルス
状のものから三角波のようなものになり、パワーも比較
的大きくなり、従フて、合成フィルタのゲインが下がり
、これによって、合成器の発振を減少させ、より安定で
高品質な合成音声を得ることが出来る。
Accordingly, in the present invention, the spectral parameters of the original audio signal with high frequencies emphasized are obtained, and the residual waveform with the high frequencies suppressed is obtained by passing the original audio signal through an inverse filter of this spectrum. changes from the conventional impulse-like shape to a triangular wave-like one, and the power becomes relatively large.As a result, the gain of the synthesis filter decreases, which reduces the oscillation of the synthesizer and makes it more stable and high-frequency. It is possible to obtain high-quality synthesized speech.

第1図は、諸求項第1項に記載した発明の一実施例を説
明するための構成図で1図中、1は高域フィルタ、2は
LSP分析回路、3はLSP逆フィルタ、4はLSP合
成フィルタ、5はパラメータ、残差波形記憶部、6は残
差合成回路で、原音声は1合成パラメータを作成するた
めの音声資料であり、該原音声は高域フィルタ1によっ
て高域強調された後、LSP分析回lf!t2に送られ
る。
FIG. 1 is a block diagram for explaining an embodiment of the invention described in Item 1 of the claims.In the figure, 1 is a high-pass filter, 2 is an LSP analysis circuit, 3 is an LSP inverse filter, and 4 is an LSP synthesis filter, 5 is a parameter and residual waveform storage unit, 6 is a residual synthesis circuit, the original voice is audio material for creating 1 synthesis parameter, and the original voice is processed in high frequency by high frequency filter 1. After being highlighted, LSP analysis time lf! Sent to t2.

LSP分析回路2では、フレーム毎に高域強調された原
音声のスペクトル分析を行ない、スペクトル包絡を表わ
すLSPパラメータとして出力する。また、原波形はL
SP逆フィルタ3に通される。ここで、LSP逆フィル
タ3はLSPパラメータによって与えられるスペクトル
特性とは逆のスペクトル特性を持つフィルタである。こ
のLSPパラメータによって示されるスペクトル包絡は
、原音声のそれと比べて高域が強調されているため、L
SP逆フィルタ3の出力は高域抑圧された残差波形とな
る。第2図に原音声信号(a)と、高域抑圧しない残差
波形(b)と、高域抑圧した残差波形CQ)を示す。
The LSP analysis circuit 2 performs spectrum analysis of the high-frequency emphasized original speech for each frame and outputs it as an LSP parameter representing the spectrum envelope. Also, the original waveform is L
It is passed through an SP inverse filter 3. Here, the LSP inverse filter 3 is a filter having spectral characteristics opposite to those given by the LSP parameters. The spectral envelope indicated by this LSP parameter emphasizes the high range compared to that of the original voice, so
The output of the SP inverse filter 3 becomes a residual waveform with high frequencies suppressed. FIG. 2 shows the original audio signal (a), the residual waveform without high frequency suppression (b), and the residual waveform with high frequency suppression (CQ).

合成時には、高域抑圧した残差波形は残差合成回路6に
よってピッチと振幅のデータをもとに編集され合成残差
波形となる。合成残差波形はLSP合成[t4によって
合成音声として出力される。
At the time of synthesis, the high-frequency suppressed residual waveform is edited by the residual synthesis circuit 6 based on pitch and amplitude data to become a composite residual waveform. The synthesized residual waveform is output as synthesized speech by LSP synthesis [t4.

なお、本実施例のLSPパラメータ以外に、パーコール
、LPG、フォルマントなどの他のスペクトル特性を表
わすパラメータも使用できる。
In addition to the LSP parameters of this embodiment, parameters representing other spectral characteristics such as Percoll, LPG, and formant can also be used.

第3図は、請求項第2項に記載した発明の一実施例を説
明するための構成図で、図中、10は適応フィルタを示
し、該適応フィルタ10は、原音声における一次の相関
係数を相関器によって求め、スペクトルパラメータから
スペクトルの傾きを除去し、このスペクトルの逆フィル
タに原音声信号を通して、パワーの大きい残差波形を得
、また、該適応フィルタによって、有声音、無声音に関
わらずスペクトルの傾きを補正して、スペクトルをフラ
ットにしている、ここで、1次の相関係数をr工とする
と、 F(z)=1−rLz−1の特性を持つプログラマブル
フィルタによって実現できる。
FIG. 3 is a block diagram for explaining one embodiment of the invention as set forth in claim 2. In the figure, 10 indicates an adaptive filter, and the adaptive filter 10 is configured to calculate the first-order correlation in the original speech. The number is calculated by a correlator, the slope of the spectrum is removed from the spectrum parameter, and the original speech signal is passed through an inverse filter of this spectrum to obtain a residual waveform with large power. First, the slope of the spectrum is corrected to make the spectrum flat. Here, if the first-order correlation coefficient is r, this can be achieved by a programmable filter with the characteristic F(z) = 1-rLz-1. .

なお、LSP分析回路2乃至残差合成回路6の動作は、
第1図の場合と全く同様であるので、その詳細な説明は
省略する。
The operations of the LSP analysis circuit 2 to residual synthesis circuit 6 are as follows.
Since this is exactly the same as the case shown in FIG. 1, detailed explanation thereof will be omitted.

羞−一末 以上の説明から明らかなように、請求項第1項の発明に
よると、高域を強調してスペクトル分析を行い、原音声
信号を高域強調したもののスペクl−ルパラメータを求
め、このスペクトルの逆フィルタに原音声信号を通して
いるので、パワーの大きい残差波形が得られ、これが合
成器の発振を減少させ、より安定で高品質な合成音声を
得ることが出来る。また、高域強調フィルタによって有
声音に特有なスペクトルの傾きを補正し、スペクトルを
フラットにすることで、高域の分析精度を高めている。
As is clear from the above description, according to the invention of claim 1, the spectrum analysis is performed with the high frequency band emphasized, and the spectral parameters of the original audio signal with the high frequency band emphasized are determined. Since the original speech signal is passed through this spectral inverse filter, a residual waveform with large power is obtained, which reduces the oscillation of the synthesizer, making it possible to obtain a more stable and high-quality synthesized speech. Additionally, a high-frequency emphasis filter is used to correct the spectral tilt characteristic of voiced sounds, flattening the spectrum and increasing the accuracy of high-frequency analysis.

また、低次のフォルマントが見かけ上非常に鋭いQとし
て現われてしまう現象に対しても抑圧の効果がある。
It also has the effect of suppressing the phenomenon in which low-order formants appear as an apparently very sharp Q.

また、請求項第2項の発明によると、適応フィルタによ
って、スペクトルパラメータからスペクトルの傾きを除
去して、このスペクトルの逆フィルタに原音声信号を通
しているので、パワーの大きい残差波形が得られ、これ
が合成器の発振を減少させ、より安定で高品質な合成音
声を得ることが出来る。また、適応フィルタによって有
声音1、!声音に関わらずスペクトルの傾きを補正し、
スペクトルをフラットにすることで、高域の分析精度を
高めている。また、低次のフォルマントが見かけ上非常
に鋭いQとして現われてしまうという現象に対しても抑
圧の効果がある。
Further, according to the invention of claim 2, since the slope of the spectrum is removed from the spectrum parameter by the adaptive filter and the original audio signal is passed through the inverse filter of this spectrum, a residual waveform with large power can be obtained. This reduces synthesizer oscillations, resulting in more stable and high quality synthesized speech. In addition, voiced sounds 1,!, by adaptive filters. Corrects the slope of the spectrum regardless of the voice,
By flattening the spectrum, high-frequency analysis accuracy is improved. It also has the effect of suppressing the phenomenon in which low-order formants appear as an apparently very sharp Q.

【図面の簡単な説明】[Brief explanation of drawings]

第1図は、請求項第1項に記載した発明の一実施例を説
明するための構成図、第2図は、高域抑圧していない残
差波形(b)と、高域抑圧した残差波形を示す図、第3
図は、請求項第2項に記載した発明の一実施例を説明す
るための構成図である。 1・・・高域フィルタ、2・・・LSP分析回路、3・
・・LSP逆フィルタ、4・・・LSP合成フィルタ、
5・・・パラメータ、残差波形記憶部、6・・・残差合
成回路、10・・・適応フィルタ。 第1区 第2図
FIG. 1 is a block diagram for explaining one embodiment of the invention as set forth in claim 1, and FIG. 2 shows a residual waveform (b) without high frequency suppression and a residual waveform (b) with high frequency suppression. Diagram showing the difference waveform, 3rd
The figure is a configuration diagram for explaining an embodiment of the invention set forth in claim 2. 1...High-pass filter, 2...LSP analysis circuit, 3.
...LSP inverse filter, 4...LSP synthesis filter,
5... Parameter, residual waveform storage section, 6... Residual synthesis circuit, 10... Adaptive filter. District 1, Figure 2

Claims (1)

【特許請求の範囲】 1、声道の特性あるいは音声のスペクトル包絡を模擬す
る回路と該回路を駆動する音源とからなる音声合成装置
において、原音声信号を高域強調するフィルタと、入力
信号のスペクトル包絡を表現するパラメータを求める分
析回路と、このスペクトル包絡の周波数特性に対する逆
フィルタを有し、原音声信号を高域強調したもののスペ
クトルパラメータを求め、このスペクトルの逆フィルタ
に原音声信号を通すことによって得られる高域を抑圧し
た残差波形を用いて駆動することを特徴とした音声合成
装置。 2、声道の特性あるいは音声のスペクトル包絡を模擬す
る回路と該回路を駆動する音源とからなる音声合成装置
において、原音声信号のスペクトルの傾きを除去する適
応逆フィルタと、入力信号のスペクトル包絡を表現する
パラメータを求める分析回路と、このスペクトル包絡の
周波数特性に対する逆フィルタを有し、原音声信号スペ
クトルの傾きを除去した後、これを分析回路によってス
ペクトルパラメータを求め、原音声信号をこのスペクト
ルの逆フィルタに与えることで、原音声信号におけるス
ペクトルの傾きを音源信号に含め、逆にスペクトルの傾
きをスペクトルパラメータから除去することを特徴とし
た音声合成装置。
[Claims] 1. A speech synthesis device comprising a circuit that simulates the characteristics of the vocal tract or the spectral envelope of speech and a sound source that drives the circuit, which includes a filter that emphasizes the high frequency range of the original speech signal, and a filter that emphasizes the high frequency range of the input signal. It has an analysis circuit that obtains parameters expressing the spectral envelope, and an inverse filter for the frequency characteristics of this spectral envelope, obtains the spectral parameters of the high-frequency emphasized original audio signal, and passes the original audio signal through this spectral inverse filter. A speech synthesis device characterized in that it is driven using a residual waveform obtained by suppressing high frequencies. 2. In a speech synthesis device consisting of a circuit that simulates the characteristics of the vocal tract or the spectral envelope of speech and a sound source that drives the circuit, an adaptive inverse filter that removes the spectral slope of the original speech signal and the spectral envelope of the input signal are used. , and an inverse filter for the frequency characteristics of this spectral envelope. After removing the slope of the original audio signal spectrum, the analysis circuit calculates the spectral parameters, and the original audio signal is converted into this spectrum. 1. A speech synthesis device characterized in that a spectral slope in an original audio signal is included in a sound source signal and conversely removed from a spectral parameter by applying the spectral slope to an inverse filter.
JP1070030A 1989-03-20 1989-03-20 Voice synthesizing device Pending JPH02247700A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP1070030A JPH02247700A (en) 1989-03-20 1989-03-20 Voice synthesizing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP1070030A JPH02247700A (en) 1989-03-20 1989-03-20 Voice synthesizing device

Publications (1)

Publication Number Publication Date
JPH02247700A true JPH02247700A (en) 1990-10-03

Family

ID=13419788

Family Applications (1)

Application Number Title Priority Date Filing Date
JP1070030A Pending JPH02247700A (en) 1989-03-20 1989-03-20 Voice synthesizing device

Country Status (1)

Country Link
JP (1) JPH02247700A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7546241B2 (en) 2002-06-05 2009-06-09 Canon Kabushiki Kaisha Speech synthesis method and apparatus, and dictionary generation method and apparatus
JP2019035864A (en) * 2017-08-17 2019-03-07 国立研究開発法人情報通信研究機構 Glottis flow component estimation device, program and method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7546241B2 (en) 2002-06-05 2009-06-09 Canon Kabushiki Kaisha Speech synthesis method and apparatus, and dictionary generation method and apparatus
JP2019035864A (en) * 2017-08-17 2019-03-07 国立研究開発法人情報通信研究機構 Glottis flow component estimation device, program and method

Similar Documents

Publication Publication Date Title
JP4067762B2 (en) Singing synthesis device
JP4747835B2 (en) Audio reproduction effect adding method and apparatus
JPH07248794A (en) Method for processing voice signal
EP1422693A1 (en) PITCH WAVEFORM SIGNAL GENERATION APPARATUS, PITCH WAVEFORM SIGNAL GENERATION METHOD, AND PROGRAM
EP0391545A1 (en) Speech synthesizer
US5241650A (en) Digital speech decoder having a postfilter with reduced spectral distortion
JPH05307399A (en) Voice analysis system
JPH04358200A (en) Speech synthesizer
JP3197975B2 (en) Pitch control method and device
JPH02247700A (en) Voice synthesizing device
JP2841797B2 (en) Voice analysis and synthesis equipment
JPH05307395A (en) Voice synthesizer
JP3158434B2 (en) Digital audio decoder with post-filter having reduced spectral distortion
US5204934A (en) Sound synthesis device using modulated noise signal
JP4900062B2 (en) Audio signal processing apparatus, audio reproduction apparatus, and audio signal processing method
JP2000003200A (en) Voice signal processor and voice signal processing method
JPH0876799A (en) Wide band voice signal restoration method
US6418406B1 (en) Synthesis of high-pitched sounds
JP2000242287A (en) Vocalization supporting device and program recording medium
JP2000099100A (en) Voice conversion device
JPH09160595A (en) Voice synthesizing method
JPH0690638B2 (en) Speech analysis method
JP2008262140A (en) Musical pitch conversion device and musical pitch conversion method
JPS61128299A (en) Voice analysis/analytic synthesization system
JP5723568B2 (en) Speaking speed converter and program