JPH02230300A - Voice synthesizer - Google Patents

Voice synthesizer

Info

Publication number
JPH02230300A
JPH02230300A JP1049958A JP4995889A JPH02230300A JP H02230300 A JPH02230300 A JP H02230300A JP 1049958 A JP1049958 A JP 1049958A JP 4995889 A JP4995889 A JP 4995889A JP H02230300 A JPH02230300 A JP H02230300A
Authority
JP
Japan
Prior art keywords
pitch
information
pitch pulse
pulse
period
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP1049958A
Other languages
Japanese (ja)
Inventor
Takayuki Ishikawa
孝行 石川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to JP1049958A priority Critical patent/JPH02230300A/en
Publication of JPH02230300A publication Critical patent/JPH02230300A/en
Pending legal-status Critical Current

Links

Abstract

PURPOSE:To prevent energy from being converted excessively by providing the voice synthesizer with a pitch varying device and determining a pulse excitation point at or nearly at a pitch pulse period. CONSTITUTION:When a voice synthesizing filter 13 is driven by a modeled sound source with an impulse string based upon the pitch period or a white noise to generate an input voice signal, the pitch pulse varying device 18 which varies the generation period of a pitch pulse train based upon the pitch period is provided and a pitch pulse position which is a pulse excitation point varies at a period determined by the pitch pulse varying device 18. Consequently, the convergence of energy is evaded and a synthetic voice which is close to natural one can be obtained.

Description

【発明の詳細な説明】 〔産業上の利用分野〕 本発明は音声分析器で分析されたスペクトラル包絡情報
と音源情報とを合成する音声合成器に関する。
DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a speech synthesizer that synthesizes spectral envelope information and sound source information analyzed by a speech analyzer.

〔従来の技術〕[Conventional technology]

従来、この種の音声合成器は、入力音声信号の巨視的構
造を示すスペクトラル包絡情報と微細構造を示す音源情
報とを入力し、これら分析情報に基づいて人力音声信号
を再生するものである。すなわち、このような音声合成
器においては、分析情報として伝送されるスペクトラル
包絡情報を音源情報で駆動する全極型のディジタルフィ
ルタが音声合成フィルタとして備えられている。
Conventionally, this type of speech synthesizer inputs spectral envelope information indicating the macroscopic structure of an input speech signal and sound source information indicating the fine structure, and reproduces a human-powered speech signal based on these analysis information. That is, such a speech synthesizer is equipped with an all-pole digital filter as a speech synthesis filter that drives spectral envelope information transmitted as analysis information with sound source information.

スペクトラル包絡情報は、通常は入力音声信号をLPG
分析(線形予測分析; LinearPredicti
ve Coefficient) Lて求められるαパ
ラメータやKパラメータのごとき線形予測係数をフィル
タ係数としたものである。
Spectral envelope information is usually obtained by converting the input audio signal to LPG.
analysis (linear predictive analysis; Linear Predicti
The filter coefficients are linear prediction coefficients such as the α parameter and the K parameter determined by

一方、音源情報はピッチ周期に基づいたインバルス列(
ピッチパルス列)、有声無声無音情報、その他音声電力
等であり、音源情報の持つ波形情報は切り揄でて、音源
のピッチ周期と有声無声無音情報および音声電力等をモ
デル化したもので、音声合成フィルタを駆動するように
している。すなわち、有声音源はそのピッチ周期のイン
パルス列で、また無声無音は白色雑音によるモデル化表
現で、音声合成フィルタを駆動するようにしている。
On the other hand, the sound source information is an impulse sequence based on the pitch period (
(pitch pulse train), voiced/unvoiced/silent information, other audio power, etc., and the waveform information contained in the sound source information is cut out, and the pitch period of the sound source, voiced/unvoiced/silent information, audio power, etc. are modeled, and the voice synthesis I am trying to drive the filter. That is, a voiced sound source is driven by an impulse train of its pitch period, and a voiceless sound source is represented by a modeling representation using white noise to drive the speech synthesis filter.

〔発明が解決しようとする課題〕[Problem to be solved by the invention]

しかしながら、従来のこのような波掛非伝送の分析情報
を用いて音声合成フィルタを駆動する音声合成器にあっ
ては、波形伝送型の音声合成器、例えばマルチパルスボ
コーダと比較すると、本質的に位相情報に欠け、またピ
ッチ周期情報にも曖昧性が入り込み易い。特に、音声合
成フィルタは分析側から合成側に伝送されるピッチ周期
情報に対応して発生するピッチパルス位置をパルス励起
点として駆動されるので、そのパルス励起点にエネルギ
の過度の集中が起こる。ピッチ周期を利用しての励振は
、定周期的にエネルギの集中を発生することになり、自
然性に乏しい機械音的な合成音質となるという欠点があ
った。
However, in a conventional speech synthesizer that drives a speech synthesis filter using analysis information without waveform transmission, compared to a waveform transmission type speech synthesizer such as a multipulse vocoder, it is essentially Phase information is lacking, and pitch period information is also prone to ambiguity. In particular, since the speech synthesis filter is driven using the pitch pulse position generated in response to the pitch period information transmitted from the analysis side to the synthesis side as a pulse excitation point, energy is excessively concentrated at the pulse excitation point. Excitation using the pitch period causes concentration of energy at regular intervals, resulting in a synthetic sound quality that is mechanical and lacks naturalness.

〔課題を解決するための手段〕[Means to solve the problem]

本発明は、音声合成フィルタをピッチ周期に基づいたイ
ンパルス列もしくは白色雑音でモデル化音源で駆動する
ことによって入力音声信号を合成するにあたって、ピッ
チ周期に基づいたピッチパルス列(インパルス列)の発
生周期を可変するピッチパルス可変器を設けたことを特
徴とする。
The present invention provides a method for controlling the generation period of a pitch pulse train (impulse train) based on the pitch period when synthesizing an input speech signal by driving a speech synthesis filter with a modeled sound source using an impulse train based on the pitch period or white noise. It is characterized by being provided with a variable pitch pulse device.

〔作用〕[Effect]

ピッチパルス可変器により、パルス励起点であるピッチ
パルス位置が、ピッチパルス可変器で定まる周期によっ
て変化するので、エネルギの集中が避けられ、自然性に
近い合成音を得ることができる。
Since the pitch pulse variable device changes the pitch pulse position, which is the pulse excitation point, according to the period determined by the pitch pulse variable device, concentration of energy can be avoided and a synthesized sound close to naturalness can be obtained.

〔実施例〕〔Example〕

以下、図面に示す一実施例を参照して、本発明を詳細に
説明する。
Hereinafter, the present invention will be described in detail with reference to an embodiment shown in the drawings.

第1図は本発明の音声合成器の一実施例を示すブロック
図である。デマルチプレクサ11は伝送路12を介して
、合成すべき音声信号の分析情報を人力する。この分析
情報は音声分析器によるスペクトラル包絡情報と音源情
報との多重化信号であり、スペクトラル包絡情報として
のLPG係数データa1音源情報としての有声無声無音
情報bとピッチ周期情報Cおよび短時間音声電力データ
dが含まれる。
FIG. 1 is a block diagram showing an embodiment of the speech synthesizer of the present invention. The demultiplexer 11 manually inputs analysis information of the audio signal to be synthesized via the transmission line 12. This analysis information is a multiplexed signal of spectral envelope information and sound source information by a voice analyzer, and includes LPG coefficient data a as spectral envelope information, voiced/unvoiced silence information b as sound source information, pitch period information C, and short-time voice power. Data d is included.

ここで、音声信号を分析する音声分析器は、LPG分析
器、ピッチ抽出器、有声無声無音判別器、電力計測器等
からなり、分析した分析情報をメモリ回路に記憶すると
共に、マルチプレクサ等で適宜組み合わせて多重化し、
これを伝送符号化して伝送路12に送出し、第1図に示
す音声合成器に供給する。
Here, the voice analyzer that analyzes the voice signal is composed of an LPG analyzer, a pitch extractor, a voiced/unvoiced/silence discriminator, a power meter, etc., and stores the analyzed analysis information in a memory circuit and uses a multiplexer etc. as appropriate. Combine and multiplex
This is transmitted encoded, sent to the transmission line 12, and supplied to the speech synthesizer shown in FIG.

音声合成器では、入力された分析情報に基づき、デマル
チプレクサ11によって多重化データの多重化分離と復
号化とを行う。
In the speech synthesizer, the demultiplexer 11 demultiplexes and decodes the multiplexed data based on the input analysis information.

復号化したLPG係数データaは、音声合成フィルタ1
3に、ピッチ周期情報Cはピッチ,{Jレス発生器14
に、有声無声無音情報bは切替器15に、また短時間音
声電力データdは可変増幅器16にそれぞれ供給される
。音声合成フィルタ13は、予め定めた次数の全極型デ
ィジタルフィルタとして構成され、LPG係数データa
はこのフィルタの係数として利用される。
The decoded LPG coefficient data a is passed through the speech synthesis filter 1
3, the pitch period information C is the pitch, {Jless generator 14
Then, the voiced/unvoiced/silent information b is supplied to the switch 15, and the short-time voice power data d is supplied to the variable amplifier 16. The speech synthesis filter 13 is configured as an all-pole digital filter of a predetermined order, and is configured as an all-pole digital filter of a predetermined order.
are used as coefficients of this filter.

有声無声無音情報bは切替器15に供給され、このデー
タが有声を指定するときはピッチパルス発生器14の出
力を可変増幅器16に、また無声無音のときは雑音発生
器17の出力を可変増幅器l6に供給するように切替器
15を切り替えさせる。
The voiced/unvoiced/silent information b is supplied to the switch 15, and when this data specifies voiced, the output of the pitch pulse generator 14 is sent to the variable amplifier 16, and when it is unvoiced, the output of the noise generator 17 is sent to the variable amplifier 16. The switch 15 is switched to supply the signal to l6.

雑音発生器17は白色雑音を発生し、有声無声無音情報
bが無声か無音かを指定するときは、この白色雑音が可
変増幅器16に供給される。
The noise generator 17 generates white noise, and this white noise is supplied to the variable amplifier 16 when the voiced/unvoiced/silence information b specifies whether it is voiceless or silent.

ピッチ周期情報Cを供給されたピッチパルス発生器l4
は、このピッチ周期に対応する周波数のピッチパルス列
を発生し、更に前述のパルス列を本発明の特徴であるピ
ッチパルス可変器18の指示に基づいた位置に修正した
のち、切替器15に供給する。
Pitch pulse generator l4 supplied with pitch period information C
generates a pitch pulse train of a frequency corresponding to this pitch period, further corrects the position of the aforementioned pulse train based on the instruction from the pitch pulse variable device 18, which is a feature of the present invention, and then supplies it to the switch 15.

可変増幅器16はこうして入力するピッチパルスもしく
は白色雑音に対し、別に入力する短時間音声電力データ
dの大きさに対応した重み付け増幅を実施したのち、こ
れを音声合成フィルタ13に供給し、このフィルタの駆
動音源とする。
The variable amplifier 16 performs weighted amplification on the input pitch pulse or white noise in accordance with the magnitude of the short-time voice power data d that is input separately, and then supplies this to the voice synthesis filter 13. Use as a driving sound source.

音声合成フィルタ13は、こうして入力するLPG係数
データaをフィルタ係数とし、駆動音源によって駆動さ
れ、分析フレームごとに量子化合成波形を再生し、D/
Aコンバータl9に供給する。
The speech synthesis filter 13 uses the input LPG coefficient data a as a filter coefficient, is driven by a driving sound source, reproduces a quantized synthesized waveform for each analysis frame, and converts the D/
Supplied to A converter l9.

D/Aコンバータ19は、こうして入力した量子化合成
波形をアナログ波形に変換し、LPF(Low Pas
s Filter) 2Qに送出する。LPF20は、
所定の高城周波数遮断フィルタリングを行い、合成音声
として出力ライン21に送出する。
The D/A converter 19 converts the input quantized composite waveform into an analog waveform, and converts it into an analog waveform.
s Filter) Send to 2Q. LPF20 is
A predetermined Takagi frequency cutoff filtering is performed, and the synthesized speech is sent to the output line 21.

ところで、従来の音声合成器は、前述した通り、音声合
成フィルタ13を駆動するパルス励起点にピッチ周期、
すなわち定周期的な過度のエネルギの集中が生起し、自
然性に乏しい合成音声となっていたが、本発明によるこ
の実施例では、パルス励起点をピッチパルス可変器18
により意図的に動かす。これによって、定周期(ピッチ
周期)ごとに発生する過度のエネルギの集中をピッチ周
期を基本としながら分散させる。つまり、ピッチ周期、
もしくはその周期よりも少し前もしくは少し後というよ
うにパルスの励起点を意図的に動かすことにより、従来
は定周期的に発生した過度のエネルギの発生を時間的に
分散させることができ、聴覚的違和感をなくし、合成音
声の自然性を著しく改善している。
By the way, as mentioned above, the conventional speech synthesizer has a pitch period, a pulse excitation point that drives the speech synthesis filter 13, and
In other words, excessive concentration of energy occurs periodically, resulting in synthesized speech that lacks naturalness.However, in this embodiment according to the present invention, the pulse excitation point is changed to the pitch pulse variable device 18.
Move more intentionally. As a result, excessive concentration of energy that occurs every fixed period (pitch period) is dispersed based on the pitch period. That is, the pitch period,
Alternatively, by intentionally moving the excitation point of the pulse a little earlier or a little later than that period, it is possible to temporally disperse the generation of excessive energy that conventionally occurred periodically, which improves the auditory sense. This eliminates the sense of discomfort and significantly improves the naturalness of synthesized speech.

ここで、ピッチパルス可変器18は、擬似乱数として代
表的なM系列を利用しており、M系列の下位2ビットが
(0.0)(1.1)のときは、ピッチ周期通りのパル
ス励起とし、(1.0)のときはピッチ周期よりも1サ
ンプル(125μSec)早く励起し、(1.0)のと
きは1サンプル遅く励起する構成となっている。そして
、ピッチパルス可変器18の内容は1サンプルごとに更
新する構成となっている。
Here, the pitch pulse variable device 18 uses a typical M sequence as a pseudo-random number, and when the lower two bits of the M sequence are (0.0) (1.1), the pulse pulse according to the pitch period is When it is (1.0), it is excited one sample (125 μSec) earlier than the pitch period, and when it is (1.0), it is excited one sample later. The contents of the pitch pulse variable device 18 are updated every sample.

〔発明の効果〕〔Effect of the invention〕

以上説明したように本発明によれば、音声合成器にピッ
チ可変器を設け、ピッチパルス周期およびその近傍でパ
ルス励起点を定めるので、過度のエネルギの集中を防止
することができ、自然性のない機械音的な合成音となる
ことを防ぐことができる。このように、エネルギの励振
点を可変するピッチパルス可変器により、エネルギを時
間的に分散せしめ、聴覚的違和感のない自然性のよい合
成音声が生成できる。
As explained above, according to the present invention, the speech synthesizer is provided with a pitch variable device, and the pulse excitation point is determined at the pitch pulse period and its vicinity, so that excessive concentration of energy can be prevented and natural It is possible to prevent the sound from becoming a mechanically synthesized sound. In this way, by using the pitch pulse variable device that varies the energy excitation point, the energy can be dispersed over time, making it possible to generate synthetic speech with good naturalness and no auditory discomfort.

【図面の簡単な説明】 第1図は本発明の音声合成器の一実施例を示すブロック
図である。 11・・・・・・デマルチブレクサ、12・・・・・・
伝送路、13・・・・・・音声合成フィルタ、 14・・・・・・ピッチパルス発生器、15・旧・・切
替器、16・・・・・・可変増幅器、17・・・・・・
雑音発生器、18・・・・・・ピッチパルス可変器、1
9・・・・・・D/Aコンバータ、20・・・・・・L
PF.躬1図
BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram showing an embodiment of a speech synthesizer of the present invention. 11... Demultiplexer, 12...
Transmission line, 13...Speech synthesis filter, 14...Pitch pulse generator, 15.Old...Switcher, 16...Variable amplifier, 17...・
Noise generator, 18...Pitch pulse variable device, 1
9...D/A converter, 20...L
P.F. Figure 1

Claims (1)

【特許請求の範囲】[Claims] 音声分析器で分析された音声信号の巨視的構造を示すス
ペクトラル包絡情報と前記音声信号の微細構造を示す音
源情報とを入力し、前記スペクトラル包絡情報を前記音
源情報で駆動し、音声合成フィルタで音声を合成する音
声合成器において、前記音源情報のうちのピッチ周期情
報に基づいて所定の周波数のピッチパルス列を発生する
ピッチパルス発生器と、このピッチパルス発生器で発生
するピッチパルス列の発生位置を変化させるためのピッ
チパルス可変器と、白色雑音を発生するための雑音発生
器と、前記音源情報のうちの有声無声無音情報が有声情
報であるときは前記ピッチパルス発生器の出力を選択し
、無声無音情報であるときは前記雑音発生器の出力を選
択する切替器と、この切替器の出力に基づいて前記音声
合成フィルタを駆動することを特徴とする音声合成器。
Spectral envelope information indicating the macroscopic structure of the audio signal analyzed by a speech analyzer and sound source information indicating the fine structure of the audio signal are input, the spectral envelope information is driven by the sound source information, and a speech synthesis filter is used. A speech synthesizer that synthesizes speech includes a pitch pulse generator that generates a pitch pulse train of a predetermined frequency based on pitch period information of the sound source information, and a generation position of the pitch pulse train generated by this pitch pulse generator. a pitch pulse variable device for changing the pitch pulse, a noise generator for generating white noise, and selecting the output of the pitch pulse generator when the voiced/unvoiced/unvoiced information of the sound source information is voiced information; A speech synthesizer comprising: a switch that selects the output of the noise generator when the information is unvoiced/silent information; and a switch that drives the speech synthesis filter based on the output of the switch.
JP1049958A 1989-03-03 1989-03-03 Voice synthesizer Pending JPH02230300A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP1049958A JPH02230300A (en) 1989-03-03 1989-03-03 Voice synthesizer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP1049958A JPH02230300A (en) 1989-03-03 1989-03-03 Voice synthesizer

Publications (1)

Publication Number Publication Date
JPH02230300A true JPH02230300A (en) 1990-09-12

Family

ID=12845541

Family Applications (1)

Application Number Title Priority Date Filing Date
JP1049958A Pending JPH02230300A (en) 1989-03-03 1989-03-03 Voice synthesizer

Country Status (1)

Country Link
JP (1) JPH02230300A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1526105A2 (en) 1998-01-20 2005-04-27 Gannett Satellite Information Network, Inc. Information distribution system for use in an elevator
JP2016505873A (en) * 2013-01-11 2016-02-25 華為技術有限公司Huawei Technologies Co.,Ltd. Audio signal encoding and decoding method and audio signal encoding and decoding apparatus

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS58186800A (en) * 1982-04-26 1983-10-31 日本電気株式会社 Voice synthesizer
JPS60260100A (en) * 1984-06-07 1985-12-23 日本電気株式会社 Voice synthesizer

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS58186800A (en) * 1982-04-26 1983-10-31 日本電気株式会社 Voice synthesizer
JPS60260100A (en) * 1984-06-07 1985-12-23 日本電気株式会社 Voice synthesizer

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1526105A2 (en) 1998-01-20 2005-04-27 Gannett Satellite Information Network, Inc. Information distribution system for use in an elevator
JP2016505873A (en) * 2013-01-11 2016-02-25 華為技術有限公司Huawei Technologies Co.,Ltd. Audio signal encoding and decoding method and audio signal encoding and decoding apparatus
JP2017138616A (en) * 2013-01-11 2017-08-10 華為技術有限公司Huawei Technologies Co.,Ltd. Audio signal encoding and decoding method and audio signal encoding and decoding apparatus
US9805736B2 (en) 2013-01-11 2017-10-31 Huawei Technologies Co., Ltd. Audio signal encoding and decoding method, and audio signal encoding and decoding apparatus
US10373629B2 (en) 2013-01-11 2019-08-06 Huawei Technologies Co., Ltd. Audio signal encoding and decoding method, and audio signal encoding and decoding apparatus

Similar Documents

Publication Publication Date Title
US5682502A (en) Syllable-beat-point synchronized rule-based speech synthesis from coded utterance-speed-independent phoneme combination parameters
JPH06110498A (en) Speech-element coding in speech synthesis system, pitch adjusting method thereof and voiced-sound synthesis device
WO2003010752A1 (en) Speech bandwidth extension apparatus and speech bandwidth extension method
JPH02230300A (en) Voice synthesizer
JPH10207496A (en) Voice encoding device and voice decoding device
JP2583883B2 (en) Speech analyzer and speech synthesizer
JP3510168B2 (en) Audio encoding method and audio decoding method
JP3057907B2 (en) Audio coding device
JPS6346498A (en) Rhythm control system
JP3166697B2 (en) Audio encoding / decoding device and system
JP2001154683A (en) Device and method for voice synthesizing and recording medium having voice synthesizing program recorded thereon
JP3368949B2 (en) Voice analysis and synthesis device
JPS60260100A (en) Voice synthesizer
JPH0411040B2 (en)
JP2898641B2 (en) Audio coding device
JP2535809B2 (en) Linear predictive speech analysis and synthesis device
JPH0377999B2 (en)
JP2000099094A (en) Time series signal processor
JP2580123B2 (en) Speech synthesizer
JPH06250685A (en) Voice synthesis system and rule synthesis device
JPH11184499A (en) Voice encoding method and voice encoding method
JPH10232699A (en) Lpc vocoder
JPH08160993A (en) Sound analysis-synthesizer
JPH01261700A (en) Voice coding system
JPH01283600A (en) Residue driving type speech synthesis device