JPS6078492A

JPS6078492A - Synthetic voice evaluating apparatus

Info

Publication number: JPS6078492A
Application number: JP58186546A
Authority: JP
Inventors: 西澤　逹夫
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1983-10-05
Filing date: 1983-10-05
Publication date: 1985-05-04

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】不発明は、音声合成７ステムの開発時に必要となる合成
音声評価装置に関する。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a synthesized speech evaluation device that is necessary when developing a 7-stem speech synthesis system.

音声合成システムの開発においては、音声合成１．８１
で用いる合成パラメータ方式る音声分析の過程が一番重
要な部分を占めており１合成音声の音質はこの段階で決
足される。In the development of speech synthesis systems, speech synthesis 1.81
The most important part is the voice analysis process using the synthesis parameter method used in 1, and the sound quality of the synthesized voice is determined at this stage.

一般の音声分析は、１妊気テーグ等に録音された原音声
ｉＡ／１）変換する過程と、その原音声データ’ｋ　＋
′Ｊｒ定の合成方式Ｃ１ｙＬ）ＰＣＭ、ホルマント等）
に基づいて分析処理ツーる過程と、その分りｆ結果をも
とに、音声合成ＬＳＩのシミュレーションケ行う過程と
、合成晋質全評画し１問題がのれは、分析結果に修正を
加える過程と１分析＃ｔ￥米を実際に合成１．ＳＩで用
いるパラメータに変換する過程とからなる。General voice analysis involves the process of converting the original voice recorded on a 1) pregnancy test, etc., and the process of converting the original voice data 'k +
'Jr constant synthesis method C1yL) PCM, formant, etc.)
The process of performing analysis processing based on the results, the process of simulating the speech synthesis LSI based on the results, and the process of making corrections to the analysis results if the synthesis problem is completely evaluated. and 1 analysis #t￥rice actually synthesized 1. It consists of a process of converting into parameters used in SI.

ここで、ＡＤＰＣＭ方式等の原音声データケそのままコ
ーディング′する様な比較的単純な分析方式においては
、データ圧縮率も１氏＜（１／２〜１／３）。Here, in a relatively simple analysis method such as the ADPCM method in which the original audio data is coded as is, the data compression rate is also 1<(1/2 to 1/3).

分析結果に対する修市ははとんど必要無いのに対し、Ｌ
ＰＣ，パーコール、ホルマント等の声道の伝達特性を分
析するパラメータ方式においては。While there is almost no need to correct the analysis results, L
In parameter methods for analyzing vocal tract transfer characteristics such as PC, Percoll, and formant.

データの圧縮率は高い（１／１０〜１／２０　）かわり
に良質な合成音声′ｆｆ：得るためには分析結果に対し
、その修正作業が必要とされている。その修正作業にお
けるＭ要１判断手段として通冨用いられているのが１合
成音声の試聴と、プリンタ等に印字された合成音声波形
とにＪ：り、原音声のそれと対比する手段でりる拳ここで、合成音声は、音声合成ＬＳＩ＝ｉシミュレーシ
ョンする合成プログラムで生成されるが、このプログラ
ムでは、当然のことなから、音声合成Ｌ８１が出力する
データと全く同一のデータ全生成する様につくられてい
る、このンこめ、音声合成ＬＳＩが有している様々な制
約事項１例えはビット長の制限、パラメータの量子化等
のためのダイナミックレンジの大幅な制限があり１合成
音声の振幅は原音だのそれと比較した場合に著しく小さ
くなりがちで１両者全比較する場合に、そのアンバラン
スから支障を来たす場合が多かった。Although the data compression rate is high (1/10 to 1/20), it is necessary to correct the analysis results in order to obtain high quality synthesized speech 'ff. The commonly used means of determining whether or not to use the original voice in the correction work is to listen to the synthesized voice, compare it with the synthesized voice waveform printed on a printer, etc., and compare it with that of the original voice. Here, the synthesized speech is generated by a synthesis program that simulates the speech synthesis LSI=i, but as a matter of course, this program generates all the data that is exactly the same as the data that the speech synthesis L81 outputs. There are various constraints that speech synthesis LSIs have, such as limitations on bit length, significant limitations on dynamic range due to parameter quantization, etc., and limitations on the amplitude of synthesized speech. When compared with the original sound, it tends to be significantly smaller, and this imbalance often causes problems when comparing the two in full.

不発明の目的は、　８＋卸己の振１鵬のアンバランスを
自動的に補正し、分析結果の修正作業の効率化全図るこ
とのできる合成音声評価装置ｔｌｆｅ供するものである
。The purpose of the invention is to provide a synthesized speech evaluation device TLFE that can automatically correct the imbalance of 8 + wholesale self-playing 1-peng and completely improve the efficiency of the correction work of the analysis results.

不発明の合成廿声評ＩＩＩＩＩ装置は、原音声データお
よび合成音声データ金格納する記憶装置と、前記原音声
データおよび合成１４声データをそれぞれ原音声および
合成に変換−ｒるＩ）／Ａユニットと、前記原音声およ
び合成斤声の波形をプリントするグラフィックプリンタ
とを有する合成酢声評価装置において、前記合成酢声お
よび原音声の間の比を係数とする演算全行なう手段ケ有
し、　ｎ’＋１記合成音声のそれとほぼ寺しい振幅で前
記グラフィックプリンタがプリントするようにしたこと
全特徴とする。The uninvented synthesized voice evaluation device includes a storage device for storing original voice data and synthesized voice data, and a unit for converting the original voice data and synthesized 14 voice data into original voice and synthesized voice data, respectively. and a graphic printer for printing waveforms of the original voice and the synthesized voice, comprising means for performing all calculations using a ratio between the synthetic voice and the original voice as a coefficient, n The graphic printer is characterized in that the graphic printer prints at an amplitude that is almost equal to that of the synthesized voice.

次に不発明の実施例について図面を用いて説明する。Next, embodiments of the present invention will be described with reference to the drawings.

第１図は、不発明の一実施例で、音声データ全格納する
記憶装置９と、音声波形の印字を行うグラフィックプリ
ンタ１１と、音声出力をイＩうＪＪ／Ａユニットと、こ
れら音制御するコントローラ１０とにより構成される。FIG. 1 shows an embodiment of the invention, which includes a storage device 9 that stores all audio data, a graphic printer 11 that prints audio waveforms, a JJ/A unit that outputs audio, and a device that controls these sounds. It is composed of a controller 10.

第２図は不発明の一実施例の動作説明図である。FIG. 2 is an explanatory diagram of the operation of one embodiment of the invention.

この実力山側は、原音声データｌと、合成音声データ２
と、前記データ音読み出し、印字波形の５を得る波形表
示プログラム３と、音声データをアナログ音声６に変換
するｌ）／Ａ変換プログラム４で構成される。This ability mountain side is the original voice data 1 and the synthesized voice data 2.
, a waveform display program 3 for reading out the data sound and obtaining the print waveform 5, and a l)/A conversion program 4 for converting the audio data into analog audio 6.

第３図は、第１図におけるＤ／Ａ変換プログラム４の７
０−チャートである。原音声データおよび合成音声の誓
き込みおよび振幅最大値検出１３奮行った後、振幅調整
係数算出１４紫行う。振幅調整係数は下記の計算式で算
出される。Figure 3 shows the D/A conversion program 4-7 in Figure 1.
0-Chart. After the original voice data and synthesized voice have been identified and the maximum amplitude value has been detected (13), the amplitude adjustment coefficient is calculated (14). The amplitude adjustment coefficient is calculated using the following formula.

振幅調整係数＝原音声振幅最大値／合成音声振幅最太（
，１この係数を合成音声データに掛けれに、Ｄ／Ａｇ換
用ワークデータ作成お工びＤ／Ａ変換１５１７Ｃよって
振幅調整されたアナログ音声６が生成される。Amplitude adjustment coefficient = maximum original audio amplitude / maximum synthesized audio amplitude (
, 1 By multiplying the synthesized voice data by this coefficient, an amplitude-adjusted analog voice 6 is generated by the D/Ag conversion work data creation and D/A conversion 1517C.

第４図（ａ）、　（Ｃ）に示したのは、第３図の１）／
Ａの部分を波形印字におきかえた内存で、波形表示プロ
グラムによる印字波形例である。同図（ａ）の原音声波
形に対して同図（ｂ）の振幅調整を行っていない合成音
声波形は、振幅レベルが小さいため、細部（％に０付近
）が不明瞭である。一方、同図（Ｃ）の振幅調整された
合成音波形は、この点が改善されてお９．細部まで明瞭
である。Figures 4(a) and (C) show 1)/in Figure 3.
This is an example of a waveform printed by a waveform display program, with part A replaced with waveform printing. The synthesized speech waveform shown in FIG. 13B, in which the amplitude has not been adjusted with respect to the original speech waveform shown in FIG. On the other hand, the amplitude-adjusted synthesized sound waveform shown in FIG. 9(C) has been improved in this respect. Every detail is clear.

以上説明した株に、不発明によれは、原音声と合成音声
の振幅が著しく異なる場合でも、合成音の振幅の調整が
自動的ＶＣ行われるため、合成音声の異常点の抽出が容
易であり、又、正確に把迩できるため、音声分１ｒの効
率が伺上するとともに、合成片質の同上にもｄ与する。According to the invention described above, even if the amplitude of the original voice and the synthesized voice is significantly different, the amplitude of the synthesized voice is automatically adjusted by VC, so it is easy to extract abnormal points in the synthesized voice. In addition, since it can be understood accurately, the efficiency of the voice component 1r increases, and it also contributes to the same as above for the synthesized fragment.

[Brief explanation of the drawing]

第１図は本発明の一実施例のブロック図、第２図は第１
図の動作ケアＪ＜す説明図、第３図は第１図中のＡ／Ｄ
ｉ換プログラムの７０−チャート、第４図（ａ）、　（
ｂ）、　（Ｃ）は第１図中の波形表示プログラムによる
印字例を示す波形図である。１・・・・・・原音声データ、２・・・・・・合成音声
データ、３・・・・・波形表示プログラム、４・・・・
・・ＪＪ／Ａ変換プログラム、５・・・・・・印字波形
、６・・・・・・アナログ音声。７・・・・・・原音声データ（ワーク）、８・・・・・
・合成音デ−タ（”）−り）、９・・・・・・記憶装置
ｆｉｆｆｉ、１０・川・・コン）”’−ラ、１１・・・
・・・グラフイックグリンタ、１２・・・・・・ＪＪ／
Ａユニット。第１図１肇２図FIG. 1 is a block diagram of one embodiment of the present invention, and FIG. 2 is a block diagram of an embodiment of the present invention.
Figure 3 shows the A/D in Figure 1.
70-Chart of i exchange program, Fig. 4(a), (
b) and (C) are waveform diagrams showing examples of printing by the waveform display program in FIG. 1. 1...Original voice data, 2...Synthesized voice data, 3...Waveform display program, 4...
...JJ/A conversion program, 5...Printed waveform, 6...Analog audio. 7...Original audio data (work), 8...
・Synthetic sound data ('')-ri), 9...Storage device fiffi, 10・kawa...con)'''-ra, 11...
...Graphic Glinta, 12...JJ/
A unit. Figure 1 Figure 1 Hajime 2

Claims

[Claims]

A 6 billion device that stores original voice data and synthesized voice data, an IJ/A unit that converts the original voice data and synthesized voice data into original voice and synthesized voice, respectively, and an NJ recorded raw voice and synthesized voice. This invention is patented in that it has a means for printing graphics and a printer on the waveform of a voice, and the width of one synthesized voice is printed by the quick printer with an amplitude approximately equal to that of the original voice. Voice evaluation device.