JPS62143098A

JPS62143098A - Synthesization of monosyllable for evaluating clearness

Info

Publication number: JPS62143098A
Application number: JP60282747A
Authority: JP
Inventors: 利光蓑輪
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1985-12-18
Filing date: 1985-12-18
Publication date: 1987-06-26

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】（産業上の利用分野）本発明は、規則合成した合成音の評価に用いる単音節の
合成方法に関する。DETAILED DESCRIPTION OF THE INVENTION (Field of Industrial Application) The present invention relates to a method for synthesizing monosyllables used for evaluating synthesized speech synthesized according to rules.

（従来の技術）一般に、音声の規則合成においては、その合成音の明瞭
度評価のため、単音節を規則合成する。(Prior Art) In general, in the rule synthesis of speech, monosyllables are synthesized in a rule to evaluate the intelligibility of the synthesized speech.

第１図および第２図は従来の単音節の規則合成の方法を
示す音節結合図である。第１図は単音節のパワー図で、
工ないし２の区間は無音区間、２ないし３の区間は過渡
区間、そして３ないし５の区間は有声区間である。なお
、音節によっては上記の無声区Ｆｉｎ　１ないし２が無
ないものもある。FIGS. 1 and 2 are syllable combination diagrams showing a conventional method for synthesizing monosyllables according to rules. Figure 1 is a monosyllable power diagram.
The interval between 1 and 2 is a silent interval, the interval between 2 and 3 is a transient interval, and the interval between 3 and 5 is a voiced interval. Note that some syllables do not have the above-mentioned voiceless sections Fin 1 or Fin 2.

従来、規則合成における音声パワーは、上記上ないし３
の無声区間および過渡区間はスペクトル分析による音声
パワーを使用し、３ないし４の区間は直線補間させ、さ
らに４ないし５の直線区間は５のフレームにおいて音声
パワーが０になるように直線補則を行なっている。スペ
クトルパラメータも全く同様に１ないし３の区間と４の
フレームだけにスペクトル分析時のパラメータを用いて
いる。そして、３ないし４および４ないし５の長さく時
間）を調整することにより、音節の任意の時間長を得て
いる。Conventionally, the speech power in rule synthesis has ranged from above to 3.
For the unvoiced section and the transient section, the voice power obtained by spectrum analysis is used, and for the section 3 and 4, linear interpolation is performed, and for the straight section 4 and 5, a linear supplementary law is applied so that the voice power becomes 0 in frame 5. ing. Similarly, the spectral parameters used for spectrum analysis are used only for sections 1 to 3 and frame 4. By adjusting the lengths of 3 to 4 and 4 to 5, an arbitrary time length of the syllable can be obtained.

第２図はピッチパターンを示すもので従来は、６．７お
よび８で示す平坦な特性を示すピッチ周波数により合成
していた。FIG. 2 shows a pitch pattern, which has conventionally been synthesized using pitch frequencies exhibiting flat characteristics as shown by 6.7 and 8.

しかしながら、このような従来の単音節の規則合成は、
音節長を原音声のものにあわせるため、音節ごとに異な
るものであった。さらに、ピッチパターンが平坦である
ので、合成音は楽音のような響きをもち、不自然で聞き
づらいものであった。However, such conventional monosyllabic rule synthesis
In order to match the syllable length to that of the original speech, each syllable was different. Furthermore, because the pitch pattern was flat, the synthesized sound had a musical-like sound, which was unnatural and difficult to hear.

このような合成音は音声明瞭度試験の被験者が、音節の
同定作業を行なう場合、それに集中できないという欠点
となっていた。Such synthesized sounds have the disadvantage that subjects in speech intelligibility tests cannot concentrate on identifying syllables.

（発明が解決しようとする問題点）本発明は上記した従来の欠点を排除して、音声明瞭度試
験の被験者の心理的な不快感を軽減して音節の明瞭度を
正確に計測、正しい判断を行なわせることを目的にする
ものである。(Problems to be Solved by the Invention) The present invention eliminates the above-mentioned conventional drawbacks, reduces the psychological discomfort of subjects in speech intelligibility tests, accurately measures syllable intelligibility, and makes correct judgments. The purpose is to make people do the following.

（問題点を解決するための手段）本発明は、上記の目的を達成するため、標準の音節長を
設定して、音声明瞭度試験を行なう複数の音節の間で音
節長のばらつきを軽減し、また、ピッチパターンに凸形
の変化を与えることにより。(Means for Solving the Problems) In order to achieve the above object, the present invention sets a standard syllable length to reduce variations in syllable length among a plurality of syllables in a speech intelligibility test. , also by giving a convex variation to the pitch pattern.

楽音のような響きをなくし、自然音声に近い規則合成音
を得るようにするものである。This eliminates the sound of musical tones and produces a regular synthesized sound that is close to natural speech.

（作　用）上記の規則合成音の形成によって、音声明瞭度試験の被
験者の心理的な不快感を軽減して音節の明瞭度を正確し
こ計測、正しい判断を行なわせることが可能になる。(Function) By forming the above-mentioned rule-based synthesized speech, it becomes possible to reduce the psychological discomfort of the subjects in the speech intelligibility test, to accurately measure the intelligibility of syllables, and to make them make correct judgments.

（実施例）以下、本発明を一実施例により前出第１図、第２図を援
用して詳細に説明する。(Example) Hereinafter, the present invention will be explained in detail by way of an example with reference to FIGS. 1 and 2 mentioned above.

まず、第２図において、６，９および１０で示したもの
は、本発明を実施するためのピッチパターンであり、第
１図の過渡区間２ないし３の先頭から母音補間開始フレ
ーム３までは上昇させ、上記母音補間開始フレーム３か
ら音節の終了フレームまでは下降させてなる、凸形のピ
ッチパターンにされている。ここで、上記、６．９およ
び１０の位置におけるピッチ周波数はそれぞれ、２１４
ｔ（ｚ、２２２Ｈ２および２０９１１ｚであり、この値
は、自然音声から抽出して得られたピッチパターンを平
均して得られたものである。同様に、音節長も２２４ｍ
５とし、／Ｓｈｕ／のような、長めの音節だけは例外的
に取り扱かうようにして、単音節を規則合成により形成
した。First, in FIG. 2, the pitch patterns indicated by 6, 9, and 10 are pitch patterns for implementing the present invention, and the pitch pattern increases from the beginning of transition section 2 or 3 in FIG. 1 to vowel interpolation start frame 3. The pitch pattern is convex and descends from the vowel interpolation start frame 3 to the end frame of the syllable. Here, the pitch frequencies at positions 6.9 and 10 above are respectively 214
t(z, 222H2 and 20911z, and this value was obtained by averaging the pitch patterns extracted from natural speech. Similarly, the syllable length is 224 m.
5, and long syllables such as /Shu/ were handled as an exception, and monosyllables were formed by regular synthesis.

下記の表は、上記のようにしてピッチパターンに変化を
もたせ、かつ、音節長を定めて得られる、音節長および
ピッチの高さが同一の、楽音などのような響きを有しな
い規則合成した音声を用いて、音声明瞭度試験を行なっ
た結果を示したものである。これから確認されるように
、本発明による明瞭度評価用の単音節は音節の明瞭性が
改善される。The table below shows the rules for synthesizing the pitch patterns with the same syllable length and pitch height, which do not sound like musical tones, obtained by varying the pitch pattern and determining the syllable length as described above. This figure shows the results of a speech intelligibility test using speech. As will be seen, the monosyllables for intelligibility evaluation according to the invention have improved syllable intelligibility.

（発明の効果）以上説明して明らかなように、本発明は、音節毎の時間
長がほぼ等しいので、音声明瞭度の被試験者は、音節時
間長の違いを気にする必要がないので、音節の同定作業
に集中でき、また、楽音のような響きがないため、不快
感を抱かず音節の同定作業に集中でき、したがって、音
声明瞭度の評価を正確に行なえる利点があり、実施して
顕著な効果が得られる。(Effects of the Invention) As is clear from the above explanation, in the present invention, since the time lengths of each syllable are approximately equal, the speech intelligibility test subject does not need to be concerned about the difference in syllable time length. , you can concentrate on the syllable identification task, and since it does not have a musical sound, you can concentrate on the syllable identification task without feeling uncomfortable.Therefore, it has the advantage of being able to accurately evaluate speech intelligibility. A noticeable effect can be obtained.

[Brief explanation of drawings]

第１図、第２図は従来例、及び本発明を説明する音節結
合図、およびピッチパターンの図である。工ないし２（の区間）・・・無音区間、　２ないし３（
の区間）・・・過渡区間、　３ないし４．４ないし５　
・・・直線補間区間。FIGS. 1 and 2 are syllable combination diagrams and pitch pattern diagrams for explaining a conventional example and the present invention. Work or 2 (section)... Silent section, 2 or 3 (
section)...Transient section, 3 to 4.4 to 5
...Linear interpolation interval.

Claims

[Claims]

In the rule synthesis of monosyllables used for the intelligibility test of rule-synthesized speech, without flattening the pitch pattern,
A method for synthesizing a single syllable for intelligibility evaluation, characterized in that the syllable is synthesized by approximating the pitch pattern of a syllable in natural speech, and by keeping the duration of the syllable almost constant regardless of the syllable.