JP2679039B2

JP2679039B2 - Vowel cutting device

Info

Publication number: JP2679039B2
Application number: JP62053315A
Authority: JP
Inventors: 修司高田; 道代後藤; 豊上川
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1987-03-09
Filing date: 1987-03-09
Publication date: 1997-11-19
Anticipated expiration: 2012-11-19
Also published as: JPS63220200A

Description

【発明の詳細な説明】産業上の利用分野語学練習装置や発話訓練装置等を用いた母音の発音練
習においては母音単独の発声に対する分析、評価だけで
はなく、より自然な単語発声中の母音の分析、評価が望
まれている。本発明はこのような語学練習装置や発話訓練装置等に
おいて学習者の母音としては一つの母音のみを含む発声
単語中の母音分析フレームを切出すための母音切出し装
置に関するものである。従来の技術従来、この母音切出しには信号レベルによる処理やス
ペクトル変化量による処理が用いられる。このうちスペ
クトル変化量によるものはフレームごとにLPCケプスト
ラム係数等を算出したのち、隣接フレーム間で係数の距
離をとるもので、正確な切出しができる反面、多大な演
算時間を要する。従って簡易に母音を切出す方法として
信号レベルによる処理が多く用いられている。以下、図面を参照しながら、上述した従来の母音切出
し装置の一例について説明する。第２図は従来の母音切
出し装置の要部ブロック図である。同図において21はレ
ベル検出部、22はレベルピーク検出部である。以上のように構成された従来の母音切出し装置につい
て、その動作を以下に説明する。信号レベルを表わすパラメータとして波形振幅の絶対
値を用いるものとする。レベル検出部21においては、始端フレーム１から終端
フレームＮまで順次、各フレームの信号レベルPn,n＝１
〜Ｎを算出する。ここで X₁…X_m:第ｎフレームの音声波形データレベルピーク検出部22においては上記信号レベル系列
P₁…P_Nの中から最大となるフレームを探し出し前後必要
な数のフレームを母音フレームとして切出す。（第３図
参照）発明が解決しようとする問題点上記のような構成の母音切出し装置においては、レベ
ルが最大となるフレーム位置が母音区間内で異なる話者
間や同一話者であっても発声ごとに大きく変動する。本発明はかかる点に鑑みてなされたもので、簡易に安
定した母音フレームを切出すことのできる母音切出し装
置を目的としている。問題点を解決するための手段本発明は上記目的を達成するため、母音としては一つ
の単母音のみを含む入力単語音声波形データの信号レベ
ルを一定時間長のフレームごとに検出するレベル検出部
と、各フレームにおける上記信号レベルの始端フレーム
からの累積値を算出する累積レベル算出部と、終端フレ
ームまでの上記全累積値に対する比率があらかじめ単語
ごとに設定された閾値を越えるフレームを検出する累積
レベル比較部とを備えたものである。作用本発明は上記した構成により、信号レベルの累積値の
比率によって母音切出しフレームを決定するものであ
る。また入力単語音声ごとの母音区間の変動に対して事
前知識としての入力単語名によって母音前後の子音が決
定されるので、この前後の子音と母音との組合せによっ
て比率を変えることにより対応するものである。実施例以下本発明の一実施例の母音切出し装置について図面
を見ながら説明する。第１図は本発明の母音切出し装置の一実施例を示す要
部ブロック図である。第１図において、11はレベル検出
部、12は累積レベル算出部、13は累積レベル比較部であ
る。以上のように構成された母音切出し装置について、以
下その動作を説明する。信号レベルを表わすパラメータとして波形振幅の絶対
値を用いるものとする。レベル検出部11においては、従来例におけるレベル検
出部21と同様に始端フレーム１から終端フレームＮまで
順次、各フレームの信号レベルPn,n＝１〜Ｎを算出す
る。累積レベル算出部12においては各フレームにおける信
号レベルの始端フレームからの累積値Qn,n＝１〜Ｎを算
出する。累積レベル比較部13においては、単語ごとに設定され
た閾値Kj,j:単語番号と終端フレームまでの全累積値Q_N
に対する各フレームまでの累積値Qiの比率Q_i/Q_Nとを比
較し、比率が閾値を越えるフレームを検出し、前後必要
な数のフレームを母音フレームとして切出す。第４図に
閾値50％の例を示す。閾値Kjの設定は多数話者の母音としては一つの単母音
のみを含む単語音声に対してスペクトル変化量による処
理によって母音切出しを行なった場合に切出される母音
フレームにおける上記累積レベル比率を単語ごとに平均
したものであり、この値は従来例のレベルピーク検出に
よる方法に比べて話者、発声ごとの変動が小さく、安定
した母音フレームを切出すことができる。発明の効果以上のように本発明は累積レベル算出部と累積比較部
とを設けることにより、簡易に安定した母音切出しを行
なうことができる。DETAILED DESCRIPTION OF THE INVENTION Industrial field In vowel pronunciation practice using a language training device, a speech training device, etc., not only analysis and evaluation of a vowel alone but also more natural words Analysis and evaluation are desired. The present invention relates to a vowel cutout device for cutting out a vowel analysis frame in a spoken word containing only one vowel as a learner's vowel in such a language practice device or a speech training device. 2. Description of the Related Art Conventionally, processing based on a signal level or processing based on a spectrum change amount is used for vowel cutout. Among them, the one based on the amount of spectrum change is that the LPC cepstrum coefficient or the like is calculated for each frame, and then the coefficient distance is set between adjacent frames, and accurate segmentation can be performed, but a great amount of calculation time is required. Therefore, signal level processing is often used as a method for easily extracting vowels. Hereinafter, an example of the conventional vowel cutout device described above will be described with reference to the drawings. FIG. 2 is a block diagram of essential parts of a conventional vowel cutout device. In the figure, 21 is a level detection unit, and 22 is a level peak detection unit. The operation of the conventional vowel cutout device configured as described above will be described below. It is assumed that the absolute value of the waveform amplitude is used as the parameter indicating the signal level. In the level detection unit 21, the signal level Pn, n = 1 of each frame is sequentially applied from the start frame 1 to the end frame N.
~ N is calculated. here X ₁ ... X _m : the above-mentioned signal level sequence in the voice waveform data level peak detection unit 22 of the nth frame
The maximum frame is searched from P ₁ ... P _{N and} the necessary number of frames before and after is cut out as vowel frames. (See FIG. 3) Problems to be Solved by the Invention In the vowel cutout device having the above-mentioned configuration, even if the frame position where the level is the maximum is different between speakers or the same speaker in the vowel section. It varies greatly with each utterance. The present invention has been made in view of the above points, and an object of the present invention is to provide a vowel cutout device capable of easily cutting out a stable vowel frame. Means for Solving the Problems In order to achieve the above object, the present invention provides a level detection unit that detects a signal level of input word speech waveform data that includes only one single vowel as a vowel for each frame of a fixed time length. A cumulative level calculating unit for calculating a cumulative value of the signal level in each frame from the start frame, and a cumulative level for detecting a frame in which the ratio of the total cumulative value up to the end frame exceeds a threshold value set in advance for each word. And a comparison section. Action The present invention has the above-described configuration and determines the vowel cutout frame based on the ratio of the cumulative value of the signal level. In addition, since the consonant before and after the vowel is determined by the input word name as prior knowledge for the variation of the vowel section for each input word voice, it is possible to respond by changing the ratio depending on the combination of the consonant before and after this vowel. is there. Embodiments A vowel cutting device according to an embodiment of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram of essential parts showing an embodiment of a vowel cutout device of the present invention. In FIG. 1, 11 is a level detection unit, 12 is a cumulative level calculation unit, and 13 is a cumulative level comparison unit. The operation of the vowel cutout device configured as described above will be described below. It is assumed that the absolute value of the waveform amplitude is used as the parameter indicating the signal level. Similar to the level detecting unit 21 in the conventional example, the level detecting unit 11 sequentially calculates the signal levels Pn, n = 1 to N of each frame from the starting frame 1 to the ending frame N. The cumulative level calculator 12 calculates cumulative values Qn, n = 1 to N from the start frame of the signal level in each frame. In the cumulative level comparing unit 13, a threshold value Kj, j set for each word: word number and total cumulative value Q _N up to the end frame
The ratio Q _i / Q _N of the cumulative value Q _i up to each frame is compared to detect a frame in which the ratio exceeds a threshold value, and the necessary number of frames before and after are cut out as vowel frames. Fig. 4 shows an example of a threshold of 50%. The threshold Kj is set to the above-mentioned cumulative level ratio in the vowel frame cut out when the vowel cutout is performed by the processing by the spectral change amount for the word voice containing only one single vowel as the vowel of many speakers for each word. This is an average of the above values, and this value has less fluctuation for each speaker and utterance as compared with the conventional method by level peak detection, and a stable vowel frame can be cut out. EFFECTS OF THE INVENTION As described above, according to the present invention, by providing the cumulative level calculating section and the cumulative comparing section, it is possible to easily and stably perform vowel cutout.

【図面の簡単な説明】第１図は本発明の一実施例における母音切出し装置にお
ける要部ブロック図、第２図は従来の母音の切出し装置
における要部ブロック図、第３図は信号レベルを示すグ
ラフ、第４図は信号レベルの累積値を示すグラフであ
る。 11……レベル検出部、12……累積レベル算出部、13……
累積レベル比較部、21……レベル検出部、22……レベル
ピーク検出部。BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram of essential parts in a vowel cutting device according to an embodiment of the present invention, FIG. 2 is a block diagram of essential parts in a conventional vowel cutting device, and FIG. The graph shown in FIG. 4 is a graph showing the cumulative value of the signal level. 11 …… Level detector, 12 …… Cumulative level calculator, 13 ……
Cumulative level comparison unit, 21 ... Level detection unit, 22 ... Level peak detection unit.

Claims

(57) [Claims] A level detector that detects the signal level of the input word speech waveform data that contains only one vowel as a vowel, for each frame of a fixed time length, and a cumulative value that calculates the cumulative value of the above signal level in each frame from the start frame A vowel cutout device comprising: a level calculation unit; and a cumulative level comparison unit that detects a frame whose ratio to the total cumulative value up to the end frame exceeds a threshold value set in advance for each word.