JPS60201376A - Enunciation training machine - Google Patents

Enunciation training machine

Info

Publication number
JPS60201376A
JPS60201376A JP59057602A JP5760284A JPS60201376A JP S60201376 A JPS60201376 A JP S60201376A JP 59057602 A JP59057602 A JP 59057602A JP 5760284 A JP5760284 A JP 5760284A JP S60201376 A JPS60201376 A JP S60201376A
Authority
JP
Japan
Prior art keywords
voice
section
vocalization
sample
pitch
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP59057602A
Other languages
Japanese (ja)
Inventor
藤田 孝弥
奈良 泰弘
純一 棚橋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to JP59057602A priority Critical patent/JPS60201376A/en
Publication of JPS60201376A publication Critical patent/JPS60201376A/en
Pending legal-status Critical Current

Links

Landscapes

  • Electrically Operated Instructional Devices (AREA)

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。
(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】 (a)発明の技術分野 本発明は9発声訓練者の発声音の音声パワーとピッチ情
報をもとに、見本発声音との違いを発声の長さ2強さ、
高低及び不要な母音の挿入の有無を検出し表示すること
により発声訓練を行う発声訓練機に関する。
DETAILED DESCRIPTION OF THE INVENTION (a) Technical Field of the Invention The present invention is based on the voice power and pitch information of the utterances of 9 vocal trainees, and determines the differences from the sample utterances by the length, strength,
The present invention relates to a vocal training machine that performs vocal training by detecting and displaying pitch and the presence or absence of insertion of unnecessary vowels.

(b)技術の背景 外国語等の発声の仕方を独習する場合、その方法として
は磁気テープ等に記録された標準者の声を再生し、その
発声音を聞いて発音することにより訓練する方法が一般
的に行われている。しかし。
(b) Technical Background If you want to teach yourself how to pronounce a foreign language, etc., one method is to play back the voice of a standard person recorded on magnetic tape, etc., and train by listening to the pronunciation sounds and pronouncing them. is commonly practiced. but.

この方法は訓練者の発音が正しくなされているかどうか
の判断は、訓練者自身が主観的に判断するため訓練者自
身の判断力によっては間違った発音でも正しいと判断し
、真に正しい発音がなされないで訓練されてしまうと言
う問題があった。
In this method, the trainee judges subjectively whether the pronunciation is correct or not, so even a wrong pronunciation may be judged as correct depending on the trainee's own judgment, and there is no true correct pronunciation. There was a problem that they were trained without being trained.

そこで1客観的に発声音が正しいかどうかの判断機能を
有し、効率的に発声訓練が可能な発声訓練機の開発が要
望されていた。
Therefore, there has been a demand for the development of a vocal training machine that has the function of objectively determining whether or not a vocalization is correct, and that enables efficient vocal training.

(C)従来技術と問題点 次に、従来の発声訓練機について説明する。(C) Conventional technology and problems Next, a conventional voice training machine will be explained.

従来の一般的な独習による発声訓練は(b)項でも述べ
た如く、磁気テープ等により先生の声を再生し、これを
聞いて発声練習していたが1発声が正しいか否かの客観
的判断がなされないため効果が薄かった。そこで、音声
の分析技術を用いて見本者の音声スペクトルと訓練者の
発声した音声とを比較して評価する方法が特開昭−57
−44178「発声訓練装置」及び特開昭−58−17
268Or発音練習装置」にて発明出願されるに至った
As mentioned in section (b), conventional vocal training through general self-study involves playing back the teacher's voice using a magnetic tape, etc., and practicing vocalization by listening to this, but it is difficult to objectively determine whether one utterance is correct or not. The effect was weak because no judgment was made. Therefore, Japanese Patent Application Laid-Open No. 57-577 proposed a method of comparing and evaluating the voice spectrum of a sample and the voice uttered by a trainee using voice analysis technology.
-44178 "Voice training device" and JP-A-58-17
An invention application was filed for the ``268Or pronunciation practice device''.

しかし、これらの発明は見本者の発声音と発声練習者の
発声音との比較を類似度計算を行うことにより評価して
いるために1発声練習者に取っては具体的にどのような
点に注意して発声を直せばよいのかが的確でな(、従っ
て練習の効果も薄いと言う欠点があった。
However, since these inventions evaluate the comparison between the utterances of a sample and the utterances of a vocal practitioner by calculating the degree of similarity, what are the specific points for a vocal practitioner? It was difficult to know exactly what to pay attention to and correct the pronunciation (therefore, the effect of practice was also weak).

(d)発明の目的 本発明は、上記欠点を解消した新規な発声訓練機を提供
することを目的とし、特に見本者の発声音と発声練習者
の発声音との違いを定量的に解析し0両者の相違点を具
体的に的確に提示することにより、より効果的な発声訓
練が可能となる発声訓練機を実現することにある。
(d) Purpose of the Invention The purpose of the present invention is to provide a new vocal training device that eliminates the above-mentioned drawbacks, and in particular, quantitatively analyzes the differences between the vocalizations of a sample and the vocalizations of a vocal practitioner. The purpose of this invention is to realize a voice training machine that enables more effective voice training by specifically and accurately presenting the differences between the two.

(e)発明の構成 本発明は、見本者の発声音と訓練者の発声音との音声の
パワーとピンチの抽出を行う音声分析手段と、前記音声
分析手段により抽出した前記見本者の発声音と訓練者の
発声音との相違を定量的に解析する特徴解析手段と、前
記音声分析手段にて得た前記見本者の発声音と訓練者の
発声音との相違を視覚的に判明するように表示する表示
手段とを設け、前記見本者の発声音と訓練者の発声音と
の相違点を具体的に的確に提示することにより。
(e) Structure of the Invention The present invention provides a voice analysis means for extracting the power and pinch of the voice of a sample and the voice of a trainee, and the voice of the sample extracted by the voice analysis means. a feature analysis means for quantitatively analyzing the difference between the sample's vocalization and the trainee's vocalization; By providing a display means for displaying the sample's utterances and specifically and accurately presenting the differences between the utterances of the sample and the utterances of the trainee.

より効果的な発声訓練が可能となることを特徴とする発
声訓練機により達成することが出来る。
This can be achieved by using a voice training machine that enables more effective voice training.

(f)発明の実施例 以下本発明を図面を参照して説明する。(f) Examples of the invention The present invention will be explained below with reference to the drawings.

第1図は本発明に係る発声訓練機の一実施例。FIG. 1 shows an embodiment of the vocal training machine according to the present invention.

第2図は本発明に係る特徴分析部の一実施例、第3図は
アクセントの違いを表した図で、 (A)は単語の発声
音をパワーとピッチで表した図、(B)は定量的に解析
した図、第4図は見本者の発声音と訓練者の発声音の相
違を表した図で、 (A)は不要母音の挿入した場合、
 (B)は母音の長さが違う場合、第5図は会話の発声
音の相違を表した図で、 (A)はパワーとピッチで表
した図、 (B)はDPマツチングにより対応づけられ
た図、 (C)は定量的に解析した図、第6図は定量化
パラメータ表をそれぞれ示す。
Fig. 2 is an example of the feature analysis unit according to the present invention, Fig. 3 is a diagram showing differences in accents, (A) is a diagram showing the pronunciation of a word in terms of power and pitch, and (B) is Figure 4 shows the quantitative analysis of the differences between the pronunciation of the sample and the pronunciation of the trainee.
(B) is a diagram showing differences in vowel lengths, Figure 5 is a diagram showing differences in vocalizations in conversation, (A) is a diagram showing power and pitch, and (B) is a diagram showing the differences in utterances by DP matching. (C) is a diagram showing the quantitative analysis, and Figure 6 shows the quantification parameter table.

図において、1はマイクロホン、2は音声記憶部、3は
再生部、4は入力切替部、5はスピーカ。
In the figure, 1 is a microphone, 2 is an audio storage section, 3 is a playback section, 4 is an input switching section, and 5 is a speaker.

6はアナログ/ディジタル変換部(以下^/D変換部と
称する)、7は分析部、8は特徴解析部。
6 is an analog/digital conversion section (hereinafter referred to as ^/D conversion section), 7 is an analysis section, and 8 is a feature analysis section.

9は特徴記憶部、lOは比較評価部、11は判定条件格
納部、12は表示部、13は発声時間検出部、14は発
声区間検出部、15はパワー検出部。
Reference numeral 9 denotes a feature storage section, IO a comparison evaluation section, 11 a judgment condition storage section, 12 a display section, 13 a vocalization time detection section, 14 a vocalization section detection section, and 15 a power detection section.

16はピーク検出部、17はピッチ検出部、18はピッ
チ振幅検出部をそれぞれ示す。
16 is a peak detection section, 17 is a pitch detection section, and 18 is a pitch amplitude detection section.

尚TMは発声時間、 T1”Tnは発声区間、 t!1
〜Rnはパワー量* Pl−Pnはピーク値、 PTI
 xPTnはピッチ、 PTmaxは最大のピッチ、 
PTminは最小のピッチ、(1)は見本者の発声音(
先生の発声音)、+21゜(2)′は訓練者の発声音(
生徒の発声音)をそれぞれ示す。
In addition, TM is the vocalization time, T1"Tn is the vocalization interval, t!1
~Rn is the power amount* Pl-Pn is the peak value, PTI
xPTn is the pitch, PTmax is the maximum pitch,
PTmin is the minimum pitch, and (1) is the sample's vocalization (
teacher's vocalization), +21°(2)' is the trainee's vocalization (
each student's vocalization).

本実施例は訓練者の発声音を入力するマイクロホン1.
見本者の発声音が記憶されている音声記憶部2.音声記
憶部2の見本者の発声音を再生する再生部3.マイクロ
ホンlからの発声音と再生部3からの再生音との入力を
切替える入力切替部4、入力切替部4で指定された発声
音を音として再現するスピーカ5.入力切替部4で指定
入力して来た発声音(アナログ信号)をディジタル信号
に変換するA /D変換部6.ディジタル信号に変換さ
れた発声音のパワーとピンチを分析抽出する分析部71
分析部7で分析抽出したパワーとピソチ波形を解析して
定量的な特徴を抽出する特徴解析部8.特徴解析部8で
抽出した定量的な特徴を記憶する特徴記憶部9.特徴記
憶部9に記憶されていた特徴と特徴解析部8から新たに
出力された特徴とを判定条件格納部11の条件に従って
比較判定する比較評価部10.比較評価部lOで比較判
定するための一定条件を記憶している判定条件格納部1
1.比較評価部10の出力を表示する表示部12から構
成されている。
In this embodiment, a microphone 1.
A voice storage unit 2 in which the sample's vocalizations are stored. A playback unit 3 that plays back the vocalizations of the sample in the voice storage unit 2. an input switching unit 4 that switches input between the vocalization from the microphone l and the reproduced sound from the reproduction unit 3; a speaker 5 that reproduces the vocalization specified by the input switching unit 4 as a sound; An A/D converter 6 that converts the voiced sound (analog signal) specified by the input switching unit 4 into a digital signal. Analysis unit 71 that analyzes and extracts the power and pinch of the vocalization converted into a digital signal
A feature analysis unit 8 that analyzes the power and Pisochi waveform analyzed and extracted by the analysis unit 7 to extract quantitative features. A feature storage unit 9 that stores quantitative features extracted by the feature analysis unit 8. A comparative evaluation section 10 that compares and judges the features stored in the feature storage section 9 and the features newly output from the feature analysis section 8 according to the conditions in the judgment condition storage section 11. Judgment condition storage unit 1 that stores certain conditions for comparison and judgment in the comparison evaluation unit IO
1. It consists of a display section 12 that displays the output of the comparison evaluation section 10.

又第2図に示す特徴解析部8は、パワー波形から発声時
間TMを検出する発声時間検出部13.同じくパワー波
形から発声区間T1xTnを検出する発声区間検出部1
4.パワー波形と発声区間検出部14出力の区間信号T
1〜Tnによりパワーff1E1〜Enを検出するパワ
ー検出部15.同じくパワー波形と発声区間検出部14
出力の区間信号TI=Tnによりピーク値P1〜Pr+
を検出するビーク検出部16゜ピッチ波形と発声区間検
出部14出力の区間信号T1〜TnによりピッチPTI
〜PTnを検出するピッチ検出部17.ピッチ波形より
ピッチ振幅(PTmax−PTmin )を検出するピ
ッチ振幅検出部から構成されている。
The feature analysis section 8 shown in FIG. 2 also includes a vocalization time detection section 13. which detects the vocalization time TM from the power waveform. Vocalization section detection unit 1 similarly detects the vocalization section T1xTn from the power waveform.
4. Power waveform and section signal T of utterance section detection unit 14 output
A power detection unit 15.1 detects powers ff1E1 to En based on ff1E1 to Tn. Similarly, the power waveform and vocalization section detection unit 14
The peak value P1 to Pr+ is determined by the output interval signal TI=Tn.
The pitch PTI is determined by the pitch waveform of the beak detector 16 which detects
~Pitch detection section 17 for detecting PTn. It consists of a pitch amplitude detection section that detects pitch amplitude (PTmax-PTmin) from a pitch waveform.

例えば、英語の発声を独習する場合2日本人と英語国人
との発声の違いを示すものとして単語においては、第3
図(A)で示すようにアクセントを付ける母音部分の違
い、或いは第4図(A)に示すように日本人がカタカナ
的に発声することによる不要母音の挿入、又は第4図(
B)に示すような母音部分の長さの違い等がある。更に
会話においても、第5図(A)に示すように見本者の発
声音(1)と訓練者の発声音(2)、 (21’とでは
発声時間TM、各発声区間11〜Tnの長さ1発声の強
い個所(ピーク値P1〜Pn、各発声区間T1〜Tn毎
のパワーEl〜fin分布)やピッチPTI−PTn変
化等に大きな違いがあることが分かる。
For example, when studying English pronunciation by yourself, there are two words that show the difference in pronunciation between Japanese and English speakers:
As shown in Figure (A), there are differences in the vowel parts that are accented; or, as shown in Figure 4 (A), the insertion of unnecessary vowels by Japanese people pronouncing them in katakana;
There are differences in the length of the vowel part as shown in B). Furthermore, in conversation, as shown in FIG. It can be seen that there are large differences in the locations where voicing is strong (peak values P1 to Pn, power El to fin distribution for each voicing section T1 to Tn), changes in pitch PTI-PTn, etc.

そこで、見本者の正しい発声によるパワーE1〜Bnと
ピッチPTI =PTnの時間的変化と訓練者自身の発
声によるパワー[!1〜EnとピッチPTI〜PTnの
時間的変化とを比較し、その違いを解析して定量的に指
摘表示することは語学訓練の効果を上げるうえに非常に
有効であることが分かる。
Therefore, the power E1 to Bn due to the correct utterance of the sample and the temporal change in pitch PTI = PTn and the power [! It can be seen that comparing the temporal changes of pitches PTI to PTn with those of pitches PTI to PTn, analyzing the differences, and quantitatively pointing out and displaying them is very effective in increasing the effectiveness of language training.

次に第1図、第2図に示す本実施例の動作を説明する。Next, the operation of this embodiment shown in FIGS. 1 and 2 will be explained.

見本者の発声音がテープレコーダ或いはコンパクトディ
スク等からなる音声記憶部2に録音されζおり、訓練者
の発声音はマイクロホン1より入力され、この切替えを
入力切替部4で行うことにより両者の発声音が別々に入
力され、入力された発声音はスピーカ5で聞き取ること
が出来る。入力された発声音はA/D変換部6でA/D
変換された後2分析部7で音声分析され、パワー[!1
〜UnとピッチPTI〜PTnの波形が抽出される。
The utterances of the sample are recorded in the voice storage unit 2, which is a tape recorder or compact disk, etc., and the utterances of the trainee are input through the microphone 1, and by switching this with the input switching unit 4, the utterances of both parties can be changed. Voice sounds are input separately, and the input vocal sounds can be heard through the speaker 5. The input vocal sound is converted to A/D by the A/D converter 6.
After the conversion, the voice is analyzed by the second analysis unit 7, and the power [! 1
~Un and pitch PTI~PTn waveforms are extracted.

このパワー1!1−UnとピッチPTI ”PTnの波
形は特徴解析部8で発声時間TM、 ピーク値P1”P
n、発声区間T1=Tnの長さ、ピッチ幅(PTmax
 −PTa+in)等が計算され、予め蓄えられている
正しい発声か否かを判定するための各単語酸いは会話文
毎の判定基準条件を判定条件格納部11より引出し比較
評価部10で比較評価し、その評価結果を表示部12に
表示するヵ 尚判定条件格納部11には例えば、 Tl>T3>T2
゜Pi>P3.PTmax−PT+++in >40%
、 0.1 <El<0.2.0゜8 <82<0.9
等の条件が格納されている。又特徴解析部8で計算し定
量化したものは第6図に示すようなパラメータで表示さ
れる。
The waveforms of this power 1!1-Un and pitch PTI ``PTn'' are analyzed by the feature analysis section 8 to determine the utterance time TM and the peak value P1''P.
n, length of vocalization section T1=Tn, pitch width (PTmax
-PTa+in), etc. are calculated, and the criteria conditions for each word or conversation sentence that are stored in advance to determine whether or not the utterance is correct are extracted from the criteria storage unit 11 and compared and evaluated in the comparison evaluation unit 10. Then, the evaluation result is displayed on the display unit 12. Furthermore, the judgment condition storage unit 11 has, for example, Tl>T3>T2.
゜Pi>P3. PTmax-PT+++in >40%
, 0.1 <El<0.2.0°8 <82<0.9
Conditions such as Further, the parameters calculated and quantified by the feature analysis section 8 are displayed as parameters as shown in FIG.

第3図(B)は本実施例にもとづく解析結果例を示す。FIG. 3(B) shows an example of an analysis result based on this example.

即ち、ro ’CLOCK Jと言う単語の発声を見本
者の発声音と訓練者の発声音との違いを発声区間T1.
T2 、無声区間TI2に分割し、更に各区間T1.T
2毎のパワーE1,1!2のピーク値PI 、 P2或
いはパワー1!1.f!2の分布を所定計算方法に従い
計算する。又ピッチPT1.PT2についても同様にピ
ッチPTI 、 PT2のピーク値、ピッチの変化幅(
PTmax −PTmin )等を定量化し、第6図に
示すように違いを比較し、それぞれの評価を行う。
That is, the difference between the pronunciation of the word ro 'CLOCK J by the prototype and the pronunciation by the trainee is determined by the utterance interval T1.
T2, is divided into silent intervals TI2, and each interval T1. T
Peak value PI of power E1, 1!2 for every 2, P2 or power 1!1. f! 2 is calculated according to a predetermined calculation method. Also pitch PT1. Similarly for PT2, the pitch PTI, the peak value of PT2, and the pitch change width (
PTmax - PTmin), etc., and compare the differences as shown in FIG. 6 to perform respective evaluations.

一方、会話の発声における解析は第5図(C)に示すよ
うに定量的に解析する。しかし、会話の発声区間Tl〜
Tnの数が単語のように明確に同一にならないので、見
本者の発声音と訓練者の発声音を第5図(B)に示すよ
うに音声スペクトル上でのDPマツチング(オペレーシ
ョンリサーチの分野において、プロセスの各段階に多数
の結論があるような多段階の問題を解決するに当たって
最適化する手順)により対応づけをとり、その後単語の
場合と同様にその違いを定量的に解析する。
On the other hand, the analysis of conversational utterances is performed quantitatively as shown in FIG. 5(C). However, the utterance section Tl of the conversation
Since the number of Tn is not clearly the same as in words, the utterances of the sample and the utterances of the trainee are compared by DP matching (in the field of operation research) on the speech spectrum as shown in Figure 5 (B). , a procedure for optimizing multi-step problems in which there are many conclusions at each stage of the process), and then quantitatively analyze the differences in the same way as in the case of words.

第6図は第3図(B)に示す解析結果を定量化(但し、
ピーク値Pl〜Pn、ピッチPTI 〜PTnの表示値
は最大の値を1とした時の相対値を、パワーE!1〜E
nの表示値はパワーの総和EをE =8171−1!2
+・・・+En=1とした時の各区間のパワー[!1〜
Bnの比率で表示しである)したものである。
Figure 6 quantifies the analysis results shown in Figure 3 (B) (however,
The displayed values of peak values Pl to Pn and pitches PTI to PTn are relative values when the maximum value is 1, and the power E! 1-E
The displayed value of n is the total power E = 8171-1!2
+...+En=1, the power of each section [! 1~
It is expressed as a ratio of Bn).

比較評価部10における評価法は次の方法で行う。但し
、ダッシュ(′)記号を付加したものは゛訓練者用とす
る。
The evaluation method in the comparative evaluation section 10 is performed as follows. However, those with a dash (') symbol are for use by trainees.

〔l〕 :発声時間TMの評価 ■総発声時間市の差・・・ (T1+72+TI2 )
 −(Tl ’ +T2’ +T12 ’ )■有声区
間のバランス・・・ (Tl−Tl’)、(T2−T2
’) ・・・ 〔2〕 :バワーEl〜l!nのピーク値P1”Pnの
評価■最大ピニク値の位置 ■ピーク値のバランス・・・ (PI−PI’)、(P
2−P2’) ・・・ 〔3〕 :パワーE1=Enの分布 ■パワー分布のバランス・・・ (EI−E1’)、(
Bit!2’) ・・・ 〔4〕 :ピッチPTI 〜PTnの評価■最大ピッチ
位置 ■ピッチのバランス・・・ (PTI −PTI ’ 
) 、(PT2−PT2 ’ ) ・・・ ■ピッチの変化幅・・・(PTmax −PTmin 
) −(PTmax ’ −PTmin ’ )。
[l]: Evaluation of vocalization time TM ■Difference in total vocalization time... (T1+72+TI2)
-(Tl'+T2'+T12') ■Balance of voiced section... (Tl-Tl'), (T2-T2
') ... [2]: Power El~l! Peak value of n P1”Evaluation of Pn ■Position of maximum pinpoint value■Balance of peak values... (PI-PI'), (P
2-P2') ... [3]: Distribution of power E1=En■ Balance of power distribution... (EI-E1'), (
Bit! 2') ... [4]: Evaluation of pitch PTI ~ PTn ■ Maximum pitch position ■ Pitch balance... (PTI - PTI '
), (PT2-PT2')... ■Pitch change width...(PTmax -PTmin
) −(PTmax′−PTmin′).

(g)発明の効果 以上のような本発明によれば、見本者の発声音と訓練者
の発声音の相違点を定量化して具体的に表示することが
出来るため、効率的に発声訓練が可能で、しかも訓練の
効果が大きい発声訓練機を提供出来ると言う効果がある
(g) Effects of the Invention According to the present invention as described above, it is possible to quantify and specifically display the differences between the utterances of the sample and the utterances of the trainee, so that vocal training can be carried out efficiently. This has the effect of providing a voice training machine that is possible and has a great training effect.

【図面の簡単な説明】[Brief explanation of drawings]

第1図は本発明に係る発声訓練機の一実施例。 第2図は本発明に係る特徴分析部の一実施例、第3図は
アクセントの違いを表した図、第4図は見本者の発声音
と訓練者の発声音の相違を表した図。 第5図は会話の発声音の相違を表した図、第6図は定量
化パラメータ表をそれぞれ示す。 図におい°乙 1はマイクロホン、2は音声記憶部、3
は再生部、4は入力切替部、5はスピーカ。 6はA /D変変換部子7分析部、8は特徴解析部。 9は特徴記憶部、10は比較評価部、11は判定条件格
納部、12は表示部、13は発声時間検出部、14は発
声区間検出部、15はパワー検出部。 16はピーク検出部、17はピッチ検出部、18はピッ
チ振幅検出部をそれぞれ示す。 峯3唄 (8) 事srf1
FIG. 1 shows an embodiment of the vocal training machine according to the present invention. FIG. 2 is an embodiment of the feature analysis unit according to the present invention, FIG. 3 is a diagram showing differences in accents, and FIG. 4 is a diagram showing differences between a sample's vocalizations and a trainee's vocalizations. FIG. 5 is a diagram showing differences in vocalizations in conversation, and FIG. 6 is a quantification parameter table. In the diagram, 1 is the microphone, 2 is the audio storage unit, 3
4 is a playback section, 4 is an input switching section, and 5 is a speaker. 6 is an A/D conversion unit, 7 is an analysis unit, and 8 is a feature analysis unit. 9 is a feature storage section, 10 is a comparison evaluation section, 11 is a judgment condition storage section, 12 is a display section, 13 is a vocalization time detection section, 14 is a vocalization section detection section, and 15 is a power detection section. 16 is a peak detection section, 17 is a pitch detection section, and 18 is a pitch amplitude detection section. Mine 3 songs (8) things srf1

Claims (1)

【特許請求の範囲】 +1)見本者の発声音と訓練者の発声音との音声のパワ
ーとピッチの抽出を行う音声分析手段と、前記音声分析
手段により抽出した前記見本者の発声音と訓練者の発声
音との相違を定量的に解析する特徴解析手段と、前記音
声分析手段にて得た前記見本者の発声音と訓練者の発声
音との相違を視覚的に判明するように表示する表示手段
とを設けたことを特徴とする発声訓練機。 (2)前記特徴解析手段は、定量的な解析を前記見本者
の発声音−と訓練者の発声音との発声区間に対して対応
づけを行った後に行うことを特徴とする特許請求の範囲
第1項記載の発声訓練機。 (3)前記特徴解析手段は、該音声のパワー波形から有
声音区間を検出する第1の手段と、前記パワー波形中の
前記有声音区間の数と長さを検出する第2の手段と、前
記第1の手段で検出された各有声音区間のピーク値、パ
ワー分布、ピッチ振幅幅を計算する第3の手段とから構
成されてなることを特徴とする特許請求の範囲第1項及
び第2項記載の発声訓練機。
[Scope of Claims] +1) Voice analysis means for extracting the power and pitch of the voice of the sample and the voice of the trainee, and the voice of the sample extracted by the voice analysis means and the training. a feature analysis means for quantitatively analyzing the difference between the sample's vocalization and the trainee's vocalization, and a display that visually makes clear the difference between the sample's vocalization obtained by the voice analysis means and the trainee's vocalization. A vocal training machine characterized by being provided with a display means for displaying. (2) The feature analysis means performs quantitative analysis after making a correspondence between the utterances of the sample's utterances and the trainee's utterances. The vocal training machine described in paragraph 1. (3) The feature analysis means includes first means for detecting voiced sound sections from the power waveform of the voice, and second means for detecting the number and length of the voiced sound sections in the power waveform; and third means for calculating the peak value, power distribution, and pitch amplitude width of each voiced sound section detected by the first means. The vocal training machine described in Section 2.
JP59057602A 1984-03-26 1984-03-26 Enunciation training machine Pending JPS60201376A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP59057602A JPS60201376A (en) 1984-03-26 1984-03-26 Enunciation training machine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP59057602A JPS60201376A (en) 1984-03-26 1984-03-26 Enunciation training machine

Publications (1)

Publication Number Publication Date
JPS60201376A true JPS60201376A (en) 1985-10-11

Family

ID=13060399

Family Applications (1)

Application Number Title Priority Date Filing Date
JP59057602A Pending JPS60201376A (en) 1984-03-26 1984-03-26 Enunciation training machine

Country Status (1)

Country Link
JP (1) JPS60201376A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH01221784A (en) * 1987-02-06 1989-09-05 Teac Corp Method and device for learning language
KR100490367B1 (en) * 2001-08-03 2005-05-17 정택 The portable apparatus of word studying and method of word studying using the same
JP2007256349A (en) * 2006-03-20 2007-10-04 Oki Electric Ind Co Ltd Voice data recording system and voice data recording method
JP4762976B2 (en) * 2004-04-16 2011-08-31 モエ,リチャード,エイ Timekeeping practice method and system
JP2015011348A (en) * 2013-06-26 2015-01-19 韓國電子通信研究院Electronics and Telecommunications Research Institute Training and evaluation method for foreign language speaking ability using voice recognition and device for the same

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH01221784A (en) * 1987-02-06 1989-09-05 Teac Corp Method and device for learning language
KR100490367B1 (en) * 2001-08-03 2005-05-17 정택 The portable apparatus of word studying and method of word studying using the same
JP4762976B2 (en) * 2004-04-16 2011-08-31 モエ,リチャード,エイ Timekeeping practice method and system
JP2007256349A (en) * 2006-03-20 2007-10-04 Oki Electric Ind Co Ltd Voice data recording system and voice data recording method
JP2015011348A (en) * 2013-06-26 2015-01-19 韓國電子通信研究院Electronics and Telecommunications Research Institute Training and evaluation method for foreign language speaking ability using voice recognition and device for the same

Similar Documents

Publication Publication Date Title
Cucchiarini et al. Quantitative assessment of second language learners’ fluency: Comparisons between read and spontaneous speech
US8972259B2 (en) System and method for teaching non-lexical speech effects
WO2004063902B1 (en) Speech training method with color instruction
Kawai et al. Teaching the pronunciation of Japanese double-mora phonemes using speech recognition technology
US20060004567A1 (en) Method, system and software for teaching pronunciation
JP2002040926A (en) Foreign language-pronunciationtion learning and oral testing method using automatic pronunciation comparing method on internet
WO2021074721A2 (en) System for automatic assessment of fluency in spoken language and a method thereof
Asadi et al. Between-speaker rhythmic variability is not dependent on language rhythm, as evidence from Persian reveals
Hinterleitner Quality of Synthetic Speech
Streefkerk et al. Prominence in read aloud sentences, as marked by listeners and classified automatically
Kabashima et al. Dnn-based scoring of language learners’ proficiency using learners’ shadowings and native listeners’ responsive shadowings
JP2844817B2 (en) Speech synthesis method for utterance practice
JPS60201376A (en) Enunciation training machine
CN110956870A (en) Solfeggio teaching method and device
JP2001249679A (en) Foreign language self-study system
Zechner et al. Automatic scoring of children’s read-aloud text passages and word lists
Gray et al. An integrated approach to the detection and classification of accents/dialects for a spoken document retrieval system
Whiteside et al. Identification of twins from pure (single speaker) and hybrid (fused) syllables: An acoustic and perceptual case study
Bőhm et al. Listeners recognize speakers’ habitual utterance final voice quality
JP5092311B2 (en) Voice evaluation device
JP7432879B2 (en) speech training system
Nishizaki et al. The effect of filled pauses in a lecture speech on impressive evaluation of listeners.
Kim Structured encoding of the singing voice using prior knowledge of the musical score
Varatharaj Developing Automated Audio Assessment Tools for a Chinese Language Course
Yang Structure analysis of beijing opera arias