JPS607496A - Voice recognition equipment - Google Patents

Voice recognition equipment

Info

Publication number
JPS607496A
JPS607496A JP58117712A JP11771283A JPS607496A JP S607496 A JPS607496 A JP S607496A JP 58117712 A JP58117712 A JP 58117712A JP 11771283 A JP11771283 A JP 11771283A JP S607496 A JPS607496 A JP S607496A
Authority
JP
Japan
Prior art keywords
speech
similarity
phoneme
unit
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP58117712A
Other languages
Japanese (ja)
Other versions
JPH024920B2 (en
Inventor
正宏 浜田
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Panasonic Holdings Corp
Original Assignee
Matsushita Electric Industrial Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Matsushita Electric Industrial Co Ltd filed Critical Matsushita Electric Industrial Co Ltd
Priority to JP58117712A priority Critical patent/JPS607496A/en
Publication of JPS607496A publication Critical patent/JPS607496A/en
Publication of JPH024920B2 publication Critical patent/JPH024920B2/ja
Granted legal-status Critical Current

Links

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。
(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】 産業上の利用分野 本発明は音声認識装置に関するものである。[Detailed description of the invention] Industrial applications The present invention relates to a speech recognition device.

従来例の構成とその問題点 音声認識装置は、人間の音声命令によって対象の機2:
(の動作を制御することを目的とする命令入力装置の一
種であり、■操作上のFJI+練を要さない、0手足が
ふさがっている、あるいは手足が届がない場合でも操作
できる、■視覚が介在する操fWとの競合が少ない、等
の長所が生かせる諸方面での利用が合成し始めている。
Conventional configuration and its problems The voice recognition device uses human voice commands to recognize target machine 2:
(It is a type of command input device whose purpose is to control the movements of It is beginning to be used in various fields where advantages such as less competition with operational fw intervening are available.

従来の音声認識装置に用いられている認識方式は、主に
環境雑音が少なく信号対雑音比が晶い状態における最適
構成をとっている。従って、この構成に基づく音声認識
装置rtを雑音模境−「で使用すると、相当な詔織率の
低下を招くことが多い。例えば無声摩擦子i、特に/s
/や/f/などは信号パワーが弱く、信号そのものが定
常ランダム雑音の性質を持っている。今、入力音声に重
畳した雑斤が前記無声摩擦子片に類似した性質を持つ々
イi砕であるとすると、音声区間切り出しの1祭、tl
A頭あるいは、ft尾に存在するこれら無声摩擦子音が
欠落したり、語中に存在するこれら無声摩擦子音が音声
弔位内無音あるいは音声Qt位間無音であると誤’l’
lJ定されることがある。この様な欠落・誤判定を伴っ
た状1.r3で音声10γ社゛!準パターンとの類似度
評価を行うと、仮に前記摩擦子音以外の区間(例えばエ
ネルギーの大きな母音区間)での斤l′!1、あるいは
音響伏励が全く同一であっても、吟声単位全体としての
類(以度は低下せざるを得す、これが誤認識を招く結果
となることが多い。
The recognition methods used in conventional speech recognition devices mainly have an optimal configuration in a state where there is little environmental noise and the signal-to-noise ratio is crystalline. Therefore, if a speech recognition device rt based on this configuration is used in a noisy environment, it often causes a considerable decrease in the speech recognition rate.
/ and /f/ have weak signal power, and the signals themselves have the characteristics of stationary random noise. Now, if we assume that the noise superimposed on the input speech is a fragment with properties similar to the above-mentioned silent friction piece, then the first step in extracting a speech section, tl
If these voiceless fricative consonants present at the A-head or ft-tail are missing, or if these voiceless fricative consonants present in the word are silent in the phonetic position or silent in the phonetic Qt position, it is incorrect to say 'l'.
lJ may be determined. Conditions with such omissions and misjudgments 1. Audio 10γ company with r3! If we evaluate the similarity with the quasi-pattern, we can see that the difference between ``l'!'' in a section other than the fricative consonant (for example, a vowel section with high energy) 1, or even if the acoustic excitations are exactly the same, the class as a whole of the pronunciation unit (from then on, it has to be lowered, which often leads to misrecognition.

発明の目的 本発明は上記従来の欠点を解消するもので、雑音による
特定のi rrt+の劣化、あるいは特定の汗饗ゝ特徴
量の劣化がもたらす音声単位全体としての詔誠率低下の
影響を軽減できる音声認識装置dを提供することを目的
とする。
OBJECT OF THE INVENTION The present invention solves the above-mentioned conventional drawbacks, and reduces the influence of a decrease in the evangelization rate of the entire speech unit caused by the deterioration of a specific i rrt+ due to noise or the deterioration of a specific sweat feature. The purpose of the present invention is to provide a voice recognition device d that can perform the following tasks.

発明の構成 上記目的を達成するため、第1の発明は、入力音声の音
韻を任意の短時間区間毎に識別して音・韻系列を出力す
る音韻識別手段と、認識しようとする音声単位が有する
と予想される標準的な音韻系列が予め登録された音声単
位標準パターンと、各音韻間の類似性を表わす音韻類似
度行列と、前記音韻識別手段から出力?れる音・韻系列
とDiJ記斤声単位fliA準パターンと前記音韻類似
度行列とを用いて入力音声の音韻性と前記音声単位に4
準パターンの音韻性との間の類似度を011記短時間区
間の音韻毎に表現する音声単位音韻比較行列と、この音
声単位音韻比較行列によって表わされるところの音韻類
似性を音声単位毎にめる音声単位類似度計算手段とを備
え、予め音韻別に雑音の強度・特性に応じた複数組の重
み付は係数を用意しておき、入力音声に重畳する雑音の
強度及び特性を検出する雑音検出手段から得られる情報
に基づいて、適宜!?!r定の重み付は係数の徂を選択
し、11J記蛭声it位類似度計算手段により+JiJ
記重み付は係数を用いて音声単位毎に音韻類似度を計算
する構成としたものである。
Structure of the Invention In order to achieve the above-mentioned object, the first invention provides a phoneme identification means for identifying the phonemes of an input speech every arbitrary short time interval and outputting a phoneme/rhyme sequence, and a phoneme identification means that identifies the phoneme of an input speech for each arbitrary short time period and outputs a phonetic/rhyme sequence; A speech unit standard pattern in which a standard phoneme sequence that is expected to have a standard phoneme sequence is registered in advance, a phoneme similarity matrix representing the similarity between each phoneme, and an output from the phoneme identification means? The phonology of the input speech and the phonetic unit are calculated using the phonetic/rhyme sequence, the DiJ recorded vocal unit fliA quasi-pattern, and the phonological similarity matrix.
A phonetic unit phonetic comparison matrix that expresses the similarity between the quasi-pattern phonetic properties for each phoneme in a short period of 011, and a phonetic similarity expressed by this phonetic unit phonetic comparison matrix for each phonetic unit. A noise detection system that detects the intensity and characteristics of noise superimposed on input speech by preparing multiple sets of weighting coefficients in advance according to the intensity and characteristics of noise for each phoneme. Based on the information obtained from the means, as appropriate! ? ! For constant weighting, the coefficients are selected, and +JiJ is determined by the similarity calculation means in 11J
The weighting is configured to calculate phoneme similarity for each phonetic unit using coefficients.

また第2の発明は、入力音声を任αの4λj時間区間4
ifに音響分析してそのf響特徴系列を出力する音響分
析手段と、訝識しようとする音声単位が有すると予想8
れる標準的な音響特徴系列が予め笠帰マΣれた音声単位
標準パターンと、1JiJ記音禅分析手段から出力され
る音響特徴系列と前6己Vf声単位煙準パターンとを用
いて入力音声と斉声弔位標準パターンとの間の)4似度
をMiJ記短時短時間区間力音声イびに表現する音声単
位6145比較行列と、この音声単位音響比較行列によ
って表わされるとζろのFf %類似度を音声単位毎に
める?l声屯位ハ゛(傾度計算手段とを備え、予め音響
f# & in別に雑斤の強度・特性に応じたり数組の
重みflけ係数を月1代しておき、入力音声に重畳する
雑音の強度及び特性を検出するflr音検出手段から得
られる情報に基づいて、適宜特定の重み付は係数の11
を選択し、前記音声単位類似度計算手段により前記市み
付は係数を用いて音声単位毎に音響類似度を計算する構
成としたものである。
Further, the second invention provides an input voice in 4λj time intervals 4 of arbitrary α.
an acoustic analysis means that performs acoustic analysis on if and outputs the f-sound feature sequence;
The input speech is calculated using the voice unit standard pattern in which the standard acoustic feature series is mapped in advance, the acoustic feature series output from the 1JiJkionzen analysis means, and the 6Vf voice unit standard pattern. 6145 phonetic unit comparison matrix that expresses the 4 similarity between MiJ and the standard pattern of the standard pattern of the voice unit and the 6145 phonetic unit acoustic comparison matrix, and the Ff % of ζ Calculate similarity for each phonetic unit? It is equipped with a voice level height (gradient calculation means), and several sets of weight coefficients are calculated in advance according to the intensity and characteristics of the noise for each acoustic f# & in once a month, and the noise superimposed on the input voice is calculated. Based on the information obtained from the FLR sound detection means that detects the intensity and characteristics of the
is selected, and the audio unit similarity calculation means calculates the acoustic similarity for each audio unit using coefficients.

実施例の説明 以下、第1の発明の一実施例について、図面に基づいて
説明する。第1図は音声認識装置のブロック図で、本実
施例では、音声単位が屯語であるとして説明する。また
特許請求の範囲の項に記載された「任意の短時間区間」
なる表現を「セグメント」と11?ひかえて説明する。
DESCRIPTION OF EMBODIMENTS An embodiment of the first invention will be described below with reference to the drawings. FIG. 1 is a block diagram of a speech recognition device, and in this embodiment, explanation will be given assuming that the speech unit is a tongo. Also, "any short time period" stated in the claims section
What is the expression “segment”? Let me explain in detail.

第1図において、(1)は雑音検出手段、(2)は音韻
識別手段であり、入力音声はこの音韻識別手段(2)で
分析され、f韻系列(3)のかたちに抽象化される。(
4)は各種M 1ift間の音響的類収性を行列の各要
素に対応させて表現した音1tlJ類似度行列、(5)
は音声認識の対象となっている単語毎にその音韻の系列
を予め記述した単語標準パターンである。(6)は音韻
系列(3)と音韻類似度行夕旧4)と単語標準パターン
(5)との王者で決定きれるところの、単語全長にわた
ってのセグメント毎の単語音韻比較行列であり、これは
、4識対象単語それぞれに用意された前記単袷瞭串パタ
ーン(5)のひとつひとつに対応して、このパターンと
同数だけ生成される。Q(語類似度計・脚半1々(7)
は、前記の操作で得られた小数の単3rr音韻比較行列
のそれぞれについて、セグメント毎の音韻力1似性を単
語全長にわたって総合評価し、最後に判定手放(8)で
入力音声と最も類似性の高い単語4M準パターンを選択
し、これに対応する単語を判定結果として出力する。一
方、雑音検出手段(1)は、入力音声に重畳する雑音の
強度・特性を検出し、これに応じて711攻組の重み付
は係数(9)の中から最適だ一組をj8択する。選ばれ
た重み付は係ν(は単語類似度計算手段(7)に入力さ
れる。
In Figure 1, (1) is a noise detection means, and (2) is a phoneme identification means, and the input speech is analyzed by this phoneme identification means (2) and abstracted into an f-rhyme sequence (3). . (
4) is a sound 1tlJ similarity matrix that expresses the acoustic similarity between various M 1ifts in correspondence with each element of the matrix, (5)
is a standard word pattern in which the phoneme sequence of each word to be speech recognized is described in advance. (6) is the word phonological comparison matrix for each segment over the entire length of the word, which can be determined by the phonological sequence (3), the phonological similarity index 4), and the word standard pattern (5). , the same number of patterns are generated in correspondence with each of the above-mentioned single-sided patterns (5) prepared for each of the four recognition target words. Q (word similarity meter, one leg and a half (7)
For each of the decimal single 3rr phonological comparison matrices obtained in the above operation, the phonological power 1 similarity for each segment is comprehensively evaluated over the entire word length, and finally, in step (8), the most similar to the input speech is selected. A word 4M quasi-pattern with a high degree of similarity is selected, and the corresponding word is output as a determination result. On the other hand, the noise detection means (1) detects the strength and characteristics of the noise superimposed on the input voice, and selects the optimal set of weighting coefficients (9) for the 711 attack according to this. . The selected weighting coefficient ν( is input to the word similarity calculation means (7).

以下、定常ランダム雑音下で/san/ (rサン」)
という音声を;も識しようとする場合を例にとり、重み
付は係数の役割について説明する。+4’t;に述べた
ように、音++tl /s/は無声摩擦子音であり、付
加きれた定常ランダム雑音下ために、音声区間切り出し
の1余に容易に欠落することが考えられる。即ちその場
合には、単語標準パターンとして登録さicている/s
an/と、入力音声から得られたfl I′+t4系列
/an/とが照合されねばならない。ところで、/a/
および/n/のf4的特徴は、定常ランダム雑音とは大
きく異なり、一般にエネルギーも大きいため、付加雑音
による切出し欠落や待機量抽出の間必:いは生じにくい
。tJI]ちこの例に於いては、/san/と/an/
との照合の際に、/S/の有無による差異よりも、/a
n/が双方共に存在するという同一性の方がI射々41
音性の観点から信顆性が品いといえる。逆にuli(4
4仁音の場合には、/8/の検出も容易であるので、こ
れに対する屯み付は係に(を増加させておく。1口番識
対象単語中に/:tn/(rオン」)という単語が存在
する場合を想定すると、/a/と/l/との行1ill
 jGが比較的小さいため、/an/と/:zn/とを
誤認訣する恐れがある。このような混同を避けるために
は/S/の有無が重要になり、/S/の鍬み付は係数を
増加させておくことが有効でろる。Is 1図の重み付
は係数(9)はこのような耐雑汗姓の観点からみた音t
ll別信頼性を表現したものであり、単語全長にわたる
類似度計算の際に、セグメント毎にこれを釣用するもの
である。ここで、重み付は係数(9)は音t111性に
依存するものであるから、適用にあたっては予め当該セ
グメントのMt’dl性が明らかである必要がある。現
実的には、単語標準パターン(5)あるいは入力音声の
音韻系列(3)の王者のうち、よりケ(P、音が少ない
と判断される系列の音韻に基づいて+iiJ記重み付は
係数を適用すればよい。重み(NJけ係L(設定の一方
法は、rt知音声を付加雑音と共に認識装置トtに入力
し、セグメント毎に識別された結果を原音声から視察で
めた音韻系列と比較し、装置による識別誤差の大きなも
の程、雑音に対する耐性が弱いと考えて小さな重み付は
係数を与えるものである。
Below, under stationary random noise /san/ (rsan')
We will explain the role of weighting coefficients using the example of trying to recognize the voice ``;''. As mentioned in 4't;, the sound ++tl /s/ is a voiceless fricative consonant, and it is thought that it is easily lost in more than one of the speech segment cuts due to the addition of stationary random noise. That is, in that case, it is registered as a word standard pattern.
an/ must be matched with the fl I'+t4 sequence /an/ obtained from the input speech. By the way, /a/
The f4-like characteristics of and /n/ are very different from stationary random noise and generally have large energy, so they are unlikely to occur during extraction of missing or waiting amounts due to additional noise. tJI] In Chiko's example, /san/ and /an/
When checking with /a, rather than the difference due to the presence or absence of /S/
The identity that n/ exists on both sides is more interesting.41
From the perspective of sonics, it can be said that the sound quality is elegant. On the other hand, uli(4
In the case of the 4-nin sound, it is easy to detect /8/, so increase () in response to this. /:tn/(r-on) in the first word recognition target word. ), the line 1ill between /a/ and /l/
Since jG is relatively small, there is a risk of misunderstanding /an/ and /:zn/. In order to avoid such confusion, the presence or absence of /S/ is important, and it may be effective to increase the coefficient for /S/. The weighting coefficient (9) in Figure Is 1 is the sound t seen from the perspective of such sweat-resistant surnames.
This is an expression of reliability for each segment, and is used for each segment when calculating similarity over the entire word length. Here, since the weighting coefficient (9) depends on the t111 property of the sound, it is necessary to clarify the Mt'dl property of the segment before applying it. In reality, the +iiJ weighting coefficients are calculated based on the phonemes of the series that are judged to have fewer sounds among the word standard pattern (5) or the phonological series (3) of the input voice. One way to set the weight is to input the rt-knowledge speech together with additional noise into the recognizer, and then use the phonological sequence obtained by inspecting the original speech to identify the results for each segment. In comparison, the larger the identification error caused by the device, the weaker the resistance to noise, so a small weighting is given to the coefficient.

このように、雑音に適応した音韻別型み付け係数を単語
類似度計算の際に採用することにより、雑音重畳時の音
声区間切り出し誤差、音韻職別誤差に起因する単語認識
率の低下を軽減することができる。
In this way, by employing noise-adapted phoneme-specific typing coefficients when calculating word similarity, we reduce the drop in word recognition rate caused by speech segment segmentation errors and phoneme-specific errors when noise is superimposed. can do.

次に第2の発明の一実施例について、第2図に基づいて
説明する。第2の発明の第1の発明との主な蜜いは、音
声の抽象化にあたり音韻分析しで音韻系列化するのでな
く、昔痺的特微量で入力音声を表現しようとするもので
ある。このため第1図中で音韻に関係した部分が全て廿
響I11に関する表現、すなわち音溝分析手段0Q及び
廿婦特徴系列Uυならびに単語音響比較行列0功に代わ
り、かつ音韻類似度行列(4)に相当するものは存在し
ないが、それ以外の基本構成は同じである。このように
第1の発明では音韻別に重み付は係数を与えたのに対し
、第2の発明では何らかの音響尺度を判断法準に重み付
は係数を与えることになる。この第2の発明の場合も、
第1の発明と同様の効果をイ、1)ることかできる。
Next, an embodiment of the second invention will be described based on FIG. 2. The main difference between the second invention and the first invention is that when abstracting speech, the input speech is expressed using conventional feature quantities rather than phonological analysis and phonological series. Therefore, all parts related to phoneme in Fig. 1 are replaced by expressions related to the sound groove I11, that is, the sound groove analysis means 0Q, the sound groove feature series Uυ, and the word acoustic comparison matrix 0gong, and the phonological similarity matrix (4) There is no equivalent, but the other basic configurations are the same. In this manner, in the first invention, coefficients are given for weighting for each phoneme, whereas in the second invention, coefficients are given for weighting based on some acoustic measure as a criterion. Also in the case of this second invention,
A.1) The same effects as the first invention can be obtained.

発明の詳細 な説明したように第1及び第2の発明によれば、l(L
音に適応した音韻別型み付は係数を単語頌似度計算の際
に採用することにより、雑音型骨時の音声区間切り出し
誤差、音1徂識別誤差に起因する単語認識率の低下を軽
減することができる。
According to the first and second inventions as described in detail, l(L
Phoneme-specific typification that adapts to sounds uses coefficients when calculating word similarity, thereby reducing the drop in word recognition rate caused by speech segment segmentation errors and sound 1-side identification errors in noise-type bones. can do.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図は第1の発明の一実施例における跨座認識装置4
の全体構成図、第2図は第2の発明の一実施例における
音声認識装置の全体構成図1である。 (1)・・・雑音検出手段、(2)・・・音韻識別子1
投、に3)・・・跨1111系列、(4)・・・音韻類
似度行列、(5)・・・単語押l(ζパターン、(fi
l・・・単語音韻比較行列、(7)・・・’l’ Wt
j 4A葭度劇・n手段、(8)・・・判定手段、(9
)・・・重み付目係数、(1(ト・・音響分析手段、(
11)・・・音響特徴系列、(1カ・・・単語音韻比較
行列 代理人 森 本 義 弘
FIG. 1 shows a straddle recognition device 4 in an embodiment of the first invention.
FIG. 2 is an overall configuration diagram 1 of a speech recognition device in an embodiment of the second invention. (1)...Noise detection means, (2)...Phonological identifier 1
Throw, ni3)...1111 series across, (4)...Phonological similarity matrix, (5)...Word press l(ζ pattern, (fi
l...word phonological comparison matrix, (7)...'l' Wt
j 4A Yashidogeki/n means, (8)...judgment means, (9
)... Weighted coefficient, (1(g... Acoustic analysis means, (
11) Acoustic feature series, (1 character... Word phonological comparison matrix agent Yoshihiro Morimoto

Claims (1)

【特許請求の範囲】 1、 入力音声の音韻を任意の短時間区間毎に識別して
音韻系列を出力する音韻識別手段と、認識しようとする
音声単位が有すると予想される標準的な音韻系列が予め
登録された音声単位標準パターンと、各音韻間の類似性
を表わす音韻類似度行列と、前記音韻識別手段から出力
される音韻系列と前記音声単位標準パターンと前記音韻
類似度行列とを用いて入力音声の音韻性と1前記音声単
位I51$パターンの音韻性との間の類似度を前記短時
間区間の音韻毎に表現する音声単位音韻比較行列と、こ
の音声単位音韻比較行列によって表わされるところの音
韻類似性を音声単位毎にめる音声単位類似度計算手段と
を備え、予め音韻別に雑、音の強度・特性に応じた良数
組の重み付は係数を用意しておき、入力音声に重畳する
什、音の強度及び特性を検出する雑音検出手段から得ら
れる情報に基づいて、A;π宜特定のn↑み付は係数の
組を選M<シ、前記音声単位類似度計算手段により前記
重み付は係数を用いて音声単位毎に音韻類似度をW1算
する構成とした音声認識装置1イ。 2、 入力に声を任意の短時間区間4σに音響分イバし
てその音響特徴系列を出力する音響分析手段と、認識し
ようとする音声単位が有すると予想される(λQζ的な
音響特徴系列が予め登録された音声!11位倒1小パタ
ーンと、+iil記音品記音手分析手段力される音響特
徴系列と[J11記汁声単位標準パターンとを用いて入
力音声と台声単位漂準パターンとの間の力゛1以反を前
記短時間区間の入力音声毎に表現する音声単位音%l比
較行列と、この音声単位音響比較行列によって表わされ
るところの音響類似度を11f声単位毎にめる音声単位
類似度計算手段とを備え、予め音響特徴系列に雑音の強
度・特性に応じた複数組の重み付は係数を用意しておき
、入力音声に重畳する雑音の一強度及び・特性を検出す
る雑音検出手段から得られる情報に基づいて、適宜特定
の重み付は係数の組を選択し、前記音声単位類似度計算
手段により111ノ記重み付は係数を用いて音声単位毎
に音1.+、l、す4′I似度を計算する構成とした音
声認識装置。
[Claims] 1. A phoneme identification means that identifies the phonemes of input speech in arbitrary short time intervals and outputs a phoneme sequence, and a standard phoneme sequence that is expected to be possessed by the phonetic unit to be recognized. is registered in advance, a phoneme similarity matrix representing the similarity between each phoneme, a phoneme sequence output from the phoneme identification means, the phoneme standard pattern, and the phoneme similarity matrix. A speech unit phonology comparison matrix expressing the similarity between the phonology of the input speech and the phonology of the first speech unit I51 pattern for each phoneme of the short time period, and this speech unit phonology comparison matrix. However, it is equipped with a phonetic unit similarity calculation means that calculates the phonetic similarity for each phonetic unit, and coefficients are prepared in advance for each phoneme and weighting of good number sets according to the intensity and characteristics of the sound, and the input is performed. Based on the information obtained from the noise detection means that detects the intensity and characteristics of the sound superimposed on the voice, A; The speech recognition device 1a is configured such that the calculation means calculates the phonological similarity W1 for each speech unit using the weighting coefficient. 2. It is assumed that the input is an acoustic analysis means that acoustically divides the voice into arbitrary short time intervals 4σ and outputs the acoustic feature sequence, and the speech unit to be recognized (the acoustic feature sequence of λQζ is Preliminarily registered voice! 11th place down 1 small pattern, the input voice and voice unit standard pattern using the acoustic feature series inputted by the + III phonetic record phonetic analysis method and the [J11 voice unit standard pattern] A speech unit sound %l comparison matrix expressing the force ゛1 or more between the patterns and the input speech in the short period, and the acoustic similarity expressed by this speech unit sound comparison matrix for each 11f voice unit. and a speech unit similarity calculation means for calculating the noise superimposed on the input speech, and prepare coefficients for multiple sets of weighting according to the intensity and characteristics of the noise in the acoustic feature series in advance, and calculate the intensity and characteristics of the noise superimposed on the input speech. Based on the information obtained from the noise detection means that detects the characteristics, a set of coefficients is appropriately selected for specific weighting, and the speech unit similarity calculating means applies weighting to each speech unit using the coefficients. A speech recognition device configured to calculate the similarity of sounds 1. +, l, and s4'I.
JP58117712A 1983-06-28 1983-06-28 Voice recognition equipment Granted JPS607496A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP58117712A JPS607496A (en) 1983-06-28 1983-06-28 Voice recognition equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP58117712A JPS607496A (en) 1983-06-28 1983-06-28 Voice recognition equipment

Publications (2)

Publication Number Publication Date
JPS607496A true JPS607496A (en) 1985-01-16
JPH024920B2 JPH024920B2 (en) 1990-01-30

Family

ID=14718435

Family Applications (1)

Application Number Title Priority Date Filing Date
JP58117712A Granted JPS607496A (en) 1983-06-28 1983-06-28 Voice recognition equipment

Country Status (1)

Country Link
JP (1) JPS607496A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS61143070A (en) * 1985-11-29 1986-06-30 帝人株式会社 Heat sterilization of artificial kidney
JPS61143071A (en) * 1985-11-29 1986-06-30 帝人株式会社 Heat sterilization of artificial kidney
JPS61143072A (en) * 1985-11-29 1986-06-30 帝人株式会社 Heat sterilization of artificial kidney
JPS6343669A (en) * 1986-08-08 1988-02-24 帝人株式会社 Production of blood treatment device
JPH0426900A (en) * 1990-05-22 1992-01-30 Nec Corp Voice recognition device

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS61143070A (en) * 1985-11-29 1986-06-30 帝人株式会社 Heat sterilization of artificial kidney
JPS61143071A (en) * 1985-11-29 1986-06-30 帝人株式会社 Heat sterilization of artificial kidney
JPS61143072A (en) * 1985-11-29 1986-06-30 帝人株式会社 Heat sterilization of artificial kidney
JPH0237790B2 (en) * 1985-11-29 1990-08-27 Teijin Ltd
JPH0422585B2 (en) * 1985-11-29 1992-04-17 Teijin Ltd
JPS6343669A (en) * 1986-08-08 1988-02-24 帝人株式会社 Production of blood treatment device
JPH0379021B2 (en) * 1986-08-08 1991-12-17 Teijin Ltd
JPH0426900A (en) * 1990-05-22 1992-01-30 Nec Corp Voice recognition device

Also Published As

Publication number Publication date
JPH024920B2 (en) 1990-01-30

Similar Documents

Publication Publication Date Title
US11605371B2 (en) Method and system for parametric speech synthesis
Sambur Speaker recognition using orthogonal linear prediction
CN109686383B (en) Voice analysis method, device and storage medium
Nasib et al. A real time speech to text conversion technique for bengali language
US20030187651A1 (en) Voice synthesis system combining recorded voice with synthesized voice
Hieronymus et al. Use of acoustic sentence level and lexical stress in HSMM speech recognition.
CN107610691B (en) English vowel sounding error correction method and device
JPS607496A (en) Voice recognition equipment
JP3523382B2 (en) Voice recognition device and voice recognition method
Meftah et al. A comparative study of different speech features for arabic phonemes classification
Cheng et al. Comparative performance study of several pitch detection algorithms
JP2966002B2 (en) Voice recognition device
Othman et al. Jawi character speech-to-text engine using linear predictive and neural network for effective reading
Medress et al. A system for the recognition of spoken connected word sequences
Hong et al. Automatic Miscue Detection Using RNN Based Models with Data Augmentation.
Ezeiza et al. Combining mel frequency cepstral coefficients and fractal dimensions for automatic speech recognition
Seman et al. Hybrid methods of Brandt’s generalised likelihood ratio and short-term energy for Malay word speech segmentation
KR100236962B1 (en) Method for speaker dependent allophone modeling for each phoneme
JPH02124600A (en) Voice recognition device
Kaur et al. Automatic marking of Punjabi syllables boundaries in a sound file
Priyadarshani Speaker dependent speech recognition on a selected set of sinhala words
Coker Computer‐Simulated Analyzer for a Formant Vocoder
JPS6148897A (en) Voice recognition equipment
Al Mahmud Performance analysis of hidden markov model in Bangla speech recognition
JPS6069700A (en) Voice recognition equipment