JPS5857195A - Voice recognition system - Google Patents

Voice recognition system

Info

Publication number
JPS5857195A
JPS5857195A JP56155654A JP15565481A JPS5857195A JP S5857195 A JPS5857195 A JP S5857195A JP 56155654 A JP56155654 A JP 56155654A JP 15565481 A JP15565481 A JP 15565481A JP S5857195 A JPS5857195 A JP S5857195A
Authority
JP
Japan
Prior art keywords
candidate
speech
dictionary
standard
circuit section
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP56155654A
Other languages
Japanese (ja)
Inventor
一成 畑中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to JP56155654A priority Critical patent/JPS5857195A/en
Publication of JPS5857195A publication Critical patent/JPS5857195A/en
Pending legal-status Critical Current

Links

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。
(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】 本発明は、音声認識方式、特に標準辞書を用いて複数個
の単音節および/または単語(本明細書においては両者
を綜合して音声という)を夫々優先順位を附、して候補
音声として抽出すると共K。
DETAILED DESCRIPTION OF THE INVENTION The present invention uses a speech recognition method, particularly a standard dictionary, to assign priorities to a plurality of single syllables and/or words (hereinafter collectively referred to as speech). , and extract it as a candidate voice.

複数個の音声候補系列を識別子χ附して候補系列辞書に
格納しておき、上記抽出された結果の正当性を上記識別
子によって判定するようにした音声認識方式に関するも
のである。
The present invention relates to a speech recognition method in which a plurality of speech candidate sequences are stored in a candidate sequence dictionary with identifiers χ attached, and the validity of the extracted results is determined based on the identifiers.

従来から単音節および/または単語についての音声g識
装置においては、標準辞書tそなえておくと共に、未知
入力音声について特徴量を抽出し、誼抽出された特徴量
と上記標準辞書から読出された標準時微量とを照合して
、上記未知入力音声が属するカテゴリを決定するように
している。しかし、上記従来の構成の場合には、認識基
を向上しようとすると、時として本来あるべきでない極
端な形の認識結果が得られることが生じる。
Conventionally, in speech recognition devices for monosyllables and/or words, a standard dictionary is provided, feature quantities are extracted for unknown input speech, and the extracted feature quantities and the standard time read from the standard dictionary are used. The category to which the unknown input voice belongs is determined by comparing it with a trace amount. However, in the case of the above-mentioned conventional configuration, when trying to improve the recognition group, an extreme recognition result that should not be originally obtained may sometimes be obtained.

本発明は、このような問題点を解決することY目的とし
ており、認識結果の正当性Y自らチェックできるように
することを目的としている。そしてそのため、本発明の
音声i!職方式は、入力音声信号の周波数分析結果にも
とづいて尚該入力音声信、号の特徴量を抽出する特徴抽
出回路部、標準音声に対応した標準時微量を格納してな
る標準辞書、Rよび上記特徴抽出回路部によって得られ
た特徴量と上記標準辞書から読出された標準%微量とを
照合する照合回路部ン有し、上記入力音声信号に対応し
た音声を認識する音声綾織装置において、上記照合回路
部Kgいて複数個の候補音声を優先順位を附与して抽出
し当該抽出された音声候補系列を出力し得るよう構成さ
れると共に、複数種類の音声候補系列を格納しかつ該各
系列対応に職別子’t’ll#4 した候補系列辞書、
および上記照合回路から出力された音声候補系列と上記
候補系列辞書から読出された音声候補系列との比較を行
なう比較回路部をもうけ、該比較回路部において一致の
とれた音声候補系列に附与されている上記識別子にもと
づいて上記照合回路部から出力された音声候補系列中の
最優先順位をもつ音声Z認識結果として採択するか否か
を判定するようにしたことを特徴としている。以下図面
を参照しつつ説明する。
The purpose of the present invention is to solve such problems, and it is an object of the present invention to enable users to check the validity of recognition results themselves. And for that reason, the audio i! of the present invention! The system includes a feature extraction circuit section that extracts the feature amount of the input audio signal based on the frequency analysis result of the input audio signal, a standard dictionary storing the standard time trace amount corresponding to the standard speech, R, and the above. In the voice twill weaving device which recognizes the voice corresponding to the input voice signal, the voice twill weaving device includes a collation circuit unit that collates the feature amount obtained by the feature extraction circuit unit and the standard percentage trace amount read from the standard dictionary. The circuit unit Kg is configured to be able to extract a plurality of candidate voices by assigning priorities to them and output the extracted voice candidate sequences, and also stores a plurality of types of voice candidate sequences and supports each of the candidate voices. Candidate series dictionary with job name 't'll #4,
and a comparison circuit section that compares the speech candidate series outputted from the matching circuit with the speech candidate series read from the candidate series dictionary, and in the comparison circuit section, the speech candidate series that is matched is added to the speech candidate series. The present invention is characterized in that it is determined whether or not to adopt the speech Z recognition result having the highest priority among the speech candidate series outputted from the collation circuit unit based on the identifier that is present. This will be explained below with reference to the drawings.

図は本発明の一実施例構成を示す。図中の符号1は特徴
抽出回路部、2は標準辞書であって例えば数字roJ 
、 rIJ 、r2J *・・・「9」を発音した場合
についての標準時微量がカテゴリ名と一緒に格納されて
いるもの、3は照合回路部、4は本発明においてもうゆ
られる候補系列辞書、5は本発明においてもうけられる
比較回路部χ表わしている。
The figure shows the configuration of an embodiment of the present invention. In the figure, numeral 1 is a feature extraction circuit, and 2 is a standard dictionary, for example, the number roJ.
, rIJ, r2J *... The standard time trace amount for the case of pronouncing "9" is stored together with the category name, 3 is a matching circuit section, 4 is a dictionary of candidate series that can be changed in the present invention, 5 represents the comparator circuit section χ provided in the present invention.

候補′系列辞書4には、I!識カテゴリ内の例えば単語
rOJ t rlJ @ r2J t・・・「9」につ
いてm個の組合わせよりなる候補系列「0−4−5J 
、 「o−1−2J 。
Candidate' series dictionary 4 contains I! For example, for the word rOJ t rlJ @ r2J t... "9" in the recognition category, a candidate series "0-4-5J" consisting of m combinations is created.
, “o-1-2J.

・・・・・・が格納されており、各候補系列毎に例えば
統計的な処理にもとづいて得られた識別子iDが附与さ
れている。腋識−子IDK示す識別子rFJは対応する
候補系列が得られている場合に当鋏候補系列中の最優先
順位にある単語が正答であるとしてよいことを意味し、
また識別子rRJは尚該候補系列が得られた場合には前
段の照合回路部3による照合結果を破棄して「リジェク
ト」を解答すべきことを意味している。
. . . are stored, and an identifier iD obtained based on, for example, statistical processing is assigned to each candidate sequence. The identifier rFJ indicating the armpit IDK means that if the corresponding candidate series is obtained, the word with the highest priority in the current scissors candidate series may be considered as the correct answer,
Further, the identifier rRJ means that if the candidate sequence is obtained, the matching result by the matching circuit section 3 at the previous stage should be discarded and "reject" should be answered.

図において、従来の音声認識装置と同様に、照合回路部
3は、特徴抽出向7路部1によって抽出された特徴量と
標準辞書2から読出された標準時微量との例えば距離を
判定する。本発明の場合、判定された照合距離のより小
さいものから順K例えば3個のカテゴリを候補として決
定し、照合距離のより小さいものから順に即ち優先順位
層に配列した候補系列例えば図示の如((0−1−2)
 v出力するようにされる。
In the figure, similar to the conventional speech recognition device, the matching circuit section 3 determines, for example, the distance between the feature amount extracted by the feature extraction section 1 and the standard time minute amount read out from the standard dictionary 2. In the case of the present invention, K, for example, three categories are determined as candidates in order from the one with the smallest determined matching distance, and the candidate series is arranged in order from the one with the smallest matching distance, that is, in the priority layer, for example, as shown in the figure. (0-1-2)
v output.

該候補系列(0−1−2)は比較回路部5に導ひかれ、
一方候補系列辞w4から例えば(0−4−5゜F)s(
012,、R)・・・の如く各候補系列が識別子lDと
一緒KJlI香に読出されて比較回路部5に導ひかれる
。比較回路部5は、上記両者の候補系列(0−1−2)
と(045)t(012)、・・・とを比較する。そし
て一致がとれた場合に該当する識別子iDV調べ、識別
子rFJである場合K)言当該候補系列中の最優先順位
にある単語tもって正答とし、また識別子rRJである
場合K)ま刑合回路913からの照合結果をリジェクト
するようにする。図示の如く、照合回路部3から候補系
列(0−1−2)が得られている場合にii、比較回路
部5はリジェクトを発する。これは、上記候補系ダ11
(0−1−2)K抽出されて〜・る単語「0」、「1」
The candidate series (0-1-2) is led to the comparison circuit section 5,
On the other hand, from the candidate series word w4, for example (0-4-5°F)s(
012, , R) . The comparison circuit unit 5 selects both of the above candidate series (0-1-2).
and (045)t(012), . . . are compared. Then, if a match is found, the corresponding identifier iDV is checked, and if the identifier is rFJ, the word t with the highest priority in the candidate series is considered the correct answer, and if the identifier is rRJ, the correct answer is K). Reject the matching results from . As shown in the figure, when the candidate series (0-1-2) is obtained from the matching circuit section 3, the comparison circuit section 5 issues a reject. This is the candidate system Da 11 mentioned above.
(0-1-2)K extracted words “0”, “1”
.

「2」が音韻上からみていればノくラノ(うなもJ)で
あり、最優先順位にある単語rOJの照合結果に信頼性
がとばしいととン意味しているからである。
This is because, from a phonetic perspective, "2" is Nokurano (Unamo J), which means that the collation result for the word rOJ, which has the highest priority, is extremely reliable.

なお、図示の構成において、上記リジエクトカ玉生じた
場合に、再度発声をやり直したり、ある−一は図示特徴
抽出1路部1による特徴抽出とは別の観点からの特徴抽
出を行なって照合をやり直しするようKする構成を附加
することができる。また上記説明において、数字などの
単語ti!識するものとしたが、単音節vg識する場合
にも適用できることは言うまでもない。
In addition, in the illustrated configuration, if the above-mentioned redirect error occurs, the utterance is redone, or the feature extraction is performed from a different perspective than the feature extraction by the illustrated feature extraction 1 section 1, and the matching is redone. It is possible to add a configuration to do so. Also, in the above explanation, words such as numbers ti! However, it goes without saying that it can also be applied to the case of recognizing monosyllables.

以上説明した如く、本発明によれば、照合回路部による
照合結果の正当性を自らチェックさせることが可能であ
り、am結果の信頼性を高めることが可能となる。
As described above, according to the present invention, it is possible to have the verification circuit section check the validity of the verification result by itself, and it is possible to improve the reliability of the am result.

【図面の簡単な説明】[Brief explanation of drawings]

図は本発明の一実施例構成?示す。 ・1匂中、1は%全抽出回路部、2は欅準辞書、3は照
合回路部、4は候補系列辞書、5は比較10回路87表
わす。 特許出願人 富士通株式会社
Is the figure an example configuration of the present invention? show. - In 1, 1 represents the % total extraction circuit, 2 represents the keyaki quasi-dictionary, 3 represents the collation circuit, 4 represents the candidate series dictionary, and 5 represents the comparison 10 circuit 87. Patent applicant Fujitsu Limited

Claims (1)

【特許請求の範囲】[Claims] 入力音声信号の周波数分析結果にもとづいて当該入力音
声信号の特徴量を抽出する特徴抽出回路部、標準音声に
対応した標準特徴量を格納してなる標準辞書、および上
記特徴抽出回路部によって得られた特徴量と上記標準辞
書から読出された標準特徴量とl照合する照合回路部を
有し、上記入力音声信号に対応した音声vil!!識す
る音声認識装置において、上記照合回路部において複数
個の候補音声を優先順位を附与して抽出し当該抽出され
た音声候補系列を出力し得るよう構成されると共に、複
数種類の音声候補系列な格納しかつ該各系列対応に識別
子を附与した候補系列辞書、および上記照合回路から出
力された音声候補系列と上記候補系列辞書から読出され
た音声候補系列との比較を行なう比較回路部tもうけ、
該比較回路部において一致のとれた音声候補系列に附与
されている上記識別子にもとづいて上記照合回路部から
出力された音声候補系列中の最優先順位?もつ音声tw
l識結果として採択するか否かt判定するよ5にしたこ
とを特徴とする音声認識方式。
A feature extraction circuit section that extracts feature quantities of an input speech signal based on the frequency analysis results of the input speech signal, a standard dictionary that stores standard feature quantities corresponding to standard speech, and a feature extraction circuit section that extracts feature quantities of the input speech signal based on the frequency analysis results of the input speech signal. It has a matching circuit unit that compares the feature quantity read out from the standard dictionary with the standard feature quantity read from the standard dictionary, and generates a voice vil! corresponding to the input voice signal. ! The speech recognition device is configured such that the collation circuit unit is capable of extracting a plurality of candidate speeches by assigning priorities to them and outputting the extracted speech candidate sequences. a candidate sequence dictionary in which the candidate sequence dictionary is stored and an identifier is assigned to each corresponding sequence, and a comparison circuit section t that compares the voice candidate sequence outputted from the matching circuit and the voice candidate sequence read from the candidate sequence dictionary. Make money,
The highest priority among the speech candidate sequences outputted from the collation circuit section based on the identifier assigned to the speech candidate series that matched in the comparison circuit section? Motsu audio tw
A speech recognition method characterized in that it is determined whether or not to be adopted as a recognition result.
JP56155654A 1981-09-30 1981-09-30 Voice recognition system Pending JPS5857195A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP56155654A JPS5857195A (en) 1981-09-30 1981-09-30 Voice recognition system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP56155654A JPS5857195A (en) 1981-09-30 1981-09-30 Voice recognition system

Publications (1)

Publication Number Publication Date
JPS5857195A true JPS5857195A (en) 1983-04-05

Family

ID=15610685

Family Applications (1)

Application Number Title Priority Date Filing Date
JP56155654A Pending JPS5857195A (en) 1981-09-30 1981-09-30 Voice recognition system

Country Status (1)

Country Link
JP (1) JPS5857195A (en)

Similar Documents

Publication Publication Date Title
US4653097A (en) Individual verification apparatus
US6401063B1 (en) Method and apparatus for use in speaker verification
US6922669B2 (en) Knowledge-based strategies applied to N-best lists in automatic speech recognition systems
Soong et al. A Tree. Trellis based fast search for finding the n best sentence hypotheses in continuous speech recognition
EP0099476B1 (en) Identity verification system
CN110162780B (en) User intention recognition method and device
EP0109190A1 (en) Monosyllable recognition apparatus
JPH10105655A (en) Method and system for verification and correction for optical character recognition
EP0389541A1 (en) Pattern recognition error reduction system
US9043207B2 (en) Speaker recognition from telephone calls
US6499012B1 (en) Method and apparatus for hierarchical training of speech models for use in speaker verification
CN110853674A (en) Text collation method, apparatus, and computer-readable storage medium
CN113051923B (en) Data verification method and device, computer equipment and storage medium
JPS5857195A (en) Voice recognition system
EP0177854B1 (en) Keyword recognition system using template-concatenation model
JP3514481B2 (en) Voice recognition device
JPS5952388A (en) Dictionary collating system
Kitaoka et al. Detection and recognition of correction utterances on misrecognition of spoken dialog system
JP2908132B2 (en) Post-processing method of character recognition result
JPS6346499A (en) Big vocaburary word voice recognition system
JPS6365499A (en) Syntax recognition system
CN115064152A (en) Voice recognition method, device, equipment and storage medium
JPH1185909A (en) Address recognizing method
JPS63798B2 (en)
JPS62159200A (en) Word voice recognition equipment for specified speaker