JPS59119397A - Voice recognition equipment - Google Patents

Voice recognition equipment

Info

Publication number
JPS59119397A
JPS59119397A JP57231861A JP23186182A JPS59119397A JP S59119397 A JPS59119397 A JP S59119397A JP 57231861 A JP57231861 A JP 57231861A JP 23186182 A JP23186182 A JP 23186182A JP S59119397 A JPS59119397 A JP S59119397A
Authority
JP
Japan
Prior art keywords
recognition
silent period
section
input
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP57231861A
Other languages
Japanese (ja)
Inventor
繁 佐々木
晋太 木村
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to JP57231861A priority Critical patent/JPS59119397A/en
Publication of JPS59119397A publication Critical patent/JPS59119397A/en
Pending legal-status Critical Current

Links

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。
(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】 囚 発明の技術分野 本発明は、音声認識装置、特に装置全体を分析部と認識
部と入出力管理部とに区分して、処理結果を1v次引継
がせるよう構成すると共に、分析部と認識部との間の転
送を高速化した音声認識装置に関するものである。
[Detailed Description of the Invention] Technical Field of the Invention The present invention is a speech recognition device, in particular, the entire device is divided into an analysis section, a recognition section, and an input/output management section, and the processing results are configured to be inherited from one level to the next. The present invention also relates to a speech recognition device that speeds up transfer between an analysis section and a recognition section.

(B)  技術の背景と問題点 従来から、音声認識装置においては、一般に、(D入力
されてくる音声の周波数分析を行って複数個の時系列特
徴情報を生成し、(++)辞書に格納されている標準特
徴時系列とのマツチングをとって上記入力されてくる音
声のカテゴリを決定し、O1O当該認識結果を入出力装
置に出力するよう構成されている。このような音声認識
装置において、上記各処理(+) (ii) (ii[
)に対応してマイクロプロセッサを用意し、各処理結果
を順次引継いでゆく構成が考慮されている。
(B) Technical Background and Problems Traditionally, speech recognition devices generally perform frequency analysis of input speech to generate multiple pieces of time-series feature information and store them in a (++) dictionary. The voice recognition device is configured to determine the category of the input voice by matching it with the standard feature time series that has been provided, and output the recognition result to the input/output device.In such a voice recognition device, Each of the above processes (+) (ii) (ii[
), a configuration is being considered in which a microprocessor is prepared for each process and the results of each process are sequentially taken over.

このような構成を考慮する場合、各処理において1単位
として処理結果を引継ぐだめの単位を如何にとるかが問
題となる。即ち、上記処理(i) (ii) (iiD
に対応づけて、夫々分析部、認識部、入出力管理部とし
た場合における処理結果引継き上の問題点が第1図を参
照しつつ説明される。
When considering such a configuration, the problem is how to take over the processing results as one unit in each process. That is, the above processing (i) (ii) (iiD
With reference to FIG. 1, problems in handing over processing results in the case of an analysis section, a recognition section, and an input/output management section will be explained with reference to FIG.

第1図において入力音声が図形の如きものであったとす
る。なお図示波形は、例えば「あった」などの撥音が混
在している入力音声の場合であると考えてよい。分析部
においては、従来周知の如く、入力音声の周波数分析を
行ってゆき各周波数帯毎の時系列特徴情報を生成してゆ
くが、一方入力音声の例えばエネルギを監視しており、
図示閾値以下の無音期間の長さをチェックしている。そ
して、例えば無音期間が0.3秒以上存在すれば単語間
の無音期間であるとみなし、それ以下であれば上述の如
き撥音などに起因するものとみて未だ単語が持続してい
ると判断するようにされる。
In FIG. 1, it is assumed that the input voice is something like a figure. It should be noted that the illustrated waveform may be considered to be a case of an input voice in which a plucked sound such as "Ataru" is mixed, for example. As is well known in the art, the analysis section performs frequency analysis of the input voice and generates time-series feature information for each frequency band, but also monitors, for example, the energy of the input voice.
The length of the silent period below the indicated threshold is checked. For example, if a silent period exists for 0.3 seconds or more, it is considered to be a silent period between words, and if it is shorter than that, it is assumed that it is caused by the above-mentioned clipped sound, and it is determined that the word is still continuing. It will be done like this.

第1図図示の無音期間(5)については、0.3秒以内
であることから、分析部は単語が持続しているものとみ
ていることとなる。そして図示無音期間(B)において
始めて0.3秒を、検出したことから、単語の区切りが
到来したものとして、それまでの各周波数帯の時系列特
徴情報を、例えば16分割し16ステツプの情報として
認識部へ引渡すようにされる。第1図図示の場合、認識
部は、上記0.3秒経過後に当該16ステツプの情報を
受取処理を開始する。そしてその認識結果を入出力管理
部に引渡すようにされる。
Since the silent period (5) shown in FIG. 1 is within 0.3 seconds, the analysis unit considers that the word continues. Then, since the first 0.3 seconds in the silent period (B) in the diagram is detected, it is assumed that a word break has arrived, and the time series feature information of each frequency band up to that point is divided into 16, for example, and 16 steps of information are generated. The information is passed to the recognition unit as follows. In the case shown in FIG. 1, the recognition unit starts receiving the information of the 16 steps after the lapse of 0.3 seconds. The recognition result is then delivered to the input/output management section.

上述の如く、分析部と認識部と入出力管理部との間で処
理結果を順次引継ぐ方式を採用しようとするとき、一般
には、第1図図示の如き引継ぎ態様とならざるを得ない
。しかし、オン・ライン音声認識処理を行わせようとす
るとき、上記0.3秒の遅れ時間をも問題となる。
As described above, when attempting to adopt a method of sequentially handing over processing results between the analysis section, the recognition section, and the input/output management section, the handover mode as shown in FIG. 1 is generally inevitable. However, when trying to perform online speech recognition processing, the delay time of 0.3 seconds mentioned above also becomes a problem.

(6)発明の目的と構成 本発明は上記の点を解決することを目的としておシ、本
発明の音声認識装置は、入力されてくる音声の周波数分
析を行って複数個の時系列特徴情報を生成し、辞書に格
納されている標準特徴時系列とのマツチングをとって上
記入力されてくる音声のカテゴリを決定し、当該認識結
果を入出力装置に出力するよう構成された音声認識装置
において、上記複数個の時系特徴情報を生成する分析部
と、上記カテコ゛りを決定する認識部と、上記出力を行
う入出力管理部とをもうけ、上記分析部によって得られ
た結果を上記認識部に引渡し、該認識部によって得られ
た結果を上記入出力管理部に引渡すよう・千イブライン
処理を行うと共に、上記分析部は、上記入力されてくる
音声の無音期間の時間長を監視する監゛視手段をそなえ
、上記無音期間の開始時にそれまでの処理結果にもとづ
く上記時系列特徴情報を上記認識部に転送した上、で、
当該無音期間が単語間無音期間であったか否かを上記認
識部に通知し、単語間無音期間でなかった場合に改めて
次の無音期間で時系列特徴情報を転送するよう構成され
、上記認識部は、受信した時系列特徴情報にもとづいて
認識処理を開始し、上記単語間無音期間に対応する通知
時に当該認識処理を継続するか停止するかを判断するよ
う構成したことを特徴としている。以下図面を参照しつ
つ説明する。
(6) Object and Structure of the Invention The present invention aims to solve the above-mentioned problems.The speech recognition device of the present invention performs frequency analysis of input speech to obtain a plurality of time-series feature information. In the speech recognition device configured to generate the above-mentioned input speech, determine the category of the input speech by matching it with the standard feature time series stored in the dictionary, and output the recognition result to the input/output device. , an analysis section that generates the plurality of time series feature information, a recognition section that determines the category, and an input/output management section that outputs the above, and the results obtained by the analysis section are sent to the recognition section. The analysis section monitors the length of the silent period of the input voice. comprising a visual means, and transmitting the time-series feature information based on the processing results up to that point to the recognition unit at the start of the silent period;
The recognition unit is configured to notify the recognition unit whether or not the silent period is an inter-word silent period, and if it is not an inter-word silent period, transfer the time-series feature information again in the next silent period, and the recognition unit The present invention is characterized in that the recognition process is started based on the received time-series feature information, and it is determined whether to continue or stop the recognition process at the time of notification corresponding to the inter-word silent period. This will be explained below with reference to the drawings.

■) 発明の実施例 第2図は本発明の一実施例構成を示し、第3図は第1図
図示説明図に対応する本発明による一実施例処理を説明
する説明図を示す。
(2) Embodiment of the Invention FIG. 2 shows the configuration of an embodiment of the present invention, and FIG. 3 shows an explanatory diagram illustrating processing of an embodiment of the present invention corresponding to the explanatory diagram shown in FIG. 1.

第2図において、1は分析部、2は認識部、3は入出力
管理部、4Iriネツトワーク・ノードであって各部相
互間の転送を仲介するもの、5は分析回路であって複数
の周波数帯域毎の時系列特徴情報を抽出するもの、6は
分析プロセッサであって分析部IKおける処理・制御を
行うもの、7は無音期間監視手段、8は認識プロセッサ
であって認識処理に関する処理・制御を行うもの、9は
辞書&マツチング回路、10は入出カプロセッサであっ
て認識結果にもとづいて例えば文字出力するなどのため
の処理・制御を行うもの、11は入出力装置を表わして
いる。
In FIG. 2, 1 is an analysis section, 2 is a recognition section, 3 is an input/output management section, 4 is an Iri network node that mediates transfer between each section, and 5 is an analysis circuit that supports multiple frequencies. 6 is an analysis processor that performs processing and control in the analysis unit IK; 7 is a silent period monitoring means; 8 is a recognition processor that performs processing and control related to recognition processing. 9 is a dictionary and matching circuit; 10 is an input/output processor which performs processing and control such as outputting characters based on recognition results; and 11 an input/output device.

以下第3図を併わせ参照しつつ動作を説明する。The operation will be explained below with reference to FIG.

分析回路5において上述の叩く複数個の時系列特徴情報
を抽出しつつあυ、この間、分析フ0ロセツサ6は監視
手段7を利用しつつ無音期間の存在をチェックしている
。そして図示無音期間(5)が開始されると、分析フ0
ロセツザ6はその時点で単語間無音期間が始まったもの
とみなしてそれ捷での時系列特徴情報を上述の如く例え
は16ステツ2°の情報に区分して(図示分析データ1
)、認識部2の認識70ロセツサ8に転送する。認識グ
ロセツ、す8は、即時にマツチング処理を開始して処理
を続ける。また早期に処理を終ればその結果を入出力管
理部3の入出カプロセッサ10に転送する〇この間に、
入力音声の無音期間が図示期間(4)の如く0.3秒以
内で解消したとすると、分析プロセッサ6は認識プロセ
ッサ8にまた必要に応じて入出カプロセッサ10に対し
てキャンセル命令を発する。そして、認識プロセッサ8
や入出力ノロセッサIOにおいては、それまでの処理を
破棄する。
While the analysis circuit 5 is extracting the above-mentioned pieces of time-series characteristic information υ, the analysis processor 6 uses the monitoring means 7 to check for the existence of a silent period. Then, when the illustrated silent period (5) starts, the analysis frame 0
Rosetsuza 6 assumes that the inter-word silent period has started at that point, and divides the time-series feature information at that point into information of 16 steps and 2 degrees as described above (Illustrated analysis data 1
), the recognition unit 2 transfers the recognition 70 to the processor 8. The recognition unit 8 immediately starts the matching process and continues the process. Also, if the processing is completed early, the result is transferred to the input/output processor 10 of the input/output management section 3. During this time,
Assuming that the silent period of the input voice disappears within 0.3 seconds as shown in period (4), the analysis processor 6 issues a cancel command to the recognition processor 8 and, if necessary, to the input/output processor 10. And recognition processor 8
In the input/output processor IO, the processing up to that point is discarded.

次いで図示期間(B)の如く無音期間が開始すると、分
析プロセッサ6は、図示分析開始点t1からの情報を上
記と同様に16ステツプの情報に区分して(図示分析デ
ータ2)、認識部2の認識プロセッサ8に転送する。認
識プロセッサ8は即時処理を開始する。図示無音期間(
B)の如くO・、3秒以上無音期間が継続すると、分析
プロセッサ6は、認識プロセッサ8にまた必要に応じて
入出力プロセッサ10に対して継続命令を発する。した
がって、処理を継続して実行されてゆく。勿論、図示無
音期間の帆3秒が経過すると、次の単語に対応する音声
が入力されてきてもよい。
Next, when a silent period starts as shown in the illustrated period (B), the analysis processor 6 divides the information from the illustrated analysis start point t1 into information of 16 steps (illustrated analysis data 2) in the same way as above, and the recognition unit 2 It is transferred to the recognition processor 8 of. The recognition processor 8 begins immediate processing. Illustrated silent period (
If the silent period continues for O.3 seconds or more as in B), the analysis processor 6 issues a continuation command to the recognition processor 8 and, if necessary, to the input/output processor 10. Therefore, the process continues to be executed. Of course, after the illustrated silent period of 3 seconds has elapsed, the voice corresponding to the next word may be input.

(ト) 発明の詳細 な説明した如く1、本発明によれば、音声認識装置を、
分析部、認識部、入出力管理部に区分した上で、各部間
の情報の転送を効率よく行うことが可能となり、オンラ
イン処理時の時間遅れを解消することが可能となる。
(g) As described in detail of the invention, 1. According to the present invention, a voice recognition device,
By dividing the system into an analysis section, a recognition section, and an input/output management section, it becomes possible to efficiently transfer information between each section, and it becomes possible to eliminate time delays during online processing.

【図面の簡単な説明】[Brief explanation of drawings]

第1図は装置内での処理を区分して処理結果を順次転送
してゆく構成を考慮した場合に顕著になる問題点を説明
する説明図、第2図は本発明の一実施例構成、第3図は
第1図図示説明図に交・4応する本発明による一実施例
処理を説明する説明図を示す。 図中、1は分析部、2は認識部、3は入出力管理部、4
けネットワーク・ノー)1.6,8.10は夫々プロセ
ッサを表わす。 特許出願人 富士通株式会社 代理人弁理士  森  1)     寛(外1名)
FIG. 1 is an explanatory diagram illustrating problems that become noticeable when considering a configuration in which processing within the device is divided and processing results are sequentially transferred, and FIG. FIG. 3 shows an explanatory diagram illustrating one embodiment of the process according to the present invention, which corresponds to or corresponds to the illustrated explanatory diagram in FIG. 1. In the figure, 1 is the analysis section, 2 is the recognition section, 3 is the input/output management section, and 4
1.6 and 8.10 represent processors, respectively. Patent applicant Hiroshi Mori (1 other person), agent patent attorney of Fujitsu Ltd.

Claims (1)

【特許請求の範囲】[Claims] 入力されてくる音声の周波数分析を行って複数個の時系
列特徴情報を生成し、辞書に格納されている標準特徴時
系列とのマツチングをとって上記入力されてくる音声の
カテゴリを決定し、当該認識結果を入出力装置に出力す
るよう構成された音声認識装置において、上記複数個の
時系特徴情報を生成する分析部と、上記カテゴリを決定
する認識部と、上記出力を行う入出力管理部とをもうけ
、上記分析部によって得られた結果を上記認識部に引渡
し、該認欽部によって得られた結果を上記人・出力管理
部に引渡すよう/fイブライン処理を行うと共に、上記
分析部は、上記入力されてくる音声の無音期間の時間長
を監視する監視手段をそなえ、上記無音期間の開始時に
それまでの処理結果にもとづく上記時系列特徴情報を上
記認識部に転送した上で、自該無音期間が単語間無音期
間であったか否かを上記認識部に通知し、単語間無音期
間でなかった場合に改めて次の無音期間で時−系列特徴
情報を転送するよう構成され、上記認識部は、受信した
時系列特徴情報にもとづいて認識処理を開始し、上記単
語間無音期間に対応する通知時に描該認識処理を継続す
るか停止するかを判断するよう構成したことを特徴とす
る音声認識装置。
Performing frequency analysis of the input voice to generate a plurality of time series feature information, and determining the category of the input voice by matching with the standard feature time series stored in a dictionary, In a speech recognition device configured to output the recognition result to an input/output device, an analysis unit that generates the plurality of time series feature information, a recognition unit that determines the category, and an input/output manager that outputs the above. The analysis section performs /f-ebrining processing to transfer the results obtained by the analysis section to the recognition section and the results obtained by the certification section to the human/output management section. comprises a monitoring means for monitoring the length of the silent period of the input voice, and at the start of the silent period, transfers the time-series characteristic information based on the processing results up to that point to the recognition unit, The recognition unit is configured to notify the recognition unit whether or not the silent period is an inter-word silent period, and if it is not an inter-word silent period, transfer the time-series feature information again in the next silent period, and the recognition unit The unit is characterized in that it is configured to start the recognition process based on the received time-series feature information, and determine whether to continue or stop the drawn recognition process at the time of notification corresponding to the inter-word silent period. Speech recognition device.
JP57231861A 1982-12-25 1982-12-25 Voice recognition equipment Pending JPS59119397A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP57231861A JPS59119397A (en) 1982-12-25 1982-12-25 Voice recognition equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP57231861A JPS59119397A (en) 1982-12-25 1982-12-25 Voice recognition equipment

Publications (1)

Publication Number Publication Date
JPS59119397A true JPS59119397A (en) 1984-07-10

Family

ID=16930164

Family Applications (1)

Application Number Title Priority Date Filing Date
JP57231861A Pending JPS59119397A (en) 1982-12-25 1982-12-25 Voice recognition equipment

Country Status (1)

Country Link
JP (1) JPS59119397A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6157982A (en) * 1984-08-29 1986-03-25 ソニー株式会社 Automobile
JPS6479798A (en) * 1987-09-21 1989-03-24 Toshiba Corp Voice order recognition equipment
US5799274A (en) * 1995-10-09 1998-08-25 Ricoh Company, Ltd. Speech recognition system and method for properly recognizing a compound word composed of a plurality of words

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS52144205A (en) * 1976-05-27 1977-12-01 Nec Corp Voice recognition unit
JPS5748798A (en) * 1980-09-08 1982-03-20 Mitsubishi Electric Corp Word voice recognizing device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS52144205A (en) * 1976-05-27 1977-12-01 Nec Corp Voice recognition unit
JPS5748798A (en) * 1980-09-08 1982-03-20 Mitsubishi Electric Corp Word voice recognizing device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6157982A (en) * 1984-08-29 1986-03-25 ソニー株式会社 Automobile
JPS6479798A (en) * 1987-09-21 1989-03-24 Toshiba Corp Voice order recognition equipment
US5799274A (en) * 1995-10-09 1998-08-25 Ricoh Company, Ltd. Speech recognition system and method for properly recognizing a compound word composed of a plurality of words

Similar Documents

Publication Publication Date Title
US4567606A (en) Data processing apparatus and method for use in speech recognition
US6633941B2 (en) Reduced networking interrupts
US8712757B2 (en) Methods and apparatus for monitoring communication through identification of priority-ranked keywords
CN108962283A (en) A kind of question terminates the determination method, apparatus and electronic equipment of mute time
JPH0785208B2 (en) An alarm timer for users of interactive systems
US7747444B2 (en) Multiple sound fragments processing and load balancing
JPH08195763A (en) Voice communications channel of network
US4423290A (en) Speech synthesizer with capability of discontinuing to provide audible output
JPS59119397A (en) Voice recognition equipment
US4641342A (en) Voice input system
JP2009021923A (en) Voice communication apparatus
US11361258B2 (en) System and method for call timing and analysis
US6678354B1 (en) System and method for determining number of voice processing engines capable of support on a data processing system
CN112802457A (en) Method, device, equipment and storage medium for voice recognition
US7788097B2 (en) Multiple sound fragments processing and load balancing
JPH10240284A (en) Method and device for voice detection
JP2743810B2 (en) Voice response device capable of testing with a load close to actual load
US20030229491A1 (en) Single sound fragment processing
JPH06250815A (en) Voice mail terminal
CN117354580A (en) Live video audio silencing method and device and electronic equipment
JPH09198077A (en) Speech recognition device
CN110933151A (en) Processing method and first electronic device
JPH0631997B2 (en) Output holding circuit of voice detector
JPS6195397A (en) Voice pattern collation system for voice recognition equipment
JPH07154576A (en) Facsimile equipment