JPH06214595A - Voice recognition method - Google Patents

Voice recognition method

Info

Publication number
JPH06214595A
JPH06214595A JP5007396A JP739693A JPH06214595A JP H06214595 A JPH06214595 A JP H06214595A JP 5007396 A JP5007396 A JP 5007396A JP 739693 A JP739693 A JP 739693A JP H06214595 A JPH06214595 A JP H06214595A
Authority
JP
Japan
Prior art keywords
frame
dsp
arithmetic
processing
processor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP5007396A
Other languages
Japanese (ja)
Inventor
Makoto Shosakai
誠 庄境
Kunihiko Owa
邦彦 尾和
Kazuya Takeda
一哉 武田
Shingo Kuroiwa
眞吾 黒岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
KDDI Corp
Asahi Chemical Industry Co Ltd
Original Assignee
Kokusai Denshin Denwa KK
Asahi Chemical Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kokusai Denshin Denwa KK, Asahi Chemical Industry Co Ltd filed Critical Kokusai Denshin Denwa KK
Priority to JP5007396A priority Critical patent/JPH06214595A/en
Publication of JPH06214595A publication Critical patent/JPH06214595A/en
Pending legal-status Critical Current

Links

Abstract

PURPOSE:To eliminate waiting time of each operation processor and to reduce a frame length by repeatedly executing a third operation process for every frame with a new first operation process. CONSTITUTION:After a first operation process of a first operation processor 10, a second operation processor 20 repeatedly executes a second operation process for every frame with at least more than one frame time delay, the processor 10 repeatedly executes a third operation process, which is at least more than one frame time delayed than the second operation process, with the first operation process. Thus, a series of processes, i.e., the first to the third operation processes, which are related to one data and executed by two operation processors, are executed in plural frame time. However, each operation processor executes the process assigned to itself for every frame and performs an operation process against the data which are time sequentially inputted.

Description

【発明の詳細な説明】Detailed Description of the Invention

【0001】[0001]

【産業上の利用分野】本発明は、フレームのような一定
周期で音声認識処理を実行する音声認識方法に関し、よ
り詳しくは、複数のデジタル処理プロセッサで処理を分
担する音声認識方法に関する。
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice recognition method for executing voice recognition processing in a fixed cycle such as a frame, and more particularly to a voice recognition method in which a plurality of digital processing processors share the processing.

【0002】[0002]

【従来の技術】従来、一般的な音声認識方法では、音声
を電気信号に変換し、その信号波形の特徴を抽出し、予
め定めた標準パターンと特徴比較を行う。最も類似した
標準パターンに付加された識別ラベル(音素や音韻につ
いての識別番号)を音声認識結果として信号出力する。
2. Description of the Related Art Conventionally, in a general speech recognition method, speech is converted into an electric signal, the characteristics of the signal waveform are extracted, and the characteristics are compared with a predetermined standard pattern. An identification label (identification number for phoneme or phoneme) added to the most similar standard pattern is output as a speech recognition result.

【0003】このような音声認識方法では音声信号を一
定周期(フレーム周期と呼ばれる)毎に特徴抽出を行
い、その結果に基づいて音声認識を行う。信号波形の特
徴抽出や標準パターンの類似度の算出(距離計算)には
複雑な計算処理が伴なうために、その演算処理に時間が
かかる。そこで、最近では、複数のデジタル処理プロセ
ッサにより演算処理を分担して行い、フレーム周期を短
縮させることによって音声認識精度の向上を図った音声
認識装置が提案されている。
In such a voice recognition method, a feature is extracted from a voice signal at regular intervals (called a frame period), and voice recognition is performed based on the result. Since a complicated calculation process is involved in the extraction of the characteristic of the signal waveform and the calculation of the similarity of the standard pattern (distance calculation), the calculation process takes time. Therefore, recently, a voice recognition apparatus has been proposed in which a plurality of digital processors share the arithmetic processing and shorten the frame period to improve the voice recognition accuracy.

【0004】従来、この種、音声認識装置の部分回路構
成を図2に示す。図2において、第1のデジタル(演
算)処理プロセッサ(DSP)10と第2のDSP20
の間にバッファ15が設けられている。
FIG. 2 shows a partial circuit configuration of a speech recognition apparatus of this type conventionally. In FIG. 2, a first digital (arithmetic) processor (DSP) 10 and a second DSP 20
A buffer 15 is provided between them.

【0005】このような構成において、第1のDSP1
0は音声信号から抽出された特徴パラメータを用いて音
声認識のための第1の演算処理を実行すると、バッファ
15にその演算結果P(t)を転送する。バッファ15
は転送タイミングの調整のために演算結果P(t)を一
時格納し、第2のDSP20に一時格納後の演算結果
P′(t)を引渡す。
In such a configuration, the first DSP 1
When 0 executes the first calculation process for voice recognition using the characteristic parameter extracted from the voice signal, 0 transfers the calculation result P (t) to the buffer 15. Buffer 15
Temporarily stores the operation result P (t) for adjusting the transfer timing, and delivers the operation result P ′ (t) after the temporary storage to the second DSP 20.

【0006】第2のDSP20はこの演算結果P′
(t)を用いて音声認識に関わる第2の演算処理を行
う。この演算結果Q(t)はバッファ15に一時格納
後、Q′(t)として第1のDSP10に引渡される。
第1のDSP10は第2の演算結果Q′(t)を用い
て、第3の演算処理を行って音声認識結果を出力する。
The second DSP 20 outputs the calculation result P '.
The second arithmetic processing relating to voice recognition is performed using (t). The calculation result Q (t) is temporarily stored in the buffer 15 and then delivered to the first DSP 10 as Q ′ (t).
The first DSP 10 uses the second calculation result Q ′ (t) to perform a third calculation process and output a voice recognition result.

【0007】このような一連の処理が1フレームの中で
実行される。参考のために、上述の処理の実行タイミン
グを図3に示した。図3から明らかなように1フレーム
内で実行される処理は、以下の処理、すなわち、第1の
DSP10の演算→バッファ15への転送→第2のDS
P20の演算→バッファ15への転送→第1のDSP1
0の演算の処理があるため、1フレーム長は上述の処理
時間の合計となる。
Such a series of processing is executed in one frame. For reference, the execution timing of the above process is shown in FIG. As is clear from FIG. 3, the processing executed in one frame is as follows: operation of the first DSP 10 → transfer to the buffer 15 → second DS
P20 operation → transfer to buffer 15 → first DSP1
Since there is a calculation operation of 0, one frame length is the total of the above processing times.

【0008】[0008]

【発明が解決しようとする課題】また、上述の第1,第
2のDSP10,20の一方が演算処理を実行している
間、他方のDSPは待ち時間となる(図3参照)。加え
て、DSPの処理速度には限界があるので、DSPの最
高処理速度を用いても1フレーム長に限界が生じ1フレ
ーム長の短縮が難しいという解決すべき課題が従来方法
にはあった。
Further, while one of the first and second DSPs 10 and 20 described above is executing the arithmetic processing, the other DSP becomes a waiting time (see FIG. 3). In addition, since the processing speed of the DSP is limited, even if the maximum processing speed of the DSP is used, there is a limit to the length of one frame and it is difficult to reduce the length of one frame.

【0009】そこで、上述の点に鑑みて、本発明の目的
は、音声の認識処理を複数のDSPで分担して実行する
場合、DSPの待ち時間を減少させ、1フレーム長を短
縮することの可能な音声認識方法を提供することにあ
る。
Therefore, in view of the above points, an object of the present invention is to reduce the waiting time of the DSP and shorten the length of one frame when the voice recognition processing is shared by a plurality of DSPs. It is to provide a possible voice recognition method.

【0010】[0010]

【課題を解決するための手段】このような目的を達成す
るために、本発明は、第1の演算プロセッサにより第1
の演算処理を実行し、その第1の演算結果を用いて、第
2の演算プロセッサにより第2の演算処理を実行し、そ
の第2の演算処理結果を用いて前記第1の演算プロセッ
サにより第3の演算処理を実行して、音声認識に関わる
処理をフレーム毎に繰り返し実行する音声認識方法にお
いて、前記第2の演算プロセッサは、前記第1の演算プ
ロセッサの第1の演算処理の後、少なくとも1フレーム
以上の時間だけ遅延させて前記第2の演算処理をフレー
ム毎に繰り返し実行し、前記第1の演算プロセッサは、
前記第2の演算処理より少なくとも1フレーム以上の時
間だけ遅延させた前記第3の演算処理を新たな前記第1
の演算処理と共にフレーム毎に繰り返し実行することを
特徴とする。
In order to achieve such an object, the present invention provides a first arithmetic processor which provides a first operation processor.
The second arithmetic processor executes the second arithmetic processing using the first arithmetic result, and the second arithmetic processor executes the second arithmetic processing using the second arithmetic processing result. In the speech recognition method of executing the arithmetic processing of No. 3, and repeatedly executing the processing related to the speech recognition for each frame, the second arithmetic processor is at least after the first arithmetic processing of the first arithmetic processor. The second arithmetic processing is repeatedly executed for each frame with a delay of one frame or more, and the first arithmetic processor is
The third arithmetic process delayed from the second arithmetic process by at least one frame or more is newly added to the first arithmetic process.
It is characterized in that it is repeatedly executed for each frame together with the calculation processing of.

【0011】[0011]

【作用】本発明では、1つのデータについて関連して2
つの演算処理プロセッサにより実行される、第1の演算
処理〜第3の演算処理までの一連の処理は複数フレーム
時間で実行されるが、各演算プロセッサは自己に割当て
られた処理をフレーム毎に実行して時系列的に入力され
るデータに対して演算処理を施す。
In the present invention, one data is related to two
A series of processing from the first arithmetic processing to the third arithmetic processing, which is executed by one arithmetic processing processor, is executed in a plurality of frame times, but each arithmetic processor executes the processing assigned to itself for each frame. Then, arithmetic processing is performed on the data input in time series.

【0012】[0012]

【実施例】以下、図面を参照して本発明の実施例を詳細
に説明する。
Embodiments of the present invention will now be described in detail with reference to the drawings.

【0013】図1は本発明実施例の処理順序を示す。FIG. 1 shows the processing sequence of the embodiment of the present invention.

【0014】なお、本実施例の回路構成は図2の従来例
と同様であり、また構成回路の処理内容も従来例と同様
である。本実施例では、従来、1フレーム内で実行して
いた複数の処理を1フレームずつ順次に遅延させて実行
させるように処理順序を変えることによりDSPの待ち
時間をなくし、しかもフレーム単位で音声認識結果を出
力する。以下、本実施例の処理順序を説明する。
The circuit configuration of this embodiment is the same as that of the conventional example shown in FIG. 2, and the processing contents of the configuration circuit are also the same as those of the conventional example. In the present embodiment, the waiting time of the DSP is eliminated by changing the processing order so that a plurality of processes that were conventionally executed in one frame are sequentially delayed by one frame and then executed, and voice recognition is performed in frame units. Output the result. The processing order of this embodiment will be described below.

【0015】図1において、第1のDSP10が、音声
認識に用いる特徴パラメータを時刻tで入力すると、第
1の演算処理を実行し、その演算結果P(t)をバッフ
ァ15に引渡す。次の時刻t+1で、時刻tの演算結果
がP′(t)として第2のDSP20に引渡される。第
2のDSP20は時刻t+2で、第1のDSP10の時
刻tの演算結果を用いて第2の演算を行う。その演算結
果Q(t)がバッファ15へ転送され、時刻t+3でバ
ッファ15から第1のDSP10に引渡される。第1の
DSPは時刻t+4で第2の演算結果Q′(t)に対す
る第3の演算処理を実行して音声認識結果を出力する。
なお、第1のDSPは時刻t+4で従来と同様、第1の
演算処理をも行うが、この時刻の第1の演算処理に用い
る特徴パラメータは時刻t+4で入力したデータであ
る。以上がある時刻tの特徴パラメータが時刻t+4で
音声認識結果に変換されるまでの処理順序である。
In FIG. 1, when the first DSP 10 inputs the characteristic parameter used for voice recognition at time t, the first arithmetic processing is executed and the arithmetic result P (t) is delivered to the buffer 15. At the next time t + 1, the calculation result at time t is delivered to the second DSP 20 as P ′ (t). The second DSP 20 performs the second calculation at time t + 2 using the calculation result of the first DSP 10 at time t. The calculation result Q (t) is transferred to the buffer 15 and is transferred from the buffer 15 to the first DSP 10 at time t + 3. The first DSP executes the third arithmetic processing on the second arithmetic result Q ′ (t) at time t + 4 and outputs the voice recognition result.
Although the first DSP also performs the first arithmetic processing at time t + 4 as in the conventional case, the characteristic parameter used for the first arithmetic processing at this time is the data input at time t + 4. The above is the processing order until the characteristic parameter at time t is converted to the voice recognition result at time t + 4.

【0016】第1のDSP10では、第1の演算処理と
第3の演算処理を各時刻で繰り返して行く。したがって
第1のDSPでは一定時間毎に音声認識結果を出力して
行くことができる。
The first DSP 10 repeats the first arithmetic processing and the third arithmetic processing at each time. Therefore, the first DSP can output the voice recognition result at regular intervals.

【0017】本実施例ではフレーム毎に、各構成間のデ
ータ転送の同期がとられているので、1フレーム内でデ
ータ転送を行う従来例に比べ第1DSP10および第2
DSP20共に相手側とデータ転送を行う間の待ち時間
がなくなる。したがって、この待ち時間を短縮した分だ
け1フレーム長を短縮することができる。
In this embodiment, since the data transfer between the respective structures is synchronized for each frame, the first DSP 10 and the second DSP 10 are different from the conventional example in which the data transfer is performed within one frame.
The DSP 20 eliminates the waiting time during data transfer with the other party. Therefore, the length of one frame can be shortened by the shortened waiting time.

【0018】より具体的には、従来の1フレーム長は第
1のDSP10の第1,第3の演算処理の時間+バッフ
ァ15のデータ転送時間+第2のDSP20の第2の演
算処理時間である。本実施例では第1,第3演算処理の
時間か、バッファ15のデータ転送時間かまたは、第2
のDSP20の第2の演算処理時間の中のいずれかの最
大となる時間が1フレーム長となる。
More specifically, the conventional one frame length is the time of the first and third arithmetic processing of the first DSP 10 + the data transfer time of the buffer 15 + the second arithmetic processing time of the second DSP 20. is there. In the present embodiment, the time for the first and third arithmetic processing, the data transfer time of the buffer 15, or the second
The maximum time of any one of the second arithmetic processing times of the DSP 20 is 1 frame length.

【0019】本実施例の他に次の例を実施できる。In addition to this embodiment, the following example can be carried out.

【0020】(1) 本実施例では、第1のDSP10
および第2のDSP20の間にデータ転送用バッファを
設ける例を示しているが、バッファを特に設ける必要は
なく、第1のDSP10と第2のDSP20との間で直
接データの授受を行うこともできる。この場合は、上記
両プロセッサ間でデータ転送用の通信制御信号の授受を
行う。また、本実施例ではバッファ15は第1のDSP
10のデータの入力用および出力用に兼用しているが、
入力用と出力用に2つのバッファを用いてもよい。
(1) In this embodiment, the first DSP 10
Although an example in which a data transfer buffer is provided between the second DSP 20 and the second DSP 20 is shown, it is not necessary to provide a buffer in particular, and data can be directly transferred between the first DSP 10 and the second DSP 20. it can. In this case, communication control signals for data transfer are exchanged between the two processors. Further, in this embodiment, the buffer 15 is the first DSP.
It is used for both input and output of 10 data,
Two buffers may be used for input and output.

【0021】(2) 本実施例では、2つのDSPで音
声認識処理を実行する場合を例にしているが、2つ以上
のDSPを直列的に接続し、最下流のDSPの処理結果
を用いて最下流以外のDSPが再び演算処理を実行する
場合にも本発明を適用できる。
(2) In this embodiment, the case where the voice recognition processing is executed by two DSPs is taken as an example, but two or more DSPs are connected in series and the processing result of the most downstream DSP is used. The present invention can also be applied to a case where a DSP other than the most downstream one executes the arithmetic processing again.

【0022】(3) 本実施例では音声認識処理に関わ
る3つの処理を2つのDSPで実行する例であるが、2
つのDSPで実行する処理は、3つ以上でもよく、この
場合は、授受を行う関連データについて2つのDSPが
実行する処理のフレームタイミングを異ならせることに
なる。
(3) In this embodiment, three DSP-related processes are executed by two DSPs.
The number of processes executed by one DSP may be three or more. In this case, the frame timings of the processes executed by the two DSPs for the related data to be exchanged will be different.

【0023】[0023]

【発明の効果】以上、説明したように、本発明によれ
ば、各演算処理プロセッサの待ち時間がなくなり、ま
た、フレーム長を短くできるので、音声認識精度の向上
に寄与することができる。
As described above, according to the present invention, the waiting time of each arithmetic processing processor is eliminated and the frame length can be shortened, which can contribute to the improvement of the voice recognition accuracy.

【図面の簡単な説明】[Brief description of drawings]

【図1】本発明実施例の処理順序を示す説明図である。FIG. 1 is an explanatory diagram showing a processing order of an embodiment of the present invention.

【図2】音声認識装置の部分構成を示すブロック図であ
る。
FIG. 2 is a block diagram showing a partial configuration of a voice recognition device.

【図3】従来例の処理順序を示す説明図である。FIG. 3 is an explanatory diagram showing a processing order of a conventional example.

【符号の説明】[Explanation of symbols]

10 第1のデジタル処理プロセッサ(第1のDSP) 15 バッファ 20 第2のデジタル処理プロセッサ(第2のDSP) 10 First Digital Processor (First DSP) 15 Buffer 20 Second Digital Processor (Second DSP)

───────────────────────────────────────────────────── フロントページの続き (72)発明者 武田 一哉 東京都新宿区西新宿2丁目3番2号 国際 電信電話株式会社内 (72)発明者 黒岩 眞吾 東京都新宿区西新宿2丁目3番2号 国際 電信電話株式会社内 ─────────────────────────────────────────────────── ─── Continued Front Page (72) Inventor Kazuya Takeda 2-3-2 Nishishinjuku, Shinjuku-ku, Tokyo International Telegraph and Telephone Corporation (72) Inventor Shingo Kuroiwa 2-3-2 Nishishinjuku, Shinjuku-ku, Tokyo No. International Telegraph and Telephone Corporation

Claims (1)

【特許請求の範囲】[Claims] 【請求項1】 第1の演算プロセッサにより第1の演算
処理を実行し、その第1の演算結果を用いて、第2の演
算プロセッサにより第2の演算処理を実行し、その第2
の演算処理結果を用いて前記第1の演算プロセッサによ
り第3の演算処理を実行して、音声認識に関わる処理を
フレーム毎に繰り返し実行する音声認識方法において、 前記第2の演算プロセッサは、前記第1の演算プロセッ
サの第1の演算処理の後、少なくとも1フレーム以上の
時間だけ遅延させて前記第2の演算処理をフレーム毎に
繰り返し実行し、 前記第1の演算プロセッサは、前記第2の演算処理より
少なくとも1フレーム以上の時間だけ遅延させた前記第
3の演算処理を新たな前記第1の演算処理と共にフレー
ム毎に繰り返し実行することを特徴とする音声認識方
法。
1. A first arithmetic processor executes a first arithmetic processing, a second arithmetic processor executes a second arithmetic processing using the first arithmetic result, and a second arithmetic processing is executed.
In the speech recognition method, wherein the third arithmetic processing is executed by the first arithmetic processor by using the arithmetic processing result of 1., and the processing related to the speech recognition is repeatedly executed for each frame, wherein the second arithmetic processor is After the first arithmetic processing of the first arithmetic processor, the second arithmetic processing is repeatedly executed for each frame with a delay of at least one frame or more, and the first arithmetic processor is configured to execute the second arithmetic processing. A speech recognition method, characterized in that the third arithmetic processing delayed from the arithmetic processing by at least one frame or more is repeatedly executed for each frame together with the new first arithmetic processing.
JP5007396A 1993-01-20 1993-01-20 Voice recognition method Pending JPH06214595A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP5007396A JPH06214595A (en) 1993-01-20 1993-01-20 Voice recognition method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP5007396A JPH06214595A (en) 1993-01-20 1993-01-20 Voice recognition method

Publications (1)

Publication Number Publication Date
JPH06214595A true JPH06214595A (en) 1994-08-05

Family

ID=11664747

Family Applications (1)

Application Number Title Priority Date Filing Date
JP5007396A Pending JPH06214595A (en) 1993-01-20 1993-01-20 Voice recognition method

Country Status (1)

Country Link
JP (1) JPH06214595A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006075648A1 (en) * 2005-01-17 2006-07-20 Nec Corporation Speech recognition system, speech recognition method, and speech recognition program

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS62124600A (en) * 1985-11-26 1987-06-05 株式会社東芝 Voice recognition equipment
JPH0262879A (en) * 1988-08-26 1990-03-02 Agency Of Ind Science & Technol Photochromic compound and production thereof
JPH0345840A (en) * 1989-07-12 1991-02-27 Matsushita Electric Ind Co Ltd Outdoor device in air conditioning equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS62124600A (en) * 1985-11-26 1987-06-05 株式会社東芝 Voice recognition equipment
JPH0262879A (en) * 1988-08-26 1990-03-02 Agency Of Ind Science & Technol Photochromic compound and production thereof
JPH0345840A (en) * 1989-07-12 1991-02-27 Matsushita Electric Ind Co Ltd Outdoor device in air conditioning equipment

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006075648A1 (en) * 2005-01-17 2006-07-20 Nec Corporation Speech recognition system, speech recognition method, and speech recognition program
JPWO2006075648A1 (en) * 2005-01-17 2008-08-07 日本電気株式会社 Speech recognition system, speech recognition method and speech recognition program
US7930180B2 (en) 2005-01-17 2011-04-19 Nec Corporation Speech recognition system, method and program that generates a recognition result in parallel with a distance value
JP5103907B2 (en) * 2005-01-17 2012-12-19 日本電気株式会社 Speech recognition system, speech recognition method, and speech recognition program

Similar Documents

Publication Publication Date Title
JPH06214595A (en) Voice recognition method
US8453003B2 (en) Communication method
JP3436184B2 (en) Multi-channel input speech recognition device
JPS6059441A (en) Data control circuit
JPH05110441A (en) Prediction output d/a converter
CN114007176A (en) Audio signal processing method, apparatus and storage medium for reducing signal delay
JPH0267665A (en) Interface circuit
JPH0693240B2 (en) Program synchronization circuit
JPS60136830A (en) Operation processor
JPS62175831A (en) Control system for pipeline with tag
JPS6277651A (en) Branch processing system for data flow type computer
JPS6059461A (en) Program memory device
JP2747154B2 (en) I / O processor
JPS60164861A (en) Data transfer processing method
JPH07141288A (en) Dma transfer system
JPH06152546A (en) Microprocessor
JPS6135496A (en) Voice recognition equipment
JPH05327817A (en) Data transfer method and device therefor
JPH06195197A (en) Voice encoding arithmetic operation unit
JPS613282A (en) Logical arithmetic unit between binary images
JPS59138147A (en) Data transmitter
JPH05225130A (en) Data processor
JPS6363944B2 (en)
JPH02268367A (en) Average value calculating circuit
JPH02178805A (en) Remote i/o control method

Legal Events

Date Code Title Description
A02 Decision of refusal

Free format text: JAPANESE INTERMEDIATE CODE: A02

Effective date: 19980220