JP2000022830A

JP2000022830A - Telephone sound recognition/response device, telephone sound recognition system and recording medium recording telephone sound recognition system control program

Info

Publication number: JP2000022830A
Application number: JP10190897A
Authority: JP
Inventors: Shintaro Murakami; 伸太郎村上; Toshiyuki Sasamoto; 要志笹本
Original assignee: Meidensha Corp; Meidensha Electric Manufacturing Co Ltd
Current assignee: Meidensha Corp; Meidensha Electric Manufacturing Co Ltd
Priority date: 1998-07-07
Filing date: 1998-07-07
Publication date: 2000-01-21

Abstract

PROBLEM TO BE SOLVED: To constitute a device to be compact and to prevent a whole system from becoming large even if the number of lines to be recognized increases. SOLUTION: A sound recognition/response device has a CPU board 21, a telephone response board 22, an A/D conversion board 23, an HDD device 25, and a sound synthesis device 26. A sound signal from a telephone line is inputted to the A/D conversion board 23 via the telephone response board 22. The A/D conversion board 23 converts the inputted sound signal into a digital signal, records it and transmits the recorded sound signal to the CPU board 21 through an inner bus 24. The CPU board 21 recognizes the transmitted sound signal and takes out response information stored in the HDD device 25 based on the recognition processing. Response information is sent to the sound synthesis device 26 and converted into the synthesized sound signal. The converted sound signal is transmitted to the telephone line via the telephone response board 22.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、電話音声認識・
応答装置及び電話音声認識システム並びに電話音声認識
システム制御プログラムを記録した記録媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention
The present invention relates to a response device, a telephone voice recognition system, and a recording medium storing a telephone voice recognition system control program.

【０００２】[0002]

【従来の技術】音声認識装置として図１０に示す離散単
語音声認識システムがある。このシステムは、図１０に
示すように、電話やマイクロフォンなどの音声入力装置
１１から音声データが音声入力部１２に入力される。こ
の音声入力部１２に入力された音声データは、特徴抽出
部１３に供給され、ここで、音声データは周波数分析さ
れる。この周波数分析の結果からデータはスペクトル列
を得て、音素認識部１４に入力される。音素認識部１４
は出力を二重化されたニューラルネットワークによって
構成されている。2. Description of the Related Art As a speech recognition apparatus, there is a discrete word speech recognition system shown in FIG. In this system, as shown in FIG. 10, voice data is input to a voice input unit 12 from a voice input device 11 such as a telephone or a microphone. The audio data input to the audio input unit 12 is supplied to a feature extraction unit 13, where the audio data is subjected to frequency analysis. The data obtains a spectrum sequence from the result of the frequency analysis and is input to the phoneme recognition unit 14. Phoneme recognition unit 14
Is composed of a neural network whose output is duplicated.

【０００３】この音素認識部１４のニューラルネットワ
ークは、入力層、隠れ層、出力層からなり、入力層に例
えば、１時刻毎に５フレームのスペクトルが入力され
る。その中央のスペクトルが、該当する音素がどれであ
るかを、出力層のユニットの値によって送出する。出力
ユニットは、二重化されているため、各音素カテゴリ毎
にユニットは２個づつ対応づけられている。それに対し
て結果は、最大の出力値を示すものから２つのユニット
を選び、それが対応する音素を第１位、第２位音素候補
として得る。The neural network of the phoneme recognition unit 14 includes an input layer, a hidden layer, and an output layer. For example, a spectrum of five frames is input to the input layer every time. The central spectrum sends the corresponding phoneme according to the value of the unit in the output layer. Since the output units are duplicated, two units are associated with each phoneme category. On the other hand, as a result, two units are selected from those having the largest output value, and the phonemes corresponding to the two units are obtained as the first and second phoneme candidates.

【０００４】その認識された音素候補列と、認識させた
い語彙の音素パターンを持たせた辞書中のテンプレート
１５との類似度を、テンプレート中の音素と認識された
音素候補列中の第１位および第２位候補との類似度を局
所スコアとする。その局所スコアは、ＤＴＷ法等によっ
てマッチング部１６でマッチングされ、最も類似する単
語を累積することで全体の類似度スコアを求める。その
類似度スコアが、認識させたい全ての語彙の中で、最小
となる単語を認識結果としてマッチング部１６から出力
する。[0004] The similarity between the recognized phoneme candidate sequence and the template 15 in the dictionary having the phoneme pattern of the vocabulary to be recognized is determined by the first place in the phoneme candidate sequence recognized as the phoneme in the template. And the degree of similarity with the second candidate is defined as a local score. The local score is matched by the matching unit 16 by the DTW method or the like, and the overall similarity score is obtained by accumulating the most similar words. The word having the minimum similarity score among all the vocabularies to be recognized is output from the matching unit 16 as a recognition result.

【０００５】[0005]

【発明が解決しようとする課題】（１）上記のような音
声認識装置を電話音声認識装置に適用したシステムがあ
る。このシステムにおいて、複数の電話音声認識装置を
同時に動作させる場合、各音声認識装置の結果を個別に
処理することが一般的に行われている。このように個別
に処理を行うと、他の音声認識装置からの出力結果は、
お互いに参照できないため、各々が独立した処理しか行
うことができない問題がある。また、各音声認識装置を
統合して処理する必要がある場合、全ての音声認識装置
の結果を、１つのサーバで一括して処理できるようなク
ライアント・サーバシステムを構築することが要望され
ている。(1) There is a system in which the above speech recognition device is applied to a telephone speech recognition device. In this system, when a plurality of telephone speech recognition devices are operated simultaneously, it is common to individually process the results of each speech recognition device. When processing is performed individually in this way, the output results from other speech recognition devices are:
Since they cannot be referred to each other, there is a problem that each can perform only independent processing. Further, when it is necessary to process each voice recognition device in an integrated manner, it is desired to construct a client-server system that can process the results of all voice recognition devices collectively by one server. .

【０００６】（２）電話音声認識装置をクライアントに
持つ、サーバ・クライアントシステムでは、各クライア
ントの制御（シャットダウン処理など）を、サーバから
どのように行えるかにより、使い勝手が変わってくる。
ユーザーに余計な負担をかけないためには、簡単にクラ
イアントの制御が行えるようなシステムを構築すること
が要望されている。(2) In a server-client system having a telephone speech recognition device as a client, usability changes depending on how the control of each client (such as a shutdown process) can be performed from the server.
In order not to put an extra burden on the user, it is desired to construct a system that can easily control the client.

【０００７】（３）音声認識処理を始めとするパターン
マッチングは、計算量が多く、現在のコンピュータの処
理速度では、１台まるごとを認識専用に利用しなければ
ならない場合が多い。そのため、認識に用いる電話回線
を１回線増やすだけでも、新たに１台のコンピュータを
増設せねばならなず、システムの規模が大きくなってし
まう問題がある。この結果、複数の電話回線で同時に認
識処理を行えるシステムを構成する場合には、コンパク
トな音声認識装置を製作することが要望されている。(3) Pattern matching including speech recognition processing requires a large amount of calculation, and in many cases, the entire computer must be used exclusively for recognition at the current processing speed of a computer. Therefore, even if the number of telephone lines used for recognition is increased by one, one computer must be newly added, and there is a problem that the scale of the system increases. As a result, in the case of configuring a system that can perform recognition processing simultaneously on a plurality of telephone lines, it is desired to manufacture a compact speech recognition device.

【０００８】この発明は上記の事情に鑑みてなされたも
ので、装置をコンパクトに構成でき、しかも認識する回
線数が増加した場合でもシステム全体の肥大化を防止す
ることができるとともに、各認識クライアントの処理結
果を一括して処理でき、かつ制御も容易にできる電話音
声認識・応答装置及び電話音声認識システム並びに電話
音声認識システム制御プログラムを記録した記録媒体を
提供することを課題とする。SUMMARY OF THE INVENTION The present invention has been made in view of the above circumstances, and it is possible to make the apparatus compact, to prevent the entire system from being enlarged even when the number of lines to be recognized increases, and to make each recognition client It is an object of the present invention to provide a telephone voice recognition / response device, a telephone voice recognition system, and a recording medium which records a telephone voice recognition system control program, which can collectively process the above processing results and can easily control.

【０００９】[0009]

【課題を解決するための手段】この発明は、上記の課題
を達成するために、第１発明は、各種メモリが搭載され
たＣＰＵボードと、アナログ信号をディジタル信号に変
換するＡ／Ｄ変換ボードと、網終端装置やディジタルス
イッチが搭載され、電話回線接続端子が設けられた電話
応答ボードと、これら各ボードを電気的に接続する通電
体と、前記ＣＰＵボードから出力される信号を合成音声
信号に生成して前記電話応答ボードに供給する音声合成
装置とからなり、前記電話応答ボードの電話回線接続端
子に接続された電話回線から送られてくるアナログ信号
を、電話応答ボードからＡ／Ｄ変換ボードへ伝送してＡ
／Ｄ変換ボードでディジタル信号に変換し、そのディジ
タル信号をＣＰＵボードに伝送して、認識処理を実行
し、その認識結果を音声合成装置で合成音声にした後
に、電話応答ボードから電話回線に送信するようにした
ものである。SUMMARY OF THE INVENTION In order to achieve the above-mentioned object, the present invention provides a CPU board having various memories mounted thereon and an A / D conversion board for converting an analog signal into a digital signal. A telephone answering board on which a network terminating device and a digital switch are mounted and a telephone line connection terminal is provided, a conductor for electrically connecting these boards, and a signal output from the CPU board are synthesized voice signals. A / D conversion of an analog signal sent from a telephone line connected to a telephone line connection terminal of the telephone answering board from the telephone answering board. A transmitted to the board
The digital signal is converted to a digital signal by a / D conversion board, the digital signal is transmitted to a CPU board, recognition processing is performed, and the recognition result is converted into a synthesized voice by a voice synthesizer, and then transmitted from the telephone answering board to the telephone line. It is something to do.

【００１０】第２発明は、電話音声認識・応答装置をク
ライアントとして複数組ネットワークに接続し、そのネ
ットワークを介して、サーバに認識結果を伝送したり、
サーバからの指令を受け取って、それに対応した処理を
クライアントが自動的に行うようにしたものである。A second invention is to connect a telephone voice recognition / response device as a client to a plurality of networks and to transmit a recognition result to a server via the network.
The client receives a command from the server and automatically performs a process corresponding to the command.

【００１１】第３発明は、電話音声認識システムを制御
するプログラムを記録した記録媒体であって、サーバで
提供された状態通知関数を利用する手段と、前記サーバ
への状態通知のみならず、サーバからクライアントへの
指令を、戻り値を介して通知する手段と、前記サーバか
らのクライアント制御を行う手段として機能するプログ
ラムからなるものである。[0011] A third invention is a recording medium recording a program for controlling a telephone voice recognition system, comprising means for using a status notification function provided by a server, not only a status notification to the server but also a server. From the server to the client via a return value, and a program functioning as a means for controlling the client from the server.

【００１２】[0012]

【発明の実施の形態】以下この発明の実施の形態を図面
に基づいて説明する。図１は、この発明の実施の第１形
態を示す音声認識・応答装置の構成説明図で、図１にお
いて、２１はＣＰＵボードで、このＣＰＵボード２１に
はＣＰＵの他にＤＲＡＭ、ＤＰＲＡＭ、ネットワークカ
ード、ビデオカード等が設けられている。２２は電話応
答ボードで、この電話応答ボード２２は、後述する図２
に示すように構成されている。２３は、電話応答ボード
２２経由で電話回線からの音声信号が入力されるＡ／Ｄ
変換ボードで、このＡ／Ｄ変換ボード２３には、入力さ
れた音声信号をディジタル信号に変換して録音する機能
を備えている。各ボード２１〜２３は内部バス２４に接
続されていて、Ａ／Ｄ変換ボード２３で録音された音声
信号は、ＣＰＵボード２１に内部バス２４を介して送信
される。Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is an explanatory diagram of a configuration of a voice recognition / response device according to a first embodiment of the present invention. In FIG. 1, reference numeral 21 denotes a CPU board, and in addition to the CPU, the CPU board 21 includes a DRAM, a DPRAM, and a network. A card, a video card, and the like are provided. Reference numeral 22 denotes a telephone answering board.
It is configured as shown in FIG. An A / D 23 receives an audio signal from a telephone line via the telephone answering board 22.
The A / D conversion board 23 has a function of converting an input audio signal into a digital signal and recording the digital signal. Each of the boards 21 to 23 is connected to an internal bus 24, and an audio signal recorded by the A / D conversion board 23 is transmitted to the CPU board 21 via the internal bus 24.

【００１３】ＣＰＵボード２１では、送信されて来た音
声信号の認識処理を実行して、その認識処理に基づいて
ＨＤＤ装置２５に記憶されている応答情報を取り出した
り、あるいは上記認識処理をネットワークを介して図示
しないサーバに伝送し、そのサーバからの応答指令（認
識結果）を受信したする。その後、ＣＰＵボード２１
は、ＨＤＤ装置２５からの応答情報やサーバからの応答
指令等を、音声合成装置２６に送って、ここで、応答情
報や応答指令等は合成の音声信号に変換される。変換さ
れた音声信号は、電話応答ボード２２を経由して電話回
線に伝送される。The CPU board 21 executes a process of recognizing the transmitted audio signal, retrieves response information stored in the HDD 25 based on the recognition process, or executes the above-described recognition process on a network. And transmits a response command (recognition result) from the server. Then, the CPU board 21
Sends the response information from the HDD device 25, the response command from the server, and the like to the voice synthesizer 26, where the response information and the response command are converted into a synthesized voice signal. The converted voice signal is transmitted to the telephone line via the telephone answering board 22.

【００１４】次に、上記電話応答ボード２２の詳細につ
いて図２により述べる。図２において、２２０は、ＬＳ
Ｉにより構成される内部チャネルを切り換えるディジタ
ル時分割空間スイッチで、このスイッチ２２０にはＤＳ
Ｐ２２１、詳細を後述するネットワークコントロールユ
ニット（ＮＣＵ）２２２、音声合成装置２６からの音声
信号が入力される補助入力端子２２３および音声信号を
Ａ／Ｄ変換ボード２３に供給する補助出力端子２２４が
接続される。２２５、２２６は電話回線接続端子で、こ
れら接続端子２２５、２２６はＮＣＵ２２２に接続され
る。Next, details of the telephone answering board 22 will be described with reference to FIG. In FIG. 2, 220 is LS
I is a digital time-division space switch for switching the internal channel constituted by I.
P221, a network control unit (NCU) 222 described in detail later, an auxiliary input terminal 223 to which an audio signal from the audio synthesizer 26 is input, and an auxiliary output terminal 224 for supplying the audio signal to the A / D conversion board 23 are connected. You. 225 and 226 are telephone line connection terminals. These connection terminals 225 and 226 are connected to the NCU 222.

【００１５】２２７は併設電話機の接続端子である。前
記ＤＳＰ２２１はＤＰＲＡＭ２２８を介して内部バス２
４に接続される。なお、ＣＰＵ２１１、ＤＰＲＡＭ２１
２、ＤＲＡＭ２１３はＣＰＵボード２１に搭載されてい
る。２１４は他の機器との接続バスである。また、ＮＣ
Ｕ２２２は、網終端装置と称し、電話網における最下層
の制御装置で、電話局などに設置される交換機と所定の
手順により情報を授受し、端末装置を電話網に有機的に
接続するものである。Reference numeral 227 denotes a connection terminal of the attached telephone. The DSP 221 is connected to the internal bus 2 via the DPRAM 228.
4 is connected. The CPU 211 and the DPRAM 21
2. The DRAM 213 is mounted on the CPU board 21. Reference numeral 214 denotes a connection bus with other devices. Also, NC
U222 is referred to as a network terminating device, and is a lowermost control device in the telephone network. The U222 exchanges information with an exchange installed in a telephone office or the like according to a predetermined procedure and organically connects the terminal device to the telephone network. is there.

【００１６】上記のように音声認識・応答装置は、ＣＰ
Ｕボード２１、電話応答ボード２２およびＡ／Ｄ変換ボ
ード２３の３枚のボードとＨＤＤ装置２５を筺体に収納
し、これに音声合成装置２６を設けた構成であるから、
全体装置は比較的コンパクトな構成となる。そして、各
装置は、独立して動作するため、認識に利用する回線を
増設させる場合でも、増設のための複雑な作業が不要と
なるとともに、コンパクトであるため、システム全体の
規模をあまり大きくすることなく増設が可能となる。As described above, the voice recognition / response device is a CP
Since the three boards of the U board 21, the telephone answering board 22 and the A / D conversion board 23 and the HDD device 25 are housed in a housing, and the voice synthesizing device 26 is provided in the housing,
The overall device has a relatively compact configuration. Further, since each device operates independently, even if a line used for recognition is added, complicated work for the addition is not required, and since the device is compact, the scale of the entire system becomes too large. Expansion can be done without any problem.

【００１７】次に上記第１形態を使用してクライアント
・サーバシステムを構成したこの発明の実施の第２形態
について図３により述べる。図３において、破線で囲ん
だ部分はクライアントＡ，Ｂで、このクライアントＡ，
Ｂは図１に示した音声認識・応答装置である。クライア
ントＡ，Ｂはネットワーク３１に接続され、このネット
ワーク３１には、クライアントＡ，Ｂを監視するマネー
ジャ機能を備えたサーバ３２が接続されている。サーバ
３２では、マネージャプログラム（このプログラムの機
能は後述する）を起動し、クライアントＡ，Ｂを監視す
る。クライアントＡ，Ｂは一定時間毎に、サーバ３２に
現在の状況を伝えるタイマ関数が備えられており、これ
は認識処理と並行してマルチスレッドで実行される。Next, a second embodiment of the present invention in which a client-server system is configured using the above-described first embodiment will be described with reference to FIG. In FIG. 3, portions surrounded by broken lines are clients A and B.
B is the voice recognition / response device shown in FIG. The clients A and B are connected to a network 31, to which a server 32 having a manager function for monitoring the clients A and B is connected. The server 32 starts a manager program (the function of this program will be described later) and monitors the clients A and B. The clients A and B are provided with a timer function for notifying the server 32 of the current situation at regular intervals, and are executed in multi-threads in parallel with the recognition process.

【００１８】前記タイマ関数は、後述するような状態通
知関数を実装している。この状態通知関数は、サーバ３
２のマネージャプログラム内で定義されている。それを
クライアント側で呼び出し、引数を与える。これによ
り、マネージャプログラムは、クライアントの現在の状
態や、音声認識結果を逐次確認することができるように
なっている。サーバ３２は、図４に示す状態通知関数を
用いて、クライアントの状態等を取得するのみでなく、
各クライアントＡ，Ｂのタイマ関数に指定した戻り値を
与えることができる。クライアントＡ，Ｂは、タイマ関
数からの戻り値を参照し、サーバ３２からの指示を認識
し、必要な処理を行う。この機能により、サーバ３２か
ら指定した、例えば、クライアントＢをシャットダウン
したり、クライアントＢの認識・応答プログラムを終了
させたりすることができ、サーバ３２による、システム
全体の制御が可能となる。The timer function implements a state notification function as described later. This status notification function is
2 in the manager program. Call it on the client side and give it an argument. As a result, the manager program can sequentially check the current state of the client and the speech recognition result. The server 32 not only obtains the client status and the like using the status notification function shown in FIG.
A designated return value can be given to the timer function of each of the clients A and B. The clients A and B refer to the return value from the timer function, recognize the instruction from the server 32, and perform necessary processing. With this function, for example, the client B specified by the server 32 can be shut down or the recognition / response program of the client B can be terminated, and the server 32 can control the entire system.

【００１９】上記のように構成された第２形態における
認識・応答プログラムの処理を図５によりのべる。な
お、図５における各処理であるタイマ部の処理、プッシ
ュトーン・回線断検出部の処理、認識処理部の処理等に
ついては、図６から図８により述べる。The processing of the recognition / response program in the second embodiment configured as described above will be described with reference to FIG. The processing of the timer unit, the processing of the push tone / line disconnection detecting unit, the processing of the recognition processing unit, and the like, which are the respective processings in FIG. 5, will be described with reference to FIGS.

【００２０】まず、図５の認識・応答プログラムの処理
の流れを示すフローチャートにおいて、ステップＳ１で
アプリケーションを初期化し、サーバ３２のマネージャ
プログラムへアクセスして、ステップＳ２でマネージャ
プログラムに接続する。その後、ステップＳ３で音声合
成装置２６を初期化して、電話着信検出部をステップＳ
４でスタートさせる。着信検出をスタートさせたなら、
電話回線の初期化をステップＳ５で行った後、タイマ部
をステップＳ６でスタートさせる。このタイマ部がスタ
ートすると着信待ち（ステップＳ７）となり、着信があ
ったかをステップＳ８で判定する。判定の結果、「NO」
ならステップＳ７に処理が戻り、「YES」ならステップ
Ｓ９で、プッシュトーン・回線断検出部の処理をスター
トさせるとともに、録音・認識処理部の処理を並行して
実行する。First, in the flowchart of FIG. 5 showing the flow of processing of the recognition / response program, the application is initialized in step S1, the manager program of the server 32 is accessed, and the manager program is connected in step S2. Thereafter, in step S3, the voice synthesizer 26 is initialized, and the telephone incoming detecting unit is set in step S3.
Start with 4. If you start incoming call detection,
After the initialization of the telephone line is performed in step S5, the timer unit is started in step S6. When the timer section starts, it waits for an incoming call (step S7), and determines whether there is an incoming call in step S8. As a result of the judgment, "NO"
If so, the process returns to step S7, and if "YES", in step S9, the process of the push tone / line disconnection detecting unit is started, and the process of the recording / recognition processing unit is executed in parallel.

【００２１】図６はタイマ部の処理の流れを示すフロー
チャートで、図６において、ステップＳ１１でタイマが
起動されたなら、ステップＳ１２でクライアントの状態
を一定時間おきにサーバ３２のマネージャプログラムへ
通知する。その後、ステップＳ１３で、サーバ３２から
の状態通知関数の戻り値を参照し、対応した処理を行
う。この処理がステップＳ１４でタイマ処理が終了する
まで繰り返す。FIG. 6 is a flowchart showing the flow of processing of the timer unit. In FIG. 6, if the timer is started in step S11, the state of the client is notified to the manager program of the server 32 at regular intervals in step S12. . Thereafter, in step S13, a corresponding process is performed with reference to the return value of the status notification function from the server 32. This process is repeated until the timer process ends in step S14.

【００２２】図７はプッシュトーン・回線断検出部の処
理の流れを示すフローチャートで、図７において、ステ
ップＳ２１で回線断検出を開始し、回線断チェックをス
テップＳ２２で行う。このチェックで回線が接続されて
いなければ、ステップＳ２３で認識処理部の処理を、ス
テップＳ２４でタイマ部の処理を終了し、再び着信を待
つ（ステップＳ２５）。一方、前記チェックで回線が接
続されていれば、ステップＳ２６でプッシュトーンの検
出を開始し、ステップＳ２７でプッシュトーン検出が一
定時間検出されなければ（NO）、処理がステップＳ２１
に戻る。ステップＳ２７でプッシュトーンが検出されれ
ば（YES）、ステップＳ２８で受信したトーンに対応し
た処理が実行される。FIG. 7 is a flow chart showing the flow of processing of the push tone / line disconnection detecting unit. In FIG. 7, line disconnection detection is started in step S21, and line disconnection check is performed in step S22. If the line is not connected in this check, the processing of the recognition processing section is terminated in step S23, the processing of the timer section is terminated in step S24, and the incoming call is awaited again (step S25). On the other hand, if the line is connected in the above check, the detection of the push tone is started in step S26, and if the detection of the push tone is not detected for a predetermined time in step S27 (NO), the process proceeds to step S21.
Return to If a push tone is detected in step S27 (YES), processing corresponding to the tone received in step S28 is executed.

【００２３】図８は認識処理部の処理の流れを示すフロ
ーチャートで、図８において、ステップＳ３１で、認識
処理の初期化を行った後、ステップＳ３２で音声入力
（録音）を開始し、ステップＳ３３で録音・認識処理を
行う。ステップＳ３４で音声入力が終了したなら（YE
S）、録音、認識処理をステップＳ３５で終了し、認識
結果をステップＳ３６で音声合成装置２６等へ出力す
る。FIG. 8 is a flowchart showing the flow of the processing of the recognition processing unit. In FIG. 8, after the initialization of the recognition processing is performed in step S31, voice input (recording) is started in step S32, and step S33 is started. Perform recording and recognition processing. If the voice input is completed in step S34 (YE
S), the recording and recognition processing ends in step S35, and the recognition result is output to the speech synthesizer 26 and the like in step S36.

【００２４】上記認識・応答プログラムの処理では、録
音用バッファ群に音声を録音して行くが。１つのバッフ
ァに音声が録音され終わる度に、そのバッファを認識処
理部に伝送するという、逐次処理方式を採用している。
録音処理と認識処理が並行して行われるため、音声発声
終了から認識結果の出力までの待ち時間はほとんどな
く、高速な認識処理が実現している。In the processing of the recognition / response program, voice is recorded in a recording buffer group. Each time a voice is recorded in one buffer, the buffer is transmitted to the recognition processing unit, and a sequential processing method is adopted.
Since the recording process and the recognition process are performed in parallel, there is almost no waiting time from the end of the voice utterance to the output of the recognition result, and a high-speed recognition process is realized.

【００２５】前記サーバ３２におけるマネージャプログ
ラムは、認識・応答プログラムが起動されたすべてのク
ライアントに対して、現在の状態や、認識結果等を逐次
チェックし、記録して行くものである。このプログラム
は、起動されている各クライアントの認識・応答プログ
ラムに、状態通知関数を提供する。状態通知関数は、ク
ライアントの各時点での状態（回線接続中とか、音声認
識中など）や、音声認識結果を獲得するだけでなく、戻
り値を与えることで、サーバ３２からのクライアント
Ａ，Ｂへの指令を伝えることができる。この戻り値は、
マネージャプログラムのウィンドウ上に用意されたメニ
ューを選択することで設定することができるようにす
る。The manager program in the server 32 sequentially checks and records the current state, the recognition result, and the like for all clients on which the recognition / response program has been started. This program provides a status notification function to the recognition / response program of each activated client. The status notification function not only obtains the status of the client at each point in time (such as during line connection or voice recognition) and the voice recognition result, but also gives a return value, and provides the client A, B from the server 32. You can tell the order to. This return value is
The setting can be made by selecting a menu prepared on the window of the manager program.

【００２６】ここで、マネージャプログラムの処理の流
れを図９に示すフローチャートにより説明する。図９に
おいて、ステップＳ４１で任意のクライアントの認識・
応答プログラムから接続要求を受けると、プログラムを
起動し、「YES」ならクライアントに対応し、記録部を
作成する（Ｓ４２）。なお、接続されるクライアントが
増えると、それに対応した記録部を順次作成する。記録
部が作成されたなら、ステップＳ４３でユーザによるク
ライアント処理指令かを判断し、「YES」ならステップ
Ｓ４４で状態通知関数への戻り値を設定する。その後、
ステップＳ４５で一定時間おきに状態通知関数を介し
て、各クライアントの状態を受け取り、メニューが選択
された場合、それに対した戻り値を対応したクライアン
トから呼び出された状態通知関数に伝えて記録部にチェ
ックする（Ｓ４６）。このチェック後、接続している認
識・応答プログラムの全てが終了したことを確認して
（４７）、マネージャプログラムを終了する。Here, the flow of processing of the manager program will be described with reference to the flowchart shown in FIG. In FIG. 9, the recognition / recognition of an arbitrary client is performed in step S41.
When a connection request is received from the response program, the program is started, and if "YES", a recording unit is created corresponding to the client (S42). When the number of connected clients increases, recording units corresponding to the clients are sequentially created. If the recording unit has been created, it is determined in step S43 whether the command is a client processing instruction by the user. If "YES", a return value to the state notification function is set in step S44. afterwards,
In step S45, the status of each client is received via the status notification function at regular time intervals, and when a menu is selected, a return value corresponding to the menu is transmitted to the status notification function called from the corresponding client and transmitted to the recording unit. Check (S46). After this check, it is confirmed that all connected recognition / response programs have been completed (47), and the manager program is terminated.

【００２７】[0027]

【発明の効果】以上述べたように、この発明によれば、
音声認識・応答装置をコンパクトに構成したため、認識
する回線が増加した場合でも、システム全体の肥大化を
防止することができるとともに、音声新式・応答装置を
用いてクライアント・サーバ型を構成したため、各認識
クライアントの処理結果を一括して扱うことができ、か
つ制御も容易になるなどの種々の優れた効果がある。As described above, according to the present invention,
Because the voice recognition / response device is compact, even if the number of lines to be recognized increases, it is possible to prevent the system from becoming too large.Also, a new type of voice / response device is used to configure the client / server type. There are various excellent effects such that the processing results of the recognition client can be handled collectively and the control becomes easy.

[Brief description of the drawings]

【図１】この発明の実施の第１形態を示す音声認識・応
答装置の構成説明図。FIG. 1 is a configuration explanatory diagram of a voice recognition / response device according to a first embodiment of the present invention.

【図２】電話応答ボードの詳細な構成説明図。FIG. 2 is a diagram illustrating a detailed configuration of a telephone answering board.

【図３】この発明の実施の第２形態を示すクライアント
・サーバシステム構成図。FIG. 3 is a configuration diagram of a client-server system according to a second embodiment of the present invention.

【図４】状態通知関数の利用法の説明図。FIG. 4 is an explanatory diagram of how to use a state notification function.

【図５】認識・応答プログラムの処理を示すフローチャ
ート。FIG. 5 is a flowchart showing processing of a recognition / response program.

【図６】タイマ部の処理を示すフローチャート。FIG. 6 is a flowchart illustrating processing of a timer unit.

【図７】プッシュトーン・回線断検出部の処理を示すフ
ローチャート。FIG. 7 is a flowchart illustrating processing of a push tone / line disconnection detection unit.

【図８】録音・認識処理部の処理を示すフローチャー
ト。FIG. 8 is a flowchart showing processing of a recording / recognition processing unit.

【図９】マネージャプログラムの処理を示すフローチャ
ート。FIG. 9 is a flowchart showing processing of a manager program.

【図１０】従来の音声認識装置の概略構成図。FIG. 10 is a schematic configuration diagram of a conventional voice recognition device.

[Explanation of symbols]

２１…ＣＰＵボード２２…電話応答ボード２３…Ａ／Ｄ変換ボード２４…内部バス２５…ＨＤＤ装置２６…音声合成装置３１…ネットワーク３２…サーバ DESCRIPTION OF SYMBOLS 21 ... CPU board 22 ... Telephone answering board 23 ... A / D conversion board 24 ... Internal bus 25 ... HDD device 26 ... Voice synthesizer 31 ... Network 32 ... Server

フロントページの続きＦターム(参考） 5D015 KK01 5D045 AB24 AB30 5K015 AA00 AA06 AA07 GA00 GA07 5K024 AA76 BB01 BB02 CC01 DD01 EE01 EE09 FF06 Continued on the front page F term (reference) 5D015 KK01 5D045 AB24 AB30 5K015 AA00 AA06 AA07 GA00 GA07 5K024 AA76 BB01 BB02 CC01 DD01 EE01 EE09 FF06

Claims

[Claims]

1. A telephone answering board equipped with a CPU board on which various memories are mounted, an A / D conversion board for converting an analog signal into a digital signal, a network terminator and a digital switch, and provided with a telephone line connection terminal. Board and
A conductor for electrically connecting each of these boards;
A voice synthesizer for generating a signal output from the U board into a synthesized voice signal and supplying the synthesized voice signal to the telephone answering board, and an analog signal transmitted from a telephone line connected to a telephone line connection terminal of the telephone answering board. A signal is transmitted from the telephone answering board to the A / D conversion board, converted into a digital signal by the A / D conversion board, and the digital signal is converted
A telephone voice recognition / response device, wherein the voice data is transmitted to a board, a recognition process is executed, the recognition result is converted into a synthesized voice by a voice synthesizer, and then transmitted to a telephone line from a telephone response board.

2. A telephone voice recognition / response apparatus according to claim 1 is connected to a plurality of networks as a client, and a recognition result is transmitted to a server via the network, or a command from the server is received. A telephone voice recognition system wherein a client automatically performs a corresponding process.

3. A recording medium storing a program for controlling a telephone speech recognition system, comprising: means for using a status notification function provided by a server; A recording medium storing a telephone speech recognition system control program, comprising: a program for notifying the above command via a return value; and a program functioning as a program for controlling the client from the server.