JPH07110021B2

JPH07110021B2 - Interactive voice response device

Info

Publication number: JPH07110021B2
Application number: JP1205287A
Authority: JP
Inventors: 和洋五味; 宏之西; 順治小島
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1986-04-11
Filing date: 1987-01-21
Publication date: 1995-11-22
Anticipated expiration: 2010-11-22
Also published as: JPS6345950A

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、対話形式で応答を行う対話形音声応答装置に
関するものである。TECHNICAL FIELD The present invention relates to an interactive voice response device that responds in an interactive manner.

[Conventional technology]

対話形音声応答装置は、回線からの入力内容に応じて、
音声による所定のメッセージを選択してこれを回線に送
出するものである。この装置においては、利用者（発呼
者）の意向を知るための手段、即ち回線からの入力手段
は、利用者の音声を認識することによるか、又はPB（プ
ッシュボタン）信号等の非音声情報を授受することによ
るものである。The interactive voice response device, depending on the input contents from the line,
It selects a predetermined voice message and sends it to the line. In this device, the means for knowing the intention of the user (caller), that is, the means for inputting from the line is by recognizing the voice of the user or by non-voice such as PB (push button) signal. This is due to the exchange of information.

第２図は音声認識を用いた場合の処理例を示すフローチ
ャートである。なお同図においては利用者への音声の送出ブロック、は利用者の音声の録音ブロックを示す。同図に示すよう
に利用者の意向を確認するために、例えば「ハイ」の発
声を指示する応答メッセージを送出して、利用者の音声
を認識し、その認識結果が「ハイ」ならば次の処理に進
み、「ハイ」以外ならばもう一度同じ応答メッセージを
送出するものである。FIG. 2 is a flowchart showing an example of processing when voice recognition is used. In the figure Is a block for sending audio to the user, Indicates a recording block of the user's voice. As shown in the figure, in order to confirm the user's intention, for example, a response message instructing the utterance of "high" is transmitted, the voice of the user is recognized, and if the recognition result is "high", the next Then, if it is not "high", the same response message is sent again.

一方、第３図はPB信号を用いた処理例を示すフローチャ
ートである。同図に示すように本処理例では、「ハイ」
を意味する所定のPB操作、例えば「＊０」を指示する応
答メッセージを送出して、PB信号を検出することによ
り、上記処理例と同様の処理を行うものである。On the other hand, FIG. 3 is a flowchart showing a processing example using the PB signal. As shown in the figure, in this processing example, "high"
By transmitting a response message instructing a predetermined PB operation, which means "* 0", and detecting the PB signal, the same processing as the above processing example is performed.

[Problems to be solved by the invention]

しかしながら音声認識により処理する場合は、利用者が
「ハイ」の意思をもっているにも拘わらず、誤って「エ
エ」「ドウゾ」等の認識対象以外の言葉を発した場合
に、「ハイ」以外の入力があったものと認識してしま
い、もう一度同じ応答メッセージを送出することになる
という欠点があった。しかも第２図中に示したように、
「よろしかったら『ハイ』とおっしゃってください」
等、発声内容を指示する冗長なメッセージを挿入する必
要がある等、人間同士の会話と大きく異なるため、利用
しづらいという欠点もあった。However, in the case of processing by voice recognition, even if the user has an intention of "high", if a word other than the recognition target is erroneously uttered, such as "AE" or "dozo", There was a drawback that it recognized that there was input and would send the same response message again. Moreover, as shown in FIG.
"Please say" High "if you like."
For example, it is necessary to insert a redundant message instructing the content of utterance, which is very different from the conversation between humans, and thus it is difficult to use.

またPB信号等により処理する場合も、第３図に示したよ
うに「ハイ」の場合は「＊０」を押すというような決め
られた手順に従って操作する必要があるために、上記と
同様に誤操作による応答メッセージの繰り返し等の欠点
があり、しかも利用者に上記のような特殊なPB操作を強
いることになるので、利用者の負担が極めて大きくなる
という欠点があった。Also when processing with PB signals, etc., it is necessary to operate according to a predetermined procedure such as pressing "* 0" when it is "high" as shown in FIG. There is a drawback that the response message is repeated due to an erroneous operation, and since the user is forced to perform the special PB operation as described above, the user's burden becomes extremely large.

これらの欠点を解決するためには、入力手段として、語
彙・語数等に制限がない日常的な言葉によって入力内容
を認識することが必要であるが、現在の音声認識技術で
はハードウエア規模が非常に大きく高価になってしま
い、電話機等に適用するには難点があった。In order to solve these shortcomings, it is necessary to recognize the input content by everyday words with no restrictions on vocabulary, number of words, etc., as input means, but the current speech recognition technology requires a very large hardware scale. However, it is very expensive and difficult to apply to a telephone or the like.

本発明はこれらの問題点に鑑み、日常的な言葉の発声に
よる入力音声に応答でき、しかも簡便な構造の音声入力
応答装置を提供することを目的とする。In view of these problems, an object of the present invention is to provide a voice input response device which can respond to an input voice produced by a daily speech and has a simple structure.

[Means for solving problems]

本発明の構成を第１図に基づいて説明する。本発明は、
回線からの入力音声を受信し、それに基づいて応答メッ
セージを送出し、対話形式で応答を行う対話形音声応答
装置において、回線から入力音声を検出する入力音声検
出手段１と、入力音声の継続時間長（音声時間長）Ｔを
計測する測定手段２と、音声時間長しきい値情報を格納
する音声時間長しきい値情報格納手段３と、複数の応答
メッセージ情報を格納する応答メッセージ情報格納手段
４と、音声時間長Ｔを音声時間長しきい値と比較する比
較手段５と、その比較結果に基づいて応答メッセージを
選択し回線に出力する応答メッセージ送出手段６とを備
えたものである。The structure of the present invention will be described with reference to FIG. The present invention is
In an interactive voice response device that receives an input voice from a line, sends a response message based on the input voice, and responds in an interactive manner, an input voice detecting means 1 for detecting the input voice from the line, and a duration of the input voice. Measuring means 2 for measuring the length (voice time length) T, voice time threshold information storage means 3 for storing voice time threshold information, and response message information storage means for storing a plurality of response message information. 4, a comparing means 5 for comparing the voice time length T with a voice time length threshold value, and a response message sending means 6 for selecting a response message based on the comparison result and outputting it to the line.

[Action]

本発明の作用を同じく第１図に基づいて説明する。回線
からの相手側の入力音声は入力音声検出手段１により検
出される。そして測定手段２により音声時間長Ｔが計測
される。第４図は入力音声の内容と音声時間長との相関
関係を示す説明図であり、横軸は音声時間長Ｔ、縦軸は
出現頻度（％）を表わす。同図に示すように、一般に、
応答メッセージに続けて発せられる入力音声の内容と音
声時間長Ｔの長さとは相関関係がある。そこで比較手段
５によりこの音声時間長Ｔの長さを音声時間長しきい値
と比較することによって入力音声に対応した適切な応答
メッセージを選択することができ、応答メッセージ送出
手段６により選択された応答メッセージは回線に送出さ
れる。The operation of the present invention will be described with reference to FIG. The input voice of the other party from the line is detected by the input voice detecting means 1. Then, the measuring unit 2 measures the voice duration T. FIG. 4 is an explanatory diagram showing the correlation between the content of the input voice and the voice duration, where the horizontal axis represents the voice duration T and the vertical axis represents the appearance frequency (%). As shown in the figure, in general,
There is a correlation between the content of the input voice issued following the response message and the length of the voice time length T. Therefore, by comparing the length of the voice time length T with the voice time length threshold value by the comparison means 5, it is possible to select an appropriate response message corresponding to the input voice, and the response message sending means 6 selects. The response message is sent to the line.

〔Example〕

本発明の一実施例を第５図及び第６図に基づいて説明す
る。An embodiment of the present invention will be described with reference to FIGS.

第５図は本発明の一実施例に係る自動応答録音装置を示
すブロック図であり、７は局線L₁，L₂に接続される着信
検出部、８は４又は８ビットのマイクロコンピュータで
構成される制御部、９はフックスイッチ、10はフックス
イッチ９と並列に接続されるループ制御部、11はループ
制御部10を介して局線L₁，L₂に接続される通話回路部、
12は通話回路部11の送話端子T₁，T₂に接続される応答メ
ッセージ送出部、13は応答メッセージ送出部12に接続さ
れ、複数の応答メッセージ情報を蓄積した応答メッセー
ジ情報蓄積部、14は通話回路部11の受話端子R₁，R₂に接
続される相手音声検出部、15は同じく通話回路部11の受
話端子に接続される相手メッセージ録音部、16は相手音
声の継続時間長又は無言時間長を計測するための時計、
17は音声時間長しきい値T_L，T_H（T_L＜T_H）、無言判定し
きい値及び発生終了しきい値といった時間長しきい値情
報を格納する時間長しきい値情報格納部である。FIG. 5 is a block diagram showing an automatic response recording apparatus according to an embodiment of the present invention, 7 is an incoming call detecting unit connected to the office lines L ₁ and L ₂ , and 8 is a 4- or 8-bit microcomputer. A control unit configured, 9 is a hook switch, 10 is a loop control unit connected in parallel with the hook switch 9, 11 is a call circuit unit connected to the office lines L ₁ and L ₂ via the loop control unit 10,
Reference numeral 12 is a response message transmission unit connected to the transmission terminals T ₁ and T ₂ of the call circuit unit 11, 13 is a response message information storage unit that is connected to the response message transmission unit 12 and stores a plurality of response message information, 14 Is a partner voice detection unit connected to the receiving terminals R ₁ and R ₂ of the calling circuit unit 11, 15 is a partner message recording unit that is also connected to the receiving terminal of the calling circuit unit 11, and 16 is the duration of the partner voice or A clock for measuring silent time length,
Reference numeral 17 is a time length threshold information storage unit for storing time length threshold information such as voice time length thresholds T _L , T _H (T _L <T _H ), silence determination threshold and occurrence end threshold. Is.

以上の構造における動作を同じく第５図に基づいて説明
する。まず、着信があると着信検出部７がこれを検出し
て制御部８に着信信号を出力する。制御部８はこの着信
信号があると、所定時間経過後、ループ制御部10を作動
させてループを閉成し、自動着信動作を終了する。The operation of the above structure will be described with reference to FIG. First, when an incoming call is received, the incoming call detection unit 7 detects it and outputs an incoming call signal to the control unit 8. When the incoming signal is received, the control unit 8 operates the loop control unit 10 to close the loop after a lapse of a predetermined time, and ends the automatic incoming call operation.

第６図は第５図の装置における応答制御を示すフローチ
ャートであり、以下の動作は第５図及び第６図に基づい
て説明する。ループ閉成後、応答メッセージ送出部12を
作動させて、応答メッセージ情報蓄積部13に予め記憶さ
せた第一メッセージ（例えば「はい、鈴木でございま
す」）を通話回路部11に送出する。FIG. 6 is a flow chart showing the response control in the apparatus of FIG. 5, and the following operation will be explained based on FIGS. 5 and 6. After the loop is closed, the response message sending unit 12 is activated to send the first message (for example, "Yes, I am Suzuki") stored in advance in the response message information storage unit 13 to the call circuit unit 11.

この際、第一メッセージの送出後、相手メッセージ録音
部15を作動させ、通話回路部11の受話端子R₁，R₂から得
られる利用者（発呼者）の音声を録音する。At this time, after sending the first message, the other party message recording unit 15 is operated to record the voice of the user (caller) obtained from the receiving terminals R ₁ and R _{2 of} the communication circuit unit 11.

同時に相手音声検出部14により利用者の音声の有無を監
視する。無音区間の時間長を時計16により計測し、この
時間長が無言判定しきい値を越える前に音声が検出され
たら、その無音区間は音声区間と次の音声区間との間の
ポーズであると判断し、再び音声の無音区間の開始を検
知するために音声を監視する。そして無音区間の時間長
が無言判定しきい値を越えた時点で、利用者のメッセー
ジが終了したものと判断し、その継続時間長（音声時間
長）に基づいて次に送出する応答メッセージの選択制御
が開始される。At the same time, the presence / absence of the user's voice is monitored by the partner voice detection unit 14. The time length of the silent section is measured by the clock 16, and if the voice is detected before this time length exceeds the silence determination threshold value, the silent section is a pause between the voice section and the next voice section. The decision is made and the voice is monitored again to detect the start of a silent section of the voice. When the time length of the silent section exceeds the silent threshold, it is determined that the user's message has ended, and the response message to be sent next is selected based on the duration (voice duration). Control is started.

ここで上記の第一応答メッセージが「はい、鈴木でござ
います」であるとすると、それに続く利用者のメッセー
ジには以下のような場合が考えられる。If the above first response message is "Yes, I am Suzuki", the following user message may be as follows.

利用者が送出された応答メッセージを聞き取れなか
ったか、応答メッセージの内容に不自然さを感じた等の
理由で、「え？」「もしもし？」等の問い返しをした場
合。この場合音声時間長Ｔは極めて短くなる。When the user asks back "Eh?" Or "Hello?", For example, because the user could not hear the response message sent, or the content of the response message seemed unnatural. In this case, the voice time length T becomes extremely short.

と同じ理由により無言の場合。 If it is silent for the same reason as above.

「鈴木一郎さんいらっしゃいますでしょうか。」の
ように、相手名のみが含まれているメッセージの場合。
この場合、音声時間長Ｔは比較的短くなる。In the case of a message containing only the other party's name, such as "Is there Mr. Ichiro Suzuki?"
In this case, the voice time length T becomes relatively short.

「東京の田中と申しますが、鈴木一郎さんいらしゃ
まいますでしょうか。」のように、利用者名と用件のあ
る相手の名が両方とも含まれているメッセージの場合。
この場合、音声時間長Ｔは比較的長くなる。In the case of a message that includes both the user's name and the name of the person who has a message, such as "I'm Tanaka from Tokyo, would you like Mr. Ichiro Suzuki?"
In this case, the voice time length T becomes relatively long.

そこで本実施例では、音声時間長Ｔを時計16により計測
し、音声時間長しきい値T_L，T_Hと比較することにより上
記のないし、又はの場合に対応させてＴ≦T_L，
T_L＜Ｔ＜T_H，T_H≦Ｔの三つに分類する。Therefore, in the present embodiment, the voice duration T is measured by the clock 16 and compared with the voice duration thresholds T _L and T _H so that T ≦ T _L ,
It is classified into three categories of T _L <T <T _H and T _H ≤T.

そしてＴ≦T_Lのときは上記の又はの場合であると判
断し、再び第一メッセージを送出する。Then, when T ≦ T _L , it is determined that the above case or is satisfied, and the first message is transmitted again.

またT_L＜Ｔ＜T_Hのときは上記のの場合であると判断
し、「恐れ入りますが、どちら様でしょうか」等、利用
者名を質問する応答メッセージを選択して送出し、利用
者の音声を録音する。そして次に例えば「御用件をお話
し下さい」等の応答メッセージを送出し、再び利用者の
音声を録音する。When T _L <T <T _H , it is judged that it is the above case, and a response message for asking the user name such as "I'm sorry, which one is it?" Is selected and sent out. Record the voice of. Then, for example, a response message such as "Please tell us what you want to do" is transmitted, and the voice of the user is recorded again.

さらにT_H≦Ｔのときは上記のの場合であると判断し、
T_L＜Ｔ＜T_Hのときのように利用者名を尋ねることなく、
用件を尋ねる旨の応答メッセージの送出を行い、以下同
様の動作を行う。Further, when T _H ≦ T, it is judged that the above case,
Without asking for the user name as when T _L <T <T _H
A response message for inquiring about the requirement is sent, and the same operation is performed thereafter.

このように本実施例では、相手音声検出部14により利用
者の音声の有無を監視し、時計16により音声時間長Ｔを
計測することによって、利用者のメッセージの内容を推
測し、それに対応して適切な応答メッセージを送出する
ことができるので、利用者にとって極めて簡便な自動応
答録音装置を提供することができるという利点がある。As described above, in the present embodiment, the other party voice detection unit 14 monitors the presence or absence of the user's voice, and the clock 16 measures the voice duration T to infer the content of the user's message and respond to it. Since an appropriate response message can be sent out by the user, there is an advantage that an extremely simple automatic response recording device can be provided for the user.

次に本発明の他の実施例を説明する。Next, another embodiment of the present invention will be described.

まず本実施例の概略を第７図に基づいて説明する。第７
図は本実施例の基本動作を示すフローチャートである。
前記の実施例では相手メッセージをその音声時間長Ｔに
より、Ｔ≦T_Lの場合は利用者名、相手名を含まないも
の、T_L＜Ｔ＜T_Hの場合は利用者名を含まず相手名を含む
もの、T_H≦Ｔの場合は利用者名及び相手名を含むものと
いうように判断するものであったが、第４図から分かる
ようにT_L＜Ｔ＜T_Hのメッセージの場合、相手名を含まず
利用者名を含むものがある。この場合と利用者名を含ま
ず相手名を含む場合は音声時間長Ｔが同程度であり、音
声時間長Ｔからの判別は困難である。そこで本実施例は
第７図に示すように、T_L＜Ｔ＜T_Hの場合に「はい」など
の相槌を送出し、これに対し利用者から応答があった場
合は、前のメッセージは利用者名のみを含むものであ
り、さらに後のメッセージは用件のある相手名のみを含
むものであると判断し、利用者から応答がない場合は、
利用者のメッセージは用件のある相手名のみを含むもの
であったと判断するようにしたものである。First, the outline of this embodiment will be described with reference to FIG. 7th
The figure is a flow chart showing the basic operation of the present embodiment.
In the above embodiment, the other party's message is not included in the user name and the other party's name if T ≦ T _L or the user's name is not included in the case of T _L <T <T _{H according} to the voice time length T of the other party. If T _L <T <T _H , the message includes the user's name, and if T _H ≤ T, the user name and the other party's name are included. , Some include the user name but not the other party's name. In this case, when the user name is not included and the partner name is included, the voice time length T is about the same, and it is difficult to distinguish from the voice time length T. Therefore, in the present embodiment, as shown in FIG. 7, when T _L <T <T _H, a reply such as “Yes” is sent, and when the user responds to this, the previous message is If it is judged that the message contains only the user name, and that the message after that contains only the name of the other party with a message, and there is no response from the user,
The user's message is determined to include only the name of the other party who has the message.

本実施例を第８図及び第９図（ａ）（ｂ）（ｃ）に基づ
いて詳細に説明する。第８図は本発明の他の実施例に係
る自動応答録音装置を示すブロック図であり、着信検出
部７、制御部８、ループ制御部10、通話回路部11、応答
メッセージ送出部12、応答メッセージ情報蓄積部13、相
手音声検出部14、相手メッセージ録音部15、時計16、時
間長しきい値情報格納部17は第５図と同じである。18は
通話回路部11の送話端子T₁，T₂に接続される相槌送出部
である。なお応答メッセージ情報蓄積部13に相槌メッセ
ージ情報を蓄積する態様をとれば、相槌送出部18を省略
することができる。This embodiment will be described in detail with reference to FIGS. 8 and 9 (a) (b) (c). FIG. 8 is a block diagram showing an automatic response recording apparatus according to another embodiment of the present invention, which includes an incoming call detection section 7, a control section 8, a loop control section 10, a call circuit section 11, a response message transmission section 12, and a response. The message information storage unit 13, the partner voice detection unit 14, the partner message recording unit 15, the clock 16, and the time length threshold value information storage unit 17 are the same as those in FIG. Reference numeral 18 is a hammer transmission unit connected to the transmission terminals T ₁ and T ₂ of the communication circuit unit 11. If the response message information accumulating unit 13 accumulates the echo message information, the echo sending unit 18 can be omitted.

以上の構造における動作を第８図に基づいて説明する。
まず着信があると、第１図の実施例と同様の自動着信動
作を行うと共に、相手メッセージ録音部15を作動させ、
通話回路部11の受話端子R₁，R₂から得られるメッセージ
音声の録音動作を開始する。The operation of the above structure will be described with reference to FIG.
First, when there is an incoming call, the automatic incoming call operation similar to that of the embodiment of FIG.
The recording operation of the message voice obtained from the receiving terminals R ₁ and R _{2 of the} call circuit unit 11 is started.

以下の動作は第８図及び第９図（ａ）（ｂ）（ｃ）に基
づいて説明する。第９図（ａ）（ｂ）（ｃ）は第８図の
装置における応答制御を示すフローチャートである。第
９図（ａ）に示すように自動着信動作終了後、応答メッ
セージ送出部12を作動させて、応答メッセージ情報蓄積
部13に予め記憶させた第１応答メッセージ（例えば「は
い○○商事でございます」）を送出する。The following operation will be described with reference to FIGS. 8 and 9 (a) (b) (c). 9 (a) (b) (c) is a flowchart showing the response control in the apparatus of FIG. As shown in FIG. 9 (a), after the completion of the automatic incoming call operation, the response message sending section 12 is activated to cause the first response message stored in the response message information accumulating section 13 in advance (for example, "Yes. Will be sent.)).

その次に相手音声検出部14を作動させ、利用者の音声の
有無及びその長さを監視する。時間長しきい値情報格納
部17に格納されている無言判定しきい値時間が経過して
も利用者がメッセージを発しない場合には、利用者は第
１応答メッセージを聞き取れなかった等の原因で無言状
態にあるものと判断し、再び第１応答メッセージを送出
する。その後も利用者がメッセージを発しない場合に
は、利用者はメッセージを述べる意思がないものと判断
し、定められた回数だけこの動作を繰り返した後、処理
終了過程へと移行する。Next, the other party voice detection unit 14 is operated to monitor the presence and the length of the user's voice. If the user does not issue a message even after the silent threshold value stored in the time length threshold information storage unit 17 has elapsed, the user may not hear the first response message. Then, the first response message is sent again, since it is determined that the speech is in a silent state. If the user does not issue the message even after that, the user judges that he / she does not have the intention to describe the message, repeats this operation a predetermined number of times, and then shifts to the process termination process.

一方、利用者音声検出部14から利用者のメッセージが検
出された場合には以下の音声時間長Ｔの測定を行う。第
９図（ｂ）に示すように、まず時計16を作動させ、利用
者メッセージの音声時間長Ｔの計測を開始する。そして
利用者メッセージが相手音声検出部14により無音と判定
された場合には、一端時計16を停止しそれまでに計数さ
れた音声時間長Ｔを保存した後、無音時間の測定を開始
する。無音状態が時間長しきい値情報格納部17に格納さ
れている発生終了しきい値時間以上継続した場合には、
その時点で利用者のメッセージは終了したと判断する
が、発生終了しきい値時間継続する以前に利用者からの
音声が相手音声検出部14から検出された場合には、先程
保存された音声時間長Ｔを時計16にロードし直し、再び
音声時間長Ｔの積算を行う。On the other hand, when the user's voice detection unit 14 detects the user's message, the following voice time length T is measured. As shown in FIG. 9 (b), first, the clock 16 is activated to start measuring the voice time length T of the user message. When the user message is determined to be silent by the other party voice detection unit 14, the clock 16 is stopped for a while, the voice time length T counted up to that point is stored, and then the silent time measurement is started. If the silent state continues for the occurrence end threshold time stored in the time length threshold information storage unit 17,
Although it is determined that the user's message has ended at that point, if the voice from the user is detected by the partner voice detection unit 14 before the occurrence end threshold time continues, the voice time saved earlier The length T is reloaded into the clock 16 and the voice time length T is integrated again.

第９図（ａ）に戻って説明する。利用者のメッセージ終
了が確認されたら、制御部８は音声時間長Ｔを時計16か
ら読み込み、時間長しきい値情報格納部17に格納されて
いた音声時間長しきい値T_L，T_Hと比較を行う。比較の結
果により以後の処理は次のようになる。Returning to FIG. 9 (a), description will be made. When the end of the message by the user is confirmed, the control unit 8 reads the voice time length T from the clock 16 and sets the voice time length thresholds T _L and T _H stored in the time length threshold value information storage unit 17. Make a comparison. The subsequent processing is as follows according to the result of the comparison.

T_H≦Ｔの場合利用者第１メッセージは利用者名と用件のある相手名の
両方を述べていると判断できるので、応答メッセージ情
報蓄積部13から利用者の用件を尋ねる応答メッセージ
（例えば「只今留守にしておりますので、用件がござい
ましたらどうぞおっしゃて下さい」）を選択し局線へ送
出する。When T _H ≦ T Since it can be determined that the user first message describes both the user name and the name of the other party with a message, the response message (from the response message information accumulating unit 13 asks the user for the message ( For example, "I'm out of office now, so please let me know if you have any requirements.") And send it to the station line.

T_L＜Ｔ＜T_Hの場合利用者第１メッセージは利用者のみを述べているか、用
件のある相手名のみを述べているかのいづれかであると
推定できる。そこで、相槌送出部18から相槌（例えば
「はい」）を局線へ送出した後、相手音声検出部14によ
り利用者音声の監視を開始する。その後の処理は、利用
者音声の有無によって次の２つに別れる。When T _L <T <T _{H It} can be presumed that the first message of the user is either only the user or only the name of the other party who has the message. Therefore, after transmitting the echo (for example, "yes") from the echo output unit 18 to the office line, the partner voice detection unit 14 starts monitoring the user voice. The subsequent processing is divided into the following two depending on the presence / absence of user voice.

−１一定時間が経過する以前に利用者の音声が検出
された場合。-1 When the user's voice is detected before a certain period of time elapses.

利用者第１メッセージは利用者名のみを述べたものであ
り、しかも利用者第２メッセージは用件のある相手名を
述べていると判断できるので、前述した利用者メッセー
ジ終了確認を行った後、応答メッセージ情報蓄積部13か
ら利用者の用件を尋ねるメッセージを選択し局線へ送出
する。Since it can be determined that the user first message only describes the user name, and the user second message describes the name of the other party with a message, after confirming the end of the user message described above. , Selects a message asking the user's requirement from the response message information storage unit 13 and sends it to the central office line.

−２一定時間が経過しても利用者の音声が検出され
ない場合。-2 When the user's voice is not detected even after a certain period of time.

利用者第１メッセージは用件のある相手名のみを述べた
ものであると判断できるので、応答メッセージ情報蓄積
部13から利用者名を尋ねるメッセージ（例えば「失礼で
すがどちら様でしょうか」）を選択し局線へ出力する。
そして、利用者メッセージ終了確認後、応答メッセージ
情報蓄積部13から利用者の用件を尋ねるメッセージを選
択し局線へ送出する。Since it can be judged that the first message of the user is only the name of the other party with a business requirement, a message asking the user name from the response message information storage unit 13 (for example, "I'm sorry, which one is it?") Select and output to the office line.
After confirming the end of the user message, the response message information storage unit 13 selects a message inquiring about the user's requirement and sends it to the central line.

T_L≦Ｔの場合利用者第１メッセージは利用者名も用件のある相手名も
述べていないと判断できるので、相槌送出部18から相槌
を選択し局線へ出力した後、再び利用者第１メッセージ
に対する処理を繰り返す。When T _L ≦ T Since it can be determined that the user's first message does not describe the user's name or the name of the other party with a message, the user selects the azuchi from the azuchi transmitter 18 and outputs it to the station line, and then the user again. The process for the first message is repeated.

以上の処理を行うことにより、装置はいづれにしても用
件を尋ねるメッセージを送出することになる。この用件
を尋ねるメッセージに対して利用者からの音声がない場
合には、第１応答メッセージ送出時と同様に一定の回数
用件を尋ねるメッセージを繰り返した後、処理終了過程
へ移行する。利用者からの音声が検出された場合には、
利用者メッセージ終了確認（第９図（ｃ））後、応答メ
ッセージ情報蓄積部13から対話終了を表わすメッセージ
（例えば「どうもありがとうございました」）を選択し
局線へ送出したのち、ループ制御部10を作動させループ
を開放すると共に、相手メッセージ録音部15を停止しす
べての処理を終了する。By performing the above processing, the device sends a message asking for a message in any case. If there is no voice from the user in response to the message asking for the message, the message asking for the message is repeated a fixed number of times as in the case of sending the first response message, and then the process ends. If voice from the user is detected,
After confirming the end of the user message (Fig. 9 (c)), the response message information accumulating unit 13 selects a message indicating the end of the dialogue (for example, "Thank you very much") and sends it to the central line, and then the loop control unit 10 Is activated to open the loop, the partner message recording section 15 is stopped, and all processing is terminated.

このように、相手音声検出部14と時計16により利用者に
より利用者メッセージの継続時間長情報を抽出し、その
結果からその利用者メッセージの意味を推定すると共
に、音声時間長情報だけでは判定できないメッセージの
意味を正確に推定・理解することができる。利用者メッ
セージの意味を理解できれば、利用者メッセージに対し
て適切な応答メッセージを出力することが可能になり、
マンマシンインタフェースのよい対話形応答を実現でき
るという利点がある。In this way, the other party voice detection unit 14 and the clock 16 extract the duration information of the user message by the user, estimate the meaning of the user message from the result, and it is not possible to judge only by the voice duration information. The meaning of the message can be accurately estimated and understood. If you understand the meaning of the user message, it will be possible to output an appropriate response message to the user message,
There is an advantage that a good interactive response of the man-machine interface can be realized.

〔The invention's effect〕

以上説明したように本発明によれば、入力音声の継続時
間の長さに基づいて、次に送出すべき応答メッセージを
選択することにより、入力音声の内容に対応した応答メ
ッセージの送出が行えるので、日常的な言葉の発声によ
る入力が可能な対話形の音声応答装置を実現できるとい
う効果がある。As described above, according to the present invention, the response message corresponding to the content of the input voice can be transmitted by selecting the response message to be transmitted next based on the duration of the input voice. There is an effect that it is possible to realize an interactive voice response device capable of inputting by uttering everyday words.

[Brief description of drawings]

第１図は本発明の構成を示すブロック図、第２図は音声
認識を用いた場合の処理例を示すフローチャート、第３
図はPB信号を用いた処理例を示すフローチャート、第４
図は入力音声の内容とその継続時間長との相関関係を示
す説明図、第５図は一実施例に係る自動応答録音装置を
示すブロック図、第６図は第５図の装置における応答制
御を示すフローチャート、第７図は他の実施例の基本動
作を示すフローチャート、第８図は他の実施例に係る自
動応答録音装置を示すブロック図及び第９図（ａ）
（ｂ）（ｃ）は第８図の装置における応答制御を示すフ
ローチャートである。１……入力音声検出手段、２……測定手段、３……音声
時間長しきい値情報格納手段、４……応答メッセージ情
報格納手段、５……比較手段、６……応答メッセージ送
出手段、７……着信検出部、８……制御部、10……ルー
プ制御部、11……通話回路部、12……応答メッセージ送
出部、13……応答メッセージ情報蓄積部、14……相手音
声検出部、15……相手メッセージ録音部、16……時計、
17……時間長しきい値情報格納部、18……相槌送出部。FIG. 1 is a block diagram showing the configuration of the present invention, FIG. 2 is a flow chart showing an example of processing when voice recognition is used, and FIG.
FIG. 4 is a flowchart showing a processing example using a PB signal,
FIG. 5 is an explanatory diagram showing the correlation between the content of the input voice and its duration, FIG. 5 is a block diagram showing an automatic response recording device according to one embodiment, and FIG. 6 is response control in the device of FIG. 7 is a flowchart showing the basic operation of another embodiment, FIG. 8 is a block diagram showing an automatic response recording device according to another embodiment, and FIG. 9 (a).
8B and 8C are flowcharts showing response control in the apparatus of FIG. 1 ... Input voice detecting means, 2 ... measuring means, 3 ... voice time length threshold information storing means, 4 ... response message information storing means, 5 ... comparing means, 6 ... response message sending means, 7 ... incoming call detection section, 8 ... control section, 10 ... loop control section, 11 ... calling circuit section, 12 ... response message sending section, 13 ... response message information storage section, 14 ... caller voice detection Part, 15 …… Recording message from the other party, 16 …… Clock,
17: Time length threshold information storage unit, 18: Aizu transmission unit.

Claims

[Claims]

1. An input voice detecting means (1) for detecting an input voice from a line in an interactive voice response device which receives an input voice from a line, sends a response message based on the voice, and responds in an interactive manner.
A measuring means (2) for measuring the duration of the input voice, a voice time length threshold information storage means (3) for storing voice time length threshold information, and a plurality of response message information. Response message information storage means (4), comparison means (5) for comparing the duration of the input voice with a voice duration threshold, and a response message for selecting a response message based on the comparison result signal and sending it to the line. An interactive voice response device comprising: a sending means (6).