JP2010204260A

JP2010204260A - Interactive device

Info

Publication number: JP2010204260A
Application number: JP2009047872A
Authority: JP
Inventors: Shoji Onofuji; 祥司尾野藤
Original assignee: Brother Industries Ltd
Current assignee: Brother Industries Ltd
Priority date: 2009-03-02
Filing date: 2009-03-02
Publication date: 2010-09-16

Abstract

<P>PROBLEM TO BE SOLVED: To detect the distance between an interacting device and an operator by utilizing a voice input means and a voice output means, which are originally equipped for interactive processing. <P>SOLUTION: A receiving terminal 20 includes a microphone 207 for inputting voice, and a speaker 208 for outputting voice. Based on noise information which is generated in surroundings, and which is obtained by the microphone 207, false noise for detecting distance is output via the speaker 208. Reflective sound information including a corresponding amplitude or frequency, by reflective sound of the false noise from an object, which is input via the microphone 207, is obtained. Based on the obtained reflective sound information, predetermined calculation processing is performed, and distance to a visitor M is detected by presuming that the object is the visitor M. Based on the detection result, receiving processing by interaction with the visitor M is started. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、操作者が音声による対話方式により操作可能な対話装置に関する。 The present invention relates to an interactive device that can be operated by an operator using a voice interactive method.

例えば建造物への来訪者に対する受付業務を行う受付装置等、操作者が対話方式によって操作可能な対話装置が、従来より既に知られている。このような対話装置においては、操作者の所定距離範囲内への存在・不存在を処理開始・終了のトリガーとしたり、操作者の発話内容の音声認識精度を向上する等のために、装置から操作者までの距離を非接触で精度よく検出できることが好ましい。 For example, an interactive device that can be operated by an operator in an interactive manner, such as a reception device that performs a reception work for a visitor to a building, has been known. In such an interactive device, the presence / absence of the operator within a predetermined distance range is used as a trigger for the start / end of processing, the speech recognition accuracy of the operator's utterance content is improved from the device, etc. It is preferable that the distance to the operator can be accurately detected without contact.

このような非接触距離検出に関しては、例えば特許文献１記載の従来技術が知られている。この従来技術では、超音波パルスを生成して対象物（物体）に対して出力し、検出対象物での反射波（エコーパルス）を探知する。そして超音波パルスの伝達時間を算出することにより、その伝達時間に基づいて対象物までの距離を検出するようになっている。 For such non-contact distance detection, for example, the prior art described in Patent Document 1 is known. In this prior art, an ultrasonic pulse is generated and output to an object (object), and a reflected wave (echo pulse) on the detection object is detected. Then, by calculating the transmission time of the ultrasonic pulse, the distance to the object is detected based on the transmission time.

特開２００５−３５１８９７号公報JP 2005-351897 A

しかしながら、対話装置に対し、上記従来技術のような超音波を用いた距離検出手法を適用しようとする場合、距離検出専用のセンサやマイクを新たに設ける必要が生じるという問題があった。 However, when the distance detection method using ultrasonic waves as in the above-described conventional technique is applied to the interactive device, there is a problem that it becomes necessary to newly provide a sensor or microphone dedicated to distance detection.

本発明の目的は、専用のセンサやマイクを新たに設ける必要がなく、操作者までの距離検出を行える対話装置を提供することにある。 An object of the present invention is to provide an interactive apparatus that can detect a distance to an operator without newly providing a dedicated sensor or microphone.

上記目的を達成するために、第１の発明は、操作者が対話方式により操作可能な対話装置であって、音声を入力するための音声入力手段と、音声を出力するための音声出力手段と、前記音声入力手段を介し入力された音により、対応する振幅あるいは周波数を含む音情報を取得する音取得手段と、前記音声入力手段が前記音を入力してから所定時間以内に、当該音取得手段で取得された前記音情報に基づき、前記音声出力手段を介し距離検出用の疑似音を出力する疑似音出力手段と、前記音声入力手段を介し入力された、前記疑似音の対象物での反射音により、対応する振幅あるいは周波数を含む反射音情報を取得する反射音取得手段と、前記反射音取得手段で取得された前記反射音情報に基づき、所定の演算処理を行い、前記対象物が前記操作者であると推測して当該操作者までの距離を検出する距離検出手段と、前記距離検出手段での検出結果に基づき、前記操作者との対話処理を開始する対話処理制御手段とを有することを特徴とする。 In order to achieve the above object, the first invention is an interactive apparatus that can be operated by an operator in an interactive manner, and includes an audio input means for inputting audio, and an audio output means for outputting audio. Sound acquisition means for acquiring sound information including a corresponding amplitude or frequency by sound input through the sound input means, and the sound acquisition within a predetermined time after the sound input means inputs the sound. On the basis of the sound information acquired by the means, a pseudo sound output means for outputting a pseudo sound for distance detection via the sound output means, and an object of the pseudo sound input via the sound input means Based on the reflected sound, reflected sound acquisition means for acquiring reflected sound information including a corresponding amplitude or frequency, and based on the reflected sound information acquired by the reflected sound acquisition means, a predetermined calculation process is performed, and the object is Said It has distance detection means for detecting the distance to the operator based on the assumption that the user is an author, and dialogue processing control means for starting dialogue processing with the operator based on the detection result of the distance detection means. It is characterized by.

本願第１発明の対話装置においては、音を用いて操作者との距離を検出する。すなわち、装置の周囲で発生した音（いわゆる雑音）が音声入力手段を介して入力され、対応する音情報が音取得手段で取得される。すると、この音情報に基づき、疑似音出力手段が、距離検出用の疑似音を音声出力手段を介し出力する。出力された疑似音は対象物に向かって伝搬しその反射音が音声入力手段を介して入力される、対応する反射音情報が反射音取得手段で取得される。疑似音が発せられてからその反射音が戻ってくるまでの時間は、装置から対象物までの距離に比例するが、操作者が存在している場合は対象物としての操作者で反射した反射音が音声入力手段を介して入力され、上記時間は装置から操作者までの距離に比例する。したがって、距離検出手段は、上記反射音情報に基づき、前記対象物が前記操作者であると推測して操作者までの距離を検出する。この距離検出が終わった後に、当該検出結果に基づき、対話処理制御手段が操作者との対話処理を開始することで、確実な対話処理を行うことができる。 In the dialogue apparatus of the first invention of the present application, the distance from the operator is detected using sound. That is, sound (so-called noise) generated around the apparatus is input through the sound input unit, and corresponding sound information is acquired by the sound acquisition unit. Then, based on this sound information, the pseudo sound output means outputs the pseudo sound for distance detection via the sound output means. The output pseudo sound propagates toward the object and the reflected sound is input via the sound input means, and the corresponding reflected sound information is acquired by the reflected sound acquisition means. The time from when the pseudo sound is emitted until the reflected sound returns is proportional to the distance from the device to the object, but when there is an operator, the reflection reflected by the operator as the object Sound is input via the voice input means, and the time is proportional to the distance from the device to the operator. Therefore, the distance detection means detects the distance to the operator by estimating that the object is the operator based on the reflected sound information. After the distance detection is completed, the dialogue processing control means starts the dialogue processing with the operator based on the detection result, so that the certain dialogue processing can be performed.

以上のようにして、本願第１発明においては、音声入力手段及び音声出力手段を介して入出力する音を用いて、操作者までの距離を検出することができる。すなわち、対話処理のためにもともと備わっている音声入力手段（マイク等）や音声出力手段（スピーカ等）を活用することで、それ以外の別途の距離検出用のセンサや専用マイク等を新たに設けることなく、距離検出を行うことができる。 As described above, in the first invention of the present application, the distance to the operator can be detected using the sound input / output via the voice input means and the voice output means. That is, by utilizing voice input means (such as a microphone) and voice output means (such as a speaker) that are originally provided for interactive processing, other distance detection sensors and dedicated microphones are newly provided. The distance can be detected without any problem.

またこのとき、距離検出のために、装置の周囲で発生した音（いわゆる雑音）に基づく疑似音を用いることにより、音を用いて検出していることを操作者に悟られることなく、距離検出を行える効果もある。 At this time, the distance detection is performed without using the pseudo sound based on the sound (so-called noise) generated around the device without the operator realizing that the sound is detected. There is also an effect that can be performed.

第２発明は、上記第１発明において、前記音取得手段で取得された前記音情報に所定の処理を行い、対応する前記疑似音を生成する疑似音生成手段を有し、前記疑似音出力手段は、前記疑似音生成手段で生成された前記疑似音を出力することを特徴とする。 According to a second aspect of the present invention, in the first aspect of the invention, the pseudo-sound output unit includes a pseudo-sound generation unit that performs a predetermined process on the sound information acquired by the sound acquisition unit and generates the corresponding pseudo-sound. Outputs the pseudo sound generated by the pseudo sound generating means.

これにより、装置の周囲で発生した音（いわゆる雑音）をそのまま用いて距離検出を行う以外に、雑音のうちの所定範囲（レベル範囲や時間範囲）のものを用いたり、雑音に各種の加工を施したものを用いたりすることが可能となる。この結果、距離検出に使用可能な音のバリエーションを拡張できるので、種々の用途への応用性を向上することができる。 As a result, in addition to the distance detection using the sound (so-called noise) generated around the device as it is, the noise within a predetermined range (level range or time range) can be used, or various processing can be applied to the noise. It is possible to use what has been applied. As a result, the variation of sound that can be used for distance detection can be expanded, so that the applicability to various applications can be improved.

第３発明は、上記第２発明において、前記疑似音生成手段は、前記音情報のうち所定のしきい値レベルを超えたものに基づき、前記疑似音を生成することを特徴とする。 According to a third aspect, in the second aspect, the pseudo sound generating means generates the pseudo sound based on the sound information that exceeds a predetermined threshold level.

雑音に基づき疑似音を生成するとき、元となる雑音のレベルがあまりに小さいと、出力した疑似音のレベルも小さく、その反射音を検出することが困難となる。そこで、本願第３発明においては、所定のしきい値レベルを超えた雑音のみに限定して疑似音を生成することにより、上記のレベル不足による不都合を回避し、確実な距離検出を行うことができる。 When generating the pseudo sound based on the noise, if the level of the original noise is too small, the level of the output pseudo sound is also small and it is difficult to detect the reflected sound. Therefore, in the third invention of the present application, by generating a pseudo sound only for noise exceeding a predetermined threshold level, it is possible to avoid the inconvenience due to the insufficient level and to perform reliable distance detection. it can.

第４発明は、上記第３発明において、前記疑似音生成手段は、所定の時間範囲の前記音情報に基づき、前記疑似音を生成することを特徴とする。 According to a fourth aspect, in the third aspect, the pseudo sound generating means generates the pseudo sound based on the sound information in a predetermined time range.

音情報に基づき疑似音を生成するとき、元となる雑音のレベルがあまりに小さいと、出力する疑似音のレベルも小さく、その反射音を検出することが困難となる。そこで、本願第４発明においては、所定の時間範囲の雑音のみに限定して疑似音を生成する。これにより、例えばドアをしめた音や物を置いた音等、最初に大きく立ち上がって急激に減衰していく雑音のうち減衰するまでの最初のレベルの大きな部分のみを時間的に切り取り、その切り取った部分に基づいて疑似音を生成することが可能となる。これにより、前述のようなレベル不足による不都合を回避し、確実な距離検出を行うことができる。 When generating the pseudo sound based on the sound information, if the level of the original noise is too small, the level of the pseudo sound to be output is also small, and it is difficult to detect the reflected sound. Therefore, in the fourth invention of the present application, the pseudo sound is generated by limiting only to noise in a predetermined time range. This cuts out only the large part of the first level until it is attenuated, such as the sound of closing the door or the sound of placing an object, and then decaying rapidly. It is possible to generate a pseudo sound based on the part. As a result, it is possible to avoid the inconvenience due to the insufficient level as described above, and to perform reliable distance detection.

第５発明は、上記第１乃至第４発明のいずれかにおいて、前記対話処理制御手段の制御に基づく前記対話処理が終了した後、所定期間が経過したら、疑似音の出力を再び実行するように制御する出力制御手段を有することを特徴とする。 According to a fifth aspect of the present invention, in any one of the first to fourth aspects, the pseudo sound is output again after a predetermined period of time has elapsed after the completion of the interactive process based on the control of the interactive process control unit. It has the output control means to control, It is characterized by the above-mentioned.

操作者との距離を確定して対話処理が行われ、その対話処理が終了してしばらくたった場合には、対話していた操作者は既に別の場所に移動し、装置近傍に誰もいない（あるいは別の操作者がいる）状態になっている可能性が高い。そこで本願第５発明においてはこれに応じ、出力制御手段が、対話処理終了後所定期間が経過したら、疑似音の出力を再び実行するように制御する。これにより、次の操作者に対する距離検出を確実に実行することができる。 When the distance between the operator and the dialogue process is determined, and the dialogue process is completed, it is a while after the dialogue process is completed, the operator who has been in conversation has already moved to another place, and there is no one in the vicinity of the device ( Or there is another possibility that there is another operator). Accordingly, in the fifth aspect of the present invention, in response to this, the output control means performs control so that the output of the pseudo sound is executed again when a predetermined period has elapsed after the end of the dialogue processing. Thereby, the distance detection with respect to the next operator can be performed reliably.

本発明によれば、専用のセンサやマイクを新たに設ける必要がなく、操作者までの距離検出を行うことができる。 According to the present invention, it is not necessary to newly provide a dedicated sensor or microphone, and the distance to the operator can be detected.

本発明の一実施形態における来訪者受付システムの全体構成を示すシステム構成図である。It is a system configuration figure showing the whole visitor reception system composition in one embodiment of the present invention. 来訪者受付システムのシステム全体の機能構成を表す機能ブロック図である。It is a functional block diagram showing the function structure of the whole system of a visitor reception system. 表示部における表示画面の一例を表す図である。It is a figure showing an example of the display screen in a display part. 受付端末の機能的構成を示す機能ブロック図である。It is a functional block diagram which shows the functional structure of a reception terminal. ＤＢサーバの機能的構成を表す機能ブロック図である。It is a functional block diagram showing the functional structure of DB server. スピーカより疑似雑音を出力するまでの手順の概要を説明した説明図である。It is explanatory drawing explaining the outline | summary of the procedure until it outputs pseudo noise from a speaker. 来訪者までの距離を検出する手法の概要を説明した説明図である。It is explanatory drawing explaining the outline | summary of the method of detecting the distance to a visitor. 雑音情報の取得を再開するまでの手順の概要を説明した説明図である。It is explanatory drawing explaining the outline | summary of the procedure until it restarts acquisition of noise information. 受付端末の制御回路部により実行する制御手順を表すフローチャートである。It is a flowchart showing the control procedure performed by the control circuit part of a reception terminal. 所定の時間範囲の雑音情報に限定して疑似雑音情報を生成する変形例を説明するための説明図である。It is explanatory drawing for demonstrating the modification which produces | generates pseudo noise information limited to the noise information of a predetermined time range.

以下、本発明の一実施の形態を図面を参照しつつ説明する。本実施形態では、本発明の対話装置を、例えば、ビルや会社その他の建造物への来訪者に対する受付業務を行う来訪者受付システムに適用した場合を表している。 Hereinafter, an embodiment of the present invention will be described with reference to the drawings. In the present embodiment, the dialogue apparatus of the present invention is applied to a visitor reception system that performs reception work for visitors to buildings, companies, and other buildings, for example.

（Ａ）システムの基本構成
図１は、本実施形態の来訪者受付システムの全体構成を示すシステム構成図である。 (A) Basic Configuration of System FIG. 1 is a system configuration diagram showing an overall configuration of a visitor reception system according to the present embodiment.

図１において、来訪者受付システム１は、例えば会社の入口付近に設置され、操作者（この例では、会社への来訪者）Ｍが対話方式により操作可能な受付端末２０（対話装置）を有している。受付端末２０には、音声を入力するためのマイク２０７（音声入力手段）と、音声を出力するためのスピーカ２０８（音声出力手段）とが設けられている。 In FIG. 1, a visitor reception system 1 is installed near the entrance of a company, for example, and has an acceptance terminal 20 (interactive device) that can be operated by an operator (in this example, a visitor to the company) M in an interactive manner. is doing. The reception terminal 20 is provided with a microphone 207 (voice input means) for inputting voice and a speaker 208 (voice output means) for outputting voice.

受付端末２０は、来訪者Ｍとの対話処理（この例では、来訪者Ｍとの対話による受付処理）や、音声（雑音、疑似雑音、反射音等。後述）を用いた来訪者Ｍまでの距離の検出を行う。本実施形態では、受付端末２０から来訪者Ｍまでの距離を検出する方法として、スピーカ２０８から距離検出用の音（後述する疑似雑音）を出力させて、その疑似雑音が来訪者Ｍで反射し、その反射音がマイク２０７に入力されるまでの所要時間を測定する。そして、この所要時間が、来訪者Ｍまでの距離に比例するという関係から、来訪者Ｍまでの距離を検出する。すなわち、来訪者Ｍまでの距離をＬ、上記所要時間をｔとすると、
Ｌ＝ｃ×ｔ／２・・・（式１）
で表される関係が成り立つ（詳細は図７で後述する）。なお、ｃは音速（約３４０［ｍ／ｓ］。但し、媒体である空気の密度や圧力により異なる）である。 The reception terminal 20 communicates with the visitor M (in this example, the reception process by the conversation with the visitor M) and the visitor M using voice (noise, pseudo-noise, reflected sound, etc., which will be described later). Perform distance detection. In this embodiment, as a method for detecting the distance from the reception terminal 20 to the visitor M, a sound for distance detection (pseudo noise described later) is output from the speaker 208, and the pseudo noise is reflected by the visitor M. The time required until the reflected sound is input to the microphone 207 is measured. Then, the distance to the visitor M is detected from the relationship that the required time is proportional to the distance to the visitor M. That is, if the distance to the visitor M is L and the required time is t,
L = c × t / 2 (Formula 1)
(The details will be described later with reference to FIG. 7). Note that c is the speed of sound (about 340 [m / s], but varies depending on the density and pressure of air as a medium).

上記（式１）を解くことによって、来訪者Ｍまでの距離が検出できる。そして、検出した距離が所定値（受付処理可能な距離に相当。例えば、１［ｍ］）以下となったら、受付処理を開始する（詳細は後述する）。 The distance to the visitor M can be detected by solving the above (Formula 1). Then, when the detected distance is equal to or less than a predetermined value (corresponding to a distance that can be accepted, for example, 1 [m]), the acceptance process is started (details will be described later).

また、図１に示すように、受付端末２０は、表示部２１０、上記マイク２０７、及び上記スピーカ２０８を有している。表示部２１０は、例えば液晶ディスプレイで構成され、この例では水平に設置されるベース２１２に対してアーム２１１を介し支持され、来訪者Ｍの視線に対して直角となるように面方向が斜め上方を向いている。マイク２０７は、ベース２１２に対し先端を来訪者Ｍ側へ向けるようにして略円弧状に配置されている。 As illustrated in FIG. 1, the reception terminal 20 includes a display unit 210, the microphone 207, and the speaker 208. The display unit 210 is composed of, for example, a liquid crystal display. In this example, the display unit 210 is supported by an arm 211 with respect to a base 212 installed horizontally, and the surface direction is obliquely upward so as to be perpendicular to the visitor's line of sight Facing. The microphone 207 is arranged in a substantially arc shape with the tip thereof facing the visitor M side with respect to the base 212.

なお、表示部２１０をタッチパネルで構成し、表示される表示画面を来訪者Ｍが直接画面に触れながら操作できるようにしてもよい。 Note that the display unit 210 may be configured by a touch panel so that the visitor M can operate the displayed display screen while directly touching the screen.

図２は、来訪者受付システム１のシステム全体の機能構成を表す機能ブロック図である。 FIG. 2 is a functional block diagram showing the functional configuration of the entire system of the visitor reception system 1.

図２において、来訪者受付システム１は、上記受付端末２０と、周知のパーソナルコンピュータにより構成されるＤＢサーバ１０と、会社の従業員それぞれに対応して設けられた複数の（この例では２つの）ＩＰ電話機６０と、それら複数のＩＰ電話機６０の回線交換を行う周知の交換装置であるＩＰ−ＰＢＸ（ＩｎｔｅｎｅｔＰｒｏｔｏｃｏｌＰｒｉｖａｔｅＢｒａｎｃｈｅＸｃｈａｎｇｅ）５０とを有し、これらはすべてルータ４０を介して接続されている。 In FIG. 2, the visitor reception system 1 includes a plurality of (two in this example) provided corresponding to the reception terminal 20, a DB server 10 constituted by a well-known personal computer, and employees of the company. ) An IP telephone 60 and an IP-PBX (Internet Protocol Private Branch Exchange) 50, which is a well-known switching apparatus that performs circuit switching of the plurality of IP telephones 60, all of which are connected via a router 40 Yes.

受付端末２０は、端末本体２０Ａと、この端末本体２０Ａに接続された、上記表示部２１０、上記マイク２０７、及び上記スピーカ２０８とを有している。 The reception terminal 20 includes a terminal main body 20A, and the display unit 210, the microphone 207, and the speaker 208 connected to the terminal main body 20A.

マイク２０７は、入力された音声を音声情報に変換し、端末本体２０Ａへ出力する。入力音声としては、本実施形態では、例えば来訪者Ｍが発話した音声や、受付端末２０の周囲で発生した雑音（例えば空調の音、ドアを閉めた音、物を置いた音、足音等）等がある。 The microphone 207 converts the input voice into voice information and outputs it to the terminal body 20A. In this embodiment, for example, a voice uttered by the visitor M or noise generated around the reception terminal 20 (for example, an air-conditioning sound, a door-closed sound, a sound of placing an object, a footstep, etc.) Etc.

スピーカ２０８は、端末本体２０Ａから入力された音声信号を、来訪者Ｍに対する報知音（案内音声）や距離検出用の疑似雑音（疑似音。詳細は後述する）に変換して出力する。 The speaker 208 converts the voice signal input from the terminal main body 20A into a notification sound (guidance voice) for the visitor M and a pseudo noise (pseudo sound for details detection) (details will be described later).

図３は、表示部２１０における表示画面の一例を表す図である。この画面においては、後述の描画プログラムによって生成された、受付業務を行う仮想人物ＩＭが、後述の受付処理が開始されると、オフィス風の背景Ｇとともに表示される。また、スピーカ２０８から発話される音声に対応する文章Ｂ（図中では「＊＊＊」で略記している）が併せて表示される。 FIG. 3 is a diagram illustrating an example of a display screen in the display unit 210. On this screen, a virtual person IM that performs a reception work, which is generated by a drawing program described later, is displayed together with an office-like background G when a reception process described later is started. In addition, a sentence B (abbreviated as “***” in the drawing) corresponding to the voice uttered from the speaker 208 is also displayed.

図４は、受付端末２０の機能的構成を示す機能ブロック図である。 FIG. 4 is a functional block diagram illustrating a functional configuration of the reception terminal 20.

図４において、受付端末２０の端末本体２０Ａは、制御回路部２００と、入出力（Ｉ／Ｏ）インタフェイス２０４と、ハードディスク装置（ＨＤＤ）２０５と、計時手段であるタイマ２０９とを有している。 In FIG. 4, the terminal body 20A of the reception terminal 20 includes a control circuit unit 200, an input / output (I / O) interface 204, a hard disk device (HDD) 205, and a timer 209 that is a time measuring means. Yes.

制御回路部２００は、ＣＰＵ２０１と、受付端末２０の基本的な動作に必要なプログラムやそのための設定値を記憶したＲＯＭ２０２と、各種データを一時的に記憶するＲＡＭ２０３とを備えている。ＣＰＵ２０１は、ＲＯＭ２０２や、ＨＤＤ２０５に記憶されたプログラムに従って、受付端末２０全体の動作を制御する。 The control circuit unit 200 includes a CPU 201, a ROM 202 that stores programs necessary for basic operations of the receiving terminal 20 and setting values for the programs, and a RAM 203 that temporarily stores various data. The CPU 201 controls the overall operation of the reception terminal 20 according to programs stored in the ROM 202 and the HDD 205.

Ｉ／Ｏインタフェイス２０４には、上記ＣＰＵ２０１と、上記ハードディスク装置２０５と、上記タイマ２０９と、上記表示部２１０と、上記マイク２０７と、上記スピーカ２０８と、ネットワーク（ＮＷ）カード２０６とが接続されている。 The CPU 201, the hard disk device 205, the timer 209, the display unit 210, the microphone 207, the speaker 208, and a network (NW) card 206 are connected to the I / O interface 204. ing.

ＨＤＤ２０５には、言語モデル記憶エリア２５２、辞書記憶エリア２５３、及びプログラム記憶エリア２５６を含む複数の記憶エリアを備えている。 The HDD 205 includes a plurality of storage areas including a language model storage area 252, a dictionary storage area 253, and a program storage area 256.

言語モデル記憶エリア２５２には、来訪者Ｍによる発話の認識に使用するための受理可能な文のパターンが、受付端末２０と来訪者Ｍとの対話で想定される様々な場面に応じて予め作成され、言語モデルとして記憶されている。 In the language model storage area 252, acceptable sentence patterns to be used for recognition of utterances by the visitor M are created in advance according to various situations assumed in the dialogue between the reception terminal 20 and the visitor M. And stored as a language model.

辞書記憶エリア２５３には、上記言語モデルとともに音声認識に使用される単語辞書や、上記言語モデル及び単語辞書とともに来訪者Ｍの身元を特定するための音声認識に適宜使用される来訪者辞書等が、記憶されている。 The dictionary storage area 253 includes a word dictionary used for speech recognition together with the language model, a visitor dictionary used as appropriate for speech recognition for identifying the identity of the visitor M together with the language model and word dictionary, and the like. , Remembered.

プログラム記憶エリア２５６には、例えば、受付端末２０の各種動作を制御するための複数のプログラムが記憶されている。記憶されているプログラムとしては、例えば、受付端末２０の基本的な動作を制御するシステムプログラム、ＤＢサーバ１０との通信を制御する通信プログラム、表示部２１０に表示する画像を生成する描画プログラム、上述した音声認識を実行する音声認識プログラム、ＤＢサーバ１０のデータベースにアクセスし照合を行うためのＤＢ照合プログラム、音声合成プログラム、対話制御プログラム、ＩＰ電話機６０とＩＰ−ＰＢＸ５０との接続に係わる電話接続プログラム、前述した距離検出を制御する距離検出プログラム等がある。 In the program storage area 256, for example, a plurality of programs for controlling various operations of the reception terminal 20 are stored. Examples of the stored program include a system program that controls basic operations of the receiving terminal 20, a communication program that controls communication with the DB server 10, a drawing program that generates an image to be displayed on the display unit 210, and the like described above. Voice recognition program for executing voice recognition, DB collation program for accessing and collating database of DB server 10, voice synthesis program, dialogue control program, telephone connection program for connection between IP telephone 60 and IP-PBX 50 And a distance detection program for controlling the distance detection described above.

なお、図示はされていないが、ＨＤＤ２０５には、その他、音声認識処理で一般的に使用される周知の音響モデルや、各種処理で使用される設定値等も記憶されている。なお、詳細は説明しないが、音響モデルは、音声の音響的特徴を統計的にモデル化したもので、例えば、母音、子音のそれぞれについて、音響的特徴（例えば、周波数特性）と対応する音素とで表現されている。 Although not shown, the HDD 205 also stores a well-known acoustic model generally used in voice recognition processing, setting values used in various processing, and the like. Although not described in detail, the acoustic model is a statistical model of the acoustic features of speech. For example, for each vowel and consonant, a phoneme corresponding to the acoustic feature (for example, frequency characteristics) It is expressed by.

ＮＷカード２０６は、上記ルータ４０に接続され、ＤＢサーバ１０などとの間でデータの送受信を可能とするための拡張カードである。 The NW card 206 is connected to the router 40 and is an expansion card for enabling data transmission / reception with the DB server 10 or the like.

図５は、ＤＢサーバ１０の機能的構成を表す機能ブロック図である。 FIG. 5 is a functional block diagram illustrating a functional configuration of the DB server 10.

図５に示すように、ＤＢサーバ１０は、ＣＰＵ１０１と、ＣＰＵ１０１に各々接続されたＲＯＭ１０２及びＲＡＭ１０３と、ＣＰＵ１０１に接続された入出力（Ｉ／Ｏ）インタフェイス１０４と、Ｉ／Ｏインタフェイス１０４にそれぞれ接続された、マウスコントローラ１０６、キーコントローラ１０７、ビデオコントローラ１０８、通信装置１０９、及びハードディスク装置（ＨＤＤ）１５０とを有している。 As shown in FIG. 5, the DB server 10 includes a CPU 101, a ROM 102 and a RAM 103 connected to the CPU 101, an input / output (I / O) interface 104 connected to the CPU 101, and an I / O interface 104. A mouse controller 106, a key controller 107, a video controller 108, a communication device 109, and a hard disk device (HDD) 150 are connected to each other.

ＲＯＭ１０２は、ＢＩＯＳを含む、ＤＢサーバ１０を動作させるための各種のプログラムを記憶している。ＲＡＭ１０３は、各種データを一時的に記憶する。ＣＰＵ１０１は、ＲＯＭ１０２や、後述するＨＤＤ１５０に記憶されたプログラムに従って、ＤＢサーバ１０の全体の制御を司る。 The ROM 102 stores various programs including the BIOS for operating the DB server 10. The RAM 103 temporarily stores various data. The CPU 101 governs overall control of the DB server 10 according to programs stored in the ROM 102 and an HDD 150 described later.

マウスコントローラ１０６、キーコントローラ１０７、及びビデオコントローラ１０８には、それぞれマウス１１６、キーボード１１７、及びディスプレイ１１８が接続されている。通信装置１０９は、ルータ４０に接続され、受付端末２０等、外部機器との間でデータの送受信を行うことを可能とする。 A mouse 116, a keyboard 117, and a display 118 are connected to the mouse controller 106, the key controller 107, and the video controller 108, respectively. The communication device 109 is connected to the router 40 and can exchange data with an external device such as the reception terminal 20.

ＨＤＤ１５０は、来訪者情報を格納する来訪者予約データベース（ＤＢ）記憶エリア１５１、社員情報を格納する社員データベース（ＤＢ）記憶エリア１５５、及びプログラム記憶エリア１５６を含む複数の記憶エリアを備えている。 The HDD 150 includes a plurality of storage areas including a visitor reservation database (DB) storage area 151 for storing visitor information, an employee database (DB) storage area 155 for storing employee information, and a program storage area 156.

プログラム記憶エリア１５６には、システムプログラム、通信プログラム等、各種処理をＤＢサーバ１０に実行させるための各種プログラムが記憶されている。なお、これらのプログラムは、例えばＣＤ−ＲＯＭに記憶されたものがＣＤ−ＲＯＭドライブ（図示せず）を介してインストールされ、プログラム記憶エリア１５６に記憶される。又は、適宜のネットワークを介してシステム外部からダウンロードされたプログラムが記憶されてもよい。 The program storage area 156 stores various programs for causing the DB server 10 to execute various processes such as a system program and a communication program. For example, those programs stored in a CD-ROM are installed via a CD-ROM drive (not shown) and stored in the program storage area 156. Alternatively, a program downloaded from outside the system via an appropriate network may be stored.

（Ｂ）受付処理の開始までの流れ
以上のような構成の本実施形態の最大の特徴は、マイク２０７を介し入力された雑音に対応する雑音情報に基づき距離検出用の疑似雑音をスピーカ２０８を介し出力すること、マイク２０７を介し入力された上記疑似雑音の来訪者Ｍでの反射音に対応する反射音情報に基づき来訪者Ｍまでの距離を検出すること、及び、検出した距離が所定値以下となったら受付処理を開始すること、である。以下、図６及び図７を用いて、その詳細を順を追って説明する。 (B) Flow until Start of Reception Processing The greatest feature of the present embodiment configured as described above is that the pseudo noise for distance detection is transmitted to the speaker 208 based on noise information corresponding to the noise input via the microphone 207. Output via the microphone 207, detecting the distance to the visitor M based on the reflected sound information corresponding to the reflected sound of the pseudo noise from the visitor M, and the detected distance is a predetermined value. The reception process is started when the following occurs. Hereinafter, the details will be described in order with reference to FIGS. 6 and 7.

図６は、スピーカ２０８より疑似雑音を出力するまでの手順の概要を説明した説明図である。 FIG. 6 is an explanatory diagram for explaining an outline of the procedure until the pseudo noise is output from the speaker 208.

図６（ａ）には、マイク２０７に入力された雑音より、疑似雑音を生成する手順を模式的に示している。図６（ａ）に示すように、受付端末２０の周囲で雑音（この例では、会社内の所定の場所に設置されているドア３０が閉まる音）が発生すると、この雑音が伝搬してマイク２０７に入力され、対応する振幅あるいは周波数を含む雑音情報（音情報）が取得される。このとき、取得された雑音情報が、所定のしきい値レベルを超えているかどうかの確認が行われる（例えば、短時間フーリエ変換によりパワーに変換して確認すればよい）。そして、取得された雑音情報が、上記しきい値レベルを超える（パワーが大きい）ものであれば、当該雑音情報に基づき、距離検出用の疑似雑音（疑似音）が生成される。 FIG. 6A schematically shows a procedure for generating pseudo noise from noise input to the microphone 207. As shown in FIG. 6A, when noise (in this example, the sound of closing the door 30 installed at a predetermined location in the company) is generated around the reception terminal 20, this noise propagates and the microphone Input to 207, noise information (sound information) including the corresponding amplitude or frequency is acquired. At this time, it is confirmed whether or not the acquired noise information exceeds a predetermined threshold level (for example, it may be confirmed by converting to power by short-time Fourier transform). If the acquired noise information exceeds the threshold level (power is high), distance detection pseudo noise (pseudo sound) is generated based on the noise information.

なお、取得された雑音情報が、上記しきい値レベルを超えない（パワーが小さい）ものであれば、後述する疑似雑音の反射音の検出を行うことが困難であるため、上記のような疑似雑音の生成に用いられずに、再度雑音情報の取得が行われる。このように、上記しきい値レベルを超える（パワーが大きい）雑音情報に限定して、言い換えれば、雑音情報のうち、上記しきい値レベルを超える雑音情報が切り取られて、上記疑似雑音の生成に用いられるのである。 If the acquired noise information does not exceed the threshold level (power is small), it is difficult to detect the reflected sound of the pseudo noise described later. Noise information is acquired again without being used for noise generation. Thus, it is limited to noise information that exceeds the threshold level (high power), in other words, noise information that exceeds the threshold level is cut out from the noise information to generate the pseudo noise. It is used for.

図６（ｂ）には、スピーカ２０８より疑似雑音が出力された状態を模式的に示している。図６（ｂ）に示すように、上記のようにして生成された距離検出用の疑似雑音はスピーカ２０８より出力される。この疑似雑音は、上記図６（ａ）において（ドア３０において）発生した雑音に似た音（又は加工がされた音でもよい）である。また、疑似雑音の出力とほぼ同時に、タイマ２０９（図４参照）が起動される。これにより、スピーカ２０８より疑似雑音が出力されてから、この疑似雑音が来訪者Ｍに反射し、その反射音（＝疑似雑音の来訪者Ｍでの反射音。以下、単に「反射音」という）がマイク２０７に入力されるまでの所要時間（以下、単に「所要時間」という）の測定（計測）が開始される。 FIG. 6B schematically shows a state in which pseudo noise is output from the speaker 208. As shown in FIG. 6B, the pseudo noise for distance detection generated as described above is output from the speaker 208. This pseudo noise is a sound similar to the noise generated in FIG. 6A (in the door 30) (or may be a processed sound). Also, the timer 209 (see FIG. 4) is started almost simultaneously with the output of the pseudo noise. Thereby, after the pseudo noise is output from the speaker 208, the pseudo noise is reflected to the visitor M, and the reflected sound (= the reflected sound of the pseudo noise at the visitor M. Hereinafter, simply referred to as “reflected sound”). Measurement (measurement) of a required time (hereinafter simply referred to as “required time”) until it is input to the microphone 207 is started.

図７は、来訪者Ｍまでの距離を検出する手法の概要を説明した説明図である。 FIG. 7 is an explanatory diagram for explaining the outline of the method for detecting the distance to the visitor M. FIG.

前述のようにして疑似雑音がスピーカ２０８より出力されると、この疑似雑音は、所定の距離範囲（伝搬可能な距離範囲。パワーによって異なる）に伝搬される。このとき、当該範囲内に来訪者Ｍが存在すると、上記疑似雑音は、図７に示すように、来訪者Ｍにより反射し、その反射音が伝搬してマイク２０７に入力され、対応する反射音情報が取得される。このようにしてマイク２０７に反射音が入力されると、タイマ２０９によって行われていた上記所要時間の測定が終了する。すなわち、このときのタイマ２０９の測定値が上記所要時間となる。 When the pseudo noise is output from the speaker 208 as described above, the pseudo noise is propagated to a predetermined distance range (distance range in which propagation is possible, which varies depending on power). At this time, if there is a visitor M within the range, the pseudo noise is reflected by the visitor M as shown in FIG. 7, and the reflected sound is propagated and input to the microphone 207, and the corresponding reflected sound. Information is acquired. When the reflected sound is input to the microphone 207 in this way, the measurement of the required time performed by the timer 209 ends. That is, the measured value of the timer 209 at this time is the required time.

ここで、上記疑似雑音及びその反射音は、共に音波であるので受付端末２０と来訪者Ｍとの間を音速で伝搬している。また、上記所要時間は、上記疑似雑音及びその反射音、すなわち音波が、受付端末２０と来訪者Ｍとの間を往復する往復伝搬時間である（詳細にはスピーカ２０８→来訪者Ｍ間の疑似雑音の伝搬時間と、来訪者Ｍ→マイク２０７間の反射音の伝搬時間との合計時間）。すなわち、音速と、上記所要時間の半分（＝片道の伝搬時間に相当）との積の値が、受付端末２０から来訪者Ｍまでの距離となる。このようなことから、上記（式１）（図１参照）を解くことによって、受付端末２０から来訪者Ｍまでの距離を検出（算出）することができるのである。 Here, since the pseudo noise and the reflected sound thereof are both sound waves, they propagate between the reception terminal 20 and the visitor M at the speed of sound. The required time is a round-trip propagation time in which the pseudo noise and its reflected sound, that is, a sound wave, reciprocate between the reception terminal 20 and the visitor M (more specifically, a pseudo-range between the speaker 208 and the visitor M). The total time of the propagation time of the noise and the propagation time of the reflected sound between the visitor M and the microphone 207). That is, the product of the sound speed and half of the required time (= corresponding to the one-way propagation time) is the distance from the reception terminal 20 to the visitor M. For this reason, the distance from the reception terminal 20 to the visitor M can be detected (calculated) by solving the above (Formula 1) (see FIG. 1).

例えば、音速を３４６．５［ｍ／ｓ］とし、タイマ２０９の測定値（＝上記所要時間）を２．０［ｍｓｅｃ］とすると、来訪者Ｍまでの距離Ｌは、
Ｌ＝３４６．５×２．０×１０^−３／２＝３４６．５×１０^−３［ｍ］≒３５［ｃｍ］
となる。 For example, if the sound speed is 346.5 [m / s] and the measured value of the timer 209 (= the above required time) is 2.0 [msec], the distance L to the visitor M is
L = 346.5 × 2.0 × 10 ⁻³ /2=346.5×10 ⁻³ [m] ≈35 [cm]
It becomes.

以上のようにして検出された距離Ｌが、所定値（受付処理可能な距離に相当。例えば１［ｍ］）以下となったら、受付処理が開始される。 When the distance L detected as described above is equal to or less than a predetermined value (corresponding to a distance that can be accepted, for example, 1 [m]), the acceptance process is started.

なお、上記のような距離検出の際、この例では、タイマ計測開始から所定の最小音波受音時間を経過するまでは上記反射音情報の取得は開始されないようになっている。この最小音波受音時間とは、スピーカ２０８より出力された疑似雑音が、来訪者Ｍに反射することなく、直接マイク２０７に入力されるまで（＝いわゆる疑似雑音のスピーカ２０８からマイク２０７への周り込み）の所要時間である。例えば、スピーカ２０８とマイク２０７との間の距離が３０［ｃｍ］であるとすると、最小音波受音時間は１．７３［ｍｓｅｃ］となる。タイマ２０９の測定時間が最小音波受音時間を経過するまで、マイク２０７には、反射音は入力されない。したがって、最小音波受音時間が経過するまで反射音情報の取得を開始せずに待つことで、マイク２０７に入力する不要な音声（上記周り込みした疑似雑音）を、（後述の図９のステップＳ８０で行われる）反射音を入力したか否かの確認の対象から除外することができる。 In the case of the distance detection as described above, in this example, the acquisition of the reflected sound information is not started until a predetermined minimum sound wave receiving time elapses from the start of the timer measurement. The minimum sound wave reception time means that the pseudo noise output from the speaker 208 is not directly reflected by the visitor M and is directly input to the microphone 207 (= the so-called pseudo noise around the speaker 207 from the speaker 208). Time). For example, if the distance between the speaker 208 and the microphone 207 is 30 [cm], the minimum sound wave receiving time is 1.73 [msec]. The reflected sound is not input to the microphone 207 until the measurement time of the timer 209 has passed the minimum sound wave reception time. Therefore, by waiting without starting the acquisition of the reflected sound information until the minimum sound wave receiving time elapses, unnecessary sound (pseudo-noise around the above) input to the microphone 207 can be reduced (step of FIG. 9 described later). It can be excluded from the object of confirmation of whether or not the reflected sound is input (performed in S80).

また、タイマ計測開始から所定の最大音波受音時間が経過すると反射音情報の取得は終了され、再び雑音情報の取得が開始されるようになっている。この最大音波受音時間とは、スピーカ２０８より出力された疑似雑音が、受付端末２０による受付処理を可能とする最大距離にいる来訪者Ｍにより反射し、その反射音がマイク２０７に入力されるまでの所要時間である。例えば、上記最大距離を１００［ｃｍ］とすると、最大音波受音時間は５．７７［ｍｓｅｃ］となる。この最大音波受音時間を経過した後、マイク２０７に入力された反射音は、上記最大距離より遠い位置に存在する対象物（来訪者Ｍとは限らない）により反射されたものである。タイマ２０９の測定時間が最大音波受音時間を経過すると、反射音情報の取得を終了とすることで、不要な反射音、すなわち、上記最大距離を越えた距離に存在する対象物により反射される反射音より取得される反射音情報を（後述の図９のステップＳ１００で行われる）距離検出の対象から除外することができる。 Further, when a predetermined maximum sound wave receiving time has elapsed from the start of the timer measurement, the acquisition of the reflected sound information is terminated, and the acquisition of noise information is started again. The maximum sound wave reception time is that the pseudo-noise output from the speaker 208 is reflected by the visitor M who is at the maximum distance that allows reception processing by the reception terminal 20, and the reflected sound is input to the microphone 207. It is the time required until. For example, when the maximum distance is 100 [cm], the maximum sound wave receiving time is 5.77 [msec]. After the maximum sound wave receiving time elapses, the reflected sound input to the microphone 207 is reflected by an object (not necessarily the visitor M) existing at a position farther than the maximum distance. When the measurement time of the timer 209 exceeds the maximum sound wave reception time, the acquisition of the reflected sound information is terminated, and the reflected sound is reflected by an unnecessary reflected sound, that is, an object existing at a distance exceeding the maximum distance. Reflected sound information acquired from the reflected sound can be excluded from distance detection targets (performed in step S100 in FIG. 9 described later).

（Ｃ）受付処理の開始後に、雑音情報の取得を再開するまでの流れ
前述のようにして受付処理が開始されると、スピーカ２０８より所定の音声（案内音声。例えば、「いらっしゃいませ。どちら様でしょうか」等）が出力され、さらにこれに併せて表示部２１０に所定の表示画面（例えば前述した図３のようなもの）が表示される。来訪者Ｍがこれら音声や表示に応じて、受付端末２０に対して発話すると、対応する音声がマイク２０７によって入力される。このようにして、来訪者Ｍによって（表示部２１０の表示画面を参照にしつつ）対話方式による受付操作が行われる。 (C) Flow from the start of reception processing to restarting acquisition of noise information When reception processing is started as described above, a predetermined voice (guidance voice. For example, “Welcome. In addition, a predetermined display screen (for example, the one shown in FIG. 3 described above) is displayed on the display unit 210. When the visitor M speaks to the reception terminal 20 in response to the voice and display, the corresponding voice is input by the microphone 207. In this way, the reception operation by the interactive method is performed by the visitor M (with reference to the display screen of the display unit 210).

また、このようにして受付処理が開始された場合、（受付処理を行っている間は）、上記図６（ａ）に示した雑音情報の取得が再開されない（あるいは、図６（ａ）のように雑音情報は取得されるが、図６（ｂ）のような疑似雑音の出力は行われない）ようになっている。すなわち、受付処理中においては、先に来訪者Ｍまでの距離検出のために使用していたマイク２０７及びスピーカ２０８が、受付処理（来訪者Ｍとの対話）に使用されることになる。 In addition, when the reception process is started in this way (when the reception process is being performed), the acquisition of the noise information illustrated in FIG. 6A is not resumed (or as illustrated in FIG. 6A). Thus, although the noise information is acquired, the pseudo noise is not output as shown in FIG. 6B). That is, during the reception process, the microphone 207 and the speaker 208 previously used for detecting the distance to the visitor M are used for the reception process (dialogue with the visitor M).

図８（ａ）〜（ｃ）には、上記の受付処理が終了した後の状態を示している。図８（ａ）に示すように受付処理が終了すると、来訪者Ｍが受付端末２０の近傍から離れて別の場所に移動して受付端末２０の近傍には誰もいない状態となる（図８（ｂ））。すなわち、マイク２０７及びスピーカ２０８は、受付処理が終了された後、しばらくすると、受付処理（来訪者Ｍとの対話）には使用されなくなる。そして、受付処理が終了された後、所定期間（例えば１０秒）が経過したら、図８（ｃ）に示すように、上記雑音情報の取得が再開される（あるいは、上記した雑音情報は取得されるが疑似雑音が出力されない状態から、疑似雑音の出力が再開される）。こうして、上記図６（ａ）の状態に戻る。 FIGS. 8A to 8C show a state after the above reception process is completed. When the reception process is completed as shown in FIG. 8A, the visitor M moves away from the vicinity of the reception terminal 20 and moves to another place, and no one is in the vicinity of the reception terminal 20 (FIG. 8). (B)). That is, the microphone 207 and the speaker 208 are not used for the reception process (dialogue with the visitor M) after a while after the reception process is completed. Then, when a predetermined period (for example, 10 seconds) elapses after the reception process is completed, the acquisition of the noise information is resumed (or the above-described noise information is acquired as shown in FIG. 8C). However, the output of pseudo noise is resumed from the state where pseudo noise is not output). Thus, the state returns to the state of FIG.

（Ｄ）制御手順
図９は、以上説明した内容を実現するために、受付端末２０の制御回路部２００により実行する制御手順を表すフローチャートである。なお、このフローに示す処理は、ＨＤＤ２０５のプログラム記憶エリア２５６に記憶された来訪者受付処理用のプログラム群（前述のシステムプログラム、描画プログラム、音声認識プログラム、対話制御プログラム、距離検出プログラム等）に従って、ＣＰＵ２０１が実行するものである。 (D) Control Procedure FIG. 9 is a flowchart showing a control procedure executed by the control circuit unit 200 of the receiving terminal 20 in order to realize the contents described above. The processing shown in this flow is in accordance with a program group for visitor reception processing (the aforementioned system program, drawing program, voice recognition program, dialogue control program, distance detection program, etc.) stored in the program storage area 256 of the HDD 205. The CPU 201 executes.

図９において、例えば受付端末２０の電源ＯＮによって、このフローが開始される（「ＳＴＡＲＴ」位置）。まずステップＳ１０で、所定の初期化処理を実行する。 In FIG. 9, for example, this flow is started when the reception terminal 20 is turned on (“START” position). First, in step S10, a predetermined initialization process is executed.

そして、ステップＳ２０において、マイク２０７及びＩ／Ｏインタフェイス２０４を介して入力した音（雑音）により、対応する振幅あるいは周波数を含む上記雑音情報を取得する（音取得手段としての機能）。 In step S20, the noise information including the corresponding amplitude or frequency is acquired from the sound (noise) input via the microphone 207 and the I / O interface 204 (function as sound acquisition means).

その後、ステップＳ３０で、上記ステップＳ２０で取得した雑音情報のレベルが、所定のしきい値レベルを超えたか否かを判定する。雑音情報がしきい値レベルを超えていない場合には、判定が満たされず上記ステップＳ２０に戻り、同様の手順を繰り返す。雑音情報がしきい値レベルを超えていた場合には、判定が満たされてステップＳ４０に移る。 Thereafter, in step S30, it is determined whether or not the level of the noise information acquired in step S20 has exceeded a predetermined threshold level. If the noise information does not exceed the threshold level, the determination is not satisfied and the routine returns to step S20 and the same procedure is repeated. If the noise information exceeds the threshold level, the determination is satisfied and the routine goes to Step S40.

ステップＳ４０では、所定のしきい値レベルを超えた雑音情報に所定の処理を行い、対応する疑似雑音を生成する。 In step S40, predetermined processing is performed on noise information exceeding a predetermined threshold level, and corresponding pseudo noise is generated.

そして、ステップＳ５０に移り、Ｉ／Ｏインタフェイス２０４及びスピーカ２０８を介し、上記生成した疑似雑音を出力させる（疑似音出力手段としての機能）。このステップＳ５０の後、ステップＳ５５に移り、生成した擬似雑音の出力を停止する。 Then, the process proceeds to step S50, and the generated pseudo noise is output via the I / O interface 204 and the speaker 208 (function as pseudo sound output means). After step S50, the process proceeds to step S55, and the output of the generated pseudo noise is stopped.

その後、ステップＳ６０で、Ｉ／Ｏインタフェイス２０４を介してタイマ２０９に制御信号を出力し、タイマ２０９を起動させる。これにより、上記ステップＳ５０で出力した疑似雑音が対象物（来訪者Ｍが存在している場合には来訪者Ｍ）に反射し、後述のステップＳ８０で反射音がマイク２０７に入力されるまでの所要時間の測定（計時測定）が開始される。 Thereafter, in step S60, a control signal is output to the timer 209 via the I / O interface 204 to start the timer 209. As a result, the pseudo-noise output in step S50 is reflected on the object (visitor M when visitor M is present), and the reflected sound is input to microphone 207 in step S80 described later. Measurement of the required time (time measurement) is started.

そして、ステップＳ７０に移り、タイマ２０９の測定時間に基づき、測定時間が前述の最小音波受音時間を経過したか否かを判定する。最小音波受音時間を経過するまでは判定が満たされずループ待機し、最小音波受音時間を経過したら判定が満たされて、ステップＳ８０に移る。 Then, the process proceeds to step S70, and based on the measurement time of the timer 209, it is determined whether or not the measurement time has passed the aforementioned minimum sound wave reception time. Until the minimum sound wave receiving time elapses, the determination is not satisfied and the loop stands by. When the minimum sound wave receiving time elapses, the determination is satisfied, and the process proceeds to step S80.

ステップＳ８０では、マイク２０７及びＩ／Ｏインタフェイス２０４を介して、対象物での反射音を入力した否かを判定する。この判定は、上記疑似雑音と、マイク２０７及びＩ／Ｏインタフェイス２０４を介して入力した音声との、パワースペクトルを比較する等の公知の手法により行えば足りる。反射音を入力していない場合には、判定が満たされずステップＳ８５に移る。 In step S <b> 80, it is determined whether or not a reflected sound from the object is input via the microphone 207 and the I / O interface 204. This determination may be performed by a known method such as comparing the power spectrum of the pseudo noise and the sound input via the microphone 207 and the I / O interface 204. If no reflected sound is input, the determination is not satisfied and the routine goes to Step S85.

ステップＳ８５では、上記ステップＳ６０で既に計時開始しているタイマ２０９の測定時間に基づき、計時開始してから前述の最大音波受音時間を経過したか否かを判定する。最大音波受音時間を経過していない場合には、判定が満たされず上記ステップＳ８０に戻り、同様の手順を繰り返す。最大音波受音時間を経過した場合には、判定が満たされて、上記ステップＳ２０に戻り、同様の手順を繰り返す。 In step S85, based on the measurement time of the timer 209 already started in step S60, it is determined whether or not the above-described maximum sound wave reception time has elapsed since the start of time measurement. If the maximum sound wave receiving time has not elapsed, the determination is not satisfied and the routine returns to step S80 to repeat the same procedure. If the maximum sound wave receiving time has elapsed, the determination is satisfied, the process returns to step S20, and the same procedure is repeated.

一方、上記ステップＳ８０において、反射音を入力していた場合には、ステップＳ８０の判定が満たされてステップＳ９０に移る。 On the other hand, if the reflected sound is input in step S80, the determination in step S80 is satisfied and the process proceeds to step S90.

ステップＳ９０では、上記ステップＳ８０でマイク２０７及びＩ／Ｏインタフェイス２０４を介して入力された反射音により、対応する振幅あるいは周波数を含む反射音情報を取得する。 In step S90, the reflected sound information including the corresponding amplitude or frequency is acquired from the reflected sound input via the microphone 207 and the I / O interface 204 in step S80.

ステップＳ１００では、上記ステップＳ９０で取得された反射音情報と、上記ステップＳ６０で既に計時開始しているタイマ２０９のここまでの測定時間とに基づき、所定の演算処理（この例では、前述の図１や図７で説明した上記（式１）を用いる手法）を行い、対象物までの距離（来訪者Ｍがいる場合は来訪者Ｍまでの距離）を検出する（距離検出手段としての機能）。 In step S100, based on the reflected sound information acquired in step S90 and the measurement time of the timer 209 already started in step S60, a predetermined calculation process (in this example, the above-described diagram) 1 and the method using (Expression 1) described above with reference to FIG. 7 are performed to detect the distance to the object (distance to the visitor M when there is a visitor M) (function as distance detection means) .

その後、ステップＳ１１０で、上記ステップＳ１００の距離検出結果に基づき、対象物までの距離が所定値（例えば、１［ｍ］）以下であるか否かを判定する。対象物までの距離が所定値より大きい場合には、判定が満たされず、来訪者Ｍが存在していないと推測して（あるいは来訪者Ｍが存在しているが受付処理するには遠すぎるとみなして）上記ステップＳ２０に戻り、同様の手順を繰り返す。対象物までの距離が所定値以下である場合は判定が満たされ、来訪者Ｍが受付可能な距離に存在していると推測して、ステップＳ１２０に移る。 Thereafter, in step S110, based on the distance detection result in step S100, it is determined whether or not the distance to the object is equal to or less than a predetermined value (for example, 1 [m]). If the distance to the object is greater than the predetermined value, the determination is not satisfied, and it is assumed that the visitor M does not exist (or the visitor M exists but is too far for the reception process). (Considering) Returning to step S20, the same procedure is repeated. If the distance to the object is less than or equal to the predetermined value, the determination is satisfied, and it is assumed that the visitor M exists within a distance that can be accepted, and the process proceeds to step S120.

ステップＳ１２０では、ＨＤＤ２０５のプログラム記憶エリア２５６に記憶された所定のアプリケーションプログラムを読み出し、当該アプリケーションを起動することで、受付処理を開始する。 In step S120, a predetermined application program stored in the program storage area 256 of the HDD 205 is read, and the reception process is started by starting the application.

そして、ステップＳ１３０に移り、上記ステップＳ１２０において開始した受付処理が終了しているか否かを判定する。受付処理が終了するまでは判定が満たされずループ待機し、受付処理が終了したら判定が満たされて、ステップＳ１４０に移る（なお、このとき後述するステップＳ１４０のためにタイマ２０９による計時を開始する）。 Then, the process proceeds to step S130, and it is determined whether or not the reception process started in step S120 has been completed. Until the acceptance process is completed, the determination is not satisfied and the system waits in a loop. When the acceptance process is completed, the determination is satisfied, and the process proceeds to step S140 (at this time, the timer 209 starts timing for step S140 described later). .

ステップＳ１４０では、受付処理が終了した後、所定の期間（例えば、１０秒）経過したか否かを（例えば上記タイマ２０９による計時に基づき）判定する。所定期間が経過するまでは判定が満たされずループ待機し、所定期間が経過したら判定が満たされて、ステップＳ２０に戻り、同様の手順を繰り返す。この結果、上記のフローは、例えば受付端末２０の電源がＯＮの間、あるいは所定の終了操作がされるまでの間は、所定の時間間隔（例えば２秒間隔）で繰り返し継続して実行される。 In step S140, it is determined whether or not a predetermined period (for example, 10 seconds) has elapsed (for example, based on the time measured by the timer 209) after the reception process is completed. Until the predetermined period elapses, the determination is not satisfied and the loop waits. When the predetermined period elapses, the determination is satisfied, the process returns to step S20, and the same procedure is repeated. As a result, the above-described flow is repeatedly executed continuously at a predetermined time interval (for example, every 2 seconds) until the receiving terminal 20 is turned on or until a predetermined end operation is performed. .

なお、上記において、ステップＳ３０及びステップＳ４０が、各請求項記載の疑似音生成手段として機能し、ステップＳ８０及びステップＳ９０が、反射音取得手段として機能し、ステップＳ１２０が、対話処理制御手段として機能する。 In the above, Step S30 and Step S40 function as the pseudo sound generation means described in each claim, Step S80 and Step S90 function as the reflected sound acquisition means, and Step S120 functions as the dialog processing control means. To do.

また、ステップＳ１２０で受付処理が開始された後、ステップＳ１３０の判定が満たされずにループ待機している期間、言い換えれば、受付処理が行われている期間は、ステップＳ１４０に移行せず図９のフローは終了しない。すなわち、受付処理が行われている期間は、上記雑音情報の取得が再び行われることはない。 In addition, after the reception process is started in step S120, the period in which the determination in step S130 is not satisfied and the loop is waiting, in other words, the period in which the reception process is performed does not proceed to step S140 and does not move to step S140. The flow does not end. That is, the acquisition of the noise information is not performed again during the period in which the reception process is performed.

また、受付処理が終了した後、ステップＳ１４０の判定が満たされると、言い換えれば、所定期間（例えば１０秒）が経過すると図９のフローは終了する。すなわち、フローが再び「ＳＴＡＲＴ」位置から開始され、ステップＳ１０→ステップＳ２０と移り、上記雑音情報の取得を再び実行する。この結果、ステップＳ１４０は、受付処理が終了した後、所定期間経過したら、疑似雑音の出力を再び実行するように制御する出力制御手段として機能している。 In addition, when the determination in step S140 is satisfied after the reception process ends, in other words, the flow in FIG. 9 ends when a predetermined period (for example, 10 seconds) elapses. That is, the flow is started again from the “START” position, and the process proceeds from step S10 to step S20, and the acquisition of the noise information is executed again. As a result, step S140 functions as an output control unit that performs control so that pseudo noise is output again after a predetermined period of time has elapsed after the reception process is completed.

以上説明したように、本実施形態の受付端末２０においては、マイク２０７及びスピーカ２０８を介して入出力する音を用いて、来訪者Ｍとの距離を検出する。すなわち、受付端末２０の周囲で発生した雑音（例えば、ドアが閉まる音等）がマイク２０７を介し入力されると、対応する雑音情報を取得し（ステップＳ２０参照）、この取得した雑音情報に基づき、距離検出用の疑似雑音をスピーカ２０８を介し出力する（ステップＳ５０参照）。そして、出力された疑似雑音が伝搬し来訪者Ｍで反射すると、その反射音がマイク２０７を介し入力され、対応する反射音情報を取得する（ステップＳ９０参照）。そして、当該取得した反射音情報に基づき、来訪者Ｍまでの距離を検出する（ステップＳ１００参照）。そして、当該検出した距離が所定値以下であれば、受付処理を開始する（ステップＳ１２０参照）ことで、来訪者Ｍに対して確実な受付処理を行うことができる。 As described above, in the reception terminal 20 according to the present embodiment, the distance to the visitor M is detected using the sound input / output via the microphone 207 and the speaker 208. That is, when noise (for example, a door closing sound) generated around the reception terminal 20 is input via the microphone 207, corresponding noise information is acquired (see step S20), and based on the acquired noise information. Then, pseudo noise for distance detection is output through the speaker 208 (see step S50). Then, when the output pseudo noise propagates and is reflected by the visitor M, the reflected sound is input via the microphone 207, and the corresponding reflected sound information is acquired (see step S90). And based on the acquired reflected sound information, the distance to the visitor M is detected (refer step S100). If the detected distance is equal to or smaller than the predetermined value, the reception process is started (see step S120), so that a reliable reception process can be performed for the visitor M.

この結果、本実施形態の受付端末２０によれば、マイク２０７及びスピーカ２０８を介して入出力する音を用いて、来訪者Ｍまでの距離を検出することができる。すなわち、受付処理のためにもともと備わっているマイク２０７及びスピーカ２０８を活用することで、それ以外の別途の距離検出用のセンサや専用マイク等を新たに設けることなく、距離検出を行うことができる。 As a result, according to the reception terminal 20 of the present embodiment, the distance to the visitor M can be detected using the sound input / output via the microphone 207 and the speaker 208. That is, by using the microphone 207 and the speaker 208 that are originally provided for the reception process, distance detection can be performed without newly providing a separate distance detection sensor, a dedicated microphone, or the like. .

またこのとき、距離検出のために、雑音情報に基づく疑似雑音（周囲で発生した雑音に似ている音）を用いることにより、音を用いて検出していることを来訪者Ｍに悟られることなく、距離検出を行える効果もある。 At this time, the visitor M can realize that the sound is detected by using pseudo noise (sound similar to noise generated in the surroundings) based on noise information for distance detection. There is also an effect that distance detection can be performed.

また、本実施形態では特に、上記雑音情報に所定の処理を行い、対応する疑似雑音を生成し（ステップＳ４０参照）、スピーカ２０８を介し疑似雑音を出力する（ステップＳ５０参照）。これにより、雑音情報をそのまま用いて距離検出を行う以外に、雑音情報のうちの所定範囲（レベル範囲や時間範囲）のものを用いたり（後述の（１）の変形例参照）、雑音情報に各種の加工を施したものを用いたりすることができる（後述の（２）の変形例参照）。この結果、距離検出に使用可能な音のバリエーションを拡張できるので、種々の用途への応用性を向上することができる。 In the present embodiment, in particular, predetermined processing is performed on the noise information, corresponding pseudo noise is generated (see step S40), and the pseudo noise is output via the speaker 208 (see step S50). As a result, in addition to the distance detection using the noise information as it is, the noise information having a predetermined range (level range or time range) can be used (see a modification of (1) described later), or the noise information can be used. What gave various processes can be used (refer the modification of below-mentioned (2)). As a result, the variation of sound that can be used for distance detection can be expanded, so that the applicability to various applications can be improved.

ここで、雑音情報に基づき疑似雑音を生成するとき、元となる雑音のレベルがあまりに小さいと、出力する疑似雑音のレベルも小さく、その反射音を検出することが困難となる。そこで、本実施形態では特に、上記雑音情報のうち所定のしきい値レベルを超えたもの（例えばパワーが大きいもの）に基づき、疑似雑音を生成する（ステップＳ３０参照）。これにより、上記のように、出力する疑似雑音のレベル不足による不都合を回避し、確実な距離検出を行うことができる。 Here, when the pseudo noise is generated based on the noise information, if the level of the original noise is too small, the level of the pseudo noise to be output is also small, and it is difficult to detect the reflected sound. Therefore, in the present embodiment, pseudo noise is generated based on the noise information that exceeds a predetermined threshold level (for example, power is high) (see step S30). Thereby, as described above, inconvenience due to insufficient level of pseudo noise to be output can be avoided, and reliable distance detection can be performed.

また、距離検出を行って来訪者Ｍとの距離を確定した後、受付処理を開始した場合には、当該来訪者Ｍによって対話方式による操作が安定的に行われているはずである。これに対応し、本実施形態では特に、受付処理が開始された後は、疑似雑音の出力を再び行わないようにする（ステップＳ１３０参照）。これにより、上記のような安定的な操作時に再び疑似雑音の出力を繰り返す無駄を避けることができる。 In addition, when the reception process is started after the distance is detected and the distance to the visitor M is determined, the visitor M should be stably operated by the interactive method. Corresponding to this, particularly in the present embodiment, the pseudo-noise is not output again after the acceptance process is started (see step S130). Thereby, it is possible to avoid waste of repeating the output of pseudo noise again during the stable operation as described above.

さらに、来訪者Ｍとの距離を確定して受付処理が行われ、その受付処理が終了してしばらく経過した場合には、対話していた来訪者Ｍは既に別の場所に移動し、受付端末２０の近傍に誰もいない（あるいは別の来訪者Ｍがいる）状態になっている可能性が高い。これに対応し、本実施形態では特に、受付処理が終了した後、所定期間（例えば、１０秒）が経過したら、雑音情報の取得を再び実行するようにする（ステップＳ１４０参照）。したがって、受付処理終了後、所定期間が経過したら、（フローを終了し、再度フローを開始して）疑似雑音の出力を再び行うようにすることで、次の来訪者Ｍに対する距離検出を確実に実行することができる。 Furthermore, when the reception process is performed after the distance to the visitor M is determined and the reception process is completed, the visitor M who has been in conversation has already moved to another place, and the reception terminal There is a high possibility that there is no one in the vicinity of 20 (or there is another visitor M). Corresponding to this, particularly in the present embodiment, when a predetermined period (for example, 10 seconds) elapses after the reception process ends, the acquisition of noise information is executed again (see step S140). Therefore, when a predetermined period elapses after the reception process is completed, the distance detection for the next visitor M is ensured by outputting the pseudo noise again (ending the flow and starting the flow again). Can be executed.

なお、本発明は、上記実施形態に限られるものではなく、その趣旨及び技術的思想を逸脱しない範囲内で種々の変形が可能である。以下、そのような変形例を順を追って説明する。 The present invention is not limited to the above-described embodiment, and various modifications can be made without departing from the spirit and technical idea of the present invention. Hereinafter, such modifications will be described in order.

（１）所定の時間範囲の雑音情報に限定して疑似雑音を生成する場合
上記実施形態においては、出力する疑似雑音のレベル不足による不都合を回避するために、所定のしきい値レベルを超えた（パワーの大きい）雑音情報に限定して疑似雑音を生成していたが、これに限られない。すなわち、所定の時間範囲（例えば、最初の１［ｍｓｅｃ］）の雑音情報に限定して疑似雑音を生成するようにしてもよい。 (1) When generating pseudo-noise limited to noise information in a predetermined time range In the above embodiment, in order to avoid inconvenience due to insufficient level of pseudo-noise to be output, a predetermined threshold level is exceeded. Although pseudo-noise was generated only for noise information (high power), the present invention is not limited to this. That is, the pseudo noise may be generated limited to noise information in a predetermined time range (for example, first 1 [msec]).

本変形例の受付端末２０の制御回路部２００により実行する制御手順は、前述の図９とほぼ同様のもので足りる。但し、ステップＳ３０では、上記ステップＳ２０で取得した雑音情報のうち、所定のしきい値レベルを超えたものから、図１０に示すような所定の時間Ｔの範囲（例えば最初の１［ｍｓｅｃ］）で雑音情報を時間的に抽出する（切り取る）。 The control procedure executed by the control circuit unit 200 of the receiving terminal 20 of the present modification may be almost the same as that shown in FIG. However, in step S30, a range of predetermined time T as shown in FIG. 10 (for example, first 1 [msec]) from noise information acquired in step S20 above a predetermined threshold level. To extract (cut out) noise information in terms of time.

そして、ステップＳ４０では、上記のようにしてステップＳ３０で抽出した所定の時間範囲の雑音情報に基づき、対応する疑似雑音を生成する。 In step S40, the corresponding pseudo noise is generated based on the noise information in the predetermined time range extracted in step S30 as described above.

本変形例によれば、疑似雑音の生成時に、取得した雑音情報をそのまま用いるのではなく、所定の時間範囲（例えば、最初の１［ｍｓｅｃ］）の雑音情報のみに限定して疑似雑音を生成する。これにより、ドアを閉めた音や物を置いた音等、最初に大きく立ち上がって急激に減衰していく雑音のうち減衰するまでの最初のレベルの大きな部分のみを時間的に切り取り、その切り取った部分に基づいて疑似雑音を生成することができる。これにより、上記実施形態と同様、出力する疑似雑音のレベル不足による不都合を回避し、確実な距離検出を行うことができる。 According to this modified example, when the pseudo noise is generated, the acquired noise information is not used as it is, but the pseudo noise is generated only in a predetermined time range (for example, first 1 [msec]). To do. This cuts out only the large part of the first level of the noise that suddenly attenuates, such as the sound of closing the door and the sound of placing objects, until it attenuates. Pseudo noise can be generated based on the portion. As a result, as in the above-described embodiment, it is possible to avoid inconvenience due to insufficient level of the pseudo noise to be output, and perform reliable distance detection.

（２）変調処理を行ってから疑似雑音を生成する場合
以上においては、所定の範囲の雑音情報（しきい値レベルを超えた雑音情報、時間範囲の雑音情報）に限定して疑似雑音を生成していたが、これに限られない。すなわち、雑音情報に変調処理（例えば振幅変調や周波数変調等）を行って疑似雑音を生成するようにしてもよい。この変形例の受付端末２０の制御回路部２００により実行する制御手順では、図９におけるステップＳ３０が省略される。また、図９のステップＳ４０において、上記ステップＳ２０で取得した雑音情報に変調処理（例えば振幅変調）を行い、対応する疑似雑音を生成する（疑似音生成手段としての機能）。その後のステップＳ５０〜ステップＳ１４０は、前述の図９と同様である。 (2) When generating pseudo-noise after performing modulation processing In the above, pseudo-noise is generated by limiting to noise information in a predetermined range (noise information exceeding a threshold level, noise information in a time range) However, it is not limited to this. In other words, pseudo noise may be generated by performing modulation processing (for example, amplitude modulation or frequency modulation) on the noise information. In the control procedure executed by the control circuit unit 200 of the receiving terminal 20 of this modification, step S30 in FIG. 9 is omitted. In step S40 in FIG. 9, the noise information acquired in step S20 is modulated (for example, amplitude modulation) to generate corresponding pseudo noise (function as pseudo sound generation means). Subsequent steps S50 to S140 are the same as those in FIG.

本変形例では、取得した雑音情報に変調処理を行うことにより、マイク２０７に入力した雑音のレベルに対して、適宜の大きさ（例えば５倍）に増幅した疑似雑音をスピーカ２０８を介して出力できるので、確実な距離検出を行うことができる。なお、この場合、元となる雑音情報は、所定のしきい値レベル（上記実施形態）や所定の時間範囲（上記（１）の変形例）等、所定の範囲内のものに限定されないという効果もある。あるいは、上記変調により、検出に都合がよいような周波数に変えて疑似雑音を生成することも可能であり、これによっても確実な距離検出を行うことができる。 In this modification, by performing a modulation process on the acquired noise information, pseudo noise amplified to an appropriate level (for example, five times) with respect to the noise level input to the microphone 207 is output via the speaker 208. Therefore, reliable distance detection can be performed. In this case, the original noise information is not limited to a predetermined range such as a predetermined threshold level (the above embodiment) or a predetermined time range (modified example of (1) above). There is also. Alternatively, it is possible to generate pseudo noise by changing to a frequency that is convenient for detection by the above modulation, and reliable distance detection can also be performed.

（３）その他
以上においては、音声入力手段を、１つのマイク２０７で構成したが、これに限らず、複数の（例えば、２つの）マイクで構成してもよい（いわゆるアレー型のマイクロホン装置）。このような構成とすることで、受付端末２０の周囲で発生した雑音を複数のマイクのそれぞれで入力でき、良好に（感度よく）雑音情報を取得することができる。また、複数のマイクそれぞれの指向性を制御することで雑音発生方向を特定することができる。この結果、雑音発生方向へのマイク感度を高めたり、スピーカにより出力する疑似雑音を当該雑音発生方向に対応した態様とすることで、来訪者Ｍに対しさらに気づかれにくくすることができる、等の効果を得る。 (3) Others In the above, the voice input unit is configured by one microphone 207, but is not limited thereto, and may be configured by a plurality of (for example, two) microphones (so-called array type microphone device). . With such a configuration, noise generated around the reception terminal 20 can be input by each of the plurality of microphones, and noise information can be acquired well (with high sensitivity). In addition, the noise generation direction can be specified by controlling the directivity of each of the plurality of microphones. As a result, the microphone sensitivity in the noise generation direction is increased, or the pseudo-noise output from the speaker is set in a mode corresponding to the noise generation direction, so that the visitor M can be made less noticeable. Get the effect.

また、以上においては、所定の演算処理として、疑似雑音を出力してから、その反射音が入力されるまでの所要時間を測定し、この所要時間が、来訪者Ｍまでの距離に比例するという関係（上記（式１）を参照）から来訪者Ｍまでの距離を検出した。しかしながら、これに限らず、所定の演算処理として、出力した疑似雑音と入力した反射音との位相差から来訪者Ｍまでの距離を検出するようにしてもよい。この場合でも上記と同様の効果を得る。 In the above, as the predetermined calculation process, the time required from the output of pseudo noise to the input of the reflected sound is measured, and this time required is proportional to the distance to the visitor M. The distance from the relationship (see above (formula 1)) to the visitor M was detected. However, the present invention is not limited to this, and as a predetermined calculation process, the distance to the visitor M may be detected from the phase difference between the output pseudo noise and the input reflected sound. Even in this case, the same effect as described above can be obtained.

なお、以上において、図４、図５等の各図中に示す矢印は信号の流れの一例を示すものであり、信号の流れ方向を限定するものではない。 In addition, in the above, the arrow shown in each figure of FIG. 4, FIG. 5, etc. shows an example of the flow of a signal, and does not limit the flow direction of a signal.

また、図９に示すフローチャートは本発明を上記フローに示す手順に限定するものではなく、発明の趣旨及び技術的思想を逸脱しない範囲内で手順の追加・削除又は順番の変更等をしてもよい。 In addition, the flowchart shown in FIG. 9 does not limit the present invention to the procedure shown in the above-described flow, and the procedure may be added / deleted or the order may be changed without departing from the spirit and technical idea of the invention. Good.

また、以上既に述べた以外にも、上記実施形態や各変形例による手法を適宜組み合わせて利用しても良い。 In addition to those already described above, the methods according to the above-described embodiments and modifications may be used in appropriate combination.

その他、一々例示はしないが、本発明は、その趣旨を逸脱しない範囲内において、種々の変更が加えられて実施されるものである。 In addition, although not illustrated one by one, the present invention is implemented with various modifications within a range not departing from the gist thereof.

２０受付端末（対話装置）
２０１ＣＰＵ
２０７マイク（音声入力手段）
２０８スピーカ（音声出力手段）
Ｍ来訪者（操作者） 20 Reception terminal (dialogue device)
201 CPU
207 Microphone (voice input means)
208 Speaker (Audio output means)
M Visitor (operator)

Claims

An interactive device that an operator can operate in an interactive manner,
Voice input means for inputting voice;
Audio output means for outputting audio;
Sound acquisition means for acquiring sound information including a corresponding amplitude or frequency by sound input through the voice input means;
A pseudo sound output means for outputting a pseudo sound for distance detection via the sound output means based on the sound information acquired by the sound acquisition means within a predetermined time after the sound input means inputs the sound. When,
Reflected sound acquisition means for acquiring reflected sound information including a corresponding amplitude or frequency from the reflected sound of the pseudo sound object input through the voice input means;
Based on the reflected sound information acquired by the reflected sound acquisition means, a predetermined calculation process, a distance detection means for detecting the distance to the operator by assuming that the object is the operator;
A dialogue processing control means for starting dialogue processing with the operator based on a detection result of the distance detection means.

Performing a predetermined process on the sound information acquired by the sound acquisition unit, and generating a corresponding pseudo sound,
The pseudo sound output means includes
The interactive apparatus according to claim 1, wherein the pseudo sound generated by the pseudo sound generating means is output.

The pseudo sound generating means includes
The interactive apparatus according to claim 2, wherein the pseudo sound is generated based on sound information that exceeds a predetermined threshold level.

The pseudo sound generating means includes
The interactive apparatus according to claim 3, wherein the pseudo sound is generated based on the sound information in a predetermined time range.

2. An output control unit that controls to output a pseudo sound again after a predetermined period of time has elapsed after the completion of the interactive processing based on the control of the interactive processing control unit. Item 5. The interactive device according to any one of items 4 to 5.