JPH08190470A

JPH08190470A - Information providing terminal

Info

Publication number: JPH08190470A
Application number: JP7000298A
Authority: JP
Inventors: Shinichi Tanaka; 信一田中
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1995-01-05
Filing date: 1995-01-05
Publication date: 1996-07-23

Abstract

PURPOSE: To dynamically determine a speech which can be recognized according to the function and operation method of an information providing terminal which change with time, etc. CONSTITUTION: This information providing terminal is equipped with a time detection part 11 which detects the time, a state determination part 12 which periodically reads in the time detected by the time detection part 11 and determines the state of the information providing terminal corresponding to the time zone including the time, and a speech recognition part 13 which recognizes a speech that a user speaks according to the state reported from the state determination part 12 at the time. Further, this terminal is equipped with an information processing part 14 which switches a function provided to the user according to the state reported from the state determination part 12 when the speech that the user speaks is recognized by the speech recognition part 13 and performs information providing operation corresponding to the speech recognition result of the speech recognition part 13.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、利用者が発声した音声
を認識することにより利用者に対して適切な情報を提供
する情報提供端末に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an information providing terminal for providing appropriate information to a user by recognizing a voice uttered by the user.

【０００２】[0002]

【従来の技術】この種の情報提供端末、即ち音声を用い
て操作することの可能な情報提供端末は、利用者が日常
的に使用している音声を用いて装置にコマンドなどを与
えることができる有用な装置である。2. Description of the Related Art An information providing terminal of this type, that is, an information providing terminal which can be operated by using a voice, can give a command or the like to a device by using a voice that a user uses daily. It is a useful device.

【０００３】このような情報提供端末は、利用者の音声
を認識するための音声認識手段を持つ。従来より開発さ
れている音声認識手段のうち、現在広く用いられている
ものは、単語音声を認識する手段である。この音声認識
手段による音声認識の具体的な方式は、例えば“中川聖
一著「確率モデルによる音声認識」、電子情報通信学会
発行、ｐ．７〜８９”に詳しい。Such an information providing terminal has a voice recognition means for recognizing the voice of the user. Among the speech recognition means developed conventionally, the one widely used at present is a means for recognizing word speech. A concrete method of the voice recognition by the voice recognition means is described in, for example, “Seiichi Nakagawa“ Voice Recognition by Stochastic Model ”, published by Institute of Electronics, Information and Communication Engineers, p. Detailed on 7-89 ".

【０００４】ここで、音声認識手段での認識過程につい
て、簡単に述べる。まず、入力された音響信号から、音
響信号のエネルギーなどを用いて、音響信号中の音声信
号が存在する区間（以下、音声区間と称する）を推定す
る。次に、音声区間に含まれる音響信号と、認識手段の
辞書に予め登録されている単語モデルとを照合し、最良
の照合結果の得られた単語を認識結果として出力する。Here, the recognition process by the voice recognition means will be briefly described. First, a section (hereinafter referred to as a voice section) in which the voice signal is present in the acoustic signal is estimated from the input acoustic signal by using the energy of the acoustic signal and the like. Next, the acoustic signal included in the voice section is collated with the word model registered in the dictionary of the recognition means in advance, and the word with the best collation result is output as the recognition result.

【０００５】情報提供端末は、音声認識手段から出力さ
れた音声認識結果に従って、適切な情報を利用者に提示
する。従来は、以上に述べたような情報提供端末が用い
られてきた。このような情報提供端末では、音声認識手
段が認識することのできる音声（語彙）は常に固定され
ており、認識できた音声、即ち登録されている語彙に対
しては、常に動作していた。The information providing terminal presents appropriate information to the user according to the voice recognition result output from the voice recognition means. Conventionally, the information providing terminal as described above has been used. In such an information providing terminal, the voice (vocabulary) that can be recognized by the voice recognition means is always fixed, and the recognized voice, that is, the registered vocabulary, always operates.

【０００６】[0006]

【発明が解決しようとする課題】上記したように、従来
の情報提供端末では、受理する語彙が常に一定であっ
た。しかし、広く一般の人が利用者となる情報提供端末
の操作に音声認識手段を適用した場合、利用者の層はそ
の利用時刻、利用日などによって変化し、情報提供端末
に問い合わせる内容の傾向も、それに応じて変化するこ
とが考えられる。また、情報提供端末が利用者に提供す
るサービス（機能）によっては、利用時刻等の関係でサ
ービスできる場合とできない場合とがある。As described above, the conventional information providing terminal always accepts a fixed vocabulary. However, when the voice recognition means is applied to the operation of the information providing terminal that is widely used by the general public, the layer of users changes depending on the time of use, the date of use, etc. , It can be changed accordingly. In addition, depending on the service (function) provided by the information providing terminal to the user, there are cases where the service can be provided depending on the time of use and the like, and cases where the service cannot be provided.

【０００７】そこで、このような点を考慮するならば、
音声によって操作可能な情報提供端末は、時刻等に応じ
て内部状態を変更し、利用者に提供する機能を内部状態
に応じて変更する必要がある。さらに、上記の内部状態
の変更に応じて、その操作方法も変更され得るため、情
報提供端末は、当該端末の操作に利用される音声認識手
段が認識可能な音声の集合を変化させる必要もある。Therefore, if such points are taken into consideration,
The information providing terminal that can be operated by voice needs to change the internal state according to the time and the like, and change the function provided to the user according to the internal state. Furthermore, since the operation method can be changed according to the change in the internal state, the information providing terminal also needs to change the set of voices that can be recognized by the voice recognition means used for operating the terminal. .

【０００８】ところが、従来の情報提供端末では、この
ようなことは何等考慮されておらず、当該端末の操作に
利用される音声認識手段では、認識単語が予め設定した
語彙に固定されており、また時刻等による情報提供端末
の状態の変化に追従できない。したがって、このような
音声認識手段を、上記した時刻等によって（提供する）
機能が変化する情報提供端末に組み込むことができず、
この種の情報提供端末（音声によって操作可能な、時刻
等によって機能が変化する情報提供端末）を実現するこ
とは困難であった。However, in the conventional information providing terminal, such a thing is not considered at all, and in the voice recognition means used for operating the terminal, the recognition word is fixed to a preset vocabulary, In addition, it cannot follow changes in the state of the information providing terminal due to time and the like. Therefore, such a voice recognition means is (provided) according to the above-mentioned time and the like
It cannot be installed in an information provision terminal whose functions change,
It has been difficult to realize this type of information providing terminal (an information providing terminal that can be operated by voice and whose function changes depending on the time, etc.).

【０００９】本発明はこのような事情を考慮してなされ
たもので、その目的は、時刻等に応じて変化する情報提
供端末の機能、操作法に従って音声認識可能な音声を動
的に決定することができる、音声によって操作可能な情
報提供端末を提供することにある。The present invention has been made in view of the above circumstances, and an object thereof is to dynamically determine a voice that can be recognized in accordance with the function and operation method of an information providing terminal that changes according to time and the like. It is to provide an information providing terminal which can be operated by voice.

【００１０】本発明の他の目的は、利用者が発声した音
声が、その発声された時刻等以外の時点で受理される音
声である場合、何も動作しないとか、その音声が無効で
あることを示す情報を利用者に提示するとか、利用者が
発声した音声内容を利用者に提示するとか、その時点に
受理することのできる音声及びその時点に提供できる機
能の一覧の少なくとも１つを利用者に提示するとか、す
ることによって、利用者が誤って使用した場合でも誤動
作しない、音声によって操作可能な情報提供端末を提供
することにある。Another object of the present invention is that, when the voice uttered by the user is a voice accepted at a time other than the time when the voice is uttered, no action is taken or the voice is invalid. Presenting information indicating to the user, presenting the voice content uttered by the user to the user, or using at least one of the list of voices that can be accepted at that time and the functions that can be provided at that time. It is to provide an information providing terminal operable by voice that does not malfunction even if the user mistakenly uses it by presenting it to a person.

【００１１】[0011]

【課題を解決するための手段及び作用】本発明の第１の
観点に係る構成は、音声を用いて操作することの可能な
情報提供端末において、日付、曜日、時刻及び休日であ
るか否かの少なくとも１つ（以下、時刻等と称する）を
検知する検知手段と、この検知手段の検知結果に応じて
情報提供端末の状態を決定する状態決定手段と、入力さ
れた音声を、その際の状態決定手段の示す状態に応じて
認識する音声認識手段と、この音声認識手段により入力
音声が認識された際の上記状態決定手段の示す状態に応
じて利用者に提供する機能を切り替えて、上記音声認識
手段の音声認識結果に応じた情報提供動作を行う情報処
理手段とを備えたことを特徴とするものである。The configuration according to the first aspect of the present invention is, in an information providing terminal that can be operated using voice, whether the date, the day of the week, the time of day, and the holiday. Of at least one of the following (hereinafter referred to as time and the like), state determining means for determining the state of the information providing terminal according to the detection result of this detecting means, and the input voice, The voice recognition means for recognizing according to the state indicated by the state deciding means, and the function provided to the user according to the state indicated by the state deciding means when the input voice is recognized by the voice recognizing means are switched, The information processing means performs an information providing operation according to the voice recognition result of the voice recognition means.

【００１２】上記第１の観点に係る構成においては、検
知手段により例えば常に現在時刻等が検知され、状態決
定手段に通知される。状態決定手段は、検知手段により
通知される時刻等を例えば定期的に読み込み、その時刻
等に対応して予め定められている情報提供端末の状態を
決定する。この状態決定手段により決定された情報提供
端末の最新の状態は音声認識手段及び情報処理手段に通
知される。In the configuration according to the first aspect, the detection unit always detects the current time, for example, and notifies the state determination unit. The state determining means reads the time or the like notified by the detecting means, for example, periodically, and determines a predetermined state of the information providing terminal corresponding to the time or the like. The latest state of the information providing terminal determined by the state determining means is notified to the voice recognition means and the information processing means.

【００１３】音声認識手段は、利用者が本端末を操作す
るために（本端末から機能を提供してもらうために）本
端末に対して必要な音声を入力すると、その入力音声を
認識する。この際、音声認識手段では、認識の対象とな
る音声の入力時において状態決定手段により通知されて
いる状態、即ち音声入力時点における本端末の状態に応
じて認識可能な音声（受理可能な音声）が決定され、そ
の範囲内で入力音声の認識が行われる。したがって、利
用者が入力した音声が、その音声入力時点（において検
知手段により検知された時刻等）に対応する情報提供端
末の状態で受理されものである場合には音声認識の対象
となり、受理されないものである場合には音声認識の対
象外となる。The voice recognition means recognizes an input voice when a user inputs a voice required for operating the terminal (in order for the terminal to provide a function) to the terminal. At this time, in the voice recognition means, a recognizable voice (acceptable voice) according to the state notified by the state determination means when the voice to be recognized is input, that is, the state of the terminal at the time of voice input. Is determined, and the input voice is recognized within the range. Therefore, if the voice input by the user is accepted in the state of the information providing terminal corresponding to the time when the voice is input (the time detected by the detection means at that time), it is subject to voice recognition and is not accepted. If it is a thing, it is excluded from the target of voice recognition.

【００１４】音声認識手段により、入力音声が認識でき
た場合、その音声認識結果が情報処理処理手段に送られ
る。この情報処理手段は、上記状態決定手段により決定
され得る複数の状態を持ち、その状態に応じて受け付け
可能な（操作コマンドとしての）音声認識結果と、それ
に対して利用者に提供する機能（サービス）との対応情
報を有している。そして情報処理手段は、音声認識手段
から音声認識結果が送られた場合には、その際に状態決
定手段により通知されている情報提供端末の状態で決ま
る機能の中から、音声認識手段からの音声認識結果に対
応した機能を選択して、利用者に提供する。When the voice recognition means can recognize the input voice, the voice recognition result is sent to the information processing means. The information processing means has a plurality of states that can be determined by the state determining means, and accepts a voice recognition result (as an operation command) according to the states, and a function (service) provided to the user. ) And corresponding information. Then, when the voice recognition result is sent from the voice recognition means, the information processing means selects the voice from the voice recognition means from the functions determined by the state of the information providing terminal notified by the state determination means at that time. Select the function corresponding to the recognition result and provide it to the user.

【００１５】このように、上記第１の観点に係る構成に
おいては、情報提供端末の提供する機能（サービス）、
及びその操作法を、時刻等によって変化させ、しかもそ
れに対応した操作を音声によって行うことが可能とな
る。As described above, in the configuration according to the first aspect, the function (service) provided by the information providing terminal,
It is possible to change the operation method and the operation method according to the time and the like, and to perform the operation corresponding to the operation by voice.

【００１６】本発明の第２の観点に係る構成は、上記第
１の観点に係る構成における音声認識手段と情報処理手
段に代えて、入力された音声を認識する音声認識手段で
あって、上記入力音声が受理可能か否かを、その際の状
態決定手段の示す状態に応じて判定する音声認識手段
と、この音声認識手段により入力音声が認識された際の
状態決定手段の示す状態に応じて利用者に提供する機能
を切り替えて、上記音声認識手段の音声認識結果に応じ
た情報提供動作を行う情報処理手段であって、上記音声
認識手段により上記入力音声が受理可能でないと判定さ
れた場合には、利用者への情報提供を控える情報処理手
段とを適用したことを特徴とする。A configuration according to a second aspect of the present invention is a voice recognition means for recognizing an input voice, in place of the voice recognition means and the information processing means in the configuration according to the first aspect. Depending on the voice recognition means for judging whether or not the input voice can be accepted according to the state indicated by the state decision means at that time, and the state indicated by the state decision means when the input voice is recognized by the voice recognition means. It is an information processing means for switching the function provided to the user and performing an information providing operation according to the voice recognition result of the voice recognition means, wherein the voice recognition means determines that the input voice cannot be accepted. In this case, an information processing means for refraining from providing information to the user is applied.

【００１７】上記第２の観点に係る構成において、音声
認識手段は、情報処理手段のとり得る状態において受理
可能な音声については、音声の入力時において状態決定
手段により通知されている状態（現在の情報提供端末の
状態）に無関係に、全て認識対象として認識する全音声
認識機能と、この機能により認識された音声が現在の状
態では受理できないものであるか否かを判定し、受理で
きないものと判定した場合には情報処理手段への認識結
果の出力を控える認識結果検査機能とを有している。し
たがって、入力音声が、その時点の状態では受理できな
いものである場合、音声認識手段での音声認識結果は情
報処理手段に出力されず、情報処理手段での情報提供動
作を行われない。In the configuration according to the second aspect, the voice recognizing means is in a state (current state) notified by the state deciding means at the time of voice input regarding voices which can be accepted by the information processing means. Regardless of the state of the information providing terminal), all voice recognition functions that recognize all as recognition targets, and whether the voice recognized by this function is unacceptable in the current state is judged as unacceptable. It has a recognition result inspection function of refraining from outputting the recognition result to the information processing means when the judgment is made. Therefore, if the input voice cannot be accepted in the state at that time, the voice recognition result by the voice recognition means is not output to the information processing means, and the information providing operation is not performed by the information processing means.

【００１８】このように、上記第２の観点に係る構成に
おいては、利用者が、その時刻等には受理されないよう
な音声を誤って発声した場合に、それが音声認識手段に
て検知されて、情報処理手段への（操作コマンドとして
の）音声認識結果の出力が行われないことから、情報処
理手段の情報提供動作が抑止され、したがって利用者が
誤ってその時刻では受理されない音声を発声したとき
に、情報提供端末が誤動作することがない。As described above, in the configuration according to the second aspect, when the user mistakenly utters a voice that is not accepted at that time or the like, it is detected by the voice recognition means. Since the voice recognition result (as an operation command) is not output to the information processing means, the information providing operation of the information processing means is suppressed, and thus the user erroneously uttered a voice that is not accepted at that time. Sometimes, the information providing terminal does not malfunction.

【００１９】本発明の第３の観点に係る構成は、上記第
１の観点に係る構成における音声認識手段と情報処理手
段に代えて、入力された音声を認識する音声認識手段で
あって、上記入力音声が受理可能か否かを、その際の状
態決定手段の示す状態に応じて判定する音声認識手段
と、この音声認識手段により入力音声が認識された際の
状態決定手段の示す状態に応じて利用者に提供する機能
を切り替えて、上記音声認識手段の音声認識結果に応じ
た情報提供動作を行う情報処理手段であって、上記音声
認識手段により上記入力音声が受理可能でないと判定さ
れた場合には、その音声が無効であることを示す情報を
利用者に提示する情報処理手段とを適用したことを特徴
とする。A configuration according to a third aspect of the present invention is a voice recognition means for recognizing an input voice in place of the voice recognition means and the information processing means in the configuration according to the first aspect. Depending on the voice recognition means for judging whether or not the input voice can be accepted according to the state indicated by the state decision means at that time, and the state indicated by the state decision means when the input voice is recognized by the voice recognition means. It is an information processing means for switching the function provided to the user and performing an information providing operation according to the voice recognition result of the voice recognition means, wherein the voice recognition means determines that the input voice cannot be accepted. In this case, an information processing means for presenting information indicating that the voice is invalid to the user is applied.

【００２０】上記第３の観点に係る構成において、音声
認識手段は、情報処理手段のとり得る状態において受理
可能な音声については、音声の入力時において状態決定
手段により通知されている状態（現在の情報提供端末の
状態）に無関係に、全て認識対象として認識する全音声
認識機能と、この機能により認識された音声が現在の状
態では受理できないものであるか否かを判定し、受理で
きると判定した場合には情報処理手段への認識結果の出
力を行い、受理できないと判定した場合には（情報処理
手段への認識結果の出力に代えて）、その音声が不正な
ものであることを示す不正通知を行う認識結果検査機能
とを有している。In the configuration according to the third aspect described above, the voice recognition means has a state in which the voice that is acceptable by the information processing means is notified by the state determination means at the time of voice input (current state). Regardless of the state of the information providing terminal), all voice recognition functions that recognize all as recognition targets, and it is determined whether the voice recognized by this function is unacceptable in the current state, and it is determined that it can be accepted If it does, the recognition result is output to the information processing means, and if it is determined that it cannot be accepted (instead of outputting the recognition result to the information processing means), it indicates that the voice is illegal. It has a recognition result inspection function for making a fraudulent notification.

【００２１】情報処理手段は、音声認識手段から音声認
識結果または不正通知のいずれが入力されるかを監視し
ており、音声認識手段により入力音声が受理できないと
判定されて不正通知が入力された場合には、その音声が
無効であることを示す情報（例えば、入力音声が無効な
ため再発話を促すための情報）を利用者に提示する。The information processing means monitors whether the voice recognition result or the fraudulent notification is input from the voice recognition means, and the fraudulent notification is input when the voice recognition means determines that the input voice cannot be accepted. In this case, the user is presented with information indicating that the voice is invalid (for example, information for prompting reoccurrence because the input voice is invalid).

【００２２】このように、上記第３の観点に係る構成に
おいては、利用者が、その時刻等には受理されないよう
な音声を誤って発声した場合に、それが音声認識手段に
て検知されて情報処理手段への不正通知が行われ、情報
処理手段から利用者に対してその音声が無効であること
を示す情報が提供されることから、利用者に対して情報
提供端末を誤って使用していることを明らかにすること
ができる。As described above, in the configuration according to the third aspect, when the user mistakenly utters a voice that is not accepted at that time or the like, it is detected by the voice recognition means. Since the fraudulent notification is given to the information processing means, and the information processing means provides the user with information indicating that the sound is invalid, the information providing terminal is mistakenly used for the user. Can be revealed.

【００２３】本発明の第４の観点に係る構成は、上記第
１の観点に係る構成における音声認識手段と情報処理手
段に代えて、入力された音声を認識する音声認識手段で
あって、上記入力音声が受理可能か否かを、その際の状
態決定手段の示す状態に応じて判定する音声認識手段
と、この音声認識手段により入力音声が認識された際の
状態決定手段の示す状態に応じて利用者に提供する機能
を切り替えて、上記音声認識手段の音声認識結果に応じ
た情報提供動作を行う情報処理手段であって、上記音声
認識手段により上記入力音声が受理可能でないと判定さ
れた場合には、上記音声認識手段の音声認識結果の示す
入力音声の情報を利用者に提示する情報処理手段とを適
用したことを特徴とする。A configuration according to a fourth aspect of the present invention is a voice recognition means for recognizing an input voice, in place of the voice recognition means and the information processing means in the configuration according to the first aspect. Depending on the voice recognition means for judging whether or not the input voice can be accepted according to the state indicated by the state decision means at that time, and the state indicated by the state decision means when the input voice is recognized by the voice recognition means. It is an information processing means for switching the function provided to the user and performing an information providing operation according to the voice recognition result of the voice recognition means, wherein the voice recognition means determines that the input voice cannot be accepted. In this case, an information processing means for presenting the information of the input voice indicated by the voice recognition result of the voice recognition means to the user is applied.

【００２４】上記第４の観点に係る構成において、音声
認識手段は、情報処理手段のとり得る状態において受理
可能な音声については、音声の入力時において状態決定
手段により通知されている状態（現在の情報提供端末の
状態）に無関係に、全て認識対象として認識する全音声
認識機能と、この機能により認識された音声が現在の状
態では受理できないものであるか否かを判定し、受理で
きると判定した場合には情報処理手段への認識結果の出
力を行い、受理できないと判定した場合には情報処理手
段への認識結果出力と共に不正通知を行う認識結果検査
機能とを有している。In the configuration according to the fourth aspect, the voice recognizing means is in a state of being notified by the state deciding means at the time of voice input regarding the voice that is acceptable in the possible state of the information processing means (current state). Regardless of the state of the information providing terminal), all voice recognition functions that recognize all as recognition targets, and it is determined whether the voice recognized by this function is unacceptable in the current state, and it is determined that it can be accepted. In this case, it has a recognition result inspection function of outputting the recognition result to the information processing means and, if it is judged that the acceptance cannot be accepted, outputting the recognition result to the information processing means and making a fraudulent notice.

【００２５】情報処理手段は、音声認識手段からの音声
認識結果を監視し、音声認識結果が入力されると、不正
通知の有無を調べる。もし、不正通知が有るならば、情
報処理手段は、音声認識手段からの音声認識結果を利用
者に提示する。The information processing means monitors the voice recognition result from the voice recognition means, and when the voice recognition result is input, checks whether or not there is a fraudulent notification. If there is a fraudulent notification, the information processing means presents the voice recognition result from the voice recognition means to the user.

【００２６】このように、上記第４の観点に係る構成に
おいては、利用者が、その時刻等には受理されないよう
な音声を誤って発声した場合に、それが音声認識手段に
て検知されて情報処理手段への不正通知付きの音声認識
結果出力が行われ、情報処理手段から利用者に対してそ
の音声認識結果が提示されることから、利用者に対して
適切な使用法に誘導することができる。この誘導は、音
声認識結果と共に、その音声認識結果（の示す音声）が
無効であることを示す情報を提示することにより、一層
確実なものとなる。As described above, in the configuration according to the fourth aspect, when the user erroneously utters a voice that is not accepted at that time or the like, it is detected by the voice recognition means. Since the voice recognition result output with the fraudulent notification is output to the information processing means and the voice recognition result is presented to the user from the information processing means, the user is guided to an appropriate usage. You can This guidance becomes more reliable by presenting the information indicating that the voice recognition result (the voice indicated by the voice recognition) is invalid together with the voice recognition result.

【００２７】本発明の第５の観点に係る構成は、上記第
１の観点に係る構成における音声認識手段と情報処理手
段に代えて、入力された音声を認識する音声認識手段で
あって、上記入力音声が受理可能か否かを、その際の状
態決定手段の示す状態に応じて判定する音声認識手段
と、この音声認識手段により入力音声が認識された際の
状態決定手段の示す状態に応じて利用者に提供する機能
を切り替えて、上記音声認識手段の音声認識結果に応じ
た情報提供動作を行う情報処理手段であって、上記音声
認識手段により上記入力音声が受理可能でないと判定さ
れた場合には、その時点において受理可能な音声及び提
供可能な機能の一覧の少なくとも１つを利用者に提示す
る情報処理手段とを適用したことを特徴とする。A configuration according to a fifth aspect of the present invention is a voice recognition means for recognizing an input voice in place of the voice recognition means and the information processing means in the configuration according to the first aspect. Depending on the voice recognition means for judging whether or not the input voice can be accepted according to the state indicated by the state decision means at that time, and the state indicated by the state decision means when the input voice is recognized by the voice recognition means. It is an information processing means for switching the function provided to the user and performing an information providing operation according to the voice recognition result of the voice recognition means, wherein the voice recognition means determines that the input voice cannot be accepted. In that case, an information processing means for presenting at least one of a list of acceptable voices and functions that can be provided to the user at that time is applied.

【００２８】上記第５の観点に係る構成において、音声
認識手段は、前記第３の観点に係る構成での音声認識手
段と同様の機能を有しており、認識した入力音声が現在
の状態では受理できないと判定した場合には、情報処理
手段への認識結果の出力に代えて不正通知を行う。In the configuration according to the fifth aspect, the voice recognition means has the same function as the voice recognition means in the configuration according to the third aspect, and the recognized input voice is in the present state. When it is determined that the information cannot be accepted, the fraudulent notification is given instead of outputting the recognition result to the information processing means.

【００２９】情報処理手段は、音声認識手段から音声認
識結果または不正通知のいずれが入力されるかを監視し
ており、音声認識手段から不正通知が入力された場合に
は、その時点において受理可能な音声及び提供可能な機
能の一覧の少なくとも１つを利用者に提示する。The information processing means monitors whether the voice recognition result or the fraudulent notification is input from the voice recognition means, and when the fraudulent notification is input from the voice recognition means, it can be accepted at that time. At least one of a list of available voices and functions that can be provided to the user.

【００３０】このように、上記第５の観点に係る構成に
おいては、利用者が、その時刻等には受理されないよう
な音声を誤って発声した場合に、それが音声認識手段に
て検知されて情報処理手段への不正通知が行われ、情報
処理手段から利用者に対してその時点において受理可能
な音声及び提供可能な機能（サービス）の一覧の少なく
とも１つが提示されることから、利用者は、再発話の際
に発声すべき（受理可能な）音声を認識できる。As described above, in the configuration according to the fifth aspect, when the user mistakenly utters a voice that is not accepted at that time or the like, it is detected by the voice recognition means. Since the fraudulent notification is given to the information processing means, and the information processing means presents to the user at least one of a list of acceptable voices and functions (services) that can be provided at that time, the user is , Recognize (acceptable) voice that should be uttered when re-speaking.

【００３１】[0031]

【実施例】以下、本発明の実施例につき図面を参照して
説明する。［第１の実施例］まず、本発明の情報提供端末の第１の
実施例について説明する。Embodiments of the present invention will be described below with reference to the drawings. [First Embodiment] First, a first embodiment of the information providing terminal of the present invention will be described.

【００３２】図１は本発明の第１の実施例に係る情報提
供端末の概略構成を示すブロック図である。この図１に
示す情報提供端末は、時刻に応じて機能、操作方法が変
化するものとする。FIG. 1 is a block diagram showing a schematic configuration of an information providing terminal according to the first embodiment of the present invention. The function and operation method of the information providing terminal shown in FIG. 1 are assumed to change depending on the time.

【００３３】図１に示す情報提供端末は、時刻検知部１
１、状態決定部１２、音声認識部１３及び情報処理部１
４から構成される。時刻検知部１１は、時刻を検知し、
その時刻を状態決定部１２に通知するものである。The information providing terminal shown in FIG.
1, state determination unit 12, voice recognition unit 13, and information processing unit 1
It is composed of 4. The time detection unit 11 detects the time,
The time is notified to the state determination unit 12.

【００３４】状態決定部１２は、時刻検知部１１から通
知された時刻に応じて情報提供端末の状態を決定するも
のである。音声認識部１３は、利用者から入力された音
声を、状態決定部１２によって決定される（その時点に
おける）情報提供端末の状態に応じて認識し、認識結果
を情報処理部１４に出力するものである。The state determining unit 12 determines the state of the information providing terminal according to the time notified from the time detecting unit 11. The voice recognition unit 13 recognizes the voice input by the user according to the state of the information providing terminal determined by the state determination unit 12 (at that time), and outputs the recognition result to the information processing unit 14. Is.

【００３５】情報処理部１４は、音声認識部１３から得
られる認識結果によって選択される機能（サービス）を
利用者に提供するものである。ここで、情報処理部１４
にて実行できる機能、受け付けることのできる認識結果
は、状態決定部１２によって決定される状態によって変
化するようになっている。即ち情報処理部１４は、音声
認識部１３から得られる認識結果及び状態決定部１２に
よって決定される状態によって定められる機能を利用者
に提供する。The information processing unit 14 provides the user with a function (service) selected according to the recognition result obtained from the voice recognition unit 13. Here, the information processing unit 14
The functions that can be executed in step 1 and the recognition results that can be received change depending on the state determined by the state determination unit 12. That is, the information processing unit 14 provides the user with the function determined by the recognition result obtained from the voice recognition unit 13 and the state determined by the state determination unit 12.

【００３６】次に、図１に示した情報提供端末の動作の
概略を説明する。まず時刻検知部１１は、常に時刻を検
知して状態決定部１２に通知している。状態決定部１２
は、時刻検知部１１から通知される時刻を定期的（例え
ば、時刻検知部１１から通知される時刻の最小単位毎、
あるいはそれより小さい単位毎）に読み込んで当該時刻
に応じて情報提供端末の動作を定める状態を決定する。
状態決定部１２により決定される状態は、常に音声認識
部１３及び情報処理部１４に通知される。Next, the outline of the operation of the information providing terminal shown in FIG. 1 will be described. First, the time detection unit 11 always detects the time and notifies the state determination unit 12 of the time. State determination unit 12
Periodically checks the time notified from the time detection unit 11 (for example, for each minimum unit of the time notified from the time detection unit 11,
Alternatively, it is read for each smaller unit) and the state that determines the operation of the information providing terminal is determined according to the time.
The state determined by the state determination unit 12 is constantly notified to the voice recognition unit 13 and the information processing unit 14.

【００３７】さて、利用者が図１の構成の情報提供端末
から何らかの機能（サービス、情報）を提供してもらう
ために、その機能の提供を要求するコマンド等を音声に
より入力すると、その音声が、音声認識部１３で、その
時点において状態決定部１２から通知されている状態に
応じて認識される。この音声認識部１３での認識結果は
情報処理部１４に出力される。When a user inputs a command or the like requesting the provision of a function by voice in order to have the information providing terminal having the configuration shown in FIG. 1 provide some function (service, information), the voice is output. The voice recognition unit 13 recognizes the voice according to the state notified from the state determination unit 12 at that time. The recognition result of the voice recognition unit 13 is output to the information processing unit 14.

【００３８】情報処理部１４では、状態決定部１２から
通知される状態に応じて、機能、操作方法などが変化し
ており、音声認識部１３から利用者の入力した音声の認
識結果を受け取ると、その際の状態に応じて、当該認識
結果で決まる動作を行う。In the information processing unit 14, the function, the operation method, etc. are changed according to the state notified from the state determination unit 12, and when the voice recognition result of the user's input is received from the voice recognition unit 13. The operation determined by the recognition result is performed according to the state at that time.

【００３９】次に、図１の構成の各部の詳細を説明す
る。ここでは、音声認識部１３で適用される音声認識方
式がＳＭＱ−ＨＭＭ法（統計的マトリクスリ量子化−隠
れマルコフモデル法）に基づく単語認識方式であり、図
１に示した情報提供端末は、利用者が発声した音声を認
識して、それに応じて動作するものとする。また、この
情報提供端末は、画面を用いて利用者に情報を案内する
ものとする。Next, details of each part of the configuration of FIG. 1 will be described. Here, the speech recognition method applied by the speech recognition unit 13 is a word recognition method based on the SMQ-HMM method (statistical matrix requantization-hidden Markov model method), and the information providing terminal shown in FIG. It is assumed that the voice uttered by the user is recognized and operates accordingly. Moreover, this information providing terminal guides the user to information using a screen.

【００４０】まず、音声認識部１３の詳細を図２に示
す。この図２に示す音声認識部１３は、単語辞書切替部
１３１、単語辞書１３２-0，１３２-1及び音声照合部１
３３から構成される。First, the details of the voice recognition unit 13 are shown in FIG. The voice recognition unit 13 shown in FIG. 2 includes a word dictionary switching unit 131, word dictionaries 132-0 and 132-1 and a voice collation unit 1.
It consists of 33.

【００４１】単語辞書切替部１３１は、図１中の状態決
定部１２によって決定された状態に従って複数の単語辞
書の中から音声照合部１３３が使用すべき単語辞書を決
定して選択するものである。本実施例において、状態決
定部１２により決定される状態には２種あり、その状態
を示す情報として状態番号＃０，＃１が用いられる。こ
の場合、単語辞書切替部１３１は、状態決定部１２から
状態番号＃０が通知されたならば単語辞書１３２-0を選
択し、状態番号＃１が通知されたならば単語辞書１３２
-1を選択する。各単語辞書１３２-0，１３２-1には、状
態番号によって示される、情報提供端末の使用者が音声
を入力した時刻（時間帯）に適した語彙に含まれる複数
の単語を認識するための単語モデルが予め登録されてい
る。The word dictionary switching unit 131 determines and selects a word dictionary to be used by the voice collating unit 133 from a plurality of word dictionaries according to the state determined by the state determining unit 12 in FIG. . In this embodiment, there are two types of states determined by the state determining unit 12, and the state numbers # 0 and # 1 are used as information indicating the states. In this case, the word dictionary switching unit 131 selects the word dictionary 132-0 when the state number # 0 is notified from the state determining unit 12, and the word dictionary 132-0 when the state number # 1 is notified.
Select -1. The word dictionaries 132-0 and 132-1 are used for recognizing a plurality of words included in a vocabulary suitable for the time (time zone) at which the user of the information providing terminal inputs a voice, which is indicated by the state number. The word model is registered in advance.

【００４２】単語辞書１３２-0，１３２-1の一例を図３
に示す。状態番号＃０の場合に（単語辞書切替部１３１
により選択されて音声照合部１３３で）使用される単語
辞書１３２-0には、「警備室」、「当直室」の２単語を
認識するための単語モデルが登録され、状態番号＃１の
場合に（単語辞書切替部１３１により選択されて音声照
合部１３３で）使用される単語辞書１３２-1には、「総
務部」、「営業部」、「警備室」の３単語を認識するた
めの単語モデルが登録されている。An example of the word dictionaries 132-0 and 132-1 is shown in FIG.
Shown in In the case of state number # 0 (word dictionary switching unit 131
In the case of state number # 1, a word model for recognizing two words of “guard room” and “watch room” is registered in the word dictionary 132-0 selected by The word dictionary 132-1 used by the word dictionary switching unit 131 (selected by the word dictionary switching unit 131 and used by the voice collating unit 133) is for recognizing the three words “general affairs department”, “sales department”, and “security room”. The word model is registered.

【００４３】音声照合部１３３は、入力された音声信号
と、単語辞書切替部１３１によって選択された単語辞書
に予め登録されている認識単語のモデルとの照合を行
い、照合の結果が最も良かった単語を音声認識結果とす
る。The voice collating unit 133 collates the input voice signal with the model of the recognition word registered in advance in the word dictionary selected by the word dictionary switching unit 131, and the collation result is the best. Let a word be a voice recognition result.

【００４４】音声信号と認識単語モデルとの照合方式は
種々知られているが、ここでは例として離散ＨＭＭを用
いた照合方式を説明する。この方式では、音声信号をま
ず音声セグメントと呼ばれる離散記号の時系列に変換
し、この系列と単語辞書切替部１３１によって選択され
た単語辞書に予め登録されている認識単語のモデルとの
照合を行う。Various collating methods of the voice signal and the recognized word model are known, but here, the collating method using the discrete HMM will be described as an example. In this method, a speech signal is first converted into a time series of discrete symbols called a speech segment, and this series is collated with a model of a recognition word registered in advance in a word dictionary selected by a word dictionary switching unit 131. .

【００４５】音声照合部１３３の詳細を図４に示す。こ
の図４に示す音声照合部１３３は、特徴抽出部１３３
１、離散化部１３３２及びＨＭＭ照合部１３３３から構
成される。Details of the voice collating unit 133 are shown in FIG. The voice collating unit 133 shown in FIG.
1, a discretization unit 1332 and an HMM matching unit 1333.

【００４６】図４の構成の音声照合部１３３において、
利用者の発声した音声は図示せぬＡ／Ｄ変換器を用いて
例えばサンプリング周波数１２KHZ ，１２ビットで量子
化される。特徴抽出部１３３１では、この量子化データ
から８msec周期で、長さ２４msecのデータを順次取り出
し、取り出したデータから１６次のＬＰＣ（Linear Pre
dictive Coding）（メル）ケプストラムを求める。In the voice collating unit 133 having the configuration of FIG.
The voice uttered by the user is quantized with a sampling frequency of 12 KHZ and 12 bits by using an A / D converter (not shown). The feature extraction unit 1331 sequentially extracts data with a length of 24 msec from this quantized data at a cycle of 8 msec, and extracts 16th-order LPC (Linear Pre) from the extracted data.
(dictive Coding) (Mel) Ask for the cepstrum.

【００４７】これによって、８msec毎に１６次元のベク
トル（特徴パラメータ）が得られる。離散化部１３３２
では、この分析された特徴パラメータを時間軸方向にマ
トリクス量子化し、数百程度の音声セグメントに量子化
（記号化）する。As a result, a 16-dimensional vector (feature parameter) is obtained every 8 msec. Discretizer 1332
Then, the analyzed feature parameters are matrix-quantized in the time axis direction and quantized (symbolized) into hundreds of voice segments.

【００４８】以上の処理で、利用者から入力された音声
は、音声セグメントの時系列に変換される。この音声セ
グメント時系列を、以下では入力系列と呼ぶ。ＨＭＭ照
合部１３３３では、ＨＭＭを用いた照合方式を適用し、
単語辞書切替部１３１によって選択された単語辞書（こ
こでは、単語辞書１３２-0または１３２-1）内の単語モ
デルと入力系列の照合を行い、最も良く照合できた単語
モデルに対応する単語を音声認識結果として出力する。Through the above processing, the voice input by the user is converted into a time series of voice segments. This audio segment time series is hereinafter referred to as an input series. The HMM matching unit 1333 applies a matching method using HMM,
The word model in the word dictionary selected by the word dictionary switching unit 131 (here, the word dictionary 132-0 or 132-1) is matched with the input sequence, and the word corresponding to the best matched word model is voiced. Output as a recognition result.

【００４９】単語辞書内の単語モデルとは、それぞれの
単語に対応する離散ＨＭＭである。離散ＨＭＭは、Ｎ個
の状態Ｓ₁ ，Ｓ₂ ，…，Ｓ_N を持ち、初期状態がこれら
Ｎ個の状態に確率的に分布しているとする。音声では、
一定の周期毎に、ある確率（遷移確率）で状態を遷移す
るモデルが使われる。遷移の際には、ある確率（出力確
率）でラベルを出力する。この例の場合、入力系列は音
声セグメント系列なので、ラベルとして音声セグメント
を用いる。以上のようなモデルを与えられたとき、それ
ぞれの単語に対応する離散ＨＭＭからの入力系列が生成
される確率を計算することができ、この確率が最も高く
なるモデルに対応する単語を音声認識の結果とする。The word model in the word dictionary is a discrete HMM corresponding to each word. The discrete HMM has N states S ₁ , S ₂ , ..., _SN , and the initial state is stochastically distributed to these N states. In voice,
A model that transitions a state with a certain probability (transition probability) is used at regular intervals. At the time of transition, the label is output with a certain probability (output probability). In the case of this example, since the input sequence is the voice segment sequence, the voice segment is used as the label. Given the above model, the probability that an input sequence from the discrete HMM corresponding to each word is generated can be calculated, and the word corresponding to the model with the highest probability can be used for speech recognition. The result.

【００５０】ＨＭＭのモデルは次の６個のパラメータか
ら定義される。Ｎ：状態数（状態Ｓ₁ ，Ｓ₂ ，…，Ｓ_N ）Ｋ：ラベル数（ラベルＲ＝１，２，…，Ｋ）ｐ_ij ：遷移確率（Ｓ_i からＳ_j に遷移する確率）ｑ_ij(k) ：Ｓ_i からＳ_j への遷移の際にラベルｋを出力
する確率ｍ_i ：初期状態確率（Ｓ_i が初期状態となる確率）Ｆ：最終状態になり得る状態の集合一般に音声認識で用いられるＨＭＭの代表的な構造を、
状態数Ｎが１０の場合について図５に示す。The HMM model is defined by the following six parameters. N: number of states (states S ₁ , S ₂ , ..., _SN ) K: number of labels (labels R = 1, 2, ..., K) p _ij : transition probability (probability of transition from S _i to S _j ) q _ij (k): Probability of outputting the label k at the transition from S _i to S _j m _i : Initial state probability (probability that S _i becomes the initial state) F: Set of states that can be the final state Generally, speech The typical structure of the HMM used for recognition is
FIG. 5 shows the case where the number of states N is 10.

【００５１】さて、ＨＭＭを用いた照合では、認識に先
立って、多数の話者から収録した認識単語の学習データ
を用いて、学習データの出力確率が最大となるモデルの
パラメータを推定し、それぞれの認識語彙に対応する単
語辞書に登録しておく。この推定に用いられるアルゴリ
ズムとしては、フォワード・バックワード・アルゴリズ
ムが知られている。In the collation using the HMM, prior to recognition, the learning data of the recognition words recorded from a large number of speakers are used to estimate the model parameter that maximizes the output probability of the learning data. It is registered in the word dictionary corresponding to the recognition vocabulary of. A forward backward algorithm is known as an algorithm used for this estimation.

【００５２】また、ＨＭＭを用いた照合では、認識単語
ｗのモデルがラベル系列Ｏ＝ｏ₁ ，ｏ₂ ，…，ｏ_T を出
力する確率を求める。この確率を求めるアルゴリズムと
しては、ビタビ（Viterbi ）アルゴリズムが知られてい
る。In the matching using the HMM, the probability that the model of the recognized word w outputs the label series O = o ₁ , o ₂ , ..., O _T is obtained. A Viterbi algorithm is known as an algorithm for obtaining this probability.

【００５３】次に、図１中の状態決定部１２の詳細を図
６に示す。この図６に示す状態決定部１２は、状態番号
記憶部１２１、状態選択部１２２及び状態設定部１２３
から構成される。Next, FIG. 6 shows details of the state determining unit 12 in FIG. The state determination unit 12 shown in FIG. 6 includes a state number storage unit 121, a state selection unit 122, and a state setting unit 123.
Consists of

【００５４】状態番号記憶部１２１は、時刻の範囲（時
間帯）と、その範囲内の時刻のときの情報提供端末の状
態を指定する状態番号との対応情報を予め登録しておく
ものである。ここでは、前記したように状態を２通り持
っており、状態番号＃０，＃１がそれぞれの認識語彙に
対応している。この状態番号記憶部１２１における登録
例を図７に示す。図７の例では、０時０分から８時０分
までの時間帯は状態番号＃０、８時０分から１７時０分
までの時間帯は状態番号＃１、そして１７時０分から２
４時０分までの時間帯は状態番号＃０が、それぞれ設定
されていることを示している。The state number storage unit 121 registers in advance correspondence information between a time range (time zone) and a state number designating the state of the information providing terminal at the time within the range. . Here, as described above, there are two states, and the state numbers # 0 and # 1 correspond to the respective recognition vocabularies. An example of registration in the state number storage unit 121 is shown in FIG. In the example of FIG. 7, the time zone from 0:00 to 8:00 is state number # 0, the time zone from 8:00 to 17:00 is state number # 1, and from 17: 0 to 2
It is indicated that the state number # 0 is set for the time zones up to 4:00.

【００５５】状態選択部１２２は、時刻検知部１１から
通知される時刻を定期的に読み込んで、状態番号記憶部
１２１を参照し、通知された時刻に対応した（状態を示
す）状態番号を決定するものである。状態選択部１２２
によって決定された状態番号は状態設定部１２３に設定
されて、音声認識部１３及び情報処理部１４に通知され
る。The state selection unit 122 periodically reads the time notified from the time detection unit 11, refers to the state number storage unit 121, and determines the state number (indicating the state) corresponding to the notified time. To do. State selection unit 122
The state number determined by is set in the state setting unit 123 and is notified to the voice recognition unit 13 and the information processing unit 14.

【００５６】例えば、時刻検知部１１からの通知時刻
「１７時３２分」を読み込んだ場合、状態選択部１２２
は、その時刻「１７時３２分」と図７に示す状態番号記
憶部１２１の内容とから、「１７時３２分」が属する時
刻範囲の状態番号を検索し、状態番号＃０を状態設定部
１２３に設定する。この状態設定部１２３に設定された
状態番号＃０は、他の状態番号に更新されるまで、音声
認識部１３及び情報処理部１４に出力される。図７の例
では、状態設定部１２３には、０時０分から８時０分ま
での時間帯は状態番号＃０が、８時０分から１７時０分
までの時間帯は状態番号＃１が、そして１７時０分から
２４時０分までの時間帯は状態番号＃０が設定されるこ
とになる。For example, when the notification time “17:32” from the time detection unit 11 is read, the state selection unit 122
Searches the state number of the time range to which "17:32" belongs from the time "17:32" and the contents of the state number storage unit 121 shown in FIG. 7, and sets the state number # 0 to the state setting unit. Set to 123. The state number # 0 set in the state setting unit 123 is output to the voice recognition unit 13 and the information processing unit 14 until updated to another state number. In the example of FIG. 7, the state setting unit 123 displays the state number # 0 in the time zone from 0:00 to 8:00, and the state number # 1 in the time zone from 8:00 to 17:00. , And the state number # 0 is set in the time zone from 17:00 to 24:00.

【００５７】次に、図１中の情報処理部１４の詳細を図
８に示す。この図８に示す情報処理部１４は、情報辞書
切替部１４１、情報辞書１４２-0，１４２-1及び出力生
成部１４３から構成される。Next, details of the information processing unit 14 in FIG. 1 are shown in FIG. The information processing unit 14 shown in FIG. 8 includes an information dictionary switching unit 141, information dictionaries 142-0 and 142-1 and an output generation unit 143.

【００５８】情報辞書切替部１４１は、図１中の状態決
定部１２によって決定された状態（を示す状態番号）に
従って複数の情報辞書の中から出力生成部１４３が使用
すべき情報辞書を決定して選択するものである。本実施
例において情報辞書切替部１４１は、状態決定部１２か
ら状態番号＃０が通知されたならば情報辞書１４２-0を
選択し、状態番号＃１が通知されたならば情報辞書１４
２-1を選択する。各情報辞書１４２-0，１４２-1には、
対応する状態が選択（決定）される時刻（の範囲内）に
情報提供端末が受け付ける音声認識結果と、その音声認
識結果に応じて使用者に提示する情報（ここでは案内情
報）の組が予め登録されている。The information dictionary switching unit 141 determines the information dictionary to be used by the output generating unit 143 from the plurality of information dictionaries according to the state (state number indicating) determined by the state determining unit 12 in FIG. To choose. In the present embodiment, the information dictionary switching unit 141 selects the information dictionary 142-0 when the state number # 0 is notified from the state determining unit 12 and the information dictionary 14 when the state number # 1 is notified.
Select 2-1. Each information dictionary 142-0, 142-1 contains
A combination of a voice recognition result accepted by the information providing terminal at (a range of) the time when the corresponding state is selected (determined) and information (guidance information here) presented to the user according to the voice recognition result is set in advance. It is registered.

【００５９】情報辞書１４２-0，１４２-1の一例を図９
に示す。状態番号＃０の場合に（情報辞書切替部１４１
により選択されて出力生成部１４３で）使用される情報
辞書１４２-0には、音声認識結果が「警備室」、「当直
室」てあった場合に、それぞれに対応して画面に表示す
る応答文が登録され、状態番号＃１の場合に（情報辞書
切替部１４１により選択されて出力生成部１４３で）使
用される情報辞書１４２-1には、音声認識結果「総務
部」、「営業部」、「警備室」のそれぞれに対応する
（画面表示用の）応答文が登録されている。このよう
に、情報辞書１４２-0，１４２-1に登録されている（利
用者に提示する）情報は画面表示に使う文であるが、こ
れが画像、音声などのその他の情報であっても問題な
い。An example of the information dictionaries 142-0 and 142-1 is shown in FIG.
Shown in In the case of the state number # 0 (the information dictionary switching unit 141
When the voice recognition result is “guard room” or “shift room” in the information dictionary 142-0 used by the output generation unit 143 selected by, the response to be displayed on the screen corresponding to each of them. In the information dictionary 142-1 used when the sentence is registered and the state number is # 1 (selected by the information dictionary switching unit 141 and used by the output generation unit 143), the speech recognition results “general affairs department” and “sales department” are included. ], And a response sentence (for screen display) corresponding to each of “security room” is registered. As described above, the information registered in the information dictionaries 142-0 and 142-1 (provided to the user) is the sentence used for the screen display, but even if this is other information such as images and sounds, there is a problem. Absent.

【００６０】出力生成部１４３には、図１中の音声認識
部１３による入力音声に対する音声認識結果が与えられ
る。出力生成部１４３は、この音声認識結果を受け取る
と、当該音声認識結果に対応する情報（案内情報）を情
報辞書切替部１４１によって選択されている情報辞書を
検索して選択し、これを利用者に提示する。本実施例の
場合、この情報は画面に表示される。The output generation unit 143 is provided with the voice recognition result for the input voice by the voice recognition unit 13 in FIG. Upon receiving the voice recognition result, the output generation unit 143 searches the information dictionary selected by the information dictionary switching unit 141 for information (guidance information) corresponding to the voice recognition result, and selects the information. To present. In this example, this information is displayed on the screen.

【００６１】次に、本実施例の全体の動作を具体例で説
明する。まず、図１の構成の情報提供端末の利用者が、
本端末の図示せぬ音声入力部に向かって、例えば１０時
１５分に「総務部」と発話したものとする。Next, the overall operation of this embodiment will be described with a specific example. First, the user of the information providing terminal having the configuration of FIG.
It is assumed that the user speaks “general affairs department” at 10:15 toward the voice input unit (not shown) of the terminal.

【００６２】すると、その音声は音声認識部１３に入力
される。この入力音声は特徴抽出部１３３１で分析さ
れ、さらに離散化部１３３２で音声セグメント系列に変
換される。Then, the voice is input to the voice recognition unit 13. This input voice is analyzed by the feature extraction unit 1331 and further converted into a voice segment sequence by the discretization unit 1332.

【００６３】一方、状態決定部１２には、時刻検知部１
１で検知される時刻が常に通知されている。状態決定部
１２内の状態選択部１２２は、時刻検知部１１から通知
される時刻を定期的（例えば３０秒毎）に読み込んで、
状態番号記憶部１２１を参照する。もし、利用者から音
声が入力された時刻「１０時１５分」に、時刻検知部１
１から通知されている時刻が状態決定部１２内の状態選
択部１２２により読み込まれた場合、当該状態選択部１
２２は、その時刻「１０時１５分」と図７に示した状態
番号記憶部１２１の内容とから、「１０時１５分」が属
する時刻範囲の状態番号＃１を選択して状態設定部１２
３に設定する。この状態決定部１２の内容である状態番
号＃１は、音声認識部１３及び情報処理部１４に通知さ
れる。なお、当該状態番号＃１が状態設定部１２３に設
定される時刻範囲（時間帯）は、図７から明らかなよう
に８時０分から１７時０分までである。On the other hand, the state determination unit 12 includes the time detection unit 1
The time detected in 1 is always notified. The state selection unit 122 in the state determination unit 12 reads the time notified from the time detection unit 11 periodically (for example, every 30 seconds),
The state number storage unit 121 is referred to. If the user inputs a voice at "10:15", the time detection unit 1
1 is read by the state selection unit 122 in the state determination unit 12, the state selection unit 1
22 selects the state number # 1 in the time range to which "10:15" belongs from the time "10:15" and the contents of the state number storage unit 121 shown in FIG.
Set to 3. The state number # 1, which is the content of the state determination unit 12, is notified to the voice recognition unit 13 and the information processing unit 14. The time range (time zone) in which the status number # 1 is set in the status setting unit 123 is from 8:00 to 17:00, as is clear from FIG.

【００６４】音声認識部１３内の単語辞書切替部１３１
は、状態決定部１２（内の状態設定部１２３）から通知
されている状態番号に応じて、単語辞書１３２-0または
単語辞書１３２-1のいずれか一方を選択して、音声照合
部１３３で対象とする認識語彙を切り替える。もし、状
態番号が＃１の場合には、単語辞書１３２-1が選択され
る。The word dictionary switching unit 131 in the voice recognition unit 13
Selects either the word dictionary 132-0 or the word dictionary 132-1 according to the state number notified from the state determination unit 12 (inside the state setting unit 123), and the voice collation unit 133 selects Switch the target recognition vocabulary. If the state number is # 1, the word dictionary 132-1 is selected.

【００６５】一方、音声認識部１３内の音声照合部１３
３に設けられたＨＭＭ照合部１３３３は、前記したよう
に１０時１５分に利用者からの音声入力（ここでは「総
務部」という音声の入力）が開始されると、その音声入
力開始時点において音声認識部１３内の単語辞書切替部
１３１により選択されている（図３の内容の）単語辞書
１３２-1を用いて、（上記の如く特徴抽出部１３３１及
び離散化部１３３２により変換された）入力音声に対応
する音声セグメント系列と単語「総務部」、「営業
部」、「警備室」の各単語モデルとをそれぞれ照合し、
「総務部」を音声認識結果として出力する。このＨＭＭ
照合部１３３３から出力される音声認識結果「総務部」
は、音声認識部１３での音声認識結果として情報処理部
１４に送られる。On the other hand, the voice verification unit 13 in the voice recognition unit 13
As described above, the HMM collation unit 1333 provided in No. 3 starts voice input from the user at 10:15 (here, the voice input “general affairs department”) at the time when the voice input starts. Using the word dictionary 132-1 (with the contents of FIG. 3) selected by the word dictionary switching unit 131 in the voice recognition unit 13 (converted by the feature extraction unit 1331 and the discretization unit 1332 as described above). The voice segment series corresponding to the input voice is compared with the word "general affairs department", "sales department", and each word model of "security room",
"General Affairs Department" is output as the voice recognition result. This HMM
Speech recognition result output from collation unit 1333 "General Affairs Department"
Is sent to the information processing unit 14 as the result of voice recognition by the voice recognition unit 13.

【００６６】このとき情報処理部１４では、当該情報処
理部１４内の情報辞書切替部１４１により、状態決定部
１２（内の状態設定部１２３）からの状態番号＃１に応
じて情報辞書１４２-1が選択されている。At this time, in the information processing unit 14, the information dictionary switching unit 141 in the information processing unit 14 causes the information dictionary 142-in accordance with the state number # 1 from the state determining unit 12 (the state setting unit 123 therein). 1 is selected.

【００６７】そこで情報処理部１４内の出力生成部１４
３は、音声認識部１３（内の音声照合部１３３に設けら
れたＨＭＭ照合部１３３３）からの音声認識結果「総務
部」に対応するメッセージ（応答文）を、（情報辞書切
替部１４１によって選択されている）情報辞書１４２-1
から検索する。これにより出力生成部１４３は、図９か
ら明らかなように、情報辞書１４２-1から音声認識結果
「総務部」に対応するメッセージ「総務部は内線０００
１です」を取得し、当該メッセージを画面表示すること
で、入力音声「総務部」に対する応答情報を利用者に提
示する。Therefore, the output generation unit 14 in the information processing unit 14
3 selects a message (response sentence) corresponding to the speech recognition result “general affairs section” from the speech recognition unit 13 (the HMM matching unit 1333 provided in the speech matching unit 133 therein) by the information dictionary switching unit 141. Information dictionary 142-1
Search from. As a result, as is apparent from FIG. 9, the output generation unit 143 causes the message “general affairs department is extension 000” to correspond to the voice recognition result “general affairs department” from the information dictionary 142-1.
1 ”is acquired and the message is displayed on the screen to present the response information to the input voice“ General Affairs Department ”to the user.

【００６８】同様にして、利用者が例えば７時３０分に
「警備室」と発話した場合には、その時点において状態
決定部１２（内の状態設定部１２３）から通知される状
態番号は＃０であることから、音声認識部１３内の単語
辞書切替部１３１により単語辞書１３２-0が選択され、
情報処理部１４内の情報処理部１４により情報辞書１４
２-0が選択される。Similarly, when the user utters "security room" at 7:30, for example, the state number notified from the state determining unit 12 (inside the state setting unit 123) is #. Since it is 0, the word dictionary switching unit 131 in the voice recognition unit 13 selects the word dictionary 132-0,
The information dictionary 14 is provided by the information processing unit 14 in the information processing unit 14.
2-0 is selected.

【００６９】したがって、音声認識部１３内の音声照合
部１３３に設けられたＨＭＭ照合部１３３３は、単語辞
書１３２-0を用いて、利用者が発声した入力音声「警備
室」に対応する音声セグメント系列と単語「警備室」、
「当直室」の各単語モデルとをそれぞれ照合し、「警備
室」を音声認識結果として出力する。このＨＭＭ照合部
１３３３から出力される音声認識結果「警備室」は、音
声認識部１３での音声認識結果として情報処理部１４に
送られる。Therefore, the HMM collating unit 1333 provided in the voice collating unit 133 in the voice recognizing unit 13 uses the word dictionary 132-0 and the voice segment corresponding to the input voice "guard room" uttered by the user. Sequence and word "security room",
Each word model of the "watch room" is compared with each other, and the "guard room" is output as a voice recognition result. The voice recognition result “guard room” output from the HMM matching unit 1333 is sent to the information processing unit 14 as the voice recognition result of the voice recognition unit 13.

【００７０】このとき情報処理部１４では、情報辞書切
替部１４１により、状態決定部１２（内の状態設定部１
２３）からの状態番号＃０に応じて情報辞書１４２-0が
選択されている。At this time, in the information processing section 14, the information dictionary switching section 141 causes the state determining section 12 (in the state setting section 1
The information dictionary 142-0 is selected according to the state number # 0 from 23).

【００７１】そこで情報処理部１４内の出力生成部１４
３は、（利用者が「警備室」と発話した時点において）
情報辞書切替部１４１により選択されている情報辞書１
４２-0を用いて、上記音声認識部１３での音声認識結果
「警備室」に対応するメッセージ「警備室は内線０００
４です」（図９参照）を取得して利用者に提示する。［第２の実施例］次に、本発明の情報提供端末の第２の
実施例について説明する。この第２の実施例における情
報提供端末の全体構成の概略は前記第１の実施例と同様
であり、情報提供端末の各構成要素の内部構成について
も音声認識部を除いて前記第１の実施例と同様である。Therefore, the output generation unit 14 in the information processing unit 14
3 is (when the user speaks "security room")
Information dictionary 1 selected by the information dictionary switching unit 141
42-0, the message "Security Room is extension 000" corresponding to the result of the security recognition "Security Room" in the voice recognition unit 13
4 ”(see FIG. 9) and present it to the user. [Second Embodiment] Next, a second embodiment of the information providing terminal of the present invention will be described. The general configuration of the information providing terminal in the second embodiment is similar to that of the first embodiment, and the internal configuration of each component of the information providing terminal is the same as that in the first embodiment except for the voice recognition unit. Similar to the example.

【００７２】そこで、（前記第１の実施例とは異なる構
成の）本実施例における音声認識部の詳細について、図
１０のブロック構成図を参照して説明する。図１０にお
いて、２３は前記第１の実施例における音声認識部１３
に相当する音声認識部である。この図１０に示した音声
認識部２３を備えた情報提供端末の全体の概略構成は、
図１において音声認識部１３を音声認識部２３に置き換
えたものとなる。したがって以降の説明では、図１にお
いて音声認識部１３が音声認識部２３で置き換えられて
いるものとして、便宜的に図１を併用する。Therefore, details of the voice recognition unit in this embodiment (which has a different structure from the first embodiment) will be described with reference to the block diagram of FIG. In FIG. 10, 23 is the voice recognition unit 13 in the first embodiment.
Is a voice recognition unit corresponding to. The overall schematic configuration of the information providing terminal including the voice recognition unit 23 shown in FIG.
In FIG. 1, the voice recognition unit 13 is replaced with the voice recognition unit 23. Therefore, in the following description, it is assumed that the voice recognition unit 13 is replaced with the voice recognition unit 23 in FIG.

【００７３】音声認識部２３は、全単語辞書２３１、全
音声認識部２３２、認識語彙辞書２３３及び認識結果検
査部２３４から構成される。全単語辞書２３１は、情報
提供端末が取り得る全ての状態（ここでは状態＃０と＃
１の２通り）で認識する必要のある単語を認識するため
の単語モデル（即ち前記第１の実施例における２つの単
語辞書１３２-0，１３２-1に登録されている全ての単語
モデル）を予め登録しておくものである。この全単語辞
書２３１の一例を図１１に示す。本実施例における情報
提供端末は、その状態に応じて、「警備室」、「当直
室」、「総務部」、「営業部」の４単語を受け付ける。The voice recognition unit 23 comprises an all-word dictionary 231, an all-voice recognition unit 232, a recognition vocabulary dictionary 233, and a recognition result inspection unit 234. The all-word dictionary 231 includes all the states (here, states # 0 and #) that the information providing terminal can have.
A word model for recognizing a word that needs to be recognized (in two ways 1) (that is, all word models registered in the two word dictionaries 132-0 and 132-1 in the first embodiment). It is to be registered in advance. An example of this all-word dictionary 231 is shown in FIG. The information providing terminal according to the present embodiment accepts four words of “security room”, “shift room”, “general affairs department”, and “sales department” according to the state.

【００７４】全音声認識部２３２は、入力された音声信
号と全単語辞書２３１に予め登録されている各単語モデ
ルとの照合を行い、照合の結果が最も良かった単語を仮
音声認識結果として認識結果検査部２３４に出力するも
のである。この全音声認識部２３２の構成は、図４に示
した前記第１の実施例における音声照合部１３３と同様
である。The all-speech recognition unit 232 collates the input speech signal with each word model registered in advance in the all-word dictionary 231, and recognizes the word with the best collation result as the temporary speech recognition result. The result is output to the result inspection unit 234. The configuration of this all-speech recognition section 232 is the same as that of the speech collation section 133 in the first embodiment shown in FIG.

【００７５】認識語彙辞書２３３は、全単語辞書２３１
に登録されている単語が、情報提供端末のいずれの状態
で有効であるかを示す情報を予め登録しておくものであ
る。認識語彙辞書２３３の一例を図１２に示す。この図
１２の例では、例えば単語「警備室」は、情報提供端末
の状態が＃０，＃１のいずれの場合にも有効となり、単
語「総務部」は、情報提供端末の状態が＃１の場合だけ
有効となることを示している。The recognition vocabulary dictionary 233 is the all-word dictionary 231.
Information indicating in which state of the information providing terminal the word registered in is valid is registered in advance. FIG. 12 shows an example of the recognition vocabulary dictionary 233. In the example of FIG. 12, for example, the word “security room” is valid regardless of whether the information providing terminal status is # 0 or # 1, and the word “general affairs department” is when the information providing terminal status is # 1. It is shown that it is valid only in the case of.

【００７６】認識結果検査部２３４は、全音声認識部２
３２から出力された仮音声認識結果が、状態決定部１２
から出力された状態番号の示す状態で有効であるか否か
を、当該状態番号と認識語彙辞書２３３とを用いて判定
するものである。認識結果検査部２３４は、全音声認識
部２３２からの仮音声認識結果が有効であると判定した
場合だけ、その認識結果を（真の）音声認識結果として
情報処理部１４に出力する。The recognition result inspecting unit 234 is the all-speech recognition unit 2
The temporary voice recognition result output from 32 is the state determination unit 12
It is determined by using the state number and the recognition vocabulary dictionary 233 whether the state indicated by the state number output from is valid. The recognition result inspection unit 234 outputs the recognition result to the information processing unit 14 as a (true) voice recognition result only when it is determined that the temporary voice recognition result from the all-voice recognition unit 232 is valid.

【００７７】この認識結果検査部２３４は、具体的には
次のように動作する。まず認識結果検査部２３４は、全
音声認識部２３２からの仮音声認識結果で認識語彙辞書
２３３を検索し、その仮音声認識結果である単語が有効
となる状態を得る。The recognition result inspection section 234 specifically operates as follows. First, the recognition result inspection unit 234 searches the recognition vocabulary dictionary 233 with the temporary voice recognition result from the all voice recognition unit 232, and obtains a state in which the word as the temporary voice recognition result is valid.

【００７８】次に認識結果検査部２３４は、仮音声認識
結果を用いて認識語彙辞書２３３から得た状態の中に、
状態決定部１２から出力されている現在の状態番号（の
示す状態）が含まれているか否かを調べ、含まれている
ならば当該仮音声認識結果を有効と判定し、含まれてい
ないならば無効と判定する。Next, the recognition result inspecting unit 234 uses the temporary voice recognition result to obtain the state obtained from the recognition vocabulary dictionary 233.
It is checked whether (the state indicated by) the current state number output from the state determination unit 12 is included, and if it is included, the temporary voice recognition result is determined to be valid, and if it is not included. If it is invalid.

【００７９】認識結果検査部２３４は、有効と判定した
場合には、上記仮音声認識結果を（真の）音声認識結果
として情報処理部１４に出力し、無効と判定した場合に
は、情報処理部１４に何も出力しない。The recognition result inspection unit 234 outputs the temporary voice recognition result to the information processing unit 14 as a (true) voice recognition result when it is determined to be valid, and outputs the information processing when it is determined to be invalid. Nothing is output to the section 14.

【００８０】以上のように音声認識部２３を構成する
と、ある時刻に情報提供端末の利用者により、当該端末
でその時刻に提供していないコマンドが誤って発声され
た場合には、情報処理部１４には何も情報が入力されな
いため、装置は動作しないで済む。したがって、情報提
供端末は誤った音声認識結果によって動作することがな
く、また情報提供端末の利用者は装置が動作しないこと
で、現在の時間帯では受理されない誤った単語を発声し
たことを知ることができる。［第３の実施例］次に、本発明の情報提供端末の第３の
実施例について説明する。When the voice recognition unit 23 is configured as described above, when the user of the information providing terminal mistakenly utters a command not provided at that time by the user of the information providing terminal at a certain time, the information processing unit Since no information is entered in 14, the device does not need to operate. Therefore, the information providing terminal does not operate due to an incorrect voice recognition result, and the user of the information providing terminal knows that the user does not operate the device and utters a wrong word that is not accepted in the current time zone. You can [Third Embodiment] Next, a third embodiment of the information providing terminal of the present invention will be described.

【００８１】図１３は本発明の第３の実施例に係る情報
提供端末の概略構成を示すブロック図である。なお、図
１と同一部分には同一符号を付して説明を省略する。こ
の図１３に示す情報提供端末は、時刻検知部１１、状態
決定部１２、音声認識部３３及び情報処理部３４から構
成されており、図１において音声認識部１３を音声認識
部３３に、情報処理部１４を情報処理部３４に、それぞ
れ置き換えたものである。FIG. 13 is a block diagram showing a schematic configuration of an information providing terminal according to the third embodiment of the present invention. The same parts as those in FIG. The information providing terminal shown in FIG. 13 includes a time detecting unit 11, a state determining unit 12, a voice recognizing unit 33, and an information processing unit 34. In FIG. The processing unit 14 is replaced with the information processing unit 34.

【００８２】図１３中の音声認識部３３の詳細を図１４
に示す。なお、図１０と同一部分には同一符号を付して
説明を省略する。この図１４に示す音声認識部３３は、
全単語辞書２３１、全音声認識部２３２、認識語彙辞書
２３３及び認識結果検査部３３４から構成される。この
音声認識部３３が図１０に示した音声認識部２３と異な
る点は、認識結果検査部２３４に代えて認識結果検査部
３３４が用いられていることである。そこで、認識結果
検査部３３４の動作について、図１５のフローチャート
を参照して説明する。Details of the voice recognition unit 33 in FIG. 13 are shown in FIG.
Shown in The same parts as those in FIG. 10 will be assigned the same reference numerals and explanations thereof will be omitted. The voice recognition unit 33 shown in FIG.
It is composed of an all-word dictionary 231, an all-voice recognition unit 232, a recognition vocabulary dictionary 233, and a recognition result inspection unit 334. The voice recognition unit 33 differs from the voice recognition unit 23 shown in FIG. 10 in that the recognition result inspection unit 234 is used in place of the recognition result inspection unit 234. Therefore, the operation of the recognition result inspection unit 334 will be described with reference to the flowchart of FIG.

【００８３】認識結果検査部３３４は、全音声認識部２
３２から出力された仮音声認識結果が、状態決定部１２
から出力された状態番号の示す状態で有効であるか否か
を、当該状態番号と認識語彙辞書２３３とを用いて判定
する（ステップＳ１）。もし、仮音声認識結果を有効と
判定した場合には、認識結果検査部３３４は、この仮音
声認識結果を（真の）音声認識結果として出力し（ステ
ップＳ２）、無効と判定した場合には音声認識結果を出
力しない。この点では、図１０中の認識結果検査部２３
４と同様である（但し、音声認識結果の出力先は情報処
理部３４となる）。The recognition result inspecting unit 334 is used by the all-speech recognition unit
The temporary voice recognition result output from 32 is the state determination unit 12
It is determined whether the state indicated by the state number output from is valid or not using the state number and the recognition vocabulary dictionary 233 (step S1). If it is determined that the temporary voice recognition result is valid, the recognition result inspection unit 334 outputs the temporary voice recognition result as a (true) voice recognition result (step S2), and if it is determined that the temporary voice recognition result is invalid. Does not output the voice recognition result. In this respect, the recognition result inspection unit 23 in FIG.
4 (however, the output destination of the voice recognition result is the information processing unit 34).

【００８４】この認識結果検査部３３４が図１０中の認
識結果検査部２３４と異なる点は、仮音声認識結果を無
効と判定した場合に、情報提供端末で受理できない単語
が利用者により誤って発声されたことを示す不正単語信
号３３５を（情報処理部３４に）出力する（ステップＳ
３）ことである。この不正単語信号３３５は、有効判定
時には出力されない。The recognition result inspection unit 334 differs from the recognition result inspection unit 234 in FIG. 10 in that when the provisional voice recognition result is determined to be invalid, a word that cannot be accepted by the information providing terminal is erroneously uttered by the user. The incorrect word signal 335 indicating that the word was processed is output (to the information processing unit 34) (step S
3) That is. This illegal word signal 335 is not output at the time of validity determination.

【００８５】なお、認識結果検査部３３４での仮音声認
識結果に対する有効／無効の判定方法は、前記第２の実
施例で述べた図１０中の認識結果検査部２３４で適用し
た方法と同一で構わない。The method of determining whether the temporary speech recognition result is valid or invalid in the recognition result inspection unit 334 is the same as the method applied in the recognition result inspection unit 234 in FIG. 10 described in the second embodiment. I do not care.

【００８６】次に、図１３中の情報処理部３４の詳細を
図１６に示す。なお、図８と同一部分には同一符号を付
して説明を省略する。この図１６に示す情報処理部３４
は、情報辞書切替部１４１、情報辞書１４２-0，１４２
-1及び出力生成部３４３から構成される。この情報処理
部３４が図８に示した情報処理部１４と異なる点は、出
力生成部１４３に代えて出力生成部３４３が用いられて
いることである。そこで、出力生成部３４３の動作につ
いて、図１７のフローチャートを参照して説明する。Next, details of the information processing unit 34 in FIG. 13 are shown in FIG. The same parts as those in FIG. 8 are designated by the same reference numerals and the description thereof will be omitted. The information processing section 34 shown in FIG.
Is an information dictionary switching unit 141, information dictionaries 142-0, 142
-1 and the output generation unit 343. The information processing unit 34 differs from the information processing unit 14 shown in FIG. 8 in that the output generation unit 343 is used instead of the output generation unit 143. Therefore, the operation of the output generation unit 343 will be described with reference to the flowchart in FIG.

【００８７】（情報処理部３４内の）出力生成部３４３
は、図１４に示した構成の音声認識部３３（中の認識結
果検査部３３４）から出力される音声認識結果及び不正
単語信号３３５を監視し、いずれか一方が入力されると
動作を開始する。Output generation section 343 (in information processing section 34)
Monitors the voice recognition result and the incorrect word signal 335 output from the voice recognition unit 33 (the recognition result inspection unit 334 therein) having the configuration shown in FIG. 14, and starts the operation when either one is input. .

【００８８】まず、出力生成部３４３は、音声認識結果
及び不正単語信号３３５のいずれが入力されたかを判定
し（ステップＳ１１）、音声認識結果が入力された場合
には、その入力された音声認識結果に対応する情報を、
情報辞書切替部１４１によって選択されている情報辞書
から取得し、これを情報提供端末の利用者に提示する
（ステップＳ１２）。First, the output generation unit 343 determines which of the voice recognition result and the incorrect word signal 335 is input (step S11). When the voice recognition result is input, the input voice recognition is performed. Information corresponding to the result,
It is acquired from the information dictionary selected by the information dictionary switching unit 141 and presented to the user of the information providing terminal (step S12).

【００８９】これに対し、不正単語信号３３５が入力さ
れた場合には、出力生成部３４３は、情報提供端末の利
用者が発声した音声が無効であることを示す情報（メッ
セージ）を、その利用者に提示する（ステップＳ１
３）。この無効を示す情報として、例えば「無効な発声
がありました。もう一度言い直してください」というメ
ッセージを用い、当該メッセージを出力することで、利
用者に（その時刻では受理されない）間違った発声があ
ったことを伝えることができる。On the other hand, when the illegal word signal 335 is input, the output generator 343 uses the information (message) indicating that the voice uttered by the user of the information providing terminal is invalid. Present to the person (step S1)
3). As the information indicating this invalidity, for example, by using a message "There was an invalid utterance. Please try again", and by outputting the message, the user has an incorrect utterance (not accepted at that time). I can tell you that.

【００９０】以上のように、音声認識部３３及び情報処
理部３４を構成すると、ある時刻に情報提供端末の利用
者により、当該端末でその時刻に提供していないコマン
ドが誤って発声された場合には、音声認識部３３から情
報処理部３４には不正単語信号３３５のみが入力され、
情報処理部３４から情報提供端末の利用者に対して、発
声した音声が無効であることを示す情報を提示すること
ができる。［第４の実施例］次に、本発明の情報提供端末の第４の
実施例について説明する。なお、本実施例の基本構成は
前記第３の実施例と同様であり、一部の構成要素の動作
機能が異なるだけのため、以下の説明では、便宜的に前
記第３の実施例で参照した図１３、図１４及び図１６を
流用する。As described above, when the voice recognition unit 33 and the information processing unit 34 are configured, when the user of the information providing terminal mistakenly utters a command not provided at that time by the user of the information providing terminal at a certain time. , Only the incorrect word signal 335 is input from the voice recognition unit 33 to the information processing unit 34.
The information processing unit 34 can present information indicating that the uttered voice is invalid to the user of the information providing terminal. [Fourth Embodiment] Next, a fourth embodiment of the information providing terminal of the present invention will be described. The basic configuration of the present embodiment is the same as that of the third embodiment, and only the operation functions of some components are different. Therefore, in the following description, the third embodiment will be referred to for convenience. FIG. 13, FIG. 14 and FIG.

【００９１】まず、前記第３の実施例では、ある時刻に
情報提供端末の利用者により、当該端末でその時刻に提
供していないコマンドが誤って発声された場合には、音
声認識部３３から情報処理部３４には不正単語信号３３
５のみを入力し、音声認識結果は入力しないものとして
説明した。これに対し、本実施例（第４の実施例）は、
不正単語信号３３５と共に音声認識結果も入力し、この
音声認識結果を情報処理部３４にて利用するようにした
ものである。First, in the third embodiment, if the user of the information providing terminal erroneously utters a command not provided at that time at a certain time, the voice recognition unit 33 An illegal word signal 33 is sent to the information processing unit 34.
It has been described that only 5 is input and the voice recognition result is not input. On the other hand, in this embodiment (fourth embodiment),
The voice recognition result is input together with the illegal word signal 335, and the voice recognition result is used in the information processing section 34.

【００９２】以下、本実施例の動作を、前記第３の実施
例と異なる音声認識部３３及び情報処理部３４を中心に
説明する。まず、本実施例における音声認識部３３が前
記第３の実施例と異なる部分は、認識結果検査部３３４
の動作機能だけであり、他の全単語辞書２３１、全音声
認識部２３２及び認識語彙辞書２３３については何ら変
わらない。そこで、ここでは、音声認識部３３内の認識
結果検査部３３４の動作について、図１８のフローチャ
ートを参照して説明する。The operation of this embodiment will be described below with a focus on the voice recognition unit 33 and the information processing unit 34, which are different from those of the third embodiment. First, the part of the voice recognition unit 33 of this embodiment different from that of the third embodiment is the recognition result inspection unit 334.
This is only the operation function of the above, and the other all-word dictionary 231, all-speech recognition unit 232, and the recognized vocabulary dictionary 233 do not change at all. Therefore, here, the operation of the recognition result inspection unit 334 in the voice recognition unit 33 will be described with reference to the flowchart of FIG. 18.

【００９３】認識結果検査部３３４は、前記第３の実施
例と同様に、全音声認識部２３２から出力された仮音声
認識結果が、状態決定部１２から出力された状態番号の
示す状態で有効であるか否かを、当該状態番号と認識語
彙辞書２３３とを用いて判定する（ステップＳ２１）。As in the third embodiment, the recognition result inspection unit 334 is effective when the temporary voice recognition result output from the total voice recognition unit 232 is in the state indicated by the state number output from the state determination unit 12. It is determined using the state number and the recognition vocabulary dictionary 233 (step S21).

【００９４】もし、仮音声認識結果を有効と判定した場
合には、認識結果検査部３３４は、この仮音声認識結果
を（真の）音声認識結果として情報処理部３４に出力し
（ステップＳ２２）、不正単語信号３３５は出力しな
い。If it is determined that the temporary voice recognition result is valid, the recognition result inspection section 334 outputs this temporary voice recognition result to the information processing section 34 as a (true) voice recognition result (step S22). , The illegal word signal 335 is not output.

【００９５】これに対し、仮音声認識結果を無効と判定
した場合に、認識結果検査部３３４は、情報提供端末で
受理できない単語が利用者により誤って発声されたこと
を示す不正単語信号３３５を情報処理部３４に出力する
と共に、この無効と判定した仮音声認識結果を音声認識
結果として当該情報処理部３４に出力する（ステップＳ
２３）。このように本実施例における認識結果検査部３
３４は、仮音声認識結果の無効判定時に、不正単語信号
３３５だけでなく、仮音声認識結果も情報処理部３４に
出力する点で、前記第３の実施例と異なる。On the other hand, when it is determined that the temporary voice recognition result is invalid, the recognition result inspecting unit 334 outputs the incorrect word signal 335 indicating that the user erroneously uttered a word that the information providing terminal cannot accept. The temporary voice recognition result determined to be invalid is output to the information processing unit 34 as the voice recognition result while being output to the information processing unit 34 (step S
23). In this way, the recognition result inspection unit 3 in this embodiment is
34 is different from the third embodiment in that not only the incorrect word signal 335 but also the temporary voice recognition result is output to the information processing unit 34 when the temporary voice recognition result is determined to be invalid.

【００９６】次に、本実施例における情報処理部３４が
前記第３の実施例と異なる部分は、出力生成部３４３の
動作機能だけであり、他の情報辞書切替部１４１及び情
報辞書１４２-0，１４２-1については何ら変わらない。
そこで、ここでは、情報処理部３４内の出力生成部３４
３の動作について、図１９のフローチャートを参照して
説明する。Next, the only difference of the information processing unit 34 in this embodiment from the third embodiment is the operation function of the output generation unit 343, and the other information dictionary switching unit 141 and information dictionary 142-0. , 142-1 does not change at all.
Therefore, here, the output generation unit 34 in the information processing unit 34
The operation of No. 3 will be described with reference to the flowchart of FIG.

【００９７】情報処理部３４内の出力生成部３４３は、
図１４に示した構成の音声認識部３３（中の認識結果検
査部３３４）から出力される音声認識結果を監視し、当
該音声認識結果が入力されると動作を開始する。The output generation unit 343 in the information processing unit 34
The voice recognition result output from the voice recognition unit 33 (the recognition result inspection unit 334 therein) having the configuration shown in FIG. 14 is monitored, and the operation is started when the voice recognition result is input.

【００９８】出力生成部３４３は、音声認識結果が入力
されると、当該音声認識結果と共に不正単語信号３３５
が入力されているか否かを調べ（ステップＳ３１）、不
正単語信号３３５が入力されていない場合には、入力さ
れた音声認識結果に対応する情報を、情報辞書切替部１
４１によって選択されている情報辞書から取得し、これ
を情報提供端末の利用者に提示する（ステップＳ３
２）。When the voice recognition result is input, the output generating section 343 inputs the voice recognition result and the illegal word signal 335.
Is input (step S31), and if the illegal word signal 335 is not input, the information corresponding to the input voice recognition result is changed to the information dictionary switching unit 1
It is acquired from the information dictionary selected by 41 and presented to the user of the information providing terminal (step S3
2).

【００９９】これに対し、音声認識結果と共に不正単語
信号３３５が入力されている場合には、出力生成部３４
３は、その入力された音声認識結果と共に、当該音声認
識結果の示す音声（即ち情報提供端末の利用者が発声し
て認識された音声）がその時刻（時間帯）では無効であ
ることを示す情報（メッセージ）を利用者に提示する
（ステップＳ３３）。On the other hand, when the illegal word signal 335 is input together with the voice recognition result, the output generation unit 34
3 indicates that the voice indicated by the voice recognition result (that is, the voice recognized by the user of the information providing terminal) is invalid at the time (time zone) together with the input voice recognition result. The information (message) is presented to the user (step S33).

【０１００】以上のように、音声認識部３３及び情報処
理部３４を構成すると、ある時刻に情報提供端末の利用
者により、当該端末でその時刻に提供していないコマン
ドが誤って発声された場合には、音声認識部３３から情
報処理部３４には音声認識結果及び不正単語信号３３５
の両方が入力され、情報処理部３４から情報提供端末の
利用者に対して、入力された音声の音声認識結果と、そ
れが無効であることを示す情報を提示することができ
る。［第５の実施例］次に、本発明の情報提供端末の第５の
実施例について説明する。なお、本実施例の基本構成も
前記第３の実施例と同様であり、一部の構成要素の動作
機能が異なるだけのため、以下の説明では、便宜的に前
記第３の実施例で参照した図１３、図１４及び図１６を
流用する。As described above, when the voice recognition unit 33 and the information processing unit 34 are configured, when the user of the information providing terminal mistakenly utters a command not provided at that time by the user of the information providing terminal at the certain time. From the voice recognition unit 33 to the information processing unit 34, the voice recognition result and the incorrect word signal 335 are displayed.
Both are input, and the information processing unit 34 can present the voice recognition result of the input voice and the information indicating that it is invalid to the user of the information providing terminal. [Fifth Embodiment] Next, a fifth embodiment of the information providing terminal of the present invention will be described. The basic configuration of the present embodiment is also the same as that of the third embodiment, and only the operation functions of some components are different. Therefore, in the following description, the third embodiment will be referred to for convenience. FIG. 13, FIG. 14 and FIG.

【０１０１】まず、前記第３の実施例では、音声認識部
３３内の認識結果検査部３３４から情報処理部３４内の
出力生成部３４３に不正単語信号３３５が入力された場
合には、利用者が発声した音声が無効であることを示す
情報を提示するものとして説明した。これに対し、本実
施例（第５の実施例）は、その時刻（時間帯）において
利用可能な単語のリストを利用者に提示するようにした
ものである。First, in the third embodiment, when the illegal word signal 335 is input from the recognition result inspection unit 334 in the voice recognition unit 33 to the output generation unit 343 in the information processing unit 34, the user Has been described as presenting information indicating that the voice uttered by is invalid. On the other hand, in the present embodiment (fifth embodiment), a list of words that can be used at that time (time zone) is presented to the user.

【０１０２】以下、本実施例の動作を、前記第３の実施
例と異なる情報処理部３４内の出力生成部３４３の機能
を中心に、図２０のフローチャートを参照して説明す
る。まず、音声認識部３３内の認識結果検査部３３４か
ら情報処理部３４内の出力生成部３４３には、前記第３
の実施例と同様に、全音声認識部２３２で認識された仮
音声認識結果が有効であるならば当該仮音声認識結果が
音声認識結果として入力され、仮音声認識結果が無効で
あるならば不正単語信号３３５だけが入力される。Hereinafter, the operation of this embodiment will be described with reference to the flowchart of FIG. 20, focusing on the function of the output generation unit 343 in the information processing unit 34, which is different from the third embodiment. First, from the recognition result inspection unit 334 in the voice recognition unit 33 to the output generation unit 343 in the information processing unit 34, the third
Similarly to the embodiment described above, if the temporary voice recognition result recognized by the all-voice recognition unit 232 is valid, the temporary voice recognition result is input as the voice recognition result, and if the temporary voice recognition result is invalid, the temporary voice recognition result is invalid. Only the word signal 335 is input.

【０１０３】出力生成部３４３は、音声認識部３３内の
認識結果検査部３３４から出力される音声認識結果及び
不正単語信号３３５を監視し、いずれか一方が入力され
ると動作を開始する。The output generation unit 343 monitors the voice recognition result and the illegal word signal 335 output from the recognition result inspection unit 334 in the voice recognition unit 33, and starts operation when either one is input.

【０１０４】まず、出力生成部３４３は、音声認識結果
及び不正単語信号３３５のいずれが入力されたかを判定
し（ステップＳ４１）、音声認識結果が入力された場合
には、その入力された音声認識結果に対応する情報を、
情報辞書切替部１４１によって選択されている情報辞書
から取得し、これを情報提供端末の利用者に提示する
（ステップＳ４２）。ここまでは、前記第３の実施例と
同様である。First, the output generation unit 343 determines which of the voice recognition result and the incorrect word signal 335 is input (step S41). When the voice recognition result is input, the input voice recognition is performed. Information corresponding to the result,
It is acquired from the information dictionary selected by the information dictionary switching unit 141 and presented to the user of the information providing terminal (step S42). Up to this point, the process is the same as in the third embodiment.

【０１０５】これに対し、不正単語信号３３５が入力さ
れた場合には、出力生成部３４３は、その時点において
情報辞書切替部１４１によって選択されている情報辞書
（情報辞書１４２-0または情報辞書１４２-1のいずれか
一方）を走査し、現時点で利用可能な単語リストの出力
であることを示す情報、例えば「現在利用できる単語は
以下の通りです」のメッセージと共に、その情報辞書に
含まれている単語のリストを利用者に提示する（ステッ
プＳ４３）。On the other hand, when the illegal word signal 335 is input, the output generation section 343 causes the information dictionary switching section 141 to select the information dictionary (information dictionary 142-0 or information dictionary 142) selected at that time. -1) and included in the information dictionary, along with information indicating that the output is a list of currently available words, for example, "Currently available words are:" The list of existing words is presented to the user (step S43).

【０１０６】以上のように、情報処理部３４を構成する
と、ある時刻に情報提供端末の利用者により、当該端末
でその時刻に提供していないコマンドが誤って発声され
た場合には、音声認識部３３から情報処理部３４には不
正単語信号３３５のみが入力され、情報処理部３４から
情報提供端末の利用者に対して、その時刻（時間帯）に
有効な（受理可能な）単語のリストを提示することがで
きる。また、情報辞書切替部１４１によって選択されて
いる情報辞書中には、受理可能な音声（を示す単語）に
対応して、その音声が受理された場合に利用者に提供す
る機能（ここでは応答文）が登録されていることから、
単語リストだけでなく当該機能の一覧を利用者に提示す
ることも可能であり、機能一覧だけを提示することも可
能である。As described above, when the information processing section 34 is configured, when the user of the information providing terminal mistakenly utters a command not provided at that time by the user of the information providing terminal at a certain time, voice recognition is performed. Only the incorrect word signal 335 is input from the section 33 to the information processing section 34, and the information processing section 34 provides the user of the information providing terminal with a list of valid (acceptable) words at that time (time zone). Can be presented. In addition, in the information dictionary selected by the information dictionary switching unit 141, in response to (acceptable word) acceptable voice, a function provided to the user when the voice is accepted (response here) (Text) is registered,
Not only the word list but also the function list can be presented to the user, or only the function list can be presented.

【０１０７】なお、前記実施例では、状態決定部１２は
時刻検知部１１により通知されている時刻を定期的に読
み込んで状態決定を行うものしたが、例えば音声認識部
１３（２３，３３）から状態決定部１２に利用者による
音声入力開始を通知することで、音声入力開始時点だ
け、時刻検知部１１により通知されている時刻を読み込
んで状態決定を行うようにしても構わない。同様に、音
声認識部１３（２３，３３）から時刻検知部１１に音声
入力開始を通知することで、音声入力開始時点だけ、そ
の際の時刻を時刻検知部１１から状態決定部１２に通知
して、その際の状態を決定させるようにしても構わな
い。In the above embodiment, the state determining unit 12 reads the time notified by the time detecting unit 11 periodically to determine the state. For example, from the voice recognizing unit 13 (23, 33). The state determination unit 12 may be notified of the voice input start by the user, and the state determination may be performed by reading the time notified by the time detection unit 11 only at the voice input start time. Similarly, the voice recognition unit 13 (23, 33) notifies the time detection unit 11 of the voice input start, so that the time detection unit 11 notifies the state determination unit 12 of the time only at the voice input start time. Then, the state at that time may be determined.

【０１０８】また、前記実施例では、情報提供端末の提
供できるサービス（機能）を時刻（時間帯）によって切
り替え、利用者が発声した音声（単語）の内容により受
理可能な時刻（時間帯）が異なる場合について説明した
が、これに限るものではなく、本発明は、例えば日付ま
たは曜日によって切り替えるものであっても同様に適用
可能である。この他、日付と関連して休日であるか否か
（例えば会社が休日であるか否か）によって切り替える
ものであっても、日付、曜日、時刻、休日であるか否か
の２つ以上を組み合わせたもので切り替えるものであっ
ても、同様に適用可能である。また、前記実施例では、
利用者から操作可能な音声が単語の場合について説明し
たが、単語に限るものではなく、複合語や文であっても
同様に適用できる。Further, in the above embodiment, the service (function) that the information providing terminal can provide is switched depending on the time (time zone), and the acceptable time (time zone) depends on the content of the voice (word) uttered by the user. Although different cases have been described, the present invention is not limited to this, and the present invention can be similarly applied to the case of switching by date or day of the week. In addition, even if it is switched depending on whether it is a holiday in relation to the date (for example, whether the company is a holiday), two or more of the date, the day of the week, the time of day, and the holiday are displayed. Even if the combination is used for switching, the same is applicable. Further, in the above embodiment,
Although the case where the voice that can be operated by the user is a word has been described, the present invention is not limited to words, and the same applies to compound words and sentences.

【０１０９】[0109]

【発明の効果】以上詳述したように本発明によれば、情
報提供端末が提供できるサービスを時刻等によって自動
的に切り替え、しかもそれに追従して音声認識可能な音
声を切り替えることができる。As described in detail above, according to the present invention, the services that can be provided by the information providing terminal can be automatically switched according to the time and the like, and the voice that can be recognized by voice can be switched accordingly.

【０１１０】また、本発明によれば、利用者が情報提供
端末を利用した時刻等には受理されないような無効なサ
ービスを音声で誤って指示した場合でも、情報提供端末
が誤動作するのを防止することができる。Further, according to the present invention, even if the user erroneously instructs by voice an invalid service that is not accepted at the time when the information providing terminal is used, the information providing terminal is prevented from malfunctioning. can do.

【０１１１】また、本発明によれば、利用者が情報提供
端末を利用した時刻等には受理されないような無効なサ
ービスを音声で誤って指示した場合には、その無効な音
声に対応した適切な指示を利用者に提示することができ
る。したがって、本発明によれば、情報提供端末の利用
者にとって、常に適切なサービスを提供することがで
き、実用上多大なる効果がある。Further, according to the present invention, when the user erroneously instructs by voice an invalid service that is not accepted at the time when the information providing terminal is used, it is appropriate to deal with the invalid voice. Various instructions can be presented to the user. Therefore, according to the present invention, an appropriate service can always be provided to the user of the information providing terminal, which has a great practical effect.

[Brief description of drawings]

【図１】本発明の第１の実施例に係る情報提供端末の概
略構成を示すブロック図。FIG. 1 is a block diagram showing a schematic configuration of an information providing terminal according to a first embodiment of the present invention.

【図２】図１中の音声認識部１３の詳細構成を示すブロ
ック図。FIG. 2 is a block diagram showing a detailed configuration of a voice recognition unit 13 in FIG.

【図３】図１中の単語辞書１３２-0，１３２-1の一例を
示す図。FIG. 3 is a diagram showing an example of word dictionaries 132-0 and 132-1 in FIG.

【図４】図２中の音声照合部１３３の詳細構成を示すブ
ロック図。4 is a block diagram showing a detailed configuration of a voice collating unit 133 in FIG.

【図５】音声認識で用いられるＨＭＭの代表的な構造
を、状態数Ｎが１０の場合について示す図。FIG. 5 is a diagram showing a typical structure of an HMM used in speech recognition when the number of states N is 10.

【図６】図１中の状態決定部１２の詳細構成を示すブロ
ック図。6 is a block diagram showing a detailed configuration of a state determination unit 12 in FIG.

【図７】図６中の状態番号記憶部１２１における登録例
を示す図。7 is a diagram showing an example of registration in a state number storage unit 121 in FIG.

【図８】図１中の情報処理部１４の詳細構成を示すブロ
ック図。8 is a block diagram showing a detailed configuration of an information processing section 14 in FIG.

【図９】図８中の情報辞書１４２-0，１４２-1の一例を
示す図。9 is a diagram showing an example of information dictionaries 142-0 and 142-1 in FIG.

【図１０】本発明の第２の実施例に係る情報提供端末に
適用される音声認識部２３の詳細構成を示すブロック
図。FIG. 10 is a block diagram showing a detailed configuration of a voice recognition unit 23 applied to an information providing terminal according to a second embodiment of the present invention.

【図１１】図１０中の全単語辞書２３１の一例を示す
図。11 is a diagram showing an example of an all-word dictionary 231 in FIG.

【図１２】図１０中の認識語彙辞書２３３の一例を示す
図。12 is a diagram showing an example of a recognition vocabulary dictionary 233 in FIG.

【図１３】本発明の第３の実施例に係る情報提供端末の
概略構成を示すブロック図。FIG. 13 is a block diagram showing a schematic configuration of an information providing terminal according to a third embodiment of the present invention.

【図１４】図１３中の音声認識部３３の詳細構成を示す
ブロック図。FIG. 14 is a block diagram showing a detailed configuration of a voice recognition unit 33 in FIG.

【図１５】上記第３の実施例における認識結果検査部３
３４の動作を説明するためのフローチャート。FIG. 15 is a diagram showing a recognition result inspection unit 3 in the third embodiment.
34 is a flowchart for explaining the operation of 34.

【図１６】図１３中の情報処理部３４の詳細構成を示す
ブロック図。16 is a block diagram showing a detailed configuration of an information processing section 34 in FIG.

【図１７】上記第３の実施例における出力生成部３４３
の動作を説明するためのフローチャート。FIG. 17 is an output generation unit 343 according to the third embodiment.
6 is a flowchart for explaining the operation of FIG.

【図１８】本発明の第４の実施例における認識結果検査
部３３４の動作を説明するためのフローチャート。FIG. 18 is a flowchart for explaining the operation of the recognition result inspection unit 334 according to the fourth embodiment of the present invention.

【図１９】本発明の第４の実施例における出力生成部３
４３の動作を説明するためのフローチャート。FIG. 19 is an output generator 3 in the fourth embodiment of the present invention.
The flowchart for demonstrating operation | movement of 43.

【図２０】本発明の第５の実施例における出力生成部３
４３の動作を説明するためのフローチャート。FIG. 20 is an output generation unit 3 in the fifth embodiment of the present invention.
The flowchart for demonstrating operation | movement of 43.

[Explanation of symbols]

１１…時刻検知部（検知手段）、１２…状態決定部、１
３，２３，３３…音声認識部、１４，３４…情報処理
部、１２１…状態番号記憶部、１２２…状態選択部、１
２３…状態設定部、１３１…単語辞書切替部、１３２-
0，１３２-1…単語辞書、１３３…音声照合部、１４１
…情報辞書切替部、１４２-0，１４２-1…情報辞書、１
４３，３４３…出力生成部、２３１…全単語辞書、２３
２…全音声認識部、２３３…認識語彙辞書、２３４，３
３４…認識結果検査部、３３５…不正単語信号、１３３
１…特徴抽出部、１３３２…離散化部、１３３３…ＨＭ
Ｍ照合部。11 ... Time detection unit (detection means), 12 ... State determination unit, 1
3, 23, 33 ... Voice recognition section, 14, 34 ... Information processing section, 121 ... State number storage section, 122 ... State selection section, 1
23 ... State setting unit, 131 ... Word dictionary switching unit, 132-
0, 132-1 ... Word dictionary, 133 ... Voice collating unit, 141
... Information dictionary switching unit, 142-0, 142-1 ... Information dictionary, 1
43, 343 ... Output generation unit, 231 ... All word dictionary, 23
2 ... All voice recognition unit, 233 ... Recognition vocabulary dictionary, 234, 3
34 ... Recognition result inspection unit, 335 ... Illegal word signal, 133
1 ... Feature extraction unit, 1332 ... Discretization unit, 1333 ... HM
M collating unit.

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁶ 識別記号庁内整理番号ＦＩ技術表示箇所Ｇ１０Ｌ 3/00 ５７１Ｈ ─────────────────────────────────────────────────── ─── Continuation of the front page (51) Int.Cl. ⁶ Identification code Internal reference number FI technical display location G10L 3/00 571 H

Claims

[Claims]

1. An information providing terminal which can be operated using voice, wherein at least one of date, day of the week, time of day, and holiday is
Detecting means for detecting one, a state determining means for determining the state of the information providing terminal according to the detection result of the detecting means, and an input voice according to the state indicated by the state determining means at that time. A voice recognition unit that recognizes the voice recognition result of the voice recognition unit by switching the function provided to the user according to the state indicated by the state determination unit when the input voice is recognized by the voice recognition unit. An information providing terminal, comprising: an information processing unit that performs a corresponding information providing operation.

2. An information providing terminal that can be operated using voice, wherein at least one of date, day of the week, time of day, and whether or not it is a holiday
Detecting means for detecting one of the input voice, a state determining means for determining the state of the information providing terminal according to the detection result of the detecting means, and a voice recognizing means for recognizing an input voice, wherein the input voice is accepted. Voice recognition means for determining whether or not it is possible according to the state indicated by the state determination means at that time, and use depending on the state indicated by the state determination means when the input voice is recognized by the voice recognition means An information processing means for performing an information providing operation according to a voice recognition result of the voice recognition means by switching a function to be provided to a person, and when the voice recognition means determines that the input voice cannot be accepted. An information providing terminal, comprising: an information processing means for refraining from providing information to a user.

3. An information providing terminal which can be operated using voice, wherein at least one of date, day of the week, time of day, and whether or not it is a holiday
Detecting means for detecting the input voice, a state determining means for determining the state of the information providing terminal according to the detection result of the detecting means, and a voice recognizing means for recognizing an input voice, wherein the input voice is accepted. Voice recognition means for determining whether or not it is possible according to the state indicated by the state determination means at that time, and use depending on the state indicated by the state determination means when the input voice is recognized by the voice recognition means An information processing means for performing an information providing operation according to a voice recognition result of the voice recognition means by switching a function to be provided to a person, and wherein the voice recognition means determines that the input voice cannot be accepted. An information providing terminal, comprising: an information processing unit that presents information indicating that the sound is invalid to the user.

4. An information providing terminal that can be operated using voice, wherein at least one of date, day of the week, time of day, and whether or not it is a holiday
Detecting means for detecting one of the input voice, a state determining means for determining the state of the information providing terminal according to the detection result of the detecting means, and a voice recognizing means for recognizing an input voice, wherein the input voice is accepted. Voice recognition means for determining whether or not it is possible according to the state indicated by the state determination means at that time, and use depending on the state indicated by the state determination means when the input voice is recognized by the voice recognition means An information processing means for performing an information providing operation according to a voice recognition result of the voice recognition means by switching a function to be provided to a person, and when the voice recognition means determines that the input voice cannot be accepted. An information providing terminal, comprising: information processing means for presenting information of the input voice indicated by the voice recognition result of the voice recognition means to a user.

5. An information providing terminal which can be operated using voice, wherein at least one of date, day of the week, time of day, and whether or not it is a holiday
Detecting means for detecting the input voice, a state determining means for determining the state of the information providing terminal according to the detection result of the detecting means, and a voice recognizing means for recognizing an input voice, wherein the input voice is accepted. Voice recognition means for determining whether or not it is possible according to the state indicated by the state determination means at that time, and use depending on the state indicated by the state determination means when the input voice is recognized by the voice recognition means An information processing means for performing an information providing operation according to a voice recognition result of the voice recognition means by switching a function to be provided to a person, and wherein the voice recognition means determines that the input voice cannot be accepted. An information providing terminal, comprising: an information processing unit that presents at least one of a list of acceptable voices and functions that can be provided to the user at that time.