JPH04338817A

JPH04338817A - Electronic equipment controller

Info

Publication number: JPH04338817A
Application number: JP3139439A
Authority: JP
Inventors: Hidemi Tomizuka; 富塚　英省; Asako Tamura; 朝子田村; Yasuhiro Chigusa; 千種　康裕; Shiro Omori; 士郎大森
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1991-05-16
Filing date: 1991-05-16
Publication date: 1992-11-26

Abstract

PURPOSE:To obtain a controller which can accurately analyzes the purpose of the instruction given from a user and then control an electronic equipment in place of the user who is not accustomed to operate the electronic equipment. CONSTITUTION:The voices of a user 50 using an electronic equipment like a VTR 71, etc., are recognized by a speech recognizing part 62. A conversation comprehending part 63 infers or comprehends the instructions of the user 50 based on the recognizing result of the part 62. Based on this comprehending result, a control part 66 controls the electronic equipment. At the same time, a message preparation part 67 produces a message to be given for the user 50 based on the inferring result of the part 63.

Description

[Detailed description of the invention]

【０００１】0001

【産業上の利用分野】本発明は、使用者からの命令に応
じて各種の電子機器の制御を行う電子機器の制御装置に
関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a control device for electronic equipment that controls various electronic equipment in response to instructions from a user.

【０００２】0002

【従来の技術】近年において、例えばいわゆるマイクロ
プロセッサ等の普及により、各種の電子機器は高機能化
してきており、それに伴って、これら電子機器の操作が
複雑化する傾向にある。このような状況の中で、複雑化
した電子機器を十分に使いこなせないユーザ（使用者）
が増えてきている。2. Description of the Related Art In recent years, various electronic devices have become more sophisticated due to the spread of so-called microprocessors, and as a result, the operation of these electronic devices tends to become more complex. Under these circumstances, there are users who are unable to fully utilize complex electronic devices.
is increasing.

【０００３】このような電子機器等の操作を苦手として
いるようないわゆる「電子機器に弱い」ユーザにとって
電子機器が使い難くなる原因としては、次のようなこと
が考えられる。先ず、電子機器の操作は複雑で覚えにく
いものである。例えば、近年の電子機器は、機能の数が
増え、それに伴い操作の数も多くなっている。すなわち
、これら電子機器において、このように多数の操作を実
現するためには、それらの操作に応じた数だけボタンや
スイッチ等を増やすか、或いは、少ないボタン類を組み
合わせて各種操作を実現させるようにすることになり、
いずれにしても電子機器の操作は複雑になる。電子機器
に弱いユーザにとってこれら複雑な操作の全てを覚える
ておくのは非常に困難である。また、ユーザにとっては
、自分の意図（ユーザ自身の考えていること）と電子機
器の各種操作とを対応付けることが難しい。すなわち、
例えば、電子機器の取扱説明書を読んでもこの電子機器
の操作ができないユーザが存在し、この場合、ユーザの
持っている個人的な意図の表現と取扱説明書にある機能
説明の表現とに隔たりがあるために、ユーザはこの両者
を結び付けられないことがある。The following are possible reasons why electronic devices become difficult for so-called "electronic device weak" users who are not good at operating such electronic devices. First, operating electronic devices is complicated and difficult to learn. For example, in recent years, electronic devices have increased the number of functions, and accordingly, the number of operations has also increased. In other words, in order to realize such a large number of operations in these electronic devices, it is necessary to increase the number of buttons and switches according to the number of operations, or to realize various operations by combining fewer buttons. I decided to make it
In either case, operating electronic devices becomes complicated. It is extremely difficult for users who are not familiar with electronic devices to remember all of these complicated operations. Furthermore, it is difficult for the user to associate his/her own intention (what the user is thinking) with various operations of the electronic device. That is,
For example, there are users who are unable to operate the electronic device even after reading the instruction manual for the device, and in this case, there is a gap between the expression of the user's personal intentions and the expression of the function description in the instruction manual. Because of this, users may not be able to connect the two.

【０００４】0004

【発明が解決しようとする課題】これらの問題を解決す
るために、上記電子機器として、例えば、文書作成装置
（いわゆるワードプロセッサ）等では、当該文書作成装
置の画面上で操作法の説明をしたり、或いは、音声でユ
ーザにメッセージを送るというような種々の試み（Ｖａ
ｌｅｒｉｅ　Ｌ．　Ｆｒａｎｋｅｌ　＆　Ｏｓｍａｎ　
Ｂａｌｃｉ，　”Ａｎ　ｏｎ−ｌｉｎｅ　ａｓｓｉｓｔ
ａｎｃｅｓｙｓｔｅｍ　ｆｏｒ　ｔｈｅｓｉｍｕｌａｔ
ｉｏｎ　ｍｏｄｅｌ　ｄｅｖｅｌｏｐｍｅｎｔ　ｅｎｖ
ｉｒｏｎｍｅｎｔ”，　Ｉｎｔ．　Ｊ．　Ｍａｎ−Ｍａ
ｃｈｉｎｅ　Ｓｔｕｄｉｅｓ，３１，　ｐｐ．６９９−
７１６，　１９８９、伊藤典幸他，“一般ユーザ向け対
話型コンピュータシステムにおける音声メッセージの効
果”，第３回ヒューマン・インターフェース・シンポジ
ウム論文集，ｐｐ．３１９−３２４，１９８７）がなさ
れている。これらは電子機器の中に取扱説明書の一部を
取り込もうとする試みであると言える。このようなアプ
ローチは、単に操作手順が覚え切れず電子機器を使えな
いユーザにとっては有効である。[Problems to be Solved by the Invention] In order to solve these problems, the above-mentioned electronic devices, such as document creation devices (so-called word processors), provide explanations of operation methods on the screen of the document creation device. , or various attempts such as sending messages to users by voice (Va
lerie L. Frankel & Osman
Balci, “An on-line assistant
ance system for the simulator
ion model development env
Ironment”, Int. J. Man-Ma
Chine Studies, 31, pp. 699-
716, 1989, Noriyuki Ito et al., “Effects of voice messages in interactive computer systems for general users,” Proceedings of the 3rd Human Interface Symposium, pp. 319-324, 1987). These can be said to be attempts to incorporate part of the instruction manual into electronic devices. Such an approach is effective for users who cannot use electronic devices simply because they cannot remember the operating procedures.

【０００５】しかし、取扱説明書を読んでも電子機器を
使いこなすことのできないユーザにとっても上述したよ
うな試みが有効であるとは限らない。更に、取扱説明書
やいわゆるヘルプ画面等からのユーザに対する情報の伝
達は、固定的かつ一方的なものであり、例えば、ユーザ
の理解度に合わせて表現を変えたり、ユーザの細かい質
問に対応したりといった融通性は少ない。[0005] However, the above-mentioned attempts are not necessarily effective even for users who cannot fully use electronic equipment even after reading the instruction manual. Furthermore, the transmission of information to users from instruction manuals and so-called help screens is fixed and one-sided; for example, it is difficult to change the wording depending on the user's level of understanding, or to respond to detailed questions from the user. There is little flexibility.

【０００６】また、ユーザの判断を余り必要としない比
較的単純な作業においては、いわゆるファジー制御等に
より今までユーザが行ってきた様々な条件下でのきめ細
かい調整作業を、電子機器が自動的に行うことができる
ものも存在する（広田薫，“ファジィ情報処理応用の現
状と展望”，情報処理学会誌，ｖｏｌ．３０，ｎｏ．８
，１９８９，ｐｐ．９１３−９２１）。これによると、
ユーザの操作する手間を軽減し、誤操作を防止する等の
効果がある。しかし、これは例えば後述するようなビデ
オテープレコーダ（ＶＴＲ）の録画予約等のようにユー
ザから多くの入力情報を必要とする複雑な作業への応用
は難しい。[0006] In addition, for relatively simple tasks that do not require much judgment on the part of the user, electronic equipment can automatically perform fine adjustment work under various conditions that has previously been performed by the user using so-called fuzzy control. (Kaoru Hirota, “Current status and prospects of fuzzy information processing applications”, Information Processing Society of Japan, vol. 30, no. 8)
, 1989, pp. 913-921). according to this,
This has the effect of reducing the effort required for the user to perform operations and preventing erroneous operations. However, this method is difficult to apply to complex tasks that require a lot of input information from the user, such as the recording reservation of a video tape recorder (VTR), which will be described later.

【０００７】更に、ユーザと電子機器とのインターフェ
ースすなわちマン−マシンインターフェース上にいわゆ
るメタファ（ｍｅｔａｐｈｏｒ　；比喩，隠喩）を適用
して、未知の事象に対する理解を促進しようとしたり（
　Ｂｒｅｎｄａ　Ｌａｕｒｅｌ，　　”Ｉｎｔｅｒｆａ
ｃｅ　Ａｎｇｅｎｔ：Ｍｅｔａｐｈｏｒｓ　ｗｉｔｈ　
Ｃｈａｒａｃｔｅｒ”　Ｔｈｅ　Ａｒｔ　Ｏｆ　Ｈｕｍ
ａｎ−Ｃｏｍｐｕｔｅｒ　Ｉｎｔｅｒｆａｃｅ　Ｄｅｓ
ｉｇｎ，　ｐｐ．３５５−３６６，１９９０　、Ｋｅｎ
ｔ　Ｌ．　Ｎｏｒｍａｎ　＆　Ｊｏｈｎ　Ｐ．　Ｃｈｉ
ｎ，　”Ｔｈｅ　ｍｅｎｕ　ｍｅｔａｐｈｏｒ：　ｆｏ
ｏｄｆｏｒ　ｔｈｏｕｇｈｔ”，　Ｂｅｈａｖｉｏｕｒ
　ａｎｄ　Ｉｎｆｏｒｍａｔｉｏｎ　Ｔｅｃｈｎｏｌｏ
ｇｙ，　ｖｏｌ．８，ｎｏ．２，ｐｐ．１２５−１３４
，１９８９、Ｓｔｕａｒｔ　Ｋ．　Ｃａｒｄ　＆　Ｄ．
　Ａｕｓｔｉｎ　Ｈｅｎｄｅｒｓｏｎ，　Ｊｒ．，　”
Ｃａｔａｌｏｇｕｅｓ：　Ａ　ｍｅｔａｐ−ｈｏｒｆｏ
ｒ　ｃｏｍｐｕｔｅｒ　ａｐｐｌｉｃａｔｉｏｎ　ｄｅ
ｌｉｖｅｒｙ”，　Ｈｕｍａｎ−Ｃｏｍｐｕｔｅｒ　Ｉ
ｎｔｅｒｒａｄｔｉｏｎ−Ｉｎｔｅｒａｃｔ’８７，ｐ
ｐ．９５９−９６４，１９８７）　、ユーザのレベルに
合わせてインターフェース機能を変更したりする試み（
吉村晋他，“ユーザフレンドリ・アシスタンスについて
”，第２回ヒューマン・インターフェース・シンポジウ
ム論文集，ｐｐ．３９−４２，１９８６）（川越恭二他
，“適応型対話インターフェースを持つ知的対話システ
ムとその応用”，第２回ヒューマン・インターフェース
・シンポジウム論文集，ｐｐ．２５−３２，１９８６　
、高木英行，“音声入力ワードプロセッサの主観評価”
，第２回第２回ヒューマン・インターフェース・シンポ
ジウム論文集，ｐｐ．３５１−３６０，１９８６）もあ
る。これらは、電子機器の機能の見せ方をユーザにわか
りやすい形に変えようとする試みであるが、それらに対
するユーザの正しい理解を積極的に引き出すものではな
い。Furthermore, attempts have been made to promote understanding of unknown phenomena by applying so-called metaphors to the interface between the user and electronic equipment, that is, the man-machine interface.
Brenda Laurel, “Interfa
ce Agent: Metaphors with
Character” The Art of Hum
an-Computer Interface Des
ign, pp. 355-366, 1990, Ken
tL. Norman & John P. Chi
n, ”The menu metaphor: fo
"oddfor thou", Behavior
and Information Technology
gy, vol. 8, no. 2, pp. 125-134
, 1989, Stuart K. Card & D.
Austin Henderson, Jr. , ”
Catalogs: A metap-horfo
r computer application
Human-Computer I
interradtion-Interact'87, p.
p. 959-964, 1987), an attempt to change interface functions to suit the user's level (
Susumu Yoshimura et al., “About user-friendly assistance,” Proceedings of the 2nd Human Interface Symposium, pp. 39-42, 1986) (Kyoji Kawagoe et al., “Intelligent Dialogue System with Adaptive Dialogue Interface and its Applications”, Proceedings of the 2nd Human Interface Symposium, pp. 25-32, 1986)
, Hideyuki Takagi, “Subjective evaluation of voice input word processor”
, Proceedings of the 2nd 2nd Human Interface Symposium, pp. 351-360, 1986). These are attempts to change the way the functions of electronic devices are presented in a way that is easier for users to understand, but they do not actively elicit correct understanding from users.

【０００８】そこで、本発明は、上述のような実情に鑑
みて提案されたものであり、電子機器の操作に不慣れな
ユーザに代わって電子機器の操作の制御を行うことので
きる電子機器の制御装置をを提供することを目的とする
ものである。[0008]The present invention was proposed in view of the above-mentioned circumstances, and provides a control method for electronic equipment that can control the operation of electronic equipment in place of a user who is unaccustomed to operating electronic equipment. The purpose is to provide a device.

【０００９】[0009]

【課題を解決するための手段】本発明の電子機器の制御
装置は、上述の目的を達成するために提案されたもので
あり、予め設定された語彙情報テーブルに基づいて電子
機器を使用する使用者からの音声を認識する音声認識部
と、少なくとも電子機器を制御するための制御情報との
関連の下に上記音声認識部からの認識結果を理解し又は
推論し理解された制御情報又は推論結果情報を出力する
対話理解部と、上記対話理解部からの上記推論結果情報
が入力され上記使用者に対する質問や確認を求めるため
のメッセージ情報を少なくとも生成するメッセージ生成
部と、上記対話理解部からの上記制御情報が入力され上
記電子機器を制御する制御部とを有してなるものである
。[Means for Solving the Problems] A control device for an electronic device according to the present invention has been proposed to achieve the above-mentioned object. a voice recognition unit that recognizes voice from a person; and control information or inference results that are understood by understanding or inferring the recognition results from the voice recognition unit in relation to at least control information for controlling electronic equipment; a dialogue understanding unit that outputs information; a message generation unit that receives the inference result information from the dialogue understanding unit and generates at least message information for asking questions or asking the user for confirmation; and a control section to which the control information is input and controls the electronic device.

【００１０】ここで、上記電子機器の制御装置は、電子
機器を制御するための制御情報テーブルと、使用者の命
令情報テーブル（語彙情報テーブル）とを用意しておき
、上記音声認識部にて上記使用者からの上記電子機器を
制御するための命令（音声）を上記語彙情報テーブルに
基づき解析し、上記音声認識部からの認識結果と制御情
報テーブルとを上記対話理解部で比較し、この比較の結
果、上記使用者からの命令と制御情報とが一致しない場
合に最適と判断される制御情報を選択し、上記使用者に
対して上記選択された制御情報の出力の許可を求める問
い合わせ信号をメッセージ生成部で発生するようにして
いる。[0010] Here, the control device for the electronic device prepares a control information table for controlling the electronic device and a user command information table (vocabulary information table), and the voice recognition unit The command (voice) from the user to control the electronic device is analyzed based on the vocabulary information table, the recognition result from the voice recognition unit is compared with the control information table, and the dialog understanding unit compares the recognition result from the voice recognition unit with the control information table. As a result of the comparison, if the command from the user and the control information do not match, control information that is determined to be optimal is selected, and an inquiry signal is sent to the user requesting permission to output the selected control information. is generated in the message generator.

【００１１】また、上記音声認識部としては、例えばい
わゆるＮＡＴ方式音声認識装置を用い、予め登録された
語彙領域上の単語を認識する。また、対話理解部は、一
般常識や使用者の習慣に関する知識を用いて使用者の意
図理解のために不足する情報を補間推論しながら対話の
文脈を解析し、そして対話の場面に応じて使用者に質問
したり、確認を求めたりするためのメッセージを上記メ
ッセージ生成部で生成する。このメッセージ生成部から
のメッセージ信号は、音声や映像として使用者に伝達さ
れる。例えば、映像としてはアニメーションキャラクタ
をテレビジョン画面上に視覚化して示す。また、音声と
しては、例えば規則音声合成等を用いて合成した自然言
語を用いて対話を行うようにする。[0011] Furthermore, as the speech recognition unit, for example, a so-called NAT type speech recognition device is used to recognize words in a vocabulary area registered in advance. In addition, the dialogue understanding unit uses general common sense and knowledge of the user's habits to analyze the context of the dialogue while interpolating and inferring missing information in order to understand the user's intentions, and uses it according to the dialogue situation. The message generation unit generates a message for asking a question or requesting confirmation from the person. The message signal from this message generation section is transmitted to the user as audio or video. For example, the video may be a visual representation of an animated character on a television screen. Furthermore, the dialogue is performed using a natural language synthesized using, for example, regular speech synthesis as the voice.

【００１２】0012

【作用】本発明によれば、使用者からの音声を認識して
この認識結果に基づいて理解された制御情報に応じて電
子機器を制御している。この時、認識結果に基づいた推
論結果情報に応じたメッセージを使用者に伝達し、使用
者はこのメッセージに対して答え、この答えに対して更
に推論又は理解を行うことで、使用者の命令の意図を正
確に理解でき、したがって、電子機器の制御も正確とな
る。According to the present invention, the electronic device is controlled in accordance with the control information understood based on the recognition result by recognizing the voice from the user. At this time, a message corresponding to the inference result information based on the recognition result is transmitted to the user, the user answers this message, and by further reasoning or understanding the answer, the user's command is The intention of the user can be understood accurately, and therefore the electronic equipment can be controlled accurately.

【００１３】[0013]

【実施例】以下、本発明を適用した実施例について図面
を参照しながら説明する。本実施例の電子機器の制御装
置は、図１に示すように、例えばビデオテープレコーダ
（ＶＴＲ）７１やテープレコーダ７２，テレビジョン受
像機（ＴＶ）７３等の電子機器を制御する制御装置６０
であって、予め設定された語彙情報テーブルに基づいて
電子機器を使用する使用者（ユーザ５０）からの音声を
認識する音声認識部６２と、少なくとも電子機器を制御
するための制御情報との関連の下に上記音声認識部６２
からの認識結果を理解し又は推論し理解された制御情報
又は推論結果情報を出力する対話理解部６３と、上記対
話理解部６３からの上記推論結果情報が入力され上記使
用者に対する質問や確認を求めるためのメッセージ情報
を少なくとも生成するメッセージ生成部６７と、上記対
話理解部６３からの上記制御情報が入力され上記電子機
器を制御する電子機器制御部６６とを有してなるもので
ある。Embodiments Hereinafter, embodiments to which the present invention is applied will be described with reference to the drawings. As shown in FIG. 1, the control device for electronic devices of this embodiment includes a control device 60 that controls electronic devices such as a video tape recorder (VTR) 71, a tape recorder 72, and a television receiver (TV) 73.
The relationship between a voice recognition unit 62 that recognizes voice from a user (user 50) using an electronic device based on a preset vocabulary information table, and control information for controlling at least the electronic device. Below the voice recognition section 62
a dialogue understanding unit 63 that understands or infers the recognition results from the user and outputs the understood control information or inference result information; The message generation section 67 generates at least message information to be obtained, and the electronic device control section 66 receives the control information from the dialogue understanding section 63 and controls the electronic device.

【００１４】すなわち、上記ＶＴＲ７１等の電子機器の
操作の記述は、限定された語彙領域で表現できるもので
ある。したがって、本実施例では、その語彙領域上で意
味解析を主導とする手法によって、非文法的な入力文（
ユーザ５０からの音声入力による自然言語）も理解でき
る強靱な意味解析システム（三澤誠一他，“対話型知的
インターフェースとそのＶＴＲの録画予約への適用”，
第２回ヒューマン・インターフェース・シンポジウム論
文集，ｐｐ．２０７−２６４，１９９１　、三澤誠一他
，“意味解析主導による対話型知的インターフェース”
，情報処理学会第４０回全国大会講演論文集（１）　，
ｐｐ．４４８−４４９，１９９０）を構築している。That is, the description of the operation of electronic equipment such as the VTR 71 can be expressed using a limited vocabulary area. Therefore, in this example, an ungrammatical input sentence (
A robust semantic analysis system that can also understand (natural language input by voice input from the user 50) (Seiichi Misawa et al., “Interactive intelligent interface and its application to VTR recording reservation”,
Proceedings of the 2nd Human Interface Symposium, pp. 207-264, 1991, Seiichi Misawa et al., “Interactive intelligent interface driven by semantic analysis”
, Proceedings of the 40th National Conference of the Information Processing Society of Japan (1),
pp. 448-449, 1990).

【００１５】この図１の構成の上記音声認識部６２とし
ては、マイク６１から供給されるユーザ５０からの音声
入力に対し、例えばいわゆるＮＡＴ方式音声認識装置（
田村震一他，“ＮＡＴ方式不特定話者単語認識”，日本
音響学会講演論文集，ｐｐ．２１−２２，１９８５）を
用い、予め登録された語彙領域（ユーザからの命令に対
応する命令情報テーブル）上の単語を認識している。The voice recognition unit 62 configured as shown in FIG.
Shinichi Tamura et al., “NAT method speaker-independent word recognition”, Proceedings of the Acoustical Society of Japan, pp. 21-22, 1985) to recognize words on a pre-registered vocabulary area (command information table corresponding to commands from the user).

【００１６】また、対話理解部６３は、一般常識やユー
ザ５０の習慣に関する知識（ユーザからの命令を解析す
るためのテーブル）を用いてユーザ５０の意図理解のた
めに不足する情報を補間推論しながら対話の文脈を理解
する。[0016] Furthermore, the dialog understanding unit 63 uses general common sense and knowledge of the habits of the user 50 (a table for analyzing commands from the user) to interpolate and infer missing information in order to understand the intention of the user 50. while understanding the context of the dialogue.

【００１７】上記メッセージ生成部６７は上記対話理解
部６３から伝えられた情報（推論結果情報）に従い場面
に応じたメッセージ信号（例えば対話の場面に応じてユ
ーザ５０に質問したり、確認を求めたりするためのメッ
セージ信号）を生成して、それを音声合成部６５とキャ
ラクタ表示部６８とに送り、実際に表出させる。このメ
ッセージ生成部６７からのメッセージ信号は、音声や映
像として使用者に伝達される。例えば、映像としては上
記メッセージ信号に基づいてキャラクタ表示部６８で後
述するアニメーションキャラクタＡＣのグラフィックデ
ータを組合せると共に、このアニメーションキャラクタ
ＡＣに対して対話の場面に応じた動作及び表情付けを行
ってテレビジョン画面（モニタ６９）上に視覚化して示
す。また、音声としては、音声合成部６５により例えば
いわゆる規則音声合成方式等を用いて合成した自然言語
の音声をスピーカ６４から放音させることでユーザとの
対話を行うようにする。The message generating section 67 generates a message signal depending on the situation (for example, asking the user 50 a question or requesting confirmation depending on the dialog situation) according to the information (inference result information) transmitted from the dialog understanding section 63. A message signal for the purpose of the character is generated and sent to the voice synthesis section 65 and the character display section 68 for actual display. The message signal from the message generation section 67 is transmitted to the user as audio or video. For example, as a video, graphic data of an animation character AC, which will be described later, is combined on the character display section 68 based on the above-mentioned message signal, and the animation character AC is given movements and expressions according to the dialogue scene. The images are visualized and shown on the John screen (monitor 69). In addition, as for the voice, a natural language voice synthesized by the voice synthesis unit 65 using, for example, a so-called regular voice synthesis method is emitted from the speaker 64 so as to perform a dialogue with the user.

【００１８】更に、電子機器制御部６６は、上記ユーザ
５０との対話を通じて理解したユーザ５０の命令の意図
に応じてすなわち上記対話理解部からの上記理解された
制御情報に基づいて実際の電子機器（例えばＶＴＲ７１
等）の操作を行う。Furthermore, the electronic device control section 66 controls the actual electronic device according to the intention of the command of the user 50 understood through the interaction with the user 50, that is, based on the understood control information from the interaction understanding section. (For example, VTR71
etc.).

【００１９】ところで、通常、ユーザと電子機器とがコ
ミュニケーションを取り易い環境を構築するためには、
次の二点が必要であると考えられる。すなわち、第１に
、電子機器が豊かなコミュニケーション能力を有するこ
とと、第２にユーザの意図を理解しようとする意思を持
つことである。なお、上記コミュニケーションとは、対
話だけに止まらず、表情や身振り等をも含めた様々な手
段による意思交換を指す。By the way, in order to create an environment where users and electronic devices can easily communicate,
The following two points are considered necessary. That is, firstly, electronic devices must have rich communication capabilities, and secondly, they must have the intention to understand the user's intentions. Note that the above-mentioned communication refers not only to dialogue, but also to the exchange of ideas through various means, including facial expressions and gestures.

【００２０】ここで、どのような手段によるものであっ
ても、機器がユーザの意図を汲み取ろうとする意思を持
って、ユーザとの様々な情報のやりとりを行うように見
せることができれば、人間と機械とのコミュニケーショ
ンを人間同士のようなコミュニケーションに近づけるこ
とができる。なお、このような振る舞いをする知的な存
在を例えば機械意思と呼ぶとする。この機械意思が存在
するインターフェース環境においては、ユーザは複雑な
操作によらなくても機械に自分の意思を伝えやすくなる
。このことは、すなわち機械を人間のように見せるとい
うことを意味する。しかし、通常人間が機械を人間のよ
うに感じることは無く、むしろそのように感じることを
嫌う傾向にある。[0020] No matter what method is used, if a device can be made to appear to exchange various information with the user with the intention of understanding the user's intentions, it is possible to It is possible to bring communication between humans and machines closer to that between humans. For example, let us say that an intelligent being that behaves in this way is called a machine will. In an interface environment where machine will exists, users can easily convey their intentions to the machine without using complicated operations. This means making machines look like humans. However, humans usually do not feel that machines are human, and in fact, they tend to dislike feeling like machines.

【００２１】そこで、本発明実施例においては、電子機
器の側にこのような意識的存在を実現するための手段と
して、ユーザと電子機器との間に立ってコミュニケーシ
ョンを仲介するような人をメタファとしたエージェント
（上記制御装置６０）をヒューマンインターフェースに
導入している。すなわち、人間のように、ユーザの意図
を汲み取ろうとする意思を持ってユーザと対話し、電子
機器の操作を代行するエージェントをユーザと電子機器
との間に置くようにしている。Therefore, in the embodiment of the present invention, as a means to realize such a conscious presence on the side of the electronic device, a person who stands between the user and the electronic device and mediates communication is used as a metaphor. An agent (control device 60 described above) is introduced into the human interface. In other words, an agent is placed between the user and the electronic device that interacts with the user with the intention of understanding the user's intentions and operates the electronic device on behalf of the user, like a human.

【００２２】本実施例では、そのメタファとして、電子
機器の操作に間違いなくユーザの意思通りに代行してく
れる機器操作の熟練者を取り挙げている。このようなエ
ージェントは、電子機器の操作に関する知識と能力に加
え、ユーザの意図を実現する意思を持つため、ユーザは
容易に自分の意図を伝え、操作を代行してもらうことが
できる。このようなインターフェース環境を実現するこ
とによって、今まで電子機器の操作を苦手としてたユー
ザも機器を使い易くなる。このインターフェースの概略
が上述した図１の本実施例の構成であり、図２にはこの
図１の構成を更に簡略化して示している。[0022] In this embodiment, as a metaphor for this, a person skilled in operating the electronic equipment is used who can operate the electronic equipment according to the user's intention without fail. Such an agent has the knowledge and ability to operate electronic devices as well as the intention to realize the user's intention, so the user can easily convey his or her intention and have the user perform the operation on his/her behalf. By creating such an interface environment, it becomes easier for users who have traditionally found it difficult to operate electronic devices to use the devices. The outline of this interface is the configuration of the present embodiment shown in FIG. 1 described above, and FIG. 2 shows a further simplified version of the configuration of FIG. 1.

【００２３】この図２において、上記電子機器７０の操
作の熟練者をメタファにした上記制御装置６０のエージ
ェントの機能としては、以下の３つが存在する。In FIG. 2, there are the following three functions of the agent of the control device 60, which is a metaphor for a person skilled in operating the electronic device 70.

【００２４】すなわち、上記制御装置６０（エージェン
ト）は、第１の機能として、電子機器７０の操作に関す
る知識とユーザに関する知識とを持っており、それを利
用してユーザ５０の意図を正しく理解する。また、第２
の機能として、ユーザ５０からの入力が不足しているた
めに電子機器７０を操作できない場合、ユーザ５０に対
して必要な情報を問い合わせる。更に、第３の機能とし
て、電子機器７０の操作を正確に代行する。That is, the control device 60 (agent) has, as a first function, knowledge regarding the operation of the electronic device 70 and knowledge regarding the user, and uses this knowledge to correctly understand the intention of the user 50. . Also, the second
As a function, if the electronic device 70 cannot be operated due to insufficient input from the user 50, the user 50 is asked for necessary information. Furthermore, as a third function, the electronic device 70 can be operated accurately.

【００２５】ここで、上記第１の機能としては、ある程
度の一般知識や常識、電子機器７０の操作に関する知識
を持つことによって、曖昧な表現や部分的に情報が省略
された表現等から、ユーザ５０が明確に表現しなかった
情報を補間推論する。また、ユーザの好みや習慣に関す
る知識を持つことによって、少ない情報からでもユーザ
５０の意図を察することができる。Here, as the first function, by having a certain degree of general knowledge, common sense, and knowledge regarding the operation of the electronic device 70, the user can easily understand from ambiguous expressions, expressions with partially omitted information, etc. 50 interpolates and infers information that was not clearly expressed. Furthermore, by having knowledge about the user's preferences and habits, it is possible to infer the user's 50 intention even from little information.

【００２６】また、上記第２の機能としては、上記第１
の機能で理解し切れなかった情報があれば、ユーザ５０
に対して問い合わせる。ユーザ５０が入力方法が分から
ない場合、或いは、何を入力して良いか分からない等の
ように当惑していると思われる場合等も自発的にユーザ
に働きかけて情報を得るようにする。[0026] Also, as the second function, the first function is as follows.
If there is any information that you cannot understand using the function, please
Inquire about. Even when the user 50 does not know how to input, or seems confused, such as not knowing what to input, the user 50 is encouraged to voluntarily work on the user to obtain information.

【００２７】更に、第３の機能としては、上記第１及び
第２の機能の実行の結果により得られた情報に基づいて
、ユーザ５０の意図した電子機器７０の操作を正確に代
行する。Furthermore, the third function is to accurately operate the electronic device 70 as intended by the user 50 based on the information obtained as a result of the execution of the first and second functions.

【００２８】人間である熟練操作者には、上述のような
能力に加えて、人間である熟練操作者にはコミュニケー
ションを円滑にする効果のある様々な特質が備わってい
るが、それらも本実施例では実現している。例えば、後
述するような、人間と同じような具体的な姿（アニメー
ションキャラクタＡＣ）を上記エージェントにも与える
。例えば、このアニメーションキャラクタＡＣをユーザ
５０の目前に提示することによって、このアニメーショ
ンキャラクタＡＣとコミュニケーションするという意識
を持たせ、対話を誘発するようにしている。[0028] In addition to the above-mentioned abilities, a skilled human operator has various characteristics that are effective in facilitating communication, and these are also included in this implementation. This is achieved in the example. For example, the agent is given a specific appearance (animation character AC) similar to a human being, as will be described later. For example, by presenting this animated character AC in front of the user 50, the user 50 is made to feel that he or she is communicating with this animated character AC, thereby inducing dialogue.

【００２９】すなわち、人間は等しく感情を持っており
、その少なくとも根本的な部分は全ての人に共通である
という前提があるために、コミュニケーションをする時
に相手に反応をある程度予測することができるものであ
る。これがなければ、相手にどのように接してよいのか
わからず、スムーズなコミュニケーションをとることが
できない。本発明実施例においては、エージェントに人
間と同じような感情表現をさせることによって、コミュ
ニケーションの円滑化を図っている。また、感情表現の
手段として、後述するようにアニメーションキャラクタ
ＡＣに表情を与えたり、ある身振りをさせたりしている
。通常、声だけでも、或いは文字による会話だけでも感
情表現は可能であるが、本実施例では、アニメーション
キャラクタＡＣによる視覚的な表現としているため、よ
り効果的となっている。[0029] In other words, it is possible to predict the reaction of the other person to a certain extent when communicating because it is assumed that humans have emotions equally and that at least the fundamental part is common to all people. It is. Without this, you won't know how to interact with the other person, and you won't be able to communicate smoothly. In the embodiment of the present invention, communication is facilitated by having the agent express emotions similar to those of humans. Furthermore, as a means of expressing emotions, the animated character AC is given facial expressions and gestures, as will be described later. Normally, emotions can be expressed only by voice or by talking in text, but in this embodiment, the expression is visually expressed by the animated character AC, which is more effective.

【００３０】また、人間はそれぞれ一貫した正確を持つ
存在であり、その一貫性が相手に関する情報の蓄積を可
能にする前提条件であると言える。このため、相手に関
する情報が多ければコミュニケーションも効率よく行う
ことが可能となる。したがって、本実施例では、エージ
ェントの感情表現にも一貫性を持たせ、例えば怒りっぽ
い、或いは、忍耐強い等のある種の正確を表現するよう
にしている。Furthermore, each human being is a being with consistent accuracy, and it can be said that this consistency is a prerequisite for making it possible to accumulate information about the other person. Therefore, the more information you have about the other party, the more efficient your communication will be. Therefore, in this embodiment, the agent's emotional expression is made consistent, and is expressed with a certain degree of precision, such as being angry or patient.

【００３１】上述のように、本実施例においては、電子
機器の制御装置６０のエージェントに上記機能を持たせ
ることにより、ユーザ５０を助ける意思を持った電子機
器の熟練者と同じような振る舞いをさせることができる
。As described above, in this embodiment, by providing the agent of the electronic device control device 60 with the above function, it can behave in the same way as an electronic device expert who has the intention of helping the user 50. can be done.

【００３２】また、本実施例の電子機器の制御装置６０
において、インターフェースの基本的な流れすなわち言
葉で意図を伝えてユーザ５０が望む操作を電子機器７０
に対して実行するという流れは、簡単な教示によって全
てのユーザに理解できるものとなっている。更に、使用
する言葉に制約が少ない日常的な言い方ができるので使
い易くなっており、自分で機器の操作ができない人或い
は初めてこのインターフェースを使用する人にとっても
非常に便利なものとなる。その他、ユーザは画面上の視
覚的なキャラクタ（アニメーションキャラクタＡＣ）が
、人間的な感情を表現する動作をしたりするため、この
キャラクタに対して好意的な印象を持ち、ユーザの装置
に対する親近感が得られ、更に、機械とコミュニケーシ
ョンすることへの心理的な拒否反応を軽減する効果があ
る。[0032] Furthermore, the electronic device control device 60 of this embodiment
, the basic flow of the interface is to convey the intention in words and perform the operation desired by the user 50 on the electronic device 70.
The flow of executing the program can be understood by all users with simple instructions. Furthermore, it is easy to use because it can be used in everyday language with fewer restrictions on the words that can be used, making it extremely convenient for people who cannot operate the device themselves or who are using this interface for the first time. In addition, the user has a favorable impression of the visual character (animated character AC) on the screen because it makes movements that express human emotions, and the user feels a sense of familiarity with the device. This also has the effect of reducing psychological negative reactions to communicating with machines.

【００３３】すなわち、本実施例装置においては、操作
が複雑な電子機器をいわゆる機械に弱いユーザにとって
も使い易くするために、ユーザに代わって機器の操作を
するエージェントをヒューマンインターフェースに導入
し、更に、このエージェントには機器操作の熟練者をメ
タファとすると共に、機器の操作に関する知識に加えて
ユーザの意図を理解しそれを実行しようとする意思を持
つようにしているため、ユーザと機器とのコミュニケー
ションを容易にすることが可能となっている。したがっ
て、電子機器の操作に馴染めなかったユーザも、機器を
使い易くなっている。That is, in the device of this embodiment, an agent that operates the device on behalf of the user is introduced into the human interface in order to make the electronic device, which is complicated to operate, easy to use even for users who are weak with machines. This agent is a metaphor for an expert in equipment operation, and in addition to knowledge of equipment operation, it also understands the user's intention and has the will to carry it out, so the interaction between the user and the equipment is improved. It is possible to facilitate communication. Therefore, even users who are not accustomed to operating electronic devices can easily use the devices.

【００３４】ここで、上記電子機器７０として、例えば
、家庭用ＶＴＲは、多機能で操作が難解な機器であると
言える。特に番組の録画予約の操作は複雑で使いこなせ
ないユーザが多い。録画予約は予約する番組を特定する
ためにユーザが多くの情報を指定しなければ処理が行え
ない機能であり、従来のボタン操作方法では操作が複雑
になってしまうのがその原因である。なお、例えば家庭
用ＶＴＲ等の電子機器の制御を行うのに有効な制御装置
として、本件出願人は、先に、特願平３−１３７５８号
，特願平３−１３７５９号，特願平３−１３７６０号，
特願平３−１３７６１号の明細書及び図面に記載の電子
機器の制御装置を提案している。[0034] Here, as the electronic device 70, for example, a home VTR can be said to be a multi-functional device that is difficult to operate. In particular, the operation for reserving recording of a program is complicated and many users cannot master it. Recording reservation is a function that cannot be processed unless the user specifies a lot of information in order to specify the program to be reserved, and the reason for this is that the conventional button operation method makes the operation complicated. The applicant has previously published Japanese Patent Application No. 3-13758, Japanese Patent Application No. 3-13759, Japanese Patent Application No. -13760,
A control device for electronic equipment is proposed as described in the specification and drawings of Japanese Patent Application No. 3-13761.

【００３５】以下に本実施例の一具体例の制御装置を示
す、この制御装置では、上記家庭用ＶＴＲのインターフ
ェースを例に挙げている。もちろん、本発明の制御装置
は上記実施例や以下の具体例に限定されず、例えば、テ
レビジョン受像機やテープレコーダ或いは文書作成装置
等他の各種電子機器に適用できるものである。A specific example of a control device according to this embodiment will be described below. In this control device, the interface of the home VTR mentioned above is taken as an example. Of course, the control device of the present invention is not limited to the embodiments described above or the specific examples below, and can be applied to various other electronic devices such as a television receiver, a tape recorder, or a document creation device.

【００３６】本具体例のＶＴＲとユーザとの間のインタ
ーフェースを行う制御装置６０には、上述したような例
えば録画予約等をはじめとするＶＴＲ操作に関する知識
と実際にそれを行う能力を有するエージェントを導入し
ている。このエージェントとしては、アニメーションキ
ャラクタＡＣをテレビジョン画面上に視覚化している。ユーザ５０はこのアニメーションキャラクタ（エージェ
ント）ＡＣと音声による自然言語を用いて対話を行うよ
うにしている。The control device 60 that performs the interface between the VTR and the user in this specific example includes an agent that has knowledge of VTR operations such as recording reservations and the ability to actually perform them. It has been introduced. As this agent, an animated character AC is visualized on a television screen. The user 50 interacts with this animated character (agent) AC using natural language using voice.

【００３７】すなわち図３に示す本実施例の具体的装置
において、上記動作モードの指定制御のための命令や各
種情報を音声にて入力するための音声入力手段として、
例えばいわゆる電話機の送受話器と同じ形状をした送受
話器１０を設けている。この送受話器１０の送話部１１
には入力音声を電気的な信号に変換して出力する音響−
電気変換素子が設けられており、この送話部１１の近傍
に音声入力状態を切換指定して例えば入力の区切り等を
付けるためのスイッチ（プレストークスイッチ）１２が
設けられている。この送受話器１０の送話部１１からの
出力信号は、音声認識回路１３に送られて信号処理され
ることで上記命令や各種情報が認識される。スイッチ１
２からの出力信号はスイッチ状態検出部１４に送られて
オン／オフ状態の検出が行われる。これらの音声認識回
路１３及びスイッチ状態検出部１４は制御回路１５に接
続されている。制御回路１５には、この他、メッセージ
の話者となるアニメーションキャラクタの映像信号を出
力するアニメーションキャラクタ発生回路１６と、メッ
セージ信号入力に応じて該メッセージの音声信号を例え
ばいわゆる規則音声合成の手法を用いて合成する音声合
成回路１９と、上記ＶＴＲ４０の動作モードを指定制御
するためのＶＴＲコントローラ１８とが接続されている
。キャラクタ発生回路１６からの映像信号は、スーパー
インポーザ１７に送られてＶＴＲ４０からの映像信号に
スーパーインポーズされ、画像表示手段としての陰極線
管（ＣＲＴ）表示装置３０に送られている。音声合成回
路１９からの音声信号は音声出力手段としてのスピーカ
２０に送られて音声となって出力される。このスピーカ
２０及びＣＲＴ表示装置３０は、例えばテレビジョン受
像機としてまとめて構成するようにしてもよい。That is, in the specific device of this embodiment shown in FIG. 3, as a voice input means for inputting commands and various information for specifying and controlling the operation mode by voice,
For example, a handset 10 having the same shape as a so-called telephone handset is provided. Transmitter 11 of this handset 10
is an audio system that converts input audio into an electrical signal and outputs it.
An electrical conversion element is provided, and a switch (press talk switch) 12 is provided near the transmitting section 11 for specifying switching of the audio input state and, for example, adding a break between inputs. The output signal from the transmitter 11 of the handset 10 is sent to the voice recognition circuit 13 and subjected to signal processing, thereby recognizing the above-mentioned commands and various information. switch 1
The output signal from 2 is sent to the switch state detection section 14, and the on/off state is detected. These voice recognition circuit 13 and switch state detection section 14 are connected to a control circuit 15. In addition, the control circuit 15 includes an animation character generation circuit 16 that outputs a video signal of an animation character who is a speaker of a message, and an animation character generation circuit 16 that outputs a video signal of an animation character who is a speaker of a message, and a method that generates an audio signal of the message according to the input of the message signal, for example, by a method of so-called regular speech synthesis. A voice synthesis circuit 19 for synthesizing using the VTR 40 is connected to a VTR controller 18 for specifying and controlling the operation mode of the VTR 40. The video signal from the character generation circuit 16 is sent to a superimposer 17, superimposed on the video signal from the VTR 40, and sent to a cathode ray tube (CRT) display device 30 as an image display means. The audio signal from the audio synthesis circuit 19 is sent to a speaker 20 as audio output means and output as audio. The speaker 20 and CRT display device 30 may be configured together as a television receiver, for example.

【００３８】制御回路１５は、少なくとも上記音声認識
回路１３からの出力信号に応じて、上記ＶＴＲ４０の動
作モードを指定制御する動作モード指定制御信号、上記
アニメーションキャラクタ発生回路１６のアニメーショ
ンキャラクタＡＣの動作を制御する動作制御信号、及び
上記音声合成回路１９にて合成させたいメッセージ音声
を指示するメッセージ信号を出力するものであり、マイ
クロプロセッサ等のＣＰＵを有して成っている。この制
御回路１５は、上記音声認識回路１３により認識された
上記動作モードの指定制御のための命令に応じて複数の
制御命令の内から上記ＶＴＲ４０の現在の動作状態に対
して適切な一つの制御命令を選択して出力するものであ
る。この選択処理及びＶＴＲ４０の動作モードの詳細に
ついては後述する。また制御回路１５は、音声入力内容
に応じて（さらに現在の状態に応じて）応答内容として
最適のメッセージを指示するメッセージ信号を出力し、
このメッセージ信号を上記アニメーションキャラクタ発
生回路１６と音声合成回路１９とに送っている。The control circuit 15 generates an operation mode designation control signal for designating and controlling the operation mode of the VTR 40 and the operation of the animation character AC of the animation character generation circuit 16 in accordance with at least the output signal from the voice recognition circuit 13. It outputs an operation control signal to be controlled and a message signal instructing the message voice to be synthesized by the voice synthesis circuit 19, and includes a CPU such as a microprocessor. This control circuit 15 selects one control appropriate for the current operating state of the VTR 40 from among a plurality of control commands in response to the instruction for specifying control of the operation mode recognized by the voice recognition circuit 13. It selects and outputs instructions. Details of this selection process and the operation mode of the VTR 40 will be described later. The control circuit 15 also outputs a message signal that instructs the optimal message as the response content according to the voice input content (and also according to the current state),
This message signal is sent to the animation character generation circuit 16 and the voice synthesis circuit 19.

【００３９】また図３において、上記送受話器１０の送
話部１１は、人（例えば上記ＶＴＲ４０等を操作するオ
ペレータ、すなわちユーザ）が発声する音声を、電気的
な音声信号に変換する。上記オペレータが発声する音声
とは、上記ＶＴＲ４０の上記各種動作モードを直接制御
するための、例えば「再生」，「停止」，「録画」，「
ポーズ」，「スロー」，「早送り」，「巻戻し」等の命
令の単語を音声で発音したものや、上記各種情報として
例えば録画予約のための曜日（「日曜」〜「土曜」，「
毎日」，「毎週」等），チャンネル（「１チャンネル」
〜「１２チャンネル」等），開始／終了時刻（「午前」
，「午後」，「０時」〜「１２時」等）の単語等を音声
で発音したものを挙げることができる。更に、本具体例
では、上記各単語の他に、より人間的な例えば「おい」
，「もういい」，「だめ」等の単語をも音声入力するこ
とができるようになっている。Further, in FIG. 3, the transmitting section 11 of the handset 10 converts the voice uttered by a person (for example, the operator operating the VTR 40, etc., ie, the user) into an electrical voice signal. The voice uttered by the operator is used to directly control the various operation modes of the VTR 40, such as "play", "stop", "record", "
"Pause", "Slow", "Fast forward", "Rewind" and other commands are pronounced audibly, and the various information mentioned above includes, for example, the days of the week for recording reservations ("Sunday" to "Saturday", "
``Daily'', ``Weekly'', etc.), Channel (``1 Channel'')
- "12 channels" etc.), start/end time ("AM" etc.)
, "afternoon,""0:00" to "12:00," etc.) can be enumerated as vocal pronunciations. Furthermore, in this specific example, in addition to the above words, more human words such as "oi" are used.
It is now possible to input words such as , ``enough'', and ``no'' by voice.

【００４０】送受話器１０に配された上記音声入力状態
指定のためのプレストークスイッチ１２は、上記オペレ
ータがオン／オフすることで、該オペレータが発音する
言葉に対して区切りを指示するためのものである。すな
わち、このプレストークスイッチ１２は、複数の離散単
語の連続で構成された文からなる入力音声信号に対して
、上記音声認識回路１３での音声認識の処理単位を区切
るために設けられているものであり、当該プレストーク
スイッチ１２からの出力は上記音声認識回路１３に併設
されたスイッチ状態検出部１４に送られるようになって
いる。このスイッチ状態検出部１４は、上記プレストー
クスイッチ１２からの出力信号に応じて現在のオン／オ
フ状態を指示する状態指示信号を形成するものである。この状態指示信号としては、例えば、上記プレストーク
スイッチ１２が非作動時（オフ）には状態０、作動時（
オン）は状態１となるものとする。したがって、上記音
声認識回路１３で音声認識を行わせたい場合は、プレス
トークスイッチ１２をオンに操作し、音声入力終了後に
オフに操作することで、対応する上記状態指示信号に基
づく音声認識の処理単位で、上記音声認識回路１３での
音声認識処理が行われるようになる。このようなことか
ら、音声による連続単語入力を行う際に、その音声単語
入力の終了を意味するオフ（状態０）によって、上記音
声認識回路１３では、入力が終わったかどうかの解析を
しなくても済むようになる。換言すれば、当該音声認識
回路１３においては、音声の認識の際に、明確な開始タ
イミング及び終了タイミングがわかり、音声認識する範
囲がソフトウェアで容易に判別可能となるため、この範
囲外の雑音に対して無用な音声認識処理を行わなくても
済むようになる。また、上記送受話器１０の制御を行わ
ないため、音声入力切り換え時（音声切断時）の雑音が
入らないようになる。[0040] The press talk switch 12 for specifying the voice input state disposed on the handset 10 is turned on/off by the operator to instruct a break between the words pronounced by the operator. It is. That is, the press talk switch 12 is provided to separate processing units of speech recognition in the speech recognition circuit 13 for an input speech signal consisting of a sentence composed of a plurality of consecutive discrete words. The output from the press talk switch 12 is sent to a switch state detection section 14 attached to the voice recognition circuit 13. This switch state detection section 14 forms a state instruction signal indicating the current on/off state in response to the output signal from the press talk switch 12. This state instruction signal may be, for example, state 0 when the press talk switch 12 is not activated (off), and state 0 when activated (
ON) is assumed to be in state 1. Therefore, if you want the voice recognition circuit 13 to perform voice recognition, turn on the press talk switch 12 and turn it off after voice input is completed, thereby processing the voice recognition based on the corresponding status instruction signal. Speech recognition processing is performed in the speech recognition circuit 13 in units of units. For this reason, when inputting continuous words by voice, the voice recognition circuit 13 does not have to analyze whether the input has ended due to the OFF state (state 0), which means the end of the voice word input. You will also be able to do it. In other words, in the speech recognition circuit 13, when recognizing speech, the clear start timing and end timing are known, and the range of speech recognition can be easily determined by software, so noise outside this range is ignored. This eliminates the need for unnecessary speech recognition processing. Furthermore, since the handset 10 is not controlled, noise is not generated when switching audio inputs (when audio is disconnected).

【００４１】ここで、上記音声合成回路１９からの音声
出力信号を上記送受話器１０の受話器に送るようにし、
音声の入出力を電話機の送受話器と同様な形状をした送
受話器１０を介して行うことにより、電子機器に向かっ
て話す抵抗感や違和感を軽減することができ、またノイ
ズの多い場所で使用する際の誤動作防止にもなる。また
、テレビジョン受像機やステレオセット等のスピーカ２
０には応答音声を送らずに送受話器１０の受話部にのみ
送るようにすることにより、共同視聴者に対して応答音
声を遮蔽でき、録画予約設定操作やＶＴＲ操作等が共同
視聴者に迷惑をかけることなく行える。Here, the voice output signal from the voice synthesis circuit 19 is sent to the handset of the handset 10,
By inputting and outputting audio through the handset 10, which has a shape similar to that of a telephone, it is possible to reduce the feeling of resistance and discomfort when talking into an electronic device, and it is also possible to reduce the feeling of discomfort when talking into an electronic device. This will also help prevent malfunctions. Also, the speakers 2 of a television receiver, stereo set, etc.
By sending the response voice only to the receiver of the handset 10 without sending it to the receiver 10, the response voice can be shielded from the co-viewers, and recording reservation setting operations, VTR operations, etc. can be a nuisance to the co-viewers. It can be done without applying.

【００４２】図４に、本具体例装置における主要な動作
のフローチャートを示す。この図４において、装置電源
オンした後等の初期状態において、ステップＳ１でオペ
レータ（ユーザ）が音声入力あるいは他の入力操作を行
うことによりアニメーションキャラクタを呼び出すと、
上記ＣＲＴ表示装置３０の画面ＳＣ上には、例えば図５
に示すようなアニメーションキャラクタＡＣの表示が行
われ、オペレータによる音声入力の待機状態となる。す
なわち、上記ＣＲＴ表示装置３０の画面ＳＣ上には、図
５に示すように、上記アニメーションキャラクタＡＣと
このアニメーションキャラクタＡＣの発言内容を文字表
示するための吹き出し部ＳＰとが、上記ＶＴＲ４０から
の映像信号を背景画像ＢＧとして、いわゆるスーパーイ
ンポーズされて表示される。吹き出し部ＳＰ内のメッセ
ージ（例えば上記待機状態であることを示す「何をしま
すか？」のようなメッセージ）は、同時に上記音声合成
回路１９でも同じ内容の合成音声が形成され、上記スピ
ーカ２０からこの音声（例えば「何をしますか？」との
音声）が発音される。なお、このステップＳ１での呼び
出し処理を更に具体的に例示すると、上記アニメーショ
ンキャラクタＡＣの名前（例えばアイビーとする）を呼
んだり、「おい」等の呼び掛けを音声で入力すると、あ
るいは上記ＶＴＲ４０の電源をオンすると、先ず、「は
いアイビーです」とのメッセージと共にアニメーション
キャラクタＡＣが表示され、その後上記図５の状態に移
行することにより、より自然な対話の雰囲気が得られる
。FIG. 4 shows a flowchart of the main operations in this specific example device. In FIG. 4, in an initial state such as after the device is powered on, when an operator (user) calls an animation character by performing a voice input or other input operation in step S1,
On the screen SC of the CRT display device 30, for example, FIG.
An animation character AC as shown in FIG. 2 is displayed, and the system enters a standby state for voice input by the operator. That is, on the screen SC of the CRT display device 30, as shown in FIG. The signal is displayed as a background image BG in a so-called superimposed manner. The message in the speech bubble SP (for example, the message "What do you want to do?" indicating the standby state) is simultaneously generated by the speech synthesis circuit 19 into synthesized speech with the same content, and is output to the speaker 20. This voice (for example, the voice "What do you want to do?") is pronounced. To give a more specific example of the calling process in step S1, when the name of the animation character AC (for example, Ivy) is called, or when a voice command such as "Hey" is input, or when the power of the VTR 40 is turned on, When turned on, the animated character AC is first displayed with the message "Yes, I'm Ivy," and then the state shifts to the state shown in FIG. 5, thereby creating a more natural dialogue atmosphere.

【００４３】次のステップＳ２では、上記オペレータ（
ユーザ）が上記動作モードの指定制御のための命令や各
種情報を音声入力する。この命令が、上記ＶＴＲ４０の
走行系操作のための命令である場合はステップＳ３へ進
んでＶＴＲ４０の動作モードを直接的に制御するフェー
ズに入り、録画予約のための命令である場合はステップ
Ｓ４へ進んでＶＴＲ４０の録画予約動作のフェーズに入
る。予約確認のための命令である場合はステップＳ５へ
進んで予約された内容の確認動作のフェーズに入る。この予約確認のフェーズにおいては、ステップＳ５の次
にさらにステップＳ６に進んで予約の変更や取消の処理
がなされる。上記録画予約、及び予約確認のフェーズに
おいては、後述するようにアニメーションキャラクタＡ
Ｃとのより深いレベルの会話が行われる。これらのステ
ップＳ３，Ｓ４，Ｓ５（Ｓ６）の処理が終了すると、ス
テップＳ２の音声入力の待機状態に戻る。In the next step S2, the operator (
(user) inputs commands and various information for specifying and controlling the operation mode by voice. If this command is a command for operating the running system of the VTR 40, the process advances to step S3 and enters a phase in which the operation mode of the VTR 40 is directly controlled; if this command is a command for recording reservation, the process advances to step S4. Then, the VTR 40 enters the recording reservation operation phase. If the command is for confirming a reservation, the process advances to step S5 and enters a phase of confirming the contents of the reservation. In this reservation confirmation phase, after step S5, the process further advances to step S6 to change or cancel the reservation. In the above-mentioned recording reservation and reservation confirmation phases, the animation character A
A deeper level conversation with C takes place. When these steps S3, S4, and S5 (S6) are completed, the process returns to the voice input standby state of step S2.

【００４４】ここで図５の例においては、アニメーショ
ンキャラクタＡＣは、ＣＲＴ表示装置３０の画面ＳＣ上
の例えば左下に表示するようにしている。すなわち、上
記アニメーションキャラクタＡＣを画面ＳＣの中央に表
示すると、画面上に映し出されている映像（背景画像Ｂ
Ｇ）に対して視覚上妨げとなるが、左下ならば当該画面
ＳＣ上の映像の妨げとはなりにくい。しかも画面ＳＣの
下側だとアニメーションキャラクタＡＣの足が宙に浮か
ず安定感があり、左下ならば人間の感性から抵抗感が少
ない。画面ＳＣにメッセージを文字表示するための吹き
出し部ＳＰは、当該文字列を左から右への横書きとする
ならば画面ＳＣの下部に配置し、文字列を上から下への
縦書きとするならば画面ＳＣの右側に配置する。これに
より画面ＳＣ上の映像（背景画像ＢＧ）の妨げになるこ
とが少ない。In the example shown in FIG. 5, the animation character AC is displayed, for example, at the lower left of the screen SC of the CRT display device 30. That is, when the animation character AC is displayed in the center of the screen SC, the image (background image B) displayed on the screen
G), but if it is at the lower left, it is unlikely to interfere with the image on the screen SC. Moreover, if it is at the bottom of the screen SC, the feet of the animation character AC do not float in the air, giving a sense of stability, and if it is at the bottom left, there is less resistance from human sensibilities. The speech bubble SP for displaying a message on the screen SC should be placed at the bottom of the screen SC if the text string is written horizontally from left to right, and placed at the bottom of the screen SC if the text string is written vertically from top to bottom. For example, it is placed on the right side of the screen SC. Thereby, the image (background image BG) on the screen SC is less likely to be obstructed.

【００４５】アニメーションキャラクタＡＣの動作は、
制御回路１５からのアニメーション動作制御信号がアニ
メーションキャラクタ発生回路１６に送られることによ
り制御される。すなわち制御回路１５からの上記動作制
御信号を受けたアニメーションキャラクタ発生回路１６
は、上記オペレータと対話するための話者である例えば
上述した図５に示すようなアニメーションキャラクタＡ
Ｃを上記メッセージの音声出力に応じて口を動かすよう
な動画の映像信号を出力し、スーパーインポーザ１７に
供給する。このアニメーション信号中には、上記メッセ
ージを文字表示する吹き出し部ＳＰの表示信号も含まれ
ている。スーパーインポーザ１７は、ＶＴＲ４０からの
再生映像（あるいはテレビジョン放送映像）上に上記ア
ニメーションキャラクタＡＣの映像を重ねた形態の映像
信号に変換、いわゆるスーパーインポーズしてＣＲＴ表
示装置３０に送る。なお、ＣＲＴ表示装置３０の画面Ｓ
Ｃ上に映し出されるアニメーションキャラクタＡＣは、
表情を有する擬人的なものとし、更に親しみのあるもの
とする。これによりオペレータ（ユーザ）は、このアニ
メーションキャラクタＡＣ（或いは電子機器）と対話し
ている感じを持つことができるようになる。[0045] The motion of the animation character AC is as follows:
The animation operation control signal from the control circuit 15 is sent to the animation character generation circuit 16 for control. That is, the animation character generation circuit 16 receives the operation control signal from the control circuit 15.
is an animated character A as shown in FIG. 5 described above, which is a speaker for interacting with the operator.
A video signal of a moving image in which the mouth moves in response to the audio output of the message is outputted from C and supplied to the superimposer 17. This animation signal also includes a display signal for a speech bubble SP that displays the above message in text. The superimposer 17 converts the image of the animation character AC onto a reproduced image (or television broadcast image) from the VTR 40 into a video signal, so-called superimposing, and sends the superimposed signal to the CRT display device 30. Note that the screen S of the CRT display device 30
The animation character AC projected on C is
It is made to be anthropomorphic with facial expressions, making it more familiar. This allows the operator (user) to feel as if he or she is interacting with the animated character AC (or electronic device).

【００４６】次に、上記図３における音声認識回路１３
は、供給された音声信号を信号処理して上記命令や各種
情報を認識するものであり、種々の構成が考えられるが
、例えば図６に示すような構成のものについて説明する
。この図６において、入力端子２１には上記送受話器１
０の送話部１１からの出力信号（入力音声信号）が供給
されており、この入力音声信号は、アナログインターフ
ェース２２を介して演算プロセッサ２３に送られる。アナログインターフェース２２は、システムコントロー
ラ２４から供給される制御データに応じて、入力音声レ
ベルを所定値に制御した後シリアルのディジタル音声信
号に変換して演算プロセッサ２３に送る。演算プロセッ
サ２３は、入力されたディジタル音声信号を例えば周波
数分析すること等によって音声パターンを形成すると共
に、人声音の発声速度の変動等による音声パターンの時
間的な歪みを修正（時間軸正規化）し、時間軸正規化さ
れた音声パターンを標準パターン登録メモリ２５に予め
蓄えられている複数の標準パターンと比較して、いわゆ
るパターンマッチング処理を行う。このパターンマッチ
ング処理とは、検出された音声パターンと各標準パター
ンとの距離を算出し最短距離となる標準パターンを求め
ることであり、処理結果はインターフェース２６を介し
て上記ＣＰＵ等から成る制御回路１５に送られる。ここ
で上記標準パターン登録メモリ２５には、上述した「再
生」，「停止」等の命令の単語や、「日曜」，「１チャ
ンネル」等の各種情報の単語等の音声パターンが予め複
数個登録されており、演算プロセッサ２３では上記入力
音声信号の音声パターンがこれら複数の登録パターンの
いずれか（いずれに最も近いか）を判断して単語等を認
識するものである。なお、標準パターン登録メモリ２５
へのパターン登録は、メーカ側で標準的な複数の音声パ
ターンを予め記憶させるようにしたり、あるいはユーザ
側で製品使用開始に先立ってオペレータが複数の単語を
順次音声入力し、演算プロセッサ２３で例えば周波数分
析して上記音声パターンを形成し、これらを記憶させる
ようにしてもよい。Next, the speech recognition circuit 13 in FIG.
The system recognizes the above-mentioned commands and various information by signal processing the supplied audio signal, and various configurations are possible, but the configuration shown in FIG. 6, for example, will be described. In this FIG. 6, the input terminal 21 is connected to the handset 1.
An output signal (input audio signal) from the transmitting unit 11 of 0 is supplied, and this input audio signal is sent to the arithmetic processor 23 via the analog interface 22 . The analog interface 22 controls the input audio level to a predetermined value in accordance with control data supplied from the system controller 24, converts it into a serial digital audio signal, and sends it to the arithmetic processor 23. The arithmetic processor 23 forms a voice pattern by, for example, frequency-analyzing the input digital voice signal, and also corrects temporal distortion of the voice pattern due to fluctuations in the speaking rate of human voices (time axis normalization). Then, the time axis normalized audio pattern is compared with a plurality of standard patterns stored in advance in the standard pattern registration memory 25 to perform a so-called pattern matching process. This pattern matching process is to calculate the distance between the detected audio pattern and each standard pattern to find the standard pattern with the shortest distance, and the process result is sent to the control circuit 15 comprising the CPU etc. sent to. Here, in the standard pattern registration memory 25, a plurality of audio patterns are registered in advance, such as command words such as "play" and "stop", and words of various information such as "Sunday" and "1 channel". The arithmetic processor 23 recognizes words and the like by determining which of the plurality of registered patterns (to which one) the voice pattern of the input voice signal is closest. In addition, the standard pattern registration memory 25
Pattern registration can be done by having the manufacturer memorize a plurality of standard voice patterns in advance, or by having the user input multiple words sequentially by an operator before starting to use the product, and by having the arithmetic processor 23, for example, register the patterns. The voice patterns may be formed by frequency analysis and stored.

【００４７】この音声認識回路１３からの出力信号は、
制御回路１５に送られ、この制御回路１５においては所
定のソフトウェア・プログラムによって自然言語入力処
理や推論・対話処理が施される。すなわち、図７に示す
ように、ユーザ側の音声入力３１に応じて、自然言語入
力処理プロダクション・システム３２が機能して例えば
予約番組数に応じた意味フレーム３３が形成され、これ
らの意味フレーム３３が推論・対話プロダクション・シ
ステム３４に送られて、録画予約スケジューラ３５を設
定制御する。自然言語入力処理プロダクション・システ
ム３２は、文正規化プロダクション・システムＰＳ１、
区切り分けプロダクション・システムＰＳ２、単語抽出
プロダクション・システムＰＳ３及び意味理解プロダク
ション・システムＰＳ４に細分化でき、また推論・対話
プロダクション・システム３４は、補間推論プロダクシ
ョン・システムＰＳ５、習慣学習プロダクション・シス
テムＰＳ６及び対話処理プロダクション・システムＰＳ
７に細分化できる。１つの意味フレーム３３内には、例
えば録画予約処理用として、曜日情報項目、チャンネル
情報項目、開始時刻情報項目、録画時間あるいは終了時
刻情報項目等の複数項目のスロットが設けられており、
予約処理中に音声入力された各種情報がこれらの対応す
る項目スロットに書き込まれるようになっている。The output signal from this speech recognition circuit 13 is
The information is sent to the control circuit 15, where it is subjected to natural language input processing and inference/dialogue processing using a predetermined software program. That is, as shown in FIG. 7, the natural language input processing production system 32 functions in response to the user's voice input 31 to form, for example, semantic frames 33 corresponding to the number of reserved programs, and these semantic frames 33 is sent to the inference/dialogue production system 34 to set and control the recording reservation scheduler 35. The natural language input processing production system 32 includes a sentence normalization production system PS1,
The inference/dialogue production system 34 can be subdivided into a segmentation production system PS2, a word extraction production system PS3, and a semantic understanding production system PS4. Processing production system PS
It can be subdivided into 7. One semantic frame 33 is provided with slots for a plurality of items, such as a day of the week information item, a channel information item, a start time information item, and a recording time or end time information item, for example, for recording reservation processing.
Various information input by voice during the reservation process is written into these corresponding item slots.

【００４８】次に図８は、上記図４のフローチャートに
おける各ステップでの処理をより詳細に説明するための
フローチャートであり、録画予約とＶＴＲ４０の走行系
の操作を行う場合を例に挙げている。すなわち、この図
８において、ステップＳ１１では、上記図４のステップ
Ｓ１におけるアニメーションキャラクタＡＣの呼び出し
操作が行われる。このステップＳ１１での呼び出しは、
上記図４のステップＳ１で説明したようにオペレータの
呼び出し命令（「アイビー」とか「おい」等）の音声入
力や、ＶＴＲ４０の電源オン等に応じてなされる。この
ステップＳ１１で上記アニメーションキャラクタＡＣの
呼び出し入力がされると、次のステップＳ１２に進んで
上記ＣＲＴ表示装置３０の画面ＳＣ上に上記図５に示し
たようなアニメーションキャラクタＡＣが映し出される
（アニメーションキャラクタＡＣの起動）。Next, FIG. 8 is a flowchart for explaining in more detail the processing at each step in the flowchart shown in FIG. . That is, in this FIG. 8, in step S11, the operation for calling the animation character AC in step S1 in FIG. 4 is performed. The call in step S11 is
As explained in step S1 of FIG. 4 above, this is done in response to the operator's voice input of a calling command (such as "Ivy" or "hey") or when the VTR 40 is turned on. When the input to call the animation character AC is made in this step S11, the process proceeds to the next step S12, where the animation character AC as shown in FIG. AC start).

【００４９】次にステップＳ１３では、上記プレストー
クスイッチ１２がオンかオフかの判断がなされる。オフ
の場合（Ｎｏ）の場合はステップＳ１４に進み、オンの
場合（Ｙｅｓ）はステップＳ１５に進む。ステップＳ１
５に進んで上記オペレータからの命令が音声入力される
と、先ず、ステップＳ１６でこの入力命令が録画予約の
命令であるか否かの判断がなされる。録画予約の命令で
ある場合（Ｙｅｓ）はステップＳ１７で録画予約の処理
を行った後、ステップＳ１５に戻る。ステップＳ１６で
Ｎｏの場合（命令入力が録画予約の命令でない場合）に
はステップＳ１８に進み、既に行われた録画予約内容を
表示させる表示命令であるか否かの判断がなされ、Ｙｅ
ｓの場合はステップＳ１９でＣＲＴ表示装置３０の画面
ＳＣ上に先の録画予約内容を表示する処理がなされた後
ステップＳ１５に戻り、Ｎｏの場合はステップＳ２０に
進む。ステップＳ２０では、上記オペレータの命令入力
が録画予約の変更命令か否かの判断を行う。このステッ
プＳ２０でＹｅｓの場合はステップＳ２１で録画予約の
変更が行われた後ステップＳ１５に戻り、Ｎｏの場合は
ステップＳ２２へ進む。ステップＳ２２では録画予約の
取消命令であるか否かの判断がなされ、Ｙｅｓの場合は
ステップＳ２３で録画予約の取消処理がなされた後ステ
ップＳ１５へ戻り、Ｎｏの場合はステップＳ２４へ進む
。ステップＳ２４では、上記アニメーションキャラクタ
ＡＣに対する操作命令であるか否かの判断が成され、Ｙ
ｅｓの場合ステップＳ２５で当該アニメーションキャラ
クタＡＣの操作がなされた後ステップＳ１５に戻る。なお、このアニメーションキャラクタＡＣの操作の具体
例としては、後述するようにアニメーションキャラクタ
ＡＣを画面から消したり、黙らせたり（音声を消す）等
の操作を行わせるような操作を挙げることができる。ま
た、ステップ２４でＮｏの場合はステップＳ２６に進む
。ステップＳ２６ではＶＴＲ４０の操作命令であるか否
かの判断がなされ、Ｙｅｓの場合はＶＴＲ４０の操作処
理がなされた後ステップＳ１５へ戻り、Ｎｏの場合は処
理を終了する。Next, in step S13, it is determined whether the press talk switch 12 is on or off. If it is off (No), the process proceeds to step S14, and if it is on (Yes), the process proceeds to step S15. Step S1
When the command from the operator is input by voice in step 5, first, in step S16, it is determined whether or not this input command is a recording reservation command. If it is a recording reservation command (Yes), recording reservation processing is performed in step S17, and then the process returns to step S15. If No in step S16 (if the command input is not a recording reservation command), the process advances to step S18, where it is determined whether or not the command is a display command to display the contents of a recording reservation that has already been made.
In the case of s, the process returns to step S15 after displaying the contents of the previous recording reservation on the screen SC of the CRT display device 30 in step S19, and in the case of No, the process proceeds to step S20. In step S20, it is determined whether the command input by the operator is a command to change the recording reservation. If Yes in step S20, the recording reservation is changed in step S21, and then the process returns to step S15; if No, the process proceeds to step S22. In step S22, it is determined whether or not it is a command to cancel the recording reservation. If Yes, the recording reservation is canceled in step S23, and then the process returns to step S15. If No, the process advances to step S24. In step S24, it is determined whether or not this is an operation command for the animation character AC, and Y
In the case of es, the animation character AC is operated in step S25, and then the process returns to step S15. As a specific example of the operation of the animated character AC, there can be mentioned an operation such as making the animated character AC disappear from the screen, make it silent (mute the sound), etc., as will be described later. Moreover, in the case of No in step S24, the process advances to step S26. In step S26, it is determined whether or not the instruction is to operate the VTR 40. If YES, the VTR 40 is operated and then the process returns to step S15; if NO, the process ends.

【００５０】なお、上記ステップＳ１３でＮｏとなって
ステップＳ１４に進んだ場合、当該ステップＳ１４でオ
ペレータ（ユーザ）からの音声命令入力がある一定時間
内に無い場合には、例えば、上記アニメーションキャラ
クタＡＣが暇を持て余している様子を映像で表現する。具体的には、例えば、最初は、アニメーションキャラク
タＡＣがあくびをし、次に画面の端に寄りかかったり、
頭を掻く動作をさせたりし、音声命令入力が引き続き無
い場合はアニメーションキャラクタＡＣが横たわる（横
に寝る）ような暇を持て余す動作をさせる。この動作処
理を以後暇アニメ処理と呼ぶ。[0050] If the answer in step S13 is No and the process proceeds to step S14, if there is no voice command input from the operator (user) within a certain period of time in step S14, for example, the animation character AC The video depicts a person who has too much time to spare. Specifically, for example, first the animated character AC yawns, then leans to the edge of the screen,
The animation character AC is made to make a motion of scratching its head, and if there is no voice command input continuously, the animation character AC is made to make a motion to waste time such as lying down (sleeping on its side). This motion processing will be referred to as free animation processing hereinafter.

【００５１】図９は、上記図８のステップＳ１７におけ
る予約処理の詳細を説明するためのフローチャートであ
る。すなわちこの図９において、ステップＳ５０で録画
予約のための情報入力待ちを示す予約入力要求表示がな
される。この表示としては、例えば図１０に示すように
、ＣＲＴ表示装置３０の画面ＳＣ上にアニメーションキ
ャラクタＡＣ及び吹き出し部ＳＰ内の文字メッセージに
よる「予約をどうぞ。」との表示がなされる。このとき
、合成音声により「予約をどうぞ」との発音がなされる
と共に、アニメーションキャラクタＡＣの口が動く等の
動画表示がなされる。なお図１０の例では、画面ＳＣの
例えば右上位置の文字表示窓部ＰＨに現在のフェーズを
示す「予約」との文字が表示されている。FIG. 9 is a flowchart for explaining details of the reservation process in step S17 of FIG. 8 above. That is, in FIG. 9, in step S50, a reservation input request display indicating that information input for recording reservation is being waited for is displayed. For example, as shown in FIG. 10, the message "Please make a reservation" is displayed on the screen SC of the CRT display device 30 using an animation character AC and a text message in a speech bubble SP. At this time, the synthesized voice pronounces "Please make a reservation," and a moving image is displayed, such as the animation character AC's mouth moving. In the example of FIG. 10, for example, characters "reservation" indicating the current phase are displayed in the character display window PH at the upper right position of the screen SC.

【００５２】この図１０に示すような表示（及び音声出
力）がなされた状態において、図９の次のステップＳ５
１では、上記ＶＴＲ４０の録画予約のための各種要素情
報、例えば曜日、チャンネル、開始及び終了時刻等につ
いての音声入力が行われる。ここで、当該ステップＳ５
１での要素情報入力では、複数の要素情報を一度にしか
も任意の順序で入力できるようになってる。このため、
ステップＳ５２では、上記図７に示したような自然言語
入力処理プロダクション・システム３２による入力処理
が行われ、文の正規化、区切り分け、単語抽出及び意味
理解の各処理が行われた後、複数の要素情報が上記意味
フレーム３３の各項目、例えば曜日情報項目、チャンネ
ル情報項目、開始時刻情報項目、録画時間あるいは終了
時刻情報項目等のそれぞれ対応するスロットに分類され
て書き込まれる。このステップＳ５２で要素情報入力処
理がなされた後、ステップＳ５３で不足の要素情報があ
るか否か判断される。In the state where the display (and audio output) as shown in FIG. 10 is made, the next step S5 in FIG.
1, various element information for recording reservation of the VTR 40, such as day of the week, channel, start and end times, etc., are input by voice. Here, the step S5
When inputting element information in 1, multiple pieces of element information can be input at once and in any order. For this reason,
In step S52, input processing is performed by the natural language input processing production system 32 as shown in FIG. The element information is classified and written into slots corresponding to each item of the meaning frame 33, such as a day of the week information item, a channel information item, a start time information item, a recording time information item, or an end time information item. After element information input processing is performed in step S52, it is determined in step S53 whether or not there is insufficient element information.

【００５３】この要素情報の不足とは、上述した意味フ
レーム内の各項目情報が全て揃っていないことを意味す
るものである。すなわち、例えば録画予約時の意味フレ
ーム内に、曜日情報項目、チャンネル情報項目、開始時
刻情報項目、終了時刻（あるいは録画時間）情報項目の
４項目が設けられている場合に、これらの４項目の内の
１項目でも情報が欠けると正常な録画予約が行えなくな
る。そこで、ステップＳ５３で要素情報が不足（Ｙｅｓ
）と判断された場合、ステップＳ６６で不足の要素情報
の質問がなされ、ステップＳ５１に戻る。この不足情報
の補完処理については、図１３を参照しながら後で説明
する。[0053] This lack of element information means that all the item information in the above-mentioned meaning frame is not available. In other words, for example, if four items are provided in the semantic frame at the time of recording reservation, a day of the week information item, a channel information item, a start time information item, and an end time (or recording time) information item, these four items If even one item of information is missing, it will not be possible to make a normal recording reservation. Therefore, in step S53, element information is insufficient (Yes).
), a question is asked about the missing element information in step S66, and the process returns to step S51. This missing information complementing process will be described later with reference to FIG. 13.

【００５４】図９のステップＳ５３で不足が無い（Ｎｏ
）と判断された場合には、ステップＳ５４に進んで入力
された各要素情報の確認がなされる。このステップＳ５
４の要素情報の確認の際には、例えばＣＲＴ表示装置３
０の画面ＳＣ上に図１１のような表示がなされる。この
図１１の画面ＳＣ上での表示としては、上記図１０の場
合と同様に、アニメーションキャラクタＡＣ及び吹き出
し部ＳＰ内の文字メッセージによる「これでいいですか
？」との表示がなされ、合成音声により同じく「これで
いいですか」との発音がなされると共にアニメーション
キャラクタＡＣの口が動く表示がなされる。また画面Ｓ
Ｃの文字表示窓部ＰＨに現在のフェーズを示す「予約」
との文字が表示される。さらに、この図１１の確認の表
示においては、画面ＳＣの中央部分に録画予約内容表示
窓部ＰＲが設けられ、この表示窓部ＰＲ内に、例えば録
画予約の曜日データ「１１月７日水曜日」、開始及び終
了時刻データ「午前３時−午前４時」、及びチャンネル
データ「６チャンネル」のように、上記意味フレーム内
の各項目のデータを表示している。In step S53 of FIG. 9, there is no shortage (No.
), the process advances to step S54 and each input element information is confirmed. This step S5
When checking the element information in step 4, for example, use the CRT display device 3.
A display like that shown in FIG. 11 is made on the screen SC of 0. As for the display on the screen SC in FIG. 11, as in the case of FIG. ``Is this okay?'' is similarly pronounced and the animation character AC's mouth is displayed moving. Also screen S
"Reservation" indicating the current phase on the character display window PH of C
is displayed. Furthermore, in the confirmation display of FIG. 11, a recording reservation content display window PR is provided in the center of the screen SC, and within this display window PR, for example, the day of the week data of the recording reservation "Wednesday, November 7th" is displayed. , start and end time data "3:00 a.m. - 4:00 a.m.", and channel data "6 channels".

【００５５】このような図１１の表示が行われている状
態で、図９の次のステップＳ５５では上記ステップＳ５
４での予約確認のための音声入力がなされる。例えば、
「イエス」，「戻れ」，「ノー」，「変更」のような命
令あるいは上記要素情報等の音声入力がなされる。この
ときにも上述の図７による自然言語入力処理が行われる
ことは勿論である。ステップＳ５６では、上記ステップ
Ｓ５５における音声入力が「イエス」か否かの判断がな
され、Ｙｅｓ（「イエス」の入力音声）と判断された場
合は、ステップＳ６７に進み録画予約の重複がないかの
チェックされる。またステップＳ５６でＮｏ（「イエス
」以外）と判断された場合は、ステップＳ５７に進む。当該ステップＳ５７では上記ステップＳ５５の音声入力
が「戻れ」か否かの判断がなされ、この判断がＹｅｓの
場合は更にステップＳ６８で要素情報入力の不足情報の
質問をしたかどうかの判断がなされる。このステップＳ
６８でＹｅｓの場合はステップＳ６６に戻り、Ｎｏの場
合はステップＳ５０に戻る。一方当該ステップＳ５７の
判断がＮｏとなった場合は、ステップＳ５８に進む。当
該ステップＳ５８では上記ステップＳ５５での音声入力
が要素情報であるか否かの判断がなされる。上記音声入
力が要素情報である場合（Ｙｅｓ）は、ステップＳ５２
に戻り、要素情報でない別の音声入力である場合（Ｎｏ
）は、ステップＳ５９に進む。ステップＳ５９ではステ
ップＳ５５の音声入力が「変更」であるか否かの判断が
なされる。Ｎｏの場合はステップＳ６０に進む。このステップＳ６０では、録画予約の各要素情報の変更
か或いは取消かの選択を行う。このためステップＳ６１
で変更／取消のいずれかの入力を行い、ステップＳ６２
に進む。ステップＳ６２では、再び音声入力が「変更」
であるか否かの判断を行い、Ｎｏの場合はステップＳ６
９で録画予約を中止する。また、ステップＳ６２でＹｅ
ｓの場合は、ステップＳ６３に進み、変更内容の質問を
行い、ステップＳ６５で再び要素情報の入力を行った後
、ステップＳ５２に戻る。While the display shown in FIG. 11 is being performed, in the next step S55 in FIG.
4, a voice input is made to confirm the reservation. for example,
Commands such as "yes", "go back", "no", and "change" or the above-mentioned element information are input by voice. Of course, the natural language input processing shown in FIG. 7 described above is also performed at this time. In step S56, it is determined whether the voice input in step S55 is "yes" or not. If it is determined to be yes (input voice of "yes"), the process advances to step S67 and a check is made to see if there are any duplicate recording reservations. Will be checked. If the determination in step S56 is No (other than "yes"), the process advances to step S57. In step S57, it is determined whether the voice input in step S55 is "Go back", and if this determination is Yes, it is further determined in step S68 whether or not a question has been asked about the missing information in the element information input. . This step S
If 68 is Yes, the process returns to step S66, and if No, the process returns to step S50. On the other hand, if the determination in step S57 is No, the process advances to step S58. In step S58, it is determined whether the voice input in step S55 is element information. If the audio input is element information (Yes), step S52
If it is another voice input that is not element information (No.
), the process proceeds to step S59. In step S59, it is determined whether the voice input in step S55 is "change". If No, the process advances to step S60. In this step S60, a selection is made as to whether to change or cancel each element information of the recording reservation. For this reason, step S61
Enter either change/cancellation in step S62.
Proceed to. In step S62, the voice input is "changed" again.
It is determined whether or not, and if No, step S6
9 to cancel the recording reservation. Moreover, in step S62,
In the case of s, the process advances to step S63, where a question is asked about the content of the change, and after the element information is input again at step S65, the process returns to step S52.

【００５６】図１２は、上記図９のステップＳ５１の要
素情報入力及びステップＳ５２の要素情報入力処理の詳
細を示すフローチャートである。この図１２では、自然
言語の「戻れ」に複数の意味があり、現在の状態に応じ
て異なる制御が必要となる点に着目した動作の流れを示
している。すなわちこの図１２において、要素情報の入
力があると、ステップＳ７１で該入力要素情報の意味解
析が行われた後、ステップＳ７２に進む。ステップＳ７
２では、入力された音声が「戻れ」であるか否かの判断
がなされ、Ｙｅｓの場合はステップＳ７３に進み、当該
ステップＳ７３で直前の音声入力が「戻れ」であったか
否かの判断が行われる。このステップＳ７３でＹｅｓな
らばステップＳ７５に進み予約中止の処理がなされた後
処理を終了する。また、ステップＳ７３でＮｏと判断さ
れると、ステップＳ７４に進み、当該ステップＳ７４で
更にその直前の質問があったか否かの判断がなされる。ステップＳ７４でＮｏの場合はステップＳ７５の予約中
止に進み、Ｙｅｓの場合はステップＳ７６に進む。ステ
ップＳ７６では、上記直前の質問をして、上記図９のス
テップＳ５１の要素情報入力ステップに戻る。更に、上
記ステップＳ７２でＮｏと判断された場合は、ステップ
Ｓ７７に進む。当該ステップＳ７７ではエラーが有るか
否かの判断がなされ、Ｙｅｓの場合ステップＳ７９へ進
み、エラー項目の質問がなされた後、上記ステップＳ５
１の要素情報入力に戻る。ステップＳ７７においてＮｏ
と判断されると、ステップＳ７８に進み、上記意味フレ
ームへの書き込みが行われた後、処理を終了する。FIG. 12 is a flowchart showing details of the element information input processing in step S51 and step S52 in FIG. 9 above. FIG. 12 shows the flow of operations focusing on the fact that the natural language "go back" has multiple meanings, and that different controls are required depending on the current state. That is, in FIG. 12, when element information is input, the input element information is semantically analyzed in step S71, and then the process proceeds to step S72. Step S7
In step 2, it is determined whether the input voice is "go back", and if YES, the process advances to step S73, where it is determined whether the immediately previous voice input was "go back". be exposed. If Yes in this step S73, the process advances to step S75, and after the reservation cancellation process is performed, the process ends. Further, if it is determined No in step S73, the process advances to step S74, and in step S74 it is further determined whether or not there was a previous question. If No in step S74, the process proceeds to step S75 to cancel the reservation, and if Yes, the process proceeds to step S76. In step S76, the previous question is asked, and the process returns to step S51 of FIG. 9, which is the element information input step. Further, if the determination in step S72 is No, the process advances to step S77. In step S77, it is determined whether or not there is an error, and if YES, the process advances to step S79, where a question regarding the error item is asked, and then step S5 is performed.
Return to element information input in step 1. No in step S77
If it is determined that this is the case, the process proceeds to step S78, and after writing to the meaning frame is performed, the process ends.

【００５７】このような「戻れ」の他にも、「取り消し
」とか「いい」のように複数の意味がある単語が音声入
力された場合には、状況に応じて正しくその意味を理解
して処理を行えるように、すなわち言葉の多義性を扱え
るように、自然言語処理プログラムが組まれている。これは、上記図１２に示したように、現在の状態を判断
対象に組み入れることによって実現でき、これによって
人間の言語生活で普通に行われているような自然な感覚
で操作が行える。なお、上記図４のステップＳ３におけ
る上記ＶＴＲ４０の走行系を直接的に操作するフェーズ
や、ステップＳ５（及びＳ６）の予約確認のフェーズ等
においても、上記「戻れ」や「取り消し」等の音声入力
には種々の意味があり、状況に応じた適切な対応がなさ
れるわけであるが、これについては後述する。[0057] In addition to such words as ``go back'', when a word with multiple meanings is input by voice, such as ``cancel'' or ``good'', it is necessary to correctly understand the meaning depending on the situation. Natural language processing programs are designed to perform processing, that is, to handle the ambiguity of words. This can be achieved by incorporating the current state into the judgment target, as shown in FIG. 12 above, and thereby the operation can be performed with a natural feeling that is normally performed in human language life. Note that even in the phase of directly operating the running system of the VTR 40 in step S3 of FIG. has various meanings, and appropriate responses are taken depending on the situation, which will be discussed later.

【００５８】次に図１３は図９での不足情報処理の詳細
を示すフローチャートである。図９では図示を簡略化す
るために各種不足情報の有無や質問をまとめてそれぞれ
１ステップで示しているが、図１３ではこれらを上記意
味フレームの４項目に応じて展開すると共に、録画予約
の習慣を学習して得られる要素情報（以下ＨＬＳデータ
という）の有無に応じたステップも付加している。すな
わちこの図１３の不足情報の補完処理において、最初の
ステップＳ３３では録画の曜日情報の項目の有り／無し
の判断がなされ、有りのときはステップＳ３６へ、無し
のときは次のステップＳ３４へ進む。ステップＳ３４で
は録画予約の曜日を何曜日にするかの質問が上記アニメ
ーションキャラクタＡＣの動画表示と音声とでなされ、
ステップＳ３５で曜日の情報の入力がなされる。ステッ
プＳ３６では録画予約のチャンネルの情報の項目が有り
／無しの判断がなされ、有りの場合はステップＳ３９へ
、無しの場合はステップＳ３７へ進む。ステップＳ３７
ではチャンネルをどのチャンネルにするかの質問がなさ
れ、ステップＳ３８でチャンネルの情報の入力が行われ
る。Next, FIG. 13 is a flowchart showing details of the missing information processing in FIG. In order to simplify the illustration, in Figure 9, the presence or absence of various types of missing information and questions are shown together in one step, but in Figure 13, these are expanded according to the four items of the above meaning frame, and the recording reservation Steps are also added depending on the presence or absence of elemental information (hereinafter referred to as HLS data) obtained by learning habits. That is, in the missing information complementing process shown in FIG. 13, in the first step S33, it is determined whether or not there is an item of recording day of the week information, and if it is present, the process advances to step S36, and if it is absent, the process advances to the next step S34. . In step S34, a question is asked as to which day of the week the recording reservation should be made, using the video display and audio of the animated character AC.
In step S35, day of the week information is input. In step S36, it is determined whether there is an item of channel information for recording reservation. If it is present, the process advances to step S39, and if it is absent, the process advances to step S37. Step S37
Then, a question is asked as to which channel to select, and channel information is input in step S38.

【００５９】ここで、本具体例は、オペレータがＶＴＲ
４０で録画予約する場合の予約の習慣を学習するシステ
ム（ＨＬＳ）を有している。すなわちこのＨＬＳシステ
ムにおいては、例えば曜日とチャンネル（或いは曜日の
み）の情報入力があったときに、ステップＳ３９にてこ
の曜日とチャンネルの情報が上記習慣的に行われている
録画予約のデータ（ＨＬＳデータ）と同じであるか否か
の判断を行い、ＨＬＳデータである場合には、ステップ
Ｓ４６で当該習慣的に予約される番組の少なくとも開始
時刻及び終了時刻を表示（及び音声出力）し、この習慣
的な録画予約を行うかどうかの確認を行う。このような
表示及び確認がされているとき、ステップＳ４７で「イ
エス」又は「ノー」を音声入力し、次のステップＳ４８
で上記習慣的な録画予約を行うか否かの判断を行い、Ｙ
ｅｓ（「イエス」の音声入力）の場合はステップＳ４９
へ、Ｎｏの場合はステップＳ４１へ進む。[0059] In this specific example, the operator
It has a system (HLS) that learns reservation habits when making recording reservations at 40. In other words, in this HLS system, for example, when information about the day of the week and the channel (or only the day of the week) is input, in step S39, the information about the day of the week and the channel is added to the habitual recording reservation data (HLS If the data is HLS data, at least the start time and end time of the habitually reserved program are displayed (and audio output) in step S46, and this Confirm whether or not to make a habitual recording reservation. When such a display and confirmation is being made, "yes" or "no" is input by voice in step S47, and the process proceeds to the next step S48.
Determine whether or not to perform the habitual recording reservation described above, and select Y.
In the case of es (voice input of "yes"), step S49
If the answer is No, the process advances to step S41.

【００６０】また、ステップＳ３９で曜日とチャンネル
の情報が上記習慣的なＨＬＳデータで無いと判断された
ならば、ステップＳ４０へ進む。ステップＳ４０では録
画予約の開始時刻の情報の有り／無しの判断がなされ、
有りの場合はステップＳ４３に進み、無しの場合はステ
ップＳ４１に進む。このステップＳ４１では上記録画予
約の開始時刻を何時にするかの質問がなされ、ステップ
Ｓ４２では当該開始時刻の情報入力がなされる。この開
始時刻情報入力がなされると、ステップＳ４３では録画
予約の終了時刻か又は録画時間の情報の有り／無しの判
断がなされる。有りの場合はステップＳ４９へ進み、無
しの場合はステップＳ４４へ進む。該ステップＳ４４で
は録画予約の終了時刻を何時にするかの質問がなされ、
ステップＳ４５では終了時刻の時間情報か又は録画時間
の情報の入力がなされる。ステップＳ４９では、例えば
図１１に示すような表示（及び「これでいいですか？」
との音声出力）がなされ、上記各ステップで入力された
録画予約のための各情報が正しいかどうかの確認がなさ
れる。If it is determined in step S39 that the day of the week and channel information is not the customary HLS data, the process advances to step S40. In step S40, it is determined whether there is information on the start time of the recording reservation.
If there is, the process proceeds to step S43; if not, the process proceeds to step S41. In step S41, a question is asked as to what time the recording reservation should start, and in step S42, information on the start time is input. When this start time information is input, in step S43, it is determined whether there is information about the end time or recording time of the recording reservation. If there is, the process advances to step S49; if not, the process advances to step S44. In step S44, a question is asked about the end time of the recording reservation.
In step S45, time information of the end time or information of the recording time is input. In step S49, for example, a display as shown in FIG. 11 (and "Is this OK?") is displayed.
(audio output) is performed, and it is confirmed whether each piece of information for recording reservation inputted in each step above is correct.

【００６１】次に図１４は、図９のステップＳ６７の重
複チェック処理の詳細を説明するためのフローチャート
である。この図１４において、ステップＳ８１では上述
した予約内容の確認の表示（上記図９のステップＳ５４
に相当）がなされ、ステップＳ８２では重複確認処理を
開始するため音声入力により「イエス」を入力（図９の
ステップＳ５５及びＳ５６に相当）している。従って、
ステップＳ８３以降が上記図９のステップＳ６７の重複
チェック処理に相当することになる。図１４のステップ
Ｓ８３では、既に予約された録画予約情報の内に時間が
重複する予約情報があるか否かの判断がなされ、Ｎｏの
場合はステップＳ９２で録画予約を完了させて処理を終
了し、Ｙｅｓの場合はステップＳ８４に進む。当該ステ
ップＳ８４では重複している録画予約の表示と、この重
複している録画予約内容の変更／取消の選択を行う。ス
テップＳ８５では音声入力により、「変更」／「取消」
の何れかの入力を行う。ステップＳ８６では、上記音声
入力が「変更」であるか否かの判断がなされ、Ｎｏの場
合はステップＳ９３で録画予約を中止して処理を終了し
、Ｙｅｓの場合はステップＳ８７で変更内容の質問をす
る。このステップＳ８７の次に、ステップＳ８８で要素
情報の音声入力を行う。このステップＳ８８で要素情報
が入力されたならば、ステップＳ８９で意味解析処理が
なされる。この時ステップＳ９０では、音声入力の意味
解析のエラーの有り／無しの判断を行い、有りの場合は
ステップＳ９４でエラー項目を質問してステップＳ８８
に戻る。また、ステップＳ９０で無しと判断された場合
は、ステップＳ９１に進み、上記意味フレームの対応情
報項目へのスロット書き込みを行った後、上記ステップ
Ｓ８１に戻る。Next, FIG. 14 is a flowchart for explaining details of the duplication check process in step S67 of FIG. In this FIG. 14, in step S81, the above-mentioned confirmation of reservation details is displayed (step S54 in FIG. 9 above).
(equivalent to steps S55 and S56 in FIG. 9) is performed, and in step S82, "yes" is input by voice input to start the duplication confirmation process (equivalent to steps S55 and S56 in FIG. 9). Therefore,
The steps after step S83 correspond to the duplication check process of step S67 in FIG. 9 above. In step S83 of FIG. 14, it is determined whether or not there is any reservation information that overlaps in time among the recording reservation information that has already been reserved. If No, the recording reservation is completed in step S92 and the process ends. , in the case of Yes, the process advances to step S84. In step S84, duplicate recording reservations are displayed and a selection is made to change or cancel the contents of the duplicate recording reservations. In step S85, "change"/"cancel" is selected by voice input.
Enter any of the following. In step S86, it is determined whether or not the voice input is "change". If No, the recording reservation is stopped in step S93 and the process ends; if Yes, a question is asked about the content of the change in step S87. do. Following this step S87, element information is input by voice in step S88. If element information is input in step S88, semantic analysis processing is performed in step S89. At this time, in step S90, it is determined whether there is an error in the semantic analysis of the voice input, and if there is an error, an error item is asked in step S94, and step S88
Return to If it is determined in step S90 that there is no such thing, the process advances to step S91, where slot writing is performed to the corresponding information item of the meaning frame, and then the process returns to step S81.

【００６２】ここで、上記ステップＳ９０、Ｓ９４での
エラー処理や、上記図１２のステップＳ７７、Ｓ７９で
のエラー処理等は、原則として音声入力がなされる毎に
行われるものであり、大別すると、意味解析ルーチンで
アンマッチが起きた（ルールが無い）場合の文法エラー
と、要素情報の指定に誤りがある（あるいは指定に誤り
は無いがＶＴＲの仕様上間違っている）場合の入力エラ
ーとが挙げられる。上記文法エラーの場合には、例えば
「わかりませんでした。もう一度言って下さい。」との
メッセージを動画表示及び音声出力し、入力待ち状態と
する。上記入力エラーの場合には、例えば、「わかりま
せんでした。何曜日の予約ですか？」とか、「開始と終
了の時刻が同じです。時間を入れ直して下さい。」等の
ように入力文中の誤り項目（エラー項目）を指摘して再
入力を要求するようなメッセージを動画表示及び音声出
力し、入力待ち状態とする。Here, the error processing in steps S90 and S94, and the error processing in steps S77 and S79 in FIG. , a syntax error occurs when an unmatch occurs in the semantic analysis routine (there is no rule), and an input error occurs when there is an error in the specification of element information (or there is no error in the specification, but it is incorrect due to the VTR specifications). Can be mentioned. In the case of the above-mentioned grammar error, for example, a message such as "I didn't understand. Please say it again." is displayed as a video and outputted as audio, and the system waits for input. In the case of the above input error, for example, "I don't understand. What day of the week is your reservation?" or "The start and end times are the same. Please re-enter the time." Displays a video and outputs a message pointing out the error item (error item) and requesting re-input, and waits for input.

【００６３】次に図１５及び図１６は、上記図８のステ
ップＳ１９での表示処理ルーチンの詳細を説明するため
のフローチャートである。すなわち、上記図８のステッ
プＳ１９における表示処理が開始されると、先ず図１５
のステップＳ１００で表示命令，変更命令，取消命令等
の音声入力がなされる。ステップＳ１０１では録画予約
の情報が有るか無いかの判断がなされ、無しの場合は、
ステップＳ１０２で表示すべき予約情報がないことを示
すエラーメッセージを動画表示及び音声出力して処理を
終了する。またステップＳ１０１で有りと判断された場
合は、ステップＳ１０３で録画予約内容の表示を行った
後、ステップＳ１０４で未表示予約情報が有るか無いか
の判断を行う。ステップＳ１０４で無しと判断されると
ステップＳ１１３に進む。またこのステップＳ１０４で
有りと判断されると、ステップＳ１０５でＣＲＴ表示装
置３０の画面を切り替えるかどうか質問がなされる。ス
テップＳ１０６では、音声による「イエス」，「ノー」
，「戻れ」や変更の命令，取消の命令の入力がなされる
。このステップＳ１０６の入力がなされた後、ステップ
Ｓ１０８で当該入力音声が「イエス」であるか否かの判
断がなされる。このステップＳ１０８でＹｅｓと判断さ
れた場合は、ステップＳ１１５へ進み、更に次の画面の
情報があるか否かの判断がなされる。このステップＳ１
１５でＹｅｓと判断されると、ステップＳ１１６に進ん
で次の画面を表示してステップＳ１０５に戻り、ステッ
プＳ１１５でＮｏと判断されると、ステップＳ１１７に
進んでエラーメッセージを表示してステップＳ１０５に
戻る。また、上記ステップＳ１０８での判断がＮｏとな
る場合は、ステップＳ１０９に進む。当該ステップＳ１
０９では上記音声入力が「戻れ」であるか否かの判断が
なされ、Ｙｅｓの場合はステップＳ１１８に進む。当該ステップＳ１１８では前画面が有るか否かの判断を
行い、Ｙｅｓの場合はステップＳ１１９で前画面を表示
してステップＳ１０５に戻り、Ｎｏの場合は処理を終了
する。上記ステップＳ１０９がＮｏの場合は、ステップ
Ｓ１１０に進む。ステップＳ１１０は上記音声入力が変
更の命令であるか否かの判断がなされ、Ｙｅｓの場合は
ステップＳ１２０で変更処理に進み、Ｎｏの場合はステ
ップＳ１１１に進む。ステップＳ１１１では上記音声入
力が取消の命令であるか否かの判断を行い、Ｙｅｓの場
合はステップＳ１２１で取消処理に進み、Ｎｏの場合は
ステップＳ１１２に進む。当該ステップＳ１１２では、
これら変更処理，取消処理が表示の命令から始まったか
否かの判断がなされ、Ｎｏの場合はステップＳ１１０に
戻り、Ｙｅｓの場合はステップＳ１１３に進む。当該ス
テップＳ１１３では変更か取消をするかどうかの質問を
行い、ステップＳ１１４では音声による「イエス」，「
ノー」，「戻れ」或いは変更命令，取消命令の入力を行
う。Next, FIGS. 15 and 16 are flowcharts for explaining details of the display processing routine in step S19 of FIG. 8. That is, when the display process in step S19 of FIG. 8 is started, first the display process shown in FIG.
In step S100, a voice input such as a display command, change command, cancellation command, etc. is performed. In step S101, it is determined whether there is recording reservation information or not, and if there is no recording reservation information,
In step S102, an error message indicating that there is no reservation information to be displayed is displayed as a video and output as audio, and the process ends. If it is determined in step S101 that the recording reservation information exists, the content of the recording reservation is displayed in step S103, and then it is determined in step S104 whether there is undisplayed reservation information. If it is determined in step S104 that there is no such information, the process advances to step S113. If it is determined in step S104 that the screen is to be switched, a question is asked in step S105 as to whether the screen of the CRT display device 30 should be switched. In step S106, a voice ``yes'' or ``no'' is sent.
, "Go back," an instruction to change, or an instruction to cancel is input. After the input in step S106 is made, it is determined in step S108 whether or not the input voice is "yes". If it is determined Yes in this step S108, the process advances to step S115, and it is further determined whether there is information for the next screen. This step S1
If it is determined Yes in step S15, the process advances to step S116, displays the next screen, and returns to step S105. If it is determined No in step S115, the process advances to step S117, displays an error message, and returns to step S105. return. Further, if the determination in step S108 is No, the process advances to step S109. The step S1
At step S09, it is determined whether or not the voice input is "go back", and if YES, the process advances to step S118. In step S118, it is determined whether or not there is a previous screen. If Yes, the previous screen is displayed in Step S119 and the process returns to Step S105; if No, the process ends. If the answer in step S109 is No, the process advances to step S110. In step S110, it is determined whether or not the voice input is a change command. If Yes, the process proceeds to step S120, and if No, the process proceeds to step S111. In step S111, it is determined whether the voice input is a cancellation command. If Yes, the process proceeds to step S121, and if No, the process proceeds to step S112. In step S112,
It is determined whether or not these change processing and cancellation processing started from the display command. If No, the process returns to step S110, and if Yes, the process proceeds to step S113. In step S113, a question is asked as to whether the change or cancellation is to be made, and in step S114, a voice response of "yes" or "
input "No", "Go back", change command, or cancellation command.

【００６４】更に、この図１５のステップＳ１１４の処
理が終了すると、図１６に示すステップＳ１２５に進む
。当該ステップＳ１２５では上記ステップＳ１１４での
音声入力が変更命令であるか否かの判断がなされ、Ｙｅ
ｓの場合はステップＳ１３１で変更処理を行った後処理
を終了し、Ｎｏの場合はステップＳ１２６に進む。当該
ステップＳ１２６ではステップＳ１１４での音声入力が
取消命令であるか否かの判断がなされ、Ｙｅｓの場合は
ステップＳ１３２で取消処理を行った後処理を終了し、
Ｎｏの場合はステップＳ１２７に進む。当該ステップＳ
１２７ではステップＳ１１４での音声入力が「イエス」
であるか否かの判断がなされ、当該判断がＮｏの場合は
処理を終了し、判断がＹｅｓの場合はステップＳ１２８
に進む。当該ステップＳ１２８では変更／取消の選択の
質問がなされ、ステップＳ１２９で音声による変更命令
，取消命令，「戻れ」の入力がなされる。その後、ステ
ップＳ１３０に進み、ここで上記音声入力が「戻れ」で
あるか否かの判断がなされ、この判断がＹｅｓの場合は
処理を終了し、Ｎｏの場合は上記ステップＳ１２５に戻
る。Furthermore, when the process of step S114 in FIG. 15 is completed, the process advances to step S125 shown in FIG. In step S125, it is determined whether the voice input in step S114 is a change command, and
In the case of s, the post-processing after performing the change process in step S131 is ended, and in the case of No, the process proceeds to step S126. In step S126, it is determined whether the voice input in step S114 is a cancellation command, and if YES, the cancellation process is performed in step S132, and then the process is terminated.
If No, the process advances to step S127. The relevant step S
127, the voice input in step S114 is "yes"
A determination is made as to whether or not this is the case, and if the determination is No, the process ends; if the determination is Yes, step S128
Proceed to. In step S128, a question is asked to select change/cancellation, and in step S129, a change command, a cancellation command, and a "return" command are input by voice. Thereafter, the process advances to step S130, where it is determined whether the voice input is "go back". If the determination is Yes, the process ends; if the determination is No, the process returns to step S125.

【００６５】図１７は上記図８のフローチャートのステ
ップＳ２１における変更処理ルーチンの詳細を説明する
ためのフローチャートである。すなわちこの図１７にお
いて、ステップＳ１４０では録画予約の選択処理が行わ
れ、具体的には例えば録画予約内容の一覧表が表示され
て、どの予約を変更するかを尋ねる旨のメッセージが動
画表示及び音声出力される。次のステップＳ１４１では
、上記表示された一覧表中に例えば４つの内容表示欄が
有る場合、これらの内の何番目の欄の録画予約情報を変
更するかを、例えば「１番」〜「４番」のように音声で
入力する。次のステップＳ１４２では上記音声入力で指
定した番号の欄に実際に録画予約された情報があるか否
かの判断を行う。このステップＳ１４２でＮｏの場合は
ステップＳ１５０でエラーメッセージを出してステップ
Ｓ１４０に戻り、Ｙｅｓの場合はステップＳ１４３に進
む。FIG. 17 is a flowchart for explaining details of the change processing routine in step S21 of the flowchart of FIG. 8. That is, in FIG. 17, in step S140, a recording reservation selection process is performed, and specifically, for example, a list of recording reservation contents is displayed, and a message asking which reservation to change is displayed in the video display and audio. Output. In the next step S141, if there are, for example, four content display columns in the displayed list, the number of columns among these columns in which the recording reservation information is to be changed is determined, for example, from "No. 1" to "4". Enter by voice, such as "No.". In the next step S142, it is determined whether or not there is information about actual recording reservation in the column of the number specified by the voice input. If No in this step S142, an error message is output in step S150 and the process returns to step S140, and if Yes, the process proceeds to step S143.

【００６６】このステップＳ１４３では、指定された録
画予約内容についての変更内容の質問が行われ、ステッ
プＳ１４４で変更内容の要素情報の音声入力が行われる
。ステップＳ１４５では当該要素情報により変更された
録画予約内容の確認処理がなされ、ステップＳ１４６で
音声による「イエス」，「ノー」あるいは要素情報の入
力が行われる。ステップＳ１４７では、上記ステップＳ
１４６の音声入力が要素情報であるか否かの判断がなさ
れ、Ｙｅｓの場合はステップＳ１４５に戻り、Ｎｏの場
合はステップＳ１４８に進む。このステップＳ１４８で
は上記音声入力が「イエス」か否かの判断がなされ、Ｙ
ｅｓ（「イエス」の音声入力）の場合はステップＳ１５
１で上記重複チェック処理を行った後、処理を終了する
。また、このステップＳ１４８での判断がＮｏの場合、
ステップＳ１４９で変更内容の質問が再びなされた後、
ステップＳ１４４に戻る。[0066] In step S143, a question is asked about the content of changes to the specified recording reservation content, and in step S144, element information of the content of changes is input by voice. In step S145, the recording reservation content changed based on the element information is confirmed, and in step S146, "yes" or "no" or the element information is input by voice. In step S147, the step S
A determination is made as to whether or not the voice input No. 146 is element information. If Yes, the process returns to step S145; if No, the process proceeds to step S148. In this step S148, it is determined whether the voice input is "yes" or not, and Y
In the case of es (voice input of "yes"), step S15
After performing the above-mentioned duplication check process in step 1, the process ends. Further, if the determination in step S148 is No,
After the question about the change details is asked again in step S149,
Return to step S144.

【００６７】図１８は上記図８のフローチャートのステ
ップＳ２３における取消処理ルーチンの詳細を説明する
ためのフローチャートである。すなわちこの図１８にお
いて、ステップＳ１６０では取消を行う録画予約情報の
選択を行い、ステップＳ１６１で例えば１番〜４番の４
つある録画予約情報の内のどの録画予約情報を取り消す
かを音声で入力する。ステップＳ１６２ではこの１番〜
４番の録画予約情報があるか否かの判断を行う。このス
テップＳ１６２でＮｏの場合はステップＳ１６８でエラ
ーメッセージを出してステップＳ１６０に戻り、Ｙｅｓ
の場合はステップＳ１６３に進む。この図１８のステッ
プＳ１６０からステップＳ１６２までの処理は、上記図
１７のステップＳ１４０からステップＳ１４２までの処
理と略々同様である。FIG. 18 is a flowchart for explaining details of the cancellation processing routine in step S23 of the flowchart shown in FIG. That is, in FIG. 18, in step S160, the recording reservation information to be canceled is selected, and in step S161, for example, 4
The user inputs by voice which recording reservation information among the available recording reservation information is to be canceled. In step S162, this number 1~
It is determined whether recording reservation information No. 4 exists. If No in this step S162, an error message is displayed in step S168 and the process returns to step S160.
In this case, the process advances to step S163. The processing from step S160 to step S162 in FIG. 18 is substantially the same as the processing from step S140 to step S142 in FIG. 17 described above.

【００６８】次に、ステップＳ１６３では取消の確認処
理がなされ、ステップＳ１６４で音声による「イエス」
，「ノー」の入力が行われる。ステップＳ１６５では、
上記ステップＳ１６４の音声入力が「イエス」であるか
否かの判断がなされ、この判断がＹｅｓ（音声入力が「
イエス」）の場合はステップＳ１６６に進み取消完了と
された後処理を終了する。また、ステップＳ１６５の判
断でＮｏとされた場合はステップＳ１６７へ進み、取消
処理を中止して処理を終了する。[0068] Next, in step S163, cancellation confirmation processing is performed, and in step S164, a voice ``yes'' is confirmed.
, "No" is input. In step S165,
A determination is made as to whether or not the voice input in step S164 is "yes", and this determination is Yes (the voice input is "yes").
If the answer is "Yes"), the process advances to step S166, and the post-processing that has been determined to be canceled is completed. If the determination in step S165 is No, the process advances to step S167, cancels the cancellation process, and ends the process.

【００６９】以上説明したような録画予約あるいは予約
確認（及び変更／取消）のフェーズにおいて、上記図９
のステップＳ５４〜Ｓ５９や、上記図１５のステップＳ
１０５〜Ｓ１１１や、上記図１５のステップＳ１１３〜
上記図１５のステップＳ１２７や、上記図１７のステッ
プＳ１４５〜Ｓ１４８等から明らかなように、要求され
た答え以外の入力をも受け付け、理解するようにしてい
る。すなわち図９のステップＳ５４においては、上記図
１１のような表示（及び音声出力）がなされて、一般的
には「イエス」又は「ノー」の答えが要求されているわ
けであるが、直接に要素情報を音声入力することで、ス
テップＳ５８でＹｅｓと判断されて要素情報入力処理Ｓ
５２に移ることができる。また「変更」と音声入力する
ことで、変更処理に移行することができる。また、図１
５のステップＳ１０５においては、画面を切り替えるか
否かの質問メッセージが動画表示及び音声出力されてお
り、通常は「イエス」又は「ノー」で答えるものである
が、変更命令や取消命令を直接入力して、変更処理ステ
ップＳ１２０や取消処理ステップＳ１２１に移行するこ
とができる。In the phase of recording reservation or reservation confirmation (and change/cancellation) as explained above, the process shown in FIG.
Steps S54 to S59 of , and step S of FIG.
105 to S111 and steps S113 to S113 in FIG.
As is clear from step S127 in FIG. 15 and steps S145 to S148 in FIG. 17, input other than the requested answer is also accepted and understood. That is, in step S54 of FIG. 9, a display (and voice output) as shown in FIG. By inputting the element information by voice, it is determined Yes in step S58 and the element information input process S is performed.
52. In addition, by inputting "change" by voice, it is possible to proceed to the change process. Also, Figure 1
In Step S105 of Step 5, a question message asking whether to switch the screen is displayed as a video and is outputted as audio. Normally, the answer is "yes" or "no," but a change command or cancellation command can be directly input. Then, the process can proceed to change processing step S120 or cancellation processing step S121.

【００７０】これは、通常の人間同士の会話においては
、文脈から意味を類推し、相手の前言と形式的には継ら
ない飛躍した返答が行われることがあることを考慮して
、要求された答え以外の入力をも受け付けるようにし、
入力内容に応じた最適の処理を行わせるものである。こ
れによって、例えば「イエス」又は「ノー」の答えを省
略して次の入力ステップに直接入ることができ、操作手
順の簡略化が図れる。また、所定の処理に移るための手
順が複数存在することになる。例えば録画予約内容を変
更したい場合には、確認時の「これでいいですか？」と
の問いに対して「ノー」と答え、次に「どうしますか？
」との問いに対して「変更」と命令するのが各問いに対
応する正統的な手順であるが、上記確認時に、「変更」
との命令を入力したり、直接変更内容を入力したり、「
戻れ」との命令で直前の情報入力状態に戻ったりするこ
とができ、ユーザ（オペレータ）の様々な入力に柔軟に
対処することができる。しかも、変更項目の要素情報を
例えば「６チャンネル」等のように直接入力して変更す
ることもでき、操作手順の簡略化のみならず、自明なこ
とはわざわざ言及しないという自然言語での対話の環境
あるいは雰囲気を実現することができる。[0070] This is required in consideration of the fact that in normal conversations between humans, the meaning may be inferred from the context and a dramatic response that does not formally follow the other person's previous statement may be made. It also accepts input other than the given answer.
This allows optimal processing to be performed depending on the input content. This makes it possible to directly enter the next input step by omitting, for example, a "yes" or "no" answer, thereby simplifying the operating procedure. Furthermore, there are multiple procedures for moving to a predetermined process. For example, if you want to change the recording reservation details, answer "No" to the question "Are you sure you want to do this?" during confirmation, and then "What do you want to do?"
The orthodox procedure for responding to each question is to order "change" in response to the question "Change", but when confirming the above,
You can enter commands such as , enter changes directly, or press ``
It is possible to return to the previous information input state by commanding "Go back," and it is possible to flexibly respond to various user (operator) inputs. What's more, it is also possible to change the element information of the change item by directly inputting it, such as "6 channels," which not only simplifies the operating procedure, but also allows dialogue in natural language without mentioning obvious things. An environment or atmosphere can be realized.

【００７１】図１９は上記図８のフローチャートのステ
ップＳ２６におけるＶＴＲ４０の操作処理ルーチンの詳
細を説明するためのフローチャートである。すなわちこ
の図１９において、ステップＳ１７１では、ＶＴＲ４０
の現在の状態（例えば動作モードがどのモードになって
いるか）のチェックが行われる。ステップＳ１７２では
、動作モード指定制御信号に応じたコマンドに対応する
操作を後述する動作モードのためのマトリクス（命令−
動作対照表）より検索する。このマトリクスの一部を以
下の表１に示す。FIG. 19 is a flowchart for explaining details of the operation processing routine for the VTR 40 in step S26 of the flowchart shown in FIG. That is, in this FIG. 19, in step S171, the VTR 40
The current state (for example, which mode of operation is set) is checked. In step S172, a matrix (command-
Search from the operation comparison table). A portion of this matrix is shown in Table 1 below.

【表１】次のステップＳ１７３では、このマトリクス内に対応す
る操作があるか否かの判断がなされ、無しの場合はステ
ップＳ１７６でエラーメッセージを出して処理を終了す
る。また、ステップＳ１７３で無しの場合は、ステップ
Ｓ１７４に進み、当該ステップＳ１７４でこのコマンド
に対応する操作を実行させる。その後、ステップＳ１７
５で実行する旨或いは実行した旨のメッセージを出して
処理を終了する。[Table 1] In the next step S173, it is determined whether or not there is a corresponding operation in this matrix. If there is no corresponding operation, an error message is output in step S176 and the process ends. Moreover, if there is no command in step S173, the process advances to step S174, and in step S174, the operation corresponding to this command is executed. After that, step S17
In step 5, a message indicating that the process is to be executed or has been executed is output and the process is terminated.

【００７２】ここで上記表１は、例えば、現在の動作モ
ードの状態から、上記音声入力による命令があった場合
、その現在の動作モードからどの動作モードに移るかの
例を示している。すなわち、表１では、例えば、音声命
令で「電源」と入力した場合において、現在の動作モー
ドが例えばパワーオフの状態であったならば、パワーオ
ンの動作モードに制御され、以下同様にしてテープなし
，停止，再生等の場合はパワーオフに制御される。また
、音声命令で例えば「再生」と入力した場合、現在の動
作モードがパワーオフ，停止等となっていれば動作モー
ドが再生に制御される。更に、例えば「早く」と音声入
力を行った場合において、現在の動作モードが例えば停
止，早送り，キューの状態であったならば早送りに制御
され、再生，ポーズであったならばキューに制御され、
巻戻しであったならばレビューに、レビューであったな
らば巻戻しに制御される。また更に、例えば「戻れ」と
音声入力した場合において、現在の動作モードが例えば
停止，早送り，ポーズであったならばレビューに制御さ
れ、再生，キューであったならばリバース再生に制御さ
れ、巻戻しであったならば早送りに、レビューであった
ならば再生に制御される。このように、多義的な「早く
」とか「戻れ」とかの音声入力に対しても、そのときの
動作モードに応じた正しい意味の制御動作が選択される
。[0072] Here, Table 1 above shows an example of which operation mode the current operation mode will shift to when the above-mentioned voice input command is received from the current operation mode. In other words, in Table 1, for example, when "power" is entered as a voice command, if the current operation mode is power off, the operation mode is controlled to power on, and the tape is then turned on in the same manner. In the case of none, stop, playback, etc., the power is turned off. Furthermore, when a voice command such as "playback" is input, if the current operating mode is power off, stop, etc., the operating mode is controlled to playback. Furthermore, for example, when inputting the word "quickly" by voice, if the current operation mode is stop, fast forward, or cue, it will be controlled to fast forward, and if it is playback or pause, it will be controlled to cue. ,
If it is a rewind, it is controlled to review, and if it is a review, it is controlled to rewind. Furthermore, for example, when inputting a voice command such as "Go back", if the current operation mode is stop, fast forward, or pause, the current operation mode will be controlled to review, and if it is playback or cue, it will be controlled to reverse playback, and if the current operation mode is playback or cue, it will be controlled to reverse playback. If it was a rewind, it would be controlled to fast forward, and if it was a review, it would be controlled to play. In this way, even in response to an ambiguous voice input such as "quickly" or "go back", a control action with the correct meaning is selected according to the operating mode at that time.

【００７３】次に図２０は、上記図８のフローチャート
のステップＳ１４における暇アニメ処理ルーチンの詳細
を説明するためのフローチャートである。この図２０に
おいて、ステップＳ１８１ではトップレベルであるか否
かの判断がなされる。このトップレベルとは、ＶＴＲ４
０の基本操作を（直接的に）行ったり、上記予約／確認
のフェーズに入ったりする会話のレベルのことである。このステップＳ１８１でＹｅｓ（トップレベルにある）
と判断された場合、ステップＳ１８６でトップレベル用
の暇アニメの映像の選択が行われた後、ステップＳ１８
３に進む。また、上記ステップＳ１８１でＮｏと判断さ
れた場合は、ステップＳ１８２で各フェーズ毎の処理モ
ード用の暇アニメの映像の選択が行われる。これら、ト
ップレベル用或いは処理モード用の暇アニメの映像は予
め種々の設定が可能で、具体的には、例えば前述したよ
うに、アニメーションキャラクタＡＣがあくびをしたり
、画面の端に寄りかかったり、頭を掻く動作をさせたり
し、さらには、アニメーションキャラクタＡＣが横たわ
る（横に寝る）ような暇を持て余す動作をさせればよい
。ステップＳ１８３では、例えば乱数発生により、これ
ら暇アニメの映像のパターンのうち何れかを選択する処
理を行った後、ステップＳ１８４に進む。当該ステップ
Ｓ１８４では、音声入力が無い場合の待ち時間が所定時
間経過したか否かの判断を行い、Ｎｏの場合はこの判断
を繰り返し、Ｙｅｓの場合はステップＳ１８５に進む。当該ステップＳ１８５では上記選ばれたパターンの暇ア
ニメの映像をＣＲＴ表示装置３０の画面ＳＣ上に動画表
示する。すなわちアニメーションキャラクタＡＣが暇を
持て余していることを示す動作をさせる。Next, FIG. 20 is a flowchart for explaining the details of the idle animation processing routine in step S14 of the flowchart of FIG. 8 above. In FIG. 20, in step S181, it is determined whether or not the level is the top level. This top level is VTR4
This refers to the level of conversation at which the basic operations of 0 are performed (directly) and the reservation/confirmation phase described above is entered. Yes in this step S181 (at the top level)
If it is determined that this is the case, in step S186 the video of the leisure animation for the top level is selected, and then in step S18
Proceed to step 3. Further, if the determination in step S181 is No, then in step S182, a video of a leisure animation for each phase-by-phase processing mode is selected. These leisure animation images for top level or processing mode can be set in various ways in advance, such as when the animated character AC yawns or leans toward the edge of the screen, for example, as mentioned above. , scratching its head, or even make the animation character AC do a relaxing action such as lying down (sleeping on its side). In step S183, a process is performed to select one of these free time animation video patterns by, for example, random number generation, and then the process proceeds to step S184. In step S184, it is determined whether a predetermined waiting time has elapsed when there is no voice input, and if No, this determination is repeated, and if Yes, the process advances to step S185. In step S185, the leisure animation video of the selected pattern is displayed as a moving image on the screen SC of the CRT display device 30. That is, the animation character AC is caused to perform an action indicating that it has too much free time.

【００７４】また、アニメーションキャラクタＡＣの応
答をより人間臭くし、親近感を与えるために、機器の動
作状態指定等に無関係の音声入力に対しても、何らかの
応答（遊戯応答）が行われるようにすることが好ましい
。具体的な例としては、「おい」との音声入力に対して
、「はい、なんでしょう？」と答えさせたり、「よしよ
し」との音声入力に対して、「照れるなあ」との返答と
共に頭をかく等のアニメーション表示を行わせたりする
等である。[0074] Furthermore, in order to make the responses of the animation character AC more human-like and give a sense of familiarity, some kind of response (play response) is made even to voice inputs unrelated to specifying the operating state of the device. It is preferable to do so. For example, in response to a voice input of ``Hey,'' the response is ``Yes, what is it?'', or in response to a voice input of ``Okay, yoshi'', the response is ``You're embarrassed.'' For example, it may display animations such as drawing.

【００７５】ところで、上記図３の制御回路１５とアニ
メーションキャラクタ発生回路１６や音声合成回路１９
との間でのインターフェースは、所定構造のメッセージ
パケットＭＰを用いて行われている。このメッセージパ
ケットＭＰの一具体例としては、メッセージの種類を示
すメッセージ・タイプと、各メッセージ内容毎に対応付
けられたメッセージ番号と、ステータスとの３つの要素
を有して成るものが考えられる。上記メッセージ・タイ
プは、例えば入力を伴わないメッセージ（上記図５、図
１０参照）や、入力を伴うメッセージ（上記図１１参照
）や、スケジューラ・パケットの表示や、リスト（予約
内容の一覧表）の表示や、重複リストの表示等の区別を
指示する。上記メッセージ番号は、例えば８ビットのと
き２５６種類のメッセージのいずれかを指定できる。すなわち、制御回路１５は入力音声の認識データや現在
の状態（ステータス）等に応じて適切なメッセージのタ
イプや種類（内容）を選択し、そのメッセージ・タイプ
やメッセージ番号を上記メッセージパケットＭＰに入れ
て、アニメーションキャラクタ発生回路１６や音声合成
回路１９に送る。アニメーションキャラクタ発生回路１
６や音声合成回路１９では、送られたメッセージパケッ
トＭＰ内のメッセージ番号に応じたメッセージを動画表
示（アニメーションキャラクタの口の動きの表示とメッ
セージの文字表示）したり音声合成して出力したりする
わけである。By the way, the control circuit 15, animation character generation circuit 16, and voice synthesis circuit 19 shown in FIG.
The interface between the two is performed using a message packet MP having a predetermined structure. A specific example of this message packet MP may include three elements: a message type indicating the type of message, a message number associated with each message content, and a status. The above message types include, for example, messages without input (see Figures 5 and 10 above), messages with input (see Figure 11 above), display of scheduler packets, and lists (list of reservation contents). Instructs the display of duplicate lists, display of duplicate lists, etc. For example, when the message number is 8 bits, it can specify one of 256 types of messages. That is, the control circuit 15 selects an appropriate message type and type (content) according to the recognition data of the input voice, the current state (status), etc., and inserts the message type and message number into the message packet MP. Then, it is sent to the animation character generation circuit 16 and the voice synthesis circuit 19. Animation character generation circuit 1
6 and the voice synthesis circuit 19 display the message according to the message number in the sent message packet MP as a video (display of the mouth movements of the animation character and the text of the message) or synthesize voice and output the message. That's why.

【００７６】なお、このようなメッセージ番号を介して
のメッセージの表示及び音声出力を行う場合には、予め
何種類かのメッセージを用意しておくことが必要とされ
、これらの複数のメッセージの内から１つのメッセージ
が選択されて表示及び音声出力されるわけであるが、そ
の代わりに、制御回路１５にてメッセージ自体を作文さ
せるようにし、この作られたメッセージを例えば１文字
分ずつ送って動画表示や音声合成を行わせるようにして
もよい。[0076] When displaying and audio outputting a message via such a message number, it is necessary to prepare several types of messages in advance, and among these multiple messages, One message is selected from the list to be displayed and output as audio, but instead, the control circuit 15 is configured to compose the message itself, and the created message is sent, for example, one character at a time, to create a video. Display and voice synthesis may also be performed.

【００７７】次に、上記自然言語入力処理についてさら
に説明する。例えば録画予約時において、時間情報等の
不足項目の推論を常識の範囲で行うようにしている。例
えば、開始時刻入力の際に分だけ指定されたときには、
現在時刻との関連で時間を推定したり、終了時刻入力の
際に時間や分が入力されたときには開始時刻からの録画
時間としたりする。また、複数の要素情報が入力される
ときに、「から」と「まで」との関連性を考慮し、例え
ば「３０分９時まで」のような音声入力に対して８時３
０分から９時０分までと推定したり、現在時刻が８時１
０分であれば、「３０分から１時間」との音声入力に対
して８時３０分から９時３０分までと推定したりする。Next, the above natural language input processing will be further explained. For example, when making a recording reservation, inferences regarding missing items such as time information are made using common sense. For example, if only minutes are specified when inputting the start time,
The time is estimated in relation to the current time, or if hours or minutes are input when inputting the end time, the recording time is set as the recording time from the start time. In addition, when multiple element information is input, considering the relationship between "from" and "until,"
It is estimated that the time is from 0 to 9:00, or the current time is 8:01.
If it is 0 minutes, the time is estimated to be from 8:30 to 9:30 in response to a voice input of "30 minutes to 1 hour."

【００７８】なお、本発明は上記具体例のみに限定され
るものではなく、例えば、音声入力手段としては送受話
器の他に、いわゆるハンドマイクを用いたり、遠隔操作
装置（いわゆるリモコン）に小型マイクを設けたりする
ようにしてもよい。スイッチ（プレストークスイッチ）
はこれらのハンドマイクやリモコンに設ければよい。ま
た、制御される電子機器はＶＴＲに限定されず、ディス
クの記録及び／又は再生装置や、デジタル又はアナログ
のオーディオテープレコーダ等の種々の電子機器に対す
る制御装置に適用可能である。Note that the present invention is not limited to the above-mentioned specific examples; for example, in addition to a telephone receiver, a so-called hand microphone may be used as the voice input means, or a small microphone may be used in a remote control device (so-called remote control). You may also provide a. Switch (press talk switch)
may be provided on these hand microphones or remote controls. Further, the electronic device to be controlled is not limited to a VTR, but can be applied to a control device for various electronic devices such as a disk recording and/or reproducing device, a digital or analog audio tape recorder, etc.

【００７９】[0079]

【発明の効果】上述のように、本発明の電子機器の制御
装置においては、電子機器を使用する使用者からの命令
を認識し、この認識結果を用いて使用者からの命令を推
論或いは理解してこの結果に基づいて電子機器を制御す
ると共に推論結果に応じて使用者に対するメッセージを
生成することにより、使用者からの命令の意図を正確に
解析でき電子機器の操作に不慣れな使用者に代わって正
確に電子機器の制御が可能となっている。[Effects of the Invention] As described above, the control device for electronic equipment of the present invention recognizes commands from the user of the electronic equipment, and uses the recognition results to infer or understand the commands from the user. By controlling the electronic device based on this result and generating a message to the user according to the inference result, the intent of the command from the user can be accurately analyzed, making it easy for users who are unfamiliar with operating electronic devices. Instead, it is now possible to accurately control electronic devices.

【００８０】また、本発明の制御装置においては、操作
が複雑な電子機器をいわゆる機械に弱い使用者にとって
も使い易くするために、使用者に代わって機器の操作を
するエージェントをヒューマンインターフェースに導入
し、更に、このエージェントには機器操作の熟練者をメ
タファとすると共に、機器の操作に関する知識に加えて
使用者の意図を理解しそれを実行しようとする意思を持
つようにしているため、使用者と機器とのコミュニケー
ションを容易にすることが可能となっている。したがっ
て、電子機器の操作に馴染めなかった使用者も、機器を
使い易くなっている。In addition, in the control device of the present invention, an agent that operates the device on behalf of the user is introduced into the human interface in order to make electronic devices that are complicated to operate easy to use even for users who are weak with machines. Furthermore, this agent is a metaphor for a person skilled in operating equipment, and in addition to having knowledge of equipment operation, it also has the ability to understand the user's intentions and carry out their intentions. This makes it possible to facilitate communication between users and devices. Therefore, even users who are not accustomed to operating electronic devices can now use the devices easily.

【００８１】更に、本発明の電子機器の制御装置におい
て、インターフェースの基本的な流れすなわち言葉で意
図を伝えて使用者が望む操作を電子機器に対して実行す
るという流れは、簡単な教示によって全ての使用者に理
解できるものとなっていて、使用する言葉に制約が少な
い日常的な言い方ができるので使い易くなっており、自
分で機器の操作ができない人或いは初めてこのインター
フェースを使用する人にとっても非常に便利なものとな
っている。また、使用者は、制御装置が人間的な感情を
表現することで、好意的な印象を持ち、使用者の装置に
対する親近感が得られ、機械とコミュニケーションする
ことへの心理的な拒否反応を軽減する効果がある。Furthermore, in the electronic device control device of the present invention, the basic flow of the interface, that is, the flow of conveying the intention in words and executing the desired operation on the electronic device by the user, can be completely controlled by simple instruction. It is easy to use because it can be used in everyday language with fewer restrictions on the words used, and even for people who cannot operate the device themselves or who are using this interface for the first time. It is very convenient. In addition, when the control device expresses human emotions, the user has a favorable impression, feels a sense of familiarity with the device, and is less likely to have a psychological reaction to communicating with a machine. It has a mitigating effect.

[Brief explanation of drawings]

【図１】実施例の制御装置の概略構成を示すブロック図
である。FIG. 1 is a block diagram showing a schematic configuration of a control device according to an embodiment.

【図２】実施例装置を簡略化して示す図である。FIG. 2 is a diagram schematically showing an example device.

【図３】本具体例の制御装置の概略構成を示すブロック
図である。FIG. 3 is a block diagram showing a schematic configuration of a control device of this specific example.

【図４】本具体例装置における主要な動作のフローチャ
ートである。FIG. 4 is a flowchart of main operations in the device of this specific example.

【図５】初期状態でのアニメーションキャラクタ及び吹
き出し部が表示されたＣＲＴ画面を示す図である。FIG. 5 is a diagram showing a CRT screen on which animation characters and speech balloons are displayed in an initial state.

【図６】音声認識回路の一構成例を示すブロック図であ
る。FIG. 6 is a block diagram showing an example of the configuration of a speech recognition circuit.

【図７】制御回路での機能を説明するためのブロック図
である。FIG. 7 is a block diagram for explaining functions in a control circuit.

【図８】図４のフローチャートにおける各ステップでの
処理の詳細を示すフローチャートである。FIG. 8 is a flowchart showing details of processing at each step in the flowchart of FIG. 4;

【図９】図８のフローチャートにおける予約処理の詳細
を示すフローチャートである。FIG. 9 is a flowchart showing details of reservation processing in the flowchart of FIG. 8;

【図１０】予約入力要求表示がなされたＣＲＴ画面を示
す図である。FIG. 10 is a diagram showing a CRT screen on which a reservation input request is displayed.

【図１１】予約情報確認のための表示がなされたＣＲＴ
画面を示す図である。[Figure 11] CRT with display for confirming reservation information
It is a figure showing a screen.

【図１２】図９のフローチャートにおける要素情報入力
及び要素情報入力処理の詳細を示すフローチャートであ
る。12 is a flowchart showing details of element information input and element information input processing in the flowchart of FIG. 9; FIG.

【図１３】図９のフローチャートでの不足情報処理の詳
細を示すフローチャートである。FIG. 13 is a flowchart showing details of missing information processing in the flowchart of FIG. 9;

【図１４】図９のフローチャートにおける重複チェック
処理の詳細を示すフローチャートである。FIG. 14 is a flowchart showing details of the duplication check process in the flowchart of FIG. 9;

【図１５】図８のフローチャートにおける表示処理の詳
細を示すフローチャートである。FIG. 15 is a flowchart showing details of display processing in the flowchart of FIG. 8;

【図１６】図１５のフローチャートの続きのフローチャ
ートである。FIG. 16 is a flowchart that is a continuation of the flowchart in FIG. 15;

【図１７】図８のフローチャートにおける変更処理の詳
細を示すフローチャートである。FIG. 17 is a flowchart showing details of change processing in the flowchart of FIG. 8;

【図１８】図８のフローチャートにおける取消処理の詳
細を示すフローチャートである。FIG. 18 is a flowchart showing details of cancellation processing in the flowchart of FIG. 8;

【図１９】図８のフローチャートにおけるＶＴＲ操作処
理の詳細を示すフローチャートである。19 is a flowchart showing details of VTR operation processing in the flowchart of FIG. 8; FIG.

【図２０】図８のフローチャートにおける暇アニメ処理
の詳細を示すフローチャートである。20 is a flowchart showing details of the leisure animation process in the flowchart of FIG. 8; FIG.

[Explanation of symbols]

５０・・・・・・ユーザ６０・・・・・・制御装置６１・・・・・・マイク６２・・・・・・音声認識部６３・・・・・・対話理解部６４・・・・・・スピーカ６５・・・・・・音声合成部６６・・・・・・電子機器制御部６７・・・・・・メッセージ生成部６８・・・・・・キャラクタ表示部６９・・・・・・モニタ７１・・・・・・ＶＴＲ７３・・・・・・テレビジョン受像機 50...user 60...Control device 61...Mike 62...Speech recognition section 63・・・Dialogue understanding department 64...Speaker 65...Speech synthesis section 66...Electronic equipment control section 67...Message generation section 68...Character display area 69...Monitor 71...VTR 73...Television receiver

Claims

[Claims]

Claim 1: A voice recognition unit that recognizes speech from a user using an electronic device based on a preset vocabulary information table, and at least the above-mentioned in association with control information for controlling the electronic device. a dialogue understanding unit that understands or infers the recognition result from the speech recognition unit and outputs the understood control information or inference result information; An electronic device comprising: a message generation section that generates at least message information for requesting confirmation; and a control section that receives the control information from the dialogue understanding section and controls the electronic device. Control device.