JP2011081541A - Input device and control method thereof - Google Patents

Input device and control method thereof Download PDF

Info

Publication number
JP2011081541A
JP2011081541A JP2009232406A JP2009232406A JP2011081541A JP 2011081541 A JP2011081541 A JP 2011081541A JP 2009232406 A JP2009232406 A JP 2009232406A JP 2009232406 A JP2009232406 A JP 2009232406A JP 2011081541 A JP2011081541 A JP 2011081541A
Authority
JP
Japan
Prior art keywords
operation command
success rate
recognition
user
input device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP2009232406A
Other languages
Japanese (ja)
Other versions
JP2011081541A5 (en
JP5473520B2 (en
Inventor
Kazuhiro Matsubayashi
一弘 松林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canon Inc
Original Assignee
Canon Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Canon Inc filed Critical Canon Inc
Priority to JP2009232406A priority Critical patent/JP5473520B2/en
Publication of JP2011081541A publication Critical patent/JP2011081541A/en
Publication of JP2011081541A5 publication Critical patent/JP2011081541A5/ja
Application granted granted Critical
Publication of JP5473520B2 publication Critical patent/JP5473520B2/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Position Input By Displaying (AREA)
  • Details Of Television Systems (AREA)
  • User Interface Of Digital Computer (AREA)
  • Image Analysis (AREA)

Abstract

<P>PROBLEM TO BE SOLVED: To provide an input device for notifying a user of the possibility of the success of the recognition of an operation command, and to provide a control method for the input device. <P>SOLUTION: An input device connected to or incorporated in electronic equipment is configured to recognize at least either a sound issued by a user and the motion of the user, and to convert it into an operation command to the electronic equipment, and provided with: an environment obtaining means for obtaining information related with the external environment of the input device affecting the recognition success rate of the operation command; a storage means for storing information expressing the recognition success rate of the operation command for each external environment; a success rate obtaining means for obtaining the recognition success rate of the operation command in the current external environment based on the information obtained by the environment obtaining means and the information stored by the storage means; and a display means for displaying the recognition success rate of the operation command obtained by the success rate obtaining means at a display part. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、ユーザの発する音及びユーザの動きのうち少なくともいずれかを認識して電子機器に対する操作コマンドへ変換する入力装置及び入力装置の制御方法に関する。   The present invention relates to an input device that recognizes at least one of a user-generated sound and a user's movement and converts it into an operation command for an electronic device, and a control method for the input device.

近年、ユーザの声やジェスチャ(例えば、手の形や動き)などを認識して電子機器に対する操作コマンドへ変換する技術が提案されている(例えば、特許文献1、非特許文献1参照)。このような技術を用いれば、リモコン、キーボード、タッチパネルなどを用いずに、電子機器を操作することが可能となる。
しかしながら、マイクロホンから入力される音声からユーザの声を認識して操作コマンドへ変換する場合、該入力される音声には、ユーザの声の他に、周囲の雑音(周りの人の声、自動車や電車の騒音、テレビ受信機の出力音声など)も含まれる虞がある。入力される音声にそのような雑音が含まれると、認識(音声認識)に失敗する可能性が高くなる。
また、デジタルカメラから入力される映像からジェスチャを認識して操作コマンドへ変換する場合、周囲の明るさが認識(ジェスチャ認識)に影響してしまう。例えば、暗い場所ではジェスチャ自体を認識することが困難となる。また、暗い場所でジェスチャを認識可能とするために、デジタルカメラの感度を上げると、撮影された映像中のノイズが増すため、認識に失敗する可能性が高くなる。
In recent years, a technique for recognizing a user's voice or gesture (for example, the shape or movement of a hand) and converting it into an operation command for an electronic device has been proposed (for example, see Patent Document 1 and Non-Patent Document 1). If such a technique is used, an electronic device can be operated without using a remote controller, a keyboard, a touch panel, or the like.
However, when the user's voice is recognized from the voice input from the microphone and converted into an operation command, the input voice includes ambient noise (voices of surrounding people, automobiles, etc.) in addition to the user's voice. Train noise, TV receiver output sound, etc.). If such noise is included in the input voice, there is a high possibility that recognition (voice recognition) will fail.
In addition, when a gesture is recognized from an image input from a digital camera and converted into an operation command, ambient brightness affects recognition (gesture recognition). For example, it is difficult to recognize the gesture itself in a dark place. Further, if the sensitivity of the digital camera is increased in order to make it possible to recognize a gesture in a dark place, the noise in the captured video increases, so that the possibility of recognition failure increases.

そのような問題に鑑みた従来技術として、例えば、特許文献2,3がある。
具体的には、特許文献2には、周囲の雑音を含む音声から音声認識が可能か否かを判定し、判定結果を文字列で表示する技術が開示されている。
特許文献3には、複数の特徴量(音声と唇の動きの特徴量)を、それぞれの信頼度(高いか低いか)に応じた重みで合成し、合成された特徴量を用いて認識処理を行う技術が開示されている。
For example, Patent Documents 2 and 3 are known as conventional techniques in view of such a problem.
Specifically, Patent Document 2 discloses a technique for determining whether or not speech recognition is possible from speech including ambient noise and displaying the determination result as a character string.
In Patent Document 3, a plurality of feature quantities (speech and lip movement feature quantities) are synthesized with weights corresponding to respective reliability levels (high or low), and recognition processing is performed using the synthesized feature quantities. Techniques for performing are disclosed.

特開昭63−209296号公報JP-A-63-209296 特開平11−352995号公報Japanese Patent Application Laid-Open No. 11-352995 特開2006−30447号公報JP 2006-30447 A

入江耕太、若村直弘、梅田和昇「ジェスチャ認識を用いたインテリジェントルームの構築 ‐手のジェスチャによる家電製品の操作‐」 第21回日本ロボット学会学術講演会(2003年9月20日〜22日)2J15Kota Irie, Naohiro Wakamura, Kazunobu Umeda “Establishing Intelligent Rooms Using Gesture Recognition-Manipulating Home Appliances Using Hand Gestures” The 21st Annual Conference of the Robotics Society of Japan (September 20-22, 2003) 2J15

ユーザがテレビを視聴していてリモコンが手元にない場合、音声認識に成功する可能性が高ければ音声で操作コマンドを入力する方が早いが、音声認識に成功する可能性が低ければリモコンを取りに行った方が早い。即ち、ユーザは、操作コマンドの入力を音声で行う手間と、リモコンを取りに行く手間とを比較して、操作手段を選択する。ジェスチャ認識についても同様であり、ユーザは、操作コマンドの入力をジェスチャで行う手間と、リモコンを取りに行く手間とを比較して、操作手段を選択する。
しかしながら、上記特許文献2,3に開示の技術では、音声認識やジェスチャ認識などに成功する可能性(操作コマンドの認識に成功する可能性)がユーザに通知されないため
、ユーザは上述したような比較をすることができない。
If the user is watching TV and the remote control is not at hand, it is faster to input the operation command with voice if the voice recognition is likely to succeed, but the remote control is removed if the voice recognition is unlikely to succeed. It is faster to go to. That is, the user selects the operation means by comparing the effort of inputting the operation command by voice with the effort of taking the remote control. The same applies to gesture recognition, and the user selects an operation means by comparing the effort of inputting an operation command with a gesture with the effort of obtaining a remote control.
However, in the technologies disclosed in Patent Documents 2 and 3, the user is not notified of the possibility of successful speech recognition or gesture recognition (possibility of successful operation command recognition). I can't.

そこで、本発明は、ユーザに対し操作コマンドの認識に成功する可能性を知らせることのできる入力装置及び入力装置の制御方法を提供することを目的とする。   Therefore, an object of the present invention is to provide an input device and a control method for the input device that can inform a user of the possibility of successful recognition of an operation command.

本発明の入力装置は、電子機器に接続又は内蔵される入力装置であって、ユーザの発する音及びユーザの動きのうち少なくともいずれかを認識して電子機器に対する操作コマンドへ変換する入力装置において、操作コマンドの認識成功率に影響を与える、入力装置の外部環境に関する情報を取得する環境取得手段と、外部環境ごとに、操作コマンドの認識成功率を表す情報を記憶している記憶手段と、環境取得手段により取得された情報と記憶手段に記憶された情報に基づいて、現在の外部環境における操作コマンドの認識成功率を取得する成功率取得手段と、成功率取得手段により取得された操作コマンドの認識成功率を表示部に表示する表示手段と、を有する。   The input device of the present invention is an input device that is connected to or incorporated in an electronic device, and recognizes at least one of a user-generated sound and a user's movement and converts it into an operation command for the electronic device. An environment acquisition unit that acquires information related to the external environment of the input device that affects the recognition success rate of the operation command, a storage unit that stores information representing the recognition success rate of the operation command for each external environment, and an environment based on the acquired information and the information stored in the storage unit by the acquisition unit, a success rate acquisition means for acquiring a recognition success rate of the operation command in the current external environment, has been operation command acquired by the success rate acquirer Display means for displaying the recognition success rate on the display unit.

本発明の入力装置の制御方法は、電子機器に接続又は内蔵される入力装置であって、ユーザの発する音及びユーザの動きのうち少なくともいずれかを認識して電子機器に対する操作コマンドへ変換する入力装置の制御方法において、操作コマンドの認識成功率に影響を与える、入力装置の外部環境に関する情報を取得する環境取得ステップと、外部環境ごとに、操作コマンドの認識成功率を表す情報を記憶している記憶手段に記憶された情報と、環境取得ステップで取得された情報とに基づいて、現在の外部環境における操作コマンドの認識成功率を取得する成功率取得ステップと、成功率取得ステップで取得された操作コマンドの認識成功率を表示部に表示する表示ステップと、を有する。   The input device control method of the present invention is an input device that is connected to or incorporated in an electronic device and recognizes at least one of a user-generated sound and a user's movement and converts the operation command to an electronic device. In the device control method, an environment acquisition step for acquiring information on the external environment of the input device that affects the operation command recognition success rate, and information indicating the operation command recognition success rate for each external environment are stored. and information stored in the storage means are, on the basis of the information acquired by the environment acquisition step, the success rate acquiring a recognition success rate of the operation command in the current external environment, acquired by the success rate acquisition step And a display step for displaying the recognition success rate of the operation command on the display unit.

本発明によれば、ユーザに対し操作コマンドの認識に成功する可能性を知らせることのできる入力装置及び入力装置の制御方法を提供することができる。   ADVANTAGE OF THE INVENTION According to this invention, the input device and the control method of an input device which can notify a user of the possibility of succeeding in recognition of an operation command can be provided.

本実施例に係るテレビ受信機の外観の一例を示す図。The figure which shows an example of the external appearance of the television receiver which concerns on a present Example. 実施例1に係る入力装置の機能構成の一例を示す図。FIG. 3 is a diagram illustrating an example of a functional configuration of the input device according to the first embodiment. 実施例1に係る入力装置の処理の流れの一例を示す図。FIG. 3 is a diagram illustrating an example of a process flow of the input device according to the first embodiment. 実施例1の認識履歴記憶部に記憶されている情報の一例を示す図。FIG. 4 is a diagram illustrating an example of information stored in a recognition history storage unit according to the first embodiment. 認識成功率の表示方法の一例を示す図。The figure which shows an example of the display method of a recognition success rate. 認識成功率の表示方法の一例を示す図。The figure which shows an example of the display method of a recognition success rate. 実施例2に係る入力装置の機能構成の一例を示す図。FIG. 6 is a diagram illustrating an example of a functional configuration of an input device according to a second embodiment. 実施例2の認識履歴記憶部に記憶されている情報の一例を示す図。FIG. 10 is a diagram illustrating an example of information stored in a recognition history storage unit according to the second embodiment. 認識成功率の表示方法の一例を示す図。The figure which shows an example of the display method of a recognition success rate.

<実施例1>
以下、本発明の実施例1に係る入力装置及びその制御方法について説明する。本発明に係る入力装置は電子機器に接続又は内蔵されるものであり、本実施例では、入力装置を内蔵するテレビ受信機1について説明する(図1)。ユーザは、テレビ受信機1で、例えば、テレビ放送のコンテンツ、ビデオレコーダ、インターネットなどから取得されるコンテンツなどを視聴できる。メインディスプレイ2は、コンテンツの映像を表示し、スピーカ3は、コンテンツの音声を出力する。
<Example 1>
Hereinafter, an input device and a control method thereof according to Embodiment 1 of the present invention will be described. The input device according to the present invention is connected to or built in an electronic device. In this embodiment, a television receiver 1 incorporating the input device will be described (FIG. 1). The user can view, for example, television broadcast content, a video recorder, content acquired from the Internet, and the like with the television receiver 1. The main display 2 displays the content video, and the speaker 3 outputs the content audio.

また、ユーザは、ユーザの発する音やユーザの動きによって、テレビ受信機1を操作できる(詳細は後述する)。
人感センサ6は、人が現れたり去ったりしたことを検出する。人感センサ6は、例えば、赤外線センサによって構成される。これによって、テレビ受信機1の各デバイスの通電を必要に応じて制御することができ、消費電力を削減することができる。例えば、人(ユーザ)が去ったとき(即ち、ユーザがテレビ受信機1の周囲にいないとき)に各デバイスへの通電を絶つことにより、消費電力を削減することができる。
サブディスプレイ7は、テレビ受信機1に関する情報を必要に応じて表示する。上記情報は、メインディスプレイ2に表示しても構わないが、サブディスプレイ7に表示することでコンテンツの視聴の邪魔にならずに情報を表示することができる。また、電源スタンバイモード(メインディスプレイ2が通電されていない状態)において、サブディスプレイ7に情報を表示することができる。
Moreover, the user can operate the television receiver 1 with the sound which a user emits, or a user's movement (it mentions later for details).
The human sensor 6 detects that a person has appeared or left. The human sensor 6 is constituted by an infrared sensor, for example. Thus, energization of each device of the television receiver 1 can be controlled as necessary, and power consumption can be reduced. For example, when a person (user) leaves (that is, when the user is not around the television receiver 1), the power consumption can be reduced by turning off the power to each device.
The sub display 7 displays information related to the television receiver 1 as necessary. The above information may be displayed on the main display 2 but can be displayed on the sub display 7 without disturbing the viewing of the content. In addition, information can be displayed on the sub-display 7 in the power standby mode (the state where the main display 2 is not energized).

図2は、本実施例に係る入力装置の機能構成を示すブロック図である。本実施例に係る入力装置は、ユーザの発する音及びユーザの動きのうち少なくともいずれかを認識して電子機器に対する操作コマンドに変換する。本実施例では、ユーザの発する音及びユーザの動きを認識して電子機器に対する操作コマンドに変換する。具体的には、ユーザの発する音としてユーザの声を認識し、ユーザの動きとしてジェスチャ(例えば、ユーザの手の形や動きなど)を認識する。   FIG. 2 is a block diagram illustrating a functional configuration of the input device according to the present embodiment. The input device according to the present embodiment recognizes at least one of the sound emitted by the user and the movement of the user and converts it into an operation command for the electronic device. In this embodiment, the sound generated by the user and the movement of the user are recognized and converted into an operation command for the electronic device. Specifically, the user's voice is recognized as the sound emitted by the user, and the gesture (for example, the shape or movement of the user's hand) is recognized as the user's movement.

音声入力部11は、マイクロホン4から入力される音声をデジタル信号(デジタル音声信号)として音声認識部12へ出力する。
映像入力部13は、カメラ5(撮像装置)から入力される映像をデジタル信号(デジタル映像信号)としてジェスチャ認識部14へ出力する。
The voice input unit 11 outputs the voice input from the microphone 4 to the voice recognition unit 12 as a digital signal (digital voice signal).
The video input unit 13 outputs the video input from the camera 5 (imaging device) to the gesture recognition unit 14 as a digital signal (digital video signal).

音声認識部12は、マイクロホン4から入力される音声からユーザの声を認識して操作コマンドに変換する(第1の認識処理;音声認識)。具体的には、音声認識部12は、デジタル音声信号から抽出した特徴データを所定の操作コマンドの特徴データとパターンマッチングすることで、入力される音声からユーザの声を認識して操作コマンドに変換する。例えば、「電源オフ」という声は、電子機器の電源をオフするための操作コマンドに変換される。また、「電源オン」という声は、電子機器の電源をオンするための操作コマンドに変換される。「音量アップ」、「音量ダウン」という声は、それぞれ、電子機器の音量をアップするための操作コマンド、電子機器の音量をダウンするための操作コマンドに変換される。「チャンネルアップ」、「チャンネルダウン」という声は、それぞれ、視聴するチャンネルをアップするための操作コマンド、視聴するチャンネルをダウンするための操作コマンドに変換される。   The voice recognition unit 12 recognizes the voice of the user from the voice input from the microphone 4 and converts it into an operation command (first recognition process; voice recognition). Specifically, the speech recognition unit 12, the feature data extracted from the digital audio signal by the feature data and pattern matching predetermined operation command converted from the speech that is input to the operation to recognize the user's voice command To do. For example, a voice “power off” is converted into an operation command for turning off the power of the electronic device. The voice “power on” is converted into an operation command for turning on the power of the electronic device. The voices “volume up” and “volume down” are converted into an operation command for increasing the volume of the electronic device and an operation command for decreasing the volume of the electronic device, respectively. Voices “channel up” and “channel down” are converted into an operation command for raising the viewing channel and an operation command for lowering the viewing channel, respectively.

ジェスチャ認識部14は、カメラ5から入力される映像からジェスチャを認識して操作コマンドに変換する(第2の認識処理;ジェスチャ認識)。具体的には、ジェスチャ認識部14は、デジタル映像信号から抽出した特徴データを所定の操作コマンドの特徴データとパターンマッチングすることで、入力される映像からジェスチャを認識して操作コマンドに変換する。例えば、手を「ぐう」の形にするジェスチャは、電子機器の電源をオフするための操作コマンドに変換される。また、手を「ちょき」の形にするジェスチャは、電子機器の電源をオンするための操作コマンドに変換される。手を「上向き指差し」、「下向き指差し」の形にするジェスチャは、それぞれ、電子機器の音量をアップするための操作コマンド、電子機器の音量をダウンするための操作コマンドに変換される。手を「右向き指差し」、「左向き指差し」の形にするジェスチャは、それぞれ、視聴するチャンネルをアップするための操作コマンド、視聴するチャンネルをダウンするための操作コマンドに変換される。   The gesture recognition unit 14 recognizes a gesture from the video input from the camera 5 and converts it into an operation command (second recognition process; gesture recognition). Specifically, the gesture recognition unit 14 recognizes a gesture from an input video and converts it into an operation command by pattern matching feature data extracted from the digital video signal with feature data of a predetermined operation command. For example, a gesture that makes the hand “gu” is converted into an operation command for turning off the power of the electronic device. In addition, the gesture that changes the hand into a “chokki” shape is converted into an operation command for turning on the power of the electronic device. The gestures that make the hand “upward pointing” and “downward pointing” are converted into an operation command for increasing the volume of the electronic device and an operation command for decreasing the volume of the electronic device, respectively. The gestures that make the hand “point to the right” and “point to the left” are converted into an operation command for raising the viewing channel and an operation command for lowering the viewing channel, respectively.

ユーザの声(ユーザの発する音)を認識する構成においては、例えば、日常の会話の中で操作コマンドに対応する声(言葉)と同じ(または類似する)言葉が発せられた場合に
、その言葉が操作コマンドに誤変換されてしまう虞がある。ユーザの動きを認識する場合においても同様であり、日常の動きの中で操作コマンドに対応する動きと同じ(または類似する)動きが行われた場合に、その動きが操作コマンドに誤変換されてしまう虞がある。これを防ぐためには、入力装置が、自身の状態を、操作コマンド受け付け可能状態か操作コマンド受け付け不可能状態に切り換える機能(制御手段)を有していればよい。本実施例では、ユーザが開始操作を行うことで、後述の操作コマンド実行部15が、入力装置の状態を操作コマンド受け付け可能状態に切り換えるものとする。そして、ユーザの声やジェスチャは、入力装置の状態が操作コマンド受け付け可能状態のときにのみ操作コマンドへ変換される。開始操作は、例えば、「テレビ操作」などの声、「テレビに向かって指差し」などのジェスチャなどである。
なお、開始操作と操作コマンドの入力とは、同じ操作手段によって行われてもよいし、互いに異なる操作手段によって行われてもよい。例えば、開始操作と操作コマンドの入力の両方を音声認識で行ってもよいし、それら両方をジェスチャ認識で行ってもよい。開始操作と操作コマンドの入力の一方を音声認識で行い、他方をジェスチャ認識で行ってもよい。また、開始操作や操作コマンドの入力はリモコンを用いて行われてもよい。
In a configuration for recognizing a user's voice (sound emitted by a user), for example, when a word (similar to) a voice (word) corresponding to an operation command is uttered in everyday conversation, the word May be erroneously converted into an operation command. The same applies to the case of recognizing the user's movement. When the movement corresponding to the operation command is performed in the daily movement, the movement is erroneously converted into the operation command. There is a risk of it. In order to prevent this, it is sufficient that the input device has a function (control means) for switching its own state from an operation command acceptable state to an operation command unacceptable state. In this embodiment, it is assumed that the operation command execution unit 15 described later switches the state of the input device to an operation command receivable state when the user performs a start operation. The user's voice or gesture is converted into an operation command only when the state of the input device is an operation command receivable state. The start operation is, for example, a voice such as “television operation” or a gesture such as “pointing toward the television”.
The start operation and the input of the operation command may be performed by the same operation unit or may be performed by different operation units. For example, both the start operation and the input of the operation command may be performed by voice recognition, or both of them may be performed by gesture recognition. One of the start operation and the input of the operation command may be performed by voice recognition, and the other may be performed by gesture recognition. Further, the start operation and the input of the operation command may be performed using a remote controller.

操作コマンド実行部15は、認識された操作コマンドを実行することにより、テレビ受信機1を制御する。
認識結果表示部16は、サブディスプレイ7に認識された操作コマンドを表す文字列やアイコンを表示する。
ユーザは、操作コマンドの実行結果(操作コマンド実行後のテレビ受信機1の状態)や、サブディスプレイ7に表示された情報(認識された操作コマンドを表す文字列やアイコン)により、所望の操作コマンドが正しく実行されたか否かを判断することができる。所望の操作コマンドと異なる操作コマンドが実行されてしまった場合には、ユーザは、取消操作(例えば、「取り消し」などの音声、「手を横に振る」などのジェスチャなど)によって、テレビ受信機1の状態を操作コマンド実行前の状態に戻すことができる。
The operation command execution unit 15 controls the television receiver 1 by executing the recognized operation command.
The recognition result display unit 16 displays a character string and an icon representing the recognized operation command on the sub display 7.
Based on the execution result of the operation command (the state of the television receiver 1 after the execution of the operation command) and the information displayed on the sub display 7 (character string and icon representing the recognized operation command), the user It can be determined whether or not is executed correctly. When an operation command different from a desired operation command has been executed, the user performs a cancel operation (for example, a voice such as “cancel”, a gesture such as “waving his hand”), etc. The state of 1 can be returned to the state before execution of the operation command.

動作モード切換部17は、入力装置の動作モード(本実施例ではテレビ受信機1の動作モード)を、消費電力の異なる複数の動作モード(通常動作モード、省電力動作モード、電源スタンバイモード)のいずれかの動作モードに切り換える。動作モードは、ユーザ操作によって明示的に切り換えられてもよい。また、人感センサ6の検出結果、経過時間(例えば、所定の操作が行われた時点からの経過時間や所望の動作モードが選択された時点からの経過時間)、及び、時刻などに応じて自動的に切り換えられてもよい。
省電力動作モードでは、一部のセンサ(マイクロホン4やカメラ5)や回路が、低電圧や低動作クロックで動作する。それにより、通常動作モード時に比べ、センサの感度が低くなる(センサで生成される信号のレベルが小さくなったり、信号のサンプリング数が少なくなったりする)。そのため、操作コマンドの認識成功率(音声認識やジェスチャ認識に成功する可能性)が通常動作モード時よりも低くなる。
電源スタンバイモード(映像や音声を出力せず、電源オン操作を待っている状態)では、電源オン操作のみを受け付ける程度に消費電力が抑えられているため、操作コマンドの認識成功率が省電力動作モード時よりも低くなる。
即ち、上記複数の動作モードは、動作モードごとに操作コマンドの認識成功率が異なる。なお、動作モードの種類は3種類より少なくてもよいし、多くてもよい。
The operation mode switching unit 17 selects the operation mode of the input device (in this embodiment, the operation mode of the television receiver 1) from a plurality of operation modes (normal operation mode, power saving operation mode, power standby mode) having different power consumption. Switch to one of the operating modes. The operation mode may be explicitly switched by a user operation. Further, according to the detection result of the human sensor 6, the elapsed time (for example, the elapsed time from the time when the predetermined operation is performed or the elapsed time from the time when the desired operation mode is selected), the time, and the like. It may be switched automatically.
In the power saving operation mode, some sensors (microphone 4 and camera 5) and circuits operate with a low voltage or a low operation clock. As a result, the sensitivity of the sensor is lower than in the normal operation mode (the level of the signal generated by the sensor is reduced or the number of signal samplings is reduced). Therefore, the recognition success rate of operation commands (the possibility of success in voice recognition and gesture recognition) is lower than in the normal operation mode.
In power standby mode (state that does not output video or audio and is waiting for power-on operation), the power consumption is reduced to the extent that only power-on operation is accepted, so the success rate of operation command recognition is power-saving operation Lower than in mode.
That is, in the plurality of operation modes, the recognition success rate of the operation command is different for each operation mode. Note that the number of operation modes may be less than or more than three.

外部環境取得部18は、操作コマンドの認識成功率に影響を与える、入力装置の外部環境に関する情報を取得する。音声認識の認識成功率に影響を与える外部環境は、例えば、マイクロホン4から入力される音声の音量などである。ジェスチャ認識の認識成功率に影響を与える外部環境は、例えば、カメラ5で取得される映像の明るさなどである。
認識履歴記憶部19は、外部環境と動作モードの組み合わせごとに、操作コマンドの認識成功率を表す情報を記憶する。本実施例では、操作コマンドの認識成功率を表す情報と
して、操作コマンドの認識の成功及び失敗の履歴(認識成功数および認識失敗数)を記憶する。具体的には、図4に示すように、操作手段(音声認識、ジェスチャ認識)、動作モード、外部環境の組み合わせ毎に、認識成功数と認識失敗数を記憶する。認識成功数は、ユーザの声やジェスチャが正しく認識された回数であり、認識失敗数は、ユーザの声やジェスチャが正しく認識されなかった回数である。
また、本実施例では、操作コマンド実行部15が、操作コマンドの認識の成功及び失敗の履歴を、操作コマンドの認識成功率を表す情報として、外部環境と動作モードの組み合わせ毎に、認識履歴記憶部19に記録する機能(履歴記録手段)を有する。具体的には、操作コマンド実行部15は、認識成功数と認識失敗数を変更する。
The external environment acquisition unit 18 acquires information regarding the external environment of the input device that affects the recognition success rate of the operation command. The external environment that affects the recognition success rate of voice recognition is, for example, the volume of voice input from the microphone 4. The external environment that affects the recognition success rate of gesture recognition is, for example, the brightness of an image acquired by the camera 5.
The recognition history storage unit 19 stores information indicating the recognition success rate of the operation command for each combination of the external environment and the operation mode. In the present embodiment, the history of the recognition and success of the operation command (the number of recognition successes and the number of recognition failures) is stored as information indicating the recognition success rate of the operation commands. Specifically, as shown in FIG. 4, the number of recognition successes and the number of recognition failures are stored for each combination of operation means (voice recognition, gesture recognition), operation mode, and external environment. The number of successful recognitions is the number of times that the user's voice or gesture has been correctly recognized, and the number of recognition failures is the number of times that the user's voice or gesture has not been correctly recognized.
In this embodiment, the operation command execution unit 15 stores the recognition history of the operation command as the information indicating the recognition success rate of the operation command for each combination of the external environment and the operation mode. It has a function (history recording means) for recording in the unit 19. Specifically, the operation command execution unit 15 changes the number of recognition successes and the number of recognition failures.

認識成功率取得部20は、外部環境取得部18で取得された情報(外部環境に関する情報)、動作モードの情報、及び、認識履歴記憶部19に記憶された情報に基づいて、現在の外部環境と動作モードの組み合わせにおける操作コマンドの認識成功率を取得する。本実施例では、認識成功率取得部20は、認識履歴記憶部19に記録された認識の成功及び失敗の履歴から操作コマンドの認識成功率を算出(取得)する。具体的には、現在の外部環境と動作モードの組み合わせにおける、認識成功数/(認識成功数+認識失敗数)の値が操作コマンドの認識成功率として算出される。
ユーザの声やジェスチャを認識すればするほど、上記式の母数が増え、操作コマンドの認識成功率は或る値に収束する。しかし、母数が小さいうちは、操作コマンドの認識成功率は安定した値とならないため、工場出荷時の初期値として、複数のテストユーザの使用履歴に基づく値を認識履歴記憶部19に予め記憶しておくとよい。
認識成功率レベル表示部21は、認識成功率取得部20で取得された操作コマンドの認識成功率を表示部(サブディスプレイ7)に表示する。
本実施例では、第1の認識処理(音声認識)と第2の認識処理(ジェスチャ認識)のそれぞれについて、個別に、認識成功率を取得し、表示部に表示する。
The recognition success rate acquisition unit 20 is based on the information acquired by the external environment acquisition unit 18 (information on the external environment), the operation mode information, and the information stored in the recognition history storage unit 19. The recognition success rate of the operation command in the combination of the operation mode and the operation mode is acquired. In this embodiment, the recognition success rate acquisition unit 20 calculates (acquires) an operation command recognition success rate from the recognition success and failure histories recorded in the recognition history storage unit 19. Specifically, the value of successful recognition / (recognition success number + recognition failure number) in the combination of the current external environment and the operation mode is calculated as the recognition success rate of the operation command.
The more the user's voice and gesture are recognized, the more the parameter of the above formula increases, and the success rate of recognition of the operation command converges to a certain value. However, since the operation command recognition success rate does not become a stable value while the parameter is small, a value based on usage histories of a plurality of test users is stored in advance in the recognition history storage unit 19 as an initial value at the time of factory shipment. It is good to keep.
The recognition success rate level display unit 21 displays the recognition success rate of the operation command acquired by the recognition success rate acquisition unit 20 on the display unit (sub display 7).
In the present embodiment, the recognition success rate is acquired individually for each of the first recognition process (voice recognition) and the second recognition process (gesture recognition) and displayed on the display unit.

以下、本実施例に係る入力装置の処理の流れについて、図3のフローチャートを用いて説明する。なお、以下の処理は操作手段(音声認識、ジェスチャ認識)毎に独立して行われる。
まず、認識成功率取得部20が、現在の外部環境に関する情報及び動作モードの情報を取得する(ステップS101)。本実施例では、動作モード切換部17から現在の動作モードの情報を取得するとともに、外部環境取得部18から現在の外部環境に関する情報を取得する。具体的には、動作モードの情報として、通常動作モード、省電力動作モード、電源スタンバイモードのいずれかを表す識別子を取得する。外部環境に関する情報として、マイクロホン4から入力された音声の音量や、カメラ5から入力された映像の明るさに応じた値を取得する。ただし、音量や明るさはリアルタイムに刻々と変化するため、外部環境に関する情報として、ある程度の期間(数秒間〜数分間)における音量や明るさの積分値や平均値を用いるのが好ましい。そして、外部環境に関する情報は、その値に応じて、数段階の値(例えば、図4に示すように、良、中、悪の3段階)に分類される。なお、外部環境に関する情報は2段階の値であってもよいし、4段階以上の値であってもよい。
Hereinafter, the processing flow of the input apparatus according to the present embodiment will be described with reference to the flowchart of FIG. The following processing is performed independently for each operation means (voice recognition, gesture recognition).
First, the recognition success rate acquisition unit 20 acquires information on the current external environment and information on the operation mode (step S101). In the present embodiment, information on the current operation mode is acquired from the operation mode switching unit 17, and information on the current external environment is acquired from the external environment acquisition unit 18. Specifically, an identifier representing any one of the normal operation mode, the power saving operation mode, and the power standby mode is acquired as the operation mode information. As information about the external environment, a value corresponding to the volume of the sound input from the microphone 4 and the brightness of the video input from the camera 5 is acquired. However, since the volume and brightness change every moment in real time, it is preferable to use the integrated value or average value of the volume and brightness over a certain period (several seconds to several minutes) as information about the external environment. The information related to the external environment is classified into several levels according to the value (for example, three levels, good, medium, and bad as shown in FIG. 4). The information regarding the external environment may be a two-stage value or a four-stage value or more.

次に、認識成功率取得部20が、現在の外部環境及び動作モードにおける操作コマンドの認識成功率を算出する(ステップS102)。
そして、認識成功率レベル表示部21が、ステップS102で算出された操作コマンドの認識成功率をサブディスプレイ7に表示する(ステップS103)。操作コマンドの認識成功率は、例えば、図1に示すように、レベルメータで表示(レベル表示)される。
ステップS104においてユーザの発声やジェスチャが行われたと判断されるまで、ステップS101〜S103の処理が繰り返される。なお、動作モードや外部環境が変化すると、ステップS102で算出される操作コマンドの認識成功率は変化し、ステップS103で表示されるレベルメータも変化する。
例えば、音声入力部11は、人間の話し声の周波数帯域の音声が所定値以上の音量で所定時間以上入力されたときに、発声が行われたと判断する。映像入力部13は、入力された映像から人物を抽出し、抽出された人物から手を検出する。そして、さらに手が動いたことが検出されたときに、ジェスチャが行われたと判断する。
Next, the recognition success rate acquisition unit 20 calculates the recognition success rate of the operation command in the current external environment and operation mode (step S102).
Then, the recognition success rate level display unit 21 displays the recognition success rate of the operation command calculated in step S102 on the sub display 7 (step S103). The recognition success rate of the operation command is displayed (level display) with a level meter, for example, as shown in FIG.
The processes in steps S101 to S103 are repeated until it is determined in step S104 that the user has made a voice or gesture. When the operation mode or the external environment changes, the recognition success rate of the operation command calculated in step S102 changes, and the level meter displayed in step S103 also changes.
For example, the voice input unit 11 determines that the utterance has been made when voice in the frequency band of human speech is input at a volume higher than a predetermined value for a predetermined time or more. The video input unit 13 extracts a person from the input video and detects a hand from the extracted person. When it is further detected that the hand has moved, it is determined that a gesture has been performed.

ステップS104においてユーザの発声やジェスチャが行われたと判断された場合に、ステップS105へ進む。ステップS105では、音声認識部12やジェスチャ認識部14が、認識処理を行う。具体的には、ステップS104においてユーザの発声が行われたと判断された場合に、音声認識部12が認識処理(音声認識)を行い、ジェスチャが行われたと判断された場合に、ジェスチャ認識部14が認識処理(ジェスチャ認識)を行う。
次に、音声認識部12やジェスチャ認識部14が、入力装置の状態が操作コマンド受け付け可能状態か否かを判定する(ステップS106)。操作コマンド受け付け可能状態でない場合には(ステップS106:NO)、ステップS107へ進み、操作コマンド受け付け可能状態である場合には(ステップS106:YES)、ステップS109へ進む。
If it is determined in step S104 that the user has made a speech or gesture, the process proceeds to step S105. In step S105, the voice recognition unit 12 and the gesture recognition unit 14 perform recognition processing. Specifically, when it is determined in step S104 that the user has made a speech, the speech recognition unit 12 performs a recognition process (speech recognition), and when it is determined that a gesture has been performed, the gesture recognition unit 14 Performs recognition processing (gesture recognition).
Next, the voice recognition unit 12 and the gesture recognition unit 14 determine whether or not the state of the input device is an operation command receivable state (step S106). If the operation command is not acceptable (step S106: NO), the process proceeds to step S107. If the operation command is acceptable (step S106: YES), the process proceeds to step S109.

ステップS107では、音声認識部12やジェスチャ認識部14が、ステップS105での認識結果が開始操作を示すものか否か判定する。
開始操作を示すものであると判定された場合には(ステップS107:YES)、ステップS108へ進む。ステップS108では、操作コマンド実行部15が、入力装置の状態を操作コマンド受け付け可能状態に切り換える。操作コマンド受け付け可能状態においては、サブディスプレイ7に該状態であることを示す文字列やアイコンが表示される。
開始操作を示すものでないと判定された場合には(ステップS107:NO)、ユーザの行った発声やジェスチャが日常の会話や動きの中で行われたものであるとみなし、何も行わず、ステップS101へ戻る。
なお、本実施例では、操作コマンド受け付け可能状態において、ユーザの発声や動きが行われない時間や操作コマンドの入力、開始操作、取消操作が行われない時間が所定時間以上になった場合には、操作コマンド受け付け可能状態は解除される。
In step S107, the voice recognition unit 12 and the gesture recognition unit 14 determine whether the recognition result in step S105 indicates a start operation.
If it is determined that the start operation is indicated (step S107: YES), the process proceeds to step S108. In step S108, the operation command execution unit 15 switches the state of the input device to an operation command receivable state. In an operation command receivable state, a character string and an icon indicating the state are displayed on the sub display 7.
If it is determined that the start operation is not indicated (step S107: NO), the user's utterance or gesture is regarded as being performed in everyday conversation or movement, and nothing is performed. Return to step S101.
In this embodiment, when the operation command can be received, when the time when the user does not utter or move or when the operation command input, start operation, or cancel operation is not performed is a predetermined time or more. The operation command acceptable state is canceled.

ステップS109,S110では、音声認識部12やジェスチャ認識部14が、ステップS105での認識結果が操作コマンドや取消操作を示すものか否かを判定する。
認識結果が操作コマンドを示すものであると判定された場合には(ステップS110:YES)、操作コマンド実行部15が該操作コマンドを実行する(ステップS111)。また、認識結果表示部16が、サブディスプレイ7に該操作コマンドを表す文字列やアイコンを表示する。
そして、操作コマンド実行部15が、操作コマンドを入力するために利用された操作手段と、現在の外部環境及び動作モードとの組み合わせに対応する認識成功数を1カウントアップし(ステップS112)、ステップS101へ戻る。
In steps S109 and S110, the voice recognition unit 12 and the gesture recognition unit 14 determine whether the recognition result in step S105 indicates an operation command or a cancel operation.
If it is determined that the recognition result indicates an operation command (step S110: YES), the operation command execution unit 15 executes the operation command (step S111). The recognition result display unit 16 displays a character string and an icon representing the operation command on the sub display 7.
Then, the operation command execution unit 15 increments the number of successful recognitions corresponding to the combination of the operation means used for inputting the operation command and the current external environment and operation mode (step S112), Return to S101.

認識結果が操作コマンドや取消操作を示すものでないと判定された場合には(ステップS110:NO)、ステップS113へ進む。ステップS113では、操作コマンド実行部15が、利用された操作手段、現在の外部環境、及び、現在の動作モードの組み合わせに対応する認識失敗数を1カウントアップし、ステップS101へ戻る。このような構成にするのは、入力装置の状態が操作コマンド受け付け可能状態である場合に、ユーザの動きや発声はコマンド操作または取消操作を示すものである可能性が高く、そのように認識されないことは、認識に失敗した可能性が高いからである。   When it is determined that the recognition result does not indicate an operation command or a cancel operation (step S110: NO), the process proceeds to step S113. In step S113, the operation command execution unit 15 increments the number of recognition failures corresponding to the combination of the used operation means, the current external environment, and the current operation mode, and returns to step S101. To such a configuration, when the state of the input device is ready acceptance operation command, the movement or vocalization user is likely shows the command operation or undo operation is not recognized as such This is because there is a high possibility that recognition has failed.

認識結果が取消操作を示すものであると判定された場合には(ステップS109:YES)、操作コマンド実行部15が、直前に操作コマンドが実行されたか否かを判定する(ステップS114)。
直前に操作コマンドが実行されていない場合には(ステップS114:NO)、ステッ
プS101へ戻る。なお、この場合には、認識結果が取消操作と誤認識されたものである可能性が高いため、認識失敗数を1カウントアップしてもよい。
直前に操作コマンドが実行されていた場合には(ステップS114:YES)、操作コマンド実行部15が、該直前の操作コマンドの実行を取り消す(ステップS115)。
When it is determined that the recognition result indicates a cancel operation (step S109: YES), the operation command execution unit 15 determines whether an operation command has been executed immediately before (step S114).
If the operation command has not been executed immediately before (step S114: NO), the process returns to step S101. In this case, since the recognition result is likely to be erroneously recognized as a cancel operation, the number of recognition failures may be incremented by one.
If the operation command has been executed immediately before (step S114: YES), the operation command execution unit 15 cancels the execution of the previous operation command (step S115).

そして、取り消された操作コマンドを入力するために利用された操作手段と、該操作コマンドの実行時の外部環境及び動作モードとの組み合わせに対応する認識成功数を1カウントダウンする(ステップS116)。即ち、取り消された操作コマンドの実行時に行ったカウントアップを取り消す。
次に、取り消された操作コマンドを入力するために利用された操作手段と、該操作コマンドの実行時の外部環境及び動作モードとの組み合わせに対応する認識失敗数を1カウントアップし(ステップS117)、ステップS101へ戻る。このような構成にするのは、取り消された操作コマンドは誤認識された操作コマンドである可能性が高いからである。
なお、取り消された操作コマンドを入力するために利用された操作手段と取消操作のために利用された操作手段は同じであってもよいし異なっていてもよい。例えば、直前の音声認識による操作コマンドを取り消すための取消操作は、音声認識、ジェスチャ認識、リモコン操作など、どの操作手段による操作であってもよい。認識処理の必要がない操作手段(例えばリモコン操作など)で取消操作を行う場合には、図3の処理は実行されないが、ステップS115〜S117の処理は同様に実行される。
Then, the number of successful recognitions corresponding to the combination of the operation means used for inputting the canceled operation command and the external environment and operation mode when the operation command is executed is counted down by 1 (step S116). That is, the count-up performed when the canceled operation command is executed is canceled.
Next, the number of recognition failures corresponding to the combination of the operation means used for inputting the canceled operation command and the external environment and operation mode when the operation command is executed is incremented by 1 (step S117). Return to step S101. The reason for this configuration is that the canceled operation command is likely to be an erroneously recognized operation command.
Note that the operating means used for inputting the canceled operation command and the operating means used for the canceling operation may be the same or different. For example, the cancel operation for canceling the operation command by the previous speech recognition may be an operation by any operation means such as speech recognition, gesture recognition, and remote control operation. When the canceling operation is performed by an operating means that does not require a recognition process (for example, remote control operation), the process of FIG. 3 is not executed, but the processes of steps S115 to S117 are executed in the same manner.

以上の処理を繰り返すことにより、操作手段ごとに、現在の外部環境と動作モードの組み合わせにおける操作コマンドの認識成功率がレベルメータで表示される。それにより、ユーザは、各操作手段(音声認識、ジェスチャ認識)の認識成功率を把握することができ、各操作手段による電子機器の操作の手間を比較することが可能となる。   By repeating the above processing, the recognition success rate of the operation command in the combination of the current external environment and the operation mode is displayed on the level meter for each operation means. Thereby, the user can grasp the recognition success rate of each operation means (speech recognition, gesture recognition), and can compare the effort of operating the electronic device by each operation means.

なお、本実施例では、操作コマンドの認識成功率を算出する構成としたが、操作コマンドの認識成功率は認識履歴記憶部19に予め記憶された固定値であってもよい。それにより、より簡易な構成で上記効果を得ることができる。
また、本実施例では外部環境の他に動作モードをさらに考慮したが、外部環境のみを考慮してもよい。外部環境のみを考慮しても上記効果を得ることができる。
In this embodiment, the operation command recognition success rate is calculated. However, the operation command recognition success rate may be a fixed value stored in advance in the recognition history storage unit 19. Thereby, the above effect can be obtained with a simpler configuration.
In the present embodiment, the operation mode is further considered in addition to the external environment, but only the external environment may be considered. The above effect can be obtained even when only the external environment is considered.

なお、本実施例では認識成功率をレベルメータで表示するものとしたが、認識成功率はどのように表示してもよい。例えば、図5(A)のように数字で表示したり、図5(B)のようにアイコンの図柄(例えば、◎、○、△、×)で表示してもよい。また、ランプなどの色(例えば、青、緑、黄、橙、赤、非点灯)で表示してもよい。具体的には、図6(A)のように、音声認識(第1の認識処理)の認識成功率をランプ8の色、ジェスチャ認識(第2の認識処理)の認識成功率をランプ9の色で表示してもよい。
また、音声認識の認識成功率をマイクロホン4に近接する位置に表示し、ジェスチャ認識の認識成功率をカメラ5に近接する位置に表示してもよい。具体的には、図6(B)に示すように、マイクロホン4付近にランプ8を配置し、カメラ5付近にランプ9を配置してもよい。それにより、操作手段と認識成功率との対応を文字などで表示しなくてもユーザに直感的に理解させやすくすることができる。
In this embodiment, the recognition success rate is displayed with a level meter. However, the recognition success rate may be displayed in any way. For example, numbers may be displayed as shown in FIG. 5A, or icons may be displayed as shown in FIG. 5B (for example, ◎, ○, Δ, ×). Moreover, you may display by colors, such as a lamp (for example, blue, green, yellow, orange, red, non-lighting). Specifically, as shown in FIG. 6A, the recognition success rate of speech recognition (first recognition processing) is set to the color of the lamp 8, and the recognition success rate of gesture recognition (second recognition processing) is set to that of the lamp 9. It may be displayed in color.
Further, the recognition success rate of voice recognition may be displayed at a position close to the microphone 4, and the recognition success rate of gesture recognition may be displayed at a position close to the camera 5. Specifically, as shown in FIG. 6B, a lamp 8 may be arranged near the microphone 4 and a lamp 9 may be arranged near the camera 5. This makes it easy for the user to intuitively understand the correspondence between the operation means and the recognition success rate without displaying the correspondence with characters or the like.

なお、本実施例では認識成功率が常に表示される構成としたが、認識成功率レベル表示部21は、入力装置の状態が操作コマンド受け付け可能状態であるときのみ認識成功率を表示してもよい。それにより、ユーザは、操作コマンドの入力を行うとき以外に、コンテンツの視聴に集中することができる。
また、人感センサ6によって、人が現れたり去ったりしたことを検出して、適宜サブディスプレイ7やランプ8、ランプ9の点灯と消灯を切り換えるようにしてもよい。
In this embodiment, the recognition success rate is always displayed. However, the recognition success rate level display unit 21 may display the recognition success rate only when the state of the input device is an operation command receivable state. Good. Thereby, the user can concentrate on viewing the content other than when inputting the operation command.
Alternatively, the presence sensor 6 may be detected by the human sensor 6, and the sub display 7, the lamp 8, and the lamp 9 may be switched on and off as appropriate.

なお、本実施例では、入力装置を内蔵するテレビ受信機1について説明したが、入力装置は、パーソナルコンピュータ、ハードディスクレコーダ、エアコン、冷蔵庫など、いかなる電子機器に接続または内蔵されてもよい。
なお、本実施例では、ユーザの声及びジェスチャを認識するものとしたが、いずれか一方を認識する構成であってもよいし、他の音や動き(ユーザの発する音やユーザの動き)を認識する構成であってもよい。例えば、ユーザの手叩きの音や、眼、口などの各器官の動きを検出してもよい。
なお、図3のステップS115で取り消される操作コマンドがリモコン操作によるものである可能性があるため、取り消される操作コマンドが音声認識やジェスチャ認識によるものか否かを判定してもよい。取り消された操作コマンドが、音声認識やジェスチャ認識によるものでない(リモコン操作などによるものである)場合には、認識成功数や認識失敗数を変更する必要はない。
In the present embodiment, the television receiver 1 including the input device has been described. However, the input device may be connected to or incorporated in any electronic device such as a personal computer, a hard disk recorder, an air conditioner, and a refrigerator.
In this embodiment, the user's voice and gesture are recognized. However, any one of the configurations may be recognized, and other sounds and movements (sounds generated by the users and user movements) may be used. The structure which recognizes may be sufficient. For example, it is possible to detect the sound of a user's hand and the movement of each organ such as the eyes and mouth.
Since there is a possibility that the operation command canceled in step S115 in FIG. 3 is a remote control operation, it may be determined whether or not the canceled operation command is due to voice recognition or gesture recognition. When the canceled operation command is not based on voice recognition or gesture recognition (by remote control operation or the like), it is not necessary to change the number of recognition successes or the number of recognition failures.

<実施例2>
次に、本発明の実施例2に係る入力装置及びその制御方法について説明する。図7は、本実施例に係る入力装置の機能構成を示すブロック図である。本実施例に係る入力装置は、図2の構成のほかに、ユーザ識別部22とユーザ位置判定部23をさらに有する。
<Example 2>
Next, an input device and a control method thereof according to Embodiment 2 of the present invention will be described. FIG. 7 is a block diagram illustrating a functional configuration of the input device according to the present embodiment. The input device according to the present embodiment further includes a user identification unit 22 and a user position determination unit 23 in addition to the configuration of FIG.

ユーザ識別部22は、ユーザを識別する。ユーザは、例えば、カメラ5で撮影した人物の顔を認識することにより識別されてもよいし、マイクロホン4で取得した音声の声紋を解析することにより識別されてもよい。なお、識別されるユーザは1人でもよいし、複数でもよい。
ユーザ位置判定部23は、ユーザの位置を判定する。具体的には、赤外線センサやカメラを用いて、テレビ受信機1からユーザまでの距離や角度を測定する。上記測定に用いる赤外線センサやカメラは、カメラ5や人感センサ6が兼ねてもよいし、別途設けてもよい。
The user identification unit 22 identifies a user. For example, the user may be identified by recognizing the face of a person photographed by the camera 5 or may be identified by analyzing the voice print of the voice acquired by the microphone 4. Note that one or more users may be identified.
The user position determination unit 23 determines the position of the user. Specifically, the distance and angle from the television receiver 1 to the user are measured using an infrared sensor or a camera. The infrared sensor or camera used for the measurement may be used by the camera 5 or the human sensor 6 or may be provided separately.

以下、本実施例に係る入力装置の処理の流れについて図3のフローチャートを用いて説明する。基本的な処理の流れは実施例1と同様のため、ここでは実施例1と異なる点について説明する。
ステップS101では、認識成功率取得部20が、現在の外部環境に関する情報、動作モードの情報、ユーザの識別情報、及び、ユーザの位置情報を取得する。具体的には、実施例1と同様に、動作モード切換部17から現在の動作モードの情報を取得するとともに、外部環境取得部18から現在の外部環境に関する情報を取得する。更に、本実施例では、ユーザ識別部22から現在のユーザの識別情報を取得し、ユーザ位置判定部23から現在のユーザの位置情報を取得する。
Hereinafter, the processing flow of the input apparatus according to the present embodiment will be described with reference to the flowchart of FIG. Since the basic processing flow is the same as that of the first embodiment, only the differences from the first embodiment will be described here.
In step S101, the recognition success rate acquisition unit 20 acquires information on the current external environment, operation mode information, user identification information, and user position information. Specifically, as in the first embodiment, information on the current operation mode is acquired from the operation mode switching unit 17 and information on the current external environment is acquired from the external environment acquisition unit 18. Further, in this embodiment, the current user identification information is acquired from the user identification unit 22, and the current user position information is acquired from the user position determination unit 23.

ステップS102では、認識成功率取得部20が、現在の外部環境、動作モード、ユーザ、及び、ユーザの位置における認識成功率を算出する。本実施例の認識履歴記憶部19には、図8に示すように、操作手段、動作モード、外部環境、ユーザ、ユーザの位置の組み合わせごとに、認識成功数と認識失敗数の値がそれぞれ記憶されている。なお、図8には、ユーザが「A」と「B」の2人の例を示しているが、1人分だけ記憶されていてもよいし、3人以上について記憶されていてもよい。また、図8は、ユーザの位置が「近」と「遠」の2段階に分類される例を示しているが、3段階以上に分類されてもよい。距離と角度の組み合わせによって分類されてもよい。   In step S102, the recognition success rate acquisition unit 20 calculates the recognition success rate at the current external environment, operation mode, user, and user position. As shown in FIG. 8, the recognition history storage unit 19 of this embodiment stores the number of recognition successes and the number of recognition failures for each combination of operation means, operation mode, external environment, user, and user position. Has been. FIG. 8 shows an example of two users “A” and “B”, but only one user may be stored, or three or more users may be stored. FIG. 8 shows an example in which the user's position is classified into two levels, “near” and “far”, but the user position may be classified into three or more levels. The combination of distance and angle may be classified.

ステップS103では、認識成功率レベル表示部21が、ステップS102で算出された操作コマンドの認識成功率をサブディスプレイ7に表示する。本実施例では、認識成功率レベル表示部21は、識別されたユーザに対する操作コマンドの認識成功率を表示する
。複数のユーザが識別された場合には、例えば、図9(A)に示すように、ユーザ毎の認識成功率を同時に表示する。ユーザ毎の認識成功率を同時に表示しきれない場合には、図9(B)に示すように、所定時間ごとに表示するユーザ及びそのユーザについての認識成功率を切り換えればよい。
In step S103, the recognition success rate level display unit 21 displays the recognition success rate of the operation command calculated in step S102 on the sub display 7. In the present embodiment, the recognition success rate level display unit 21 displays the recognition success rate of the operation command for the identified user. When a plurality of users are identified, for example, as shown in FIG. 9A, the recognition success rate for each user is displayed simultaneously. When the recognition success rate for each user cannot be displayed at the same time, as shown in FIG. 9B, the user displayed for every predetermined time and the recognition success rate for the user may be switched.

ステップS112,S113,S116,S117においてカウントアップ、カウントダウンの対象となる操作コマンドを入力したユーザは、例えば、以下のように特定される。カウントアップ、カウントダウンの対象となる操作コマンドを入力するために利用された操作手段が音声認識の場合には、声紋の解析や撮影した顔の唇の動きを解析することで特定される。カウントアップ、カウントダウンの対象となる操作コマンドを入力するために利用された操作手段がジェスチャ認識の場合には、ジェスチャを行った人物の顔を認識することで特定される。   A user who has input an operation command to be counted up or counted down in steps S112, S113, S116, and S117 is identified as follows, for example. Counting, if the utilized operating means for inputting an operation command to be countdown target of voice recognition is specified by analyzing the motion analysis and imaging the face of the lip voiceprint. Counting, if the utilized operating means for inputting an operation command to be countdown target gesture recognition is identified by recognizing the face of the person who made the gesture.

以上の処理を繰り返すことにより、操作手段ごとに、現在の外部環境、動作モード、ユーザ、及び、ユーザの位置の組み合わせにおける操作コマンドの認識成功率がレベルメータで表示される。それにより、ユーザは、各操作手段(音声認識、ジェスチャ認識)の認識成功率を把握することができ、各操作手段による電子機器の操作の手間を比較することが可能となる。
なお、本実施例では、実施例1に比べ、ユーザとユーザの位置をさらに考慮したが、ユーザとユーザの位置のいずれか一方をさらに考慮した構成であってもよい。また、外部環境とユーザ、または、外部環境とユーザの位置を考慮した構成であってもよい。また、上慮するパラメータはこれらに限らない。操作コマンドの認識成功率に影響を与えるパラメータであればどのようなパラメータを考慮してもよい。
By repeating the above process, the operation command recognition success rate in the combination of the current external environment, operation mode, user, and user position is displayed on the level meter for each operation means. Thereby, the user can grasp the recognition success rate of each operation means (speech recognition, gesture recognition), and can compare the effort of operating the electronic device by each operation means.
In the present embodiment, the user and the position of the user are further considered as compared with the first embodiment, but a configuration in which any one of the user and the position of the user is further considered may be used. Moreover, the structure which considered the external environment and the user or the external environment and the position of the user may be sufficient. The parameters to be considered are not limited to these. Any parameter that affects the recognition success rate of the operation command may be considered.

18 外部環境取得部
19 認識履歴記憶部
20 認識成功率取得部
21 認識成功率レベル表示部
18 External environment acquisition unit 19 Recognition history storage unit 20 Recognition success rate acquisition unit 21 Recognition success rate level display unit

Claims (9)

電子機器に接続又は内蔵される入力装置であって、ユーザの発する音及びユーザの動きのうち少なくともいずれかを認識して前記電子機器に対する操作コマンドへ変換する入力装置において、
操作コマンドの認識成功率に影響を与える、前記入力装置の外部環境に関する情報を取得する環境取得手段と、
外部環境ごとに、操作コマンドの認識成功率を表す情報を記憶している記憶手段と、
前記環境取得手段により取得された情報と前記記憶手段に記憶された情報に基づいて、現在の外部環境における操作コマンドの認識成功率を取得する成功率取得手段と、
前記成功率取得手段により取得された操作コマンドの認識成功率を表示部に表示する表示手段と、
を有することを特徴とする入力装置。
An input device that is connected to or incorporated in an electronic device and recognizes at least one of a user-generated sound and a user's movement and converts it into an operation command for the electronic device.
Environment acquisition means for acquiring information about the external environment of the input device, which affects the recognition success rate of the operation command;
Storage means storing information representing the recognition success rate of the operation command for each external environment;
Based on the information acquired by the environment acquisition unit and the information stored in the storage unit, a success rate acquisition unit that acquires a recognition success rate of the operation command in the current external environment;
Display means for displaying a recognition success rate of the operation command acquired by the success rate acquisition means on a display unit;
An input device comprising:
操作コマンドの認識の成功及び失敗の履歴を、前記操作コマンドの認識成功率を表す情報として、外部環境ごとに前記記憶手段に記録する履歴記録手段をさらに有し、
前記成功率取得手段は、前記記憶手段に記録された認識の成功及び失敗の履歴から認識成功率を算出する
ことを特徴とする請求項1に記載の入力装置。
A history recording means for recording the history of the success and failure of the recognition of the operation command as information representing the recognition success rate of the operation command in the storage means for each external environment,
The input device according to claim 1, wherein the success rate acquisition unit calculates a recognition success rate from a history of recognition successes and failures recorded in the storage unit.
マイクロホンから入力される音声からユーザの発する音を認識して操作コマンドへ変換する第1の認識処理を行う第1の認識手段と、
撮像装置から入力される映像からユーザの動きを認識して操作コマンドへ変換する第2の認識処理を行う第2の認識手段と、をさらに有し、
前記第1の認識処理と第2の認識処理のそれぞれについて、個別に、現在の外部環境における認識成功率が表示される
ことを特徴とする請求項1または2に記載の入力装置。
First recognition means for performing first recognition processing for recognizing a sound emitted by a user from sound input from a microphone and converting the sound into an operation command;
A second recognizing unit for performing a second recognizing process for recognizing a user's movement from an image input from the imaging apparatus and converting the image into an operation command;
The input device according to claim 1, wherein a recognition success rate in the current external environment is displayed individually for each of the first recognition process and the second recognition process.
前記第1の認識処理の認識成功率は、前記マイクロホンに近接する位置に表示され、
前記第2の認識処理の認識成功率は、前記撮像装置に近接する位置に表示される
ことを特徴とする請求項3に記載の入力装置。
The recognition success rate of the first recognition process is displayed at a position close to the microphone,
The input device according to claim 3, wherein the recognition success rate of the second recognition process is displayed at a position close to the imaging device.
前記入力装置の状態を、操作コマンド受け付け可能状態か操作コマンド受け付け不可能状態に切り換える制御手段をさらに有し、
前記表示手段は、前記入力装置の状態が前記操作コマンド受け付け可能状態であるときにのみ、前記認識成功率を表示部に表示する
ことを特徴とする請求項1〜4のいずれか1項に記載の入力装置。
Control means for switching the state of the input device to an operation command accepting state or an operation command not accepting state,
5. The display unit according to claim 1, wherein the display unit displays the recognition success rate on a display unit only when the state of the input device is a state in which the operation command can be received. Input device.
前記入力装置は、消費電力の異なる複数の動作モードを有し、動作モードごとに操作コマンドの認識成功率が異なっており、
前記記憶手段は、さらに動作モードごとに、操作コマンドの認識成功率を表す情報を記憶しており、
前記成功率取得手段は、現在の動作モードをさらに考慮して、現在の外部環境における操作コマンドの認識成功率を取得する
ことを特徴とする請求項1〜5のいずれか1項に記載の入力装置。
The input device has a plurality of operation modes with different power consumption, the recognition success rate of the operation command is different for each operation mode,
The storage means further stores information representing a recognition success rate of the operation command for each operation mode,
6. The input according to claim 1, wherein the success rate acquisition unit acquires a recognition success rate of an operation command in a current external environment, further considering a current operation mode. apparatus.
ユーザを識別するユーザ識別手段をさらに有し、
前記記憶手段は、さらにユーザごとに、操作コマンドの認識成功率を表す情報を記憶しており、
前記成功率取得手段は、前記ユーザ識別手段により識別された現在のユーザをさらに考
慮して、現在の外部環境における操作コマンドの認識成功率を取得する
ことを特徴とする請求項1〜6のいずれか1項に記載の入力装置。
A user identification means for identifying the user;
The storage means further stores, for each user, information representing a recognition success rate of the operation command,
The said success rate acquisition means acquires the recognition success rate of the operation command in the present external environment further in consideration of the current user identified by the said user identification means. The input device according to claim 1.
ユーザの位置を判定する位置判定手段をさらに有し、
前記記憶手段は、さらにユーザの位置ごとに、操作コマンドの認識成功率を表す情報を記憶しており、
前記成功率取得手段は、前記位置判定手段により判定された現在のユーザの位置をさらに考慮して、現在の外部環境における操作コマンドの認識成功率を取得する
ことを特徴とする請求項1〜7のいずれか1項に記載の入力装置。
It further has a position determination means for determining the position of the user,
The storage means further stores information representing a recognition success rate of the operation command for each position of the user,
The success rate acquisition unit acquires the operation command recognition success rate in the current external environment, further considering the current user position determined by the position determination unit. The input device according to any one of the above.
電子機器に接続又は内蔵される入力装置であって、ユーザの発する音及びユーザの動きのうち少なくともいずれかを認識して前記電子機器に対する操作コマンドへ変換する入力装置の制御方法において、
操作コマンドの認識成功率に影響を与える、前記入力装置の外部環境に関する情報を取得する環境取得ステップと、
外部環境ごとに、操作コマンドの認識成功率を表す情報を記憶している記憶手段に記憶された情報と、前記環境取得ステップで取得された情報とに基づいて、現在の外部環境における操作コマンドの認識成功率を取得する成功率取得ステップと、
前記成功率取得ステップで取得された操作コマンドの認識成功率を表示部に表示する表示ステップと、
を有することを特徴とする入力装置の制御方法。
In an input device connected to or incorporated in an electronic device, the input device control method for recognizing at least one of a user-generated sound and a user's movement and converting it into an operation command for the electronic device.
An environment acquisition step for acquiring information relating to the external environment of the input device that affects the recognition success rate of the operation command;
For each external environment, based on the information stored in the storage means storing the information indicating the recognition success rate of the operation command and the information acquired in the environment acquisition step, the operation command in the current external environment A success rate acquisition step for acquiring a recognition success rate;
A display step of displaying a recognition success rate of the operation command acquired in the success rate acquisition step on a display unit;
A control method for an input device, comprising:
JP2009232406A 2009-10-06 2009-10-06 Input device and control method thereof Expired - Fee Related JP5473520B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2009232406A JP5473520B2 (en) 2009-10-06 2009-10-06 Input device and control method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2009232406A JP5473520B2 (en) 2009-10-06 2009-10-06 Input device and control method thereof

Publications (3)

Publication Number Publication Date
JP2011081541A true JP2011081541A (en) 2011-04-21
JP2011081541A5 JP2011081541A5 (en) 2012-11-08
JP5473520B2 JP5473520B2 (en) 2014-04-16

Family

ID=44075547

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2009232406A Expired - Fee Related JP5473520B2 (en) 2009-10-06 2009-10-06 Input device and control method thereof

Country Status (1)

Country Link
JP (1) JP5473520B2 (en)

Cited By (155)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101228643B1 (en) * 2011-08-24 2013-01-31 한국과학기술원 Apparatus and method for motion detection, and apparatus for audio and image gerneration
JP2013037689A (en) * 2011-08-05 2013-02-21 Samsung Electronics Co Ltd Electronic equipment and control method thereof
JP2013037688A (en) * 2011-08-05 2013-02-21 Samsung Electronics Co Ltd Electronic equipment and control method thereof
JP2013037454A (en) * 2011-08-05 2013-02-21 Ikutoku Gakuen Posture determination method, program, device, and system
JP2013041580A (en) * 2011-08-05 2013-02-28 Samsung Electronics Co Ltd Electronic apparatus and method of controlling the same
JP2013080015A (en) * 2011-09-30 2013-05-02 Toshiba Corp Speech recognition device and speech recognition method
WO2013069936A1 (en) * 2011-11-07 2013-05-16 Samsung Electronics Co., Ltd. Electronic apparatus and method for controlling thereof
WO2013122310A1 (en) * 2012-02-17 2013-08-22 Lg Electronics Inc. Method and apparatus for smart voice recognition
WO2014065254A1 (en) * 2012-10-25 2014-05-01 京セラ株式会社 Portable terminal device and input operation acceptance method
WO2015097568A1 (en) * 2013-12-24 2015-07-02 Sony Corporation Alternative camera function control
WO2015118578A1 (en) * 2014-02-10 2015-08-13 三菱電機株式会社 Multimodal input device, and method for controlling timeout in terminal device and multimodal input device
JP2015194766A (en) * 2015-06-29 2015-11-05 株式会社東芝 speech recognition device and speech recognition method
WO2016039992A1 (en) * 2014-09-12 2016-03-17 Apple Inc. Dynamic thresholds for always listening speech trigger
US9582608B2 (en) 2013-06-07 2017-02-28 Apple Inc. Unified ranking with entropy-weighted information for phrase-based semantic auto-completion
US9626955B2 (en) 2008-04-05 2017-04-18 Apple Inc. Intelligent text-to-speech conversion
US9633660B2 (en) 2010-02-25 2017-04-25 Apple Inc. User profiling for voice input processing
US9646614B2 (en) 2000-03-16 2017-05-09 Apple Inc. Fast, language-independent method for user authentication by voice
US9668024B2 (en) 2014-06-30 2017-05-30 Apple Inc. Intelligent automated assistant for TV user interactions
JP2017120609A (en) * 2015-12-24 2017-07-06 カシオ計算機株式会社 Emotion estimation device, emotion estimation method and program
US9934775B2 (en) 2016-05-26 2018-04-03 Apple Inc. Unit-selection text-to-speech synthesis based on predicted concatenation parameters
US9953088B2 (en) 2012-05-14 2018-04-24 Apple Inc. Crowd sourcing information to fulfill user requests
US9953654B2 (en) 2014-05-20 2018-04-24 Samsung Electronics Co., Ltd. Voice command recognition apparatus and method
US9966068B2 (en) 2013-06-08 2018-05-08 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US9966060B2 (en) 2013-06-07 2018-05-08 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US9971774B2 (en) 2012-09-19 2018-05-15 Apple Inc. Voice-based media searching
US9972304B2 (en) 2016-06-03 2018-05-15 Apple Inc. Privacy preserving distributed evaluation framework for embedded personalized systems
US9986419B2 (en) 2014-09-30 2018-05-29 Apple Inc. Social reminders
US10043516B2 (en) 2016-09-23 2018-08-07 Apple Inc. Intelligent automated assistant
US10049663B2 (en) 2016-06-08 2018-08-14 Apple, Inc. Intelligent automated assistant for media exploration
US10049668B2 (en) 2015-12-02 2018-08-14 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10067938B2 (en) 2016-06-10 2018-09-04 Apple Inc. Multilingual word prediction
US10079014B2 (en) 2012-06-08 2018-09-18 Apple Inc. Name recognition system
US10083690B2 (en) 2014-05-30 2018-09-25 Apple Inc. Better resolution when referencing to concepts
US10089072B2 (en) 2016-06-11 2018-10-02 Apple Inc. Intelligent device arbitration and control
US10102359B2 (en) 2011-03-21 2018-10-16 Apple Inc. Device access using voice authentication
US10108612B2 (en) 2008-07-31 2018-10-23 Apple Inc. Mobile device having human language translation capability with positional feedback
US10169329B2 (en) 2014-05-30 2019-01-01 Apple Inc. Exemplar-based natural language processing
US10185542B2 (en) 2013-06-09 2019-01-22 Apple Inc. Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant
US10192552B2 (en) 2016-06-10 2019-01-29 Apple Inc. Digital assistant providing whispered speech
US10223066B2 (en) 2015-12-23 2019-03-05 Apple Inc. Proactive assistance based on dialog communication between devices
US10249300B2 (en) 2016-06-06 2019-04-02 Apple Inc. Intelligent list reading
US10269345B2 (en) 2016-06-11 2019-04-23 Apple Inc. Intelligent task discovery
US10283110B2 (en) 2009-07-02 2019-05-07 Apple Inc. Methods and apparatuses for automatic speech recognition
US10297253B2 (en) 2016-06-11 2019-05-21 Apple Inc. Application integration with a digital assistant
US10303715B2 (en) 2017-05-16 2019-05-28 Apple Inc. Intelligent automated assistant for media exploration
US10311144B2 (en) 2017-05-16 2019-06-04 Apple Inc. Emoji word sense disambiguation
US10311871B2 (en) 2015-03-08 2019-06-04 Apple Inc. Competing devices responding to voice triggers
US10318871B2 (en) 2005-09-08 2019-06-11 Apple Inc. Method and apparatus for building an intelligent automated assistant
US10332518B2 (en) 2017-05-09 2019-06-25 Apple Inc. User interface for correcting recognition errors
US10356243B2 (en) 2015-06-05 2019-07-16 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
US10354011B2 (en) 2016-06-09 2019-07-16 Apple Inc. Intelligent automated assistant in a home environment
US10366158B2 (en) 2015-09-29 2019-07-30 Apple Inc. Efficient word encoding for recurrent neural network language models
US10381016B2 (en) 2008-01-03 2019-08-13 Apple Inc. Methods and apparatus for altering audio output signals
US10395654B2 (en) 2017-05-11 2019-08-27 Apple Inc. Text normalization based on a data-driven learning network
US10403278B2 (en) 2017-05-16 2019-09-03 Apple Inc. Methods and systems for phonetic matching in digital assistant services
US10403283B1 (en) 2018-06-01 2019-09-03 Apple Inc. Voice interaction at a primary device to access call functionality of a companion device
US10410637B2 (en) 2017-05-12 2019-09-10 Apple Inc. User-specific acoustic models
US10417266B2 (en) 2017-05-09 2019-09-17 Apple Inc. Context-aware ranking of intelligent response suggestions
US10431204B2 (en) 2014-09-11 2019-10-01 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US10438595B2 (en) 2014-09-30 2019-10-08 Apple Inc. Speaker identification and unsupervised speaker adaptation techniques
US10445429B2 (en) 2017-09-21 2019-10-15 Apple Inc. Natural language understanding using vocabularies with compressed serialized tries
US10446143B2 (en) 2016-03-14 2019-10-15 Apple Inc. Identification of voice inputs providing credentials
US10453443B2 (en) 2014-09-30 2019-10-22 Apple Inc. Providing an indication of the suitability of speech recognition
US10474753B2 (en) 2016-09-07 2019-11-12 Apple Inc. Language identification using recurrent neural networks
US10482874B2 (en) 2017-05-15 2019-11-19 Apple Inc. Hierarchical belief states for digital assistants
US10490187B2 (en) 2016-06-10 2019-11-26 Apple Inc. Digital assistant providing automated status report
US10496705B1 (en) 2018-06-03 2019-12-03 Apple Inc. Accelerated task performance
US10497365B2 (en) 2014-05-30 2019-12-03 Apple Inc. Multi-command single utterance input method
US10509862B2 (en) 2016-06-10 2019-12-17 Apple Inc. Dynamic phrase expansion of language input
US10521466B2 (en) 2016-06-11 2019-12-31 Apple Inc. Data driven natural language event detection and classification
US10529332B2 (en) 2015-03-08 2020-01-07 Apple Inc. Virtual assistant activation
US10567477B2 (en) 2015-03-08 2020-02-18 Apple Inc. Virtual assistant continuity
US10592604B2 (en) 2018-03-12 2020-03-17 Apple Inc. Inverse text normalization for automatic speech recognition
US10593346B2 (en) 2016-12-22 2020-03-17 Apple Inc. Rank-reduced token representation for automatic speech recognition
US10636424B2 (en) 2017-11-30 2020-04-28 Apple Inc. Multi-turn canned dialog
US10643611B2 (en) 2008-10-02 2020-05-05 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US10657328B2 (en) 2017-06-02 2020-05-19 Apple Inc. Multi-task recurrent neural network architecture for efficient morphology handling in neural language modeling
US10671428B2 (en) 2015-09-08 2020-06-02 Apple Inc. Distributed personal assistant
US10684703B2 (en) 2018-06-01 2020-06-16 Apple Inc. Attention aware virtual assistant dismissal
US10691473B2 (en) 2015-11-06 2020-06-23 Apple Inc. Intelligent automated assistant in a messaging environment
US10699717B2 (en) 2014-05-30 2020-06-30 Apple Inc. Intelligent assistant for home automation
US10706841B2 (en) 2010-01-18 2020-07-07 Apple Inc. Task flow identification based on user intent
US10714117B2 (en) 2013-02-07 2020-07-14 Apple Inc. Voice trigger for a digital assistant
US10726832B2 (en) 2017-05-11 2020-07-28 Apple Inc. Maintaining privacy of personal information
US10733375B2 (en) 2018-01-31 2020-08-04 Apple Inc. Knowledge-based framework for improving natural language understanding
US10733993B2 (en) 2016-06-10 2020-08-04 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10733982B2 (en) 2018-01-08 2020-08-04 Apple Inc. Multi-directional dialog
US10741185B2 (en) 2010-01-18 2020-08-11 Apple Inc. Intelligent automated assistant
US10748546B2 (en) 2017-05-16 2020-08-18 Apple Inc. Digital assistant services based on device capabilities
US10747498B2 (en) 2015-09-08 2020-08-18 Apple Inc. Zero latency digital assistant
US10755051B2 (en) 2017-09-29 2020-08-25 Apple Inc. Rule-based natural language processing
US10755703B2 (en) 2017-05-11 2020-08-25 Apple Inc. Offline personal assistant
US10769385B2 (en) 2013-06-09 2020-09-08 Apple Inc. System and method for inferring user intent from speech inputs
US10789959B2 (en) 2018-03-02 2020-09-29 Apple Inc. Training speaker recognition models for digital assistants
US10791176B2 (en) 2017-05-12 2020-09-29 Apple Inc. Synchronization and task delegation of a digital assistant
US10789945B2 (en) 2017-05-12 2020-09-29 Apple Inc. Low-latency intelligent automated assistant
US10795541B2 (en) 2009-06-05 2020-10-06 Apple Inc. Intelligent organization of tasks items
US10810274B2 (en) 2017-05-15 2020-10-20 Apple Inc. Optimizing dialogue policy decisions for digital assistants using implicit feedback
US10818288B2 (en) 2018-03-26 2020-10-27 Apple Inc. Natural assistant interaction
US10839159B2 (en) 2018-09-28 2020-11-17 Apple Inc. Named entity normalization in a spoken dialog system
US10892996B2 (en) 2018-06-01 2021-01-12 Apple Inc. Variable latency device coordination
US10909331B2 (en) 2018-03-30 2021-02-02 Apple Inc. Implicit identification of translation payload with neural machine translation
US10928918B2 (en) 2018-05-07 2021-02-23 Apple Inc. Raise to speak
WO2021044569A1 (en) * 2019-09-05 2021-03-11 三菱電機株式会社 Speech recognition support device and speech recognition support method
US10984780B2 (en) 2018-05-21 2021-04-20 Apple Inc. Global semantic word embeddings using bi-directional recurrent neural networks
CN112786036A (en) * 2019-11-04 2021-05-11 海信视像科技股份有限公司 Display apparatus and content display method
US11010550B2 (en) 2015-09-29 2021-05-18 Apple Inc. Unified language modeling framework for word prediction, auto-completion and auto-correction
US11010561B2 (en) 2018-09-27 2021-05-18 Apple Inc. Sentiment prediction from textual data
US11010127B2 (en) 2015-06-29 2021-05-18 Apple Inc. Virtual assistant for media playback
US11023513B2 (en) 2007-12-20 2021-06-01 Apple Inc. Method and apparatus for searching using an active ontology
US11025565B2 (en) 2015-06-07 2021-06-01 Apple Inc. Personalized prediction of responses for instant messaging
US11069336B2 (en) 2012-03-02 2021-07-20 Apple Inc. Systems and methods for name pronunciation
US11070949B2 (en) 2015-05-27 2021-07-20 Apple Inc. Systems and methods for proactively identifying and surfacing relevant content on an electronic device with a touch-sensitive display
US11080012B2 (en) 2009-06-05 2021-08-03 Apple Inc. Interface for a virtual digital assistant
US11120372B2 (en) 2011-06-03 2021-09-14 Apple Inc. Performing actions associated with task items that represent tasks to perform
US11127397B2 (en) 2015-05-27 2021-09-21 Apple Inc. Device voice control
US11133008B2 (en) 2014-05-30 2021-09-28 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US11140099B2 (en) 2019-05-21 2021-10-05 Apple Inc. Providing message response suggestions
US11145294B2 (en) 2018-05-07 2021-10-12 Apple Inc. Intelligent automated assistant for delivering content from user experiences
US11170166B2 (en) 2018-09-28 2021-11-09 Apple Inc. Neural typographical error modeling via generative adversarial networks
US11204787B2 (en) 2017-01-09 2021-12-21 Apple Inc. Application integration with a digital assistant
US11217251B2 (en) 2019-05-06 2022-01-04 Apple Inc. Spoken notifications
US11227589B2 (en) 2016-06-06 2022-01-18 Apple Inc. Intelligent list reading
US11231904B2 (en) 2015-03-06 2022-01-25 Apple Inc. Reducing response latency of intelligent automated assistants
US11237797B2 (en) 2019-05-31 2022-02-01 Apple Inc. User activity shortcut suggestions
US11269678B2 (en) 2012-05-15 2022-03-08 Apple Inc. Systems and methods for integrating third party services with a digital assistant
US11281993B2 (en) 2016-12-05 2022-03-22 Apple Inc. Model and ensemble compression for metric learning
US11289073B2 (en) 2019-05-31 2022-03-29 Apple Inc. Device text to speech
US11301477B2 (en) 2017-05-12 2022-04-12 Apple Inc. Feedback analysis of a digital assistant
US11307752B2 (en) 2019-05-06 2022-04-19 Apple Inc. User configurable task triggers
US11314370B2 (en) 2013-12-06 2022-04-26 Apple Inc. Method for extracting salient dialog usage from live data
US11348573B2 (en) 2019-03-18 2022-05-31 Apple Inc. Multimodality in digital assistant systems
US11350253B2 (en) 2011-06-03 2022-05-31 Apple Inc. Active transport based notifications
US11360641B2 (en) 2019-06-01 2022-06-14 Apple Inc. Increasing the relevance of new available information
US11386266B2 (en) 2018-06-01 2022-07-12 Apple Inc. Text correction
US11388291B2 (en) 2013-03-14 2022-07-12 Apple Inc. System and method for processing voicemail
US11423908B2 (en) 2019-05-06 2022-08-23 Apple Inc. Interpreting spoken requests
US11462215B2 (en) 2018-09-28 2022-10-04 Apple Inc. Multi-modal inputs for voice commands
US11468282B2 (en) 2015-05-15 2022-10-11 Apple Inc. Virtual assistant in a communication session
US11475884B2 (en) 2019-05-06 2022-10-18 Apple Inc. Reducing digital assistant latency when a language is incorrectly determined
US11475898B2 (en) 2018-10-26 2022-10-18 Apple Inc. Low-latency multi-speaker speech recognition
US11488406B2 (en) 2019-09-25 2022-11-01 Apple Inc. Text detection using global geometry estimators
US11495218B2 (en) 2018-06-01 2022-11-08 Apple Inc. Virtual assistant operation in multi-device environments
US11496600B2 (en) 2019-05-31 2022-11-08 Apple Inc. Remote execution of machine-learned models
US11532306B2 (en) 2017-05-16 2022-12-20 Apple Inc. Detecting a trigger of a digital assistant
US11587559B2 (en) 2015-09-30 2023-02-21 Apple Inc. Intelligent device identification
US11638059B2 (en) 2019-01-04 2023-04-25 Apple Inc. Content playback on multiple devices
US11657813B2 (en) 2019-05-31 2023-05-23 Apple Inc. Voice identification in digital assistant systems
US11671920B2 (en) 2007-04-03 2023-06-06 Apple Inc. Method and system for operating a multifunction portable electronic device using voice-activation
US11765209B2 (en) 2020-05-11 2023-09-19 Apple Inc. Digital assistant hardware abstraction
US11798547B2 (en) 2013-03-15 2023-10-24 Apple Inc. Voice activated device for use with a voice-based digital assistant
US11809483B2 (en) 2015-09-08 2023-11-07 Apple Inc. Intelligent automated assistant for media search and playback
US11853536B2 (en) 2015-09-08 2023-12-26 Apple Inc. Intelligent automated assistant in a media environment
US11886805B2 (en) 2015-11-09 2024-01-30 Apple Inc. Unconventional virtual assistant interactions
US12010262B2 (en) 2013-08-06 2024-06-11 Apple Inc. Auto-activating smart responses based on activities from remote devices

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10240291A (en) * 1996-12-26 1998-09-11 Seiko Epson Corp Voice input possible state informing method and device in voice recognition device
JP2000338991A (en) * 1999-05-25 2000-12-08 Nec Saitama Ltd Voice operation telephone device with recognition rate reliability display function and voice recognizing method thereof
WO2008069519A1 (en) * 2006-12-04 2008-06-12 Electronics And Telecommunications Research Institute Gesture/speech integrated recognition system and method
JP2009218910A (en) * 2008-03-11 2009-09-24 Mega Chips Corp Remote control enabled apparatus
JP2010511958A (en) * 2006-12-04 2010-04-15 韓國電子通信研究院 Gesture / voice integrated recognition system and method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10240291A (en) * 1996-12-26 1998-09-11 Seiko Epson Corp Voice input possible state informing method and device in voice recognition device
JP2000338991A (en) * 1999-05-25 2000-12-08 Nec Saitama Ltd Voice operation telephone device with recognition rate reliability display function and voice recognizing method thereof
WO2008069519A1 (en) * 2006-12-04 2008-06-12 Electronics And Telecommunications Research Institute Gesture/speech integrated recognition system and method
JP2010511958A (en) * 2006-12-04 2010-04-15 韓國電子通信研究院 Gesture / voice integrated recognition system and method
JP2009218910A (en) * 2008-03-11 2009-09-24 Mega Chips Corp Remote control enabled apparatus

Cited By (239)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9646614B2 (en) 2000-03-16 2017-05-09 Apple Inc. Fast, language-independent method for user authentication by voice
US10318871B2 (en) 2005-09-08 2019-06-11 Apple Inc. Method and apparatus for building an intelligent automated assistant
US11928604B2 (en) 2005-09-08 2024-03-12 Apple Inc. Method and apparatus for building an intelligent automated assistant
US11671920B2 (en) 2007-04-03 2023-06-06 Apple Inc. Method and system for operating a multifunction portable electronic device using voice-activation
US11023513B2 (en) 2007-12-20 2021-06-01 Apple Inc. Method and apparatus for searching using an active ontology
US10381016B2 (en) 2008-01-03 2019-08-13 Apple Inc. Methods and apparatus for altering audio output signals
US9626955B2 (en) 2008-04-05 2017-04-18 Apple Inc. Intelligent text-to-speech conversion
US9865248B2 (en) 2008-04-05 2018-01-09 Apple Inc. Intelligent text-to-speech conversion
US10108612B2 (en) 2008-07-31 2018-10-23 Apple Inc. Mobile device having human language translation capability with positional feedback
US10643611B2 (en) 2008-10-02 2020-05-05 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US11348582B2 (en) 2008-10-02 2022-05-31 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US10795541B2 (en) 2009-06-05 2020-10-06 Apple Inc. Intelligent organization of tasks items
US11080012B2 (en) 2009-06-05 2021-08-03 Apple Inc. Interface for a virtual digital assistant
US10283110B2 (en) 2009-07-02 2019-05-07 Apple Inc. Methods and apparatuses for automatic speech recognition
US10706841B2 (en) 2010-01-18 2020-07-07 Apple Inc. Task flow identification based on user intent
US10741185B2 (en) 2010-01-18 2020-08-11 Apple Inc. Intelligent automated assistant
US11423886B2 (en) 2010-01-18 2022-08-23 Apple Inc. Task flow identification based on user intent
US10049675B2 (en) 2010-02-25 2018-08-14 Apple Inc. User profiling for voice input processing
US9633660B2 (en) 2010-02-25 2017-04-25 Apple Inc. User profiling for voice input processing
US10692504B2 (en) 2010-02-25 2020-06-23 Apple Inc. User profiling for voice input processing
US10417405B2 (en) 2011-03-21 2019-09-17 Apple Inc. Device access using voice authentication
US10102359B2 (en) 2011-03-21 2018-10-16 Apple Inc. Device access using voice authentication
US11120372B2 (en) 2011-06-03 2021-09-14 Apple Inc. Performing actions associated with task items that represent tasks to perform
US11350253B2 (en) 2011-06-03 2022-05-31 Apple Inc. Active transport based notifications
US9733895B2 (en) 2011-08-05 2017-08-15 Samsung Electronics Co., Ltd. Method for controlling electronic apparatus based on voice recognition and motion recognition, and electronic apparatus applying the same
US9002714B2 (en) 2011-08-05 2015-04-07 Samsung Electronics Co., Ltd. Method for controlling electronic apparatus based on voice recognition and motion recognition, and electronic apparatus applying the same
JP2013037689A (en) * 2011-08-05 2013-02-21 Samsung Electronics Co Ltd Electronic equipment and control method thereof
JP2013037688A (en) * 2011-08-05 2013-02-21 Samsung Electronics Co Ltd Electronic equipment and control method thereof
JP2013037454A (en) * 2011-08-05 2013-02-21 Ikutoku Gakuen Posture determination method, program, device, and system
JP2013041580A (en) * 2011-08-05 2013-02-28 Samsung Electronics Co Ltd Electronic apparatus and method of controlling the same
KR101228643B1 (en) * 2011-08-24 2013-01-31 한국과학기술원 Apparatus and method for motion detection, and apparatus for audio and image gerneration
JP2013080015A (en) * 2011-09-30 2013-05-02 Toshiba Corp Speech recognition device and speech recognition method
WO2013069936A1 (en) * 2011-11-07 2013-05-16 Samsung Electronics Co., Ltd. Electronic apparatus and method for controlling thereof
JP2014532933A (en) * 2011-11-07 2014-12-08 サムスン エレクトロニクス カンパニー リミテッド Electronic device and control method thereof
WO2013122310A1 (en) * 2012-02-17 2013-08-22 Lg Electronics Inc. Method and apparatus for smart voice recognition
US9229681B2 (en) 2012-02-17 2016-01-05 Lg Electronics Inc. Method and apparatus for smart voice recognition
US8793138B2 (en) 2012-02-17 2014-07-29 Lg Electronics Inc. Method and apparatus for smart voice recognition
US8793136B2 (en) 2012-02-17 2014-07-29 Lg Electronics Inc. Method and apparatus for smart voice recognition
CN104169837A (en) * 2012-02-17 2014-11-26 Lg电子株式会社 Method and apparatus for smart voice recognition
US11069336B2 (en) 2012-03-02 2021-07-20 Apple Inc. Systems and methods for name pronunciation
US9953088B2 (en) 2012-05-14 2018-04-24 Apple Inc. Crowd sourcing information to fulfill user requests
US11269678B2 (en) 2012-05-15 2022-03-08 Apple Inc. Systems and methods for integrating third party services with a digital assistant
US11321116B2 (en) 2012-05-15 2022-05-03 Apple Inc. Systems and methods for integrating third party services with a digital assistant
US10079014B2 (en) 2012-06-08 2018-09-18 Apple Inc. Name recognition system
US9971774B2 (en) 2012-09-19 2018-05-15 Apple Inc. Voice-based media searching
WO2014065254A1 (en) * 2012-10-25 2014-05-01 京セラ株式会社 Portable terminal device and input operation acceptance method
JP2014085954A (en) * 2012-10-25 2014-05-12 Kyocera Corp Portable terminal device, program and input operation accepting method
US9760165B2 (en) 2012-10-25 2017-09-12 Kyocera Corporation Mobile terminal device and input operation receiving method for switching input methods
US10714117B2 (en) 2013-02-07 2020-07-14 Apple Inc. Voice trigger for a digital assistant
US11636869B2 (en) 2013-02-07 2023-04-25 Apple Inc. Voice trigger for a digital assistant
US10978090B2 (en) 2013-02-07 2021-04-13 Apple Inc. Voice trigger for a digital assistant
US11388291B2 (en) 2013-03-14 2022-07-12 Apple Inc. System and method for processing voicemail
US11798547B2 (en) 2013-03-15 2023-10-24 Apple Inc. Voice activated device for use with a voice-based digital assistant
US9966060B2 (en) 2013-06-07 2018-05-08 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US9582608B2 (en) 2013-06-07 2017-02-28 Apple Inc. Unified ranking with entropy-weighted information for phrase-based semantic auto-completion
US10657961B2 (en) 2013-06-08 2020-05-19 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US9966068B2 (en) 2013-06-08 2018-05-08 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US11727219B2 (en) 2013-06-09 2023-08-15 Apple Inc. System and method for inferring user intent from speech inputs
US11048473B2 (en) 2013-06-09 2021-06-29 Apple Inc. Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant
US10185542B2 (en) 2013-06-09 2019-01-22 Apple Inc. Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant
US10769385B2 (en) 2013-06-09 2020-09-08 Apple Inc. System and method for inferring user intent from speech inputs
US12010262B2 (en) 2013-08-06 2024-06-11 Apple Inc. Auto-activating smart responses based on activities from remote devices
US11314370B2 (en) 2013-12-06 2022-04-26 Apple Inc. Method for extracting salient dialog usage from live data
WO2015097568A1 (en) * 2013-12-24 2015-07-02 Sony Corporation Alternative camera function control
WO2015118578A1 (en) * 2014-02-10 2015-08-13 三菱電機株式会社 Multimodal input device, and method for controlling timeout in terminal device and multimodal input device
US9953654B2 (en) 2014-05-20 2018-04-24 Samsung Electronics Co., Ltd. Voice command recognition apparatus and method
US10878809B2 (en) 2014-05-30 2020-12-29 Apple Inc. Multi-command single utterance input method
US11133008B2 (en) 2014-05-30 2021-09-28 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US11699448B2 (en) 2014-05-30 2023-07-11 Apple Inc. Intelligent assistant for home automation
US11257504B2 (en) 2014-05-30 2022-02-22 Apple Inc. Intelligent assistant for home automation
US10657966B2 (en) 2014-05-30 2020-05-19 Apple Inc. Better resolution when referencing to concepts
US10169329B2 (en) 2014-05-30 2019-01-01 Apple Inc. Exemplar-based natural language processing
US10417344B2 (en) 2014-05-30 2019-09-17 Apple Inc. Exemplar-based natural language processing
US11810562B2 (en) 2014-05-30 2023-11-07 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US10083690B2 (en) 2014-05-30 2018-09-25 Apple Inc. Better resolution when referencing to concepts
US10497365B2 (en) 2014-05-30 2019-12-03 Apple Inc. Multi-command single utterance input method
US11670289B2 (en) 2014-05-30 2023-06-06 Apple Inc. Multi-command single utterance input method
US10714095B2 (en) 2014-05-30 2020-07-14 Apple Inc. Intelligent assistant for home automation
US10699717B2 (en) 2014-05-30 2020-06-30 Apple Inc. Intelligent assistant for home automation
US9668024B2 (en) 2014-06-30 2017-05-30 Apple Inc. Intelligent automated assistant for TV user interactions
US11516537B2 (en) 2014-06-30 2022-11-29 Apple Inc. Intelligent automated assistant for TV user interactions
US10904611B2 (en) 2014-06-30 2021-01-26 Apple Inc. Intelligent automated assistant for TV user interactions
US10431204B2 (en) 2014-09-11 2019-10-01 Apple Inc. Method and apparatus for discovering trending terms in speech requests
WO2016039992A1 (en) * 2014-09-12 2016-03-17 Apple Inc. Dynamic thresholds for always listening speech trigger
JP2017537361A (en) * 2014-09-12 2017-12-14 アップル インコーポレイテッド Dynamic threshold for always listening for speech trigger
US10789041B2 (en) 2014-09-12 2020-09-29 Apple Inc. Dynamic thresholds for always listening speech trigger
US9986419B2 (en) 2014-09-30 2018-05-29 Apple Inc. Social reminders
US10390213B2 (en) 2014-09-30 2019-08-20 Apple Inc. Social reminders
US10438595B2 (en) 2014-09-30 2019-10-08 Apple Inc. Speaker identification and unsupervised speaker adaptation techniques
US10453443B2 (en) 2014-09-30 2019-10-22 Apple Inc. Providing an indication of the suitability of speech recognition
US11231904B2 (en) 2015-03-06 2022-01-25 Apple Inc. Reducing response latency of intelligent automated assistants
US10930282B2 (en) 2015-03-08 2021-02-23 Apple Inc. Competing devices responding to voice triggers
US10529332B2 (en) 2015-03-08 2020-01-07 Apple Inc. Virtual assistant activation
US10567477B2 (en) 2015-03-08 2020-02-18 Apple Inc. Virtual assistant continuity
US11842734B2 (en) 2015-03-08 2023-12-12 Apple Inc. Virtual assistant activation
US10311871B2 (en) 2015-03-08 2019-06-04 Apple Inc. Competing devices responding to voice triggers
US11087759B2 (en) 2015-03-08 2021-08-10 Apple Inc. Virtual assistant activation
US11468282B2 (en) 2015-05-15 2022-10-11 Apple Inc. Virtual assistant in a communication session
US11070949B2 (en) 2015-05-27 2021-07-20 Apple Inc. Systems and methods for proactively identifying and surfacing relevant content on an electronic device with a touch-sensitive display
US11127397B2 (en) 2015-05-27 2021-09-21 Apple Inc. Device voice control
US10681212B2 (en) 2015-06-05 2020-06-09 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
US10356243B2 (en) 2015-06-05 2019-07-16 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
US11025565B2 (en) 2015-06-07 2021-06-01 Apple Inc. Personalized prediction of responses for instant messaging
US11947873B2 (en) 2015-06-29 2024-04-02 Apple Inc. Virtual assistant for media playback
US11010127B2 (en) 2015-06-29 2021-05-18 Apple Inc. Virtual assistant for media playback
JP2015194766A (en) * 2015-06-29 2015-11-05 株式会社東芝 speech recognition device and speech recognition method
US11126400B2 (en) 2015-09-08 2021-09-21 Apple Inc. Zero latency digital assistant
US10747498B2 (en) 2015-09-08 2020-08-18 Apple Inc. Zero latency digital assistant
US11500672B2 (en) 2015-09-08 2022-11-15 Apple Inc. Distributed personal assistant
US11853536B2 (en) 2015-09-08 2023-12-26 Apple Inc. Intelligent automated assistant in a media environment
US11550542B2 (en) 2015-09-08 2023-01-10 Apple Inc. Zero latency digital assistant
US10671428B2 (en) 2015-09-08 2020-06-02 Apple Inc. Distributed personal assistant
US11809483B2 (en) 2015-09-08 2023-11-07 Apple Inc. Intelligent automated assistant for media search and playback
US11010550B2 (en) 2015-09-29 2021-05-18 Apple Inc. Unified language modeling framework for word prediction, auto-completion and auto-correction
US10366158B2 (en) 2015-09-29 2019-07-30 Apple Inc. Efficient word encoding for recurrent neural network language models
US11587559B2 (en) 2015-09-30 2023-02-21 Apple Inc. Intelligent device identification
US10691473B2 (en) 2015-11-06 2020-06-23 Apple Inc. Intelligent automated assistant in a messaging environment
US11526368B2 (en) 2015-11-06 2022-12-13 Apple Inc. Intelligent automated assistant in a messaging environment
US11886805B2 (en) 2015-11-09 2024-01-30 Apple Inc. Unconventional virtual assistant interactions
US10354652B2 (en) 2015-12-02 2019-07-16 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10049668B2 (en) 2015-12-02 2018-08-14 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10223066B2 (en) 2015-12-23 2019-03-05 Apple Inc. Proactive assistance based on dialog communication between devices
US10942703B2 (en) 2015-12-23 2021-03-09 Apple Inc. Proactive assistance based on dialog communication between devices
JP2017120609A (en) * 2015-12-24 2017-07-06 カシオ計算機株式会社 Emotion estimation device, emotion estimation method and program
US10446143B2 (en) 2016-03-14 2019-10-15 Apple Inc. Identification of voice inputs providing credentials
US9934775B2 (en) 2016-05-26 2018-04-03 Apple Inc. Unit-selection text-to-speech synthesis based on predicted concatenation parameters
US9972304B2 (en) 2016-06-03 2018-05-15 Apple Inc. Privacy preserving distributed evaluation framework for embedded personalized systems
US11227589B2 (en) 2016-06-06 2022-01-18 Apple Inc. Intelligent list reading
US10249300B2 (en) 2016-06-06 2019-04-02 Apple Inc. Intelligent list reading
US10049663B2 (en) 2016-06-08 2018-08-14 Apple, Inc. Intelligent automated assistant for media exploration
US11069347B2 (en) 2016-06-08 2021-07-20 Apple Inc. Intelligent automated assistant for media exploration
US10354011B2 (en) 2016-06-09 2019-07-16 Apple Inc. Intelligent automated assistant in a home environment
US11037565B2 (en) 2016-06-10 2021-06-15 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10733993B2 (en) 2016-06-10 2020-08-04 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10192552B2 (en) 2016-06-10 2019-01-29 Apple Inc. Digital assistant providing whispered speech
US10509862B2 (en) 2016-06-10 2019-12-17 Apple Inc. Dynamic phrase expansion of language input
US11657820B2 (en) 2016-06-10 2023-05-23 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10067938B2 (en) 2016-06-10 2018-09-04 Apple Inc. Multilingual word prediction
US10490187B2 (en) 2016-06-10 2019-11-26 Apple Inc. Digital assistant providing automated status report
US11809783B2 (en) 2016-06-11 2023-11-07 Apple Inc. Intelligent device arbitration and control
US10089072B2 (en) 2016-06-11 2018-10-02 Apple Inc. Intelligent device arbitration and control
US10942702B2 (en) 2016-06-11 2021-03-09 Apple Inc. Intelligent device arbitration and control
US11152002B2 (en) 2016-06-11 2021-10-19 Apple Inc. Application integration with a digital assistant
US10580409B2 (en) 2016-06-11 2020-03-03 Apple Inc. Application integration with a digital assistant
US10521466B2 (en) 2016-06-11 2019-12-31 Apple Inc. Data driven natural language event detection and classification
US10297253B2 (en) 2016-06-11 2019-05-21 Apple Inc. Application integration with a digital assistant
US10269345B2 (en) 2016-06-11 2019-04-23 Apple Inc. Intelligent task discovery
US11749275B2 (en) 2016-06-11 2023-09-05 Apple Inc. Application integration with a digital assistant
US10474753B2 (en) 2016-09-07 2019-11-12 Apple Inc. Language identification using recurrent neural networks
US10553215B2 (en) 2016-09-23 2020-02-04 Apple Inc. Intelligent automated assistant
US10043516B2 (en) 2016-09-23 2018-08-07 Apple Inc. Intelligent automated assistant
US11281993B2 (en) 2016-12-05 2022-03-22 Apple Inc. Model and ensemble compression for metric learning
US10593346B2 (en) 2016-12-22 2020-03-17 Apple Inc. Rank-reduced token representation for automatic speech recognition
US11656884B2 (en) 2017-01-09 2023-05-23 Apple Inc. Application integration with a digital assistant
US11204787B2 (en) 2017-01-09 2021-12-21 Apple Inc. Application integration with a digital assistant
US10741181B2 (en) 2017-05-09 2020-08-11 Apple Inc. User interface for correcting recognition errors
US10332518B2 (en) 2017-05-09 2019-06-25 Apple Inc. User interface for correcting recognition errors
US10417266B2 (en) 2017-05-09 2019-09-17 Apple Inc. Context-aware ranking of intelligent response suggestions
US10755703B2 (en) 2017-05-11 2020-08-25 Apple Inc. Offline personal assistant
US10726832B2 (en) 2017-05-11 2020-07-28 Apple Inc. Maintaining privacy of personal information
US10847142B2 (en) 2017-05-11 2020-11-24 Apple Inc. Maintaining privacy of personal information
US11599331B2 (en) 2017-05-11 2023-03-07 Apple Inc. Maintaining privacy of personal information
US10395654B2 (en) 2017-05-11 2019-08-27 Apple Inc. Text normalization based on a data-driven learning network
US11580990B2 (en) 2017-05-12 2023-02-14 Apple Inc. User-specific acoustic models
US11301477B2 (en) 2017-05-12 2022-04-12 Apple Inc. Feedback analysis of a digital assistant
US11405466B2 (en) 2017-05-12 2022-08-02 Apple Inc. Synchronization and task delegation of a digital assistant
US10789945B2 (en) 2017-05-12 2020-09-29 Apple Inc. Low-latency intelligent automated assistant
US11380310B2 (en) 2017-05-12 2022-07-05 Apple Inc. Low-latency intelligent automated assistant
US10791176B2 (en) 2017-05-12 2020-09-29 Apple Inc. Synchronization and task delegation of a digital assistant
US10410637B2 (en) 2017-05-12 2019-09-10 Apple Inc. User-specific acoustic models
US10482874B2 (en) 2017-05-15 2019-11-19 Apple Inc. Hierarchical belief states for digital assistants
US10810274B2 (en) 2017-05-15 2020-10-20 Apple Inc. Optimizing dialogue policy decisions for digital assistants using implicit feedback
US10303715B2 (en) 2017-05-16 2019-05-28 Apple Inc. Intelligent automated assistant for media exploration
US11217255B2 (en) 2017-05-16 2022-01-04 Apple Inc. Far-field extension for digital assistant services
US11675829B2 (en) 2017-05-16 2023-06-13 Apple Inc. Intelligent automated assistant for media exploration
US10311144B2 (en) 2017-05-16 2019-06-04 Apple Inc. Emoji word sense disambiguation
US10909171B2 (en) 2017-05-16 2021-02-02 Apple Inc. Intelligent automated assistant for media exploration
US10748546B2 (en) 2017-05-16 2020-08-18 Apple Inc. Digital assistant services based on device capabilities
US11532306B2 (en) 2017-05-16 2022-12-20 Apple Inc. Detecting a trigger of a digital assistant
US10403278B2 (en) 2017-05-16 2019-09-03 Apple Inc. Methods and systems for phonetic matching in digital assistant services
US10657328B2 (en) 2017-06-02 2020-05-19 Apple Inc. Multi-task recurrent neural network architecture for efficient morphology handling in neural language modeling
US10445429B2 (en) 2017-09-21 2019-10-15 Apple Inc. Natural language understanding using vocabularies with compressed serialized tries
US10755051B2 (en) 2017-09-29 2020-08-25 Apple Inc. Rule-based natural language processing
US10636424B2 (en) 2017-11-30 2020-04-28 Apple Inc. Multi-turn canned dialog
US10733982B2 (en) 2018-01-08 2020-08-04 Apple Inc. Multi-directional dialog
US10733375B2 (en) 2018-01-31 2020-08-04 Apple Inc. Knowledge-based framework for improving natural language understanding
US10789959B2 (en) 2018-03-02 2020-09-29 Apple Inc. Training speaker recognition models for digital assistants
US10592604B2 (en) 2018-03-12 2020-03-17 Apple Inc. Inverse text normalization for automatic speech recognition
US10818288B2 (en) 2018-03-26 2020-10-27 Apple Inc. Natural assistant interaction
US11710482B2 (en) 2018-03-26 2023-07-25 Apple Inc. Natural assistant interaction
US10909331B2 (en) 2018-03-30 2021-02-02 Apple Inc. Implicit identification of translation payload with neural machine translation
US10928918B2 (en) 2018-05-07 2021-02-23 Apple Inc. Raise to speak
US11145294B2 (en) 2018-05-07 2021-10-12 Apple Inc. Intelligent automated assistant for delivering content from user experiences
US11900923B2 (en) 2018-05-07 2024-02-13 Apple Inc. Intelligent automated assistant for delivering content from user experiences
US11169616B2 (en) 2018-05-07 2021-11-09 Apple Inc. Raise to speak
US11854539B2 (en) 2018-05-07 2023-12-26 Apple Inc. Intelligent automated assistant for delivering content from user experiences
US11487364B2 (en) 2018-05-07 2022-11-01 Apple Inc. Raise to speak
US10984780B2 (en) 2018-05-21 2021-04-20 Apple Inc. Global semantic word embeddings using bi-directional recurrent neural networks
US10720160B2 (en) 2018-06-01 2020-07-21 Apple Inc. Voice interaction at a primary device to access call functionality of a companion device
US11431642B2 (en) 2018-06-01 2022-08-30 Apple Inc. Variable latency device coordination
US11386266B2 (en) 2018-06-01 2022-07-12 Apple Inc. Text correction
US11009970B2 (en) 2018-06-01 2021-05-18 Apple Inc. Attention aware virtual assistant dismissal
US11360577B2 (en) 2018-06-01 2022-06-14 Apple Inc. Attention aware virtual assistant dismissal
US10684703B2 (en) 2018-06-01 2020-06-16 Apple Inc. Attention aware virtual assistant dismissal
US10984798B2 (en) 2018-06-01 2021-04-20 Apple Inc. Voice interaction at a primary device to access call functionality of a companion device
US10892996B2 (en) 2018-06-01 2021-01-12 Apple Inc. Variable latency device coordination
US11495218B2 (en) 2018-06-01 2022-11-08 Apple Inc. Virtual assistant operation in multi-device environments
US10403283B1 (en) 2018-06-01 2019-09-03 Apple Inc. Voice interaction at a primary device to access call functionality of a companion device
US10496705B1 (en) 2018-06-03 2019-12-03 Apple Inc. Accelerated task performance
US10504518B1 (en) 2018-06-03 2019-12-10 Apple Inc. Accelerated task performance
US10944859B2 (en) 2018-06-03 2021-03-09 Apple Inc. Accelerated task performance
US11010561B2 (en) 2018-09-27 2021-05-18 Apple Inc. Sentiment prediction from textual data
US11170166B2 (en) 2018-09-28 2021-11-09 Apple Inc. Neural typographical error modeling via generative adversarial networks
US10839159B2 (en) 2018-09-28 2020-11-17 Apple Inc. Named entity normalization in a spoken dialog system
US11462215B2 (en) 2018-09-28 2022-10-04 Apple Inc. Multi-modal inputs for voice commands
US11475898B2 (en) 2018-10-26 2022-10-18 Apple Inc. Low-latency multi-speaker speech recognition
US11638059B2 (en) 2019-01-04 2023-04-25 Apple Inc. Content playback on multiple devices
US11348573B2 (en) 2019-03-18 2022-05-31 Apple Inc. Multimodality in digital assistant systems
US11705130B2 (en) 2019-05-06 2023-07-18 Apple Inc. Spoken notifications
US11475884B2 (en) 2019-05-06 2022-10-18 Apple Inc. Reducing digital assistant latency when a language is incorrectly determined
US11423908B2 (en) 2019-05-06 2022-08-23 Apple Inc. Interpreting spoken requests
US11217251B2 (en) 2019-05-06 2022-01-04 Apple Inc. Spoken notifications
US11307752B2 (en) 2019-05-06 2022-04-19 Apple Inc. User configurable task triggers
US11140099B2 (en) 2019-05-21 2021-10-05 Apple Inc. Providing message response suggestions
US11888791B2 (en) 2019-05-21 2024-01-30 Apple Inc. Providing message response suggestions
US11496600B2 (en) 2019-05-31 2022-11-08 Apple Inc. Remote execution of machine-learned models
US11360739B2 (en) 2019-05-31 2022-06-14 Apple Inc. User activity shortcut suggestions
US11657813B2 (en) 2019-05-31 2023-05-23 Apple Inc. Voice identification in digital assistant systems
US11237797B2 (en) 2019-05-31 2022-02-01 Apple Inc. User activity shortcut suggestions
US11289073B2 (en) 2019-05-31 2022-03-29 Apple Inc. Device text to speech
US11360641B2 (en) 2019-06-01 2022-06-14 Apple Inc. Increasing the relevance of new available information
JP7242873B2 (en) 2019-09-05 2023-03-20 三菱電機株式会社 Speech recognition assistance device and speech recognition assistance method
JPWO2021044569A1 (en) * 2019-09-05 2021-12-09 三菱電機株式会社 Voice recognition assist device and voice recognition assist method
WO2021044569A1 (en) * 2019-09-05 2021-03-11 三菱電機株式会社 Speech recognition support device and speech recognition support method
US11488406B2 (en) 2019-09-25 2022-11-01 Apple Inc. Text detection using global geometry estimators
CN112786036A (en) * 2019-11-04 2021-05-11 海信视像科技股份有限公司 Display apparatus and content display method
CN112786036B (en) * 2019-11-04 2023-08-08 海信视像科技股份有限公司 Display device and content display method
US11924254B2 (en) 2020-05-11 2024-03-05 Apple Inc. Digital assistant hardware abstraction
US11765209B2 (en) 2020-05-11 2023-09-19 Apple Inc. Digital assistant hardware abstraction

Also Published As

Publication number Publication date
JP5473520B2 (en) 2014-04-16

Similar Documents

Publication Publication Date Title
JP5473520B2 (en) Input device and control method thereof
CN106463114B (en) Information processing apparatus, control method, and program storage unit
JP7425349B2 (en) equipment control system
JP6525496B2 (en) Display device, remote control device for controlling display device, control method for display device, control method for server, and control method for remote control device
CN108604447B (en) Information processing unit, information processing method and program
US6353764B1 (en) Control method
KR102339657B1 (en) Electronic device and control method thereof
US9824688B2 (en) Method for controlling speech-recognition text-generation system and method for controlling mobile terminal
JP2005284492A (en) Operating device using voice
WO2017168936A1 (en) Information processing device, information processing method, and program
US20150279369A1 (en) Display apparatus and user interaction method thereof
JP2009229899A (en) Device and method for voice recognition
JP2013080015A (en) Speech recognition device and speech recognition method
WO2017141530A1 (en) Information processing device, information processing method and program
US20140214430A1 (en) Remote control system and device
JP2018036902A (en) Equipment operation system, equipment operation method, and equipment operation program
JP2009087074A (en) Equipment control system
JP2004303251A (en) Control method
US11657821B2 (en) Information processing apparatus, information processing system, and information processing method to execute voice response corresponding to a situation of a user
US20190035420A1 (en) Information processing device, information processing method, and program
KR20210155505A (en) Movable electronic apparatus and the method thereof
JP2004289850A (en) Control method, equipment control apparatus, and program recording medium
KR20220072621A (en) Electronic apparatus and the method thereof
JP2004282770A (en) Control method
JP2014048748A (en) Control device, and control method and control program of control device

Legal Events

Date Code Title Description
A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20120926

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20120926

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20130529

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20130604

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20130805

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20140107

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20140204

R151 Written notification of patent or utility model registration

Ref document number: 5473520

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R151

LAPS Cancellation because of no payment of annual fees