JP6934092B2

JP6934092B2 - Initialize conversations with automation agents via selectable graphical elements

Info

Publication number: JP6934092B2
Application number: JP2020117027A
Authority: JP
Inventors: ヴィクラム・アガワル; ディナ・エルハダッド
Original assignee: Google LLC
Current assignee: Google LLC
Priority date: 2017-05-08
Filing date: 2020-07-07
Publication date: 2021-09-08
Anticipated expiration: 2038-05-07
Also published as: CN110622136A; KR20200006558A; US10237209B2; JP6733063B2; US20230336505A1; US20190166073A1; CN110622136B; US20180324115A1; CN112905284A; EP3494499B1; KR102391387B1; EP3494499A1; JP2020520507A; JP2020173846A; US11689480B2; KR102164428B1; EP4167084A1; WO2018208694A1; KR20200117070A

Description

自動化アシスタント(「パーソナルアシスタントモジュール」、「モバイルアシスタント」、または「チャットボット」とも呼ばれる)が、スマートフォン、タブレットコンピュータ、ウェアラブルデバイス、自動車システム、スタンドアロンパーソナルアシスタントデバイスなどの様々なコンピューティングデバイスを介してユーザと対話され得る。自動化アシスタントは、ユーザからの入力(たとえば、タイプおよび/または発話された自然言語入力)を受け取り、応答内容(responsive content)(たとえば、視覚および/または可聴自然言語出力)で応答する。 Automation assistants (also known as "personal assistant modules", "mobile assistants", or "chatbots") are used by users through various computing devices such as smartphones, tablet computers, wearable devices, automotive systems, and stand-alone personal assistant devices. Can be interacted with. The automation assistant receives input from the user (eg, type and / or spoken natural language input) and responds with responsive content (eg, visual and / or audible natural language output).

自動化アシスタントは、様々なローカルおよび/またはサードパーティエージェントとの対話を通じて広範な機能を提供し得る。ユーザが自動化アシスタントを利用して特定の機能を実施するために、多くの場合、ユーザはまず、(たとえば、特定の口頭の語句および/または特定のハードウェア入力を通じて)自動化アシスタントを明示的に起動し、次いで特定の機能に関連する特定の起動語句を与えなければならない。起動語句は、自動化アシスタントを介して、特定の機能を実施し得るエージェントを起動する。しかしながら、ユーザは、自動化アシスタントの様々な機能を知らないことがあり、かつ/または自動化アシスタントを介してそのような機能を起動するための起動語句を知らないことがある。さらに、ユーザは、そのデバイスのうちの1つが照会に回答する(たとえば、タブレットからフライト詳細を取得する)ためにロードされた自動化アシスタントを有さないことがあっても、自動化アシスタントを有する他のデバイスからユーザが回答を取得し得ることを知らないことがある。その結果、ある状況では、ユーザは、特定の機能を実施するためにリソース効率の低い他のアプリケーションを利用することがある。さらに、ユーザが自動化アシスタントを起動する場合であっても、ある状況では、ユーザは、自動化アシスタントを介してどのように特定の機能を実施するかを知るために、自動化アシスタントとの広範なリソース集約的な対話(たとえば、ダイアログターン(dialog turn))に依然として関与する必要があることがある。たとえば、自動化アシスタントが特定の機能の実施を可能にすることができることをユーザが知るためにさえも大量のダイアログターンが必要とされ得る。 Automation assistants can provide a wide range of functionality through interaction with various local and / or third-party agents. In order for a user to utilize an automation assistant to perform a particular function, the user often first explicitly launches the automation assistant (for example, through certain verbal phrases and / or certain hardware inputs). Then you have to give a specific activation phrase related to a specific function. The launch phrase launches an agent that can perform a particular function through an automation assistant. However, the user may be unaware of the various features of the automation assistant and / or the launch phrase for invoking such features through the automation assistant. In addition, the user may not have an automation assistant loaded to answer the query (for example, get flight details from the tablet) on one of its devices, but the other has an automation assistant. You may not know that the user can get an answer from the device. As a result, in some situations, users may utilize other resource-inefficient applications to perform certain functions. In addition, even if the user launches the automation assistant, in some situations the user aggregates extensive resources with the automation assistant to know how to perform a particular function through the automation assistant. You may still need to be involved in a typical dialogue (for example, a dialog turn). For example, a large number of dialog turns may be required even for the user to know that an automation assistant can enable the performance of a particular function.

自動化アシスタントを起動して、アプリケーションに関連するエージェントモジュールと通信するための技法が本明細書において説明される。いくつかの実装は、自動化アシスタントアプリケーションとは別々のアプリケーションの非自動化アシスタントグラフィカルユーザインターフェース内でユーザに選択可能要素を提示することを可能にする。選択可能要素のユーザ選択に応答して、自動化アシスタントは、選択可能要素に対応し、非自動化アシスタントインターフェースを介して提示される内容に関連するエージェントモジュールを起動し得る。それらの実装のうちのいくつかでは、選択可能要素は、シングルタップ、シングルクリック、または他の「単一選択」方式で選択可能である。これらおよび他の方式では、ユーザは、選択可能要素を選択して、非会話型インターフェースから会話型自動化アシスタントインターフェースに遷移し得、(非会話型インターフェース内の内容に関係する)エージェントが自動化アシスタントインターフェース内で起動される。いくつかの実装では、そのような方式で自動化アシスタントインターフェース内でエージェントを起動することによって、エージェントモジュールの機能を実施するためにユーザによって必要とされる入力量が削減され得る。具体的には、最初に自動化アシスタントを起動し、次いで特定のエージェントモジュールまたはエージェントモジュールの機能に関連する特定の起動語句を与える代わりに、エージェントモジュールが、グラフィカルユーザインターフェースに基づいて直接的に起動され得る。グラフィカルユーザインターフェースに関連するエージェントモジュールが、自動化アシスタントを介して直接的に起動され得る。ユーザによって必要とされる入力数が、少なくとも、特定のエージェントモジュールについてのさらなる起動語句(invocation phrase)を必要としないことによって削減される。このようにして、自動化アシスタントによる一般的照会の出力、および特定のエージェントモジュールについてのさらなる起動語句の処理は必要とされない。この入力の削減は、計算資源を節約し、たとえば、器用さの問題のあるユーザなどの様々なユーザに恩恵を与える。さらに、様々な自動化アシスタント機能の発見が促進され得、それによって、さらなる対話のための、潜在的によりリソース効率の高い自動化アシスタントインターフェースの使用が促進される。 Techniques for invoking an automation assistant to communicate with an agent module associated with an application are described herein. Some implementations allow the user to be presented with selectable elements within the non-automated assistant graphical user interface of an application separate from the automated assistant application. In response to the user selection of the selectable element, the automation assistant may launch an agent module that corresponds to the selectable element and is associated with what is presented via the non-automation assistant interface. In some of those implementations, selectable elements are selectable by single tap, single click, or other "single selection" method. In these and other methods, the user can select selectable elements to transition from the non-conversational interface to the conversational automation assistant interface, and the agent (related to the content in the non-conversational interface) is the automation assistant interface. Launched within. In some implementations, launching the agent within the automation assistant interface in such a manner may reduce the amount of input required by the user to perform the functionality of the agent module. Specifically, instead of first launching the automation assistant and then giving a particular agent module or a specific launch phrase related to the function of the agent module, the agent module is launched directly based on the graphical user interface. obtain. Agent modules associated with the graphical user interface can be launched directly through the automation assistant. The number of inputs required by the user is reduced, at least by not requiring additional invocation phrases for a particular agent module. In this way, the automation assistant does not need to output general queries and handle additional launch terms for a particular agent module. This reduction in input saves computational resources and benefits a variety of users, for example, users with dexterity problems. In addition, the discovery of various automation assistant features can be facilitated, thereby facilitating the use of potentially more resource-efficient automation assistant interfaces for further dialogue.

いくつかの実装では、1つまたは複数のプロセッサによって実装される方法が、コンピューティングデバイスの非自動化アシスタントアプリケーションによってレンダリングされたグラフィカルユーザインターフェースでの選択可能要素の選択を受け取ることなどのステップを含むものとして説明される。選択可能要素は、グラフィカルユーザインターフェースに関連するエージェントが、非自動化アシスタントアプリケーションとは別々の自動化アシスタントアプリケーションを介して起動され得ることを示し得る。ステップは、選択可能要素の選択に応答して、自動化アシスタントアプリケーションを介してエージェントを起動することをさらに含み得る。エージェントは、自動化アシスタントアプリケーションを介して起動され得る複数の利用可能なエージェントのうちの1つであり得る。ステップは、エージェントを起動したことに応答して、エージェントから応答内容を受け取ること、および自動化アシスタントインターフェースを介して、自動化アシスタントアプリケーションによって、エージェントから受け取った応答内容に基づく出力を提供することをさらに含み得る。 In some implementations, the method implemented by one or more processors involves steps such as receiving a selection of selectable elements in a graphical user interface rendered by a non-automated assistant application on a computing device. It is explained as. The selectable element may indicate that the agent associated with the graphical user interface can be launched through an automated assistant application that is separate from the non-automated assistant application. The step may further include invoking the agent through the automation assistant application in response to the selection of selectable elements. The agent can be one of several available agents that can be launched through the automation assistant application. The steps further include receiving a response from the agent in response to invoking the agent, and providing output based on the response received from the agent by the automation assistant application via the automation assistant interface. obtain.

他の実装では、1つまたは複数のプロセッサによって実装される方法が、非自動化アシスタントアプリケーションを操作しているコンピューティングデバイスにおいて選択可能要素を表示させることなどのステップを含むものとして説明される。選択可能要素は、非自動化アシスタントアプリケーションに関連するエージェントモジュールを自動化アシスタントに初期化させるように構成され得る。ステップは、選択可能要素の選択を受け取ること、および選択可能要素の選択を受け取ったことに応答して、自動化アシスタントがコンピューティングデバイスにとってアクセス可能であるかどうかを判定することをさらに含み得る。ステップはまた、自動化アシスタントがコンピューティングデバイスにとってアクセス不能であると判定されたとき、選択可能要素に対応するリンクを実行して、デフォルトウェブページに、エージェントモジュールと通信するための発話可能コマンド語句をオープンおよび提示させることをも含み得る。 In other implementations, the method implemented by one or more processors is described as involving steps such as displaying selectable elements on a computing device operating a non-automated assistant application. The selectable element can be configured to initialize the agent module associated with the non-automation assistant application to the automation assistant. The steps may further include receiving a selection of selectable elements and determining whether the automation assistant is accessible to the computing device in response to receiving the selection of selectable elements. The step also performs a link corresponding to the selectable element when the automation assistant determines that it is inaccessible to the computing device, and puts a spoken command phrase on the default web page to communicate with the agent module. It can also include opening and presenting.

さらに別の実装では、非一時的コンピュータ可読媒体が、1つまたは複数のプロセッサによって実行されるとき、ユーザが非自動化アシスタントアプリケーションのアプリケーションインターフェースを閲覧していると判定することを含むステップを1つまたは複数のプロセッサに実施させる命令を記憶するものとして説明される。アプリケーションインターフェースは、自動化アシスタントアプリケーションを介するエージェントモジュールとの通信を初期化するための第1の選択可能要素を含み得る。エージェントモジュールは、非自動化アシスタントアプリケーションに関連するアクションを実施するように構成され得る。ステップは、第1の選択可能要素の選択を受け取ることをさらに含み得る。第1の選択可能要素は、エージェントモジュールを識別するリンクと、アクションを実施するためのパラメータとを含み得る。ステップはまた、会話型インターフェースをユーザに提示させることをも含み得る。会話型インターフェースは、ユーザとエージェントモジュールとの間の媒介となるように自動化アシスタントによって構成され得る。さらに、ステップは、会話型インターフェースにおいて第2の選択可能要素を提供することを含み得る。第2の選択可能要素は、アクションを進めるためのリンク内で識別されるパラメータに基づき得る。 In yet another implementation, one step involves determining that a user is browsing the application interface of a non-automated assistant application when a non-temporary computer-readable medium is run by one or more processors. Alternatively, it is described as storing instructions to be executed by a plurality of processors. The application interface may include a first selectable element for initializing communication with the agent module via the automation assistant application. The agent module can be configured to perform actions related to the non-automated assistant application. The step may further include receiving a selection of the first selectable element. The first selectable element may include a link that identifies the agent module and parameters for performing the action. Steps can also include having the user present a conversational interface. The conversational interface can be configured by an automation assistant to act as an intermediary between the user and the agent module. Further, the steps may include providing a second selectable element in the conversational interface. The second selectable element may be based on the parameters identified within the link to proceed with the action.

さらに、いくつかの実装は、1つまたは複数のコンピューティングデバイスの1つまたは複数のプロセッサを含み、1つまたは複数のプロセッサは、関連するメモリ内に記憶された命令を実行するように動作可能であり、命令は、前述の方法のうちのいずれかを実施させるように構成される。いくつかの実装は、前述の方法のうちのいずれかを実施するように1つまたは複数のプロセッサによって実行可能なコンピュータ命令を記憶する1つまたは複数の非一時的コンピュータ可読記憶媒体をも含む。 In addition, some implementations include one or more processors of one or more computing devices, one or more processors capable of operating to execute instructions stored in the associated memory. The instruction is configured to implement any of the methods described above. Some implementations also include one or more non-temporary computer-readable storage media that store computer instructions that can be executed by one or more processors to perform any of the methods described above.

ユーザがモバイルデバイスにおいて会話型ユーザインターフェースを活動化するダイアグラムを示す図である。FIG. 5 shows a diagram in which a user activates a conversational user interface on a mobile device. ユーザがモバイルデバイスにおいて会話型ユーザインターフェースを活動化するダイアグラムを示す図である。FIG. 5 shows a diagram in which a user activates a conversational user interface on a mobile device. ユーザがモバイルデバイスにおいて会話型ユーザインターフェースを活動化するダイアグラムを示す図である。FIG. 5 shows a diagram in which a user activates a conversational user interface on a mobile device. ウェブサイトに関連するエージェントモジュールと対話するために使用され得る会話型ユーザインターフェースを示す図である。FIG. 5 illustrates a conversational user interface that can be used to interact with agent modules associated with a website. 様々なアプリケーションおよび/またはウェブサイトを制御するために利用可能な発話可能コマンドにユーザを慣れさせるために、クライアントデバイスにおいて会話型インターフェースを提供するためのシステムを示す図である。FIG. 5 illustrates a system for providing a conversational interface on a client device to familiarize a user with utterable commands available to control various applications and / or websites. 会話型ユーザインターフェースがコンピューティングデバイスのユーザインターフェースにおいて提示されていることを示す図である。It is a figure which shows that the conversational user interface is presented in the user interface of a computing device. 会話型ユーザインターフェースがコンピューティングデバイスのユーザインターフェースにおいて提示されていることを示す図である。It is a figure which shows that the conversational user interface is presented in the user interface of a computing device. 会話型ユーザインターフェースがコンピューティングデバイスのユーザインターフェースにおいて提示されていることを示す図である。It is a figure which shows that the conversational user interface is presented in the user interface of a computing device. エージェントモジュールに自動化アシスタントを介して機能を実施させるための方法を示す図である。It is a figure which shows the method for having an agent module perform a function through an automation assistant. 選択可能要素の選択に応答して実施される動作を実施するための、ネイティブアプリケーションが存在するかどうかに従って、動作を制限するための方法を示す図である。FIG. 5 illustrates a method for limiting an action according to the presence or absence of a native application for performing the action performed in response to the selection of selectable elements. エージェントモジュールがアクセス可能であるかどうかに従って、自動化アシスタントを介してエージェントモジュールと対話するための方法を示す図である。FIG. 5 illustrates a method for interacting with an agent module through an automation assistant, depending on whether the agent module is accessible. 例示的コンピュータシステムまたはコンピューティングデバイスのブロック図である。FIG. 6 is a block diagram of an exemplary computer system or computing device.

記載の実装は、自動化アシスタントを使用して、アプリケーションに関連するエージェントモジュールと対話するためのシステム、方法、および装置に関する。一例として、ユーザがクライアントデバイスのウェブブラウザアプリケーションを介してPizza Companyの「ピザを注文」ウェブページにアクセスしたと仮定する。選択されたとき、クライアントデバイスの自動化アシスタントアプリケーションに、「Pizza Company」のエージェントを起動させ、「Pizza Company」のエージェントによって生成された自動化アシスタントインターフェース出力をユーザに提示させるウェブブラウザアプリケーションを介して、選択可能要素が提示され得る。言い換えれば、要素の選択に応答して、自動化アシスタントアプリケーションは、エージェントを起動して、自動化アシスタントインターフェースを介してユーザがエージェントとのダイアログに関与することを可能にし得る。いくつかの実装では、選択可能要素は、ウェブページの内容として含まれ(たとえば、Pizza Companyによるウェブページ内に組み込まれ)得る。さらに、いくつかの実装では、選択可能要素の選択に応答して、ウェブブラウザアプリケーションを介するユーザの対話に基づくインテントおよび/またはインテントパラメータについての値(たとえば、「スロット値」)と共にエージェントが起動され得る。たとえば、ユーザが「ピザを注文」ウェブページと対話して「ラージ1トッピング」ピザを選択した場合、「ピザを注文」インテントと、スロット値「ラージ」および「1トッピング」と共にエージェントが起動され得る。 The implementation described relates to a system, method, and device for interacting with an agent module associated with an application using an automation assistant. As an example, suppose a user visits the Pizza Company's "Order Pizza" web page through a web browser application on a client device. When selected, select through a web browser application that causes the automation assistant application on the client device to launch the "Pizza Company" agent and present the user with the automation assistant interface output generated by the "Pizza Company" agent. Possible elements can be presented. In other words, in response to element selection, the automation assistant application may launch the agent and allow the user to engage in a dialog with the agent through the automation assistant interface. In some implementations, selectable elements can be included as the content of a web page (eg, incorporated within a web page by The Pizza Company). In addition, in some implementations, the agent responds to the selection of selectable elements with values for intents and / or intent parameters based on user interaction through the web browser application (for example, "slot value"). Can be activated. For example, if a user interacts with the Order Pizza web page and selects the Large 1 Topping pizza, the agent is launched with the Order Pizza intent and the slot values Large and 1 Topping. obtain.

図1A〜図1Cは、会話型ユーザインターフェース114がモバイルデバイスのアプリケーション106から活動化されるダイアグラムを示す。具体的には、図1Aは、モバイルデバイスのユーザインターフェース108において表示されるアプリケーション106のダイアグラム100を示す。アプリケーション106は、たとえば、ユーザインターフェースを介してユーザがホテルを予約することを可能にするホテル予約アプリケーションであり得る。モバイルデバイスは、モバイルデバイスの様々な機能を支援するための自動化アシスタントを含み、または別々のデバイスの自動化アシスタントと通信し得る。たとえば、自動化アシスタントは、音声コマンドに応答し、アプリケーション106などのアプリケーションに関連するエージェントモジュールによって使用され得るテキストに音声コマンドを変換し得る。モバイルデバイス104のアプリケーションは、アプリケーション106に関連する機能をユーザが実施するのを支援するように具体的に設計されたエージェントモジュールに関連付けられ得る。いくつかの実装では、自動化アシスタントは、ユーザとの音声またはテキスト会話を初期化し、ユーザとアプリケーション106に関連するエージェントモジュールとの間の媒介として働き得る。しかしながら、ユーザは、自動化アシスタント、またはアプリケーション106に関連するエージェントモジュールについて利用可能な発話可能コマンドを知らないことがあり、それによって、エージェントモジュールと対話するための効率の低い手段がユーザに残される。自動化アシスタントを介してエージェントモジュールと通信するために利用可能な発話可能コマンドをユーザに紹介するために、自動化アシスタントは、第1の選択可能要素112の選択を介して初期化される会話型ユーザインターフェース114を提供し得る。 Figures 1A-1C show a diagram in which the conversational user interface 114 is activated from application 106 on a mobile device. Specifically, FIG. 1A shows Diagram 100 of application 106 displayed in the user interface 108 of the mobile device. Application 106 can be, for example, a hotel booking application that allows a user to book a hotel through a user interface. The mobile device may include an automation assistant to assist various functions of the mobile device, or may communicate with an automation assistant of a separate device. For example, an automation assistant can respond to voice commands and translate voice commands into text that can be used by agent modules associated with the application, such as application 106. The application on the mobile device 104 may be associated with an agent module specifically designed to assist the user in performing the functions associated with the application 106. In some implementations, the automation assistant can initialize a voice or text conversation with the user and act as an intermediary between the user and the agent module associated with application 106. However, the user may not be aware of the available utterable commands for the automation assistant, or the agent module associated with application 106, which leaves the user with an inefficient means of interacting with the agent module. To introduce the user to the utterable commands available to communicate with the agent module through the automation assistant, the automation assistant is a conversational user interface that is initialized through the selection of the first selectable element 112. 114 can be provided.

第1の選択可能要素112は、ユーザがユーザの自動化アシスタントを使用してアプリケーション106に関連するエージェントモジュールと通信し得ることをユーザに示す語句を含み得る。たとえば、第1の選択可能要素112は、「自動化アシスタントを使用する」という語句を含み得、ユーザがユーザの自動化アシスタントを通じてアプリケーション106に関係する機能、またはエージェントモジュールを実施することができることをユーザ102に通知する。当初、ユーザ102が何らかの発話可能コマンドを知らない場合、ユーザ102は、第1の選択可能要素112を選択し得る、または「自動化アシスタントを使用する」という語句を話し得る。ユーザ102が第1の選択可能要素112を選択したこと、または「自動化アシスタントを使用する」という語句を話したことに応答して、自動化アシスタントは、アプリケーション106に対応するエージェントモジュールを初期化し、起動し得る。第1の選択可能要素112は、エージェントモジュールおよび/またはエージェントモジュールによって実施されるべきコマンドもしくはインテントを具体的には識別するリンクまたはコマンドに関連付けられ得る。いくつかの実装では、リンクは、「http://assistant.url/hotel-agent-module/hotel-booking」などのユニバーサルリソースロケータ(URL)、またはエージェントモジュールを識別する任意のコマンドであり得る。リンクはまた、第1の選択可能要素112を選択する前にユーザによってアプリケーションに提供される任意の情報をも含み得る。たとえば、図1Aに示されるように、ユーザは、ホテル予約のための日付(「3/14」)および客数(「1」)を既に選択していることがある。したがって、リンクは、エージェントモジュールを識別し、日付および客数を含み得る。このようにして、自動化アシスタントは、ホテル予約の進行に関して通知を受け得、指定の日付および客数と共にエージェントを起動し得る。たとえば、エージェントは、「ホテル予約」インテントと、「日付」スロットパラメータについての値「3/14」および「客数」スロットパラメータについての値「1」と共に起動され得る。そのようなリンクの一例は、「http://assistant.url/agent-module/hotel-booking-date_0314_guests_1」であり得る。いくつかの実装では、リンクまたはコマンドは、悪意のあるURL作成者によって引き起こされる損害をなくすために、エージェントがどのように自動化アシスタントから入力を受け取るかについての詳細を隠すための不透明パラメータ(たとえば、「.../date_889293」)を含み得る。 The first selectable element 112 may include a phrase indicating to the user that the user can use the user's automation assistant to communicate with the agent module associated with application 106. For example, the first selectable element 112 may include the phrase "use an automation assistant" so that the user can perform functions or agent modules related to application 106 through the user's automation assistant 102. Notify to. Initially, if the user 102 does not know any utterable command, the user 102 may select the first selectable element 112 or speak the phrase "use an automation assistant". In response to user 102 selecting the first selectable element 112 or speaking the phrase "use the automation assistant", the automation assistant initializes and launches the agent module for application 106. Can be done. The first selectable element 112 may be associated with a link or command that specifically identifies the command or intent to be executed by the agent module and / or the agent module. In some implementations, the link can be a universal resource locator (URL) such as "http://assistant.url/hotel-agent-module/hotel-booking", or any command that identifies the agent module. The link may also contain any information provided to the application by the user prior to selecting the first selectable element 112. For example, as shown in Figure 1A, the user may have already selected a date (“3/14”) and number of guests (“1”) for hotel reservations. Therefore, the link can identify the agent module and include the date and number of guests. In this way, the automation assistant may be notified about the progress of the hotel reservation and may activate the agent with a specified date and number of guests. For example, an agent can be activated with a "Hotel Reservation" intent and a value of "3/14" for the "Date" slot parameter and a value of "1" for the "Number of Guests" slot parameter. An example of such a link could be "http://assistant.url/agent-module/hotel-booking-date_0314_guests_1". In some implementations, the link or command is an opaque parameter (for example,) to hide details about how the agent receives input from the automation assistant in order to eliminate the damage caused by malicious URL authors. "... / date_889293") can be included.

いくつかの実装では、ユーザが第1の選択可能要素112を選択したことに応答して、自動化アシスタントは、リンクおよびリンク内のエージェントモジュール識別子を使用して、エージェントモジュールが自動化アシスタントにとってアクセス可能であるかどうかを判定し得る。エージェントモジュールが自動化アシスタントにとって利用可能である場合、自動化アシスタントはエージェントモジュールを起動し得、任意選択で、エージェントモジュールとのさらなる対話のために利用可能なコマンドをユーザに提示し得る。たとえば、図1Bでは、アプリケーション106に関連するエージェントモジュールが会話型インターフェース114内で起動されており、起動に応答してエージェントモジュールによって生成された応答内容に基づく会話型インターフェース内で、出力(「エージェント:...は何ですか」)が提示される。たとえば、エージェントモジュールは、「ホテル予約」インテントと共に起動され、それに応答して、図1Bに示される出力を提供していることがある。 In some implementations, in response to the user selecting the first selectable element 112, the automation assistant uses the link and the agent module identifier in the link so that the agent module is accessible to the automation assistant. It can be determined if there is. If the agent module is available to the automation assistant, the automation assistant may launch the agent module and optionally present the user with commands available for further interaction with the agent module. For example, in Figure 1B, the agent module associated with application 106 is launched in conversational interface 114, and the output (“Agent” is in the conversational interface based on the response generated by the agent module in response to the launch. What is: ...? ") Is presented. For example, the agent module may be launched with a "hotel reservation" intent and in response to provide the output shown in Figure 1B.

いくつかの実装では、自動化アシスタントは、アプリケーション106に与えられた履歴コマンドの索引にアクセスし得る。自動化アシスタントは、履歴コマンドの索引を使用して、ユーザがエージェントモジュールと対話するための提案を行い得る。対応するエージェントモジュールが識別され、または履歴コマンドが識別されると、自動化アシスタントまたはモバイルデバイス104上の他のアプリケーションは、図1Bのダイアグラム116内で与えられるように、会話型ユーザインターフェース114内でユーザにコマンドのリストを提示し得る。たとえば、提案要素120が、「私の以前の予約の場所」という語句を備え得る。提案要素120が選択された場合、自動化アシスタントは、ユーザが以前にホテルを予約したのと同じ場所のホテルの場所を予約するようにエージェントモジュールに指示し得る。自動化アシスタントは、ユーザとアプリケーション106との間の対話に対応するユーザデータを検索して、以前のホテル予約の場所を決定し得る。あるいは、提案要素122は、「私のカレンダ内の場所」という語句を備え得る。提案要素122が選択された場合、自動化アシスタントは、リンク内に示される日付(「3/14」)の、ユーザのカレンダ内に記憶されるイベントに従ってホテルの場所を予約するようにエージェントモジュールに指示し得る。自動化アシスタントはまた、選択されたときに、ユーザが会話型ユーザインターフェース114内に入力をタイプすることを可能にするテキスト入力要素124と、選択されたときに、ユーザが自動化アシスタントに対する入力を話すことを可能にする音声入力要素132とを提供し得る。このようにして、任意選択で、ユーザは、エージェントモジュールについての提案された入力を選択すること、または自動化アシスタントを介してエージェントモジュールに対するテキストもしくは音声入力を与えることの間から選び得る。次いで、ユーザからの入力が、自動化アシスタントによってエージェントモジュールに与えられ、エージェントモジュールがダイアログを続行するさらなる応答内容を生成することを可能にし得る。 In some implementations, the automation assistant can access the index of historical commands given to application 106. The automation assistant can use the historical command index to make suggestions for the user to interact with the agent module. Once the corresponding agent module has been identified or the history command has been identified, the automation assistant or other application on the mobile device 104 will be able to use the user within the conversational user interface 114, as given in diagram 116 in Figure 1B. Can present a list of commands to. For example, proposal element 120 may include the phrase "my previous booking location". If proposal element 120 is selected, the automation assistant may instruct the agent module to book a hotel location in the same location where the user previously booked the hotel. The automation assistant may retrieve the user data corresponding to the dialogue between the user and the application 106 to determine the location of the previous hotel reservation. Alternatively, proposal element 122 may include the phrase "place in my calendar". If Proposal Element 122 is selected, the Automation Assistant instructs the Agent Module to book a hotel location according to an event stored in the user's calendar on the date shown in the link ("3/14"). Can be done. The automation assistant also speaks the input to the automation assistant when selected, with a text input element 124 that allows the user to type input within the conversational user interface 114. Can be provided with an audio input element 132 that enables. In this way, the user can optionally choose between selecting the suggested input for the agent module or giving text or voice input to the agent module via an automation assistant. Input from the user can then be given to the agent module by the automation assistant, allowing the agent module to generate further response content to continue the dialog.

コマンドのリストは、自動化アシスタントとエージェントモジュールとの間の対話を使用して、アプリケーションにおいて開始された動作を進めるためのコマンドを含み得る。コマンドのリストは、エージェントモジュールによって理解される発話可能コマンド、あるいはユーザによって話され、自動化アシスタントまたはモバイルデバイスもしくはリモートデバイス上の他のアプリケーションを使用してテキストに変換され得るテキストコマンドであり得る。たとえば、第1の選択可能要素112に関連するリンクは、自動化アシスタントがどのように入力を受け取り、かつ/または出力を提供するかについてのモダリティ(modality)を識別し得る。モダリティは、テキスト、音声、または入力を受け取り、出力を提供するための任意の他の媒体であり得る。モダリティは、自動化アシスタントに提供されるリンク内で識別され得る。たとえば、リンクは、エージェントモジュール、インテントもしくはアクション、および/またはモダリティ(たとえば、「http://assistant.url/agent-module/hotelbooking-text_modality」)を識別し得る。 The list of commands may contain commands for proceeding with the actions initiated in the application using the interaction between the automation assistant and the agent module. The list of commands can be spoken commands that are understood by the agent module, or text commands that are spoken by the user and can be converted to text using an automation assistant or other application on a mobile device or remote device. For example, the link associated with the first selectable element 112 may identify the modality of how the automation assistant receives the input and / or provides the output. Modality can be text, audio, or any other medium for receiving input and providing output. Modality can be identified within the link provided to the automation assistant. For example, a link can identify an agent module, intent or action, and / or modality (eg, "http://assistant.url/agent-module/hotelbooking-text_modality").

ユーザが図1Bのダイアグラム116における提案要素120、122、または124のいずれかを選択したことに応答して、図1Cのダイアグラム118に示されるように、会話型ユーザインターフェース114が更新され得る。図1Cの更新後会話型ユーザインターフェース130は、エージェントモジュールからのさらなる応答内容を含み得る。たとえば、自動化アシスタントがホテル予約のついての場所をエージェントモジュールに通信すると、エージェントモジュールは、ホテル予約についての支払いに対応する自動化アシスタントにさらなる応答内容を送り得る。自動化アシスタントは、エージェントモジュールに応答するための任意選択の応答要素をユーザに提示し得る。自動化アシスタントは、エージェントモジュールからの応答内容を使用して、自動化アシスタントにとってアクセス可能なユーザデータを検索し、検索に基づいて、選択可能な応答を生成し得る。たとえば、エージェントモジュールが支払いについて尋ねる照会を自動化アシスタントに提供したので、自動化アシスタントは、支払い情報についてのユーザデータを検索し得る。ユーザがモバイルデバイス上に記憶されたペイメントカードを有していると自動化アシスタントが判定した場合、自動化アシスタントは、「私の記憶されたカードで支払う」という語句を含む応答要素126を提示し得る。いくつかの実装では、自動化アシスタントは、エージェントモジュールとの対話が完了すると、ユーザが何らかの他の機能も実施することを望むと予測し得る。そのようなケースでは、自動化アシスタントは、1つのエージェントモジュール(たとえば、ホテル予約エージェントモジュール)に応答し、他の機能(たとえば、フライトの予約)を完了するための別のエージェントモジュールを起動し得るデュアルエージェント応答要素128をユーザに提示し得る。たとえば、自動化アシスタントは、話され、または選択されたときに、記憶されたカードを使用してホテル予約についてユーザに課金するようにエージェントモジュールに指示し得る、「記憶されたカードで支払い...」という語句を含むデュアルエージェント応答要素128を提供し得る。同時に、自動化アシスタントは、「...私がフライトを予約するのを助けて」という語句をデュアルエージェント応答要素128に提供し得る。ユーザがデュアルエージェント
応答要素128を選択したことに応答して、自動化アシスタントはまた、フライト予約アプリケーションまたはウェブサイトに対応するエージェントモジュールを起動し得る。このようにして、いくつかの動作を実施するためにユーザによって必要とされる入力数が削減され得る。これは、ユーザがモバイルデバイスに効率的に入力を与えることを妨げ得る器用さの問題または他の病気のあるユーザにとって有益であり得る。 The conversational user interface 114 may be updated as shown in Diagram 118 of FIG. 1C in response to the user selecting any of the proposed elements 120, 122, or 124 in Diagram 116 of FIG. 1B. The updated conversational user interface 130 in Figure 1C may contain additional responses from the agent module. For example, if the automation assistant communicates the location of the hotel reservation to the agent module, the agent module may send further responses to the automation assistant corresponding to the payment for the hotel reservation. The automation assistant may present the user with an optional response element for responding to the agent module. The automation assistant can use the response content from the agent module to search for user data accessible to the automation assistant and generate a selectable response based on the search. For example, the agent module provided the automation assistant with a query asking about payment, so that the automation assistant can retrieve user data for payment information. If the automation assistant determines that the user has a payment card stored on the mobile device, the automation assistant may present a response element 126 containing the phrase "pay with my stored card". In some implementations, the automation assistant can predict that the user will also want to perform some other function once the interaction with the agent module is complete. In such cases, the automation assistant can respond to one agent module (eg, a hotel booking agent module) and launch another agent module to complete another function (eg, booking a flight). Agent response element 128 may be presented to the user. For example, an automation assistant may instruct the agent module to charge a user for a hotel reservation using a memorized card when spoken or selected, "Pay with memorized card ... Can provide a dual agent response element 128 containing the phrase. At the same time, the automation assistant may provide the phrase "... help me book a flight" to the dual agent response element 128. In response to the user selecting dual agent response element 128, the automation assistant may also launch the agent module corresponding to the flight booking application or website. In this way, the number of inputs required by the user to perform some actions can be reduced. This can be beneficial for users with dexterity problems or other illnesses that can prevent users from efficiently giving input to their mobile devices.

図2は、ウェブサイト206に関連するエージェントモジュールと対話するために使用され得る会話型ユーザインターフェース214を示す。モバイルデバイス204上でウェブサイト206をブラウジング中に、ユーザ202は、ウェブサイト206に関連するいくつかの機能が、自動化アシスタントおよび/またはエージェントモジュールによって解釈され得る発話可能コマンドを通じて制御され得ることを知らないことがある。発話可能コマンドにユーザ202をより慣れさせるために、ユーザ202は、本明細書において説明される実装に従って会話型ユーザインターフェース214を提供し得る自動化アシスタントを初期化するように指示され得る。たとえば、ユーザ202は、図2のダイアグラム200に示されるように、食品注文ウェブサイト206をブラウジング中であり得る。ウェブサイト206を閲覧中に、ユーザ202は、モバイルデバイス204のユーザインターフェース208に提示された第1の選択可能要素212を識別し得る。第1の選択可能要素212は、ウェブサイト206に関連するいくつかの機能を実施するためのウェブサイト206に関連するエージェントモジュールと対話するために自動化アシスタントが使用され得ることをユーザ202に示す語句を含み得る。たとえば、第1の選択可能要素212は、図2に示されるように、「自動化アシスタントを使用する」という語句を含み得る。第1の選択可能要素212は、モバイルデバイス204において会話型ユーザインターフェースを開くために自動化アシスタントに提供され得る、URLなどのリンクに関連付けられ得る。たとえば、リンクは、ウェブサイト206に関連する機能(たとえば、食品の注文)を実施するためにユーザ202から発話可能コマンドを受け取るのに適したエージェントモジュールを識別し得る。そのようなリンクは、たとえば、http://assistant.url/food%ordering%agent%moduleという構造を有し、「%」は間隔文字を示し、エージェントモジュールのタイプ(たとえば、「食品注文」)がホスト名の後で識別される。いくつかの実装では、モバイルデバイス204がウェブサイト206に対応するサードパーティアプリケーション(たとえば、食品注文アプリケーション)を含む場合、ユーザ202と食品注文アプリケーションとの間の会話を続行するために、リンクが自動化アシスタントによってサードパーティアプリケーションに転送され得る。そうでない場合、リンクが自動化アシスタントによって受け取られ、モバイルデバイス204または別個のコンピューティングデバイス224において、会話型ユーザインターフェース214が提供され得る。 FIG. 2 shows a conversational user interface 214 that can be used to interact with the agent module associated with website 206. While browsing website 206 on mobile device 204, user 202 learns that some features related to website 206 can be controlled through spoken commands that can be interpreted by the automation assistant and / or agent module. Sometimes not. To better familiarize user 202 with utterable commands, user 202 may be instructed to initialize an automation assistant that may provide a conversational user interface 214 according to the implementation described herein. For example, user 202 may be browsing food ordering website 206, as shown in Diagram 200 of FIG. While browsing website 206, user 202 may identify the first selectable element 212 presented in user interface 208 of mobile device 204. The first selectable element 212 is a phrase indicating to user 202 that an automation assistant can be used to interact with the agent module associated with website 206 to perform some functions associated with website 206. May include. For example, the first selectable element 212 may include the phrase "use an automation assistant" as shown in FIG. The first selectable element 212 may be associated with a link such as a URL that may be provided to the automation assistant to open a conversational user interface on the mobile device 204. For example, a link may identify an agent module suitable for receiving a speakable command from user 202 to perform a function associated with website 206 (eg, ordering food). Such links have the structure, for example, http://assistant.url/food%ordering%agent%module, where "%" indicates the spacing character and the type of agent module (eg, "food order"). Is identified after the host name. In some implementations, if the mobile device 204 contains a third-party application (for example, a food ordering application) that supports website 206, the link is automated to continue the conversation between user 202 and the food ordering application. Can be transferred to a third party application by an assistant. Otherwise, the link may be received by the automation assistant and a conversational user interface 214 may be provided on the mobile device 204 or a separate computing device 224.

さらに別の例として、ウェブサイト206は食品配送注文ウェブサイトであり得、ユーザは(たとえば、ドロップダウンメニュー、ラジオボタン、フリーフォームテキストを介して)ウェブサイト206と対話して、食品注文についての品目および/または材料(たとえば、ピザについてのトッピング)を選択し得、任意選択で、配送の注文を終了させ、支払いをし得る。ユーザがウェブサイト206との対話を通じて食品注文を部分的に記入した場合、選択された材料のうちの1つまたは複数についての値が自動化アシスタントに送信され、自動化アシスタントがウェブサイト206に関連するエージェントモジュールをそのような値と共に起動することが可能となり(たとえば、そのような値がエージェントモジュールの起動においてエージェントモジュールに送信されるスロット値として含められ)得る。いくつかの実装では、ウェブサイト206をホストするサーバが、自動化アシスタントに渡すためのそのような値を生成し得る。たとえば、選択可能グラフィカル要素についてのリンクが、ユーザによるウェブサイト206との対話に応答して、サーバによって動的に生成され得、したがって、リンクは、そのような値の指示を含む(たとえば、「http://assistant.url/agent-module/order-pizz_toppings=pepperoni-mushroom-peppers」)。たとえば、選択可能なグラフィカル要素に関連するリンクが、ユーザのウェブサイト206との対話に応答して、サーバによって動的に更新され得る。別の例として、サーバは、選択可能なグラフィカル要素の選択に応答して、コマンドを送信し得、コマンドはそのような値を含み、任意選択で、エージェントモジュールをも示す。たとえば、サーバから自動化アシスタントに与えられるコマンドは、ユーザのウェブサイト206との対話を介して、ピザについて選択されたトッピングを含むようにサーバによって調整され得る(たとえば、コマンド「ACTION = com.assistant.toppings_pepperoni-bacon-onion.StartConversation」)。いくつかの実装では、自動化アシスタント自体がウェブサイト206のインターフェースの内容を処理して、そのような値を直接的に決定し得る。たとえば、インターフェースの1つまたは複数のスクリーンショットが処理されて、フィールドのタイトルのテキストおよび/またはフィールドについての選択された値が決定され、そのようなタイトルおよび/または値が利用されて、起動要求と共にエージェントモジュールに渡すための適切な値が決定され得る。ユーザの非自動化アシスタントインターフェースとの対話から導出された値が、関連するエージェントモジュールの起動において利用されるいくつかの実装では、自動化アシスタントインターフェースを通じたそのような値の重複した再入力が削減され(たとえば、解消され)得る。これは、様々なリソースを節約し得る。エージェントモジュールがそのような値と共に起動され得、それによって、そのような値を定義するための自動化アシスタントインターフェースを介するダイアログターンの必要が解消される。 As yet another example, website 206 can be a food delivery order website, where users interact with website 206 (eg, via drop-down menus, radio buttons, freeform text) to discuss food orders. Items and / or ingredients (eg toppings for pizza) may be selected and, optionally, delivery orders may be terminated and payments may be made. If the user partially fills out a food order through interaction with Website 206, the value for one or more of the selected ingredients is sent to the Automation Assistant, which is the agent associated with Website 206. It is possible to start a module with such a value (for example, such a value is included as the slot value sent to the agent module at the start of the agent module). In some implementations, the server hosting website 206 may generate such a value to pass to the automation assistant. For example, a link about a selectable graphical element can be dynamically generated by the server in response to a user interaction with website 206, thus the link contains an indication of such a value (eg, "" http://assistant.url/agent-module/order-pizz_toppings=pepperoni-mushroom-peppers "). For example, links related to selectable graphical elements may be dynamically updated by the server in response to a user's interaction with website 206. As another example, the server may send commands in response to a selection of selectable graphical elements, which include such values and optionally also indicate an agent module. For example, the command given by the server to the automation assistant can be adjusted by the server to include the toppings selected for the pizza through interaction with the user's website 206 (for example, the command "ACTION = com.assistant." toppings_pepperoni-bacon-onion.StartConversation "). In some implementations, the automation assistant itself may process the contents of the website 206's interface to determine such values directly. For example, one or more screenshots of the interface are processed to determine the text of the field title and / or the selected value for the field, and such title and / or value is utilized to make a launch request. With can determine the appropriate value to pass to the agent module. In some implementations where values derived from the user's interaction with the non-automated assistant interface are used in launching the associated agent module, duplicate re-entry of such values through the automated assistant interface is reduced ( For example, it can be resolved). This can save various resources. The agent module can be launched with such a value, thereby eliminating the need for a dialog turn through an automated assistant interface to define such a value.

いくつかの実装では、自動化アシスタントは、ウェブサイト206を閲覧中のモバイルデバイス204において利用可能ではないことがあるが、モバイルデバイス204は、ネットワークを介して、自動化アシスタントを含む別個のコンピューティングデバイス224に接続され得る。この実装では、ユーザ202が第1の選択可能要素212を選択するとき、モバイルデバイス204は、コンピューティングデバイス224において自動化アシスタントを起動するためのリンク(または他の内容)をコンピューティングデバイス224に提供し得る。自動化アシスタントは、リンクを使用してエージェントモジュールを識別し、ウェブサイト206において実行中の現動作のステータスに関係するデータを識別し得る。 In some implementations, the automation assistant may not be available on the mobile device 204 browsing website 206, but the mobile device 204 is a separate computing device 224 that includes the automation assistant over the network. Can be connected to. In this implementation, when the user 202 selects the first selectable element 212, the mobile device 204 provides the computing device 224 with a link (or other content) to activate the automation assistant on the computing device 224. Can be. Automation assistants can use links to identify agent modules and identify data related to the status of current operations running on website 206.

会話型ユーザインターフェース214は、自動化アシスタントを介してエージェントモジュールと対話するための発話可能コマンドに対応する語句を含む複数の異なる選択可能要素を含み得る。語句は、自動化アシスタント処理がウェブサイトの内容を処理した結果、エージェントモジュールもしくはウェブサイトによって自動化アシスタントに与えられる事前構成されたコマンド、および/または自動化アシスタントによって記録された、ユーザのウェブサイトとの履歴対話に基づき得る。たとえば、会話型ユーザインターフェース214での選択可能要素は、「食品配送を注文する」などの語句を含み得、語句は、自動化アシスタントに提供されたリンク内に詳述される注文(または他のコマンド)のステータスに基づき得る。この語句がユーザ202によって話され、自動化アシスタントによってテキストに変換され得る。その後で、テキストが、ウェブサイト206に関連するエージェントモジュールに提供され得る。エージェントモジュールはテキストを受け取り、テキストに従って食品配送を完了し得る。 The conversational user interface 214 may include a plurality of different selectable elements, including words and phrases corresponding to utterable commands for interacting with the agent module via the automation assistant. The phrase is the preconfigured commands given to the automation assistant by the agent module or website as a result of the automation assistant processing processing the contents of the website, and / or the history with the user's website recorded by the automation assistant. Get based on dialogue. For example, selectable elements in the conversational user interface 214 may include a phrase such as "order food delivery," which is an order (or other command) detailed in the link provided to the automation assistant. ) Based on the status. This phrase can be spoken by user 202 and converted to text by an automation assistant. The text may then be provided to the agent module associated with website 206. The agent module receives the text and can complete the food delivery according to the text.

いくつかの実装では、リンクは、エージェントモジュールとの対話中に自動化アシスタントを誘導するためのパラメータを含み得る。たとえば、ユーザ202は、第1の選択可能要素212を選択する前に、食品注文ウェブサイト206に少なくとも部分的に記入し得る。ユーザ202によって記入されたウェブサイト206の部分は、配送についての場所、食品の量、および/または飲料注文などの注文データを含み得る。このデータは、第1の選択可能要素212に対応するリンク内に埋め込まれ得る。たとえば、第1の選択可能要素212に対応するリンクは、「http://assistant.url/agent-module/breakfast-order/drink-coffee-location-market-street」であり得る。リンクは、エージェントモジュールによって実施されるべき後続のアクションまたはインテントについてのパラメータを識別するために、自動化アシスタントによって構文解析され得る。たとえば、リンク内で識別されるインテント「breakfast-order」は、インテントが完了され得る前に識別される必要のある複数のパラメータを含み得る。自動化アシスタントは、パラメータ「coffee」および「market street」を使用して、注文の現ステータスをユーザに通知し、注文を完了するための追加の情報(たとえば、「食品代を支払う」)を要求し得る。 In some implementations, the link may contain parameters to guide the automation assistant during the interaction with the agent module. For example, user 202 may at least partially fill out food ordering website 206 before selecting the first selectable element 212. The portion of website 206 completed by user 202 may include order data such as location for delivery, food quantity, and / or beverage order. This data can be embedded in the link corresponding to the first selectable element 212. For example, the link corresponding to the first selectable element 212 could be "http://assistant.url/agent-module/breakfast-order/drink-coffee-location-market-street". The link can be parsed by the automation assistant to identify parameters for subsequent actions or intents to be performed by the agent module. For example, an intent "breakfast-order" identified within a link may contain multiple parameters that need to be identified before the intent can be completed. The automation assistant uses the parameters "coffee" and "market street" to notify the user of the current status of the order and request additional information to complete the order (for example, "pay for food"). obtain.

いくつかの実装では、会話型ユーザインターフェース214において提供される選択可能要素のそれぞれは、ウェブサイト206の内容に従って事前構成され得る。言い換えれば、モバイルデバイス204の自動化アシスタントおよび/またはパーサエンジンは、ウェブサイト206と対話するための選択可能要素および/または発話可能コマンドを生成するために、ウェブサイト206の内容を処理し得る。他の実装では、ウェブサイト206のエージェントモジュールが、モバイルデバイス204またはコンピューティングデバイス224上に記憶され、またはそれにとってアクセス可能であり得る、事前構成されたコマンドおよびパラメータに関連付けられ得る。これらの事前構成されたコマンドおよびパラメータは、自動化アシスタントを介してエージェントモジュールと対話するための発話可能コマンドおよび/または選択可能要素を生成するために、モバイルデバイス204の自動化アシスタントおよび/またはパーサエンジンによって処理され得る。このようにして、ユーザ202は、コマンドを完全にタイプアウトすることに依拠しないことによってウェブサイト206の機能を簡素化するために、ウェブサイト206またはエージェントモジュールと音声で対話できることを知り得る。これは、疲労した、または器用さの問題のあるユーザにとって有益であり得る。 In some implementations, each of the selectable elements provided in the conversational user interface 214 may be preconfigured according to the content of website 206. In other words, the automation assistant and / or parser engine of the mobile device 204 may process the contents of website 206 to generate selectable elements and / or utterable commands for interacting with website 206. In other implementations, the agent module of website 206 may be associated with preconfigured commands and parameters that may be stored on or accessible to mobile device 204 or computing device 224. These preconfigured commands and parameters are produced by the automation assistant and / or parser engine on mobile device 204 to generate utterable commands and / or selectable elements for interacting with the agent module through the automation assistant. Can be processed. In this way, user 202 may know that he can interact with website 206 or the agent module by voice in order to simplify the functionality of website 206 by not relying on completely typing out the command. This can be beneficial for users who are tired or have dexterity problems.

いくつかの実装では、ウェブサイト206が、複数の異なるエージェントモジュールに関連付けられ得、自動化アシスタントは、ユーザ202の現在の活動および/または以前の活動に従って初期化するのに最も適切なエージェントモジュールを識別し得る。たとえば、ユーザ202は食品注文ウェブサイト206を閲覧中であり得、食品注文ウェブサイト206が、特定のタイプの食品を注文することにそれぞれ特化し得る複数の異なるエージェントモジュールに関連付けられ得る。たとえば、第1のエージェントモジュールは朝食用食品を注文することに特化し得、第2のエージェントモジュールは夕食用食品を注文することに特化し得る。自動化アシスタントは、ユーザ202が朝食用食品を注文することにより関心がある可能性が高いと判定し、「朝食用食品を注文することについてエージェントに話しかける(Talk to an agent about ordering breakfast food)」という発話可能コマンドを含む選択可能要素を提供し得る。自動化アシスタントは、ユーザ202がウェブサイト206を閲覧している時刻、ウェブサイトから朝食用食品を注文した過去の履歴、メッセージおよび/またはカレンダエントリなどのモバイルデバイス204にとってアクセス可能な媒体、ならびに/あるいはユーザ202のアクションを予測するときに使用するのに適した任意の他のデータに基づいて、ユーザ202が朝食用食品を注文する可能性がより高いと判定し得る。ウェブサイトによって提供される選択可能要素は、複数の異なるエージェントモジュールを具体的に識別するリンクに対応し得る。たとえば、選択可能要素は、選択し、初期化するための、自動化アシスタントについてのエージェントモジュールを列挙するコマンドに対応し得る(たとえば、「ACTION=com.assistant.BreakfastAgentModule.StartConversation, com.assistant.LunchAgentModule.StartConversation, OR com.assistant.DinnerAgentModule.StartConversation」)。コマンドは、自動化アシスタントによる受信のためにウェブサイト206において提供され得る。あるいは、選択可能要素は、エージェントモジュールを識別するリンクに対応し得る(たとえば、「http://assistant.url/agentmodules/breakfast-agent_lunch-agent_dinner-agent」)。次いで、リンクまたはコマンドは自動化アシスタントによって受け取られ得、その結果、自動化アシスタントは、ユーザデータの解析に基づいて初期化するのに最も適切なエージェントモジュールを選択し得る。 In some implementations, website 206 may be associated with several different agent modules, and the automation assistant identifies the most suitable agent module to initialize according to user 202's current and / or previous activity. Can be done. For example, user 202 may be browsing food ordering website 206, which may be associated with a number of different agent modules, each capable of specializing in ordering a particular type of food. For example, the first agent module may specialize in ordering breakfast foods and the second agent module may specialize in ordering dinner foods. The automation assistant determines that User 202 is more likely to be interested in ordering breakfast food and says, "Talk to an agent about ordering breakfast food." It may provide selectable elements, including utterable commands. The Automation Assistant provides the time when User 202 is browsing Website 206, the history of ordering breakfast food from Website, media accessible to Mobile Device 204 such as messages and / or calendar entries, and / or Based on any other data suitable for use in predicting the behavior of user 202, it may be determined that user 202 is more likely to order breakfast food. The selectable elements provided by the website may correspond to links that specifically identify multiple different agent modules. For example, selectable elements can correspond to commands that enumerate agent modules for automation assistants to select and initialize (for example, "ACTION = com.assistant.BreakfastAgentModule.StartConversation, com.assistant.LunchAgentModule." StartConversation, OR com.assistant.DinnerAgentModule.StartConversation "). The command may be provided on website 206 for reception by the automation assistant. Alternatively, the selectable element may correspond to a link that identifies the agent module (for example, "http://assistant.url/agentmodules/breakfast-agent_lunch-agent_dinner-agent"). The link or command can then be received by the automation assistant, so that the automation assistant can select the most suitable agent module to initialize based on the analysis of the user data.

いくつかの実装では、自動化アシスタントがモバイルデバイス204からコンピューティングデバイス224において初期化されるとき、自動化アシスタントは、ユーザデータを解析して、コンピューティングデバイス224から別のエージェントが初期化されるべきかどうかを判定し得る。たとえば、自動化アシスタントは、ユーザ202がコンピューティングデバイス224において頻繁にアクセスするムービーウェブサイトに関連するエージェントモジュールを認識し得る。ユーザ202は、第1の選択可能要素212を選択して、食品注文ウェブサイト206に関連するエージェントモジュールと対話するために自動化アシスタントを初期化し得る。同時に、自動化アシスタントはまた、図2に示されるように、ムービーウェブサイトに関連するエージェントモジュールと通信することのオプションをユーザ202に提供し得る。たとえば、自動化アシスタントは、異なるエージェントモジュールまたはウェブサイトについての異なるアクションにそれぞれ関連付けられ得る選択可能要素218および選択可能要素222を同時に提供し得る。このようにして、ユーザは、2つの別々のアクションを達成する単一の自動化アシスタントに2つのコマンド(たとえば、「食品配送を注文し、最後に視聴した映画を開始する」)を順次話すことによって、2つの異なるエージェントモジュールと音声で通信することができる。 In some implementations, when the automation assistant is initialized from mobile device 204 to compute device 224, should the automation assistant parse user data and another agent be initialized from compute device 224? You can judge whether or not. For example, the automation assistant may be aware of the agent module associated with the movie website that user 202 frequently visits on computing device 224. User 202 may select the first selectable element 212 to initialize the automation assistant to interact with the agent module associated with food ordering website 206. At the same time, the automation assistant may also provide user 202 with the option of communicating with the agent module associated with the movie website, as shown in FIG. For example, an automation assistant may simultaneously provide selectable elements 218 and selectable elements 222 that may be associated with different actions for different agent modules or websites, respectively. In this way, the user sequentially speaks two commands (for example, "order food delivery and start the last movie watched") to a single automation assistant that accomplishes two separate actions. , Can communicate by voice with two different agent modules.

図3は、様々なアプリケーション304および/またはウェブサイトを制御するために利用可能な発話可能コマンドにユーザを慣れさせるために、クライアントデバイス302において会話型インターフェース316を提供するためのシステム300を示す。クライアントデバイス302において動作するアプリケーション304は、ユーザがアプリケーション304に関連する機能を実施することを支援する1つまたは複数のエージェントモジュール310に関連付けられ得る。エージェントモジュール310は、クライアントデバイス302、またはサーバデバイスなどのリモートデバイスに記憶され得る。いくつかの実装では、サーバデバイスは、クライアントデバイス302にとってアクセス可能な1つまたは複数の自動化アシスタントを記憶し得る。自動化アシスタント320は、クライアントデバイス302のマイクロフォンによって記録される音声データを受け取り、クライアントデバイス302のいくつかの特徴を制御し、エージェントモジュール310と対話する目的で音声データを解釈し得る。 FIG. 3 shows a system 300 for providing a conversational interface 316 on a client device 302 to familiarize a user with utterable commands available to control various applications 304 and / or websites. The application 304 running on the client device 302 may be associated with one or more agent modules 310 that help the user perform functions related to the application 304. The agent module 310 may be stored in a remote device such as a client device 302 or a server device. In some implementations, the server device may store one or more automation assistants accessible to the client device 302. The automation assistant 320 may receive the voice data recorded by the microphone of the client device 302, control some features of the client device 302, and interpret the voice data for the purpose of interacting with the agent module 310.

いくつかの実装では、クライアントデバイス302および/またはサーバデバイスは、クライアントデバイス302において選択可能要素を提供するためのデータを生成し得る選択可能要素エンジン326を含み得る。選択可能要素エンジン326は、ユーザがアプリケーション304および/またはウェブブラウザ314に関連する機能を実施するために自動化アシスタント320を介するエージェントモジュール310との通信を初期化するのを支援する目的で、選択可能要素を生成し得る。たとえば、選択可能要素エンジン326は、ユーザがアプリケーション304を操作しているとき、またはウェブブラウザ314においてウェブサイトを閲覧しているときに通知を受け得る。それに応答して、選択可能要素エンジン326は、クライアントデバイス302のインターフェースにおいて選択されているとき、クライアントデバイス302において会話型インターフェース316を初期化し得る選択可能要素を生成し得る。選択可能要素は、選択可能要素エンジン326によって生成され、または自動化アシスタント320によって提供された語句を含み得る。選択可能要素エンジン326および/または自動化アシスタント320は、アプリケーション304に関連するエージェントモジュール310を認識し、エージェントモジュール310と対話するために自動化アシスタントが使用され得ることを示す選択可能要素についての語句を提供し得る。たとえば、選択可能要素の語句は、会話型インターフェース316をオープンするためにユーザによって話され得る、「自動化アシスタントを使用してアプリケーション機能を実施する」であり得る。あるいは、ユーザは、選択可能要素を選択して会話型インターフェース316をオープンし得る。 In some implementations, the client device 302 and / or the server device may include a selectable element engine 326 that may generate data to provide selectable elements in the client device 302. Selectable element engine 326 is selectable to help the user initialize communication with agent module 310 through automation assistant 320 to perform functions related to application 304 and / or web browser 314. Can generate elements. For example, the selectable element engine 326 may be notified when a user is interacting with application 304 or browsing a website in a web browser 314. In response, the selectable element engine 326 may generate selectable elements that can initialize the conversational interface 316 on the client device 302 when selected on the interface of the client device 302. The selectable element may include words generated by the selectable element engine 326 or provided by the automation assistant 320. Selectable Elements Engine 326 and / or Automation Assistant 320 recognizes Agent Module 310 associated with Application 304 and provides a phrase about Selectable Elements that indicates that Automation Assistant can be used to interact with Agent Module 310. Can be done. For example, the phrase of the selectable element can be "perform an application function using an automation assistant" that can be spoken by the user to open the conversational interface 316. Alternatively, the user may select selectable elements to open the conversational interface 316.

会話型インターフェース316は、ユーザデータ312に基づき得る語句を含む、複数の異なる選択可能要素を含み得る。たとえば、エージェントモジュール310は、ゲーミングアプリケーション304に対応し得、エージェントモジュール310は、ユーザからタイプされたコマンドを受け入れ得る。エージェントモジュール310は、タイプされた事前構成されたコマンド312を受け入れるように製造業者によって設計され得るが、自動化アシスタント320は、ユーザの発話語を、エージェントモジュール310によって理解され得るコマンドに変換するために使用され得る。たとえば、当初、ユーザがゲーミングアプリケーション304をオープンするとき、選択可能要素エンジン326は、ゲーミングアプリケーション304がオープンされたという通知を受け、選択可能要素に「自動化アシスタントを使用する」という語句を提供し得る。その後で、ユーザは、ゲーミングアプリケーション304に対応するエージェントモジュール310と通信するために自動化アシスタント320を初期化する目的で、選択可能要素を選択し、または語句を話し得る。アプリケーション304が、たとえばチェスゲームアプリケーションであるとき、会話型インターフェース316は、チェスの動きに対応する語句を有する複数の異なる選択可能要素を備え得る。語句は、ユーザによって入力された以前のコマンド、エージェントモジュール310から通信されたデータ、および/またはテキスト構文解析エンジン324によって提供された、構文解析されたアプリケーション内容などのユーザデータ312に基づき得る。選択可能要素に対応する動きを選択するために、ユーザは、選択可能要素を選択し、または選択可能要素に位置する語句を話し得る。たとえば、会話型インターフェース316の選択可能要素は、「ポーンを動かす」という語句を含み得る。選択可能要素は、実施されるべきアクション(たとえば、チェスアプリケーションにおいてポーンを動かすこと)を識別するリンクに対応し得、会話型インターフェース316に、アクションを完了するのに利用可能な追加の語句(たとえば、「A5に動かす」)で更新させ得る。次いで、リンクが、アプリケーション304に関連するエージェントモジュール310に提供され得る。 The conversational interface 316 may include a plurality of different selectable elements, including words and phrases that may be based on user data 312. For example, the agent module 310 may correspond to the gaming application 304 and the agent module 310 may accept commands typed by the user. The agent module 310 can be designed by the manufacturer to accept the typed preconfigured command 312, while the automation assistant 320 is to translate the user's utterances into commands that can be understood by the agent module 310. Can be used. For example, initially when a user opens a gaming application 304, the selectable element engine 326 may be notified that the gaming application 304 has been opened and provide the selectable element with the phrase "use an automation assistant". .. The user can then select selectable elements or speak words for the purpose of initializing the automation assistant 320 to communicate with the agent module 310 corresponding to the gaming application 304. When application 304 is, for example, a chess game application, the conversational interface 316 may include a plurality of different selectable elements with phrases corresponding to chess movements. The phrase may be based on user data 312 such as previous commands entered by the user, data communicated from the agent module 310, and / or the parsed application content provided by the text parsing engine 324. To select the movement corresponding to the selectable element, the user can select the selectable element or speak a phrase located in the selectable element. For example, the selectable elements of conversational interface 316 may include the phrase "move a pawn." The selectable element can correspond to a link that identifies the action to be performed (eg, moving a pawn in a chess application), and the conversational interface 316 has additional words available to complete the action (eg, moving the pawn). , "Move to A5") can be updated. The link may then be provided to the agent module 310 associated with application 304.

いくつかの実装では、ユーザは語句を話し得、クライアントデバイス302によって取り込まれたオーディオデータが、処理のために自動化アシスタント320に提供され得る。たとえば、自動化アシスタント320は、クライアントデバイス302によって取り込まれたオーディオデータを受け取り、オーディオデータをテキストに変換し得る音声-テキストエンジン322を含み得る。テキストは、クライアントデバイス302のマイクロフォンによってオーディオデータが取り込まれていた間のユーザの発話語に対応し得る。自動化アシスタント320はまた、テキストを構文解析し、エージェントモジュール310および/またはウェブブラウザ314において提供されるウェブサイトについての入力パラメータに対応する特定の語または語句を識別し得るテキスト構文解析エンジン324をも含み得る。その後で、自動化アシスタント320は、構文解析されたテキストからリンクまたはコマンドを生成し、処理のためにエージェントモジュールまたはウェブサイトにリンクまたはコマンドを送り得る。たとえば、ユーザがゲーミングアプリケーション304についての会話型インターフェース316において「ポーンを動かす」という語句を見たとき、ユーザは、クライアントデバイス302において「ポーンを動かす」という語句を話し得る。次いで、クライアントデバイス302は、音声のオーディオデータを取り込み、オーディオデータを自動化アシスタント320と共有し得る。次いで、自動化アシスタント320は、音声に対応する、構文解析されたテキストを含むリンクを生成し、エージェントモジュール310またはゲーミングアプリケーション304にリンクを送信する。たとえば、リンクは「http://assistant.url/chess-agent-module/move-pawn」などのURLであり得、URLがエージェントモジュール310によって処理され、アプリケーション304がチェスゲームを進めるのに使用するためのコマンドに変換され得る。リンクはまた、会話型インターフェース316における以前の選択可能要素のうちの少なくとも一部を置き換える新しい選択可能要素を生成するための選択可能要素エンジン326によって使用され得る。 In some implementations, the user can speak and the audio data captured by the client device 302 can be provided to the automation assistant 320 for processing. For example, the automation assistant 320 may include a voice-text engine 322 that can receive audio data captured by client device 302 and convert the audio data to text. The text may correspond to the user's utterance while the audio data was being captured by the microphone of client device 302. Automation Assistant 320 also has a text parsing engine 324 that can parse text and identify specific words or phrases that correspond to input parameters for websites provided in Agent Module 310 and / or Web Browser 314. Can include. The Automation Assistant 320 may then generate a link or command from the parsed text and send the link or command to the agent module or website for processing. For example, when the user sees the phrase "move a pawn" in the conversational interface 316 for the gaming application 304, the user may speak the phrase "move a pawn" on the client device 302. The client device 302 can then capture the audio audio data of the audio and share the audio data with the automation assistant 320. Automation Assistant 320 then generates a link containing the parsed text that corresponds to the voice and sends the link to Agent Module 310 or Gaming Application 304. For example, the link could be a URL such as "http://assistant.url/chess-agent-module/move-pawn", where the URL is processed by the agent module 310 and used by application 304 to advance the chess game. Can be translated into a command for. The link may also be used by the selectable element engine 326 to generate a new selectable element that replaces at least some of the previous selectable elements in the conversational interface 316.

いくつかの実装では、ユーザがウェブブラウザ314においてウェブサイトを閲覧しており、ウェブサイトがエージェントモジュール310に関連付けられない場合、ユーザには、ウェブサイトと対話するための会話型インターフェース316が依然として提示され得る。たとえば、ユーザは、エージェントモジュール310に関連付けられないホテルウェブサイトを閲覧していることがある。それでも、選択可能要素エンジン326は、クライアントデバイス302に、「自動化アシスタントを使用する」という語句を含む選択可能要素を表示させ得る。このようにして、ユーザは、利用可能なエージェントモジュール310がなくても、発話可能コマンドを受け取るのにホテルウェブサイトが利用可能であるという通知を受け得る。ユーザが選択可能要素を選択し、または「自動化アシスタントを使用する」という語句を話したことに応答して、ウェブブラウザ314は、ユーザに追加の選択可能要素を提示するために会話型インターフェース316をオープンし得る。追加の選択可能要素は、ウェブサイトの内容、ならびに/あるいはメッセージ、カレンダデータ、ブラウザ履歴、注文履歴、運転指示、および/またはユーザ活動に基づく任意の他のデータなどの、アプリケーション304からのデータを含み得るユーザデータ312の内容に基づいて、選択可能要素エンジン326によって生成され得る。選択可能要素は、発話可能コマンドの内容の少なくとも一部を識別するウェブリンクに対応し得、ウェブサイトをホストしているサーバによって処理され得る。次いで、ホテル予約プロセスを進めるために、ウェブリンクがウェブサイトまたはウェブブラウザ314に提供され得る。ユーザが、クリッカブル要素において提示される「イリノイ州のホテルを予約する」という語句を話すことを選ぶ場合、「イリノイ州のホテルを予約する」という発話語句に対応するオーディオデータが、自動化アシスタント320に提供され得る。その後で、自動化アシスタント320は、音声-テキストエンジン322において音声をテキストに変換し、次いでテキスト構文解析エンジン324においてテキストを構文解析し得る。次いで、構文解析されたテキストが、選択可能要素エンジン326においてウェブリンクに変換され、次いで、予約プロセスを進めるためにウェブリンクがウェブサイトまたはウェブブラウ
ザ314に提供され得る。その後で、選択可能要素エンジン326は、エージェントモジュール310から受け取った照会に従って予約プロセスを進めるために、会話型インターフェース316に選択可能要素を再ポピュレートし得る。たとえば、選択可能要素は、「予約日を選択する」、「部屋の大きさを選択する」、および/またはホテルを予約することに関係する任意の他のコマンドなどのコマンドに対応し得る。 In some implementations, if the user is browsing the website in a web browser 314 and the website is not associated with the agent module 310, the user is still presented with a conversational interface 316 to interact with the website. Can be done. For example, a user may be browsing a hotel website that is not associated with Agent Module 310. Nevertheless, the selectable element engine 326 may cause the client device 302 to display selectable elements that include the phrase "use an automation assistant." In this way, the user can be notified that the hotel website is available to receive spoken commands without the agent module 310 available. In response to the user selecting a selectable element or speaking the phrase "use an automation assistant", the web browser 314 provides a conversational interface 316 to present the user with additional selectable elements. Can be open. Additional selectable elements include data from application 304, such as website content and / or messages, calendar data, browser history, order history, driving instructions, and / or any other data based on user activity. It can be generated by the selectable element engine 326 based on the contents of the user data 312 which may be included. The selectable element may correspond to a web link that identifies at least part of the content of the spoken command and may be processed by the server hosting the website. A web link may then be provided to the website or web browser 314 to facilitate the hotel booking process. If the user chooses to speak the phrase "book a hotel in Illinois" presented in the clickable element, the audio data corresponding to the phrase "book a hotel in Illinois" will be sent to Automation Assistant 320. Can be provided. The automation assistant 320 can then convert the speech to text in the speech-text engine 322 and then parse the text in the text parsing engine 324. The parsed text can then be converted to a web link in the selectable element engine 326, and then the web link can be provided to the website or web browser 314 to proceed with the booking process. The selectable element engine 326 may then repopulate the selectable element into the conversational interface 316 in order to proceed with the booking process according to the query received from the agent module 310. For example, the selectable element may correspond to commands such as "select reservation date", "select room size", and / or any other command related to booking a hotel.

図4A〜図4Cは、会話型ユーザインターフェースがコンピューティングデバイスのユーザインターフェースにおいて提示されていることを示す。具体的には、図4Aのダイアグラム400は、コンピューティングデバイスのウェブブラウザにおいて提示されたウェブサイトについての第1のユーザインターフェース406を示す。ウェブサイトは、たとえば、ユーザが聴取し得る様々なラジオ局を広告するラジオウェブサイトであり得る。ウェブサイトは、第1のユーザインターフェース406において提示される第1の選択可能要素408を含み得る。第1の選択可能要素408は、ウェブサイトに関連するエージェントモジュールと対話するために自動化アシスタントが使用され得るという指示をユーザに提供し得る。たとえば、第1の選択可能要素408は、自動化アシスタントにコンピューティングデバイスにおいて会話型ユーザインターフェース410をオープンさせるためにユーザによって話され得る、「自動化アシスタントを使用する」という語句を含み得る。ユーザはまた、第1のユーザインターフェース406をタッチすることによって、またはコンピューティングデバイスに何らかの他の選択コマンドを与えることによって第1の選択可能要素408を選択し得る。第1のユーザインターフェース406が選択され、または語句がユーザによって話されたことに応答して、自動化アシスタントは、エージェントモジュールがウェブサイトに関連付けられることを識別するリンクを受け取り得る。自動化アシスタントはまた、照会を実施して、ウェブサイトに関連するアプリケーションがコンピューティングデバイス上に存在するかどうか、それともコンピューティングデバイスにとってアクセス可能かどうかを識別し得る。次いで、自動化アシスタントは、会話型ユーザインターフェース410において結果を提供し得る。 4A-4C show that the conversational user interface is presented in the user interface of the computing device. Specifically, Diagram 400 of FIG. 4A shows a first user interface 406 for a website presented in a web browser of a computing device. A website can be, for example, a radio website that advertises various radio stations that a user can listen to. The website may include a first selectable element 408 presented in the first user interface 406. The first selectable element 408 may provide the user with instructions that an automation assistant may be used to interact with the agent module associated with the website. For example, the first selectable element 408 may include the phrase "use the automation assistant" which may be spoken by the user to open the conversational user interface 410 in the computing device to the automation assistant. The user may also select the first selectable element 408 by touching the first user interface 406 or by giving the computing device some other selection command. In response to the first user interface 406 being selected or the phrase being spoken by the user, the automation assistant may receive a link identifying the agent module being associated with the website. The automation assistant can also perform queries to identify whether the application associated with the website exists on the computing device or is accessible to the computing device. The automation assistant may then provide results in the conversational user interface 410.

図4Bのダイアグラム402の会話型ユーザインターフェース410は、第1のエージェントモジュール部分412および第2のエージェントモジュール部分416を含み得る。第1のエージェントモジュール部分412は、ウェブサイトに関連するエージェントモジュールに関係する会話型ユーザインターフェース410の部分に対応し得る。第2のエージェントモジュール部分416は、ウェブサイトに関連するネイティブアプリケーションに関係する会話型ユーザインターフェース410の部分に対応し得る。たとえば、ラジオウェブサイトはオーディオ聴取ウェブサイトであるので、自動化アシスタントは、コンピューティングデバイス上のオーディオ関連アプリケーションを識別し、オーディオ関連アプリケーションに関連するエージェントと対話するための発話可能コマンドをユーザに提示し得る。ダイアグラム400に示されるように、1つのオーディオ関連アプリケーションはポッドキャストアプリケーションであり得る。しかしながら、いくつかの実装では、関連アプリケーションは、自動化アシスタントアプリケーションを制御する側からは固有のサードパーティアプリケーションであり得る。 The conversational user interface 410 of Diagram 402 of FIG. 4B may include a first agent module part 412 and a second agent module part 416. The first agent module part 412 may correspond to the part of the conversational user interface 410 related to the agent module associated with the website. The second agent module part 416 may correspond to the part of the conversational user interface 410 related to the native application associated with the website. For example, because a radio website is an audio listening website, the automation assistant presents the user with utterable commands to identify the audio-related application on the computing device and interact with the agent associated with the audio-related application. obtain. As shown in Diagram 400, one audio-related application can be a podcast application. However, in some implementations, the associated application can be a third-party application that is unique to those who control the automation assistant application.

自動化アシスタントは、ウェブサイト(たとえば、ラジオウェブサイト)に関連するエージェントモジュールと、関連アプリケーション(たとえば、ポッドキャストアプリケーション)に関連するエージェントモジュールの両方に関係する選択可能要素を同時に提示し得る。選択可能要素は、ユーザと、ウェブサイトとアプリケーションの両方との間の対話に対応するユーザデータに基づく発話可能コマンド語句を含み得る。たとえば、選択可能要素414の第1のセットが、ユーザがウェブサイトを使用して以前に実施したアクションに対応し得る。「音楽を開始する」という発話可能コマンド語句は、音楽を再生するラジオウェブサイトを開始するためにユーザが前に選択した開始ボタンに対応し得る。自動化アシスタントによって記録された、開始ボタンの以前の選択が、会話型ユーザインターフェース410において選択され、かつ/またはコンピューティングデバイスに対して話され得る選択可能要素に変換され得る。さらに、選択可能要素418の第2のセットが、ポッドキャストアプリケーションなどの関連アプリケーションにおいてユーザによって実施されたアクションに対応し得る。たとえば、ユーザがポッドキャストを開始する以前のアクションが、自動化アシスタントによって記録され得、「ポッドキャストを開始する」という発話可能コマンド語句を含む選択可能要素を提供するための基礎として使用され得る。 The automation assistant may simultaneously present selectable elements related to both the agent module associated with a website (eg, a radio website) and the agent module associated with a related application (eg, a podcast application). Selectable elements may include utterable command phrases based on user data that correspond to the interaction between the user and both the website and the application. For example, a first set of selectable elements 414 may correspond to an action previously performed by a user using a website. The spoken command phrase "start music" may correspond to a start button previously selected by the user to start a radio website that plays music. The previous selection of the start button recorded by the automation assistant can be converted into selectable elements that are selected in the conversational user interface 410 and / or can be spoken to the computing device. In addition, a second set of selectable elements 418 may correspond to actions taken by the user in related applications such as podcast applications. For example, actions prior to the user starting a podcast can be recorded by an automation assistant and used as the basis for providing selectable elements that include the spoken command phrase "start a podcast."

いくつかの実装では、会話型ユーザインターフェース410は、関連アプリケーションに関連するエージェントモジュールに話しかけることができることをユーザに通知し得る。たとえば、会話型ユーザインターフェース410は、「ポッドキャストエージェントに話しかける」という発話可能コマンド語句を含む選択可能要素を提供し得る。「ポッドキャストエージェントに話しかける」と示される選択可能要素を選択したことに応答して、自動化アシスタントは、会話型ユーザインターフェース410を、ポッドキャストアプリケーションに関係する提案を含む第2のユーザインターフェース426となるように更新し得る。たとえば、第1の更新後インターフェース部分418が、自動化アシスタントまたはポッドキャストアプリケーションに関連するエージェントモジュールによって記録された以前のアクションまたは予測されたアクションに対応する、複数の異なる選択可能要素420を含み得る。さらに、第2の更新後インターフェース部分422が、ポッドキャストアプリケーションを操作中にユーザが以前に実施した履歴アクションに関係するユーザデータに基づいて、自動化アシスタントによって提供され得る。たとえば、ポッドキャストに関係する履歴アクションは、食品配送の注文であり得る。ユーザは、帰宅し、ポッドキャストをオンにし、食品を注文する日課を有し得、したがって、自動化アシスタントは、この日課を認識し、日課をより効率的に完了するための会話型インターフェースを提供し得る。第2の更新後インターフェース部分422において提供される各選択可能要素424がユーザによって選択され、または選択可能要素424の発話可能コマンド語句内で識別されるアクションを実施するために、コンピューティングデバイスに対して口頭で話され得る。さらに、自動化アシスタントは、関連アプリケーションに対応する1つまたは複数のエージェントモジュールを識別し、選択されたときに、自動化アシスタントを介してエージェントモジュールとの会話を初期化し得る選択可能要素424を提供し得る。たとえば、関連アプリケーション(たとえば、食品配送アプリケーション)が、アジア料理エージェントおよびエチオピア料理エージェントに関連付けられ得る。食品配送アプリケーションに関連する、異なるエージェントモジュールのそれぞれが、食品配送アプリケーションのアクションのカテゴリを支援することに特化し得、自動化アシスタントは、ユーザが自動化アシスタントを介してエージェントモジュールとインターフェースすることができることをユーザに通知し得る。 In some implementations, the conversational user interface 410 may notify the user that it can speak to the agent module associated with the associated application. For example, the conversational user interface 410 may provide a selectable element that includes the spoken command phrase "speak to a podcast agent." In response to selecting the selectable element labeled "Talk to Podcast Agent," the Automation Assistant now makes the conversational user interface 410 a second user interface 426 that contains suggestions related to the podcast application. Can be updated. For example, the first updated interface part 418 may contain a number of different selectable elements 420 that correspond to previous or predicted actions recorded by the agent module associated with the automation assistant or podcast application. In addition, a second updated interface portion 422 may be provided by the automation assistant based on user data related to historical actions previously performed by the user while operating the podcast application. For example, a historical action related to a podcast could be a food delivery order. Users may have a routine to go home, turn on podcasts, and order food, so automation assistants may recognize this routine and provide a conversational interface to complete the routine more efficiently. .. Each selectable element 424 provided in the second updated interface part 422 is to the computing device to perform an action that is selected by the user or identified within the utterable command phrase of the selectable element 424. Can be spoken orally. In addition, the automation assistant may provide a selectable element 424 that can identify one or more agent modules corresponding to the associated application and, when selected, initialize the conversation with the agent module through the automation assistant. .. For example, related applications (eg, food delivery applications) can be associated with Asian and Ethiopian culinary agents. Each of the different agent modules associated with the food delivery application can specialize in assisting the category of actions of the food delivery application, and the automation assistant allows the user to interface with the agent module through the automation assistant. Can notify the user.

図5は、エージェントモジュールに自動化アシスタントを介して機能を実施させる方法500を示す。方法500は、クライアントデバイス、サーバデバイス、デバイスにおいて動作するモジュールもしくはアプリケーション、および/またはアプリケーションと対話するのに適した任意の他の装置によって実施され得る。方法500は、アプリケーションに関連するエージェントモジュールを識別するリンクの選択をコンピューティングデバイスにおいて受け取るブロック502を含み得る。アプリケーションは、ユーザ入力を通じて操作され得る内容を含むコンピューティングデバイス上の任意のアプリケーションであり得る。たとえば、アプリケーションは、ユーザがアプリケーションを使用してユーザの自宅内の様々なデバイスを制御することを可能にする自宅監視アプリケーションであり得る。リンクは、アプリケーションに関連するエージェントモジュールと対話するために自動化アシスタントが使用され得ることを示す語句を含む選択可能要素であり得る。たとえば、リンクは「自動化アシスタントを使用する」という語句を含み得、リンクを選択すること、または「自動化アシスタントを使用する」という語句を話すことによってリンクが活動化され得る。ブロック504において、自動化アシスタントがコンピューティングデバイスにとってアクセス可能であるかどうかの判定が行われる。自動化アシスタントは、アプリケーションに関連する機能を実施する際にユーザを誘導するために、エージェントモジュールとユーザとの間のインターフェースとして働く別々のアプリケーションまたはモジュールであり得る。自動化アシスタントがコンピューティングデバイスにおいて利用可能である場合、ブロック508において、会話型インターフェースが自動化アシスタントを使用してオープンされ得る。たとえば、会話型インターフェースは、起動されたエージェントモジュールと共にオープンされ得る。たとえば、会話型インターフェースがオープンされ得、エージェントモジュールの起動に応答してエージェントモジュールから受け取った内容に基づく出力を提示し得る。自動化アシスタントがコンピューティングデバイスにとって利用可能ではない場合、ブロック506において、会話型インターフェースがブラウザのデフォルトウェブページにおいてオープンされ得る。ブロック510において、ユーザインターフェース入力が会話型インターフェースにおいて受け取られる。ユーザインターフェース入力は、ユーザによって与えられた、タイプおよび/または発話された入力、ならびに/あるいは会話型インターフェースにおいて提示された選択可能要素の選択であり得る。たとえば、ユーザは、(会話型インターフェースを介して)エージェントモジュールとのダイアログに関与するためにタイプまたは発話された入力を与え得、かつ/またはエージェントモジュールとのダイアログを進めるために提案されたテキスト語句もしくは他の提案された内容を含む、提示された選択可能要素を選択し得る。ブロック512において、さらなる内容が、ブロック510において受け取ったユーザインターフェース入力に基づいてエージェントモジュールに提供される。次いで、エージェントモジュールは、エージェントモジュールに提供されたさらなる内容に基づいて、さらなる応答内容を生成し得る。そのようなさらなる応答内容(またはその変換)が、ユーザとの会話の推進においてユーザに提示するために提供され得、任意選択で、方法はブロック510に戻って、さらなる応答内容に応答してさらなるユーザインターフェース入力を受け取り得る。エージェントモジュールによって生成されるさらなる応答内容は、ブロック510において受け取ったユーザインターフェース入力に依存する。 FIG. 5 shows a method 500 for the agent module to perform a function through an automation assistant. Method 500 may be performed by a client device, a server device, a module or application running on the device, and / or any other device suitable for interacting with the application. Method 500 may include block 502 that receives in the computing device a selection of links that identify the agent module associated with the application. The application can be any application on a computing device that includes content that can be manipulated through user input. For example, an application can be a home monitoring application that allows a user to use the application to control various devices in the user's home. The link can be a selectable element that contains a phrase indicating that the automation assistant can be used to interact with the agent module associated with the application. For example, a link may contain the phrase "use an automated assistant" and the link may be activated by selecting the link or speaking the phrase "using an automated assistant". At block 504, a determination is made as to whether the automation assistant is accessible to the computing device. The automation assistant can be a separate application or module that acts as an interface between the agent module and the user to guide the user in performing the functions associated with the application. In block 508, a conversational interface can be opened using the automation assistant if the automation assistant is available on the computing device. For example, a conversational interface can be opened with an activated agent module. For example, a conversational interface can be opened and can provide output based on what is received from the agent module in response to the activation of the agent module. If the automation assistant is not available for the computing device, in block 506, a conversational interface may be opened in the browser's default web page. At block 510, the user interface input is received on the conversational interface. The user interface input can be a type and / or spoken input given by the user, and / or a selection of selectable elements presented in the conversational interface. For example, a user may be given typed or spoken input to participate in a dialog with an agent module (via a conversational interface) and / or a text phrase proposed to proceed with a dialog with an agent module. Alternatively, the presented selectable elements may be selected, including other proposed content. At block 512, additional content is provided to the agent module based on the user interface input received at block 510. The agent module can then generate additional response content based on the additional content provided to the agent module. Such additional response content (or a transformation thereof) may be provided to present to the user in facilitating a conversation with the user, and optionally, the method returns to block 510 to further respond in response to further response content. Can accept user interface input. Further response content generated by the agent module depends on the user interface input received in block 510.

図6は、動作を実施するためのクライアントアプリケーションが存在するかどうかに従って、選択可能要素の選択に応答して実施された動作を制限するための方法600を示す。方法600は、クライアントデバイス、サーバデバイス、デバイスにおいて動作するモジュールもしくはアプリケーション、および/またはアプリケーションと対話するのに適した任意の他の装置によって実施され得る。方法600は、選択可能要素および発話可能コマンド語句を含む会話型ユーザインターフェースをオープンするブロック602を含み得る。会話型ユーザインターフェースは、アプリケーションまたはウェブサイトに関連するエージェントモジュールによって実施され得るアクションに対応する複数の選択可能要素を含み得る。ブロック604において、選択可能要素の選択が受け取られ得る。選択可能要素は、コンピューティングデバイスのタッチインターフェース、コンピューティングデバイスに接続された周辺デバイス、および/またはコンピューティングデバイスのインターフェースにおいて選択可能要素を選択するための任意の他の機構によって選択され得る。ブロック606において、コマンド語句が制限されたアクションに対応するかどうかの判定が行われる。制限されたアクションは、動作中にユーザについてのプライベート情報を使用し得る動作であり得る。たとえば、非自動化アシスタントアプリケーションまたはウェブサイトは食品注文ウェブサイトであり得、選択可能要素は、ユーザに関連する支払い情報を使用して食品を注文する動作に対応し得る。あるいは、非自動化アシスタントアプリケーションまたはウェブサイトはソーシャルメディアウェブサイトであり得、選択可能要素は、ユーザのイメージを公にポストする動作に対応し得る。ブロック606において、コマンド語句が制限されたアクションに対応しない場合、ブロック608において、選択可能要素に対応するコマンドが実施され得る。たとえば、制限されたアクションに対応しないコマンドは、ゲーミングアプリケーション内の機能を実施すること、またはニュースウェブサイトによって提供されるニュース記事にアクセスすることであり得る。そのようなコマンドは、コマンド語句を話すことによって、または選択可能要素を選択することによって実施され得る。コマンド語句が制限されたアクションに対応する場合、ブロック610において、コマンドを実施するためのクライアントアプリケーションが存在するかどうかの判定が行われる。クライアントアプリケーションが存在する場合、ブロック614において、クライアントアプリケーションがコマンドを受け取るようにされる。たとえば、コマンド語句がウェブサイトからの食品を求める注文に対応する場合、クライアントアプリケーションは、ウェブサイトに関連する食品注文アプリケーションであり得る。このようにして、制限されたアクションがクライアントアプリケーションに残され得る。コマンドは、コマンドを実施するためのエージェントモジュールを識別する、URLなどのリンクの形態で、クライアントアプリケーションに提供され得る。たとえば、クライアントアプリケーションが食品注文アプリケーションであるとき、リンクは、たとえば「http://assistant.url/foodwebsite/food%ordering%agent」であり得る。このようにして、クライアントアプリケーションは、「food%ordering%agent」という識別子によって示されるように(%はスペースとして理解され得る)、ユーザが以前に食品ウェブサイトを閲覧しており、食品を注文することに関心があるという通知を受け得る。その後に、食品注文エージェントが初期化され得、会話型ユーザインターフェースが、食品注文を続行するためのコマンド語句に対応する、異なる選択可能要素を含むように更新され得る。あるいは、初期発話可能コマンド語句を実施するためのクライアントアプリケーションが存在しない場合、ブロック612において、コマンドが制限されるという指示が、コンピューティングデバイスにおいて提供され得る。その後に、ユーザは、(会話型ユーザインターフェースではなく)ウェブサイトを通じて注文を手動で続行し得るか、注文を中止する。 FIG. 6 shows method 600 for limiting the actions performed in response to the selection of selectable elements, depending on whether there is a client application to perform the actions. Method 600 may be performed by a client device, a server device, a module or application running on the device, and / or any other device suitable for interacting with the application. Method 600 may include block 602 that opens a conversational user interface containing selectable elements and utterable command terms. The conversational user interface may include multiple selectable elements that correspond to the actions that can be performed by the agent module associated with the application or website. At block 604, a selection of selectable elements can be received. The selectable elements may be selected by the touch interface of the computing device, peripheral devices connected to the computing device, and / or any other mechanism for selecting the selectable element in the interface of the computing device. At block 606, it is determined whether the command phrase corresponds to a restricted action. A restricted action can be an action that may use private information about the user during the action. For example, a non-automated assistant application or website can be a food ordering website, and selectable elements can correspond to the behavior of ordering food using payment information relevant to the user. Alternatively, the non-automated assistant application or website can be a social media website and the selectable elements can correspond to the behavior of publicly posting the user's image. In block 606, if the command phrase does not correspond to a restricted action, then in block 608 the command corresponding to the selectable element may be executed. For example, a command that does not correspond to restricted actions can be to perform a function within a gaming application or to access a news article provided by a news website. Such commands can be performed by speaking the command phrase or by selecting selectable elements. If the command phrase corresponds to a restricted action, block 610 determines if there is a client application to execute the command. If a client application exists, block 614 causes the client application to receive commands. For example, if the command phrase corresponds to an order for food from a website, the client application can be a food ordering application associated with the website. In this way, restricted actions can be left in the client application. The command may be provided to the client application in the form of a link, such as a URL, that identifies the agent module for executing the command. For example, when the client application is a food ordering application, the link could be, for example, "http://assistant.url/foodwebsite/food%ordering%agent". In this way, the client application orders food when the user has previously visited a food website, as indicated by the identifier "food% ordering% agent" (% can be understood as a space). Get notified that you are interested in. The food ordering agent can then be initialized and the conversational user interface can be updated to include different selectable elements that correspond to the command terms for continuing the food order. Alternatively, in block 612, an instruction may be provided in the computing device that the command is restricted if there is no client application for executing the initial utterable command phrase. The user can then manually continue the order through the website (rather than the conversational user interface) or cancel the order.

図7は、エージェントモジュールがアクセス可能であるかどうかに従って、自動化アシスタントを介してエージェントモジュールと対話するための方法700を示す。方法700は、クライアントデバイス、サーバデバイス、デバイスにおいて動作するモジュールもしくはアプリケーション、および/またはアプリケーションと対話するのに適した任意の他の装置によって実施され得る。方法700は、非自動化アシスタントアプリケーションに関連する機能を実施するための選択可能要素を会話型ユーザインターフェースにおいて提供するブロック702を含み得る。選択可能要素は、エージェントモジュールを識別し得るリンクもしくはコマンド、エージェントモジュールによって実施されるべきアクションもしくはインテント、および/またはアクションもしくはインテントを実施するときのエージェントモジュールによる使用のためのパラメータに関連付けられ得る。あるいは、選択可能要素は、自動化アシスタントを初期化するためのコマンド呼出しに対応し得る。ブロック704において、選択可能要素の選択が自動化アシスタントによって受け取られ得る。次いで、自動化アシスタントは、選択可能要素に対応するエージェントモジュールを識別し得る。ブロック706において、エージェントモジュールが自動化アシスタントにとってアクセス可能であるかどうかの判定が行われる。エージェントモジュールが自動化アシスタントと同一のデバイス上にロードされ、または自動化アシスタントがエージェントモジュールを含むネットワークデバイスと通信することができるとき、エージェントモジュールは自動化アシスタントにとってアクセス可能であり得る。自動化アシスタントがエージェントモジュールにアクセスすることができる場合、ブロック710において、非自動化アシスタントアプリケーションとの対話を進めるためにエージェントモジュールが初期化され得る。言い換えれば、自動化アシスタントは、ユーザがエージェントモジュールとより効率的に通信するためのインターフェースとして働き得る。しかしながら、エージェントモジュールが自動化アシスタントにとってアクセス可能ではない場合、ブロック712において、選択可能要素に対応するリンクが、自動化アシスタントを介して非自動化アシスタントアプリケーションに関連する機能を実施するための発話可能コマンド語句を伝達するためにデフォルトウェブページにおいてオープンされ得る。このようにして、自動化アシスタントにとってアクセス可能なエージェントモジュールがなくても、非自動化アシスタントアプリケーションとのユーザの対話を進めるための支援がユーザに依然として提供される。 Figure 7 shows method 700 for interacting with an agent module through an automation assistant, depending on whether the agent module is accessible. Method 700 may be performed by a client device, a server device, a module or application running on the device, and / or any other device suitable for interacting with the application. Method 700 may include block 702, which provides selectable elements in the conversational user interface for performing functions related to the non-automated assistant application. Selectable elements are associated with links or commands that can identify the agent module, actions or intents to be performed by the agent module, and / or parameters for use by the agent module when performing the action or intent. obtain. Alternatively, the selectable element may correspond to a command call to initialize the automation assistant. At block 704, the selection of selectable elements may be received by the automation assistant. The automation assistant can then identify the agent module that corresponds to the selectable element. At block 706, a determination is made as to whether the agent module is accessible to the automation assistant. The agent module may be accessible to the automation assistant when the agent module is loaded on the same device as the automation assistant, or when the automation assistant can communicate with network devices that include the agent module. If the automation assistant has access to the agent module, at block 710, the agent module may be initialized to facilitate dialogue with the non-automation assistant application. In other words, the automation assistant can act as an interface for the user to communicate more efficiently with the agent module. However, if the agent module is not accessible to the automation assistant, in block 712, the link corresponding to the selectable element provides a spoken command phrase to perform functions related to the non-automation assistant application through the automation assistant. Can be opened on the default web page to communicate. In this way, the user is still provided with assistance in facilitating the user's interaction with the non-automation assistant application, even if the automation assistant does not have an accessible agent module.

図8は、例示的コンピュータシステム810のブロック図800である。コンピュータシステム810は通常、バスサブシステム812を介していくつかの周辺デバイスと通信する少なくとも1つのプロセッサ814を含む。これらの周辺デバイスは、たとえば、メモリサブシステム825およびファイル記憶サブシステム826を含む記憶サブシステム824と、ユーザインターフェース出力デバイス820と、ユーザインターフェース入力デバイス822と、ネットワークインターフェースサブシステム816とを含み得る。入力および出力デバイスは、コンピュータシステム810とのユーザ対話を可能にする。ネットワークインターフェースサブシステム816は、外部ネットワークに対するインターフェースを提供し、他のコンピュータシステム内の対応するインターフェースデバイスに結合される。 FIG. 8 is a block diagram 800 of an exemplary computer system 810. Computer system 810 typically includes at least one processor 814 that communicates with several peripheral devices via bus subsystem 812. These peripheral devices may include, for example, a storage subsystem 824 including a memory subsystem 825 and a file storage subsystem 826, a user interface output device 820, a user interface input device 822, and a network interface subsystem 816. Input and output devices allow user interaction with computer system 810. The network interface subsystem 816 provides an interface to an external network and is coupled to a corresponding interface device in another computer system.

ユーザインターフェース入力デバイス822は、キーボード、マウス、トラックボール、タッチパッド、またはグラフィックスタブレットなどのポインティングデバイス、スキャナ、ディスプレイ内に組み込まれたタッチスクリーン、音声認識システム、マイクロフォンなどのオーディオ入力デバイス、および/または他のタイプの入力デバイスを含み得る。一般には、「入力デバイス」という用語の使用は、すべての可能なタイプのデバイス、およびコンピュータシステム810内に、または通信ネットワーク上に情報を入力するための方式を含むものとする。 The user interface input device 822 is a pointing device such as a keyboard, mouse, trackball, touchpad, or graphics tablet, a scanner, a touch screen built into the display, a voice recognition system, an audio input device such as a microphone, and / Or it may include other types of input devices. In general, the use of the term "input device" shall include all possible types of devices and methods for inputting information within computer system 810 or over communication networks.

ユーザインターフェース出力デバイス820は、ディスプレイサブシステム、プリンタ、ファックスマシン、またはオーディオ出力デバイスなどの非可視ディスプレイを含み得る。ディスプレイサブシステムは、陰極線管(CRT)、液晶ディスプレイ(LCD)などのフラットパネルデバイス、投影デバイス、または可視イメージを作成するための何らかの他の機構を含み得る。ディスプレイサブシステムはまた、オーディオ出力デバイスを介するような非可視ディスプレイを提供し得る。一般には、「出力デバイス」という用語は、すべての可能なタイプのデバイス、およびコンピュータシステム810からの情報をユーザまたは別のマシンもしくはコンピュータシステムに出力するための方式を含むものとする。 The user interface output device 820 may include an invisible display such as a display subsystem, printer, fax machine, or audio output device. A display subsystem may include a flat panel device such as a cathode ray tube (CRT), a liquid crystal display (LCD), a projection device, or any other mechanism for creating a visible image. The display subsystem may also provide an invisible display, such as through an audio output device. In general, the term "output device" shall include all possible types of devices and methods for outputting information from computer system 810 to the user or another machine or computer system.

記憶サブシステム824は、本明細書において説明されるモジュールの一部またはすべての機能を提供するプログラミングおよびデータ構成を記憶する。たとえば、記憶サブシステム824は、方法500、600、700の選択された態様を実施し、かつ/あるいは本明細書において説明されるサーバデバイス、クライアントデバイス、データベース、エンジン、および/またはモジュールのうちの1つまたは複数を実装するための論理を含み得る。 The storage subsystem 824 stores programming and data structures that provide some or all of the functionality of the modules described herein. For example, storage subsystem 824 implements selected embodiments of methods 500, 600, 700 and / or of the server devices, client devices, databases, engines, and / or modules described herein. It can contain logic to implement one or more.

これらのソフトウェアモジュールは一般に、プロセッサ814単独で、または他のプロセッサと組み合わせて実行される。記憶サブシステム824内で使用されるメモリ825は、プログラム実行中の命令およびデータの記憶のためのメインランダムアクセスメモリ(RAM)830と、固定の命令が記憶される読取り専用メモリ(ROM)832とを含むいくつかのメモリを含み得る。ファイル記憶サブシステム826は、プログラムおよびデータファイルについての永続的記憶を提供し得、ハードディスクドライブ、フロッピーディスクドライブおよび関連する取外し可能媒体、CD-ROMドライブ、光学ドライブ、または取外し可能媒体カートリッジを含み得る。いくつかの実装の機能を実装するモジュールは、ファイル記憶サブシステム826によって記憶サブシステム824内に、またはプロセッサ814によってアクセス可能な他のマシン内に記憶され得る。 These software modules are typically run on processor 814 alone or in combination with other processors. The memory 825 used in the storage subsystem 824 includes a main random access memory (RAM) 830 for storing instructions and data during program execution, and a read-only memory (ROM) 832 for storing fixed instructions. May include some memory including. The file storage subsystem 826 may provide persistent storage for programs and data files and may include hard disk drives, floppy disk drives and related removable media, CD-ROM drives, optical drives, or removable media cartridges. .. Modules that implement the functionality of some implementations may be stored in the storage subsystem 824 by the file storage subsystem 826 or in other machines accessible by the processor 814.

バスサブシステム812は、コンピュータシステム810の様々な構成要素およびサブシステムが所期の通りに互いに通信させるための機構を提供する。バスサブシステム812が単一のバスとして概略的に示されているが、バスサブシステムの代替実装は複数のバスを使用し得る。 Bus subsystem 812 provides a mechanism for the various components and subsystems of computer system 810 to communicate with each other as desired. Although the bus subsystem 812 is schematically shown as a single bus, alternative implementations of the bus subsystem may use multiple buses.

コンピュータシステム810は、ワークステーション、サーバ、コンピューティングクラスタ、ブレードサーバ、サーバファーム、または任意の他のデータ処理システムもしくはコンピューティングデバイスを含む様々なタイプのものであり得る。コンピュータおよびネットワークの常に変化する性質のために、図8に示されるコンピュータシステム810の説明は、いくつかの実装を示すための特定の例に過ぎないものとする。図8に示されるコンピューティングデバイスよりも多くの、または少ない構成要素を有する、コンピューティングデバイス810の多くの他の構成が可能である。 The computer system 810 can be of various types, including workstations, servers, computing clusters, blade servers, server farms, or any other data processing system or computing device. Due to the ever-changing nature of computers and networks, the description of computer system 810 shown in FIG. 8 shall be merely a specific example to illustrate some implementations. Many other configurations of the compute device 810 are possible, with more or fewer components than the compute device shown in FIG.

本明細書において説明されるシステムがユーザについての個人情報を収集し、または個人情報を活用する状況では、プログラムまたは特徴がユーザ情報(たとえば、ユーザのソーシャルネットワーク、社会的行動または活動、職業、ユーザのプリファレンス、ユーザの現在の地理的位置についての情報)を収集するかどうかを制御し、またはユーザにとってより強い関連があり得る内容をサーバから受け取るかどうか、および/またはどのように受け取るかを制御するための機会がユーザに提供され得る。さらに、あるデータは、個人識別可能情報が除去されるように、記憶または使用される前に1つまたは複数の方式で扱われ得る。たとえば、ユーザについて個人識別可能情報が決定され得ないか、またはユーザの特定の地理的位置が決定できないように、地理的位置情報が取得される場合にユーザの地理的位置が一般化され得る(都市、郵便番号、または州レベル)ように、ユーザの識別情報が扱われ得る。したがって、ユーザは、ユーザについてどのように情報が収集および/または使用されるかに関する制御を有し得る。 In situations where the system described herein collects or utilizes personal information about a user, the program or feature may be the user information (eg, the user's social network, social behavior or activity, occupation, user). Controls whether to collect (information about the user's current geographic location), or whether and / or how to receive content from the server that may be more relevant to the user. Opportunities for control may be provided to the user. In addition, some data may be treated in one or more ways before being stored or used so that personally identifiable information is removed. For example, a user's geographic location can be generalized when geolocation information is obtained so that personally identifiable information cannot be determined for the user or a user's specific geographic location cannot be determined (for example). User identification information can be handled, such as at the city, zip code, or state level. Therefore, the user may have control over how information is collected and / or used for the user.

いくつかの実装が本明細書において説明および図示されるが、機能を実施するための、かつ/あるいは結果および/または本明細書において説明される利点のうちの1つまたは複数を取得するための様々な他の手段および/または構造が利用され得、そのような変形形態および/または修正形態のそれぞれが、本明細書において本明細書において説明される実装の範囲内にあると見なされる。より一般には、本明細書において説明されるすべてのパラメータ、寸法、材料、および構成は例示的なものであることを意味し、実際のパラメータ、寸法、材料、および/または構成は、特定の応用例、または本教示がそのために使用される応用例に依存する。定型的な実験だけを使用して、本明細書において説明される特定の実装の多くの均等物を当業者は理解することになり、または確認することができる。したがって、前述の実装は単に例として提示されること、および添付の特許請求の範囲およびその均等物の中で、具体的に説明され、特許請求されるのとは別の方法で実装が実施され得ることを理解されたい。本開示の実装は、それぞれの個々の特徴、本明細書において説明されるシステム、物品、材料、キット、および/または方法を対象とする。さらに、2つ以上のそのような特徴、システム、物品、材料、キット、および/または方法の任意の組合せが、そのような特徴、システム、物品、材料、キット、および/または方法が相互に矛盾しない場合、本開示の範囲内に含まれる。 Although some implementations are described and illustrated herein, to perform a function and / or to obtain one or more of the results and / or the advantages described herein. Various other means and / or structures may be utilized and each such variant and / or modification is considered to be within the scope of implementation described herein herein. More generally, all parameters, dimensions, materials, and configurations described herein are exemplary, and actual parameters, dimensions, materials, and / or configurations are specific applications. It depends on the example, or the application in which this teaching is used. Using only routine experiments, one of ordinary skill in the art will understand or be able to confirm many equivalents of the particular implementation described herein. Therefore, the above implementations are presented merely as an example, and are specifically described and implemented in a manner different from that claimed, within the appended claims and their equivalents. Understand what you get. Implementations of the present disclosure are directed to their respective individual features, the systems, articles, materials, kits, and / or methods described herein. In addition, any combination of two or more such features, systems, articles, materials, kits, and / or methods conflicts with each other such features, systems, articles, materials, kits, and / or methods. If not, it is included within the scope of this disclosure.

204 ノード
100 ダイアグラム
102 ユーザ
104 モバイルデバイス
106 アプリケーション
108 ユーザインターフェース
112 選択可能要素
114 会話型インターフェース
116 ダイアグラム
118 ダイアグラム
120 提案要素
122 提案要素
124 提案要素
126 応答要素
128 応答要素
130 更新後会話型ユーザインターフェース
132 音声入力要素
200 ダイアグラム
202 ユーザ
204 モバイルデバイス
206 ウェブサイト
208 ユーザインターフェース
212 選択可能要素
214 会話型ユーザインターフェース
218 選択可能要素
222 選択可能要素
224 コンピューティングデバイス
300 システム
302 クライアントデバイス
304 アプリケーション
310 エージェントモジュール
312 ユーザデータ
314 ウェブブラウザ
316 会話型インターフェース
320 自動化アシスタント
322 音声-テキストエンジン
324 テキスト構文解析エンジン
326 選択可能要素エンジン
400 ダイアグラム
406 第1のユーザインターフェース
408 第1の選択可能要素
410 会話型ユーザインターフェース
412 第1のエージェントモジュール部分
414 選択可能要素
416 第2のエージェントモジュール部分
418 選択可能要素
420 選択可能要素
422 第2の更新後インターフェース部分
424 選択可能要素
426 第2のユーザインターフェース
810 コンピュータシステム
812 バスサブシステム
814 プロセッサ
816 ネットワークインターフェースサブシステム
820 ユーザインターフェース出力デバイス
822 ユーザインターフェース入力デバイス
824 記憶サブシステム
826 ファイル記憶サブシステム
830 メインランダムアクセスメモリ(RAM)
832 読取り専用メモリ(ROM) 204 nodes
100 diagram
102 users
104 mobile devices
106 application
108 User interface
112 Selectable elements
114 Conversational interface
116 Diagram
118 Diagram
120 Suggested elements
122 Proposed elements
124 Proposed elements
126 Response element
128 Response element
130 Updated conversational user interface
132 Voice input element
200 diagram
202 users
204 mobile device
206 website
208 user interface
212 Selectable elements
214 Conversational user interface
218 Selectable elements
222 Selectable elements
224 Computing device
300 system
302 client device
304 application
310 Agent Module
312 User data
314 web browser
316 Conversational interface
320 Automation Assistant
322 Voice-Text Engine
324 Text parsing engine
326 Selectable Element Engine
400 diagram
406 First user interface
408 First selectable element
410 Conversational user interface
412 First agent module part
414 Selectable elements
416 Second agent module part
418 Selectable elements
420 selectable elements
422 Second updated interface part
424 Selectable elements
426 Second user interface
810 computer system
812 Bus subsystem
814 processor
816 Network Interface Subsystem
820 user interface output device
822 User Interface Input Device
824 Storage Subsystem
826 File storage subsystem
830 Main Random Access Memory (RAM)
832 Read-only memory (ROM)

Claims

A method implemented by one or more processors
A step of making the application interface of a non-automated assistant application accessible to the user, wherein the application interface includes a first selectable element for initializing communication with an agent module through the automated assistant application. A step and a step in which the agent module is configured to perform actions related to the non-automated assistant application.
A step of receiving a selection of the first selectable element, wherein the first selectable element corresponds to a link that identifies the agent module and a parameter to be used when performing the action. , Steps and
In response to receiving the selection of the first selectable element, the step of causing the user to present a conversational interface via the automation assistant application, wherein the conversational interface is with the user. A step that is configured to act as an intermediary with the agent module.
A step of providing a second selectable element in the conversational interface via the automation assistant application, wherein the second selectable element is identified within the link to facilitate the performance of the action. Based on the above parameters
How to include.

A step of receiving a selection of the second selectable element in the conversational interface, wherein the second selectable element characterizes a value for fulfilling the parameter.
With the step of providing the agent module with the value to assign to the parameter identified within the link in response to receiving the selection of the second selectable element.
The method of claim 1, further comprising.

Claims that the conversational interface comprises one or more other selectable elements, each of which characterizes one or more other proposed values, each assignable to the parameter identified within the link. The method described in item 2.

The method of claim 1, wherein the second selectable element identifies a utterable command phrase associated with a previous action performed through the non-automated assistant application.

The step of receiving the audio data corresponding to the utterable command phrase,
With the step of converting the audio data into text data for transmission to the agent module in response to receiving the audio data.
4. The method of claim 4, further comprising.

A method implemented by one or more processors
A step that takes accept the selection of the selectable element in a graphical user interface rendered by non-automated assistant application of a computing device, before Symbol selectable element, said graphical user interface to the associated agent module is the non-automated Steps and steps that indicate that they can be launched through an automated assistant application that is separate from the assistant application.
In response to the selection of the selectable element,
Comprising the steps of start the agent module through the automated assistant application, when the automated assistant application is accessible via at least temporarily the computing device, the agent module, the automated assistant application One of the multiple available agent modules that can be launched through the step and
In response to activating the agent module, comprising the steps of taking accept the response content from the agent module,
By the automated assistant application through the automated assistant interface, the method comprising the steps of providing a based rather output to the response content received from the agent modules.

In response to receiving the selection of the selectable element
Comprising: receiving a modality parameter indicating the modalities of a dialog with the previous SL automated assistant application, before SL output, Ru is provided through the modalities of a dialog indicated by said modality parameter, scan Te' flop
The method of claim 6 , further comprising.

The method of claim 7, wherein the modality is either voice or text.

The selectable element identifies a link specific to the agent module and
The method of claim 6, wherein the agent module is activated based on the content of the link.

When pre-Symbol automated assistant application is inaccessible through the at least temporarily the computing device, the link is no start-up of the agent module, a display of additional content relating to the agent module in the graphical user interface causing method of claim 9.

In response to receiving the selection of the selectable element
A step of receiving an agent content including an intent parameter for the agent module, wherein the step of invoking the agent module includes a step of invoking the agent module together with the intent parameter indicated by the agent content. The method of claim 6, further comprising.

Before Symbol intent parameters are dynamically generated through interaction with the non-automated assistant application, The method of claim 11.

The agent module, from the side which provides an automated assistant application is a third-party agent module provided by the specific third party, The method of claim 6.

The automation assistant application runs on a computing device separate from the non-automation assistant application.
The output is provided by the separate computing device, The method of claim 6.

The output based on the response received from the selected agent module is the prompt
Following the prompting , the step of receiving natural language input,
A step of providing the agent module with additional content based on the natural language input,
A step of receiving further response content from the agent module in response to providing the additional content to the agent module.
Through said automated assistant interface, further comprising the steps of: providing said received from the agent module outputs additional according to a further response content by the automated assistant application, The method of claim 6.

A method implemented by one or more processors
A step of displaying selectable elements on a computing device operating a non-automated assistant application, wherein the selectable elements are configured to cause the automated assistant to initialize an agent module associated with the non-automated assistant application. To be done, steps and
The step of receiving the selection of selectable elements and
In response to receiving the selection of the selectable element
The step of determining whether the automation assistant is accessible to the computing device, and
With the step of initializing the automation assistant to communicate with the agent module when it is determined that the automation assistant is accessible to the computing device.
How to include.

The step of initializing the automation assistant to communicate with the agent module is
A step of displaying another selectable element in the computing device, wherein the other selectable element corresponds to a link that identifies the agent module.
16. The method of claim 16.

The other selectable element identifies the utterable command phrase associated with a previous action performed through the non-automated assistant application.
17. The method of claim 17, wherein the automation assistant is configured to respond to the spoken command phrase when spoken by a user and received on the computing device.

When the automated assistant is judged to be accessible to the computing device,
Sending an activation request to the agent module,
In response to the activation request, receiving a response content from the previous SL agent module,
Through an interface associated with the prior SL automated assistant The method of claim 18, further comprising the step of providing an output based on said response content.

The step of determining whether the automation assistant is accessible to the computing device is
16. A 16. the method of.