JP2020144618A

JP2020144618A - Agent device, control method of agent device, and program

Info

Publication number: JP2020144618A
Application number: JP2019040964A
Authority: JP
Inventors: 基嗣久保田; Mototsugu Kubota; 真也安原; Shinya Yasuhara; 裕介大井; Yusuke Oi; 昌宏暮橋; Masahiro Kurehashi
Original assignee: Honda Motor Co Ltd
Current assignee: Honda Motor Co Ltd
Priority date: 2019-03-06
Filing date: 2019-03-06
Publication date: 2020-09-10
Anticipated expiration: 2039-03-06
Also published as: CN111667823B; CN111667823A; JP7175221B2

Abstract

To provide an agent device, a control method of the agent device, and a program that allow an occupant to easily use a new function.SOLUTION: An agent device (100) comprises: a plurality of agent function units (150-1 to 150-3) for providing a service including a voice response to speech of an occupant of a vehicle; and a selection unit (122) for selecting an agent function unit corresponding to the speech of the occupant from among the plurality of agent function units. When a new function is added to one agent function unit of the plurality of agent function units and the newly added function is provided for the occupant, the selection unit causes a function of the agent function unit having the newly added function to be provided for the occupant preferentially over the other agent function units already having the same function as the newly added function.SELECTED DRAWING: Figure 2

Description

本発明は、エージェント装置、エージェント装置の制御方法、およびプログラムに関する。 The present invention relates to an agent device, a control method for the agent device, and a program.

従来、車両の乗員と対話を行いながら、乗員の要求に応じた運転支援に関する情報や車両の制御、その他のアプリケーション等を提供するエージェントに関する技術が開示されている（例えば、特許文献１参照）。 Conventionally, a technique relating to an agent that provides information on driving support according to a request of a occupant, control of a vehicle, other applications, and the like while interacting with a vehicle occupant has been disclosed (see, for example, Patent Document 1).

特開２００６−３３５２３１号公報Japanese Unexamined Patent Publication No. 2006-335231

近年では、複数のエージェントを車両に搭載することについて実用化が進められている。また、エージェントが実行可能な機能は、逐次アップデートされる場合がある。しかしながら、あるエージェントに新機能が追加されたとしても、当該新機能をかねてより実行可能な他のエージェントが存在する場合には、乗員に新機能が追加されたエージェントによって新機能を実行させることが困難である場合があった。 In recent years, practical application has been promoted for mounting a plurality of agents on a vehicle. In addition, the functions that can be executed by the agent may be updated sequentially. However, even if a new function is added to an agent, if there is another agent that can execute the new function for some time, the occupant may be made to execute the new function by the agent to which the new function is added. It could be difficult.

本発明は、このような事情を考慮してなされたものであり、新機能を乗員が使用しやすくすることができるエージェント装置、エージェント装置の制御方法、およびプログラムを提供することを目的の一つとする。 The present invention has been made in consideration of such circumstances, and one of the objects of the present invention is to provide an agent device, a control method of the agent device, and a program capable of making new functions easier for occupants to use. To do.

この発明に係るエージェント装置、エージェント装置の制御方法、およびプログラムは、以下の構成を採用した。
（１）：この発明の一態様に係るエージェント装置は、車両の乗員の発話に応じて、音声による応答を含むサービスを提供する複数のエージェント機能部と、前記複数のエージェント機能部のうち、前記乗員の発話に対応するエージェント機能部を選択する選択部とを備え、前記選択部は、前記複数のエージェント機能部のうち、１つのエージェント機能部に新たな機能が追加された場合において、前記新たに追加された機能を前記乗員に提供する場合、前記新たに追加された機能と同じ機能を既に有している他のエージェント機能部に対して優先的に、当該新たな機能が追加されたエージェント機能部による機能を前記乗員に提供させるものである。 The agent device, the control method of the agent device, and the program according to the present invention have adopted the following configurations.
(1): The agent device according to one aspect of the present invention includes a plurality of agent function units that provide a service including a voice response in response to an utterance of a vehicle occupant, and among the plurality of agent function units. The selection unit includes a selection unit that selects an agent function unit corresponding to the utterance of the occupant, and the selection unit is the new function unit when a new function is added to one agent function unit among the plurality of agent function units. When the function added to is provided to the occupant, the agent to which the new function is added has priority over other agent function units that already have the same function as the newly added function. The function of the functional unit is provided to the occupant.

（２）：この発明の他の態様に係るエージェント装置は、車両の乗員の発話に応じて、音声による応答を含むサービスを提供する複数のエージェント機能部と、前記複数のエージェント機能部のうち、前記乗員の発話に対応するエージェント機能部を選択する選択部とを備え、前記複数のエージェント機能部には、車両機器に動作を指示する機能を有する車両エージェント機能部が含まれ、前記選択部は、前記複数のエージェント機能部のうち、前記車両エージェント機能部に新たな機能が追加された場合において、前記新たに追加された機能を前記乗員に提供する場合、前記新たに追加された機能と同じ機能を既に有している他のエージェント機能部に対して優先的に、当該新たな機能が追加された前記車両エージェント機能部による機能を前記乗員に提供させるものである。 (2): The agent device according to another aspect of the present invention includes a plurality of agent function units that provide a service including a voice response in response to an utterance of a vehicle occupant, and among the plurality of agent function units. The plurality of agent function units include a vehicle agent function unit that has a function of instructing the vehicle equipment to operate, and the selection unit includes a selection unit that selects an agent function unit corresponding to the utterance of the occupant. When a new function is added to the vehicle agent function unit among the plurality of agent function units and the newly added function is provided to the occupant, the same as the newly added function. The occupant is made to provide the function of the vehicle agent function unit to which the new function is added, preferentially to the other agent function unit that already has the function.

（３）：上記（１）又は（２）の態様において、前記選択部は、前記複数のエージェント機能部のうち、特定のエージェント機能部を指定した問いかけであっても、前記新たに追加された機能を前記乗員に提供する場合、前記新たに追加された機能と同じ機能を既に有している他のエージェント機能部に対して優先的に、当該新たな機能が追加されたエージェント機能部による機能を前記乗員に提供させるものである。 (3): In the embodiment (1) or (2) above, the selection unit is newly added even if the question specifies a specific agent function unit among the plurality of agent function units. When the function is provided to the occupant, the function by the agent function unit to which the new function has been added has priority over other agent function units that already have the same function as the newly added function. Is provided to the occupants.

（４）：上記（１）から（３）のいずれかの態様において、前記エージェント機能部は、前記複数のエージェント機能部のうち、少なくとも１つのエージェント機能部に新たな機能が追加された場合、前記新たな機能の詳細を特定しない問い合わせに応答して、前記新たに追加された機能に関する情報を前記乗員に提供するものである。 (4): In any of the above aspects (1) to (3), when a new function is added to at least one agent function unit among the plurality of agent function units, the agent function unit It provides the occupant with information about the newly added function in response to an inquiry that does not specify the details of the new function.

（５）：上記（１）から（４）のいずれかの態様において、前記エージェント機能部は、前記複数のエージェント機能部のうち、少なくとも１つのエージェント機能部に新たな機能が追加された場合、前記新たな機能とは無関係な応答をしている際に、前記新たに追加された機能に関する情報を前記乗員に提供するものである。 (5): In any of the above aspects (1) to (4), when a new function is added to at least one agent function unit among the plurality of agent function units, the agent function unit It provides the occupant with information about the newly added function when making a response unrelated to the new function.

（６）：この発明の他の態様に係るエージェント装置の制御方法は、コンピュータが、複数のエージェント機能部のうちいずれかを起動させ、前記起動したエージェント機能部の機能として、車両の乗員の発話に応じて、音声による応答を含むサービスを提供し、前記複数のエージェント機能部のうち、前記乗員の発話に対応するエージェント機能部を選択し、前記複数のエージェント機能部のうち、１つのエージェント機能部に新たな機能が追加された場合において、前記新たに追加された機能を前記乗員に提供する場合、前記新たに追加された機能と同じ機能を既に有している他のエージェント機能部に対して優先的に、当該新たな機能が追加されたエージェント機能部による機能を前記乗員に提供させるものである。 (6): In the control method of the agent device according to another aspect of the present invention, a computer activates one of a plurality of agent function units, and as a function of the activated agent function unit, a vehicle occupant speaks. Depending on the situation, a service including a voice response is provided, an agent function unit corresponding to the utterance of the occupant is selected from the plurality of agent function units, and one agent function among the plurality of agent function units is used. When a new function is added to the unit, when the newly added function is provided to the occupant, the other agent function unit that already has the same function as the newly added function Preferentially, the occupant is provided with a function by the agent function unit to which the new function is added.

（７）：この発明の他の態様に係るプログラムは、コンピュータに、複数のエージェント機能部のうちいずれかを起動させ、前記起動したエージェント機能部の機能として、車両の乗員の発話に応じて、音声による応答を含むサービスを提供させ、前記複数のエージェント機能部のうち、前記乗員の発話に対応するエージェント機能部を選択させ、前記複数のエージェント機能部のうち、１つのエージェント機能部に新たな機能が追加された場合において、前記新たに追加された機能を前記乗員に提供する場合、前記新たに追加された機能と同じ機能を既に有している他のエージェント機能部に対して優先的に、当該新たな機能が追加されたエージェント機能部による機能を前記乗員に提供させるものである。 (7): The program according to another aspect of the present invention causes a computer to activate one of a plurality of agent function units, and as a function of the activated agent function unit, responds to a voice of a vehicle occupant. A service including a voice response is provided, an agent function unit corresponding to the utterance of the occupant is selected from the plurality of agent function units, and one agent function unit among the plurality of agent function units is newly used. When a function is added, when the newly added function is provided to the occupant, priority is given to other agent function units that already have the same function as the newly added function. , The occupant is provided with a function by the agent function unit to which the new function is added.

（１）〜（７）の態様によれば、新機能をユーザが使用しやすくすることができる。 According to the aspects (1) to (7), the new function can be easily used by the user.

エージェント装置１００を含むエージェントシステム１の構成図である。It is a block diagram of the agent system 1 including the agent apparatus 100. 第１実施形態に係るエージェント装置１００の構成と、車両Ｍに搭載された機器とを示す図である。It is a figure which shows the structure of the agent apparatus 100 which concerns on 1st Embodiment, and the apparatus mounted on the vehicle M. 表示・操作装置２０の配置例を示す図である。It is a figure which shows the arrangement example of the display / operation apparatus 20. スピーカユニット３０の配置例を示す図である。It is a figure which shows the arrangement example of a speaker unit 30. 機能一覧情報１６２の内容の一例を示す図である。It is a figure which shows an example of the content of the function list information 162. 音像が定位する位置が定まる原理について説明するための図である。It is a figure for demonstrating the principle that the position where a sound image is localized is determined. エージェントサーバ２００の構成と、エージェント装置１００の構成の一部とを示す図である。It is a figure which shows the configuration of the agent server 200, and a part of the configuration of the agent apparatus 100. 地図検索機能を提供する場合のエージェントと乗員の対話の一例を示す図である。It is a figure which shows an example of the dialogue between an agent and an occupant when providing a map search function. ウエイクアップワードを含む発話ＣＶ３に対するエージェントの回答の一例を示す図である。It is a figure which shows an example of the agent's answer to the utterance CV3 including a wake-up word. エージェント装置１００の動作の一連の流れを示すフローチャートである。It is a flowchart which shows the series flow of the operation of the agent apparatus 100. エージェント機能部１５０に優先度が付されている場合の、エージェント装置１００の動作の一連の流れを示すフローチャートである。It is a flowchart which shows the series flow of the operation of the agent apparatus 100 when the agent function part 150 is prioritized. 新たに追加された機能に関する情報を提供する場合のエージェントと乗員の対話の一例を示す図である。It is a figure which shows an example of the interaction between an agent and an occupant when providing information about a newly added function. エージェント装置１００の未実行の機能を紹介する処理の一連の流れを示すフローチャートである。It is a flowchart which shows the series flow of the process which introduces the unexecuted function of the agent apparatus 100.

以下、図面を参照し、本発明のエージェント装置、エージェント装置の制御方法、およびプログラムの実施形態について説明する。エージェント装置は、エージェントシステムの一部または全部を実現する装置である。以下では、エージェント装置の一例として、車両（以下、車両Ｍ）に搭載され、複数種類のエージェント機能を備えたエージェント装置について説明する。エージェント機能とは、例えば、車両Ｍの乗員と対話をしながら、乗員の発話の中に含まれる要求（コマンド）に基づく各種の情報提供を行ったり、ネットワークサービスを仲介したりする機能である。複数種類のエージェントは、それぞれに果たす機能、処理手順、制御、出力態様・内容がそれぞれ異なってもよい。また、エージェント機能の中には、車両内の機器（例えば運転制御や車体制御に関わる機器）の制御等を行う機能を有するものがあってよい。 Hereinafter, the agent device of the present invention, the control method of the agent device, and the embodiment of the program will be described with reference to the drawings. An agent device is a device that realizes a part or all of an agent system. Hereinafter, as an example of the agent device, an agent device mounted on a vehicle (hereinafter referred to as a vehicle M) and having a plurality of types of agent functions will be described. The agent function is, for example, a function of providing various information based on a request (command) included in the utterance of the occupant or mediating a network service while interacting with the occupant of the vehicle M. The functions, processing procedures, controls, output modes and contents of the plurality of types of agents may be different from each other. In addition, some of the agent functions may have a function of controlling equipment in the vehicle (for example, equipment related to driving control and vehicle body control).

エージェント機能は、例えば、乗員の音声を認識する音声認識機能（音声をテキスト化する機能）に加え、自然言語処理機能（テキストの構造や意味を理解する機能）、対話管理機能、ネットワークを介して他装置を検索し、或いは自装置が保有する所定のデータベースを検索するネットワーク検索機能等を統合的に利用して実現される。これらの機能の一部または全部は、ＡＩ（Artificial Intelligence）技術によって実現されてよい。また、これらの機能を行うための構成の一部（特に、音声認識機能や自然言語処理解釈機能）は、車両Ｍの車載通信装置または車両Ｍに持ち込まれた汎用通信装置と通信可能なエージェントサーバ（外部装置）に搭載されてもよい。以下の説明では、構成の一部がエージェントサーバに搭載されており、エージェント装置とエージェントサーバが協働してエージェントシステムを実現することを前提とする。また、エージェント装置とエージェントサーバが協働して仮想的に出現させるサービス提供主体（サービス・エンティティ）をエージェントと称する。 Agent functions include, for example, a voice recognition function that recognizes the voice of an occupant (a function that converts voice into text), a natural language processing function (a function that understands the structure and meaning of text), a dialogue management function, and a network. It is realized by integratedly using a network search function or the like that searches for another device or a predetermined database owned by the own device. Some or all of these functions may be realized by AI (Artificial Intelligence) technology. In addition, a part of the configuration for performing these functions (particularly, the voice recognition function and the natural language processing interpretation function) is an agent server capable of communicating with the in-vehicle communication device of the vehicle M or the general-purpose communication device brought into the vehicle M. It may be mounted on (external device). In the following description, it is assumed that a part of the configuration is installed in the agent server, and the agent device and the agent server cooperate to realize the agent system. Further, a service provider (service entity) in which an agent device and an agent server cooperate to appear virtually is called an agent.

＜全体構成＞
図１は、エージェント装置１００を含むエージェントシステム１の構成図である。エージェントシステム１は、例えば、エージェント装置１００と、複数のエージェントサーバ２００−１、２００−２、２００−３、…とを備える。符号の末尾のハイフン以下数字は、エージェントを区別するための識別子であるものとする。いずれのエージェントサーバであるかを区別しない場合、単にエージェントサーバ２００と称する場合がある。図１では３つのエージェントサーバ２００を示しているが、エージェントサーバ２００の数は２つであってもよいし、４つ以上であってもよい。それぞれのエージェントサーバ２００は、互いに異なるエージェントシステムの提供者が運営するものである。従って、本発明におけるエージェントは、互いに異なる提供者により実現されるエージェントである。提供者としては、例えば、自動車メーカー、ネットワークサービス事業者、電子商取引事業者、携帯端末の販売者及び製造者などが挙げられ、任意の主体（法人、団体、個人等）がエージェントシステムの提供者となり得る。 <Overall configuration>
FIG. 1 is a configuration diagram of an agent system 1 including an agent device 100. The agent system 1 includes, for example, an agent device 100 and a plurality of agent servers 200-1, 200-2, 200-3, .... The number after the hyphen at the end of the code shall be an identifier for distinguishing agents. When it is not distinguished which agent server it is, it may be simply referred to as an agent server 200. Although three agent servers 200 are shown in FIG. 1, the number of agent servers 200 may be two or four or more. Each agent server 200 is operated by a provider of agent systems different from each other. Therefore, the agents in the present invention are agents realized by different providers. Examples of providers include automobile manufacturers, network service providers, e-commerce businesses, mobile terminal sellers and manufacturers, and any entity (corporation, group, individual, etc.) is the provider of the agent system. Can be.

エージェント装置１００は、ネットワークＮＷを介してエージェントサーバ２００と通信する。ネットワークＮＷは、例えば、インターネット、セルラー網、Ｗｉ−Ｆｉ網、ＷＡＮ（Wide Area Network）、ＬＡＮ（Local Area Network）、公衆回線、電話回線、無線基地局などのうち一部または全部を含む。ネットワークＮＷには、各種ウェブサーバ３００が接続されており、エージェントサーバ２００またはエージェント装置１００は、ネットワークＮＷを介して各種ウェブサーバ３００からウェブページを取得することができる。 The agent device 100 communicates with the agent server 200 via the network NW. The network NW includes, for example, a part or all of the Internet, a cellular network, a Wi-Fi network, a WAN (Wide Area Network), a LAN (Local Area Network), a public line, a telephone line, a wireless base station, and the like. Various web servers 300 are connected to the network NW, and the agent server 200 or the agent device 100 can acquire web pages from the various web servers 300 via the network NW.

エージェント装置１００は、車両Ｍの乗員と対話を行い、乗員からの音声をエージェントサーバ２００に送信し、エージェントサーバ２００から得られた回答を、音声出力や画像表示の形で乗員に提示する。 The agent device 100 interacts with the occupant of the vehicle M, transmits the voice from the occupant to the agent server 200, and presents the answer obtained from the agent server 200 to the occupant in the form of voice output or image display.

＜第１実施形態＞
［車両］
図２は、第１実施形態に係るエージェント装置１００の構成と、車両Ｍに搭載された機器とを示す図である。車両Ｍには、例えば、一以上のマイク１０と、表示・操作装置２０と、スピーカユニット３０と、ナビゲーション装置４０と、車両機器５０と、車載通信装置６０と、乗員認識装置８０と、エージェント装置１００とが搭載される。また、スマートフォンなどの汎用通信装置７０が車室内に持ち込まれ、通信装置として使用される場合がある。これらの装置は、ＣＡＮ（Controller Area Network）通信線等の多重通信線やシリアル通信線、無線通信網等によって互いに接続される。なお、図２に示す構成はあくまで一例であり、構成の一部が省略されてもよいし、更に別の構成が追加されてもよい。 <First Embodiment>
[vehicle]
FIG. 2 is a diagram showing the configuration of the agent device 100 according to the first embodiment and the equipment mounted on the vehicle M. The vehicle M includes, for example, one or more microphones 10, a display / operation device 20, a speaker unit 30, a navigation device 40, a vehicle device 50, an in-vehicle communication device 60, an occupant recognition device 80, and an agent device. 100 and are installed. Further, a general-purpose communication device 70 such as a smartphone may be brought into the vehicle interior and used as a communication device. These devices are connected to each other by a multiplex communication line such as a CAN (Controller Area Network) communication line, a serial communication line, a wireless communication network, or the like. The configuration shown in FIG. 2 is merely an example, and a part of the configuration may be omitted or another configuration may be added.

マイク１０は、車室内で発せられた音声を収集する収音部である。表示・操作装置２０は、画像を表示すると共に、入力操作を受付可能な装置（或いは装置群）である。表示・操作装置２０は、例えば、タッチパネルとして構成されたディスプレイ装置を含む。表示・操作装置２０は、更に、ＨＵＤ（Head Up Display）や機械式の入力装置を含んでもよい。スピーカユニット３０は、例えば、車室内の互いに異なる位置に配設された複数のスピーカ（音出力部）を含む。表示・操作装置２０は、エージェント装置１００とナビゲーション装置４０とで共用されてもよい。これらの詳細については後述する。 The microphone 10 is a sound collecting unit that collects sounds emitted in the vehicle interior. The display / operation device 20 is a device (or device group) capable of displaying an image and accepting an input operation. The display / operation device 20 includes, for example, a display device configured as a touch panel. The display / operation device 20 may further include a HUD (Head Up Display) or a mechanical input device. The speaker unit 30 includes, for example, a plurality of speakers (sound output units) arranged at different positions in the vehicle interior. The display / operation device 20 may be shared by the agent device 100 and the navigation device 40. Details of these will be described later.

ナビゲーション装置４０は、ナビＨＭＩ（Human machine Interface）と、ＧＰＳ（Global Positioning System）などの位置測位装置と、地図情報を記憶した記憶装置と、経路探索などを行う制御装置（ナビゲーションコントローラ）とを備える。マイク１０、表示・操作装置２０、およびスピーカユニット３０のうち一部または全部がナビＨＭＩとして用いられてもよい。ナビゲーション装置４０は、位置測位装置によって特定された車両Ｍの位置から、乗員によって入力された目的地まで移動するための経路（ナビ経路）を探索し、経路に沿って車両Ｍが走行できるように、ナビＨＭＩを用いて案内情報を出力する。経路探索機能は、ネットワークＮＷを介してアクセス可能なナビゲーションサーバにあってもよい。この場合、ナビゲーション装置４０は、ナビゲーションサーバから経路を取得して案内情報を出力する。なお、エージェント装置１００は、ナビゲーションコントローラを基盤として構築されてもよく、その場合、ナビゲーションコントローラとエージェント装置１００は、ハードウェア上は一体に構成される。 The navigation device 40 includes a navigation HMI (Human machine Interface), a positioning device such as a GPS (Global Positioning System), a storage device that stores map information, and a control device (navigation controller) that performs route search and the like. .. A part or all of the microphone 10, the display / operation device 20, and the speaker unit 30 may be used as the navigation HMI. The navigation device 40 searches for a route (navigation route) for moving from the position of the vehicle M specified by the positioning device to the destination input by the occupant, so that the vehicle M can travel along the route. , Navi HMI is used to output guidance information. The route search function may be provided in a navigation server accessible via the network NW. In this case, the navigation device 40 acquires a route from the navigation server and outputs guidance information. The agent device 100 may be constructed based on the navigation controller. In that case, the navigation controller and the agent device 100 are integrally configured on the hardware.

車両機器５０は、例えば、エンジンや走行用モータなどの駆動力出力装置、エンジンの始動モータ、ドアロック装置、ドア開閉装置、窓、窓の開閉装置及び窓の開閉制御装置、シート、シート位置の制御装置、ルームミラー及びその角度位置制御装置、車両内外の照明装置及びその制御装置、ワイパーやデフォッガー及びそれぞれの制御装置、方向指示灯及びその制御装置、空調装置、走行距離情報や車両位置情報またタイヤの空気圧情報や燃料の残量情報など車両に関する情報を管理する車両情報装置などを含む。 The vehicle equipment 50 includes, for example, a driving force output device such as an engine or a traveling motor, an engine start motor, a door lock device, a door opening / closing device, a window, a window opening / closing device, a window opening / closing control device, a seat, and a seat position. Control device, room mirror and its angle position control device, lighting device inside and outside the vehicle and its control device, wiper and defogger and their respective control devices, direction indicator and its control device, air conditioner, mileage information and vehicle position information It includes a vehicle information device that manages information about the vehicle such as tire pressure information and fuel remaining amount information.

車載通信装置６０は、例えば、セルラー網やＷｉ−Ｆｉ網を利用してネットワークＮＷにアクセス可能な無線通信装置である。 The in-vehicle communication device 60 is, for example, a wireless communication device that can access the network NW using a cellular network or a Wi-Fi network.

乗員認識装置８０は、例えば、着座センサ、車室内カメラ、画像認識装置などを含む。着座センサは座席の下部に設けられた圧力センサ、シートベルトに取り付けられた張力センサなどを含む。車室内カメラは、車室内に設けられたＣＣＤ（Charge Coupled Device）カメラやＣＭＯＳ（Complementary Metal Oxide Semiconductor）カメラである。画像認識装置は、車室内カメラの画像を解析し、座席ごとの乗員の有無、顔向きなどを認識する。本実施形態において、乗員認識装置８０は、着座位置認識部の一例である。 The occupant recognition device 80 includes, for example, a seating sensor, a vehicle interior camera, an image recognition device, and the like. The seating sensor includes a pressure sensor provided at the bottom of the seat, a tension sensor attached to the seat belt, and the like. The vehicle interior camera is a CCD (Charge Coupled Device) camera or a CMOS (Complementary Metal Oxide Semiconductor) camera installed in the vehicle interior. The image recognition device analyzes the image of the vehicle interior camera and recognizes the presence or absence of a occupant for each seat, the face orientation, and the like. In the present embodiment, the occupant recognition device 80 is an example of the seating position recognition unit.

図３は、表示・操作装置２０の配置例を示す図である。表示・操作装置２０は、例えば、第１ディスプレイ２２と、第２ディスプレイ２４と、操作スイッチＡＳＳＹ２６とを含む。表示・操作装置２０は、更に、ＨＵＤ２８を含んでもよい。 FIG. 3 is a diagram showing an arrangement example of the display / operation device 20. The display / operation device 20 includes, for example, a first display 22, a second display 24, and an operation switch ASSY 26. The display / operation device 20 may further include a HUD 28.

車両Ｍには、例えば、ステアリングホイールＳＷが設けられた運転席ＤＳと、運転席ＤＳに対して車幅方向（図中Ｙ方向）に設けられた助手席ＡＳとが存在する。第１ディスプレイ２２は、インストルメントパネルにおける運転席ＤＳと助手席ＡＳとの中間辺りから、助手席ＡＳの左端部に対向する位置まで延在する横長形状のディスプレイ装置である。第２ディスプレイ２４は、運転席ＤＳと助手席ＡＳとの車幅方向に関する中間あたり、且つ第１ディスプレイの下方に設置されている。例えば、第１ディスプレイ２２と第２ディスプレイ２４は、共にタッチパネルとして構成され、表示部としてＬＣＤ（Liquid Crystal Display）や有機ＥＬ（Electroluminescence）、プラズマディスプレイなどを備えるものである。操作スイッチＡＳＳＹ２６は、ダイヤルスイッチやボタン式スイッチなどが集積されたものである。表示・操作装置２０は、乗員によってなされた操作の内容をエージェント装置１００に出力する。第１ディスプレイ２２または第２ディスプレイ２４が表示する内容は、エージェント装置１００によって決定されてよい。 The vehicle M includes, for example, a driver's seat DS provided with a steering wheel SW and a passenger seat AS provided in the vehicle width direction (Y direction in the drawing) with respect to the driver's seat DS. The first display 22 is a horizontally long display device extending from an intermediate portion between the driver's seat DS and the passenger's seat AS on the instrument panel to a position facing the left end of the passenger's seat AS. The second display 24 is installed at the middle of the driver's seat DS and the passenger's seat AS in the vehicle width direction and below the first display. For example, both the first display 22 and the second display 24 are configured as a touch panel, and include an LCD (Liquid Crystal Display), an organic EL (Electroluminescence), a plasma display, and the like as display units. The operation switch ASSY26 is a combination of dial switches, button-type switches, and the like. The display / operation device 20 outputs the content of the operation performed by the occupant to the agent device 100. The content displayed by the first display 22 or the second display 24 may be determined by the agent device 100.

図４は、スピーカユニット３０の配置例を示す図である。スピーカユニット３０は、例えば、スピーカ３０Ａ〜３０Ｈを含む。スピーカ３０Ａは、運転席ＤＳ側の窓柱（いわゆるＡピラー）に設置されている。スピーカ３０Ｂは、運転席ＤＳに近いドアの下部に設置されている。スピーカ３０Ｃは、助手席ＡＳ側の窓柱に設置されている。スピーカ３０Ｄは、助手席ＡＳに近いドアの下部に設置されている。スピーカ３０Ｅは、右側後部座席ＢＳ１側に近いドアの下部に設置されている。スピーカ３０Ｆは、左側後部座席ＢＳ２側に近いドアの下部に設置されている。スピーカ３０Ｇは、第２ディスプレイ２４の近傍に設置されている。スピーカ３０Ｈは、車室の天井（ルーフ）に設置されている。 FIG. 4 is a diagram showing an arrangement example of the speaker unit 30. The speaker unit 30 includes, for example, speakers 30A to 30H. The speaker 30A is installed on a window pillar (so-called A pillar) on the driver's seat DS side. The speaker 30B is installed under the door near the driver's seat DS. The speaker 30C is installed on the window pillar on the passenger seat AS side. The speaker 30D is installed at the bottom of the door near the passenger seat AS. The speaker 30E is installed at the lower part of the door near the right rear seat BS1 side. The speaker 30F is installed at the lower part of the door near the left rear seat BS2 side. The speaker 30G is installed in the vicinity of the second display 24. The speaker 30H is installed on the ceiling (roof) of the vehicle interior.

係る配置において、例えば、専らスピーカ３０Ａおよび３０Ｂに音を出力させた場合、音像は運転席ＤＳ付近に定位することになる。また、専らスピーカ３０Ｃおよび３０Ｄに音を出力させた場合、音像は助手席ＡＳ付近に定位することになる。また、専らスピーカ３０Ｅに音を出力させた場合、音像は右側後部座席ＢＳ１付近に定位することになる。また、専らスピーカ３０Ｆに音を出力させた場合、音像は左側後部座席ＢＳ２付近に定位することになる。また、専らスピーカ３０Ｇに音を出力させた場合、音像は車室の前方付近に定位することになり、専らスピーカ３０Ｈに音を出力させた場合、音像は車室の上方付近に定位することになる。これに限らず、スピーカユニット３０は、ミキサーやアンプを用いて各スピーカの出力する音の配分を調整することで、車室内の任意の位置に音像を定位させることができる。 In such an arrangement, for example, when the speakers 30A and 30B exclusively output sound, the sound image is localized in the vicinity of the driver's seat DS. Further, when the sound is output exclusively to the speakers 30C and 30D, the sound image is localized in the vicinity of the passenger seat AS. Further, when the sound is output exclusively to the speaker 30E, the sound image is localized in the vicinity of the right rear seat BS1. Further, when the sound is output exclusively to the speaker 30F, the sound image is localized in the vicinity of the left rear seat BS2. Further, when the sound is output exclusively to the speaker 30G, the sound image is localized near the front of the passenger compartment, and when the sound is output exclusively to the speaker 30H, the sound image is localized near the upper part of the passenger compartment. Become. Not limited to this, the speaker unit 30 can localize the sound image at an arbitrary position in the vehicle interior by adjusting the distribution of the sound output from each speaker by using a mixer or an amplifier.

［エージェント装置］
図２に戻り、エージェント装置１００は、管理部１１０と、エージェント機能部１５０−１、１５０−２、１５０−３と、ペアリングアプリ実行部１５２と、記憶部１６０とを備える。管理部１１０は、例えば、音響処理部１１２と、エージェントごとＷＵ（Wake Up）判定部１１４と、表示制御部１１６と、音声制御部１１８と、機能特定部１２０と、選択部１２２とを備える。いずれのエージェント機能部であるか区別しない場合、単にエージェント機能部１５０と称する。３つのエージェント機能部１５０を示しているのは、図１におけるエージェントサーバ２００の数に対応させた一例に過ぎず、エージェント機能部１５０の数は、２つであってもよいし、４つ以上であってもよい。図２に示すソフトウェア配置は説明のために簡易に示しており、実際には、例えば、エージェント機能部１５０と車載通信装置６０の間に管理部１１０が介在してもよいように、任意に改変することができる。 [Agent device]
Returning to FIG. 2, the agent device 100 includes a management unit 110, agent function units 150-1, 150-2, 150-3, a pairing application execution unit 152, and a storage unit 160. The management unit 110 includes, for example, an acoustic processing unit 112, a WU (Wake Up) determination unit 114 for each agent, a display control unit 116, a voice control unit 118, a function identification unit 120, and a selection unit 122. When it is not distinguished which agent function unit it is, it is simply referred to as an agent function unit 150. The three agent function units 150 are shown only as an example corresponding to the number of agent servers 200 in FIG. 1, and the number of agent function units 150 may be two or four or more. It may be. The software layout shown in FIG. 2 is simply shown for the sake of explanation, and is actually modified arbitrarily so that, for example, the management unit 110 may intervene between the agent function unit 150 and the in-vehicle communication device 60. can do.

エージェント装置１００の各構成要素は、例えば、ＣＰＵ（Central Processing Unit）などのハードウェアプロセッサがプログラム（ソフトウェア）を実行することにより実現される。これらの構成要素のうち一部または全部は、ＬＳＩ（Large Scale Integration）やＡＳＩＣ（Application Specific Integrated Circuit）、ＦＰＧＡ（Field-Programmable Gate Array）、ＧＰＵ（Graphics Processing Unit）などのハードウェア（回路部；circuitryを含む）によって実現されてもよいし、ソフトウェアとハードウェアの協働によって実現されてもよい。プログラムは、予めＨＤＤ（Hard Disk Drive）やフラッシュメモリなどの記憶装置（非一過性の記憶媒体を備える記憶装置）に格納されていてもよいし、ＤＶＤやＣＤ−ＲＯＭなどの着脱可能な記憶媒体（非一過性の記憶媒体）に格納されており、記憶媒体がドライブ装置に装着されることでインストールされてもよい。記憶部１６０は、前述した記憶装置により実現される。記憶部１６０には、例えば、機能一覧情報１６２が記憶される。 Each component of the agent device 100 is realized, for example, by executing a program (software) by a hardware processor such as a CPU (Central Processing Unit). Some or all of these components are hardware such as LSI (Large Scale Integration), ASIC (Application Specific Integrated Circuit), FPGA (Field-Programmable Gate Array), GPU (Graphics Processing Unit), etc. It may be realized by (including circuits), or it may be realized by the cooperation of software and hardware. The program may be stored in advance in a storage device (a storage device including a non-transient storage medium) such as an HDD (Hard Disk Drive) or a flash memory, or a removable storage device such as a DVD or a CD-ROM. It is stored in a medium (non-transient storage medium) and may be installed by mounting the storage medium in a drive device. The storage unit 160 is realized by the storage device described above. For example, the function list information 162 is stored in the storage unit 160.

図５は、機能一覧情報１６２の内容の一例を示す図である。機能一覧情報１６２は、エージェントが実行可能な機能と、当該機能が実行可能になった日（図示する、実行可能日）と、当該機能の実行履歴とが、エージェント毎にそれぞれ対応付けられた情報である。実行履歴には、例えば、乗員が機能を「実行済み」であるか、又は「未実行」であるかを示す情報が対応付けられ、１度でも乗員が利用した機能については、「実行済み」を示す情報が対応付けられる。機能一覧情報１６２の内容は、例えば、機能に更新がある度（例えば、新たな機能が追加される度）、又は所定の時間間隔毎にエージェントサーバ２００によって更新される。 FIG. 5 is a diagram showing an example of the contents of the function list information 162. The function list information 162 is information in which a function that can be executed by an agent, a date when the function becomes executable (illustrated, an executable date), and an execution history of the function are associated with each agent. Is. For example, the execution history is associated with information indicating whether the occupant has "executed" or "not executed" the function, and the function used by the occupant even once is "executed". Information indicating is associated. The contents of the function list information 162 are updated by the agent server 200, for example, every time the function is updated (for example, every time a new function is added) or at predetermined time intervals.

図５において、エージェント１には、地図検索機能と、音声再生機能と、しりとり機能とを示す情報が互いに対応付けられており、いずれの機能についても実行履歴が「未実行」を示す情報である。また、エージェント２には、地図検索機能と、音楽再生機能とを示す情報が対応付けられており、地図検索機能が「実行済み」を示す情報であり、音楽再生機能が「未実行」を示す情報である。また、エージェント３には、地図検索機能と、音楽再生機能とを示す情報が対応付けられており、いずれの機能についても実行履歴が「実行済み」を示す情報である。エージェント１〜３の詳細については、後述する。 In FIG. 5, the agent 1 is associated with information indicating a map search function, a voice reproduction function, and a shiritori function, and the execution history of any of the functions is information indicating “not executed”. .. Further, the agent 2 is associated with information indicating a map search function and a music playback function, the map search function indicates "executed", and the music playback function indicates "not executed". Information. Further, the agent 3 is associated with information indicating a map search function and a music playback function, and the execution history of each function is information indicating "executed". Details of agents 1 to 3 will be described later.

管理部１１０は、ＯＳ（Operating System）やミドルウェアなどのプログラムが実行されることで機能する。 The management unit 110 functions by executing a program such as an OS (Operating System) or middleware.

管理部１１０の音響処理部１１２は、エージェントごとに予め設定されているウエイクアップワードやエージェントが実行可能な機能を認識するのに適した状態になるように、入力された音に対して音響処理を行う。 The sound processing unit 112 of the management unit 110 performs sound processing on the input sound so as to be in a state suitable for recognizing the wakeup word preset for each agent and the function that the agent can execute. I do.

エージェントごとＷＵ判定部１１４は、エージェント機能部１５０−１、１５０−２、１５０−３のそれぞれに対応して存在し、エージェントごとに予め定められているウエイクアップワードを認識する。エージェントごとＷＵ判定部１１４は、音響処理が行われた音声（音声ストリーム）から音声の意味を認識する。まず、エージェントごとＷＵ判定部１１４は、音声ストリームにおける音声波形の振幅と零交差に基づいて音声区間を検出する。エージェントごとＷＵ判定部１１４は、混合ガウス分布モデル（ＧＭＭ；Gaussian mixture model) に基づくフレーム単位の音声識別および非音声識別に基づく区間検出を行ってもよい。 The WU determination unit 114 for each agent exists corresponding to each of the agent function units 150-1, 150-2, and 150-3, and recognizes a wakeup word predetermined for each agent. The WU determination unit 114 for each agent recognizes the meaning of the voice from the voice (voice stream) subjected to the acoustic processing. First, the WU determination unit 114 for each agent detects a voice section based on the amplitude and zero intersection of the voice waveform in the voice stream. The WU determination unit 114 for each agent may perform frame-by-frame speech recognition based on a mixture Gaussian mixture model (GMM) and section detection based on non-speech recognition.

次に、エージェントごとＷＵ判定部１１４は、検出した音声区間における音声をテキスト化し、文字情報とする。そして、エージェントごとＷＵ判定部１１４は、テキスト化した文字情報がウエイクアップワードに該当するか否かを判定する。ウエイクアップワードであると判定した場合。エージェントごとＷＵ判定部１１４は、対応するエージェント機能部１５０を示す情報を選択部に通知する。なお、エージェントごとＷＵ判定部１１４に相当する機能がエージェントサーバ２００に搭載されてもよい。この場合、管理部１１０は、音響処理部１１２によって音響処理が行われた音声ストリームをエージェントサーバ２００に送信し、エージェントサーバ２００がウエイクアップワードであると判定した場合、エージェントサーバ２００からの指示に従ってエージェント機能部１５０が起動する。なお、各エージェント機能部１５０は、常時起動しており且つウエイクアップワードの判定を自ら行うものであってよい。この場合、管理部１１０がエージェントごとＷＵ判定部１１４を備える必要はない。 Next, the WU determination unit 114 for each agent converts the voice in the detected voice section into text and converts it into character information. Then, the WU determination unit 114 for each agent determines whether or not the textual character information corresponds to the wakeup word. When it is determined that it is a wakeup word. The WU determination unit 114 for each agent notifies the selection unit of information indicating the corresponding agent function unit 150. The agent server 200 may be equipped with a function corresponding to the WU determination unit 114 for each agent. In this case, when the management unit 110 transmits the voice stream to which the sound processing has been performed by the sound processing unit 112 to the agent server 200 and determines that the agent server 200 is a wakeup word, the management unit 110 follows an instruction from the agent server 200. The agent function unit 150 starts. It should be noted that each agent function unit 150 may be always activated and may determine the wakeup word by itself. In this case, the management unit 110 does not need to include the WU determination unit 114 for each agent.

機能特定部１２０は、乗員が提供を要求するエージェントの機能を特定する。まず、機能特定部１２０は、音声ストリームにおける音声波形の振幅と零交差に基づいて音声区間を検出する。機能特定部１２０は、混合ガウス分布モデルに基づくフレーム単位の音声識別および非音声識別に基づく区間検出を行ってもよい。次に、機能特定部１２０は、検出した音声区間における音声をテキスト化し、文字情報とする。そして、機能特定部１２０は、テキスト化した文字情報が、機能一覧情報１６２の機能欄に含まれる機能の名称に該当するか否かを判定する。機能特定部１２０は、文字情報が機能の名称に該当すると判定した場合、当該機能を、乗員が提供を要求するエージェントの機能として特定する。 The function specifying unit 120 identifies the function of the agent requested by the occupant. First, the function specifying unit 120 detects a voice section based on the amplitude and zero intersection of the voice waveform in the voice stream. The function specifying unit 120 may perform frame-by-frame speech recognition based on the mixed Gaussian distribution model and section detection based on non-speech recognition. Next, the function specifying unit 120 converts the voice in the detected voice section into text and converts it into character information. Then, the function specifying unit 120 determines whether or not the textual character information corresponds to the name of the function included in the function column of the function list information 162. When the function specifying unit 120 determines that the character information corresponds to the name of the function, the function specifying unit 120 identifies the function as the function of the agent requested to be provided by the occupant.

なお、機能特定部１２０は、機能が特定される度、機能の名称、機能のリリース日、及び実行履歴等を各エージェント機能部１５０に問合せしてもよい。この場合、記憶部１６０には、機能一覧情報１６２が記憶されていなくてもよい。 The function specifying unit 120 may inquire of each agent function unit 150 each time the function is specified, the name of the function, the release date of the function, the execution history, and the like. In this case, the function list information 162 may not be stored in the storage unit 160.

選択部１２２は、エージェントごとＷＵ判定部１１４によってウエイクアップワードが認識されたエージェント機能部１５０、又は機能特定部１２０によって特定された機能を実現する（つまり、乗員の発話に対応する）、エージェント機能部１５０を選択する。選択部１２２がエージェント機能部１５０を選択する処理の詳細については、後述する。選択部１２２は、選択したエージェント機能部１５０に音声ストリームを送信する。選択部１２２は、選択したエージェント機能部１５０を起動させる。 The selection unit 122 realizes the function specified by the agent function unit 150 or the function identification unit 120 in which the wakeup word is recognized by the WU determination unit 114 for each agent (that is, corresponds to the utterance of the occupant). Select unit 150. The details of the process in which the selection unit 122 selects the agent function unit 150 will be described later. The selection unit 122 transmits an audio stream to the selected agent function unit 150. The selection unit 122 activates the selected agent function unit 150.

エージェント機能部１５０は、対応するエージェントサーバ２００と協働してエージェントを出現させ、車両の乗員の発話に応じて、音声による応答を含むサービスを提供する。エージェント機能部１５０には、車両機器５０を制御する権限が付与されたものが含まれてよい。また、エージェント機能部１５０には、ペアリングアプリ実行部１５２を介して汎用通信装置７０と連携し、エージェントサーバ２００と通信するものがあってよい。例えば、エージェント機能部１５０−１には、車両機器５０を制御する権限が付与されている。エージェント機能部１５０−１は、車載通信装置６０を介してエージェントサーバ２００−１と通信する。エージェント機能部１５０−２は、車載通信装置６０を介してエージェントサーバ２００−２と通信する。エージェント機能部１５０−３は、ペアリングアプリ実行部１５２を介して汎用通信装置７０と連携し、エージェントサーバ２００−３と通信する。ペアリングアプリ実行部１５２は、例えば、Ｂｌｕｅｔｏｏｔｈ（登録商標）によって汎用通信装置７０とペアリングを行い、エージェント機能部１５０−３と汎用通信装置７０とを接続させる。なお、エージェント機能部１５０−３は、ＵＳＢ（Universal Serial Bus）などを利用した有線通信によって汎用通信装置７０に接続されるようにしてもよい。以下、エージェント機能部１５０−１とエージェントサーバ２００−１が協働して出現させるエージェントをエージェント１、エージェント機能部１５０−２とエージェントサーバ２００−２が協働して出現させるエージェントをエージェント２、エージェント機能部１５０−３とエージェントサーバ２００−３が協働して出現させるエージェントをエージェント３と称する場合がある。 The agent function unit 150 causes an agent to appear in cooperation with the corresponding agent server 200, and provides a service including a voice response in response to an utterance of a vehicle occupant. The agent function unit 150 may include one to which the authority to control the vehicle device 50 is granted. Further, the agent function unit 150 may be one that cooperates with the general-purpose communication device 70 via the pairing application execution unit 152 and communicates with the agent server 200. For example, the agent function unit 150-1 is given the authority to control the vehicle device 50. The agent function unit 150-1 communicates with the agent server 200-1 via the vehicle-mounted communication device 60. The agent function unit 150-2 communicates with the agent server 200-2 via the vehicle-mounted communication device 60. The agent function unit 150-3 cooperates with the general-purpose communication device 70 via the pairing application execution unit 152, and communicates with the agent server 200-3. The pairing application execution unit 152 pairs with the general-purpose communication device 70 by, for example, Bluetooth (registered trademark), and connects the agent function unit 150-3 and the general-purpose communication device 70. The agent function unit 150-3 may be connected to the general-purpose communication device 70 by wired communication using USB (Universal Serial Bus) or the like. Hereinafter, the agent 1 in which the agent function unit 150-1 and the agent server 200-1 collaborate to appear, the agent 2 in which the agent function unit 150-2 and the agent server 200-2 collaborate to appear. An agent that the agent function unit 150-3 and the agent server 200-3 collaborate to appear may be referred to as an agent 3.

表示制御部１１６は、エージェント機能部１５０からの指示に応じて第１ディスプレイ２２または第２ディスプレイ２４に画像を表示させる。以下では、第１ディスプレイ２２を使用するものとする。表示制御部１１６は、一部のエージェント機能部１５０の制御により、例えば、車室内で乗員とのコミュニケーションを行う擬人化されたエージェントの画像（以下、エージェント画像と称する）を生成し、生成したエージェント画像を第１ディスプレイ２２に表示させる。エージェント画像は、例えば、乗員に対して話しかける態様の画像である。エージェント画像は、例えば、少なくとも観者（乗員）によって表情や顔向きが認識される程度の顔画像を含んでよい。例えば、エージェント画像は、顔領域の中に目や鼻に擬したパーツが表されており、顔領域の中のパーツの位置に基づいて表情や顔向きが認識されるものであってよい。また、エージェント画像は、立体的に感じられ、観者によって三次元空間における頭部画像を含むことでエージェントの顔向きが認識されたり、本体（胴体や手足）の画像を含むことで、エージェントの動作や振る舞い、姿勢等が認識されるものであってもよい。また、エージェント画像は、アニメーション画像であってもよい。 The display control unit 116 causes the first display 22 or the second display 24 to display an image in response to an instruction from the agent function unit 150. In the following, it is assumed that the first display 22 is used. The display control unit 116 generates, for example, an image of an anthropomorphic agent (hereinafter referred to as an agent image) that communicates with an occupant in the vehicle interior under the control of a part of the agent function unit 150, and the generated agent. The image is displayed on the first display 22. The agent image is, for example, an image of a mode of talking to an occupant. The agent image may include, for example, a facial image such that the facial expression and the facial orientation are recognized by the viewer (occupant) at least. For example, in the agent image, parts imitating eyes and nose are represented in the face area, and the facial expression and face orientation may be recognized based on the positions of the parts in the face area. In addition, the agent image is felt three-dimensionally, and the viewer can recognize the face orientation of the agent by including the head image in the three-dimensional space, or the agent's image can be included by including the image of the main body (body and limbs). The movement, behavior, posture, etc. may be recognized. Further, the agent image may be an animation image.

音声制御部１１８は、エージェント機能部１５０からの指示に応じて、スピーカユニット３０に含まれるスピーカのうち一部または全部に音声を出力させる。音声制御部１１８は、複数のスピーカユニット３０を用いて、エージェント画像の表示位置に対応する位置にエージェント音声の音像を定位させる制御を行ってもよい。エージェント画像の表示位置に対応する位置とは、例えば、エージェント画像がエージェント音声を喋っていると乗員が感じると予測される位置であり、具体的には、エージェント画像の表示位置付近（例えば、２〜３［ｃｍ］以内）の位置である。また、音像が定位するとは、例えば、乗員の左右の耳に伝達される音の大きさを調節することにより、乗員が感じる音源の空間的な位置を定めることである。 The voice control unit 118 causes a part or all of the speakers included in the speaker unit 30 to output voice in response to an instruction from the agent function unit 150. The voice control unit 118 may use a plurality of speaker units 30 to control the localization of the sound image of the agent voice at a position corresponding to the display position of the agent image. The position corresponding to the display position of the agent image is, for example, a position where the occupant is expected to feel that the agent image is speaking the agent voice. Specifically, the position is near the display position of the agent image (for example, 2). It is within ~ 3 [cm]). Further, localization of the sound image means, for example, determining the spatial position of the sound source felt by the occupant by adjusting the loudness of the sound transmitted to the left and right ears of the occupant.

図６は、音像が定位する位置が定まる原理について説明するための図である。図６では、説明を簡略化するために、上述したスピーカ３０Ｂ、３０Ｄ、および３０Ｇを用いる例を示しているが、スピーカユニット３０に含まれる任意のスピーカが使用されてよい。音声制御部１１８は、各スピーカに接続されたアンプ（ＡＭＰ）３２およびミキサー３４を制御して音像を定位させる。例えば、図６に示す空間位置ＭＰ１に音像を定位させる場合、音声制御部１１８は、アンプ３２およびミキサー３４を制御することにより、スピーカ３０Ｂに最大強度の５％の出力を行わせ、スピーカ３０Ｄに最大強度の８０％の出力を行わせ、スピーカ３０Ｇに最大強度の１５％の出力を行わせる。この結果、乗員Ｐの位置からは、図６に示す空間位置ＭＰ１に音像が定位しているように感じることになる。 FIG. 6 is a diagram for explaining the principle of determining the position where the sound image is localized. Although FIG. 6 shows an example in which the speakers 30B, 30D, and 30G described above are used for simplification of the description, any speaker included in the speaker unit 30 may be used. The audio control unit 118 controls the amplifier (AMP) 32 and the mixer 34 connected to each speaker to localize the sound image. For example, when the sound image is localized at the spatial position MP1 shown in FIG. 6, the voice control unit 118 controls the amplifier 32 and the mixer 34 to cause the speaker 30B to output 5% of the maximum intensity, and the speaker 30D. The output is 80% of the maximum intensity, and the speaker 30G is made to output 15% of the maximum intensity. As a result, from the position of the occupant P, it seems that the sound image is localized at the spatial position MP1 shown in FIG.

また、図６に示す空間位置ＭＰ２に音像を定位させる場合、音声制御部１１８は、アンプ３２およびミキサー３４を制御することにより、スピーカ３０Ｂに最大強度の４５％の出力を行わせ、スピーカ３０Ｄに最大強度の４５％の出力を行わせ、スピーカ３０Ｇに最大強度の４５％の出力を行わせる。この結果、乗員Ｐの位置からは、図６に示す空間位置ＭＰ２に音像が定位しているように感じることになる。このように、車室内に設けられる複数のスピーカとそれぞれのスピーカから出力される音の大きさを調整することで、音像が定位される位置を変化させることができる。なお、より詳細には、音像の定位する位置は、音源が元々保有している音特性や、車室内環境の情報、頭部伝達関数（HRTF；Head-related transfer function）に基づいて定まるため、音声制御部１１８は、予め官能試験などで得られた最適な出力配分でスピーカユニット３０を制御することで、音像を所定の位置に定位させる。 Further, when the sound image is localized at the spatial position MP2 shown in FIG. 6, the voice control unit 118 controls the amplifier 32 and the mixer 34 to cause the speaker 30B to output 45% of the maximum intensity, and the speaker 30D. The output of 45% of the maximum intensity is performed, and the speaker 30G is made to output 45% of the maximum intensity. As a result, from the position of the occupant P, it seems that the sound image is localized at the spatial position MP2 shown in FIG. In this way, by adjusting the plurality of speakers provided in the vehicle interior and the loudness of the sound output from each speaker, the position where the sound image is localized can be changed. More specifically, the localization position of the sound image is determined based on the sound characteristics originally possessed by the sound source, the information on the vehicle interior environment, and the head-related transfer function (HRTF). The voice control unit 118 localizes the sound image at a predetermined position by controlling the speaker unit 30 with the optimum output distribution obtained in advance by a sensory test or the like.

［エージェントサーバ］
図７は、エージェントサーバ２００の構成と、エージェント装置１００の構成の一部とを示す図である。以下、エージェントサーバ２００の構成と共にエージェント機能部１５０等の動作について説明する。ここでは、エージェント装置１００からネットワークＮＷまでの物理的な通信についての説明を省略する。 [Agent server]
FIG. 7 is a diagram showing a configuration of the agent server 200 and a part of the configuration of the agent device 100. Hereinafter, the operation of the agent function unit 150 and the like together with the configuration of the agent server 200 will be described. Here, the description of the physical communication from the agent device 100 to the network NW will be omitted.

エージェントサーバ２００は、通信部２１０を備える。通信部２１０は、例えばＮＩＣ（Network Interface Card）などのネットワークインターフェースである。更に、エージェントサーバ２００は、例えば、音声認識部２２０と、自然言語処理部２２２と、対話管理部２２４と、ネットワーク検索部２２６と、応答文生成部２２８とを備える。これらの構成要素は、例えば、ＣＰＵなどのハードウェアプロセッサがプログラム（ソフトウェア）を実行することにより実現される。これらの構成要素のうち一部または全部は、ＬＳＩやＡＳＩＣ、ＦＰＧＡ、ＧＰＵなどのハードウェア（回路部；circuitryを含む）によって実現されてもよいし、ソフトウェアとハードウェアの協働によって実現されてもよい。プログラムは、予めＨＤＤやフラッシュメモリなどの記憶装置（非一過性の記憶媒体を備える記憶装置）に格納されていてもよいし、ＤＶＤやＣＤ−ＲＯＭなどの着脱可能な記憶媒体（非一過性の記憶媒体）に格納されており、記憶媒体がドライブ装置に装着されることでインストールされてもよい。 The agent server 200 includes a communication unit 210. The communication unit 210 is a network interface such as a NIC (Network Interface Card). Further, the agent server 200 includes, for example, a voice recognition unit 220, a natural language processing unit 222, a dialogue management unit 224, a network search unit 226, and a response sentence generation unit 228. These components are realized, for example, by a hardware processor such as a CPU executing a program (software). Some or all of these components may be realized by hardware such as LSI, ASIC, FPGA, GPU (including circuit part; circuitry), or realized by collaboration between software and hardware. May be good. The program may be stored in advance in a storage device such as an HDD or flash memory (a storage device including a non-transient storage medium), or a removable storage medium such as a DVD or a CD-ROM (non-transient). It is stored in a sex storage medium) and may be installed by attaching the storage medium to a drive device.

エージェントサーバ２００は、記憶部２５０を備える。記憶部２５０は、上記の各種記憶装置により実現される。記憶部２５０には、パーソナルプロファイル２５２、辞書ＤＢ（データベース）２５４、知識ベースＤＢ２５６、応答規則ＤＢ２５８などのデータやプログラムが格納される。 The agent server 200 includes a storage unit 250. The storage unit 250 is realized by the above-mentioned various storage devices. Data and programs such as a personal profile 252, a dictionary DB (database) 254, a knowledge base DB 256, and a response rule DB 258 are stored in the storage unit 250.

エージェント装置１００において、エージェント機能部１５０は、音声ストリーム、或いは圧縮や符号化などの処理を行った音声ストリームを、エージェントサーバ２００に送信する。エージェント機能部１５０は、ローカル処理（エージェントサーバ２００を介さない処理）が可能な音声コマンドを認識した場合は、音声コマンドで要求された処理を行ってよい。ローカル処理が可能な音声コマンドとは、エージェント装置１００が備える記憶部（不図示）を参照することで回答可能な音声コマンドであったり、エージェント機能部１５０−１の場合は車両機器５０を制御する音声コマンド（例えば、空調装置をオンにするコマンドなど）であったりする。従って、エージェント機能部１５０は、エージェントサーバ２００が備える機能の一部を有してもよい。 In the agent device 100, the agent function unit 150 transmits a voice stream or a voice stream that has undergone processing such as compression or coding to the agent server 200. When the agent function unit 150 recognizes a voice command capable of local processing (processing that does not go through the agent server 200), the agent function unit 150 may perform the processing requested by the voice command. The voice command capable of local processing is a voice command that can be answered by referring to a storage unit (not shown) included in the agent device 100, or in the case of the agent function unit 150-1, the vehicle device 50 is controlled. It may be a voice command (for example, a command to turn on the air conditioner). Therefore, the agent function unit 150 may have a part of the functions provided in the agent server 200.

音声ストリームを取得すると、音声認識部２２０が音声認識を行ってテキスト化された文字情報を出力し、自然言語処理部２２２が文字情報に対して辞書ＤＢ２５４を参照しながら意味解釈を行う。辞書ＤＢ２５４は、文字情報に対して抽象化された意味情報が対応付けられたものである。辞書ＤＢ２５４は、同義語や類義語の一覧情報を含んでもよい。音声認識部２２０の処理と、自然言語処理部２２２の処理は、段階が明確に分かれるものではなく、自然言語処理部２２２の処理結果を受けて音声認識部２２０が認識結果を修正するなど、相互に影響し合って行われてよい。 When the voice stream is acquired, the voice recognition unit 220 performs voice recognition and outputs textual character information, and the natural language processing unit 222 interprets the character information with reference to the dictionary DB 254. The dictionary DB 254 is associated with abstract semantic information with respect to character information. The dictionary DB 254 may include list information of synonyms and synonyms. The processing of the voice recognition unit 220 and the processing of the natural language processing unit 222 are not clearly separated in stages, and the voice recognition unit 220 corrects the recognition result in response to the processing result of the natural language processing unit 222. It may be done by influencing each other.

自然言語処理部２２２は、例えば、認識結果として、「今日の天気は」、「天気はどうですか」等の意味が認識された場合、標準文字情報「今日の天気」に置き換えたコマンドを生成する。これにより、リクエストの音声に文字揺らぎがあった場合にも要求にあった対話をし易くすることができる。また、自然言語処理部２２２は、例えば、確率を利用した機械学習処理等の人工知能処理を用いて文字情報の意味を認識したり、認識結果に基づくコマンドを生成してもよい。 For example, when the natural language processing unit 222 recognizes the meanings such as "today's weather" and "how is the weather" as the recognition result, the natural language processing unit 222 generates a command replaced with the standard character information "today's weather". As a result, even if there is a character fluctuation in the voice of the request, it is possible to facilitate the dialogue according to the request. Further, the natural language processing unit 222 may recognize the meaning of character information by using artificial intelligence processing such as machine learning processing using probability, or may generate a command based on the recognition result.

対話管理部２２４は、自然言語処理部２２２の処理結果（コマンド）に基づいて、パーソナルプロファイル２５２や知識ベースＤＢ２５６、応答規則ＤＢ２５８を参照しながら車両Ｍの乗員に対する発話の内容を決定する。パーソナルプロファイル２５２は、乗員ごとに保存されている乗員の個人情報、趣味嗜好、過去の対話の履歴などを含む。知識ベースＤＢ２５６は、物事の関係性を規定した情報である。応答規則ＤＢ２５８は、コマンドに対してエージェントが行うべき動作（回答や機器制御の内容など）を規定した情報である。 The dialogue management unit 224 determines the content of the utterance to the occupant of the vehicle M based on the processing result (command) of the natural language processing unit 222 with reference to the personal profile 252, the knowledge base DB 256, and the response rule DB 258. The personal profile 252 includes the personal information of the occupants, hobbies and preferences, the history of past dialogues, etc. stored for each occupant. The knowledge base DB 256 is information that defines the relationships between things. The response rule DB 258 is information that defines the actions (answers, device control contents, etc.) that the agent should perform in response to the command.

また、対話管理部２２４は、音声ストリームから得られる特徴情報を用いて、パーソナルプロファイル２５２と照合を行うことで、乗員を特定してもよい。この場合、パーソナルプロファイル２５２には、例えば、音声の特徴情報に、個人情報が対応付けられている。音声の特徴情報とは、例えば、声の高さ、イントネーション、リズム（音の高低のパターン）等の喋り方の特徴や、メル周波数ケプストラム係数（Mel Frequency Cepstrum Coefficients）等による特徴量に関する情報である。音声の特徴情報は、例えば、乗員の初期登録時に所定の単語や文章等を乗員に発声させ、発声させた音声を認識することで得られる情報である。 Further, the dialogue management unit 224 may identify the occupant by collating with the personal profile 252 using the feature information obtained from the voice stream. In this case, in the personal profile 252, for example, personal information is associated with voice feature information. The voice feature information is, for example, information on the characteristics of how to speak such as voice pitch, intonation, and rhythm (sound pitch pattern), and the feature amount based on the Mel Frequency Cepstrum Coefficients. .. The voice feature information is, for example, information obtained by having the occupant utter a predetermined word or sentence at the time of initial registration of the occupant and recognizing the uttered voice.

対話管理部２２４は、コマンドが、ネットワークＮＷを介して検索可能な情報を要求するものである場合、ネットワーク検索部２２６に検索を行わせる。ネットワーク検索部２２６は、ネットワークＮＷを介して各種ウェブサーバ３００にアクセスし、所望の情報を取得する。「ネットワークＮＷを介して検索可能な情報」とは、例えば、車両Ｍの周辺にあるレストランの一般ユーザによる評価結果であったり、その日の車両Ｍの位置に応じた天気予報であったりする。 The dialogue management unit 224 causes the network search unit 226 to perform a search when the command requests information that can be searched via the network NW. The network search unit 226 accesses various web servers 300 via the network NW and acquires desired information. The "information searchable via the network NW" may be, for example, an evaluation result by a general user of a restaurant in the vicinity of the vehicle M, or a weather forecast according to the position of the vehicle M on that day.

応答文生成部２２８は、対話管理部２２４により決定された発話の内容が車両Ｍの乗員に伝わるように、応答文を生成し、エージェント装置１００に送信する。応答文生成部２２８は、乗員がパーソナルプロファイルに登録された乗員であることが特定されている場合に、乗員の名前を呼んだり、乗員の話し方に似せた話し方にした応答文を生成したりしてもよい。 The response sentence generation unit 228 generates a response sentence and transmits it to the agent device 100 so that the content of the utterance determined by the dialogue management unit 224 is transmitted to the occupant of the vehicle M. The response sentence generation unit 228 calls the occupant's name or generates a response sentence that resembles the occupant's way of speaking when the occupant is identified as a registered occupant in the personal profile. You may.

エージェント機能部１５０は、応答文を取得すると、音声合成を行って音声を出力するように音声制御部１１８に指示する。また、エージェント機能部１５０は、音声出力に合わせてエージェントの画像を表示するように表示制御部１１６に指示する。このようにして、仮想的に出現したエージェントが車両Ｍの乗員に応答するエージェント機能が実現される。 When the agent function unit 150 acquires the response sentence, the agent function unit 150 instructs the voice control unit 118 to perform voice synthesis and output the voice. Further, the agent function unit 150 instructs the display control unit 116 to display the image of the agent in accordance with the audio output. In this way, the agent function in which the virtually appearing agent responds to the occupant of the vehicle M is realized.

［エージェント機能部１５０の選択処理について：ウエイクアップワード無し］
以下、選択部１２２が、エージェント機能部１５０を選択する選択処理について説明する。図８は、地図検索機能を提供する場合のエージェントと乗員の対話の一例を示す図である。まず、乗員は、エージェントに対して、地図検索機能の提供を要求する旨を含む発話ＣＶ１を行う。発話ＣＶ１は、例えば、「地図検索機能を起動して？」等の言葉である。これを受けて、選択部１２２は、例えば、上述した処理によって機能特定部１２０が特定した機能（この一例では、地図検索機能）を検索キーとして、機能一覧情報１６２を検索し、当該機能が対応付けられているエージェントを特定する。図５の機能一覧情報１６２において、地図検索機能が対応付けられているエージェントは、エージェント１〜３のエージェントである。 [About the selection process of the agent function unit 150: No wake word]
Hereinafter, the selection process in which the selection unit 122 selects the agent function unit 150 will be described. FIG. 8 is a diagram showing an example of a dialogue between an agent and an occupant when providing a map search function. First, the occupant makes an utterance CV1 including requesting the agent to provide the map search function. The utterance CV1 is, for example, a word such as "Activate the map search function?". In response to this, the selection unit 122 searches the function list information 162 using, for example, the function specified by the function specifying unit 120 by the above-mentioned process (in this example, the map search function) as a search key, and the function corresponds to the function. Identify the attached agent. In the function list information 162 of FIG. 5, the agents associated with the map search function are the agents of agents 1 to 3.

次に、選択部１２２は、当該機能が対応付けられているエージェントのうち、既に当該機能の実行履歴が「実行済み」を示すエージェントが存在する場合であっても、当該機能の実行履歴が「未実行」を示すエージェントを優先的に選択する。図５の機能一覧情報１６２において、地図検索機能が「未実行」を示すエージェントは、エージェント１のみである。したがって、選択部１２２は、エージェント機能部１５０−１を乗員の音声に応答させるエージェント機能部として、エージェント機能部１５０−２やエージェント機能部１５０−３に対して優先的に選択し、起動させる。 Next, the selection unit 122 sets the execution history of the function to "execution" even when there is already an agent whose execution history of the function indicates "executed" among the agents associated with the function. Priority is given to agents that indicate "not executed". In the function list information 162 of FIG. 5, the agent 1 indicates that the map search function is "not executed". Therefore, the selection unit 122 preferentially selects and activates the agent function unit 150-2 and the agent function unit 150-3 as the agent function unit that causes the agent function unit 150-1 to respond to the voice of the occupant.

選択部１２２によって起動されたエージェント機能部１５０（この一例では、エージェント機能部１５０−１）は、発話ＣＶ１に対する応答文ＲＰ１を、対応するエージェントサーバ２００（この一例では、エージェントサーバ２００−１）から取得し、当該応答文ＲＰ１に音声合成を行って音声を出力するように音声制御部１１８に指示する。応答文ＲＰ１は、例えば、発話ＣＶ１において、要求されている機能を実行するエージェント機能部１５０のエージェントを紹介する言葉が含まれる。応答文ＲＰ１は、例えば、「こんにちは、△△（エージェント１）です。私が地図検索機能を提供します。」等の言葉である。 The agent function unit 150 (agent function unit 150-1 in this example) activated by the selection unit 122 sends the response sentence RP1 to the speech CV1 from the corresponding agent server 200 (agent server 200-1 in this example). The voice control unit 118 is instructed to acquire the voice, synthesize the voice in the response sentence RP1, and output the voice. The response sentence RP1 includes, for example, a word that introduces an agent of the agent function unit 150 that executes the requested function in the utterance CV1. Response sentence RP1 is, for example, "Hello, this is △△ (agent 1). I will provide a map search function.", Which is a word such as.

エージェント機能部１５０−１は、応答文ＲＰ１に対する乗員の発話ＣＶ２が、肯定的な内容である場合、要求された機能（この一例では、地図検索機能）の提供を行う。また、エージェント機能部１５０−１は、応答文ＲＰ１に対する乗員の発話ＣＶ２が、否定的な内容である場合、選択部１２２に再度、エージェント機能部１５０の選択を指示する。この場合、選択部１２２は、一度選択したエージェント機能部１５０を除くエージェント機能部１５０から、乗員が要求する機能を提供するエージェント機能部１５０を選択する。 The agent function unit 150-1 provides the requested function (in this example, the map search function) when the utterance CV2 of the occupant with respect to the response sentence RP1 has a positive content. Further, when the occupant's utterance CV2 with respect to the response sentence RP1 has a negative content, the agent function unit 150-1 instructs the selection unit 122 to select the agent function unit 150 again. In this case, the selection unit 122 selects the agent function unit 150 that provides the function requested by the occupant from the agent function unit 150 excluding the agent function unit 150 that has been selected once.

［エージェント機能部１５０の選択処理について：ウエイクアップワード有り］
次に、乗員が、エージェントに対して、ウエイクアップワードと、地図検索機能の提供を要求する旨とを含む発話ＣＶ３を行う場合について説明する。図９は、ウエイクアップワードを含む発話ＣＶ３に対するエージェントの回答の一例を示す図である。発話ＣＶ３は、例えば、「『ねぇ〇〇（エージェント２）』（ウエイクアップワード）、地図検索機能を起動して？」等の言葉である。これを受けて、選択部１２２は、例えば、上述したように、地図検索機能が対応付けられているエージェントが、エージェント１〜３であると特定する。次に、選択部１２２は、当該機能が対応付けられているエージェントのうち、既に当該機能の実行履歴が「実行済み」を示すエージェントが存在し、ウエイクアップワードで指定されたエージェントが存在する場合であっても、当該機能の実行履歴が「未実行」を示すエージェントを優先的に選択する。図５の機能一覧情報１６２において、地図検索機能が「未実行」を示すエージェントは、エージェント１のみである。したがって、選択部１２２は、エージェント機能部１５０−１を乗員の音声に応答させるエージェント機能部として、エージェント機能部１５０−２やエージェント機能部１５０−３に対して優先的に選択し、起動させる。 [About the selection process of the agent function unit 150: There is a wake word]
Next, a case where the occupant performs the utterance CV3 including the wake-up word and the request for the provision of the map search function to the agent will be described. FIG. 9 is a diagram showing an example of the agent's response to the utterance CV3 including the wakeup word. The utterance CV3 is, for example, a word such as "'Hey OO (Agent 2)" (Wake Up Word), activate the map search function? " In response to this, the selection unit 122 identifies, for example, the agents to which the map search function is associated are the agents 1 to 3, as described above. Next, when the selection unit 122 already has an agent whose execution history of the function indicates "executed" among the agents associated with the function, and an agent specified by the wakeup word exists. Even so, the agent whose execution history of the function indicates "not executed" is preferentially selected. In the function list information 162 of FIG. 5, the agent 1 indicates that the map search function is "not executed". Therefore, the selection unit 122 preferentially selects and activates the agent function unit 150-2 and the agent function unit 150-3 as the agent function unit that causes the agent function unit 150-1 to respond to the voice of the occupant.

選択部１２２によって起動されたエージェント機能部１５０（この一例では、エージェント機能部１５０−１）は、発話ＣＶ１に対する応答文ＲＰ２を、対応するエージェントサーバ２００（この一例では、エージェントサーバ２００−１）から取得し、当該応答文ＲＰ２に音声合成を行って音声を出力するように音声制御部１１８に指示する。ここで、応答文ＲＰ２は、例えば、発話ＣＶ１において、選択部１２２によって起動されたエージェント機能部１５０が実現するエージェント（この一例では、エージェント１）以外のエージェント２〜３を起動するウエイクアップワードが含まれていた場合、乗員の混乱を防ぐため、起動したエージェントがエージェント１であることを名乗る言葉が含まれる。また、応答文ＲＰ２は、例えば、要求されている機能が、選択部１２２によって起動されたエージェント機能部１５０によっても実行可能となったことを紹介する言葉が含まれる。応答文ＲＰ２は、例えば、「こんにちは、△△（エージェント１）です。私も地図検索機能が使えるようになったんですよ。よろしかったら使ってみませんか？」等の言葉である。 The agent function unit 150 (agent function unit 150-1 in this example) activated by the selection unit 122 sends the response sentence RP2 to the speech CV1 from the corresponding agent server 200 (agent server 200-1 in this example). The voice control unit 118 is instructed to acquire the voice, synthesize the voice in the response sentence RP2, and output the voice. Here, in the response statement RP2, for example, in the utterance CV1, a wakeup word that activates agents 2 to 3 other than the agent (in this example, agent 1) realized by the agent function unit 150 activated by the selection unit 122 is used. If it is included, in order to prevent confusion for the occupants, a word claiming that the activated agent is Agent 1 is included. Further, the response sentence RP2 includes, for example, a word that introduces that the requested function can be executed by the agent function unit 150 activated by the selection unit 122. Response sentence RP2 is, for example, is the word of "Hello, △△ is (agent 1). I also map search function by I was able to use. Do not try to use if you like?", And the like.

エージェント機能部１５０−１は、応答文ＲＰ２に対する乗員の発話ＣＶ４が、肯定的な内容である場合、要求された機能（この一例では、地図検索機能）の提供を行う。また、エージェント機能部１５０−１は、応答文ＲＰ２に対する乗員の発話ＣＶ４が、否定的な内容である場合、選択部１２２に再度、エージェント機能部１５０の選択を指示する。この場合、選択部１２２は、一度選択したエージェント機能部１５０を除くエージェント機能部１５０から、乗員が要求する機能を提供するエージェント機能部１５０を選択する。 The agent function unit 150-1 provides the requested function (in this example, the map search function) when the occupant's utterance CV4 for the response sentence RP2 has a positive content. Further, when the occupant's utterance CV4 for the response sentence RP2 has a negative content, the agent function unit 150-1 instructs the selection unit 122 to select the agent function unit 150 again. In this case, the selection unit 122 selects the agent function unit 150 that provides the function requested by the occupant from the agent function unit 150 excluding the agent function unit 150 that has been selected once.

以上説明したように、本実施形態のエージェント装置１００によれば、新機能を有するエージェントが優先的に乗員の対応をするようにし、新機能を乗員が使用しやすくすることができる。 As described above, according to the agent device 100 of the present embodiment, the agent having the new function can preferentially respond to the occupant, and the new function can be easily used by the occupant.

［動作フロー］
図１０は、エージェント装置１００の動作の一連の流れを示すフローチャートである。まず、音響処理部１１２は、マイク１０によって収音された音に対して音響処理を行う（ステップＳ１００）。次に、機能特定部１２０は、音響処理された音声ストリームに基づいて、乗員が提供を要求するエージェントの機能を特定する（ステップＳ１０２）。選択部１２２は、機能特定部１２０によって特定された機能を実行可能なエージェントの有無を判定する（ステップＳ１０４）。選択部１２２は、特定された機能を実現可能なエージェントが存在しない場合、所定の規則によってエージェント機能部１５０を選択／起動し、起動したエージェント機能部１５０に音声ストリームを提供する（ステップＳ１０６）。所定の規則は、例えば、予め定められた選択順序に基づいてエージェント機能部１５０を選択する規則や、ランダムにエージェント機能部１５０を選択する規則である。 [Operation flow]
FIG. 10 is a flowchart showing a series of flow of operations of the agent device 100. First, the sound processing unit 112 performs sound processing on the sound picked up by the microphone 10 (step S100). Next, the function specifying unit 120 identifies the function of the agent requested by the occupant based on the acoustically processed voice stream (step S102). The selection unit 122 determines whether or not there is an agent capable of executing the function specified by the function specifying unit 120 (step S104). When there is no agent capable of realizing the specified function, the selection unit 122 selects / activates the agent function unit 150 according to a predetermined rule, and provides an audio stream to the activated agent function unit 150 (step S106). The predetermined rule is, for example, a rule for selecting the agent function unit 150 based on a predetermined selection order, or a rule for randomly selecting the agent function unit 150.

これに応じて、エージェントサーバ２００は、機能を提供することができない旨を乗員に回答するための応答文を生成し、管理部１１０に提供する。次に、エージェント機能部１５０は、エージェントサーバ２００により提供され応答文を取得する（ステップＳ１０８）。次に、エージェント機能部１５０は、エージェントのタスクが終了したか否かを判定する（ステップＳ１１０）。例えば、エージェント機能部１５０は、乗員の発話に対する応答文が提供された場合、タスクが終了したと判定する。音声制御部１１８は、エージェント機能部１５０−１によって取得された応答文に音声合成を行って音声を出力する（ステップＳ１１２）。 In response to this, the agent server 200 generates a response statement for replying to the occupant that the function cannot be provided, and provides the response statement to the management unit 110. Next, the agent function unit 150 is provided by the agent server 200 and acquires a response statement (step S108). Next, the agent function unit 150 determines whether or not the task of the agent has been completed (step S110). For example, the agent function unit 150 determines that the task has been completed when a response sentence to the occupant's utterance is provided. The voice control unit 118 performs voice synthesis on the response sentence acquired by the agent function unit 150-1 and outputs the voice (step S112).

選択部１２２は、特定された機能が実現可能なエージェントが存在すると判定した場合、そのエージェントの中に、当該機能の実行履歴が「未実行」を示すエージェントが存在するか否かを判定する（ステップＳ１１４）。選択部１２２は、機能の実行履歴が「未実行」を示すエージェントが存在しないと判定した場合、実行履歴が「実行済み」を示すエージェント機能から、所定の規則によって、要求された機能を実現するエージェント機能部１５０を選択する（ステップＳ１１６）。選択部１２２は、選択したエージェント機能部１５０に音声ストリームを提供する（ステップＳ１１８）。 When the selection unit 122 determines that there is an agent capable of realizing the specified function, the selection unit 122 determines whether or not there is an agent whose execution history of the function indicates "not executed" among the agents ( Step S114). When the selection unit 122 determines that there is no agent whose execution history indicates "not executed", the selection unit 122 realizes the requested function from the agent function whose execution history indicates "executed" according to a predetermined rule. The agent function unit 150 is selected (step S116). The selection unit 122 provides an audio stream to the selected agent function unit 150 (step S118).

これに応じて、エージェントサーバ２００は、エージェントが要求された機能を提供する旨を乗員に回答するための応答文を生成し、管理部１１０に提供する。次に、選択されたエージェント機能部１５０は、エージェントサーバ２００により提供された応答文を取得する（ステップＳ１２０）。次に、エージェント機能部１５０は、エージェントのタスクが終了したか否かを判定する（ステップＳ１２２）。例えば、エージェント機能部１５０は、乗員の発話に対する応答文が提供された場合、タスクが終了したと判定する。音声制御部１１８は、エージェント機能部１５０によって取得された応答文に音声合成を行って音声を出力する（ステップＳ１２４）。 In response to this, the agent server 200 generates a response statement for replying to the occupant that the agent provides the requested function, and provides the response statement to the management unit 110. Next, the selected agent function unit 150 acquires the response statement provided by the agent server 200 (step S120). Next, the agent function unit 150 determines whether or not the task of the agent has been completed (step S122). For example, the agent function unit 150 determines that the task has been completed when a response sentence to the occupant's utterance is provided. The voice control unit 118 performs voice synthesis on the response sentence acquired by the agent function unit 150 and outputs the voice (step S124).

選択部１２２は、「未実行」を示すエージェントが存在すると判定した場合、特定したエージェントを実現するエージェント機能部１５０に音声ストリームを提供する（ステップＳ１２６）。なお、選択部１２２は、「未実行」を示すエージェントが存在すると判定した場合、特定したエージェントを実現するエージェント機能部１５０のうち、要求された機能を実現するエージェント機能部１５０を所定の規則によって選択してもよい。 When the selection unit 122 determines that there is an agent indicating "not executed", the selection unit 122 provides an audio stream to the agent function unit 150 that realizes the specified agent (step S126). When the selection unit 122 determines that there is an agent indicating "not executed", the selection unit 122 sets the agent function unit 150 that realizes the requested function among the agent function units 150 that realize the specified agent according to a predetermined rule. You may choose.

これに応じて、エージェントサーバ２００は、当該エージェントが要求された機能を提供する旨を乗員に回答するための応答文を生成し、管理部１１０に提供する。次に、エージェント機能部１５０は、エージェントサーバ２００により提供された応答文を取得する（ステップＳ１２８）。次に、エージェント機能部１５０は、エージェントのタスクが終了したか否かを判定する（ステップＳ１３０）。音声制御部１１８は、エージェント機能部１５０によって取得された応答文に音声合成を行って音声を出力する（ステップＳ１３２）。 In response to this, the agent server 200 generates a response statement for replying to the occupant that the agent provides the requested function, and provides the response statement to the management unit 110. Next, the agent function unit 150 acquires the response statement provided by the agent server 200 (step S128). Next, the agent function unit 150 determines whether or not the task of the agent has been completed (step S130). The voice control unit 118 performs voice synthesis on the response sentence acquired by the agent function unit 150 and outputs the voice (step S132).

［エージェント機能部１５０の優先度について］
なお、選択部１２２は、乗員に要求された機能が「未実行」を示すエージェント機能部１５０が複数存在する場合、各エージェント機能部１５０に付された優先度に基づいて、エージェント機能部１５０を選択してもよい。複数のエージェント機能部１５０のうち、高い優先度が付されるエージェント機能部１５０は、例えば、車両機器５０に動作を指示する機能を有する車両エージェント機能部（この一例では、エージェント機能部１５０−１）である。以下、優先度が最も高いエージェント機能部１５０が、エージェント機能部１５０−１であり、他のエージェント機能部１５０との優先度の関係が、エージェント機能部１５０−１＞エージェント機能部１５０−２＞エージェント機能部１５０−３であるものとする。 [Priority of agent function unit 150]
In addition, when there are a plurality of agent function units 150 indicating that the function requested by the occupant is "not executed", the selection unit 122 sets the agent function unit 150 based on the priority assigned to each agent function unit 150. You may choose. Among the plurality of agent function units 150, the agent function unit 150 having a high priority is, for example, a vehicle agent function unit having a function of instructing the vehicle device 50 to operate (in this example, the agent function unit 150-1). ). Hereinafter, the agent function unit 150 having the highest priority is the agent function unit 150-1, and the priority relationship with other agent function units 150 is as follows: agent function unit 150-1> agent function unit 150-2>. It is assumed that the agent function unit 150-3.

例えば、選択部１２２は、乗員に要求された機能が「音楽再生機能」である場合、実行履歴が「未実行」を示すエージェントは、エージェント１〜２であるが、エージェント１を実現するエージェント機能部１５０−１の方が、エージェント２を実現するエージェント機能部１５０−２よりも優先度が高いため、エージェント機能部１５０−１を選択する。 For example, in the selection unit 122, when the function requested by the occupant is the "music playback function", the agents whose execution history indicates "not executed" are agents 1 and 2, but the agent function that realizes the agent 1. Since the unit 150-1 has a higher priority than the agent function unit 150-2 that realizes the agent 2, the agent function unit 150-1 is selected.

以上説明したように、本実施形態のエージェント装置１００によれば、特定のエージェントが優先的に乗員の対応をするようにし、乗員が使い慣れたエージェントと対話する機会を増やすようにすることができる。 As described above, according to the agent device 100 of the present embodiment, it is possible to give priority to the specific agent to deal with the occupant and increase the chance for the occupant to interact with the familiar agent.

［動作フロー］
図１１は、エージェント機能部１５０に優先度が付されている場合の、エージェント装置１００の動作の一連の流れを示すフローチャートである。なお、図１０に示されるステップ番号と同様の処理には、同一のステップ番号を付して説明を省略する。 [Operation flow]
FIG. 11 is a flowchart showing a series of flow of operations of the agent device 100 when the agent function unit 150 is prioritized. The same step numbers are assigned to the same processes as those shown in FIG. 10, and the description thereof will be omitted.

選択部１２２は、機能の実行履歴が「未実行」を示すエージェントが存在すると判定した場合、当該エージェントに優先度の高いエージェント（この一例では、エージェント１）が含まれるか否かを判定する（ステップＳ２００）。選択部１２２は、エージェントにエージェント１が含まれると判定した場合、優先度の高いエージェント１を実現するエージェント機能部１５０−１に音声ストリームを提供する（ステップＳ２０２）。これに応じて、エージェントサーバ２００−１は、エージェント１が要求された機能を提供する旨を乗員に回答するための応答文を生成し、管理部１１０に提供する。次に、エージェント機能部１５０は、エージェントサーバ２００により提供された応答文を取得する（ステップＳ２０４）。次に、エージェント機能部１５０は、エージェントのタスクが終了したか否かを判定する（ステップＳ２０６）。例えば、エージェント機能部１５０は、乗員の発話に対する応答文が提供された場合、タスクが終了したと判定する。音声制御部１１８は、エージェント機能部１５０によって取得された応答文に音声合成を行って音声を出力する（ステップＳ２０８）。 When the selection unit 122 determines that there is an agent whose function execution history indicates "not executed", the selection unit 122 determines whether or not the agent includes an agent having a high priority (in this example, agent 1) (in this example, the agent 1). Step S200). When the selection unit 122 determines that the agent 1 is included in the agent, the selection unit 122 provides an audio stream to the agent function unit 150-1 that realizes the agent 1 having a high priority (step S202). In response to this, the agent server 200-1 generates a response statement for replying to the occupant that the agent 1 provides the requested function, and provides the response statement to the management unit 110. Next, the agent function unit 150 acquires the response statement provided by the agent server 200 (step S204). Next, the agent function unit 150 determines whether or not the task of the agent has been completed (step S206). For example, the agent function unit 150 determines that the task has been completed when a response sentence to the occupant's utterance is provided. The voice control unit 118 performs voice synthesis on the response sentence acquired by the agent function unit 150 and outputs the voice (step S208).

選択部１２２は、ステップＳ１１４において機能の実行履歴が「未実行」を示すエージェントが存在しないと判定した場合、又は特定された機能が実現可能なエージェントに、エージェント１が含まれないと判定した場合、所定の規則によって、要求された機能を実現するエージェント機能部１５０を選択する（ステップＳ２１０）所定の規則は、例えば、予め定められた選択順序に基づいてエージェント機能部１５０を選択する規則や、ランダムにエージェント機能部１５０を選択する規則や、実行履歴が「実行済み」を示すエージェントのうち、優先度の高いエージェントを実現するエージェント機能部１５０を選択する規則である。選択部１２２は、選択したエージェントを実現するエージェント機能部１５０に音声ストリームを提供する（ステップＳ２１２）。 When the selection unit 122 determines in step S114 that there is no agent whose function execution history indicates "not executed", or when it is determined that the agent 1 is not included in the agents capable of realizing the specified function. (Step S210) The predetermined rule is, for example, a rule for selecting the agent function unit 150 based on a predetermined selection order, or a rule for selecting the agent function unit 150 that realizes the requested function according to a predetermined rule. This is a rule for randomly selecting the agent function unit 150, or a rule for selecting the agent function unit 150 that realizes an agent having a high priority among agents whose execution history indicates "executed". The selection unit 122 provides an audio stream to the agent function unit 150 that realizes the selected agent (step S212).

これに応じて、エージェントサーバ２００は、当該エージェントが要求された機能を提供する旨を乗員に回答するための応答文を生成し、管理部１１０に提供する。次に、エージェント機能部１５０は、エージェントサーバ２００により提供された応答文を取得する（ステップＳ２１４）。次に、エージェント機能部１５０は、エージェントのタスクが終了したか否かを判定する（ステップＳ２１６）。音声制御部１１８は、エージェント機能部１５０によって取得された応答文に音声合成を行って音声を出力する（ステップＳ２１８）。 In response to this, the agent server 200 generates a response statement for replying to the occupant that the agent provides the requested function, and provides the response statement to the management unit 110. Next, the agent function unit 150 acquires the response statement provided by the agent server 200 (step S214). Next, the agent function unit 150 determines whether or not the task of the agent has been completed (step S216). The voice control unit 118 performs voice synthesis on the response sentence acquired by the agent function unit 150 and outputs the voice (step S218).

［新たに追加された機能に関する情報を提供する処理について：問合せがある場合］
また、エージェント機能部１５０は、新たな機能が追加された場合、当該新たに追加された機能に関する情報を、乗員に提供してもよい。図１２は、新たに追加された機能に関する情報を提供する場合のエージェントと乗員の対話の一例を示す図である。まず、乗員は、エージェントに対して、エージェントの新たに追加された機能について問い合わせる発話ＣＶ３を行う。発話ＣＶ３は、例えば、「何か新しい機能はない？」等の言葉である。これを受けて、機能特定部１２０は、テキスト化した文字情報に「新しい機能」等の文言が含まれているか否かを判定する。機能特定部１２０は、例えば、テキスト化した文字情報に「新しい機能」等の文言が含まれている場合、乗員がエージェントの新たに追加された機能について問合せを行っていると判定する。 [Processing to provide information about newly added functions: When there is an inquiry]
Further, when a new function is added, the agent function unit 150 may provide the occupant with information on the newly added function. FIG. 12 is a diagram showing an example of a dialogue between an agent and an occupant when providing information on a newly added function. First, the occupant makes an utterance CV3 inquiring the agent about the newly added function of the agent. The utterance CV3 is, for example, a word such as "Are there any new functions?" In response to this, the function specifying unit 120 determines whether or not the textualized character information includes words such as "new function". For example, when the textualized character information includes a word such as "new function", the function specifying unit 120 determines that the occupant is inquiring about the newly added function of the agent.

選択部１２２は、機能特定部１２０によって乗員がエージェントの新たに追加された機能について問合せを行っていると判定された場合、機能一覧情報１６２のうち、実行履歴が「未実行」である機能を特定する。図５において、実行履歴が「未実行」の機能は、例えば、エージェント１が実行可能なしりとり機能である。選択部１２２は、エージェント機能部１５０−１を乗員の音声に応答させるエージェント機能部として選択し、起動させる。 When it is determined by the function specifying unit 120 that the occupant is inquiring about the newly added function of the agent, the selection unit 122 selects a function whose execution history is "not executed" in the function list information 162. Identify. In FIG. 5, the function whose execution history is “not executed” is, for example, a shiritori function that can be executed by the agent 1. The selection unit 122 selects and activates the agent function unit 150-1 as the agent function unit that responds to the voice of the occupant.

選択部１２２によって起動されたエージェント機能部１５０（この一例では、エージェント機能部１５０−１）は、発話ＣＶ３に対する応答文ＲＰ２を、対応するエージェントサーバ２００（この一例では、エージェントサーバ２００−１）から取得し、当該応答文ＲＰ２に音声合成を行って音声を出力するように音声制御部１１８に指示する。応答文ＲＰ２は、例えば、新たに追加された機能が、選択部１２２によって起動されたエージェント機能部１５０によって実行可能となったことを紹介する言葉が含まれる。応答文ＲＰ２は、例えば、「こんにちは、△△（エージェント１）です。私は『しりとり機能』が実行可能になりました。ご使用になりますか？」等の言葉である。 The agent function unit 150 (agent function unit 150-1 in this example) activated by the selection unit 122 sends the response sentence RP2 to the speech CV3 from the corresponding agent server 200 (agent server 200-1 in this example). The voice control unit 118 is instructed to acquire the voice, synthesize the voice in the response sentence RP2, and output the voice. The response statement RP2 includes, for example, a word that introduces that the newly added function can be executed by the agent function unit 150 activated by the selection unit 122. Response sentence RP2 is, for example, "Hello, this is △△ (agent 1). I" shiritori function "can now be run. Would you like to use?" Is a word such as.

エージェント機能部１５０−１は、応答文ＲＰ２に対する乗員の発話ＣＶ４が、肯定的な内容である場合、要求された機能（この一例では、しりとり機能）の提供を行う。また、エージェント機能部１５０−１は、応答文ＲＰ２に対する乗員の発話ＣＶ４が、否定的な内容である場合、選択部１２２に再度、エージェント機能部１５０の選択を指示する。この場合、選択部１２２は、一度選択した機能を除く機能であり、使用履歴が「未実行」の機能を選択し、当該機能を実行可能なエージェント機能部１５０を選択する。 The agent function unit 150-1 provides the requested function (in this example, the shiritori function) when the occupant's utterance CV4 for the response sentence RP2 has a positive content. Further, when the occupant's utterance CV4 for the response sentence RP2 has a negative content, the agent function unit 150-1 instructs the selection unit 122 to select the agent function unit 150 again. In this case, the selection unit 122 is a function excluding the function once selected, selects a function whose usage history is "not executed", and selects an agent function unit 150 capable of executing the function.

以上説明したように、本実施形態のエージェント装置１００によれば、新機能を乗員に紹介し、新機能を乗員が使用しやすくすることができる。 As described above, according to the agent device 100 of the present embodiment, the new function can be introduced to the occupant and the new function can be easily used by the occupant.

［動作フロー］
図１３は、エージェント装置１００の未実行の機能を紹介する処理の一連の流れを示すフローチャートである。まず、音響処理部１１２は、マイク１０によって収音された音に対して音響処理を行う（ステップＳ３００）。次に、機能特定部１２０は、音響処理された音声ストリームに基づいて、乗員が追加機能の問合せを行ったか否かを判定する（ステップＳ３０２）。エージェント装置１００は、乗員が追加機能の問合せを行っていない場合、図１３のフローチャートの処理を終了する。機能特定部１２０は、乗員が追加機能の問合せを行ったと判定した場合、機能一覧情報１６２に基づいて、未実行のエージェントの機能の有無を判定する（ステップＳ３０４）。音声制御部１１８は、機能特定部１２０によって未実行のエージェントの機能が無いと判定された場合、追加機能が無い旨通知する応答文に音声合成を行って音声を出力する（ステップＳ３０６）。機能特定部１２０は、例えば、追加機能が無いことを通知する応答文の生成を、エージェント機能部１５０に指示し、当該エージェント機能部１５０から応答文の提供を受ける。追加機能が無いことを通知する応答文は、最も優先度が高いエージェント機能部１５０から提供を受けてもよく、他のエージェント機能部１５０から提供を受けてもよい。 [Operation flow]
FIG. 13 is a flowchart showing a series of processes for introducing the unexecuted functions of the agent device 100. First, the sound processing unit 112 performs sound processing on the sound picked up by the microphone 10 (step S300). Next, the function specifying unit 120 determines whether or not the occupant has inquired about the additional function based on the acoustically processed voice stream (step S302). The agent device 100 ends the processing of the flowchart of FIG. 13 when the occupant has not inquired about the additional function. When it is determined that the occupant has inquired about the additional function, the function specifying unit 120 determines whether or not there is an unexecuted agent function based on the function list information 162 (step S304). When the function specifying unit 120 determines that there is no unexecuted agent function, the voice control unit 118 performs voice synthesis on the response sentence notifying that there is no additional function and outputs the voice (step S306). For example, the function specifying unit 120 instructs the agent function unit 150 to generate a response sentence notifying that there is no additional function, and receives the response sentence provided by the agent function unit 150. The response statement notifying that there is no additional function may be provided by the agent function unit 150 having the highest priority, or may be provided by another agent function unit 150.

機能特定部１２０は、未実行の機能を有するエージェント機能部１５０に音声ストリームを提供する（ステップＳ３０８）。これに応じて、エージェントサーバ２００は、当該エージェントが要求された機能を提供する旨を乗員に回答するための応答文を生成し、管理部１１０に提供する。次に、エージェント機能部１５０は、エージェント機能部１５０により提供された応答文を取得する（ステップＳ３１０）。次に、エージェント機能部１５０は、エージェントのタスクが終了したか否かを判定する（ステップＳ３１２）。音声制御部１１８は、エージェント機能部１５０によって取得された応答文に音声合成を行って音声を出力する（ステップＳ３１４）。 The function specifying unit 120 provides an audio stream to the agent function unit 150 having an unexecuted function (step S308). In response to this, the agent server 200 generates a response statement for replying to the occupant that the agent provides the requested function, and provides the response statement to the management unit 110. Next, the agent function unit 150 acquires the response statement provided by the agent function unit 150 (step S310). Next, the agent function unit 150 determines whether or not the task of the agent has been completed (step S312). The voice control unit 118 performs voice synthesis on the response sentence acquired by the agent function unit 150 and outputs the voice (step S314).

［新たに追加された機能に関する情報を提供する処理について：問合せがない場合］
なお、上述では、エージェント機能部１５０が、乗員から追加機能の問い合わせがあった場合に、当該新たに追加された機能に関する情報を乗員に提供する場合について説明したが、これに限られない。エージェント機能部１５０は、例えば、新たに追加された機能とは無関係な応答（例えば、雑談）をしている際に、新たに追加された機能に関する情報を乗員に提供してもよい。例えば、新たに追加された機能が「しりとり機能」であり、エージェント機能部１５０が乗員に「地図検索機能」に係る応答を行っている場合において、エージェント機能部１５０は、地図検索機能に係る応答を終えた後に、「そういえば、私は『しりとり機能』が実行可能になりました。ご使用になりますか？」等の応答をすることによって、新たに追加された機能に関する情報を乗員に提供してもよい。 [Processing to provide information about newly added functions: When there is no inquiry]
In the above description, the case where the agent function unit 150 provides the occupant with information on the newly added function when the occupant inquires about the additional function has been described, but the present invention is not limited to this. The agent function unit 150 may provide the occupant with information on the newly added function, for example, when making a response (for example, chatting) unrelated to the newly added function. For example, when the newly added function is the "shiritori function" and the agent function unit 150 responds to the occupant regarding the "map search function", the agent function unit 150 responds to the map search function. After finishing, by responding such as "By the way, I can execute the" Shiritori function ". Do you want to use it?", The information about the newly added function is given to the occupants. May be provided.

以上、本発明を実施するための形態について実施形態を用いて説明したが、本発明はこうした実施形態に何等限定されるものではなく、本発明の要旨を逸脱しない範囲内において種々の変形及び置換を加えることができる。 Although the embodiments for carrying out the present invention have been described above using the embodiments, the present invention is not limited to these embodiments, and various modifications and substitutions are made without departing from the gist of the present invention. Can be added.

１…エージェントシステム、１０…マイク、２０…表示・操作装置、２２…第１ディスプレイ、２４…第２ディスプレイ、３０…スピーカユニット、３２…アンプ、３４…ミキサー、４０…ナビゲーション装置、５０…車両機器、６０…車載通信装置、７０…汎用通信装置、８０…乗員認識装置、１００…エージェント装置、１１０…管理部、１１２…音響処理部、１１４…エージェントごとＷＵ判定部、１１６…表示制御部、１１８…音声制御部、１２０…機能特定部、１２２…選択部、１５０、１５０−１、１５０−２、１５０−３…エージェント機能部、１５２…ペアリングアプリ実行部、１６０…記憶部、１６２…機能一覧情報、２００、２００−１、２００−２、２００−３…エージェントサーバ、２１０…通信部、２２０…音声認識部、２２２…自然言語処理部、２２４…対話管理部、２２６…ネットワーク検索部、２２８…応答文生成部、２５０…記憶部、２５２…パーソナルプロファイル、３００…ウェブサーバ 1 ... Agent system, 10 ... Microphone, 20 ... Display / operation device, 22 ... 1st display, 24 ... 2nd display, 30 ... Speaker unit, 32 ... Amplifier, 34 ... Mixer, 40 ... Navigation device, 50 ... Vehicle equipment , 60 ... In-vehicle communication device, 70 ... General-purpose communication device, 80 ... Crew recognition device, 100 ... Agent device, 110 ... Management unit, 112 ... Sound processing unit, 114 ... WU judgment unit for each agent, 116 ... Display control unit, 118 ... Voice control unit, 120 ... Function identification unit, 122 ... Selection unit, 150, 150-1, 150-2, 150-3 ... Agent function unit, 152 ... Pairing application execution unit, 160 ... Storage unit, 162 ... Function List information, 200, 200-1, 200-2, 200-3 ... Agent server, 210 ... Communication unit, 220 ... Voice recognition unit, 222 ... Natural language processing unit, 224 ... Dialogue management unit, 226 ... Network search unit, 228 ... Response sentence generator, 250 ... Storage unit, 252 ... Personal profile, 300 ... Web server

Claims

Multiple agent function units that provide services including voice responses in response to vehicle occupants' utterances
Among the plurality of agent function units, a selection unit for selecting an agent function unit corresponding to the utterance of the occupant is provided.
The selection unit is newly added when a new function is added to one agent function unit among the plurality of agent function units and when the newly added function is provided to the occupant. Priority is given to other agent function units that already have the same functions as the above functions, and the crew members are made to provide the functions of the agent function units to which the new functions have been added.
Agent device.

Multiple agent function units that provide services including voice responses in response to vehicle occupants' utterances
Among the plurality of agent function units, a selection unit for selecting an agent function unit corresponding to the utterance of the occupant is provided.
The plurality of agent function units include a vehicle agent function unit having a function of instructing the vehicle equipment to operate.
The selection unit is newly added when a new function is added to the vehicle agent function unit among the plurality of agent function units and when the newly added function is provided to the occupant. The occupant is made to provide the function of the vehicle agent function unit to which the new function is added, preferentially to other agent function units that already have the same function as the above function.
Agent device.

The selection unit is the newly added function when the newly added function is provided to the occupant even if the question specifies a specific agent function unit among the plurality of agent function units. Priority is given to other agent function units that already have the same function as the above, and the crew member is made to provide the function of the agent function unit to which the new function is added.
The agent device according to claim 1 or 2.

When a new function is added to at least one agent function unit among the plurality of agent function units, the agent function unit responds to an inquiry that does not specify the details of the new function and adds the new function unit. To provide the occupants with information about the functions performed,
The agent device according to any one of claims 1 to 3.

When a new function is added to at least one agent function unit among the plurality of agent function units, the agent function unit newly performs the response unrelated to the new function. To provide the occupants with information about the added features,
The agent device according to any one of claims 1 to 4.

The computer activates one of the plurality of agent function units, and as a function of the activated agent function unit,
Providing services including voice responses in response to vehicle occupants' utterances
From the plurality of agent function units, the agent function unit corresponding to the utterance of the occupant is selected.
When a new function is added to one of the plurality of agent function units and the newly added function is provided to the occupant, the same function as the newly added function is provided. The occupant is made to provide the function of the agent function unit to which the new function is added, preferentially to the other agent function unit that already has the above.
How to control the agent device.

The computer is activated by one of a plurality of agent function units, and as a function of the activated agent function unit,
Provide services including voice response in response to vehicle occupants' utterances
Among the plurality of agent function units, the agent function unit corresponding to the utterance of the occupant is selected.
When a new function is added to one of the plurality of agent function units and the newly added function is provided to the occupant, the same function as the newly added function is provided. The occupant is made to provide the function of the agent function unit to which the new function is added, preferentially to the other agent function unit that already has the above.
program.