JP7418189B2

JP7418189B2 - Display image generation device and display image generation method

Info

Publication number: JP7418189B2
Application number: JP2019210845A
Authority: JP
Inventors: 裕史井上; 乘西山; 雄宇志小田; 剛仁寺口; 翔太大久保
Original assignee: Renault SAS
Current assignee: Renault SAS
Priority date: 2019-11-21
Filing date: 2019-11-21
Publication date: 2024-01-19
Anticipated expiration: 2039-11-21
Also published as: JP2021081372A

Description

本開示は、表示画像生成装置及び表示画像生成方法に関する。 The present disclosure relates to a display image generation device and a display image generation method.

認識されている車外対象物の位置に関する情報を生成する技術が知られている。例えば特許文献１には、車両乗員が注目している車外対象物を視線検出及び音声認識により特定し、特定された対象物の車両に対する相対位置を示す表示画像を生成する技術が開示されている。 2. Description of the Related Art Techniques for generating information regarding the position of a recognized object outside the vehicle are known. For example, Patent Document 1 discloses a technology that identifies an object outside the vehicle that a vehicle occupant is paying attention to using line of sight detection and voice recognition, and generates a display image that shows the relative position of the identified object with respect to the vehicle. .

特開２００６－９０７９０号公報Japanese Patent Application Publication No. 2006-90790

しかし、上述した従来の技術は、対象物が存在する方向をユーザが見ていることを前提としており、当該対象物がユーザの視野内に含まれているか否かにかかわらず当該対象物の位置に関する情報を生成し得るものではない。また、上述した従来の技術は、ユーザ自身により認識されている対象物の位置に関する情報を生成しようとするものであって、そのユーザ以外の主体により認識されている対象物の位置に関する情報をユーザのために生成することについては考慮されていない。 However, the above-mentioned conventional technology assumes that the user is looking in the direction in which the object exists, and the position of the object is determined regardless of whether the object is within the user's field of vision. It is not possible to generate information about Furthermore, the above-mentioned conventional technology attempts to generate information regarding the position of an object recognized by the user himself/herself, and the user generates information regarding the position of the object recognized by an entity other than the user. There is no consideration given to generation for this purpose.

本開示は、このような事情に鑑みてなされてものであって、ユーザ以外の主体により認識されている抽出対象物がユーザの視野内に含まれているか否かにかかわらず、当該抽出対象物の位置に関する情報を適切に生成する表示画像生成装置及び表示画像生成方法を提供することを目的とする。 The present disclosure has been made in view of such circumstances, and regardless of whether or not the extraction target recognized by an entity other than the user is included in the user's visual field, the extraction target An object of the present invention is to provide a display image generation device and a display image generation method that appropriately generate information regarding the position of a person.

本開示に係る表示画像生成装置は、発言主体により発せられた発言に含まれる対象物を抽出対象物として特定し、当該抽出対象物に関する表示画像を生成する表示画像生成装置である。本開示に係る表示画像生成装置は、発言データ取得部と、対象物抽出部と、視野画像取得部と、対象物判定部と、表示画像生成部と、を備える。発言データ取得部は、発言主体によりユーザに対して発せられた発言の発言データを取得する。対象物抽出部は、予め複数の対象物データを記憶し、複数の対象物データと発言データ取得部により取得された発言データとを対比して、発言データのうち対象物データと一致するデータを抽出対象物として抽出する。視野画像取得部は、ユーザの視野に対応する視野画像を少なくとも含む画像を取得する。対象物判定部は、対象物抽出部により抽出された抽出対象物が視野画像に含まれるか否かを判定する。表示画像生成部は、抽出対象物の位置に関する情報である対象物情報を取得し、視野画像とは異なる出対象物情報を含む表示画像を生成する。表示画像生成部は、対象物判定部による抽出対象物が視野画像に含まれるか否かの判定結果に基づいて、抽出対象物に関する表示画像の表示態様を決定し、抽出対象物が視野画像に含まれる場合と含まれない場合とで異なる表示態様を決定する。 A display image generation device according to the present disclosure is a display image generation device that identifies a target object included in a statement uttered by a speaker as an extraction target object, and generates a display image related to the extraction target object. A display image generation device according to the present disclosure includes a statement data acquisition section, a target object extraction section, a visual field image acquisition section, a target object determination section, and a display image generation section. The utterance data acquisition unit acquires utterance data of a utterance uttered to a user by a utterer. The target object extraction unit stores a plurality of target object data in advance, compares the plurality of target object data with the statement data acquired by the statement data acquisition unit, and extracts data that matches the target object data from among the statement data. Extract as an extraction target. The visual field image acquisition unit acquires an image including at least a visual field image corresponding to the user's visual field. The target object determining unit determines whether the extraction target extracted by the target object extracting unit is included in the visual field image. The display image generation unit acquires object information that is information regarding the position of the extraction object, and generates a display image that includes extracted object information different from the visual field image. The display image generation unit determines the display mode of the display image regarding the extraction target based on the determination result of the target object determination unit as to whether or not the extraction target is included in the visual field image , and determines whether the extraction target is included in the visual field image. Decide on different display modes depending on whether it is included or not.

本開示によれば、ユーザ以外の主体により認識されている対象物がユーザの視野内に含まれているか否かにかかわらず、当該対象物の位置に関する情報を適切に生成することが可能となる。 According to the present disclosure, it is possible to appropriately generate information regarding the position of an object recognized by a subject other than the user, regardless of whether the object is included in the user's visual field. .

第１実施形態に係る表示画像生成装置を示すブロック図である。FIG. 1 is a block diagram showing a display image generation device according to a first embodiment. 端末を装着して車両に同乗しているユーザ及び同乗者を示す図である。FIG. 2 is a diagram showing a user and a fellow passenger wearing a terminal and riding together in a vehicle. 車両の上方から見たときのユーザの視野を説明するための平面図である。FIG. 2 is a plan view for explaining a user's visual field when viewed from above the vehicle. 表示画像が第１表示態様で重畳して表示されたユーザＸの視野に対応する周辺状況を示す図である。FIG. 3 is a diagram illustrating a surrounding situation corresponding to the visual field of user X in which display images are displayed in a superimposed manner in a first display mode. 第１表示画像が表示された表示画像表示装置を示す図である。FIG. 3 is a diagram showing a display image display device on which a first display image is displayed. 表示画像が第２表示態様で重畳して表示されたユーザＸの視野に対応する周辺状況を示す図である。FIG. 6 is a diagram illustrating a surrounding situation corresponding to the visual field of user X in which display images are displayed in a superimposed manner in a second display mode. 第２表示画像が表示された表示画像表示装置を示す図である。It is a figure which shows the display image display apparatus on which the 2nd display image was displayed. 表示画像が第３表示態様で重畳して表示されたユーザＸの視野に対応する周辺状況を示す図である。FIG. 6 is a diagram showing a surrounding situation corresponding to the visual field of user X in which display images are displayed in a superimposed manner in a third display mode. 第３表示画像が表示された表示画像表示装置を示す図である。FIG. 6 is a diagram showing a display image display device on which a third display image is displayed. 第１実施形態に係る表示画像生成処理を示すフローチャートである。7 is a flowchart showing display image generation processing according to the first embodiment. 第２実施形態に係る表示画像生成装置を示すブロック図である。FIG. 2 is a block diagram showing a display image generation device according to a second embodiment. 第２実施形態に係る表示画像生成処理を示すフローチャートである。7 is a flowchart showing display image generation processing according to the second embodiment. 第３実施形態に係る表示画像生成装置を示すブロック図である。FIG. 3 is a block diagram showing a display image generation device according to a third embodiment. 第３実施形態に係る表示画像生成処理を示すフローチャートである。It is a flowchart which shows display image generation processing concerning a 3rd embodiment.

以下、図面を参照して、本開示の例示的な実施形態について説明する。なお、以下の説明において、同一又は相当部分には同一符号を付し、重複する説明は省略する。
［第１実施形態］ Hereinafter, exemplary embodiments of the present disclosure will be described with reference to the drawings. In the following description, the same or equivalent parts are given the same reference numerals, and redundant description will be omitted.
[First embodiment]

図１は、第１実施形態に係る表示画像生成装置１Ａを示すブロック図である。図２は、端末を装着して車両２Ａに同乗しているユーザＸ及びユーザＹを示す図である。図３は、車両２Ａの上方から見たときのユーザＸの視野Ｅｘを説明するための平面図である。図４Ａと図５Ａと図６Ａは、表示画像が各表示態様で重畳して表示されたユーザＸの視野Ｅｘに対応する周辺状況を示す図である。図４Ｂと図５Ｂと図６Ｂは、各表示画像が表示された表示画像表示装置を示す図である。図１～図６に示されるように、表示画像生成装置１Ａは、発言主体により発せられた発言に含まれる（すなわち、発言主体により発せられた発言において言及されている）対象物Ｔを抽出対象物Ｔｅとして特定し、当該抽出対象物Ｔｅに関する表示画像Ｐを生成する装置である。 FIG. 1 is a block diagram showing a display image generation device 1A according to the first embodiment. FIG. 2 is a diagram showing a user X and a user Y who are wearing terminals and riding together in a vehicle 2A. FIG. 3 is a plan view for explaining the visual field Ex of the user X when viewed from above the vehicle 2A. FIG. 4A, FIG. 5A, and FIG. 6A are diagrams showing the surrounding situation corresponding to the visual field Ex of the user X in which display images are displayed in a superimposed manner in each display mode. FIG. 4B, FIG. 5B, and FIG. 6B are diagrams showing a display image display device on which each display image is displayed. As shown in FIGS. 1 to 6, the display image generation device 1A extracts an object T included in the utterance uttered by the utterer (that is, mentioned in the utterance uttered by the utterer). This is a device that identifies an object Te and generates a display image P regarding the extraction target object Te.

より詳細には、表示画像生成装置１Ａは、ユーザＸの視野Ｅｘに対応する周辺状況に重畳して表示される表示画像Ｐを生成する装置である。ユーザＸは、人であるユーザ（発言主体）Ｙと車両２Ａに乗車しており、例えば車外の景色を視認している。ユーザＸは、ユーザ用端末３Ａを装着している。ユーザＹは、発言主体用端末４を装着している（図２参照）。本実施形態では、ユーザＹがユーザＸに対して話しかける状況を例示して、表示画像生成装置１Ａについて説明する。 More specifically, the display image generation device 1A is a device that generates a display image P that is displayed superimposed on the surrounding situation corresponding to the user's X visual field Ex. User X is riding in a vehicle 2A with user Y (the speaker), who is a person, and is visually observing, for example, the scenery outside the vehicle. User X is wearing the user terminal 3A. User Y is wearing the speaker terminal 4 (see FIG. 2). In this embodiment, the display image generation device 1A will be described by exemplifying a situation in which user Y talks to user X.

ここで、「ユーザＸの視野Ｅｘ」とは、ユーザＸにより視認可能な視認可能領域を意味する。「視認可能領域」は、ヒトが眼を使い、生理的視野中心付近に固視点（注視点）を設けている際に外界から有効に情報を得られる範囲という有効視野である。例えば、ユーザＸの視野Ｅｘは、ユーザＸの視野Ｅｘの中心軸を中心として視認可能な上下左右の全ての領域に設定されてもよい。図３は、車両２Ａの上方から見たとき、ユーザＸの水平方向の視認可能領域を示している。ユーザＸの視野Ｅｘは車両２Ａの移動により変化する。例えば、図３では、現在のユーザＸの位置を現在位置Ｘ１により示し、現在地から移動した後のユーザＸの位置を移動位置Ｘ２により示す。以下の説明では、ユーザＸの視野Ｅｘは、後述するユーザ用端末３Ａを装着したユーザＸが所定の方向を向いている状態で、ユーザ用端末３Ａの透過型ディスプレイを介してユーザＸが視認可能な上下左右の全ての領域に設定されているものとする。なお、ユーザＹの視野Ｅｙは、ユーザＸの視野Ｅｘと同様に、ユーザＹにより視認可能な視認可能領域を意味する（図２参照）。 Here, "user X's visual field Ex" means a visible area that is visible to user X. The "visible area" is an effective field of view that is a range in which information can be effectively obtained from the outside world when a human uses the eyes to set a fixation point (fixation point) near the center of the physiological visual field. For example, the visual field Ex of the user X may be set to all visible areas above, below, left and right about the central axis of the visual field Ex of the user X. FIG. 3 shows the visible area of the user X in the horizontal direction when viewed from above the vehicle 2A. The field of view Ex of the user X changes as the vehicle 2A moves. For example, in FIG. 3, the current location of the user X is indicated by the current location X1, and the location of the user X after moving from the current location is indicated by the moved location X2. In the following explanation, the field of view Ex of the user X is that the user X can see through the transparent display of the user terminal 3A when the user X wearing the user terminal 3A, which will be described later, is facing a predetermined direction. It is assumed that all areas above, below, left and right are set. Note that the visual field Ey of the user Y means a visible area that can be visually recognized by the user Y, similarly to the visual field Ex of the user X (see FIG. 2).

「周辺状況」は、ユーザＸの周辺の領域であってユーザＸが視認可能な現実の車外の景色（外景）を意味する。周辺状況は、例えばユーザＸの現在位置を中心として水平方向の３６０度にわたる領域であって、ユーザＸの上方や下方までを含めた領域の車外の景色（外景）である。「ユーザＸの視野Ｅｘに対応する周辺状況」とは、ユーザＸの視野Ｅｘに含まれる車外の景色（外景）を意味する。言い換えると、周辺状況は、ユーザＸの視野Ｅｘ内の車外の景色である。図３に示されるように、車両２Ａの移動により、ユーザＸの視野Ｅｘに対応する周辺状況が変わる。 The "surrounding situation" means the actual scenery outside the vehicle (outside scenery) that is a region around the user X and is visible to the user X. The surrounding situation is, for example, a region spanning 360 degrees in the horizontal direction centering on the current position of the user X, and is the scenery outside the vehicle (external scenery) of the region including above and below the user X. The "surrounding situation corresponding to user X's visual field Ex" means the scenery outside the vehicle (external scenery) included in user X's visual field Ex. In other words, the surrounding situation is the scenery outside the vehicle within the user's X's visual field Ex. As shown in FIG. 3, as the vehicle 2A moves, the surrounding situation corresponding to the user's X's visual field Ex changes.

「表示画像を生成する」とは、ディスプレイ等に表示される画像情報を生成することを意味する。表示画像生成装置１Ａにより生成された画像情報が有線通信又は無線通信によりディスプレイ等に送信されると、送信された画像情報に係る表示画像Ｐが当該ディスプレイに表示可能となる。なお、「表示画像Ｐ」とは、抽出対象物Ｔｅに関する情報等を表示する画像であり、より具体的には、対象物Ｔの位置に関する情報を表示する画像である。ここでは、表示画像Ｐは、ユーザＸの視野Ｅｘに対応する周辺状況に重畳して表示される。表示画像Ｐとしては、例えば、抽出対象物Ｔｅが視野画像に含まれるか否かを示す文字を含む画像であってもよく、視野画像に含まれる特定の抽出対象物Ｔｅが枠囲みされて見えるように表示される矩形枠線の画像であってもよい。なお、詳しくは後述する。ここで、「視野画像」とは、ユーザＸの視野Ｅｘに対応する画像である。つまり、視野画像は、ユーザＸの視野Ｅｘに対応する周辺状況を撮像した画像である。本実施形態において、ユーザＸの視野Ｅｘに対応する周辺状況とは現実の車外の景色であり、視野画像とは当該ユーザＸの視野Ｅｘに対応する周辺状況が撮像装置（視野画像取得装置３２）により撮像された画像である。 "Generating a display image" means generating image information to be displayed on a display or the like. When the image information generated by the display image generation device 1A is transmitted to a display or the like by wired communication or wireless communication, the display image P related to the transmitted image information can be displayed on the display. Note that the "display image P" is an image that displays information regarding the extraction target Te, and more specifically, an image that displays information regarding the position of the target object T. Here, the display image P is displayed superimposed on the surrounding situation corresponding to the user's X visual field Ex. The display image P may be, for example, an image containing characters indicating whether or not the extraction target Te is included in the visual field image, and a specific extraction target Te included in the visual field image appears surrounded by a frame. It may be an image of a rectangular frame displayed as shown in FIG. Note that details will be described later. Here, the "visual field image" is an image corresponding to the user's X's visual field Ex. In other words, the visual field image is an image of the surrounding situation corresponding to the user's X visual field Ex. In this embodiment, the surrounding situation corresponding to the visual field Ex of the user X is the actual scenery outside the vehicle, and the visual field image is the surrounding situation corresponding to the visual field Ex of the user This is an image taken by.

表示画像生成装置１Ａは、例えばサーバとして構成されており、プロセッサ（処理装置）及びメモリ（記憶装置）等を含んでいる。 The display image generation device 1A is configured as a server, for example, and includes a processor (processing device), a memory (storage device), and the like.

プロセッサは、例えばＣＰＵ（Central Processing Unit）又はＭＰＵ（Micro-Processing Unit）により構成されていてもよい。メモリは、半導体記憶装置、磁気記憶装置、及び光学記憶装置の少なくともいずれかを備えていてもよい。また、メモリは、レジスタ、キャッシュメモリ、主記憶装置として使用されるＲＯＭ（Read Only Memory）又はＲＡＭ（Random Access Memory）等を含んでいてもよい。 The processor may be configured by, for example, a CPU (Central Processing Unit) or an MPU (Micro-Processing Unit). The memory may include at least one of a semiconductor storage device, a magnetic storage device, and an optical storage device. Further, the memory may include a register, a cache memory, a ROM (Read Only Memory) used as a main storage device, a RAM (Random Access Memory), or the like.

表示画像生成装置１Ａ、車両２Ａ、ユーザ用端末３Ａ、及び発言主体用端末４は、相互に有線又は無線により通信（送受信）可能に接続されている。なお、表示画像生成装置１Ａの機能的な構成については後述する。 The display image generation device 1A, the vehicle 2A, the user terminal 3A, and the speaker terminal 4 are connected to each other so that they can communicate (transmit and receive) by wire or wirelessly. Note that the functional configuration of the display image generation device 1A will be described later.

車両２Ａは、ユーザＸ及びユーザＹが乗車している乗用車等である。車両２Ａは、手動運転と自動運転の両方が切り替えにより可能であってもよいし、どちらか一方のみの運転が可能であってもよい。車両２Ａは、ナビゲーション装置２１及び周辺撮像装置２２を備えている。ナビゲーション装置２１は、例えば、ＧＰＳ（Global Positioning System）等により検出された車両２Ａの位置情報、及び、地図情報に基づいて、設定された目的地までの車両２Ａの走行経路を設定し、当該走行経路に沿って車両２Ａを案内する装置である。ナビゲーション装置２１は、車両２Ａの位置（例えば、ＧＰＳにより検出された位置座標）の履歴を時系列で記憶（保持）する。ナビゲーション装置２１は、記憶した車両２Ａの位置の履歴に基づいて車両２Ａの進行方向を取得してもよい。 The vehicle 2A is a passenger car or the like in which the user X and the user Y ride. The vehicle 2A may be capable of both manual operation and automatic operation by switching, or may be capable of operating only one of them. The vehicle 2A includes a navigation device 21 and a surrounding imaging device 22. The navigation device 21 sets a travel route for the vehicle 2A to a set destination based on, for example, location information of the vehicle 2A detected by GPS (Global Positioning System) or the like and map information, and This is a device that guides the vehicle 2A along a route. The navigation device 21 stores (holds) a history of the position of the vehicle 2A (for example, position coordinates detected by GPS) in chronological order. The navigation device 21 may acquire the traveling direction of the vehicle 2A based on the stored history of the position of the vehicle 2A.

周辺撮像装置２２は、ユーザＸの周辺状況を撮像して、周辺画像を取得する装置である。「周辺画像」とは、ユーザＸの視野Ｅｘ（すなわち視野画像）を含むユーザＸの周辺の領域であってユーザＸの視野Ｅｘを含む領域の画像である。ユーザＸの周辺画像は、例えばユーザＸを中心として水平方向の３６０度にわたる領域が撮像された画像であってもよく、更にユーザＸの上方まで含めた領域が撮像された画像であってもよい。あるいは、ユーザＸの周辺画像は、ユーザＸの周辺の領域のうち、ユーザＸにより視認されにくい領域（一例として、車両２Ａの座席に着座した状態のユーザＸの後方の領域等）を除く領域であってもよい。あるいは、ユーザＸの視野Ｅｘに対応する領域と同一の領域であってもよい。「ユーザＸの視野Ｅｘを含む領域」とは、ユーザＸの視野Ｅｘを含む領域であれば、その範囲は特に限定されない。 The surrounding imaging device 22 is a device that images the surrounding situation of the user X and obtains a surrounding image. The "peripheral image" is an image of a region around the user X that includes the user's X's visual field Ex (that is, a visual field image) and includes the user's X's visual field Ex. The surrounding image of user X may be, for example, an image in which an area spanning 360 degrees in the horizontal direction centering on user X is imaged, or may be an image in which an area including above user X is imaged. . Alternatively, the peripheral image of user There may be. Alternatively, it may be the same area as the area corresponding to the user's X visual field Ex. The range of the "area including user X's visual field Ex" is not particularly limited as long as it is an area including user X's visual field Ex.

周辺撮像装置２２は、例えば１又は複数のカメラによって構成されている。周辺撮像装置２２のカメラは、例えば車両２Ａの屋根上等の車室外に設けられていてもよく、フロントガラス裏等の車室内に設けられていてもよい。車両２Ａは、周辺撮像装置２２により撮像されたユーザＸの周辺画像を表示画像生成装置１Ａに送信する。なお、「画像を送信する」とは、画像の画像データを送信することを意味する。 The surrounding imaging device 22 includes, for example, one or more cameras. The camera of the peripheral imaging device 22 may be provided outside the vehicle interior, such as on the roof of the vehicle 2A, or may be provided inside the vehicle interior, such as behind the windshield. The vehicle 2A transmits the surrounding image of the user X captured by the surrounding imaging device 22 to the display image generation device 1A. Note that "sending an image" means transmitting image data of an image.

ユーザ用端末３Ａは、ユーザＸの頭部に装着される装置であり、表示画像表示装置３１Ａ及び視野画像取得装置３２を備えている。表示画像表示装置３１Ａは、表示画像生成装置１Ａにより生成された表示画像Ｐを表示可能なディスプレイを有している。表示画像表示装置３１Ａのディスプレイは、例えば眼鏡型又はゴーグル型のような透過型ディスプレイであり、ユーザＸによりユーザ用端末３Ａが装着された状態でユーザＸの目の直前に位置する。したがって、ユーザＸは表示画像表示装置３１Ａを介してユーザＸの視野Ｅｘに対応する周辺状況を視認可能となる。また、表示画像表示装置３１Ａに表示画像Ｐが表示されると、ユーザＸから見て、表示画像Ｐ（図４Ｂ，図５Ｂ，図６Ｂ参照）がユーザＸの視野Ｅｘに対応する周辺状況に重畳して表示されることとなる。つまり、表示画像表示装置３１Ａは、いわゆるＡＲ（Augmented Reality）の技術において用いられるＨＭＤ（Head Mounted Display）としての機能を備えている。 The user terminal 3A is a device worn on the head of the user X, and includes a display image display device 31A and a visual field image acquisition device 32. The display image display device 31A has a display capable of displaying the display image P generated by the display image generation device 1A. The display of the display image display device 31A is, for example, a transmissive display such as a glasses type or a goggle type, and is located right in front of the eyes of the user X when the user terminal 3A is worn by the user X. Therefore, the user X can visually recognize the surrounding situation corresponding to the visual field Ex of the user X via the display image display device 31A. Further, when the display image P is displayed on the display image display device 31A, the display image P (see FIGS. 4B, 5B, and 6B) is superimposed on the surrounding situation corresponding to the user X's visual field Ex, as seen from the user X. will be displayed. In other words, the display image display device 31A has a function as an HMD (Head Mounted Display) used in so-called AR (Augmented Reality) technology.

視野画像取得装置３２は、ユーザＸの視野Ｅｘに対応する周辺状況を撮像して、視野画像を取得する撮像装置である。視野画像取得装置３２は、ユーザＸによりユーザ用端末３Ａが装着された状態でユーザＸの視線方向を撮像可能な向きとなるように、ユーザ用端末３Ａに設けられている。視野画像取得装置３２は、例えば表示画像表示装置３１Ａの側部に設けられている。ユーザ用端末３Ａは、視野画像取得装置３２により撮像された視野画像を表示画像生成装置１Ａに送信する。なお、「視野画像を送信する」とは、視野画像の画像データを送信することを意味する。さらに、視野画像取得装置３２は、ユーザＸの視線方向を検出するセンサを備え（不図示）、センサから検出されたユーザＸの視線方向の情報を視野画像の画像データと共に送信してもよい。 The visual field image acquisition device 32 is an imaging device that images the surrounding situation corresponding to the visual field Ex of the user X and acquires a visual field image. The visual field image acquisition device 32 is provided in the user terminal 3A so as to be oriented so that it can capture an image of the line of sight of the user X when the user terminal 3A is worn by the user X. The visual field image acquisition device 32 is provided, for example, on the side of the display image display device 31A. The user terminal 3A transmits the visual field image captured by the visual field image acquisition device 32 to the display image generation device 1A. Note that "transmitting a visual field image" means transmitting image data of a visual field image. Further, the visual field image acquisition device 32 may include a sensor (not shown) that detects the visual line direction of the user X, and may transmit information on the visual line direction of the user X detected from the sensor together with the image data of the visual field image.

発言主体用端末４は、ユーザＹの頭部に装着される装置であり、発言データ取得装置４１を備えている。発言データ取得装置４１は、ユーザＹによりユーザＸに対して発せられた発言を発言データとして取得する装置である。発言データ取得装置４１は、例えばマイクロフォンによって構成されている。ここでは、発言データ取得装置４１は、発言主体用端末４はヘッドセットであり、発言データ取得装置４１はヘッドセットに設けられたマイクロフォンである。なお、発言データ取得装置４１は、車内マイクロフォン又はイヤホーンであってもよい。また、発言主体用端末４は、ユーザ用端末３Ａと同様の表示画像表示装置３１Ａ及び視野画像取得装置３２を更に備えていてもよい。「発言データ」とは、発言の内容についての情報を有するデータであり、ここでは、発言データは、発言の発言信号データである。「発言信号データ」とは、発言の音声信号を意味する。なお、発言データには、ユーザＹが何も発していないデータも含まれる。 The speaker terminal 4 is a device worn on the head of the user Y, and includes a statement data acquisition device 41 . The utterance data acquisition device 41 is a device that acquires utterances uttered by user Y to user X as utterance data. The speech data acquisition device 41 includes, for example, a microphone. Here, the speech data acquisition device 41 is a headset, and the speech subject terminal 4 is a headset, and the speech data acquisition device 41 is a microphone provided in the headset. Note that the speech data acquisition device 41 may be an in-vehicle microphone or an earphone. Moreover, the speaker terminal 4 may further include a display image display device 31A and a visual field image acquisition device 32 similar to the user terminal 3A. "Speech data" is data having information about the content of a statement, and here, the statement data is speech signal data of a statement. "Speech signal data" means an audio signal of a speech. Note that the speech data also includes data in which user Y has not uttered anything.

発言主体用端末４は、発言データ取得装置４１により取得された発言を表示画像生成装置１Ａに送信する。このとき、発言主体用端末４は、当該発言主体用端末４がユーザＹにより装着されていることを特定する情報（ユーザＹを特定する情報）を、表示画像生成装置１Ａへ更に送信する。「発言主体用端末４がユーザＹにより装着されていることを特定する情報」とは、ユーザＹに紐付けられた情報であり、例えば、ユーザＹと紐付けられた発言主体用端末４のＩＤ（Identification）番号であってもよい。なお、「発言を送信する」とは、発言の発言信号データ（詳しくは後述）を送信することを意味する。 The speaker terminal 4 transmits the statement acquired by the statement data acquisition device 41 to the display image generation device 1A. At this time, the speaker terminal 4 further transmits information specifying that the speaker terminal 4 is worn by user Y (information specifying user Y) to the display image generation device 1A. "Information identifying that the speaker terminal 4 is worn by user Y" is information linked to user Y, such as the ID of the speaker terminal 4 linked to user Y. (Identification) number may be used. Note that "sending a comment" means transmitting comment signal data (details will be described later) of the comment.

次に、表示画像生成装置１Ａの機能的な構成について説明する。表示画像生成装置１Ａは、周辺画像取得部１１、発言データ取得部１２、対象物抽出部１３、視野画像取得部１４Ａ、対象物判定部１５Ａ、存否判定部１６Ａ、位置関係取得部１７Ａ、及び表示画像生成部１８Ａを有している。 Next, the functional configuration of the display image generation device 1A will be explained. The display image generation device 1A includes a peripheral image acquisition unit 11, a statement data acquisition unit 12, a target object extraction unit 13, a visual field image acquisition unit 14A, a target object determination unit 15A, an existence determination unit 16A, a positional relationship acquisition unit 17A, and a display It has an image generation section 18A.

周辺画像取得部１１は、車両２Ａから送信される周辺画像を取得して記憶する。周辺画像取得部１１は、ユーザＸの周辺画像を取得して時系列で記憶する。より具体的には、周辺画像取得部１１は、車両２Ａの周辺撮像装置２２により撮像されたユーザＸの周辺画像を車両２Ａから受信することで、ユーザＸの周辺画像を取得する。周辺画像取得部１１は、取得したユーザＸの周辺画像を時系列で記憶する。つまり、周辺画像取得部１１は、ユーザＸの現在の周辺画像を取得するとともに、取得された周辺画像を過去の周辺画像として記憶（蓄積）していく。周辺画像取得部１１は、予め設定されたタイミングで、記憶している過去の周辺画像の情報を消去してもよい。 The peripheral image acquisition unit 11 acquires and stores peripheral images transmitted from the vehicle 2A. The peripheral image acquisition unit 11 acquires peripheral images of the user X and stores them in chronological order. More specifically, the peripheral image acquisition unit 11 acquires the peripheral image of the user X by receiving the peripheral image of the user X captured by the peripheral imaging device 22 of the vehicle 2A from the vehicle 2A. The surrounding image acquisition unit 11 stores the acquired surrounding images of the user X in chronological order. That is, the peripheral image acquisition unit 11 acquires the current peripheral image of the user X, and also stores (accumulates) the acquired peripheral image as a past peripheral image. The surrounding image acquisition unit 11 may delete the stored information on past surrounding images at a preset timing.

発言データ取得部１２は、ユーザＹによりユーザＸに対して発せられた発言の発言データを取得する。より具体的には、発言データ取得部１２は、発言主体用端末４の発言データ取得装置４１により取得されたユーザＹの発言の発言信号データを発言主体用端末４から受信することで、ユーザＹによりユーザＸに対して発せられた発言の発言データを取得する。なお、発言データ取得部１２は、発言データにユーザＹの発言が含まれるか否かを判定する。即ち、ユーザＹが発言していない場合には、発言データにユーザＹの発言が含まれないと判定する。 The utterance data acquisition unit 12 acquires utterance data of a utterance uttered by user Y to user X. More specifically, the utterance data acquisition unit 12 receives from the utterance terminal 4 the utterance signal data of user Y's utterance acquired by the utterance data acquisition device 41 of the utterance terminal 4, utterance data of the utterance uttered to user X is obtained. Note that the utterance data acquisition unit 12 determines whether or not the utterance data includes the utterance of user Y. That is, if user Y has not made a statement, it is determined that the statement data does not include user Y's statement.

また、発言データ取得部１２は、ユーザＸに対して発言を発したユーザＹを特定する情報を取得する。例えば、発言データ取得部１２は、ユーザＹを特定する情報を発言主体用端末４から受信する。 The statement data acquisition unit 12 also obtains information that identifies the user Y who made the statement to the user X. For example, the statement data acquisition unit 12 receives information identifying user Y from the speaker terminal 4.

対象物抽出部１３は、発言データ取得部１２により取得された発言データに基づいて、当該発言データに係る発言に含まれる予め記憶された対象物Ｔを表す文字列を抽出する。詳述すると、対象物抽出部１３は、予め複数の対象物Ｔを表す文字列（対象物データ）を記憶しており、複数の対象物Ｔを表す文字列と発言データを変換した文字列（発言データの一種）を対比して、発言データを変換した文字列のうち対象物Ｔを表す文字列と一致する文字列（データ）を抽出対象物Ｔｅとして抽出する。「対象物Ｔ」とは、現実に存在している物体である。物体としては、例えば、一般名詞で表現される物体の種別（自転車、街灯、建物等）であってもよく、固有名詞で表現される物体の名称（富士山、国会議事堂等）であってもよい。また、物体は、その属性、特徴等について限定されていてもよい（例えば、青い自転車、富士山の頂上等）。対象物抽出部１３は、記憶部を有し、対象物Ｔを表す一般名詞、固有名詞、属性、または特徴を予め記憶している。対象物抽出部１３は、発言データ取得部１２により取得された発言データから予め記憶された対象物Ｔを表す一般名詞、固有名詞、属性、または特徴を抽出する。 Based on the statement data acquired by the statement data acquisition unit 12, the target object extraction unit 13 extracts a character string representing a pre-stored target object T included in the statement related to the statement data. To be more specific, the object extraction unit 13 stores character strings (object data) representing a plurality of objects T in advance, and converts character strings representing the plurality of objects T and speech data ( A type of comment data) is compared, and a character string (data) that matches the character string representing the target object T from among the character strings obtained by converting the comment data is extracted as an extraction target Te. “Target T” is an object that actually exists. The object may be, for example, the type of object expressed as a common noun (bicycle, streetlight, building, etc.), or the name of an object expressed as a proper noun (Mt. Fuji, the National Diet Building, etc.). . Furthermore, the object may be limited in terms of its attributes, characteristics, etc. (for example, a blue bicycle, the top of Mt. Fuji, etc.). The object extraction unit 13 has a storage unit, and stores general nouns, proper nouns, attributes, or features representing the object T in advance. The object extraction unit 13 extracts a common noun, proper noun, attribute, or feature representing the object T stored in advance from the comment data acquired by the comment data acquisition unit 12.

一例として、ユーザＹによりユーザＸに対して「向こうに自転車があるね。」との発言が発せられた場合を説明する。この場合、発言データ取得部１２によりユーザＹが発せられた発言の発言データに基づいて、対象物抽出部１３は、ユーザＹにより発せられた発言から抽出対象物Ｔｅを抽出する。ここでは、対象物抽出部１３は、「自転車」との言葉が対象物Ｔ（自転車）の種別を表すことを予め記憶しているものとする。対象物抽出部１３は、ユーザＹにより発せられた発言から「自転車」という抽出対象物Ｔｅを抽出する。なお、ユーザＹの発言内容から、抽出対象物Ｔｅを抽出できない場合もある。 As an example, a case will be described in which user Y says to user X, "There's a bicycle over there." In this case, based on the utterance data of the utterances uttered by the user Y by the utterance data acquisition unit 12, the object extraction unit 13 extracts the extraction target Te from the utterances uttered by the user Y. Here, it is assumed that the object extraction unit 13 has previously stored that the word "bicycle" represents the type of object T (bicycle). The object extracting unit 13 extracts an extraction object Te called "bicycle" from the utterance uttered by the user Y. Note that the extraction target Te may not be extracted from the content of user Y's statement.

対象物抽出部１３は、例えば発言認識（音声認識）により、発言データに係る発言において言及されている予め記憶された複数の対象物Ｔを表す文字列を抽出する。ここで、「発言認識」としては、公知の発言認識技術が適用可能である。例えば、対象物抽出部１３は、発言認識により、発言データに基づいて発言音声信号を文字列として認識し、認識された文字列と複数の対象物Ｔを表す文字列から抽出対象物Ｔｅを抽出する。 The object extraction unit 13 extracts character strings representing a plurality of pre-stored objects T mentioned in the utterance related to the utterance data, for example, by utterance recognition (speech recognition). Here, a known speech recognition technique can be applied to the "utterance recognition". For example, the object extraction unit 13 uses speech recognition to recognize the speech audio signal as a character string based on the speech data, and extracts the extraction object Te from the recognized character string and character strings representing the plurality of objects T. do.

視野画像取得部１４Ａは、ユーザＸの視野Ｅｘに対応する画像である視野画像を少なくとも含む画像を取得する。「視野画像を少なくとも含む画像」とは、視野画像と同一範囲の画像であってもよく、視野画像よりも広い範囲の画像であってもよい。視野画像取得部１４Ａは、ユーザ用端末３Ａの視野画像取得装置３２により撮像された視野画像を視野画像取得装置３２から受信することで、当該視野画像を取得する。また、視野画像取得部１４Ａは、視野画像取得装置３２からユーザＸの視線方向の情報を取得してもよい。 The visual field image acquisition unit 14A acquires an image that includes at least a visual field image that is an image corresponding to the user's X visual field Ex. The "image including at least the visual field image" may be an image in the same range as the visual field image, or may be an image in a wider range than the visual field image. The visual field image acquisition unit 14A receives the visual field image captured by the visual field image acquisition device 32 of the user terminal 3A from the visual field image acquisition device 32, and thereby acquires the visual field image. Further, the visual field image acquisition unit 14A may acquire information on the user's X line of sight direction from the visual field image acquisition device 32.

対象物判定部１５Ａは、抽出対象物Ｔｅが視野画像取得部１４Ａにより取得されたユーザＸの視野Ｅｘの視野画像に含まれるか否かを判定する。「抽出対象物Ｔｅ」とは、上述した通り、対象物抽出部１３が記憶している複数の対象物Ｔの中からその発言データと一致するものである。ここでは、対象物抽出部１３により「自転車（bicycle）」という抽出対象物Ｔｅが抽出されている。 The target object determining unit 15A determines whether the extraction target Te is included in the visual field image of the visual field Ex of the user X acquired by the visual field image acquiring unit 14A. As described above, the "extraction object Te" is one that matches the statement data from among the plurality of objects T stored in the object extraction unit 13. Here, the object extraction unit 13 has extracted an extraction object Te called "bicycle."

対象物判定部１５Ａは、例えば画像認識により、抽出対象物Ｔｅが視野画像に含まれるか否かを判定する。ここで、「画像認識」としては、公知の画像認識技術が適用可能である。例えば、対象物判定部１５Ａは、画像認識として、画像上に含まれる物体の名称、種別、形状、色、方向等の識別情報を検出できる機械学習モデル、深層学習モデル、及びＯｐｅｎＣＶ（Open Source Computer Vision Library）を用いた画像処理アルゴリズムが適用されてもよい。 The target object determination unit 15A determines whether the extraction target Te is included in the visual field image, for example, by image recognition. Here, as the "image recognition", a known image recognition technique can be applied. For example, the object determination unit 15A uses a machine learning model, a deep learning model, and an OpenCV (Open Source Computer An image processing algorithm using the Vision Library) may be applied.

例えば、対象物判定部１５Ａは、視野画像に含まれる複数の物体の識別情報を検出し、対象物抽出部１３により取得された抽出対象物Ｔｅを表現するデータ（物体の種別等）と、視野画像に含まれる複数の物体の検出された識別情報と、を比較する。その後、対象物判定部１５Ａは、種別及び名称の少なくともいずれかにおいて、抽出対象物Ｔｅを表現するデータと、視野画像に含まれる複数の物体と、が一致するか否かに基づいて、視野画像に抽出対象物Ｔｅが含まれるか否かを判定する。また、対象物判定部１５Ａは、画像認識として、ＯＣＲ（Optical Character Recognition）を用いて、視野画像に含まれる看板の文字内容を認識し、対象物抽出部１３が取得された抽出対象物Ｔｅの名称を表現されるデータと認識された看板の内容と比較し、抽出対象物Ｔｅの名称と視野画像に含まれる看板の中に少なくとも１つの看板の内容の一部と一致するか否かに基づいて、視野画像に抽出対象物Ｔｅが含まれるか否かを判定してもよい。 For example, the target object determining unit 15A detects identification information of a plurality of objects included in the visual field image, and extracts data representing the extraction target Te acquired by the target object extracting unit 13 (object type, etc.) and the visual field image. Detected identification information of multiple objects included in the image is compared. Thereafter, the target object determination unit 15A determines whether or not the data expressing the extracted target object Te matches the plurality of objects included in the visual field image in at least one of the type and name. It is determined whether or not the extraction target Te is included. In addition, the target object determining unit 15A uses OCR (Optical Character Recognition) as image recognition to recognize the character content of the signboard included in the visual field image, and the target object extracting unit 13 recognizes the acquired extraction target Te. The name is compared with the data representing the name and the content of the recognized signboard, and based on whether the name of the extraction target Te matches a part of the content of at least one signboard included in the field of view image. Then, it may be determined whether or not the field of view image includes the extraction target Te.

対象物判定部１５Ａは、抽出対象物Ｔｅが視野画像に含まれるか否かの判定結果の情報をユーザＸの表示画像表示装置３１Ａに出力する。「視野画像に含まれるか否かの判定結果」とは、抽出対象物ＴｅがユーザＸにより視認可能である（ユーザＸの視野Ｅｘ内）か否かの判定結果の情報を意味する。ここでは、対象物判定部１５Ａは、ユーザＸのユーザ用端末３Ａに判定結果の情報を出力する。なお、発言主体であるユーザＹの発言主体用端末４にも判定結果の情報を出力する。 The target object determining unit 15A outputs information on the determination result as to whether or not the extraction target Te is included in the visual field image to the user X's display image display device 31A. The “determination result as to whether or not it is included in the visual field image” means information on the determination result as to whether or not the extraction target Te is visible to the user X (within the visual field Ex of the user X). Here, the target object determination unit 15A outputs information on the determination result to the user terminal 3A of the user X. Note that information on the determination result is also output to the speaker terminal 4 of user Y, who is the speaker.

存否判定部１６Ａは、抽出対象物Ｔｅが視野画像に含まれないと対象物判定部１５Ａにより判定された場合に、抽出対象物Ｔｅが予め設定された対象範囲内に存在するか否かを判定する。具体的には、存否判定部１６Ａは、周辺画像取得部１１により取得された現在又は過去の周辺画像に基づいて、抽出対象物Ｔｅが対象範囲内に存在するか否かを判定する。「対象範囲」とは、ユーザＸまたは車両２Ａの位置を中心として予め設定された所定の範囲である。例えば、対象範囲は、ユーザＸまたは車両２Ａの位置を中心としてユーザＸが視認可能な所定の範囲であってもよい（図３に二点鎖線で示した範囲）。当該範囲は、ユーザＸまたは車両２Ａから例えば５０キロメートルの円形の範囲であってもよく、円形以外の任意の形状の範囲であってもよい。対象範囲は、抽出対象物Ｔｅの大きさに応じて、ユーザＸが、抽出対象物Ｔｅが視認可能な範囲でもよく、例えば、抽出対象物Ｔｅが富士山であれば、対象範囲をユーザＸまたは車両２Ａの位置（中心）から３００キロメートルまでの範囲に設定すればよい。この例では、対象範囲は、中心から半径３００キロメートルの範囲とする。 The presence/absence determining unit 16A determines whether the extraction target Te exists within a preset target range when the target object determining unit 15A determines that the extraction target Te is not included in the visual field image. do. Specifically, the presence/absence determination unit 16A determines whether or not the extraction target Te exists within the target range based on the current or past peripheral images acquired by the peripheral image acquisition unit 11. The "target range" is a predetermined range that is set in advance around the position of the user X or the vehicle 2A. For example, the target range may be a predetermined range that is visible to the user X and centered on the position of the user X or the vehicle 2A (the range shown by the two-dot chain line in FIG. 3). The range may be a circular range of, for example, 50 kilometers from the user X or the vehicle 2A, or may be a range of any shape other than a circle. Depending on the size of the extraction target Te, the target range may be a range in which the extraction target Te can be visually recognized by the user X. For example, if the extraction target Te is Mt. It is sufficient to set the range up to 300 kilometers from the position (center) of 2A. In this example, the target range is a radius of 300 kilometers from the center.

まず、存否判定部１６Ａは、周辺画像取得部１１により取得されて時系列で記憶されたユーザＸの現在の周辺画像及び過去の周辺画像に抽出対象物Ｔｅが含まれているか否かを判定する。より詳細には、存否判定部１６Ａは、周辺画像取得部１１により記憶されている現在の周辺画像及び過去の周辺画像を取得し、取得された現在の周辺画像及び過去の周辺画像に抽出対象物Ｔｅが含まれているか否かを判定する。存否判定部１６Ａは、例えば画像認識により、当該判定を実行してもよい。存否判定部１６Ａは、周辺画像取得部１１により取得され記憶された現在の周辺画像及び過去の周辺画像に含まれる複数の画像の画像認識の処理を対象物判定部１５Ａに実行させて、その実行結果に基づいて、当該判定を実行してもよい。 First, the presence/absence determination unit 16A determines whether or not the extraction target Te is included in the current and past peripheral images of the user X acquired by the peripheral image acquisition unit 11 and stored in chronological order. . More specifically, the presence/absence determination unit 16A acquires the current peripheral image and past peripheral images stored by the peripheral image acquisition unit 11, and adds the extraction target to the acquired current peripheral image and past peripheral image. It is determined whether Te is included. The presence/absence determining unit 16A may perform this determination by, for example, image recognition. The presence/absence determining unit 16A causes the object determining unit 15A to perform image recognition processing for a plurality of images included in the current peripheral image and past peripheral images acquired and stored by the peripheral image acquiring unit 11, and performs the image recognition process. The determination may be made based on the results.

また、存否判定部１６Ａは、周辺画像取得部１１により記憶されている現在の周辺画像及び過去の周辺画像に含まれる複数の画像内の様々な物体を検出して、物体の名称、種別、形状、色、及び方向等の識別情報を検出し、検出された識別情報に１つ以上の画像タグを割り当ててタグ付き画像を生成し記憶する。その後、存否判定部１６Ａは、対象物抽出部１３により取得された抽出対象物Ｔｅを表現する発言データと、複数の画像タグのうち物体の名称及び種別の少なくともいずれかと一致する周辺画像が存在するか否かに基づいて、取得された現在の周辺画像及び過去の周辺画像に抽出対象物Ｔｅが含まれているか否かを判定する。 In addition, the presence/absence determination unit 16A detects various objects in a plurality of images included in the current peripheral image and past peripheral images stored by the peripheral image acquisition unit 11, and determines the name, type, and shape of the object. , color, and orientation, and assigns one or more image tags to the detected identification information to generate and store a tagged image. Thereafter, the presence/absence determination unit 16A determines whether there is a surrounding image that matches the statement data expressing the extraction target Te acquired by the target object extraction unit 13 and at least one of the name and type of the object among the plurality of image tags. Based on whether or not, it is determined whether or not the extraction target Te is included in the acquired current surrounding image and past surrounding image.

また、存否判定部１６Ａは、取得された現在の周辺画像及び過去の周辺画像に抽出対象物Ｔｅが含まれていないと判定された場合には、抽出対象物Ｔｅが予め設定された対象範囲内に存在しないと判定する。 In addition, when it is determined that the extraction target Te is not included in the acquired current peripheral image and past peripheral images, the presence/absence determining unit 16A determines that the extraction target Te is within a preset target range. It is judged that it does not exist.

次に、存否判定部１６Ａは、取得された現在の周辺画像及び過去の周辺画像に抽出対象物Ｔｅが含まれていると判定された場合に、抽出対象物Ｔｅが存在する位置が対象範囲内であるか否かを判定する。存否判定部１６Ａは、抽出対象物Ｔｅが現在の周辺画像に含まれる場合に、ユーザＸ又は車両２Ａから抽出対象物Ｔｅまでの方向及び距離を公知の手法により取得することができる。例えば、存否判定部１６Ａは、周辺画像取得部１１により取得されたユーザＸの現在の周辺画像に基づいて、ユーザＸ又は車両２Ａから抽出対象物Ｔｅまでの方向及び距離を推定し、抽出対象物Ｔｅが存在する位置が対象範囲内であるか否かを判定してもよい。あるいは、存否判定部１６Ａは、車両２Ａに設けられたＲＡＤＡＲ（Radio Detection and Ranging）又はＬＩＤＡＲ（Light Detection and Ranging）等を用いて（不図示）、ユーザＸ又は車両２Ａから抽出対象物Ｔｅまでの方向及び距離を計測し、抽出対象物Ｔｅが存在する位置が対象範囲内であるか否かを判定してもよい。 Next, when it is determined that the extraction target Te is included in the acquired current peripheral image and past peripheral images, the presence/absence determining unit 16A determines that the position where the extraction target Te exists is within the target range. Determine whether or not. When the extraction target Te is included in the current surrounding image, the presence/absence determining unit 16A can obtain the direction and distance from the user X or the vehicle 2A to the extraction target Te using a known method. For example, the presence/absence determination unit 16A estimates the direction and distance from the user X or the vehicle 2A to the extraction target Te based on the current surrounding image of the user X acquired by the surrounding image acquisition unit 11, and It may be determined whether the position where Te exists is within the target range. Alternatively, the presence/absence determination unit 16A uses RADAR (Radio Detection and Ranging) or LIDAR (Light Detection and Ranging) provided in the vehicle 2A (not shown) to detect the distance from the user X or the vehicle 2A to the extraction target Te. The direction and distance may be measured to determine whether the position where the extraction target Te exists is within the target range.

なお、存否判定部１６Ａは、抽出対象物Ｔｅが現在の周辺画像に含まれない場合に、周辺画像取得部１１から時間順で抽出対象物Ｔｅが含まれる最後の周辺画像を取得する。次に、存否判定部１６Ａは、ナビゲーション装置２１から取得した実車両位置履歴により現在のユーザＸ又は車両２Ａと撮像した時点でのユーザＸ又は車両２Ａとの相対方向及び距離を算出する。次に、存否判定部１６Ａは、その相対方向及び距離と、ユーザＸ又は車両２Ａから抽出対象物Ｔｅとの相対方向及び距離に基づいて、ユーザＸ又は車両２Ａから抽出対象物Ｔｅまでの方向及び距離を推定する。続いて、存否判定部１６Ａは、ユーザＸ又は車両２Ａから抽出対象物Ｔｅまでの距離が対象範囲内であるか否かを判定してもよい。 Note that, when the extraction target Te is not included in the current peripheral image, the presence/absence determination unit 16A acquires the last peripheral image that includes the extraction target Te in time order from the peripheral image acquisition unit 11. Next, the presence/absence determining unit 16A calculates the relative direction and distance between the current user X or the vehicle 2A and the user X or the vehicle 2A at the time of image capture based on the actual vehicle position history acquired from the navigation device 21. Next, the presence/absence determining unit 16A determines the direction and distance from the user X or the vehicle 2A to the extraction target Te based on the relative direction and distance and the relative direction and distance from the user X or the vehicle 2A to the extraction target Te. Estimate distance. Subsequently, the presence/absence determination unit 16A may determine whether the distance from the user X or the vehicle 2A to the extraction target Te is within the target range.

位置関係取得部１７Ａは、抽出対象物ＴｅとユーザＸとの相対的な位置関係を取得する。「位置関係」は、ユーザＸの位置又はユーザＸの近傍の位置に設定される基準位置（例えば車両２Ａの中心位置）を基準として、抽出対象物Ｔｅの位置の方向及び距離により表されてもよいし、抽出対象物Ｔｅが予め設定された対象範囲内に存在しない情報を表されてもよい。位置関係取得部１７Ａは、ユーザＸ又は車両２Ａから抽出対象物Ｔｅまでの方向及び距離を存否判定部１６Ａから取得してもよい。また、位置関係取得部１７Ａは、周辺画像取得部１１により取得された現在または過去の周辺画像に基づいて、ユーザＸ又は車両２Ａから抽出対象物Ｔｅまでの方向及び距離を推定してもよい。また、位置関係取得部１７Ａは、車両２Ａに設けられたレーダ又はライダー等によりユーザＸ又は車両２Ａから抽出対象物Ｔｅまでの方向及び距離を推定してもよい。また、位置関係取得部１７Ａは、存否判定部１６Ａから、抽出対象物Ｔｅが予め設定された対象範囲内に存在しない情報を取得してもよい。 The positional relationship acquisition unit 17A acquires the relative positional relationship between the extraction target Te and the user X. The "positional relationship" may be expressed by the direction and distance of the position of the extraction target Te with respect to the reference position (for example, the center position of the vehicle 2A) set at the position of the user X or a position near the user X. Alternatively, information may be expressed that the extraction target Te does not exist within a preset target range. The positional relationship acquisition unit 17A may acquire the direction and distance from the user X or the vehicle 2A to the extraction target Te from the existence determination unit 16A. Further, the positional relationship acquisition unit 17A may estimate the direction and distance from the user X or the vehicle 2A to the extraction target Te based on the current or past surrounding images acquired by the surrounding image acquisition unit 11. Further, the positional relationship acquisition unit 17A may estimate the direction and distance from the user X or the vehicle 2A to the extraction target Te using a radar, a lidar, or the like provided in the vehicle 2A. Further, the positional relationship acquisition unit 17A may acquire information that the extraction target Te does not exist within a preset target range from the presence/absence determination unit 16A.

位置関係取得部１７Ａは、ユーザＸの視線方向に対する抽出対象物Ｔｅの方向を算出する。位置関係取得部１７Ａは、視野画像取得部１４Ａから取得されたユーザＸの視野画像と周辺画像取得部１１から取得されたユーザＸの周辺画像に基づいてユーザの視線方向を推定してもよい。また、位置関係取得部１７Ａは、視野画像取得部１４ＡからユーザＸの視線方向を取得してもよい。位置関係取得部１７Ａは、算出されたユーザＸ又は車両２Ａから抽出対象物Ｔｅまでの方向とユーザＸの視線方向に基づいて、ユーザＸの視線方向に対する抽出対象物Ｔｅ方向を推定する。また、上記ユーザＸの視線方向に対する抽出対象物Ｔｅの方向は、視線方向の左後方、視線方向の右後方の２種類であってもよい。 The positional relationship acquisition unit 17A calculates the direction of the extraction target Te with respect to the direction of the user's X line of sight. The positional relationship acquisition unit 17A may estimate the user's gaze direction based on the visual field image of the user X acquired from the visual field image acquisition unit 14A and the peripheral image of the user X acquired from the peripheral image acquisition unit 11. Further, the positional relationship acquisition unit 17A may acquire the user X's line of sight direction from the visual field image acquisition unit 14A. The positional relationship acquisition unit 17A estimates the direction of the extraction target Te with respect to the user X's line-of-sight direction based on the calculated direction from the user X or vehicle 2A to the extraction target Te and the user's X line-of-sight direction. Moreover, the direction of the extraction target Te with respect to the line-of-sight direction of the user X may be one of two types: rear left in the line-of-sight direction and rear right in the line-of-sight direction.

表示画像生成部１８Ａは、抽出対象物情報を取得し、当該抽出対象物情報を含む表示画像Ｐを生成する。「抽出対象物情報」とは、抽出対象物Ｔｅの位置に関する情報を意味する。抽出対象物情報は、抽出対象物Ｔｅの位置そのものを示す情報であってもよく、抽出対象物Ｔｅが存在する方向又は距離を示す情報であってもよく、抽出対象物Ｔｅが所定エリア内に存在するか否かを示す情報であってもよい。 The display image generation unit 18A acquires the extraction object information and generates a display image P including the extraction object information. "Extraction target object information" means information regarding the position of the extraction target Te. The extraction target object information may be information indicating the position of the extraction target Te itself, or may be information indicating the direction or distance in which the extraction target Te exists, and may be information indicating the direction or distance in which the extraction target Te is located within a predetermined area. It may be information indicating whether or not it exists.

表示画像生成部１８Ａは、対象物判定部１５Ａの判定結果に基づいて、抽出対象物Ｔｅの表示画像Ｐの表示態様を決定する。「表示態様」とは、抽出対象物情報を示す画像の表示態様である。表示態様は、抽出対象物Ｔｅの位置そのものを示す画像であってもよく、ユーザから見た抽出対象物Ｔｅの距離及び方向を示す画像であってもよく、抽出対象物Ｔｅが所定エリア内に存在するか否かを示す画像であってもよい。 The display image generation unit 18A determines the display mode of the display image P of the extraction target Te based on the determination result of the target object determination unit 15A. The "display mode" is a display mode of an image showing extraction target object information. The display mode may be an image that shows the position of the extraction target Te itself, or an image that shows the distance and direction of the extraction target Te as seen from the user, and may be an image that shows the extraction target Te within a predetermined area. It may also be an image indicating whether or not it exists.

対象物判定部１５Ａにより抽出対象物Ｔｅが視野画像に含まれると判定された場合には、表示画像生成部１８Ａは、視野画像取得部１４Ａから視野Ｅｘの視野画像を取得し、視野画像から抽出対象物Ｔｅを画像認識し、抽出対象物Ｔｅに重畳して表示される抽出対象物Ｔｅそのものを強調する表示態様で抽出対象物情報を示した第１表示画像Ｐ１を生成する。「抽出対象物そのものを強調する表示態様」とは、例えば、抽出対象物Ｔｅを四角又は丸等で囲うような表示態様であってもよく、抽出対象物Ｔｅを矢印で直接指し示す表示態様であってもよい（図４参照）。 When the target object determination unit 15A determines that the extraction target Te is included in the visual field image, the display image generation unit 18A acquires the visual field image of the visual field Ex from the visual field image acquisition unit 14A, and extracts it from the visual field image. The target object Te is image-recognized, and a first display image P1 is generated that shows the extraction target object information in a display mode that emphasizes the extraction target object Te itself, which is displayed superimposed on the extraction target object Te. The "display mode that emphasizes the extraction object itself" may be, for example, a display mode in which the extraction object Te is surrounded by a square or a circle, or a display mode in which the extraction object Te is directly pointed with an arrow. (See Figure 4).

また、表示画像生成部１８Ａは、対象物判定部１５Ａにより抽出対象物Ｔｅが視野画像に含まれないと判定された場合には、抽出対象物Ｔｅが対象範囲内に存在すると存否判定部１６Ａにより判定されたか否かに基づいて、抽出対象物情報の表示態様を決定する。より詳細には、表示画像生成部１８Ａは、抽出対象物Ｔｅが対象範囲内に存在すると存否判定部１６Ａにより判定された場合に、基準位置を基準として抽出対象物Ｔｅの位置の方向及び距離を含む位置関係を表示する表示態様で抽出対象物情報を示した第２表示画像Ｐ２を生成する（図５参照）。「位置関係を表示する表示態様」とは、基準位置を基準として抽出対象物Ｔｅの位置の方向及び距離を示した画像の表示態様である。表示画像生成部１８Ａは、位置関係取得部１７Ａにより基準位置を基準として抽出対象物Ｔｅの位置の方向及び距離を含む位置関係情報を取得し、取得された基準位置を基準として抽出対象物Ｔｅの位置の方向及び距離を含む位置関係を表示する第２表示画像Ｐ２を生成する。例えば、抽出対象物ＴｅがユーザＸの視野Ｅｘの後方左に位置する場合、図５に示されるように、ユーザＸの視野Ｅｘの後方左を示す記号画像と距離を示す画像を生成して視野画像の左に表示する。 In addition, when the target object determination unit 15A determines that the extraction target Te is not included in the visual field image, the display image generation unit 18A determines that the extraction target Te exists within the target range by the presence/absence determination unit 16A. Based on whether or not the determination has been made, the display mode of the extraction target object information is determined. More specifically, when the presence/absence determination unit 16A determines that the extraction target Te exists within the target range, the display image generation unit 18A calculates the direction and distance of the extraction target Te with reference to the reference position. A second display image P2 is generated that shows the extraction target object information in a display mode that displays the positional relationship including the extracted object information (see FIG. 5). The "display mode for displaying the positional relationship" is a display mode for an image showing the direction and distance of the position of the extraction target Te with reference to the reference position. The display image generation unit 18A acquires positional relationship information including the direction and distance of the position of the extraction target Te with the reference position as a reference by the positional relationship acquisition unit 17A, and calculates the position of the extraction target Te with the acquired reference position as a reference. A second display image P2 is generated that displays the positional relationship including the direction and distance of the positions. For example, if the extraction target Te is located at the rear left of the user X's visual field Ex, as shown in FIG. Display to the left of the image.

また、表示画像生成部１８Ａは、抽出対象物Ｔｅが対象範囲内に存在しないと存否判定部１６Ａにより判定された場合に、抽出対象物Ｔｅが予め設定された対象範囲内に存在しない情報を示す第３表示画像Ｐ３を生成する（図６参照）。 In addition, when the presence/absence determination unit 16A determines that the extraction target Te does not exist within the target range, the display image generation unit 18A indicates information that the extraction target Te does not exist within the preset target range. A third display image P3 is generated (see FIG. 6).

表示画像生成部１８Ａは、発言データ取得部１２により取得された発言主体を特定する情報を含む表示画像Ｐを生成する。例えば、表示画像生成部１８Ａは、発言データ取得部１２により取得された発言主体がユーザＹである場合には、「Mentioned by Y.」という第１表示画像Ｐ１～第３表示画像Ｐ３を生成してもよい（図４～図６参照）。 The display image generation unit 18A generates a display image P that includes information that identifies the person who made the statement, which was acquired by the statement data acquisition unit 12. For example, if the person who made the statement acquired by the statement data acquisition unit 12 is the user Y, the display image generation unit 18A generates the first to third display images P1 to P3 that say “Mentioned by Y.” (See Figures 4 to 6).

表示画像生成部１８Ａは、対象物判定部１５Ａによる抽出対象物Ｔｅが視野画像に含まれるか否かの判定結果に基づいて、抽出対象物ＴｅがユーザＸにより視認可能であるか否かを示す情報を含む表示画像Ｐを生成する（図４～図６参照）。より詳細には、表示画像生成部１８Ａは、抽出対象物Ｔｅが視野画像に含まれると対象物判定部１５Ａにより判定された場合に、抽出対象物ＴｅがユーザＸにより視認可能であることを示す情報を含む第１表示画像Ｐ１を生成し、抽出対象物Ｔｅが視野画像に含まれないと対象物判定部１５Ａにより判定された場合に、抽出対象物ＴｅがユーザＸにより視認可能でないことを示す情報を含む第２表示画像Ｐ２，第３表示画像Ｐ３を生成する。例えば、表示画像生成部１８Ａは、抽出対象物Ｔｅが視野画像に含まれると対象物判定部１５Ａにより判定された場合には、「Bicycle is visible now.」という第１表示画像Ｐ１を生成してもよい（図４参照）。一方、表示画像生成部１８Ａは、抽出対象物Ｔｅが視野画像に含まれないと対象物判定部１５Ａにより判定された場合には、「Bicycle is invisible now.」という第２表示画像Ｐ２，第３表示画像Ｐ３を生成してもよい（図５と図６参照）。 The display image generation unit 18A indicates whether or not the extraction target Te is visible to the user A display image P containing information is generated (see FIGS. 4 to 6). More specifically, the display image generation unit 18A indicates that the extraction target Te is visible to the user X when the target object determination unit 15A determines that the extraction target Te is included in the visual field image. Generates a first display image P1 including information, and indicates that the extraction target Te is not visible to the user X when the target object determination unit 15A determines that the extraction target Te is not included in the visual field image. A second display image P2 and a third display image P3 containing information are generated. For example, when the object determination unit 15A determines that the extraction target Te is included in the visual field image, the display image generation unit 18A generates the first display image P1 that reads “Bicycle is visible now.” (See Figure 4). On the other hand, when the object determining unit 15A determines that the extraction target Te is not included in the visual field image, the display image generating unit 18A generates a second display image P2 that reads “Bicycle is invisible now.” A display image P3 may be generated (see FIGS. 5 and 6).

続いて、表示画像生成装置１Ａにより実行される画像生成処理について説明する。図７は、表示画像生成処理を示すフローチャートである。図７のフローチャートは、例えば表示画像生成装置１Ａによる表示画像生成処理は、車両２Ａが起動されたときに開始される。 Next, the image generation process executed by the display image generation device 1A will be described. FIG. 7 is a flowchart showing display image generation processing. In the flowchart of FIG. 7, for example, the display image generation process by the display image generation device 1A is started when the vehicle 2A is started.

図７に示されるように、ステップＳ１０１において、表示画像生成装置１Ａは、周辺画像取得部１１により、ユーザＸの周辺画像を取得する。周辺画像取得部１１は、車両２Ａの周辺撮像装置２２が撮像した周辺画像を取得する。その後、表示画像生成装置１Ａは、ステップＳ１０２に進む。 As shown in FIG. 7, in step S101, the display image generation device 1A acquires a peripheral image of the user X using the peripheral image acquisition unit 11. The surrounding image acquisition unit 11 obtains a surrounding image captured by the surrounding imaging device 22 of the vehicle 2A. After that, the display image generation device 1A proceeds to step S102.

ステップＳ１０２において、表示画像生成装置１Ａは、発言データ取得部１２により、ユーザ（発言主体）ＹによりユーザＸに対して発せられた発言の発言データを取得する。発言データ取得部１２は、発言主体用端末４の発言データ取得装置４１から取得されたユーザＹによりユーザＸに対して発せられた発言の発言データを取得する。なお、上述したとおり、発言データには、ユーザＹが何も発していないデータも含まれる。さらに、発言データ取得部１２は、同乗者Ｙを特定する情報を取得し、表示画像生成装置１Ａに送信する。その後、ステップＳ１０３に進む。 In step S<b>102 , the display image generation device 1</b>A uses the statement data acquisition unit 12 to obtain statement data of the statement uttered by the user (the subject of the statement) Y to the user X. The utterance data acquisition unit 12 acquires utterance data of the utterance uttered by the user Y to the user X, which is acquired from the utterance data acquisition device 41 of the utterance subject terminal 4. Note that, as described above, the speech data includes data in which user Y has not uttered anything. Further, the statement data acquisition unit 12 acquires information identifying the fellow passenger Y, and transmits it to the display image generation device 1A. After that, the process advances to step S103.

ステップＳ１０３において、表示画像生成装置１Ａは、発言データ取得部１２により、発言データにユーザ（発言主体）Ｙの発言が含まれるか否かを判定する。ユーザＹの発言が含まれると判定された場合には、ステップＳ１０４に進む。ユーザＹの発言が含まれないと判定された場合には、エンドに進む。 In step S103, the display image generation device 1A uses the statement data acquisition unit 12 to determine whether or not the statement data includes a statement by the user (the subject of the statement) Y. If it is determined that the statement by user Y is included, the process advances to step S104. If it is determined that the statement of user Y is not included, the process advances to the end.

ステップＳ１０４において、表示画像生成装置１Ａは、対象物抽出部１３により、発言データのうち対象物Ｔと一致する抽出対象物Ｔｅを抽出できるか否かを判定する。抽出対象物Ｔｅを抽出できると判定された場合には、ステップＳ１０５に進む。抽出対象物Ｔｅを抽出できないと判定された場合には、エンドに進む。 In step S104, the display image generation device 1A determines whether the object extraction unit 13 can extract an extraction object Te that matches the object T from the statement data. If it is determined that the extraction target Te can be extracted, the process advances to step S105. If it is determined that the extraction target Te cannot be extracted, the process proceeds to the end.

ステップＳ１０５において、表示画像生成装置１Ａは、視野画像取得部１４Ａにより、ユーザＸの視野画像を取得する。視野画像取得部１４Ａは、ユーザＸが装着しているユーザ用端末３Ａの視野画像取得装置３２からユーザＸの視野画像を取得する。その後、ステップＳ１０６に進む。 In step S105, the display image generation device 1A acquires the visual field image of the user X using the visual field image acquisition unit 14A. The visual field image acquisition unit 14A acquires the visual field image of the user X from the visual field image acquisition device 32 of the user terminal 3A worn by the user X. After that, the process advances to step S106.

ステップＳ１０６において、表示画像生成装置１Ａは、対象物判定部１５Ａにより、対象物抽出部１３から抽出された抽出対象物Ｔｅが視野画像取得部１４Ａから取得したユーザＸの視野画像に含まれるか否かを判定する。抽出対象物ＴｅがユーザＸの視野画像に含まれると判定された場合には、ステップＳ１０７に進む。抽出対象物ＴｅがユーザＸの視野画像に含まれないと判定された場合には、ステップＳ１０８に進む。 In step S106, the display image generation device 1A determines whether the extraction target Te extracted from the target object extraction unit 13 is included in the visual field image of the user X acquired from the visual field image acquisition unit 14A. Determine whether If it is determined that the extraction target Te is included in the visual field image of the user X, the process advances to step S107. If it is determined that the extraction target Te is not included in the visual field image of the user X, the process advances to step S108.

抽出対象物ＴｅがユーザＸの視野画像に含まれると判定された場合には、ステップＳ１０７において、表示画像生成装置１Ａは、表示画像生成部１８Ａにより、抽出対象物Ｔｅそのものを強調する第１表示画像Ｐ１を生成する。表示画像生成部１８Ａは、視野画像取得部１４Ａから視野Ｅｘの視野画像を取得し、視野画像から抽出対象物Ｔｅを画像認識し、視野画像に重畳して表示される抽出対象物Ｔｅそのものを強調する第１表示態様で抽出対象物情報を示した第１表示画像Ｐ１を生成する（図４参照）。なお、表示画像生成部１８Ａは、視野画像から抽出対象物ＴｅがユーザＸにより視認可能であることを示す情報（図４の「Bicycle is visible now.」）及び発言データ取得部１２により取得された発言主体を特定する情報（図４の「Mentioned by Y.」）をさらに含む第１表示画像Ｐ１を生成してもよい。表示画像生成部１８Ａは、生成した第１表示画像Ｐ１をユーザ用端末３Ａの表示画像表示装置３１Ａに送信する。 If it is determined that the extraction target Te is included in the visual field image of the user Generate image P1. The display image generation unit 18A acquires the visual field image of the visual field Ex from the visual field image acquisition unit 14A, performs image recognition of the extraction target Te from the visual field image, and emphasizes the extraction target Te itself displayed superimposed on the visual field image. A first display image P1 showing extraction target object information in a first display mode is generated (see FIG. 4). Note that the display image generation unit 18A generates information indicating that the extraction target Te is visible to the user The first display image P1 may be generated that further includes information identifying the speaker ("Mentioned by Y." in FIG. 4). The display image generation unit 18A transmits the generated first display image P1 to the display image display device 31A of the user terminal 3A.

抽出対象物ＴｅがユーザＸの視野画像に含まれないと判定された場合には、ステップＳ１０８において、表示画像生成装置１Ａは、存否判定部１６Ａにより、周辺画像取得部１１により取得された現在または過去の周辺画像に基づいて、抽出対象物Ｔｅが対象範囲内に存在するか否かを判定する。存否判定部１６Ａは、抽出対象物Ｔｅが対象範囲内に存在しないと判定した場合には、スッテプＳ１１１に進む。存否判定部１６Ａは、抽出対象物Ｔｅが対象範囲内に存在すると判定した場合には、スッテプＳ１０９に進む。 If it is determined that the extraction target Te is not included in the visual field image of the user X, in step S108, the display image generation device 1A uses the current or Based on past surrounding images, it is determined whether the extraction target Te exists within the target range. When the presence/absence determining unit 16A determines that the extraction target Te does not exist within the target range, the process proceeds to step S111. When the presence/absence determining unit 16A determines that the extraction target Te exists within the target range, the process proceeds to step S109.

抽出対象物Ｔｅが存在する位置が対象範囲内であると判定された場合には、ステップＳ１０９において、表示画像生成装置１Ａは、位置関係取得部１７Ａにより、抽出対象物ＴｅとユーザＸとの位置関係を取得する。位置関係取得部１７Ａは、周辺画像取得部１１から取得された現在または過去のユーザＸの周辺画像に基づいて、抽出対象物ＴｅからユーザＸ又は車両２Ａまでの距離とユーザＸの視野Ｅｘに対する方向を推定する。また、位置関係取得部１７Ａは、存否判定部１６Ａより抽出対象物ＴｅからユーザＸ又は車両２Ａまでの距離を取得してもよい。その後、ステップＳ１１０に進む。 If it is determined that the position where the extraction target Te exists is within the target range, in step S109, the display image generation device 1A uses the positional relationship acquisition unit 17A to determine the position of the extraction target Te and the user X. Get relationships. The positional relationship acquisition unit 17A determines the distance from the extraction target Te to the user X or the vehicle 2A and the direction with respect to the visual field Ex of the user X, based on the current or past surrounding images of the user X acquired from the surrounding image acquisition unit 11. Estimate. Further, the positional relationship acquisition unit 17A may acquire the distance from the extraction target Te to the user X or the vehicle 2A from the presence/absence determination unit 16A. After that, the process advances to step S110.

ステップＳ１１０において、表示画像生成装置１Ａは、表示画像生成部１８Ａにより、位置関係取得部１７Ａから取得された基準位置を基準として抽出対象物Ｔｅの位置の方向及び距離を含む位置関係を表示する第２表示画像Ｐ２を生成する。表示画像生成部１８Ａは、位置関係取得部１７Ａから取得されたユーザＸの視野Ｅｘに対する方向を示す記号画像（図５の矢印）と距離（図５の「２０ｍ」）を表示する第２表示態様で抽出対象物情報を示した第２表示画像Ｐ２を生成する。なお、表示画像生成部１８Ａは、視野画像から抽出対象物ＴｅがユーザＸにより視認不可能であることを示す情報（図５の「Bicycle is invisible now.」）及び発言データ取得部１２により取得された発言主体を特定する情報（図５の「Mentioned by Y.」）を含む第２表示画像Ｐ２を生成してもよい。表示画像生成部１８Ａは、生成した第２表示画像Ｐ２をユーザ用端末３Ａの表示画像表示装置３１Ａに送信する。 In step S110, the display image generation device 1A causes the display image generation unit 18A to display the positional relationship including the direction and distance of the extraction target Te with reference to the reference position acquired from the positional relationship acquisition unit 17A. 2 display image P2 is generated. The display image generation unit 18A displays a symbol image (arrow in FIG. 5) indicating the direction with respect to the visual field Ex of the user X acquired from the positional relationship acquisition unit 17A and a distance (“20 m” in FIG. 5) in a second display mode. A second display image P2 showing extraction target object information is generated. Note that the display image generation unit 18A generates information indicating that the extracted object Te is not visible to the user A second display image P2 may be generated that includes information identifying the person who made the statement (“Mentioned by Y.” in FIG. 5). The display image generation unit 18A transmits the generated second display image P2 to the display image display device 31A of the user terminal 3A.

抽出対象物Ｔｅが存在する位置が対象範囲内ではないと判定された場合には、ステップＳ１１１において、表示画像生成装置１Ａは、位置関係取得部１７Ａにより、抽出対象物ＴｅとユーザＸとの位置関係を取得する。具体的には、位置関係取得部１７Ａは、存否判定部１６Ａから抽出対象物Ｔｅが予め設定された対象範囲内に存在しない位置関係情報を取得する。その後、ステップＳ１１２に進む。 If it is determined that the position where the extraction target Te exists is not within the target range, in step S111, the display image generation device 1A uses the positional relationship acquisition unit 17A to determine the position of the extraction target Te and the user X. Get relationships. Specifically, the positional relationship acquisition unit 17A acquires positional relationship information in which the extraction target Te does not exist within a preset target range from the presence/absence determination unit 16A. After that, the process advances to step S112.

ステップＳ１１２において、表示画像生成装置１Ａは、位置関係取得部１７Ａから取得された抽出対象物Ｔｅが予め設定された対象範囲内に存在しないという抽出対象物ＴｅとユーザＸとの位置関係を表示する第３表示画像Ｐ３を生成する。表示画像生成部１８Ａは、視野画像から抽出対象物ＴｅがユーザＸにより視認不可能であることを示す情報（図６の「Bicycle is invisible now.」）及び発言データ取得部１２により取得された発言主体を特定する情報（図６の「Mentioned by Y.」）を含む第３表示画像Ｐ３を生成する。なお、抽出対象物Ｔｅの位置の方向及び距離を含む位置関係（第２表示態様に係る位置関係）は表示されない。表示画像生成部１８Ａは、生成した第３表示画像Ｐ３をユーザ用端末３Ａの表示画像表示装置３１Ａに送信する。 In step S112, the display image generation device 1A displays the positional relationship between the extraction target Te and the user X, which indicates that the extraction target Te acquired from the positional relationship acquisition unit 17A does not exist within the preset target range. A third display image P3 is generated. The display image generation unit 18A generates information indicating that the extracted object Te is invisible to the user X from the visual field image (“Bicycle is invisible now.” in FIG. 6) and the statement acquired by the statement data acquisition unit 12. A third display image P3 including information identifying the subject ("Mentioned by Y." in FIG. 6) is generated. Note that the positional relationship including the direction and distance of the extraction target object Te (positional relationship according to the second display mode) is not displayed. The display image generation unit 18A transmits the generated third display image P3 to the display image display device 31A of the user terminal 3A.

表示画像生成装置１Ａは、表示画像生成部１８Ａの上述した処理が終了すると、今回の処理を終了して、再びステップＳ１０１から表示画像生成処理を繰り返す。 When the display image generation unit 18A completes the above-described processing, the display image generation device 1A ends the current processing and repeats the display image generation processing from step S101 again.

上記のとおり、本実施形態では、発言主体により発せられた発言に含まれる対象物Ｔを抽出対象物Ｔｅとして特定し、当該抽出対象物Ｔｅに関する表示画像Ｐを生成する表示画像生成装置１Ａを開示する。表示画像生成装置１Ａは、発言データ取得部１２と、対象物抽出部１３と、視野画像取得部１４Ａと、対象物判定部１５Ａと、表示画像生成部１８Ａと、を備える。発言データ取得部１２は、発言主体であるユーザＹによりユーザＸに対して発せられた発言の発言データを取得する。対象物抽出部１３は、予め複数の対象物データ（文字列）を記憶し、複数の対象物データと発言データ取得部１２により取得された発言データ（文字列）とを対比して、発言データのうち対象物データと一致するデータを抽出対象物Ｔｅとして抽出する。視野画像取得部１４Ａは、ユーザＸの視野画像を少なくとも含む画像を取得する。対象物判定部１５Ａは、対象物抽出部１３により抽出された抽出対象物Ｔｅが視野画像に含まれるか否かを判定する。表示画像生成部１８Ａは、抽出対象物Ｔｅの位置に関する情報である抽出対象物情報を取得し、視野画像とは異なる当該抽出対象物情報を含む表示画像Ｐを生成する。更に、表示画像生成部１８Ａは、対象物判定部１５Ａによる抽出対象物Ｔｅが視野画像に含まれるか否かの判定結果に基づいて、抽出対象物Ｔｅに関する表示画像Ｐの表示態様を決定する。 As described above, this embodiment discloses a display image generation device 1A that identifies an object T included in a statement uttered by a speaker as an extraction object Te, and generates a display image P regarding the extraction object Te. do. The display image generation device 1A includes a statement data acquisition section 12, a target object extraction section 13, a visual field image acquisition section 14A, a target object determination section 15A, and a display image generation section 18A. The utterance data acquisition unit 12 acquires utterance data of a utterance uttered to user X by user Y, who is the main speaker. The object extraction unit 13 stores a plurality of object data (character strings) in advance, compares the plurality of object data with the utterance data (character string) acquired by the utterance data acquisition unit 12, and extracts the utterance data. Among them, data that matches the target object data is extracted as the extracted target object Te. The visual field image acquisition unit 14A acquires an image that includes at least the visual field image of the user X. The target object determining unit 15A determines whether the extraction target Te extracted by the target object extracting unit 13 is included in the visual field image. The display image generation unit 18A acquires extraction target information that is information regarding the position of the extraction target Te, and generates a display image P that includes the extraction target information that is different from the visual field image. Further, the display image generation unit 18A determines the display mode of the display image P regarding the extraction target Te based on the determination result of the target object determination unit 15A as to whether the extraction target Te is included in the visual field image.

この結果、表示画像生成装置１Ａは、発言データ取得部１２と対象物抽出部１３によりユーザＸ以外の主体（ユーザＹ）により認識されている抽出対象物Ｔｅを特定することができる。表示画像生成装置１Ａは、視野画像取得部１４Ａと対象物判定部１５Ａにより、抽出対象物Ｔｅが視野画像に含まれるか否かの判定結果を得ることができる。そして、表示画像生成部１８Ａは、対象物判定部１５Ａの判定結果に基づいて、抽出対象物Ｔｅに関する表示画像Ｐの表示態様を決定する。これにより、表示画像生成装置１Ａは、ユーザＸ以外の主体によって認識されている抽出対象物ＴｅがユーザＸの視野Ｅｘ内に含まれているか否かにかかわらず、当該抽出対象物Ｔｅの位置に関する情報を適切に生成することができる（図４～図６）。 As a result, the display image generation device 1A can specify the extraction target Te recognized by a subject other than the user X (user Y) using the statement data acquisition unit 12 and the target object extraction unit 13. The display image generation device 1A can obtain a determination result as to whether the extraction target Te is included in the visual field image using the visual field image acquisition unit 14A and the target object determining unit 15A. Then, the display image generation unit 18A determines the display mode of the display image P regarding the extraction target Te based on the determination result of the target object determination unit 15A. As a result, the display image generation device 1A can determine the position of the extraction target Te, regardless of whether the extraction target Te recognized by a subject other than the user X is included in the user X's field of view Ex. Information can be appropriately generated (FIGS. 4 to 6).

また、上記した実施形態においては、表示画像生成部１８Ａは、抽出対象物Ｔｅが視野画像に含まれると対象物判定部１５Ａにより判定された場合に、抽出対象物Ｔｅそのものを強調する表示態様で抽出対象物情報を示した第１表示画像Ｐ１を生成する。この結果、表示画像生成装置１Ａは、対象物判定部１５Ａにより抽出対象物ＴｅがユーザＸの視野画像に含まれると判定された場合には、ユーザＸが抽出対象物Ｔｅを特定することができる（図７のＳ１０７）。 Further, in the embodiment described above, the display image generation unit 18A uses a display mode that emphasizes the extraction target Te itself when the target object determination unit 15A determines that the extraction target Te is included in the visual field image. A first display image P1 showing extraction target object information is generated. As a result, if the target object determination unit 15A determines that the extraction target Te is included in the visual field image of the user X, the display image generation device 1A allows the user X to specify the extraction target Te. (S107 in FIG. 7).

また、上記した実施形態においては、表示画像生成装置１Ａは、抽出対象物ＴｅとユーザＸとの相対的な位置関係を取得する位置関係取得部１７Ａを備える。表示画像生成部１８Ａは、抽出対象物Ｔｅが視野画像に含まれないと対象物判定部１５Ａにより判定された場合に、位置関係を表示する表示態様で抽出対象物情報を示した第２表示画像Ｐ２を生成する。この結果、表示画像生成装置１Ａは、対象物判定部１５Ａにより抽出対象物Ｔｅが視野画像に含まれないと判定された場合に、位置関係取得部１７Ａにより抽出対象物ＴｅとユーザＸとの相対的な位置関係を取得する。表示画像生成装置１Ａは、取得された位置関係を表示する表示態様で抽出対象物情報を示した第２表示画像Ｐ２を生成する。これにより、表示画像生成装置１Ａは、対象物ＴがユーザＸの視野Ｅｘ内に含まれていないときでも、抽出対象物Ｔｅの位置に関する情報を適切に生成することができる。 Furthermore, in the embodiment described above, the display image generation device 1A includes a positional relationship acquisition unit 17A that acquires the relative positional relationship between the extraction target Te and the user X. When the object determination section 15A determines that the extraction object Te is not included in the visual field image, the display image generation section 18A generates a second display image showing the extraction object information in a display mode that displays the positional relationship. Generate P2. As a result, when the display image generation device 1A determines that the extraction target Te is not included in the visual field image by the target object determination unit 15A, the positional relationship acquisition unit 17A determines the relative relationship between the extraction target Te and the user X. Get the positional relationship. The display image generation device 1A generates a second display image P2 showing extraction target object information in a display mode that displays the acquired positional relationship. Thereby, the display image generation device 1A can appropriately generate information regarding the position of the extraction target Te even when the target T is not included in the user's X visual field Ex.

また、上記した実施形態においては、表示画像生成部１８Ａは、対象物判定部１５Ａによる抽出対象物Ｔｅが視野画像に含まれるか否かの判定結果に基づいて、抽出対象物ＴｅがユーザＸにより視認可能であるか否かを示す情報を含む表示画像Ｐ（第１表示画像Ｐ１～第３表示画像Ｐ３）を生成する。この結果、表示画像生成装置１Ａは、対象物判定部１５Ａの判定結果に基づいて、視野画像から抽出対象物ＴｅがユーザＸにより視認可能であるか否かを示す情報を含む表示画像Ｐを生成する。これにより、表示画像生成装置１Ａは、ユーザＸは抽出対象物Ｔｅが視認可能か否か情報を簡単に把握することができる。 Furthermore, in the embodiment described above, the display image generation unit 18A determines whether or not the extraction target Te is included in the visual field image by the target object determining unit 15A. A display image P (first display image P1 to third display image P3) including information indicating whether or not it is visible is generated. As a result, the display image generation device 1A generates a display image P including information indicating whether or not the target object Te extracted from the visual field image is visible to the user X based on the determination result of the target object determination unit 15A. do. Thereby, the display image generation device 1A allows the user X to easily grasp information as to whether the extraction target Te is visible or not.

また、上記した実施形態においては、表示画像生成装置１Ａは、抽出対象物Ｔｅが視野画像に含まれないと対象物判定部１５Ａにより判定された場合に、抽出対象物Ｔｅが予め設定された対象範囲内に存在するか否かを判定する存否判定部１６Ａを備える。表示画像生成部１８Ａは、抽出対象物Ｔｅが対象範囲内に存在するか否かの判定結果に基づいて、抽出対象物情報の表示態様を決定する。この結果、表示画像生成装置１Ａは、存否判定部１６Ａの判定結果に基づいて、抽出対象物情報の表示態様を決定することにより、抽出対象物ＴｅがユーザＸにより視認可能であるか否かを示す情報を含む表示画像Ｐ（第２表示画像Ｐ２，第３表示画像Ｐ３）を生成する。これより、表示画像生成装置１Ａは、抽出対象物Ｔｅが対象範囲に存在するか否かにかかわらず、当該抽出対象物Ｔｅの位置に関する情報を適切に生成することができる。 In the above-described embodiment, the display image generation device 1A is configured such that when the object determination unit 15A determines that the extraction object Te is not included in the visual field image, the display image generation device 1A selects a It includes an existence/absence determining section 16A that determines whether or not it exists within the range. The display image generation unit 18A determines the display mode of the extraction target information based on the determination result of whether the extraction target Te exists within the target range. As a result, the display image generation device 1A determines whether or not the extraction target Te is visible to the user A display image P (second display image P2, third display image P3) including information shown is generated. From this, the display image generation device 1A can appropriately generate information regarding the position of the extraction target Te, regardless of whether or not the extraction target Te exists in the target range.

また、上記した実施形態においては、表示画像生成装置１Ａは、周辺画像を取得して、取得した周辺画像を記憶する周辺画像取得部１１を備える。存否判定部１６Ａは、周辺画像取得部１１により取得された現在または過去の周辺画像に基づいて、抽出対象物Ｔｅが対象範囲内に存在するか否かを判定する。この結果、存否判定部１６Ａは、取得された現在または過去の周辺画像に基づいて、抽出対象物Ｔｅが対象範囲内に存在するか否かをより詳細に判定することができる。 Further, in the embodiment described above, the display image generation device 1A includes the peripheral image acquisition unit 11 that acquires peripheral images and stores the acquired peripheral images. The presence/absence determination unit 16A determines whether or not the extraction target Te exists within the target range based on the current or past peripheral images acquired by the peripheral image acquisition unit 11. As a result, the presence/absence determining unit 16A can determine in more detail whether or not the extraction target Te exists within the target range based on the acquired current or past surrounding images.

また、上記した実施形態においては、表示画像生成装置１Ａは、抽出対象物ＴｅとユーザＸとの相対的な位置関係を取得する位置関係取得部１７Ａを備える。表示画像生成部１８Ａは、抽出対象物Ｔｅが対象範囲内に存在すると存否判定部１６Ａにより判定された場合に、基準位置を基準として抽出対象物Ｔｅの位置の方向及び距離を含む位置関係を表示する表示態様で抽出対象物情報を示した表示画像Ｐ（第２表示画像Ｐ２，第３表示画像Ｐ３）を生成する。この結果、表示画像生成装置１Ａは、存否判定部１６Ａにより抽出対象物Ｔｅが対象範囲内に存在すると判定された場合に、位置関係取得部１７Ａにより基準位置を基準として抽出対象物Ｔｅの位置の方向及び距離を取得する。次に、表示画像生成装置１Ａは、基準位置を基準として抽出対象物Ｔｅの位置の方向及び距離を含む位置関係情報を生成することができる。これにより、表示画像生成装置１Ａは、存否判定部１６Ａにより抽出対象物Ｔｅが対象範囲内に存在すると判定された場合には、ユーザＸは抽出対象物Ｔｅの位置関係を把握することができる。 Furthermore, in the embodiment described above, the display image generation device 1A includes a positional relationship acquisition unit 17A that acquires the relative positional relationship between the extraction target Te and the user X. When the presence/absence determination unit 16A determines that the extraction target Te exists within the target range, the display image generation unit 18A displays the positional relationship including the direction and distance of the extraction target Te with the reference position as a reference. A display image P (second display image P2, third display image P3) showing extraction target object information in a display mode is generated. As a result, when the presence/absence determination unit 16A determines that the extraction target Te exists within the target range, the display image generation device 1A uses the positional relationship acquisition unit 17A to determine the position of the extraction target Te based on the reference position. Get direction and distance. Next, the display image generation device 1A can generate positional relationship information including the direction and distance of the position of the extraction target Te with reference to the reference position. Thereby, in the display image generation device 1A, when the presence/absence determining unit 16A determines that the extraction target Te exists within the target range, the user X can grasp the positional relationship of the extraction target Te.

また、上記した実施形態においては、発言主体は人（ユーザＹ）であり、発言データは、発言の発言信号データである。この結果、表示画像生成装置１Ａは、人である発信主体から発言の発言信号データを取得することができる。これにより、表示画像生成装置１Ａは、発言主体が人であっても、ユーザＸ以外の主体によって認識されている抽出対象物ＴｅがユーザＸの視野内に含まれているか否かにかかわらず、当該抽出対象物Ｔｅの位置に関する情報を適切に生成することができる。 Furthermore, in the embodiment described above, the person who makes the statement is a person (user Y), and the statement data is statement signal data of the statement. As a result, the display image generation device 1A can acquire speech signal data of a statement from a person who is a sender. As a result, the display image generation device 1A, even if the speaking subject is a person, regardless of whether or not the extraction target Te recognized by a subject other than the user X is included in the field of view of the user X. Information regarding the position of the extraction target Te can be appropriately generated.

また、上記した実施形態においては、対象物判定部１５Ａは、抽出対象物ＴｅがユーザＸの視野画像に含まれるか否かの判定結果の情報を発言主体のユーザＹに出力する。この結果、表示画像生成装置は、対象物判定部１５により抽出対象物ＴｅがユーザＸの視野画像に含まれるか否かの判定結果を発言主体のユーザＹに出力することにより、発言主体は、ユーザＸが対象物を視認できるか否かの情報を取得することができ、ユーザＸが対象物を視認できるか否かに応じて話題の進み方を決めることができる。 Further, in the embodiment described above, the target object determination unit 15A outputs information on the determination result as to whether or not the extraction target object Te is included in the visual field image of the user X to the user Y who is the main speaker. As a result, the display image generation device causes the object determining unit 15 to output the determination result of whether or not the extracted object Te is included in the visual field image of the user X to the user Y who is the main speaker, so that the main speaker can Information on whether or not user X can visually recognize the target object can be acquired, and it is possible to decide how to proceed with the topic depending on whether or not user X can visually recognize the target object.

また、上記した実施形態においては、発言データ取得部１２は、ユーザＸに対して発言を発したユーザＹを特定する情報を取得する。表示画像生成部１８Ａは、発言データ取得部１２により取得されたユーザＹを特定する情報を含む表示画像Ｐ（第１表示画像Ｐ１～第３表示画像Ｐ３）を生成する。この結果、表示画像生成装置１Ａは、発言データ取得部１２によりユーザＹを特定する情報を取得し、表示画像生成部１８ＡによりユーザＹを特定する情報を含む表示画像Ｐを生成することができる。これにより、ユーザＸがユーザＹを把握することができる。 Furthermore, in the embodiment described above, the comment data acquisition unit 12 acquires information that identifies the user Y who made the comment to the user X. The display image generation unit 18A generates a display image P (first display image P1 to third display image P3) including information identifying the user Y acquired by the comment data acquisition unit 12. As a result, the display image generation device 1A can acquire information that specifies user Y using the comment data acquisition section 12, and can generate a display image P that includes information that specifies user Y using the display image generation section 18A. This allows user X to understand user Y.

また、上記した実施形態においては、表示画像生成装置１Ａは、発言主体により発せられた発言に含まれる抽出対象物Ｔｅを特定し、当該抽出対象物Ｔｅに関する表示画像Ｐを生成する表示画像生成方法を開示する。表示画像生成装置１Ａは、発言データ取得ステップと、対象物抽出ステップと、視野画像取得ステップと、対象物判定ステップと、表示画像生成ステップと、を実行する。発言データ取得ステップは、発言主体であるユーザＹによりユーザＸに対して発せられた発言の発言データを取得する（図７のＳ１０３）。対象物抽出ステップは、予め記憶された複数の対象物データ（文字列）と取得された発言データ（文字列）とを対比して、発言データのうち対象物データと一致するデータを抽出対象物Ｔｅとして抽出する（図７のＳ１０４）。視野画像取得ステップは、ユーザＸの視野画像を取得する（図７のＳ１０５）対象物判定ステップは、抽出された抽出対象物Ｔｅが視野画像に含まれるか否かを判定する（図７のＳ１０６）。表示画像生成ステップは、抽出対象物Ｔｅの位置に関する情報である抽出対象物情報を取得し、視野画像とは異なる当該抽出対象物情報を含む表示画像Ｐを生成する（図７のＳ１０７，Ｓ１１０，Ｓ１１２）。更に、表示画像生成ステップにおいては、対象物判定ステップにおける抽出対象物Ｔｅが視野画像に含まれるか否かの判定結果に基づいて、抽出対象物Ｔｅに関する表示画像Ｐの表示態様を決定する（図７のＳ１０７，Ｓ１１０，Ｓ１１２）。 Furthermore, in the embodiment described above, the display image generation device 1A includes a display image generation method in which the display image generation device 1A identifies the extraction target Te included in the utterance uttered by the speaker and generates the display image P regarding the extraction target Te. Disclose. The display image generation device 1A executes a statement data acquisition step, a target object extraction step, a visual field image acquisition step, a target object determination step, and a display image generation step. The utterance data acquisition step acquires utterance data of a utterance uttered to user X by user Y, who is the main speaker (S103 in FIG. 7). The target object extraction step compares a plurality of pre-stored target object data (character strings) with the acquired utterance data (character string), and extracts data that matches the target object data from among the utterance data. It is extracted as Te (S104 in FIG. 7). The visual field image acquisition step acquires the visual field image of the user ). The display image generation step acquires extraction target information that is information regarding the position of the extraction target Te, and generates a display image P that includes the extraction target information that is different from the visual field image (S107, S110 in FIG. 7, S112). Furthermore, in the display image generation step, the display mode of the display image P regarding the extraction target Te is determined based on the determination result of whether the extraction target Te is included in the visual field image in the target object determination step (see FIG. 7 S107, S110, S112).

この結果、表示画像生成装置１Ａは、発言データ取得ステップ対象物抽出ステップにより、ユーザＸ以外の主体（ユーザＹ）により認識されている抽出対象物Ｔｅを特定することができる。表示画像生成装置１Ａは、視野画像取得ステップと対象物判定ステップにより、抽出対象物Ｔｅが視野画像に含まれるか否かの判定結果を得ることができる。そして、表示画像生成ステップにおいて、対象物判定ステップの判定結果に基づいて、抽出対象物Ｔｅに関する表示画像Ｐの表示態様を決定する。これにより、表示画像生成装置１Ａは、ユーザＸ以外の主体によって認識されている抽出対象物ＴｅがユーザＸの視野Ｅｘ内に含まれているか否かにかかわらず、当該抽出対象物Ｔｅの位置に関する情報を適切に生成することができる（図４～図６）。
［第２実施形態］ As a result, the display image generation device 1A can specify the extraction target Te recognized by a subject other than the user X (user Y) through the statement data acquisition step and the target object extraction step. The display image generation device 1A can obtain a determination result as to whether or not the extraction target Te is included in the visual field image through the visual field image acquisition step and the target object determination step. Then, in the display image generation step, the display mode of the display image P regarding the extraction target Te is determined based on the determination result of the target object determination step. As a result, the display image generation device 1A can determine the position of the extraction target Te, regardless of whether the extraction target Te recognized by a subject other than the user X is included in the user X's field of view Ex. Information can be appropriately generated (FIGS. 4 to 6).
[Second embodiment]

図８は、第２実施形態に係る表示画像生成装置１Ｂを示すブロック図である。本実施形態では、ＰＯＩ（Point of Interest）情報を用いて表示画像生成処理を実行可能な表示画像生成装置１Ｂについて説明する。ここで、「ＰＯＩ」とは、ＰＯＩ情報記憶部１９に名称、位置情報（緯度経度）が登録されている地図上の店舗、施設、興味ある名所などの特定な場所を意味する。また、第１実施形態の一例とした、ユーザＹによりユーザＸに対して発せられた発言「向こうに自転車があるね。」を、第２実施形態では一例として「向こうにコンビニエンスストアがあるね。」とする。そして、対象物抽出部１３は、ユーザＹにより発せられた発言から「コンビニエンスストア」という抽出対象物Ｔｅを抽出するものとする。なお、第２実施形態において、第１実施形態と同様の説明は省略又は簡略化する。 FIG. 8 is a block diagram showing a display image generation device 1B according to the second embodiment. In this embodiment, a display image generation device 1B that can perform display image generation processing using POI (Point of Interest) information will be described. Here, "POI" means a specific place on a map, such as a store, facility, or interesting place, whose name and location information (latitude and longitude) are registered in the POI information storage unit 19. Furthermore, the statement ``There's a bicycle over there'' by user Y to user X, which is an example of the first embodiment, is uttered as an example in the second embodiment, ``There's a convenience store over there.'' ”. It is assumed that the object extracting unit 13 extracts the extraction object Te "convenience store" from the utterance uttered by the user Y. Note that in the second embodiment, descriptions similar to those in the first embodiment will be omitted or simplified.

図８において、表示画像生成装置１Ｂは、第１実施形態に係る表示画像生成装置１Ａと比較して、周辺画像取得部１１を備えていない点、視野画像取得部１４Ａに代えて視野画像取得部１４Ｂを備えている点、対象物判定部１５Ａに代えて対象物判定部１５Ｂを備えている点、存否判定部１６Ａに代えて存否判定部１６Ｂを備えている点、位置関係取得部１７Ａに代えて位置関係取得部１７Ｂを備えている点、表示画像生成部１８Ａに代えて表示画像生成部１８Ｂを備えている点、及び、ＰＯＩ情報記憶部１９を更に備えている点で相違しており、その他の点で同一である。 In FIG. 8, the display image generation device 1B is different from the display image generation device 1A according to the first embodiment in that it does not include the peripheral image acquisition unit 11, and in place of the visual field image acquisition unit 14A, a visual field image acquisition unit 14B, a target object determination section 15B is provided in place of the target object determination section 15A, a presence/absence determination section 16B is provided in place of the presence/absence determination section 16A, and a positional relationship acquisition section 17A is provided in place of the positional relationship acquisition section 17A. They are different in that they include a positional relationship acquisition section 17B, a display image generation section 18B instead of the display image generation section 18A, and a POI information storage section 19. are otherwise identical.

表示画像生成装置１Ｂ、車両２Ｂ、ユーザ用端末３Ｂ、及び発言主体用端末４は、相互に有線又は無線により通信（送受信）可能に接続されている。 The display image generation device 1B, the vehicle 2B, the user terminal 3B, and the speaker terminal 4 are connected to each other so that they can communicate (transmit and receive) by wire or wirelessly.

車両２Ｂは、第１実施形態に係る車両２Ａと比較して、周辺撮像装置２２を備えていない点で相違しており、その他の点で同一である。 The vehicle 2B is different from the vehicle 2A according to the first embodiment in that it does not include the peripheral imaging device 22, and is otherwise the same.

ユーザ用端末３Ｂは、第１実施形態に係るユーザ用端末３Ａと比較して、表示画像表示装置３１Ａに代えて表示画像表示装置３１Ｂを備えている点で相違しており、その他の点で同一である。 The user terminal 3B is different from the user terminal 3A according to the first embodiment in that it includes a display image display device 31B instead of the display image display device 31A, and is otherwise the same. It is.

発言主体用端末４は、第１実施形態に係る発言主体用端末４と同一である。 The speaker terminal 4 is the same as the speaker terminal 4 according to the first embodiment.

ＰＯＩ情報記憶部１９は、地図情報に含まれる対象物Ｔの位置に関する情報を少なくとも含むＰＯＩ情報を記憶する。この「ＰＯＩ情報」は、少なくともＰＯＩであるランドマークの名称、ランドマークの用途分類、ランドマークの特徴情報、ランドマークの画像、ランドマークの位置情報を含まれている。なお、ランドマークとは、建物や公園や商業施設や小売業の店舗（コンビニエンスストア等）等である。ＰＯＩ情報記憶部１９は、ＰＯＩ情報を車両２Ｂの外部から通信により取得してもよく、ナビゲーション装置２１に記憶されたランドマーク情報を当該ナビゲーション装置２１から取得してもよい。ＰＯＩ情報記憶部１９は、取得した車両２Ｂの位置に応じて、車両２Ｂが位置する区域のＰＯＩ情報をリアルタイムに更新してもよい。また、ナビゲーション装置２１によって経路探索が行われた場合、ＰＯＩ情報記憶部１９は、ナビゲーション装置２１によりダウンロードされた経路上のＰＯＩ情報を取得してもよい。 The POI information storage unit 19 stores POI information including at least information regarding the position of the target object T included in the map information. This "POI information" includes at least the name of the landmark that is the POI, the usage classification of the landmark, the characteristic information of the landmark, the image of the landmark, and the position information of the landmark. Note that landmarks include buildings, parks, commercial facilities, retail stores (convenience stores, etc.), and the like. The POI information storage unit 19 may acquire POI information from outside the vehicle 2B through communication, or may acquire landmark information stored in the navigation device 21 from the navigation device 21. The POI information storage unit 19 may update the POI information of the area where the vehicle 2B is located in real time according to the acquired position of the vehicle 2B. Furthermore, when the navigation device 21 performs a route search, the POI information storage unit 19 may acquire POI information on the route downloaded by the navigation device 21.

視野画像取得部１４Ｂは、第１実施形態に係る対象物判定部１５Ａと同一である。 The visual field image acquisition section 14B is the same as the object determination section 15A according to the first embodiment.

対象物判定部１５Ｂは、第１実施形態に係る対象物判定部１５Ａとは以下の点で異なるが、その他は同一である。対象物判定部１５Ｂは、ＰＯＩ情報記憶部１９からＰＯＩ情報を取得し、対象物抽出部１３により取得された抽出対象物ＴｅがＰＯＩである否かを判定する。そして、抽出対象物Ｔｅの画像が視野画像取得部１４Ｂにより取得された視野画像に含まれるか否かを判定する。なお、抽出対象物ＴｅがＰＯＩではない場合、又は、抽出対象物Ｔｅが視野画像に含まれない場合には、対象物判定部１５Ｂは、抽出対象物Ｔｅの画像が視野画像に含まれないと判定する。 The target object determining section 15B differs from the target object determining section 15A according to the first embodiment in the following points, but is otherwise the same. The target object determination unit 15B acquires POI information from the POI information storage unit 19, and determines whether the extraction target Te acquired by the target object extraction unit 13 is a POI. Then, it is determined whether the image of the extraction target Te is included in the visual field image acquired by the visual field image acquisition unit 14B. Note that when the extraction target Te is not a POI or when the extraction target Te is not included in the visual field image, the target object determination unit 15B determines that the image of the extraction target Te is not included in the visual field image. judge.

存否判定部１６Ｂは、第１実施形態に係る対象物判定部１５Ａとは以下の点（ＰＯＩ情報を用いる点）で異なるが、その他は同一である。存否判定部１６Ｂは、抽出対象物Ｔｅが視野画像に含まれないと対象物判定部１５Ｂにより判定された場合に、抽出対象物Ｔｅが予め設定された対象範囲内に存在するか否かを判定する。具体的には、存否判定部１６Ｂは、ＰＯＩ情報に基づいて、抽出対象物Ｔｅが対象範囲内に存在するか否かを判定する。 The presence/absence determination section 16B differs from the object determination section 15A according to the first embodiment in the following points (use of POI information), but is otherwise the same. The presence/absence determination unit 16B determines whether or not the extraction target Te exists within a preset target range when the target determination unit 15B determines that the extraction target Te is not included in the visual field image. do. Specifically, the presence/absence determination unit 16B determines whether or not the extraction target Te exists within the target range based on the POI information.

まず、存否判定部１６Ｂは、抽出対象物ＴｅがＰＯＩ情報記憶部１９により取得されたＰＯＩ情報に含まれるか否かを判定する。より詳細には、存否判定部１６Ｂは、ＰＯＩ情報記憶部１９により取得されたＰＯＩ情報を取得し、取得されたＰＯＩ情報に対象物抽出部１３により取得された抽出対象物Ｔｅが含まれているか否かを判定する。 First, the presence/absence determination unit 16B determines whether or not the extraction target Te is included in the POI information acquired by the POI information storage unit 19. More specifically, the presence/absence determination unit 16B acquires the POI information acquired by the POI information storage unit 19, and determines whether the acquired POI information includes the extraction target Te acquired by the target object extraction unit 13. Determine whether or not.

また、存否判定部１６Ｂは、取得されたＰＯＩ情報に抽出対象物Ｔｅが含まれていないと判定された場合には、抽出対象物Ｔｅが予め設定された対象範囲内に存在しないと判定する。 Further, when it is determined that the extraction target Te is not included in the acquired POI information, the presence/absence determining unit 16B determines that the extraction target Te does not exist within the preset target range.

次に、存否判定部１６Ｂは、取得されたＰＯＩ情報に抽出対象物Ｔｅ（ここでは例えばコンビニエンスストア）が含まれていると判定された場合に、取得されたＰＯＩ情報に基づいて、抽出対象物Ｔｅが存在する位置が対象範囲内であるか否かを判定する。存否判定部１６Ｂは、抽出対象物ＴｅがＰＯＩ情報に含まれる場合に、ナビゲーション装置２１から取得された車両２Ｂの位置情報とＰＯＩ情報記憶部１９に記憶されたＰＯＩ情報に含まれる抽出対象物Ｔｅの位置情報を用いて、車両２Ｂから抽出対象物Ｔｅまでの距離を算出する。また、存否判定部１６Ｂは、算出した距離に基づいて抽出対象物Ｔｅが予め設定された対象範囲内であるか否かを判定する。 Next, when it is determined that the acquired POI information includes the extraction target Te (here, for example, a convenience store), the presence/absence determination unit 16B determines the extraction target Te (here, for example, a convenience store) based on the acquired POI information. It is determined whether the position where Te exists is within the target range. When the extraction target Te is included in the POI information, the presence/absence determining unit 16B determines whether the extraction target Te included in the position information of the vehicle 2B acquired from the navigation device 21 and the POI information stored in the POI information storage unit 19 The distance from the vehicle 2B to the extraction target Te is calculated using the position information. Furthermore, the presence/absence determination unit 16B determines whether or not the extraction target Te is within a preset target range based on the calculated distance.

位置関係取得部１７Ｂは、抽出対象物ＴｅとユーザＸとの相対的な位置関係を取得する。位置関係取得部１７Ｂは、ＰＯＩ情報記憶部１９により抽出対象物Ｔｅの位置情報を取得し、ナビゲーション装置２１から車両２Ｂの位置を取得し、取得された抽出対象物Ｔｅの位置情報と車両２Ｂの位置情報に基づいて、車両２Ｂから抽出対象物Ｔｅまでの方向及び距離を算出してもよい。 The positional relationship acquisition unit 17B acquires the relative positional relationship between the extraction target Te and the user X. The positional relationship acquisition unit 17B acquires the position information of the extraction target Te from the POI information storage unit 19, acquires the position of the vehicle 2B from the navigation device 21, and combines the acquired position information of the extraction target Te with the position of the vehicle 2B. The direction and distance from the vehicle 2B to the extraction target Te may be calculated based on the position information.

表示画像生成部１８Ｂは、抽出対象物情報を取得し、当該抽出対象物情報を含む表示画像Ｐを生成する。 The display image generation unit 18B acquires the extraction object information and generates a display image P including the extraction object information.

表示画像生成部１８Ｂは、対象物判定部１５Ｂの判定結果に基づいて、抽出対象物Ｔｅの表示画像Ｐの表示態様を決定する。なお、「抽出対象物情報」等の用語の意味は第１実施形態と同様である。また、図４～図６に表示されている「Bicycle」を、第２実施形態では「Convenience store」とする。対象物判定部１５Ｂにより、抽出対象物ＴｅがＰＯＩであり、かつ、抽出対象物Ｔｅが視野画像に含まれると判定された場合には、表示画像生成部１８Ｂは、第１表示画像Ｐ１を生成する。この場合、表示画像生成部１８Ｂは、視野画像取得部１４Ｂから視野Ｅｘの視野画像を取得し、視野画像から抽出対象物Ｔｅを画像認識し、抽出対象物Ｔｅに重畳して表示される抽出対象物Ｔｅそのものを強調する表示態様で抽出対象物情報を示した第１表示画像Ｐ１を生成する（図４参照）。なお、対象物判定部１５Ｂにより抽出対象物ＴｅがＰＯＩではないと判定された場合、又は、抽出対象物Ｔｅが視野画像に含まれないと判定された場合には、第２実施形態において第１表示画像Ｐ１は生成されない。 The display image generation unit 18B determines the display mode of the display image P of the extraction target Te based on the determination result of the target object determination unit 15B. Note that the meanings of terms such as "extraction target object information" are the same as in the first embodiment. Furthermore, "Bicycle" displayed in FIGS. 4 to 6 is referred to as "Convenience store" in the second embodiment. When the target object determination unit 15B determines that the extraction target Te is a POI and that the extraction target Te is included in the visual field image, the display image generation unit 18B generates the first display image P1. do. In this case, the display image generation unit 18B acquires the visual field image of the visual field Ex from the visual field image acquisition unit 14B, performs image recognition on the extraction target Te from the visual field image, and performs image recognition on the extraction target Te to be displayed superimposed on the extraction target Te. A first display image P1 is generated that shows the extraction target object information in a display mode that emphasizes the object Te itself (see FIG. 4). Note that if the target object determining unit 15B determines that the extraction target Te is not a POI, or if it is determined that the extraction target Te is not included in the visual field image, the first Display image P1 is not generated.

また、表示画像生成部１８Ｂは、対象物判定部１５Ｂにより抽出対象物Ｔｅが視野画像に含まれないと判定された場合には、ＰＯＩ情報に抽出対象物Ｔｅが含まれている、かつ、抽出対象物Ｔｅが対象範囲内に存在すると存否判定部１６Ｂにより判定されたか否かに基づいて、抽出対象物情報の表示態様を決定する。より詳細には、表示画像生成部１８Ｂは、ＰＯＩ情報に抽出対象物Ｔｅが含まれており、かつ、抽出対象物Ｔｅが対象範囲内に存在すると存否判定部１６Ｂにより判定された場合に、取得された抽出対象物Ｔｅの位置情報と車両２Ｂの位置情報に基づいて、抽出対象物Ｔｅが基準位置に対する方向及び距離を含む位置関係を表示する表示態様で抽出対象物情報を示した第２表示画像Ｐ２を生成する。表示画像生成部１８Ｂは、位置関係取得部１７Ｂにより抽出対象物Ｔｅが基準位置に対する方向及び距離を含む位置関係情報を取得し、取得された抽出対象物Ｔｅが基準位置に対する方向及び距離を含む位置関係を表示する第２表示画像Ｐ２を生成する（図５参照）。 In addition, when the target object determination unit 15B determines that the extraction target Te is not included in the visual field image, the display image generation unit 18B determines that the extraction target Te is included in the POI information and that the extraction target Te is not included in the visual field image. The display mode of the extracted target object information is determined based on whether the existence determining unit 16B determines that the target object Te exists within the target range. More specifically, when the POI information includes the extraction target Te and the presence/absence determination unit 16B determines that the extraction target Te exists within the target range, the display image generation unit 18B performs the acquisition. A second display that shows the extraction object information in a display mode that displays the positional relationship of the extraction object Te including the direction and distance with respect to the reference position, based on the position information of the extraction object Te and the position information of the vehicle 2B. Generate image P2. The display image generation unit 18B acquires positional relationship information including the direction and distance of the extraction target Te to the reference position by the positional relationship acquisition unit 17B, and determines the position of the extracted target Te including the direction and distance to the reference position. A second display image P2 that displays the relationship is generated (see FIG. 5).

また、表示画像生成部１８Ｂは、ＰＯＩ情報に抽出対象物Ｔｅが含まれていない、又は、抽出対象物Ｔｅが対象範囲内に存在しないと存否判定部１６Ｂにより判定された場合に、抽出対象物Ｔｅが予め設定された対象範囲内に存在しない情報を示す第３表示画像Ｐ３を生成する（図６参照）。 In addition, when the presence/absence determination unit 16B determines that the extraction target Te is not included in the POI information or that the extraction target Te does not exist within the target range, the display image generation unit 18B generates an extraction target Te. A third display image P3 is generated that indicates information that Te does not exist within a preset target range (see FIG. 6).

また、表示画像生成部１８Ｂは、発言データ取得部１２により取得された発言主体を特定する情報を含む表示画像Ｐ（第１表示画像Ｐ１～第３表示画像Ｐ３）を生成してもよい（図４～図６参照）。 Further, the display image generation unit 18B may generate a display image P (first display image P1 to third display image P3) including information identifying the speaking subject acquired by the statement data acquisition unit 12 (Fig. 4 to Figure 6).

また、表示画像生成部１８Ｂは、対象物判定部１５Ｂによる抽出対象物Ｔｅが視野画像に含まれるか否かの判定結果に基づいて、抽出対象物ＴｅがユーザＸにより視認可能であるか否かを示す情報を含む表示画像Ｐ（第１表示画像Ｐ１～第３表示画像Ｐ３）を生成する（図４～図６参照）。 In addition, the display image generation unit 18B determines whether the extraction target Te is visible to the user A display image P (first display image P1 to third display image P3) including information indicating the information is generated (see FIGS. 4 to 6).

続いて、表示画像生成装置１Ｂにより実行される画像生成処理について説明する。図９は、表示画像生成処理を示すフローチャートである。図９のフローチャートは、例えば表示画像生成装置１Ｂによる表示画像生成処理は、車両２Ｂが起動されたときに開始される。 Next, the image generation process executed by the display image generation device 1B will be described. FIG. 9 is a flowchart showing display image generation processing. In the flowchart of FIG. 9, for example, the display image generation process by the display image generation device 1B is started when the vehicle 2B is started.

図９に示されるように、ステップＳ２０１において、ＰＯＩ情報記憶部１９は、外部又は車両２ＢからＰＯＩ情報を取得して記憶する。その後、表示画像生成装置１Ｂは、ステップＳ２０２に進む。 As shown in FIG. 9, in step S201, the POI information storage unit 19 acquires and stores POI information from the outside or the vehicle 2B. After that, the display image generation device 1B proceeds to step S202.

ステップＳ２０２において、表示画像生成装置１Ｂは、発言データ取得部１２により、ユーザ（発言主体）ＹによりユーザＸに対して発せられた音声の発言データを取得する。発言データ取得部１２は、発言主体用端末４の発言データ取得装置４１から取得されたユーザＹによりユーザＸに対して発せられた音声の発言データを取得する。なお、上述したとおり、発言データには、ユーザＹが何も発していないデータも含まれる。さらに、発言データ取得部１２は、ユーザＹを特定する情報を取得し、表示画像生成装置１Ｂに送信する。その後、ステップＳ２０３に進む。 In step S202, the display image generation device 1B uses the utterance data acquisition unit 12 to acquire utterance data of the voice uttered by the user (the utterer) Y to the user X. The utterance data acquisition unit 12 acquires the utterance data of the voice uttered to the user X by the user Y, which is acquired from the utterance data acquisition device 41 of the utterance main terminal 4. Note that, as described above, the speech data includes data in which user Y has not uttered anything. Furthermore, the statement data acquisition unit 12 acquires information that identifies the user Y, and transmits it to the display image generation device 1B. After that, the process advances to step S203.

ステップＳ２０３において、表示画像生成装置１Ｂは、発言データ取得部１２により、発言データにユーザ（発言主体）Ｙの発言が含まれるか否かを判定する。ユーザＹの発言が含まれると判定された場合には、ステップＳ２０４に進む。ユーザＹの発言が含まれないと判定された場合には、エンドに進む。 In step S203, the display image generation device 1B uses the statement data acquisition unit 12 to determine whether or not the statement data includes a statement by the user (the subject of the statement) Y. If it is determined that the statement by user Y is included, the process advances to step S204. If it is determined that the statement of user Y is not included, the process advances to the end.

ステップＳ２０４において、表示画像生成装置１Ｂは、対象物抽出部１３により、発言データのうち対象物Ｔと一致する抽出対象物Ｔｅを抽出できるか否かを判定する。抽出対象物Ｔｅを抽出できると判定された場合には、ステップＳ２０５に進む。抽出対象物Ｔｅを抽出できないと判定された場合には、エンドに進む。 In step S204, the display image generation device 1B determines whether the object extraction unit 13 can extract the extraction object Te that matches the object T from the statement data. If it is determined that the extraction target Te can be extracted, the process advances to step S205. If it is determined that the extraction target Te cannot be extracted, the process proceeds to the end.

ステップＳ２０５において、表示画像生成装置１Ｂは、視野画像取得部１４Ｂにより、ユーザＸの視野画像を取得する。視野画像取得部１４Ｂは、ユーザＸが装着しているユーザ用端末３Ｂの視野画像取得装置３２からユーザＸの視野画像を取得する。その後、ステップＳ２０６に進む。 In step S205, the display image generation device 1B acquires the visual field image of the user X using the visual field image acquisition unit 14B. The visual field image acquisition unit 14B acquires the visual field image of the user X from the visual field image acquisition device 32 of the user terminal 3B worn by the user X. After that, the process advances to step S206.

ステップＳ２０６において、表示画像生成装置１Ｂは、対象物判定部１５Ｂにより、抽出対象物ＴｅがＰＯＩである否かを判定する。更に、表示画像生成装置１Ｂは、対象物判定部１５Ｂにより、対象物抽出部１３から抽出された抽出対象物Ｔｅが視野画像取得部１４Ｂから取得したユーザＸの視野画像に含まれるか否かを判定する。抽出対象物ＴｅがＰＯＩではないと判定された場合、又は、抽出対象物ＴｅがユーザＸの視野画像に含まれないと判定された場合には、ステップＳ２０８に進む。抽出対象物ＴｅがＰＯＩであると判定され、かつ、抽出対象物ＴｅがユーザＸの視野画像に含まれると判定された場合には、ステップＳ２０７に進む。ここで、例えば、抽出対象物Ｔｅがコンビニエンスストアであり、当該コンビニエンスストアがＰＯＩ情報としてＰＯＩ情報記憶部１９に記憶されていれば、抽出対象物ＴｅがＰＯＩであると判定される。また、例えば、抽出対象物Ｔｅが走行中の自転車であれば、ＰＯＩ情報としてＰＯＩ情報記憶部１９に記憶されていないので、抽出対象物ＴｅがＰＯＩではないと判定される。 In step S206, the display image generation device 1B uses the target object determination unit 15B to determine whether the extraction target Te is a POI. Furthermore, the display image generation device 1B uses the object determination unit 15B to determine whether or not the extraction target Te extracted from the target object extraction unit 13 is included in the visual field image of the user X acquired from the visual field image acquisition unit 14B. judge. If it is determined that the extraction target Te is not a POI, or if it is determined that the extraction target Te is not included in the visual field image of the user X, the process advances to step S208. If it is determined that the extraction target Te is a POI and it is determined that the extraction target Te is included in the visual field image of the user X, the process advances to step S207. Here, for example, if the extraction target Te is a convenience store and the convenience store is stored as POI information in the POI information storage unit 19, it is determined that the extraction target Te is a POI. Further, for example, if the extraction target Te is a running bicycle, it is determined that the extraction target Te is not a POI because it is not stored in the POI information storage unit 19 as POI information.

抽出対象物ＴｅがＰＯＩであると判定され、かつ、抽出対象物ＴｅがユーザＸの視野画像に含まれると判定された場合には、ステップＳ２０７において、表示画像生成装置１Ｂは、表示画像生成部１８Ｂにより、抽出対象物Ｔｅそのものを強調する第１表示画像Ｐ１を生成する。表示画像生成部１８Ｂは、視野画像取得部１４Ｂから視野Ｅｘの視野画像を取得し、視野画像から抽出対象物Ｔｅを画像認識し、視野画像に重畳して表示される抽出対象物Ｔｅそのものを強調する第１表示態様で抽出対象物情報を示した第１表示画像Ｐ１を生成する。なお、表示画像生成部１８Ｂは、視野画像から抽出対象物ＴｅがユーザＸにより視認可能であることを示す情報及び発言データ取得部１２により取得された発言主体を特定する情報をさらに含む第１表示画像Ｐ１を生成してもよい。表示画像生成部１８Ｂは、生成した第１表示画像Ｐ１をユーザ用端末３Ｂの表示画像表示装置３１Ｂに送信する。 If it is determined that the extraction target Te is a POI and it is determined that the extraction target Te is included in the visual field image of the user X, in step S207, the display image generation device 1B 18B, a first display image P1 that emphasizes the extraction target Te itself is generated. The display image generation unit 18B acquires the visual field image of the visual field Ex from the visual field image acquisition unit 14B, performs image recognition of the extraction target Te from the visual field image, and emphasizes the extraction target Te itself displayed superimposed on the visual field image. A first display image P1 showing extraction target object information in a first display mode is generated. Note that the display image generation unit 18B generates a first display that further includes information indicating that the extraction target Te from the visual field image is visible to the user X and information identifying the speaker acquired by the statement data acquisition unit 12. An image P1 may be generated. The display image generation unit 18B transmits the generated first display image P1 to the display image display device 31B of the user terminal 3B.

抽出対象物ＴｅがＰＯＩではないと判定された場合、又は、抽出対象物ＴｅがユーザＸの視野画像に含まれないと判定された場合には、ステップＳ２０８において、まず、表示画像生成装置１Ｂは、存否判定部１６Ｂにより、ＰＯＩ情報記憶部１９により記憶されたＰＯＩ情報に基づいて、ＰＯＩ情報に抽出対象物Ｔｅが含まれているか否かを判定する。更に、ＰＯＩ情報に抽出対象物Ｔｅが含まれていると判定した場合には、表示画像生成装置１Ｂは、存否判定部１６Ｂにより、ＰＯＩ情報記憶部１９により記憶されたＰＯＩ情報に基づいて、抽出対象物Ｔｅが対象範囲内に存在するか否かを判定する。ＰＯＩ情報に抽出対象物Ｔｅが含まれていないと判定された場合、又は、抽出対象物Ｔｅが対象範囲内に存在しないと判定された場合には、ステップＳ２１１に進む。ＰＯＩ情報に抽出対象物Ｔｅが含まれていると判定され、かつ、抽出対象物Ｔｅが対象範囲内に存在すると判定された場合には、スッテプＳ２０９に進む。ここで、例えば、抽出対象物Ｔｅがコンビニエンスストアであり、当該コンビニエンスストアがＰＯＩ情報としてＰＯＩ情報記憶部１９に記憶されていれば、ＰＯＩ情報に抽出対象物Ｔｅが含まれていると判定される。また、例えば、抽出対象物Ｔｅが走行中の自転車であれば、ＰＯＩ情報としてＰＯＩ情報記憶部１９に記憶されていないので、ＰＯＩ情報に抽出対象物Ｔｅが含まれていないと判定される。 If it is determined that the extraction target Te is not a POI, or if it is determined that the extraction target Te is not included in the visual field image of the user X, in step S208, the display image generation device 1B first Based on the POI information stored in the POI information storage section 19, the presence/absence determination section 16B determines whether or not the extraction target Te is included in the POI information. Further, when it is determined that the extraction target Te is included in the POI information, the display image generation device 1B causes the presence/absence determination unit 16B to perform extraction based on the POI information stored in the POI information storage unit 19. It is determined whether the target object Te exists within the target range. If it is determined that the extraction target Te is not included in the POI information, or if it is determined that the extraction target Te does not exist within the target range, the process advances to step S211. If it is determined that the extraction target Te is included in the POI information and it is determined that the extraction target Te is present within the target range, the process advances to step S209. Here, for example, if the extraction target Te is a convenience store and the convenience store is stored in the POI information storage unit 19 as POI information, it is determined that the extraction target Te is included in the POI information. . Further, for example, if the extraction target Te is a running bicycle, it is not stored in the POI information storage unit 19 as POI information, and therefore it is determined that the extraction target Te is not included in the POI information.

ＰＯＩ情報に抽出対象物Ｔｅが含まれていると判定され、かつ、抽出対象物Ｔｅが存在する位置が対象範囲内であると判定された場合には、ステップＳ２０９において、表示画像生成装置１Ｂは、位置関係取得部１７Ｂにより、抽出対象物ＴｅとユーザＸとの位置関係を取得する。位置関係取得部１７Ｂは、取得された抽出対象物Ｔｅの位置情報と車両２Ｂの位置情報に基づいて、抽出対象物ＴｅからユーザＸ又は車両２Ｂまでの方向と距離を算出により推定する。その後、ステップＳ２１０に進む。 If it is determined that the extraction target Te is included in the POI information, and if it is determined that the position where the extraction target Te exists is within the target range, in step S209, the display image generation device 1B , the positional relationship between the extraction target Te and the user X is acquired by the positional relationship acquisition unit 17B. The positional relationship acquisition unit 17B calculates and estimates the direction and distance from the extraction target Te to the user X or the vehicle 2B based on the acquired position information of the extraction target Te and the position information of the vehicle 2B. After that, the process advances to step S210.

ステップＳ２１０において、表示画像生成装置１Ｂは、表示画像生成部１８Ｂにより、位置関係取得部１７Ｂから取得された抽出対象物Ｔｅの位置情報と車両２Ｂの位置情報に基づいて、車両２Ｂから抽出対象物Ｔｅまでの方向及び距離を含む位置関係を表示する第２表示画像Ｐ２を生成する。表示画像生成部１８Ｂは、位置関係取得部１７Ｂから取得されたユーザＸの視野Ｅｘに対する方向を示す記号画像と距離を表示する第２表示態様で抽出対象物情報を示した第２表示画像Ｐ２を生成する。なお、表示画像生成部１８Ｂは、ユーザＸから抽出対象物Ｔｅが視認不可能であることを示す情報及び発言データ取得部１２により取得された発言主体を特定する情報を含む第２表示画像Ｐ２を生成してもよい。表示画像生成部１８Ｂは、生成した第２表示画像Ｐ２をユーザ用端末３Ｂの表示画像表示装置３１Ｂに送信する。 In step S210, the display image generation device 1B uses the display image generation unit 18B to extract the extraction target from the vehicle 2B based on the position information of the extraction target Te acquired from the positional relationship acquisition unit 17B and the position information of the vehicle 2B. A second display image P2 is generated that displays the positional relationship including the direction and distance to Te. The display image generation unit 18B generates a second display image P2 that shows the extraction object information in a second display mode that displays the distance and a symbol image indicating the direction with respect to the visual field Ex of the user X acquired from the positional relationship acquisition unit 17B. generate. Note that the display image generation unit 18B generates a second display image P2 that includes information indicating that the extraction target Te is not visible to the user X and information identifying the speaker acquired by the statement data acquisition unit 12. May be generated. The display image generation unit 18B transmits the generated second display image P2 to the display image display device 31B of the user terminal 3B.

ＰＯＩ情報に抽出対象物Ｔｅが含まれていないと判定した場合、又は、抽出対象物Ｔｅが存在する位置が対象範囲内ではないと判定された場合には、ステップＳ２１１において、表示画像生成装置１Ｂは、位置関係取得部１７Ｂにより、抽出対象物ＴｅとユーザＸとの位置関係を取得する。具体的には、位置関係取得部１７Ｂは、存否判定部１６Ｂから抽出対象物Ｔｅが予め設定された対象範囲内に存在しない位置関係情報を取得する。その後、ステップＳ２１２に進む。 If it is determined that the extraction target Te is not included in the POI information, or if it is determined that the position where the extraction target Te exists is not within the target range, in step S211, the display image generation device 1B acquires the positional relationship between the extraction target Te and the user X by the positional relationship acquisition unit 17B. Specifically, the positional relationship acquisition unit 17B acquires positional relationship information in which the extraction target Te does not exist within a preset target range from the presence/absence determination unit 16B. After that, the process advances to step S212.

ステップＳ２１２において、表示画像生成装置１Ｂは、位置関係取得部１７Ｂから取得された抽出対象物Ｔｅが予め設定された対象範囲内に存在しないという抽出対象物ＴｅとユーザＸとの位置関係を表示する第３表示画像Ｐ３を生成する。表示画像生成部１８Ｂは、視野画像から抽出対象物ＴｅがユーザＸにより視認不可能であることを示す情報及び発言データ取得部１２により取得された発言主体を特定する情報を含む第３表示画像Ｐ３を生成する。なお、抽出対象物Ｔｅの位置の方向及び距離を含む位置関係（第２表示態様に係る位置関係）は表示されない。表示画像生成部１８Ｂは、生成した第３表示画像Ｐ３をユーザ用端末３Ｂの表示画像表示装置３１Ｂに送信する。 In step S212, the display image generation device 1B displays the positional relationship between the extraction target Te and the user X, which indicates that the extraction target Te acquired from the positional relationship acquisition unit 17B does not exist within the preset target range. A third display image P3 is generated. The display image generation unit 18B generates a third display image P3 that includes information indicating that the extracted object Te is not visible to the user X from the visual field image and information identifying the subject of the statement acquired by the statement data acquisition unit 12. generate. Note that the positional relationship including the direction and distance of the extraction target object Te (positional relationship according to the second display mode) is not displayed. The display image generation unit 18B transmits the generated third display image P3 to the display image display device 31B of the user terminal 3B.

表示画像生成装置１Ｂは、表示画像生成部１８Ｂの上述した処理が終了すると、今回の処理を終了して、再びステップＳ２０１から表示画像生成処理を繰り返す。 When the display image generation unit 18B completes the above-described processing, the display image generation device 1B ends the current processing and repeats the display image generation processing from step S201 again.

上記のとおり、本実施形態では、抽出対象物Ｔｅの位置に関する情報を少なくとも含むＰＯＩ情報を記憶するＰＯＩ情報記憶部１９を備える。存否判定部１６Ｂは、ＰＯＩ情報記憶部１９により記憶されたＰＯＩ情報に基づいて、抽出対象物Ｔｅが対象範囲内に存在するか否かを判定する。この結果、存否判定部１６Ｂは、ＰＯＩ情報記憶部１９に記憶されたＰＯＩ情報に基づいて、抽出対象物Ｔｅが対象範囲内に存在するか否かを確実に判定することができる。
［第３実施形態］ As described above, this embodiment includes the POI information storage unit 19 that stores POI information including at least information regarding the position of the extraction target Te. The presence/absence determination unit 16B determines whether or not the extraction target Te exists within the target range based on the POI information stored by the POI information storage unit 19. As a result, the presence/absence determining unit 16B can reliably determine whether or not the extraction target Te exists within the target range based on the POI information stored in the POI information storage unit 19.
[Third embodiment]

図１０は、第３実施形態に係る表示画像生成装置１Ｃを示すブロック図である。本実施形態では、車両２Ｃに設置された表示装置であるユーザ用端末３Ｃを用いて表示画像生成処理を実行可能な表示画像生成装置１Ｃについて説明する。なお、第２実施形態において、第１実施形態と同様の説明は省略又は簡略化する。 FIG. 10 is a block diagram showing a display image generation device 1C according to the third embodiment. In this embodiment, a display image generation device 1C that can perform display image generation processing using a user terminal 3C, which is a display device installed in a vehicle 2C, will be described. Note that in the second embodiment, descriptions similar to those in the first embodiment will be omitted or simplified.

図１０において、表示画像生成装置１Ｃは、第１実施形態に係る表示画像生成装置１Ａと比較して、視野画像取得部１４Ａに代えて視野画像取得部１４Ｃを備えている点、対象物判定部１５Ａに代えて対象物判定部１５Ｃを備えている点、存否判定部１６Ａに代えて存否判定部１６Ｃを備えている点、位置関係取得部１７Ａに代えて位置関係取得部１７Ｃを備えている点、表示画像生成部１８Ａに代えて表示画像生成部１８Ｃを備えている点、及び、視線認識部２０を備えている点で相違しており、その他の点で同一である。 In FIG. 10, the display image generation device 1C is different from the display image generation device 1A according to the first embodiment in that it includes a visual field image acquisition unit 14C instead of the visual field image acquisition unit 14A, and a target object determination unit. 15A, a presence/absence judgment section 16C is provided instead of the presence/absence judgment section 16A, and a positional relationship acquisition section 17C is provided instead of the positional relationship acquisition section 17A. , is different in that it includes a display image generation section 18C instead of the display image generation section 18A, and that it includes a line of sight recognition section 20, and is the same in other respects.

表示画像生成装置１Ｃ、車両２Ｃ、ユーザ用端末３Ｃ、及び発言主体用端末４は、相互に有線又は無線により通信（送受信）可能に接続されている。 The display image generation device 1C, the vehicle 2C, the user terminal 3C, and the speaker terminal 4 are connected to each other so that they can communicate (transmit and receive) by wire or wirelessly.

車両２Ｃは、第１実施形態に係る車両２Ａと比較して、姿勢取得装置２３を備えている点で相違しており、その他の点で同一である。 The vehicle 2C is different from the vehicle 2A according to the first embodiment in that it includes an attitude acquisition device 23, and is the same in other respects.

ユーザ用端末３Ｂは、第１実施形態に係るユーザ用端末３Ａと比較して、視野画像取得装置３２を備えていない点、表示画像表示装置３１Ａに代えて表示画像表示装置３１Ｃを備えている点で相違しており、その他の点で同一である。 The user terminal 3B is different from the user terminal 3A according to the first embodiment in that it does not include the visual field image acquisition device 32 and includes a display image display device 31C in place of the display image display device 31A. They are different in some respects and are the same in other respects.

姿勢取得装置２３は、ユーザＸの顔画像を含む画像情報を取得する。姿勢取得装置２３は、車両２Ｃに設置された車内カメラからユーザＸの顔画像を含む画像を撮像する。 The posture acquisition device 23 acquires image information including the user's X face image. The posture acquisition device 23 captures an image including a face image of the user X from an in-vehicle camera installed in the vehicle 2C.

視線認識部２０は、ユーザＸの視線を認識する。「視線」とは、ユーザＸの両目の中心を通り、ユーザＸの顔向きを示す視線方向である。視線認識部２０は、姿勢取得装置２３からユーザＸの顔画像を含む画像情報を取得し、ユーザＸの視線方向を認識する。 The line of sight recognition unit 20 recognizes the line of sight of the user X. The "line of sight" is a line of sight direction that passes through the center of both eyes of user X and indicates the direction of user's X's face. The line of sight recognition unit 20 acquires image information including the face image of the user X from the posture acquisition device 23, and recognizes the direction of the user's X line of sight.

視野画像取得部１４Ｃは、周辺画像取得部１１により取得されたリアルタイムの周辺画像と視線認識部２０により認識されたユーザＸの視線とに基づいて視野画像を取得する。より詳細には、視野画像取得部１４Ｃは、視線認識部２０からユーザＸの視線方向を取得し、ユーザＸの視野Ｅｘを推定する。視野画像取得部１４Ｃは、周辺画像取得部１１からリアルタイムの車両周辺の画像を取得し、車両周辺の画像から推定されたユーザＸの視野Ｅｘに対応する領域を切り出し、視野画像を取得する。ここで、「推定されたユーザＸの視野Ｅｘに対応する領域」とは、例えば、眼を動かさない状態で、垂直視野の上側６０度・下側７０度、水平視野で左右それぞれ１００度、の領域とする。 The visual field image acquisition unit 14C acquires a visual field image based on the real-time peripheral image acquired by the peripheral image acquisition unit 11 and the line of sight of the user X recognized by the line of sight recognition unit 20. More specifically, the visual field image acquisition unit 14C acquires the visual line direction of the user X from the visual line recognition unit 20, and estimates the visual field Ex of the user X. The visual field image acquisition unit 14C acquires a real-time image around the vehicle from the peripheral image acquisition unit 11, cuts out an area corresponding to the visual field Ex of the user X estimated from the image around the vehicle, and acquires a visual field image. Here, the "area corresponding to the estimated visual field Ex of user area.

対象物判定部１５Ｃは、抽出対象物Ｔｅが視野画像取得部１４Ｃにより取得されたユーザＸの視野Ｅｘの視野画像に抽出対象物Ｔｅが含まれるか否かを判定する。対象物判定部１５Ｃは、第１実施形態に係る対象物判定部１５Ａと同一の方法で判定すればよい。 The target object determining unit 15C determines whether the extraction target Te is included in the visual field image of the visual field Ex of the user X acquired by the visual field image acquiring unit 14C. The target object determining section 15C may perform determination using the same method as the target object determining section 15A according to the first embodiment.

存否判定部１６Ｃは、抽出対象物Ｔｅが視野画像に含まれないと対象物判定部１５Ｃにより判定された場合に、抽出対象物Ｔｅが予め設定された対象範囲内に存在するか否かを判定する。存否判定部１６Ｃは、第１実施形態に係る存否判定部１６Ａと同一の方法で判定すればよい。 The presence/absence determining unit 16C determines whether the extraction target Te exists within a preset target range when the target object determining unit 15C determines that the extraction target Te is not included in the visual field image. do. The presence/absence determining section 16C may perform the determination using the same method as the presence/absence determining section 16A according to the first embodiment.

位置関係取得部１７Ｃは、抽出対象物ＴｅとユーザＸとの相対的な位置関係を取得する。位置関係取得部１７Ｃは、第１実施形態に係る位置関係取得部１７Ａと同一の方法で、ユーザＸ又は車両２Ｃから抽出対象物Ｔｅまでの方向及び距離を推定すればよい。また、位置関係取得部１７Ｃは、第１実施形態に係る位置関係取得部１７Ａと同一の方法で、存否判定部１６Ｃから、抽出対象物Ｔｅが予め設定された対象範囲内に存在しない情報を取得してもよい。 The positional relationship acquisition unit 17C acquires the relative positional relationship between the extraction target Te and the user X. The positional relationship acquisition unit 17C may estimate the direction and distance from the user X or the vehicle 2C to the extraction target Te using the same method as the positional relationship acquisition unit 17A according to the first embodiment. Further, the positional relationship acquisition unit 17C acquires information that the extraction target Te does not exist within a preset target range from the presence/absence determination unit 16C using the same method as the positional relationship acquisition unit 17A according to the first embodiment. You may.

表示画像生成部１８Ｃは、抽出対象物情報を取得し、当該抽出対象物情報を含む表示画像Ｐを生成する。 The display image generation unit 18C acquires the extraction object information and generates a display image P including the extraction object information.

表示画像生成部１８Ｃは、対象物判定部１５Ｃの判定結果に基づいて、第１実施形態と同様に抽出対象物Ｔｅの表示画像Ｐの表示態様を決定する。なお、「抽出対象物情報」等の用語の意味は第１実施形態と同様である。対象物判定部１５Ｃにより抽出対象物Ｔｅが視野画像に含まれると判定された場合には、表示画像生成部１８Ｃは、視野画像取得部１４Ｃから視野Ｅｘの視野画像を取得し、視野画像から抽出対象物Ｔｅを画像認識し、抽出対象物Ｔｅに重畳して表示される抽出対象物Ｔｅそのものを強調する表示態様で抽出対象物情報を示した第１表示画像Ｐ１を生成する（図４参照）。 The display image generation unit 18C determines the display mode of the display image P of the extraction target Te, based on the determination result of the target object determination unit 15C, similarly to the first embodiment. Note that the meanings of terms such as "extraction target object information" are the same as in the first embodiment. When the target object determination unit 15C determines that the extraction target Te is included in the visual field image, the display image generation unit 18C acquires the visual field image of the visual field Ex from the visual field image acquisition unit 14C, and extracts it from the visual field image. Image recognition is performed on the target object Te, and a first display image P1 is generated that shows the extraction target information in a display mode that emphasizes the extraction target Te itself, which is displayed superimposed on the extraction target Te (see FIG. 4). .

また、表示画像生成部１８Ｃは、対象物判定部１５Ｃにより抽出対象物Ｔｅが視野画像に含まれないと判定された場合には、抽出対象物Ｔｅが対象範囲内に存在すると存否判定部１６Ｃにより判定されたか否かに基づいて、第１実施形態と同様に抽出対象物情報の表示態様を決定する。より詳細には、表示画像生成部１８Ｃは、抽出対象物Ｔｅが対象範囲内に存在すると存否判定部１６Ｃにより判定された場合に、基準位置を基準として抽出対象物Ｔｅの位置の方向及び距離を含む位置関係を表示する表示態様で抽出対象物情報を示した第２表示画像Ｐ２を生成する（図５参照）。表示画像生成部１８Ｃは、位置関係取得部１７Ｃにより基準位置を基準として抽出対象物Ｔｅの位置の方向及び距離を含む位置関係情報を取得し、取得された基準位置を基準として抽出対象物Ｔｅの位置の方向及び距離を含む位置関係を表示する第２表示画像Ｐ２を生成する。 Further, when the target object determining unit 15C determines that the extraction target Te is not included in the visual field image, the display image generating unit 18C determines that the extraction target Te exists within the target range by the presence/absence determining unit 16C. Based on whether the determination has been made or not, the display mode of the extraction target object information is determined in the same manner as in the first embodiment. More specifically, when the presence/absence determination unit 16C determines that the extraction target Te exists within the target range, the display image generation unit 18C determines the direction and distance of the extraction target Te with reference to the reference position. A second display image P2 is generated that shows the extraction target object information in a display mode that displays the positional relationship including the extracted object information (see FIG. 5). The display image generation unit 18C acquires positional relationship information including the direction and distance of the position of the extraction target Te with the reference position as a reference by the positional relationship acquisition unit 17C, and calculates the position of the extraction target Te with the acquired reference position as a reference. A second display image P2 is generated that displays the positional relationship including the direction and distance of the positions.

また、表示画像生成部１８Ｃは、抽出対象物Ｔｅが対象範囲内に存在しないと存否判定部１６Ｃにより判定された場合に、抽出対象物Ｔｅが予め設定された対象範囲内に存在しない情報を示す第３表示画像Ｐ３を生成する（図６参照）。 Furthermore, when the presence/absence determination unit 16C determines that the extraction target Te does not exist within the target range, the display image generation unit 18C indicates information that the extraction target Te does not exist within the preset target range. A third display image P3 is generated (see FIG. 6).

また、表示画像生成部１８Ｃは、第１実施形態と同様に発言データ取得部１２により取得された発言主体を特定する情報を含む表示画像Ｐ（第１表示画像Ｐ１～第３表示画像Ｐ３）を生成する（図４～図６参照）。 In addition, the display image generation unit 18C generates a display image P (first display image P1 to third display image P3) that includes information for specifying the speaking subject acquired by the statement data acquisition unit 12, as in the first embodiment. (See Figures 4 to 6).

また、表示画像生成部１８Ｃは、対象物判定部１５Ｃによる抽出対象物Ｔｅが視野画像に含まれるか否かの判定結果に基づいて、抽出対象物ＴｅがユーザＸにより視認可能であるか否かを示す情報を含む表示画像Ｐ（第１表示画像Ｐ１～第３表示画像Ｐ３）を生成する（図４～図６参照）。 In addition, the display image generation unit 18C determines whether the extraction target Te is visible to the user A display image P (first display image P1 to third display image P3) including information indicating the information is generated (see FIGS. 4 to 6).

続いて、表示画像生成装置１Ｃにより実行される画像生成処理について説明する。図１１は、表示画像生成処理を示すフローチャートである。図１１のフローチャートは、例えば表示画像生成装置１Ｃによる表示画像生成処理は、車両２Ｃが起動されたときに開始される。 Next, the image generation process executed by the display image generation device 1C will be described. FIG. 11 is a flowchart showing display image generation processing. In the flowchart of FIG. 11, for example, display image generation processing by the display image generation device 1C is started when the vehicle 2C is started.

図１１に示されるように、ステップＳ３０１において、表示画像生成装置１Ｃは、周辺画像取得部１１により、ユーザＸの周辺画像を取得する。周辺画像取得部１１は、車両２Ｃの周辺撮像装置２２が撮像した周辺画像を取得する。その後、表示画像生成装置１Ｃは、ステップＳ３０２に進む。 As shown in FIG. 11, in step S301, the display image generation device 1C acquires a peripheral image of the user X using the peripheral image acquisition unit 11. The surrounding image acquisition unit 11 obtains a surrounding image captured by the surrounding imaging device 22 of the vehicle 2C. After that, the display image generation device 1C proceeds to step S302.

ステップＳ３０２において、表示画像生成装置１Ｃは、発言データ取得部１２により、ユーザ（発言主体）ＹによりユーザＸに対して発せられた音声の発言データを取得する。発言データ取得部１２は、発言主体用端末４の発言データ取得装置４１から取得されたユーザＹによりユーザＸに対して発せられた音声の発言データを取得する。なお、上述したとおり、発言データには、ユーザＹが何も発していないデータも含まれる。さらに、発言データ取得部１２は、ユーザＹを特定する情報を取得し、表示画像生成装置１Ｃに送信する。その後、ステップＳ３０３に進む。 In step S302, the display image generation device 1C uses the utterance data acquisition unit 12 to acquire utterance data of the voice uttered by the user (the utterer) Y to the user X. The utterance data acquisition unit 12 acquires the utterance data of the voice uttered to the user X by the user Y, which is acquired from the utterance data acquisition device 41 of the utterance main terminal 4. Note that, as described above, the speech data includes data in which user Y has not uttered anything. Furthermore, the statement data acquisition unit 12 acquires information that identifies the user Y, and transmits it to the display image generation device 1C. After that, the process advances to step S303.

ステップＳ３０３において、表示画像生成装置１Ｃは、発言データ取得部１２により、発言データにユーザ（発言主体）Ｙの発言が含まれるか否かを判定する。ユーザＹの発言が含まれると判定された場合には、ステップＳ３０４に進む。ユーザＹの発言が含まれないと判定された場合には、エンドに進む。 In step S303, the display image generation device 1C uses the statement data acquisition unit 12 to determine whether or not the statement data includes a statement by the user (the subject of the statement) Y. If it is determined that the statement by user Y is included, the process advances to step S304. If it is determined that the statement of user Y is not included, the process advances to the end.

ステップＳ３０４において、表示画像生成装置１Ｃは、対象物抽出部１３により、発言データのうち対象物Ｔと一致する抽出対象物Ｔｅを抽出できるか否かを判定する。抽出対象物Ｔｅを抽出できると判定された場合には、ステップＳ３０５に進む。抽出対象物Ｔｅを抽出できないと判定された場合には、エンドに進む。 In step S304, the display image generation device 1C determines whether the object extraction unit 13 can extract the extraction object Te that matches the object T from the statement data. If it is determined that the extraction target Te can be extracted, the process advances to step S305. If it is determined that the extraction target Te cannot be extracted, the process proceeds to the end.

ステップＳ３０５において、表示画像生成装置１Ｃは、視線認識部２０により、ユーザＸの視線を認識する。視線認識部２０は、姿勢取得装置２３からユーザＸの顔画像を含む画像情報を取得し、取得された画像情報に基づいてユーザＸの視線方向を認識する。その後、ステップＳ３０６に進む。 In step S305, the display image generation device 1C recognizes the line of sight of the user X using the line of sight recognition unit 20. The line of sight recognition unit 20 acquires image information including the face image of the user X from the posture acquisition device 23, and recognizes the direction of the line of sight of the user X based on the acquired image information. After that, the process advances to step S306.

ステップＳ３０６において、表示画像生成装置１Ｃは、視野画像取得部１４Ｃにより、ユーザＸの視野画像を取得する。視野画像取得部１４Ｃは、視線認識部２０からユーザＸの視線方向を取得し、ユーザＸの視野Ｅｘを推定する。視野画像取得部１４Ｃは、周辺画像取得部１１からリアルタイムの車両周辺の画像を取得し、車両周辺の画像から推定されたユーザＸの視野Ｅｘに対する領域を切り出し、視野画像を取得する。その後、ステップＳ３０７に進む。 In step S306, the display image generation device 1C acquires the visual field image of the user X using the visual field image acquisition unit 14C. The visual field image acquisition unit 14C acquires the visual line direction of the user X from the visual line recognition unit 20, and estimates the visual field Ex of the user X. The visual field image acquisition unit 14C acquires a real-time image around the vehicle from the peripheral image acquisition unit 11, cuts out an area for the visual field Ex of the user X estimated from the image around the vehicle, and acquires a visual field image. After that, the process advances to step S307.

ステップＳ３０７において、表示画像生成装置１Ｃは、対象物判定部１５Ｃにより、対象物抽出部１３から抽出された抽出対象物Ｔｅが視野画像取得部１４Ｃから取得したユーザＸの視野画像に含まれるか否かを判定する。抽出対象物ＴｅがユーザＸの視野画像に含まれると判定された場合には、ステップＳ３０８に進む。抽出対象物ＴｅがユーザＸの視野画像に含まれないと判定された場合には、ステップＳ３０９に進む。 In step S307, the display image generation device 1C determines whether the extraction target Te extracted from the target object extraction unit 13 is included in the visual field image of the user X acquired from the visual field image acquisition unit 14C. Determine whether If it is determined that the extraction target Te is included in the visual field image of the user X, the process advances to step S308. If it is determined that the extraction target Te is not included in the visual field image of the user X, the process advances to step S309.

抽出対象物ＴｅがユーザＸの視野画像に含まれると判定された場合には、ステップＳ３０８において、表示画像生成装置１Ｃは、表示画像生成部１８Ｃにより、抽出対象物Ｔｅそのものを強調する第１表示画像Ｐ１を生成する。表示画像生成部１８Ｃは、視野画像取得部１４Ｃから視野Ｅｘの視野画像を取得し、視野画像から抽出対象物Ｔｅを画像認識し、視野画像に重畳して表示される抽出対象物Ｔｅそのものを強調する第１表示態様で抽出対象物情報を示した第１表示画像Ｐ１を生成する。なお、表示画像生成部１８Ｃは、視野画像から抽出対象物ＴｅがユーザＸにより視認可能であることを示す情報及び発言データ取得部１２により取得された発言主体を特定する情報をさらに含む第１表示画像Ｐ１を生成してもよい。表示画像生成部１８Ｃは、生成した第１表示画像Ｐ１をユーザ用端末３Ｃの表示画像表示装置３１Ｃに送信する。 If it is determined that the extraction target Te is included in the visual field image of the user Generate image P1. The display image generation unit 18C acquires the visual field image of the visual field Ex from the visual field image acquisition unit 14C, performs image recognition of the extraction target Te from the visual field image, and emphasizes the extraction target Te itself displayed superimposed on the visual field image. A first display image P1 showing extraction target object information in a first display mode is generated. Note that the display image generation unit 18C generates a first display that further includes information indicating that the extracted object Te is visible by the user X from the visual field image and information identifying the subject of the statement acquired by the statement data acquisition unit 12. An image P1 may be generated. The display image generation unit 18C transmits the generated first display image P1 to the display image display device 31C of the user terminal 3C.

抽出対象物ＴｅがユーザＸの視野画像に含まれないと判定された場合には、ステップＳ３０９において、表示画像生成装置１Ｃは、存否判定部１６Ｃにより、周辺画像取得部１１により取得された現在または過去の周辺画像に基づいて、抽出対象物Ｔｅが対象範囲内に存在するか否かを判定する。存否判定部１６Ｃは、抽出対象物Ｔｅが対象範囲内に存在しないと判定した場合には、スッテプＳ３１２に進む。存否判定部１６Ｃは、抽出対象物Ｔｅが対象範囲内に存在すると判定した場合には、スッテプＳ３１０に進む。 If it is determined that the extraction target Te is not included in the visual field image of the user X, in step S309, the display image generation device 1C uses the current or Based on past surrounding images, it is determined whether the extraction target Te exists within the target range. If the presence/absence determination unit 16C determines that the extraction target Te does not exist within the target range, the process proceeds to step S312. When the presence/absence determination unit 16C determines that the extraction target Te exists within the target range, the process proceeds to step S310.

抽出対象物Ｔｅが存在する位置が対象範囲内であると判定された場合には、ステップＳ３１０において、表示画像生成装置１Ｃは、位置関係取得部１７Ｃにより、抽出対象物ＴｅとユーザＸとの位置関係を取得する。位置関係取得部１７Ｃは、周辺画像取得部１１から取得された現在または過去のユーザＸの周辺画像に基づいて、抽出対象物ＴｅからユーザＸ又は車両２Ｃまでの距離とユーザＸの視野Ｅｘに対する方向を推定する。また、位置関係取得部１７Ｃは、存否判定部１６Ｃより抽出対象物ＴｅからユーザＸ又は車両２Ｃまでの距離を取得してもよい。その後、ステップＳ３１１に進む。 If it is determined that the position where the extraction target Te exists is within the target range, in step S310, the display image generation device 1C uses the positional relationship acquisition unit 17C to determine the position of the extraction target Te and the user X. Get relationships. The positional relationship acquisition unit 17C obtains the distance from the extraction target Te to the user X or the vehicle 2C and the direction with respect to the visual field Ex of the user X, based on the current or past surrounding images of the user X acquired from the surrounding image acquisition unit 11. Estimate. Further, the positional relationship acquisition unit 17C may acquire the distance from the extraction target Te to the user X or the vehicle 2C from the presence/absence determination unit 16C. After that, the process advances to step S311.

ステップＳ３１１において、表示画像生成装置１Ｃは、表示画像生成部１８Ｃにより、位置関係取得部１７Ｃから取得された基準位置を基準として抽出対象物Ｔｅの位置の方向及び距離を含む位置関係を表示する第２表示画像Ｐ２を生成する。表示画像生成部１８Ｃは、位置関係取得部１７Ｃから取得されたユーザＸの視野Ｅｘに対する方向を示す記号画像と距離を表示する第２表示態様で抽出対象物情報を示した第２表示画像Ｐ２を生成する。なお、表示画像生成部１８Ｃは、ユーザＸから抽出対象物Ｔｅが視認不可能であることを示す情報及び発言データ取得部１２により取得された発言主体を特定する情報を含む第２表示画像Ｐ２を生成してもよい。表示画像生成部１８Ｃは、生成した第２表示画像Ｐ２をユーザ用端末３Ｃの表示画像表示装置３１Ｃに送信する。 In step S311, the display image generation device 1C causes the display image generation unit 18C to display a positional relationship including the direction and distance of the position of the extraction target Te using the reference position acquired from the positional relationship acquisition unit 17C as a reference. 2 display image P2 is generated. The display image generation unit 18C generates a second display image P2 that shows the extraction object information in a second display mode that displays the distance and a symbol image indicating the direction with respect to the visual field Ex of the user X acquired from the positional relationship acquisition unit 17C. generate. Note that the display image generation unit 18C generates a second display image P2 that includes information indicating that the extraction target Te is not visible to the user X and information identifying the subject of the statement acquired by the statement data acquisition unit 12. May be generated. The display image generation unit 18C transmits the generated second display image P2 to the display image display device 31C of the user terminal 3C.

抽出対象物Ｔｅが存在する位置が対象範囲内ではないと判定された場合には、ステップＳ３１２において、表示画像生成装置１Ｃは、位置関係取得部１７Ｃにより、抽出対象物ＴｅとユーザＸとの位置関係を取得する。具体的には、位置関係取得部１７Ｃは、存否判定部１６Ｃから抽出対象物Ｔｅが予め設定された対象範囲内に存在しない位置関係情報を取得する。その後、ステップＳ３１３に進む。 If it is determined that the position where the extraction target Te exists is not within the target range, in step S312, the display image generation device 1C uses the positional relationship acquisition unit 17C to determine the position of the extraction target Te and the user X. Get relationships. Specifically, the positional relationship acquisition unit 17C acquires positional relationship information in which the extraction target Te does not exist within a preset target range from the presence/absence determination unit 16C. After that, the process advances to step S313.

ステップＳ３１３において、表示画像生成装置１Ｃは、位置関係取得部１７Ｃから取得された抽出対象物Ｔｅが予め設定された対象範囲内に存在しないという抽出対象物ＴｅとユーザＸとの位置関係を表示する第３表示画像Ｐ３を生成する。表示画像生成部１８Ｃは、視野画像から抽出対象物ＴｅがユーザＸにより視認不可能であることを示す情報及び発言データ取得部１２により取得された発言主体を特定する情報を含む第３表示画像Ｐ３を生成する。なお、抽出対象物Ｔｅの位置の方向及び距離を含む位置関係（第２表示態様に係る位置関係）は表示されない。表示画像生成部１８Ｃは、生成した第３表示画像Ｐ３をユーザ用端末３Ｃの表示画像表示装置３１Ｃに送信する。 In step S313, the display image generation device 1C displays the positional relationship between the extraction target Te and the user X, which indicates that the extraction target Te acquired from the positional relationship acquisition unit 17C does not exist within the preset target range. A third display image P3 is generated. The display image generation unit 18C generates a third display image P3 that includes information indicating that the extracted object Te is not visible to the user X from the visual field image and information identifying the subject of the statement acquired by the statement data acquisition unit 12. generate. Note that the positional relationship including the direction and distance of the extraction target object Te (positional relationship according to the second display mode) is not displayed. The display image generation unit 18C transmits the generated third display image P3 to the display image display device 31C of the user terminal 3C.

表示画像生成装置１Ｃは、表示画像生成部１８Ｃの上述した処理が終了すると、今回の処理を終了して、再びステップＳ３０１から表示画像生成処理を繰り返す。 When the display image generation unit 18C completes the above-described processing, the display image generation device 1C ends the current processing and repeats the display image generation processing from step S301 again.

上記のとおり、本実施形態では、表示画像生成装置１Ｃは、周辺画像を取得して記憶する周辺画像取得部１１と、ユーザＸの視線を認識する視線認識部２０と、を備える。視野画像取得部１４Ｃは、周辺画像取得部１１により取得された現在の周辺画像と視線認識部２０により認識されたユーザＸの現在の視線とに基づいて視野画像を取得する。この結果、表示画像生成装置１Ｃは、周辺画像取得部１１によりユーザＸの視野Ｅｘを含む領域の画像である周辺画像を取得し、視線認識部２０によりユーザＸの視線を認識し、取得された周辺画像が含まれたユーザＸの視線に応じる視野画像を取得することができる。これにより、ユーザ用端末３Ｃに視野画像取得装置３２が無くても、視線認識部２０によりユーザＸの視野画像を取得することができる。 As described above, in this embodiment, the display image generation device 1C includes the peripheral image acquisition unit 11 that acquires and stores peripheral images, and the line-of-sight recognition unit 20 that recognizes the user's X line of sight. The visual field image acquisition unit 14C acquires a visual field image based on the current peripheral image acquired by the peripheral image acquisition unit 11 and the current line of sight of the user X recognized by the line of sight recognition unit 20. As a result, the display image generation device 1C uses the peripheral image acquisition unit 11 to acquire a peripheral image that is an image of the area including the visual field Ex of the user X, and the line of sight recognition unit 20 recognizes the line of sight of the user X. A visual field image corresponding to the line of sight of user X that includes peripheral images can be acquired. Thereby, even if the user terminal 3C does not have the visual field image acquisition device 32, the visual field image of the user X can be acquired by the visual line recognition unit 20.

以上、本開示の表示画像生成装置及び表示画像生成方法を上述した各実施形態に基づき説明してきたが、具体的な構成については、これらの各実施形態に限られるものではなく、特許請求の範囲の各請求項に係る発明の要旨を逸脱しない限り、設計の変更や追加等は許容される。 The display image generation device and display image generation method of the present disclosure have been described above based on the above-mentioned embodiments, but the specific configuration is not limited to these embodiments, and the scope of the claims Changes and additions to the design are permitted as long as they do not depart from the gist of the invention claimed in each claim.

各実施形態において、ユーザＸとユーザＹの両方とも、車両に乗車している例を示したが、これに限られない。例えば、ユーザＸ、及び、発言主体であるユーザＹの一方又は両方が、車両２Ａ～Ｃの車外（すなわち、車両２Ａ～Ｃから離間した場所）に存在（位置）してもよい。この場合、ユーザＸのユーザ用端末又はユーザＸのユーザ用端末が接続可能なサーバは、発言データ取得部と、対象物抽出部と、対象物判定部と、表示画像生成部と、の構成を少なくとも有する必要がある。なお、視野画像取得部は、例えば、ユーザ用端末が有する視野画像取得装置に含める。そして、周辺撮像装置により得られる周辺画像を、ユーザＸの視野Ｅｘに対応する視野画像としてもよいし、ユーザＸが周辺撮像装置を有しておりユーザ用端末に送信してもよい。更に、ユーザＸが車外にいる場合、姿勢取得装置２３はユーザ用端末３ＣまたはユーザＸの周辺に設置し、ユーザＸの顔画像またはセンサによりユーザＸの顔向き情報を取得する。そして、視線認識部２０は、姿勢取得装置２３により取得したユーザＸの顔画像または顔向き情報によりユーザＸの視線方向を認識する。視野画像取得部は、周辺撮像装置２２が撮像した周辺画像と視線認識部２０が認識したユーザＸの視線方向に基づいて、ユーザＸの視野画像を生成する。なお、ユーザＸが車外にいる場合、視線認識部２０は、ユーザＸのユーザ用端末又はユーザＸのユーザ用端末が接続可能なサーバが有するものとする。そして、ユーザＸが車外に存在する場合でも、ユーザ用端末は、発言主体により発せられた発言に含まれる抽出対象物Ｔｅに関する表示画像Ｐが生成される。そして、表示画像表示装置に表示画像Ｐが表示される。 In each embodiment, an example has been shown in which both user X and user Y are riding in a vehicle, but the present invention is not limited to this. For example, one or both of the user X and the user Y who is the main speaker may exist (position) outside the vehicles 2A to 2C (that is, in a place separated from the vehicles 2A to 2C). In this case, the user terminal of user X or the server to which the user terminal of user Must have at least one. Note that the visual field image acquisition unit is included in, for example, a visual field image acquisition device included in the user terminal. Then, the peripheral image obtained by the peripheral imaging device may be a visual field image corresponding to the visual field Ex of the user X, or the peripheral image obtained by the peripheral imaging device may be transmitted to the user terminal if the user X has the peripheral imaging device. Further, when the user X is outside the vehicle, the posture acquisition device 23 is installed on the user terminal 3C or around the user X, and acquires the facial orientation information of the user X using the user X's facial image or sensor. The line-of-sight recognition unit 20 then recognizes the line-of-sight direction of the user X based on the face image or face orientation information of the user X acquired by the posture acquisition device 23. The visual field image acquisition unit generates a visual field image of the user X based on the peripheral image captured by the peripheral imaging device 22 and the visual line direction of the user X recognized by the visual line recognition unit 20. Note that when the user X is outside the vehicle, the line of sight recognition unit 20 is included in the user terminal of the user X or a server to which the user terminal of the user X can be connected. Even when the user X is outside the vehicle, the user terminal generates a display image P regarding the extraction target Te included in the statement uttered by the speaker. Then, the display image P is displayed on the display image display device.

各実施形態において、対象物判定部は、抽出対象物ＴｅがユーザＸの視野画像に含まれるか否かの判定結果の情報を発言主体であるユーザＹの発言主体用端末４へ出力する例を示したが、これに限られない。例えば、ユーザＹへ出力する情報としては、ユーザＸの視野画像や表示画像Ｐや周辺画像などを出力しても良い。また、ユーザＹが特に車外に存在する場合には、ユーザＹの発言主体用端末４やＶＲ（Virtual Reality、画像表示装置）などに画像を表示する。このように、発言主体であるユーザＹに画像を表示することにより、ユーザＹはユーザＸの視認可能領域や視線方向の情報をえることができるので、ユーザＸとユーザＹとの話題の進み方をより決めやすくなる。 In each embodiment, the object determination unit outputs information on the determination result as to whether or not the extraction target Te is included in the visual field image of the user X to the speaker terminal 4 of the user Y who is the speaker. shown, but is not limited to this. For example, as the information to be output to the user Y, a visual field image, a display image P, a peripheral image, etc. of the user X may be output. Furthermore, when user Y is particularly present outside the vehicle, an image is displayed on user Y's speaking terminal 4, VR (Virtual Reality, image display device), or the like. In this way, by displaying the image to user Y, who is the main speaker, user Y can obtain information about user It becomes easier to decide.

また、周辺撮像装置２２により撮像された周辺画像は上記の各実施形態において説明したものに限定されず、例えばユーザＸの視野Ｅｘに対応する視野画像としてもよい。ここで、例えば、発言主体であるユーザＹが車両２Ａの車外に存在する場合には、発言主体用端末４には、周辺撮像装置２２により撮像された周辺画像の一部またはすべての画像が表示されてもよい。これにより、ユーザＸとユーザＹとの話題の進み方を決めることができる。 Further, the peripheral image captured by the peripheral imaging device 22 is not limited to those described in each of the above embodiments, and may be a visual field image corresponding to the visual field Ex of the user X, for example. Here, for example, if the user Y who is the main speaker is present outside the vehicle 2A, the terminal 4 for the main speaker displays part or all of the peripheral images captured by the peripheral imaging device 22. may be done. Thereby, it is possible to decide how the topic between user X and user Y will proceed.

また、ユーザ用端末３Ａ～３Ｃの表示画像表示装置３１Ａ～３１Ｃは、透過型ディスプレイとする例を示したが、車両２Ａ～２Ｃに設置されたヘッドアップディスプレイでもよい。例えば、ヘッドアップディスプレイは、車両２Ａ～２Ｃのフロントウィンドウの下部位置に設定され、灯光器でウィンドシールドに画像を表示する。この場合、画像は、表示画像生成部１８Ａ～１８Ｃが生成したユーザＸのＥｘの視野に対応する表示画像Ｐを表示する。 Furthermore, although the display image display devices 31A to 31C of the user terminals 3A to 3C are shown as transmissive displays, they may be head-up displays installed in the vehicles 2A to 2C. For example, the head-up display is set at the lower part of the front window of the vehicles 2A to 2C, and displays an image on the windshield using a lamp. In this case, the image is a display image P generated by the display image generation units 18A to 18C that corresponds to the user's X's Ex field of view.

また、発言主体は、人ではなく、ユーザＸに対して発言を発する発言装置でもよい。発言装置の場合、発言データは出力文データである。出力文データは、発言装置が出力文（文字列）を音声として出力する音声データであってもよいし、出力文（文字列）であってもよい。このため、表示画像生成装置１Ａ～１Ｃは、発言データ取得装置によりユーザＸに対して発言を発する発言装置から出力文データを取得することができる。また、この場合、「発言主体により発せられた発言」は、「発言装置により発せられた（出力された）音声」である。また、表示画像生成部１８Ａ～１８ＣがユーザＸに対する音声を発する発言装置を特定する情報を取得し、例えば「Mentioned by Speech output device.」という表示画像Ｐを生成してもよい。この結果、発言装置の発言に含まれる抽出対象物ＴｅをユーザＸに対する適切な表示態様で抽出対象物情報を表示させることができる。具体的には、発言装置は、ユーザＸと音声対話可能な、いわゆる対話型エージェント装置であってもよい。 Furthermore, the speaking subject may be a speaking device that makes a comment to user X instead of a person. In the case of a comment device, the comment data is output sentence data. The output sentence data may be audio data in which the speaking device outputs an output sentence (character string) as voice, or may be an output sentence (character string). Therefore, the display image generation devices 1A to 1C can acquire output sentence data from the comment device that makes a comment to the user X using the comment data acquisition device. Furthermore, in this case, the "utterance uttered by the speaking subject" is the "voice uttered (outputted) by the speaking device." Further, the display image generation units 18A to 18C may obtain information specifying a speech device that emits a voice to the user X, and generate a display image P that reads, for example, "Mentioned by Speech output device." As a result, it is possible to display extraction target object information for the extraction target Te included in the utterance of the speaking device in an appropriate display manner for the user X. Specifically, the speaking device may be a so-called interactive agent device capable of voice interaction with user X.

また、上記では、発言主体は、１人のユーザＹのみ又は１つの発言装置のみであったが、発言主体の対象としては複数であってもよい。例えば、発言主体の対象として、２人以上の同乗者（ユーザ）であってもよいし、１人の同乗者（ユーザ）と１つの発言装置であってもよい。この場合、発言データ取得部１２は、ユーザＸに対して発言を発した発言主体を特定する情報を取得する。次に、表示画像生成部１８Ａ～１８Ｃは、発言データ取得部１２により取得された発言主体を特定する情報を含む表示画像Ｐを生成する。この結果、表示画像生成装置１Ａ～１Ｃは、発言データ取得部１２により発言主体を特定する情報を取得し、表示画像生成部１８Ａ～１８Ｃにより発言主体を特定する情報を含む表示画像Ｐを生成することができる。これにより、発言主体の対象が複数であるとき、ユーザＸが発言主体を明確に把握することができる。 Furthermore, in the above description, only one user Y or one speaking device was the subject of the statement, but there may be a plurality of subject subjects. For example, the subject of the speech may be two or more fellow passengers (users), or may be one fellow passenger (user) and one speech device. In this case, the utterance data acquisition unit 12 acquires information that identifies the utterer who made the utterance to user X. Next, the display image generation units 18A to 18C generate a display image P that includes information that specifies the person who made the statement, which was acquired by the statement data acquisition unit 12. As a result, the display image generation devices 1A to 1C acquire information specifying the speaker using the comment data acquisition unit 12, and generate display images P including information specifying the speaker using the display image generating units 18A to 18C. be able to. Thereby, when there are multiple targets of the speaker, the user X can clearly understand the speaker.

上記では、発言データ取得部１２と発言データ取得装置４１を有する例を示したが、発言データ取得部１２が発言データ取得装置４１の機能を備えていれば、発言データ取得装置４１を備えていなくてもよい。また、視野画像取得部１４Ａ，１４Ｂと視野画像取得装置３２を有する例を示したが、視野画像取得部１４Ａ，１４Ｂが視野画像取得装置３２の機能を備えていれば、視野画像取得装置３２を備えなくても良い。更に、視線認識部２０と姿勢取得装置２３を有する例を示したが、視線認識部２０が姿勢取得装置２３の機能を備えていれば、姿勢取得装置２３を備えなくても良い。更にまた、周辺画像取得部１１と周辺撮像装置２２を有する例を示したが、周辺画像取得部１１が周辺撮像装置２２の機能を備えていれば、周辺撮像装置２２を備えていなくても良い。 In the above example, the comment data acquisition section 12 and the comment data acquisition device 41 are provided. However, if the comment data acquisition section 12 has the function of the comment data acquisition device 41, the comment data acquisition device 41 can be omitted. You can. Further, although an example is shown in which the visual field image acquisition units 14A, 14B and the visual field image acquisition device 32 are provided, if the visual field image acquisition units 14A, 14B have the function of the visual field image acquisition device 32, the visual field image acquisition device 32 can be used. You don't have to prepare. Further, although an example is shown in which the line-of-sight recognition unit 20 and the posture acquisition device 23 are provided, the posture acquisition device 23 may not be provided as long as the line-of-sight recognition unit 20 has the function of the posture acquisition device 23. Furthermore, although an example is shown in which the peripheral image acquisition unit 11 and the peripheral imaging device 22 are provided, the peripheral imaging device 22 may not be provided as long as the peripheral image acquisition unit 11 has the function of the peripheral imaging device 22. .

第２実施形態では、対象物判定部１５Ｂは、抽出対象物ＴｅがＰＯＩである否かを判定すると共に、抽出対象物Ｔｅの画像が視野画像取得部１４Ｂにより取得された視野画像に含まれるか否かを判定する例を示したが、これに限定されない。例えば、対象物判定部は、抽出対象物がＰＯＩである否かを判定せず、抽出対象物の画像が視野画像取得部により取得された視野画像に含まれるか否かのみを判定しても良い。このように判定する場合、抽出対象物がＰＯＩでなくても、視野画像に含まれていると判定されれば、第１表示画像が生成される。 In the second embodiment, the target object determination unit 15B determines whether the extraction target Te is a POI or not, and determines whether the image of the extraction target Te is included in the visual field image acquired by the visual field image acquisition unit 14B. Although an example of determining whether or not is shown has been shown, the present invention is not limited to this. For example, the target object determination unit may not determine whether the extraction target is a POI, but only determine whether the image of the extraction target is included in the visual field image acquired by the visual field image acquisition unit. good. In this case, even if the extraction target is not a POI, if it is determined that it is included in the visual field image, the first display image is generated.

１Ａ，１Ｂ，１Ｃ表示画像生成装置
１１周辺画像取得部
１２発言データ取得部
１３対象物抽出部
１４Ａ，１４Ｂ，１４Ｃ視野画像取得部
１５Ａ，１５Ｂ，１５Ｃ対象物判定部
１６Ａ，１６Ｂ，１６Ｃ存否判定部
１７Ａ，１７Ｂ，１７Ｃ位置関係取得部
１８Ａ，１８Ｂ，１８Ｃ表示画像生成部
１９ＰＯＩ情報記憶部
２０視線認識部
２Ａ，２Ｂ，２Ｃ車両
２１ナビゲーション装置
２２周辺撮像装置
２３姿勢取得装置
３Ａ，３Ｂ，３Ｃユーザ用端末
３１Ａ，３１Ｂ，３１Ｃ表示画像表示装置
３２視野画像取得装置
４発言主体用端末
４１発言データ取得装置 1A, 1B, 1C Display image generation device 11 Surrounding image acquisition unit 12 Speech data acquisition unit 13 Object extraction unit 14A, 14B, 14C Visual field image acquisition unit 15A, 15B, 15C Object determination unit 16A, 16B, 16C Existence determination unit 17A, 17B, 17C Positional relationship acquisition unit 18A, 18B, 18C Display image generation unit 19 POI information storage unit 20 Line of sight recognition unit 2A, 2B, 2C Vehicle 21 Navigation device 22 Peripheral imaging device 23 Attitude acquisition device 3A, 3B, 3C User terminals 31A, 31B, 31C display image display device 32 visual field image acquisition device 4 speaker terminal 41 speech data acquisition device

Claims

A display image generation device that identifies an object included in a statement uttered by a speaker as an extraction object, and generates a display image regarding the extraction object,
a utterance data acquisition unit that acquires utterance data of the utterance uttered to the user by the utterer;
A plurality of object data are stored in advance, and the plurality of object data and the statement data acquired by the statement data acquisition section are compared, and among the statement data, data that matches the target object data is selected from the statement data. an object extraction unit that extracts an object as an extraction object;
a visual field image acquisition unit that acquires an image including at least a visual field image corresponding to the visual field of the user;
a target object determination unit that determines whether the extraction target object extracted by the target object extraction unit is included in the visual field image;
a display image generation unit that acquires extraction target information that is information regarding the position of the extraction target and generates the display image that includes the extraction target information that is different from the visual field image;
The display image generation unit determines a display mode of the display image regarding the extraction target based on a determination result by the target object determination unit as to whether the extraction target is included in the visual field image, and determining the display mode that is different depending on whether or not the extraction target is included in the visual field image;
A display image generation device characterized by:

The display image generation unit displays the extraction target information in the display mode that emphasizes the extraction target itself when the target object determining unit determines that the extraction target is included in the visual field image. The display image generation device according to claim 1, wherein the display image generation device generates the display image according to the present invention.

comprising a positional relationship acquisition unit that acquires a relative positional relationship between the extraction target and the user;
The display image generation unit displays the extraction target information in the display mode that displays the positional relationship when the target object determination unit determines that the extraction target is not included in the visual field image. The display image generation device according to claim 1 or 2, wherein the display image generation device generates the display image.

The display image generation unit is configured to make the extraction target visible to the user from the visual field image based on a determination result of the target object determination unit as to whether the extraction target is included in the visual field image. The display image generation device according to any one of claims 1 to 3, wherein the display image is generated including information indicating whether or not.

an existence/nonexistence determination unit that determines whether or not the extraction target exists within a preset target range when the target object determination unit determines that the extraction target is not included in the visual field image; Prepare,
The display image generation unit determines the display mode of the extraction target information based on a determination result of the presence/absence determining unit as to whether or not the extraction target exists within the target range. The display image generation device according to any one of claims 1 to 4.

comprising a peripheral image acquisition unit that acquires a peripheral image that is an area around the user including the visual field image, and stores the acquired peripheral image;
A claim characterized in that the presence/absence determination unit determines whether or not the extraction target exists within the target range based on the current or past peripheral images acquired by the peripheral image acquisition unit. Item 5. The display image generation device according to item 5.

comprising a positional relationship acquisition unit that acquires a relative positional relationship between the extraction target and the user;
The display image generation unit is configured to use a reference position set at the user's position or a position in the vicinity of the user as a reference when the presence/absence determining unit determines that the extraction target exists within the target range. The display according to claim 5 or 6, wherein the display image is generated that shows the extraction target information in the display mode that displays the positional relationship including the direction and distance of the position of the extraction target. Image generation device.

The extraction target is a POI (Points of Interest) that is a landmark associated with a position on a map,
comprising a POI information storage unit that stores POI information of the POI including at least information regarding the location of the extraction target;
Claims 5 to 7, wherein the presence/absence determination unit determines whether or not the extraction target exists within the target range based on the POI information stored by the POI information storage unit. The display image generation device according to any one of the preceding items.

The subject of the statement is a person,
The display image generation device according to any one of claims 1 to 8, wherein the statement data is statement signal data of the statement made by a person to the user.

The utterance subject is a utterance device that utters the utterance to the user,
The image generation device according to any one of claims 1 to 8, wherein the comment data is output sentence data indicating the content of an output sentence output as the comment.

11. The object determining unit outputs information on a determination result as to whether or not the extracted object is included in the visual field image to the speaker. The image generation device described.

The utterance data acquisition unit acquires information specifying the utterance subject who uttered the utterance to the user,
The display image generating unit generates the display image including the information identifying the speaking subject acquired by the statement data acquisition unit, according to any one of claims 1 to 11. display image generation device.

a peripheral image acquisition unit that acquires a peripheral image that is an area around the user including the visual field image, and stores the acquired peripheral image;
a line of sight recognition unit that recognizes the line of sight of the user;
The visual field image acquisition unit acquires the visual field image based on the current peripheral image acquired by the peripheral image acquisition unit and the current line of sight of the user recognized by the line of sight recognition unit. The display image generation device according to any one of claims 1 to 12.

A display image generation method using a display image generation device that identifies a target object included in a statement uttered by a speaker as an extraction target object, and generates a display image related to the extraction target object, the method comprising:
a utterance data acquisition step of acquiring utterance data of utterances uttered to the user by the utterer;
a target object extraction step of comparing a plurality of pre-stored target object data and the acquired utterance data, and extracting data that matches the target object data from among the utterance data as the extraction target;
a visual field image acquisition step of acquiring a visual field image corresponding to the user's visual field;
a target object determination step of determining whether the extracted extraction target object is included in the visual field image;
a display image generation step of acquiring extraction target information that is information regarding the position of the extraction target and generating the display image including the extraction target information different from the visual field image;
In the display image generation step, a display mode of the display image regarding the extraction target is determined based on a determination result of whether the extraction target is included in the visual field image in the target object determination step , determining the display mode that is different depending on whether or not the extraction target is included in the visual field image;
A display image generation method characterized by: