JP2021081372A

JP2021081372A - Display image generator and display image generation method

Info

Publication number: JP2021081372A
Application number: JP2019210845A
Authority: JP
Inventors: 井上　裕史; Yasushi Inoue; 裕史井上; 乘西山; Nori Nishiyama; 雄宇志小田; Yuu Shioda; 剛仁寺口; Takehito Teraguchi; 翔太大久保; Shota Okubo
Original assignee: Renault SAS; Nissan Motor Co Ltd
Current assignee: Renault SAS; Nissan Motor Co Ltd
Priority date: 2019-11-21
Filing date: 2019-11-21
Publication date: 2021-05-27
Anticipated expiration: 2039-11-21
Also published as: JP7418189B2

Abstract

To appropriately generate information relating to the position of an extraction target recognized by the utterance of other than a user, irrespective of whether or not the extraction target is included in the visual field of the user.SOLUTION: A display image generator 1A comprises: an utterance data acquisition unit 12 for acquiring the utterance data of the statement uttered to a user by an uttering entity; a target extraction unit 13 for extracting data out of the utterance data that matches the target data as an extraction target; a visual field image acquisition unit 14A for acquiring the visual field image of the user; a target determination unit 15A for determining whether or not the extraction target that is the extracted target is included in the visual field image; and a display image generation unit 18A for acquiring extraction target information that is information relating to the position of the extraction target and generating a display image that includes the extraction target information. The display image generation unit 18A determines the display mode of the display image relating to the extraction target on the basis of the determination result of whether or not the extraction target is included in the visual field image.SELECTED DRAWING: Figure 1

Description

本開示は、表示画像生成装置及び表示画像生成方法に関する。 The present disclosure relates to a display image generation device and a display image generation method.

認識されている車外対象物の位置に関する情報を生成する技術が知られている。例えば特許文献１には、車両乗員が注目している車外対象物を視線検出及び音声認識により特定し、特定された対象物の車両に対する相対位置を示す表示画像を生成する技術が開示されている。 Techniques are known to generate information about the location of recognized out-of-vehicle objects. For example, Patent Document 1 discloses a technique of identifying an object outside the vehicle that the vehicle occupant is paying attention to by line-of-sight detection and voice recognition, and generating a display image showing the relative position of the identified object with respect to the vehicle. ..

特開２００６−９０７９０号公報Japanese Unexamined Patent Publication No. 2006-90790

しかし、上述した従来の技術は、対象物が存在する方向をユーザが見ていることを前提としており、当該対象物がユーザの視野内に含まれているか否かにかかわらず当該対象物の位置に関する情報を生成し得るものではない。また、上述した従来の技術は、ユーザ自身により認識されている対象物の位置に関する情報を生成しようとするものであって、そのユーザ以外の主体により認識されている対象物の位置に関する情報をユーザのために生成することについては考慮されていない。 However, the above-mentioned conventional technique presupposes that the user is looking in the direction in which the object exists, and the position of the object regardless of whether or not the object is included in the user's field of view. It cannot generate information about. Further, the above-mentioned conventional technique is to generate information on the position of the object recognized by the user himself / herself, and the user can generate information on the position of the object recognized by a subject other than the user. No consideration is given to producing for.

本開示は、このような事情に鑑みてなされてものであって、ユーザ以外の主体により認識されている抽出対象物がユーザの視野内に含まれているか否かにかかわらず、当該抽出対象物の位置に関する情報を適切に生成する表示画像生成装置及び表示画像生成方法を提供することを目的とする。 This disclosure is made in view of such circumstances, and regardless of whether or not the extraction target recognized by a subject other than the user is included in the user's field of view, the extraction target is concerned. It is an object of the present invention to provide a display image generation device and a display image generation method for appropriately generating information regarding the position of.

本開示に係る表示画像生成装置は、発言主体により発せられた発言に含まれる対象物を抽出対象物として特定し、当該抽出対象物に関する表示画像を生成する表示画像生成装置である。本開示に係る表示画像生成装置は、発言データ取得部と、対象物抽出部と、視野画像取得部と、対象物判定部と、表示画像生成部と、を備える。発言データ取得部は、発言主体によりユーザに対して発せられた発言の発言データを取得する。対象物抽出部は、予め複数の対象物データを記憶し、複数の対象物データと発言データ取得部により取得された発言データとを対比して、発言データのうち対象物データと一致するデータを抽出対象物として抽出する。視野画像取得部は、ユーザの視野に対応する視野画像を少なくとも含む画像を取得する。対象物判定部は、対象物抽出部により抽出された抽出対象物が視野画像に含まれるか否かを判定する。表示画像生成部は、抽出対象物の位置に関する情報である対象物情報を取得し、視野画像とは異なる出対象物情報を含む表示画像を生成する。表示画像生成部は、対象物判定部による抽出対象物が視野画像に含まれるか否かの判定結果に基づいて、抽出対象物に関する表示画像の表示態様を決定する。 The display image generation device according to the present disclosure is a display image generation device that identifies an object included in a statement made by a speaking subject as an extraction target and generates a display image related to the extraction object. The display image generation device according to the present disclosure includes a speech data acquisition unit, an object extraction unit, a visual field image acquisition unit, an object determination unit, and a display image generation unit. The remark data acquisition unit acquires the remark data of the remark made to the user by the remark subject. The object extraction unit stores a plurality of object data in advance, compares the plurality of object data with the speech data acquired by the speech data acquisition unit, and selects data that matches the object data among the speech data. Extract as an extraction target. The field-of-view image acquisition unit acquires an image including at least a field-of-view image corresponding to the user's field of view. The object determination unit determines whether or not the extraction target extracted by the object extraction unit is included in the visual field image. The display image generation unit acquires the object information which is the information about the position of the extraction object, and generates the display image including the output object information different from the visual field image. The display image generation unit determines the display mode of the display image related to the extraction target based on the determination result of whether or not the extraction target is included in the visual field image by the object determination unit.

本開示によれば、ユーザ以外の主体により認識されている対象物がユーザの視野内に含まれているか否かにかかわらず、当該対象物の位置に関する情報を適切に生成することが可能となる。 According to the present disclosure, it is possible to appropriately generate information on the position of an object recognized by a subject other than the user, regardless of whether or not the object is included in the user's field of view. ..

第１実施形態に係る表示画像生成装置を示すブロック図である。It is a block diagram which shows the display image generation apparatus which concerns on 1st Embodiment. 端末を装着して車両に同乗しているユーザ及び同乗者を示す図である。It is a figure which shows the user and the passenger who are riding a vehicle with a terminal attached. 車両の上方から見たときのユーザの視野を説明するための平面図である。It is a top view for demonstrating the user's field of view when viewed from above the vehicle. 表示画像が第１表示態様で重畳して表示されたユーザＸの視野に対応する周辺状況を示す図である。It is a figure which shows the peripheral situation corresponding to the field of view of the user X which the display image was superposed and displayed in the 1st display mode. 第１表示画像が表示された表示画像表示装置を示す図である。It is a figure which shows the display image display apparatus which displayed the 1st display image. 表示画像が第２表示態様で重畳して表示されたユーザＸの視野に対応する周辺状況を示す図である。It is a figure which shows the peripheral situation corresponding to the field of view of the user X which the display image was superposed and displayed in the 2nd display mode. 第２表示画像が表示された表示画像表示装置を示す図である。It is a figure which shows the display image display apparatus which displayed the 2nd display image. 表示画像が第３表示態様で重畳して表示されたユーザＸの視野に対応する周辺状況を示す図である。It is a figure which shows the peripheral situation corresponding to the field of view of the user X which the display image was superposed and displayed in the 3rd display mode. 第３表示画像が表示された表示画像表示装置を示す図である。It is a figure which shows the display image display apparatus which displayed the 3rd display image. 第１実施形態に係る表示画像生成処理を示すフローチャートである。It is a flowchart which shows the display image generation processing which concerns on 1st Embodiment. 第２実施形態に係る表示画像生成装置を示すブロック図である。It is a block diagram which shows the display image generation apparatus which concerns on 2nd Embodiment. 第２実施形態に係る表示画像生成処理を示すフローチャートである。It is a flowchart which shows the display image generation processing which concerns on 2nd Embodiment. 第３実施形態に係る表示画像生成装置を示すブロック図である。It is a block diagram which shows the display image generation apparatus which concerns on 3rd Embodiment. 第３実施形態に係る表示画像生成処理を示すフローチャートである。It is a flowchart which shows the display image generation processing which concerns on 3rd Embodiment.

以下、図面を参照して、本開示の例示的な実施形態について説明する。なお、以下の説明において、同一又は相当部分には同一符号を付し、重複する説明は省略する。
［第１実施形態］ Hereinafter, exemplary embodiments of the present disclosure will be described with reference to the drawings. In the following description, the same or corresponding parts will be designated by the same reference numerals, and duplicate description will be omitted.
[First Embodiment]

図１は、第１実施形態に係る表示画像生成装置１Ａを示すブロック図である。図２は、端末を装着して車両２Ａに同乗しているユーザＸ及びユーザＹを示す図である。図３は、車両２Ａの上方から見たときのユーザＸの視野Ｅｘを説明するための平面図である。図４Ａと図５Ａと図６Ａは、表示画像が各表示態様で重畳して表示されたユーザＸの視野Ｅｘに対応する周辺状況を示す図である。図４Ｂと図５Ｂと図６Ｂは、各表示画像が表示された表示画像表示装置を示す図である。図１〜図６に示されるように、表示画像生成装置１Ａは、発言主体により発せられた発言に含まれる（すなわち、発言主体により発せられた発言において言及されている）対象物Ｔを抽出対象物Ｔｅとして特定し、当該抽出対象物Ｔｅに関する表示画像Ｐを生成する装置である。 FIG. 1 is a block diagram showing a display image generation device 1A according to the first embodiment. FIG. 2 is a diagram showing a user X and a user Y who are riding in a vehicle 2A with a terminal attached. FIG. 3 is a plan view for explaining the field of view Ex of the user X when viewed from above the vehicle 2A. 4A, 5A, and 6A are diagrams showing a peripheral situation corresponding to the visual field Ex of the user X in which the displayed images are superimposed and displayed in each display mode. 4B, 5B, and 6B are diagrams showing a display image display device on which each display image is displayed. As shown in FIGS. 1 to 6, the display image generator 1A extracts the object T included in the remarks made by the speaking subject (that is, referred to in the remarks made by the speaking subject). This is a device that identifies the object Te and generates a display image P related to the extraction target Te.

より詳細には、表示画像生成装置１Ａは、ユーザＸの視野Ｅｘに対応する周辺状況に重畳して表示される表示画像Ｐを生成する装置である。ユーザＸは、人であるユーザ（発言主体）Ｙと車両２Ａに乗車しており、例えば車外の景色を視認している。ユーザＸは、ユーザ用端末３Ａを装着している。ユーザＹは、発言主体用端末４を装着している（図２参照）。本実施形態では、ユーザＹがユーザＸに対して話しかける状況を例示して、表示画像生成装置１Ａについて説明する。 More specifically, the display image generation device 1A is a device that generates a display image P that is superimposed and displayed on the surrounding situation corresponding to the visual field Ex of the user X. The user X is in the vehicle 2A with the user (speaking subject) Y who is a person, and is visually recognizing the scenery outside the vehicle, for example. User X is wearing a user terminal 3A. The user Y is equipped with the speaking subject terminal 4 (see FIG. 2). In the present embodiment, the display image generation device 1A will be described by exemplifying a situation in which the user Y speaks to the user X.

ここで、「ユーザＸの視野Ｅｘ」とは、ユーザＸにより視認可能な視認可能領域を意味する。「視認可能領域」は、ヒトが眼を使い、生理的視野中心付近に固視点（注視点）を設けている際に外界から有効に情報を得られる範囲という有効視野である。例えば、ユーザＸの視野Ｅｘは、ユーザＸの視野Ｅｘの中心軸を中心として視認可能な上下左右の全ての領域に設定されてもよい。図３は、車両２Ａの上方から見たとき、ユーザＸの水平方向の視認可能領域を示している。ユーザＸの視野Ｅｘは車両２Ａの移動により変化する。例えば、図３では、現在のユーザＸの位置を現在位置Ｘ１により示し、現在地から移動した後のユーザＸの位置を移動位置Ｘ２により示す。以下の説明では、ユーザＸの視野Ｅｘは、後述するユーザ用端末３Ａを装着したユーザＸが所定の方向を向いている状態で、ユーザ用端末３Ａの透過型ディスプレイを介してユーザＸが視認可能な上下左右の全ての領域に設定されているものとする。なお、ユーザＹの視野Ｅｙは、ユーザＸの視野Ｅｘと同様に、ユーザＹにより視認可能な視認可能領域を意味する（図２参照）。 Here, the "viewing field Ex of the user X" means a visible area that can be visually recognized by the user X. The "visible area" is an effective visual field in which information can be effectively obtained from the outside world when a human uses the eye and a fixed viewpoint (gaze point) is provided near the center of the physiological visual field. For example, the field of view Ex of the user X may be set in all the visible top, bottom, left, and right regions centered on the central axis of the field of view Ex of the user X. FIG. 3 shows a horizontally visible area of the user X when viewed from above the vehicle 2A. The field of view Ex of the user X changes with the movement of the vehicle 2A. For example, in FIG. 3, the current position of the user X is indicated by the current position X1, and the position of the user X after moving from the current location is indicated by the moving position X2. In the following description, the field of view Ex of the user X can be visually recognized by the user X through the transmissive display of the user terminal 3A in a state where the user X wearing the user terminal 3A described later is facing a predetermined direction. It is assumed that it is set in all areas of up, down, left and right. The visual field Eye of the user Y means a visible region that can be visually recognized by the user Y, similarly to the visual field Ex of the user X (see FIG. 2).

「周辺状況」は、ユーザＸの周辺の領域であってユーザＸが視認可能な現実の車外の景色（外景）を意味する。周辺状況は、例えばユーザＸの現在位置を中心として水平方向の３６０度にわたる領域であって、ユーザＸの上方や下方までを含めた領域の車外の景色（外景）である。「ユーザＸの視野Ｅｘに対応する周辺状況」とは、ユーザＸの視野Ｅｘに含まれる車外の景色（外景）を意味する。言い換えると、周辺状況は、ユーザＸの視野Ｅｘ内の車外の景色である。図３に示されるように、車両２Ａの移動により、ユーザＸの視野Ｅｘに対応する周辺状況が変わる。 The "surrounding situation" means an actual scenery (outside view) outside the vehicle that is visible to the user X in the area around the user X. The surrounding situation is, for example, a landscape (outside view) of a region extending 360 degrees in the horizontal direction centered on the current position of the user X, including the upper and lower parts of the user X. The “surrounding situation corresponding to the field of view Ex of the user X” means the scenery (outside view) outside the vehicle included in the field of view Ex of the user X. In other words, the surrounding situation is the scenery outside the vehicle in the field of view Ex of the user X. As shown in FIG. 3, the movement of the vehicle 2A changes the peripheral situation corresponding to the visual field Ex of the user X.

「表示画像を生成する」とは、ディスプレイ等に表示される画像情報を生成することを意味する。表示画像生成装置１Ａにより生成された画像情報が有線通信又は無線通信によりディスプレイ等に送信されると、送信された画像情報に係る表示画像Ｐが当該ディスプレイに表示可能となる。なお、「表示画像Ｐ」とは、抽出対象物Ｔｅに関する情報等を表示する画像であり、より具体的には、対象物Ｔの位置に関する情報を表示する画像である。ここでは、表示画像Ｐは、ユーザＸの視野Ｅｘに対応する周辺状況に重畳して表示される。表示画像Ｐとしては、例えば、抽出対象物Ｔｅが視野画像に含まれるか否かを示す文字を含む画像であってもよく、視野画像に含まれる特定の抽出対象物Ｔｅが枠囲みされて見えるように表示される矩形枠線の画像であってもよい。なお、詳しくは後述する。ここで、「視野画像」とは、ユーザＸの視野Ｅｘに対応する画像である。つまり、視野画像は、ユーザＸの視野Ｅｘに対応する周辺状況を撮像した画像である。本実施形態において、ユーザＸの視野Ｅｘに対応する周辺状況とは現実の車外の景色であり、視野画像とは当該ユーザＸの視野Ｅｘに対応する周辺状況が撮像装置（視野画像取得装置３２）により撮像された画像である。 "Generating a display image" means generating image information to be displayed on a display or the like. When the image information generated by the display image generation device 1A is transmitted to a display or the like by wired communication or wireless communication, the display image P related to the transmitted image information can be displayed on the display. The "display image P" is an image that displays information or the like regarding the extraction target object Te, and more specifically, is an image that displays information regarding the position of the object T. Here, the display image P is superimposed and displayed on the peripheral situation corresponding to the visual field Ex of the user X. The display image P may be, for example, an image including characters indicating whether or not the extraction target Te is included in the visual field image, and the specific extraction target Te included in the visual field image appears to be surrounded by a frame. It may be an image of a rectangular border displayed as follows. The details will be described later. Here, the "field of view image" is an image corresponding to the field of view Ex of the user X. That is, the visual field image is an image obtained by capturing the peripheral situation corresponding to the visual field Ex of the user X. In the present embodiment, the peripheral situation corresponding to the visual field Ex of the user X is the actual scenery outside the vehicle, and the visual field image is the peripheral situation corresponding to the visual field Ex of the user X as the imaging device (field image acquisition device 32). It is an image taken by.

表示画像生成装置１Ａは、例えばサーバとして構成されており、プロセッサ（処理装置）及びメモリ（記憶装置）等を含んでいる。 The display image generation device 1A is configured as, for example, a server, and includes a processor (processing device), a memory (storage device), and the like.

プロセッサは、例えばＣＰＵ（Central Processing Unit）又はＭＰＵ（Micro-Processing Unit）により構成されていてもよい。メモリは、半導体記憶装置、磁気記憶装置、及び光学記憶装置の少なくともいずれかを備えていてもよい。また、メモリは、レジスタ、キャッシュメモリ、主記憶装置として使用されるＲＯＭ（Read Only Memory）又はＲＡＭ（Random Access Memory）等を含んでいてもよい。 The processor may be composed of, for example, a CPU (Central Processing Unit) or an MPU (Micro-Processing Unit). The memory may include at least one of a semiconductor storage device, a magnetic storage device, and an optical storage device. Further, the memory may include a register, a cache memory, a ROM (Read Only Memory) or a RAM (Random Access Memory) used as a main storage device, and the like.

表示画像生成装置１Ａ、車両２Ａ、ユーザ用端末３Ａ、及び発言主体用端末４は、相互に有線又は無線により通信（送受信）可能に接続されている。なお、表示画像生成装置１Ａの機能的な構成については後述する。 The display image generator 1A, the vehicle 2A, the user terminal 3A, and the speaking subject terminal 4 are connected to each other so as to be able to communicate (transmit and receive) by wire or wirelessly. The functional configuration of the display image generation device 1A will be described later.

車両２Ａは、ユーザＸ及びユーザＹが乗車している乗用車等である。車両２Ａは、手動運転と自動運転の両方が切り替えにより可能であってもよいし、どちらか一方のみの運転が可能であってもよい。車両２Ａは、ナビゲーション装置２１及び周辺撮像装置２２を備えている。ナビゲーション装置２１は、例えば、ＧＰＳ（Global Positioning System）等により検出された車両２Ａの位置情報、及び、地図情報に基づいて、設定された目的地までの車両２Ａの走行経路を設定し、当該走行経路に沿って車両２Ａを案内する装置である。ナビゲーション装置２１は、車両２Ａの位置（例えば、ＧＰＳにより検出された位置座標）の履歴を時系列で記憶（保持）する。ナビゲーション装置２１は、記憶した車両２Ａの位置の履歴に基づいて車両２Ａの進行方向を取得してもよい。 The vehicle 2A is a passenger car or the like on which the user X and the user Y are riding. The vehicle 2A may be capable of both manual driving and automatic driving by switching, or may be capable of driving only one of them. The vehicle 2A includes a navigation device 21 and a peripheral imaging device 22. The navigation device 21 sets the travel route of the vehicle 2A to the set destination based on the position information of the vehicle 2A detected by, for example, GPS (Global Positioning System) and the map information, and the travel thereof. It is a device that guides the vehicle 2A along the route. The navigation device 21 stores (holds) the history of the position of the vehicle 2A (for example, the position coordinates detected by GPS) in chronological order. The navigation device 21 may acquire the traveling direction of the vehicle 2A based on the stored history of the position of the vehicle 2A.

周辺撮像装置２２は、ユーザＸの周辺状況を撮像して、周辺画像を取得する装置である。「周辺画像」とは、ユーザＸの視野Ｅｘ（すなわち視野画像）を含むユーザＸの周辺の領域であってユーザＸの視野Ｅｘを含む領域の画像である。ユーザＸの周辺画像は、例えばユーザＸを中心として水平方向の３６０度にわたる領域が撮像された画像であってもよく、更にユーザＸの上方まで含めた領域が撮像された画像であってもよい。あるいは、ユーザＸの周辺画像は、ユーザＸの周辺の領域のうち、ユーザＸにより視認されにくい領域（一例として、車両２Ａの座席に着座した状態のユーザＸの後方の領域等）を除く領域であってもよい。あるいは、ユーザＸの視野Ｅｘに対応する領域と同一の領域であってもよい。「ユーザＸの視野Ｅｘを含む領域」とは、ユーザＸの視野Ｅｘを含む領域であれば、その範囲は特に限定されない。 The peripheral image pickup device 22 is a device that captures the peripheral situation of the user X and acquires the peripheral image. The "peripheral image" is an image of a region around the user X including the visual field Ex (that is, the visual field image) of the user X and a region including the visual field Ex of the user X. The peripheral image of the user X may be, for example, an image in which a region extending 360 degrees in the horizontal direction around the user X is captured, or an image in which a region including the upper part of the user X is captured. .. Alternatively, the peripheral image of the user X is an area other than the area around the user X that is difficult to be visually recognized by the user X (for example, the area behind the user X in the state of being seated in the seat of the vehicle 2A). There may be. Alternatively, it may be the same region as the region corresponding to the field of view Ex of the user X. The range of the "region including the visual field Ex of the user X" is not particularly limited as long as it is the region including the visual field Ex of the user X.

周辺撮像装置２２は、例えば１又は複数のカメラによって構成されている。周辺撮像装置２２のカメラは、例えば車両２Ａの屋根上等の車室外に設けられていてもよく、フロントガラス裏等の車室内に設けられていてもよい。車両２Ａは、周辺撮像装置２２により撮像されたユーザＸの周辺画像を表示画像生成装置１Ａに送信する。なお、「画像を送信する」とは、画像の画像データを送信することを意味する。 The peripheral imaging device 22 is composed of, for example, one or a plurality of cameras. The camera of the peripheral imaging device 22 may be provided outside the vehicle interior, such as on the roof of the vehicle 2A, or may be installed inside the vehicle interior, such as behind the windshield. The vehicle 2A transmits the peripheral image of the user X captured by the peripheral image pickup device 22 to the display image generation device 1A. Note that "transmitting an image" means transmitting image data of an image.

ユーザ用端末３Ａは、ユーザＸの頭部に装着される装置であり、表示画像表示装置３１Ａ及び視野画像取得装置３２を備えている。表示画像表示装置３１Ａは、表示画像生成装置１Ａにより生成された表示画像Ｐを表示可能なディスプレイを有している。表示画像表示装置３１Ａのディスプレイは、例えば眼鏡型又はゴーグル型のような透過型ディスプレイであり、ユーザＸによりユーザ用端末３Ａが装着された状態でユーザＸの目の直前に位置する。したがって、ユーザＸは表示画像表示装置３１Ａを介してユーザＸの視野Ｅｘに対応する周辺状況を視認可能となる。また、表示画像表示装置３１Ａに表示画像Ｐが表示されると、ユーザＸから見て、表示画像Ｐ（図４Ｂ，図５Ｂ，図６Ｂ参照）がユーザＸの視野Ｅｘに対応する周辺状況に重畳して表示されることとなる。つまり、表示画像表示装置３１Ａは、いわゆるＡＲ（Augmented Reality）の技術において用いられるＨＭＤ（Head Mounted Display）としての機能を備えている。 The user terminal 3A is a device worn on the head of the user X, and includes a display image display device 31A and a visual field image acquisition device 32. The display image display device 31A has a display capable of displaying the display image P generated by the display image generation device 1A. The display of the display image display device 31A is a transmissive display such as a glasses type or goggles type, and is located immediately in front of the eyes of the user X with the user terminal 3A attached by the user X. Therefore, the user X can visually recognize the peripheral situation corresponding to the visual field Ex of the user X via the display image display device 31A. Further, when the display image P is displayed on the display image display device 31A, the display image P (see FIGS. 4B, 5B, and 6B) is superimposed on the peripheral situation corresponding to the visual field Ex of the user X when viewed from the user X. Will be displayed. That is, the display image display device 31A has a function as an HMD (Head Mounted Display) used in so-called AR (Augmented Reality) technology.

視野画像取得装置３２は、ユーザＸの視野Ｅｘに対応する周辺状況を撮像して、視野画像を取得する撮像装置である。視野画像取得装置３２は、ユーザＸによりユーザ用端末３Ａが装着された状態でユーザＸの視線方向を撮像可能な向きとなるように、ユーザ用端末３Ａに設けられている。視野画像取得装置３２は、例えば表示画像表示装置３１Ａの側部に設けられている。ユーザ用端末３Ａは、視野画像取得装置３２により撮像された視野画像を表示画像生成装置１Ａに送信する。なお、「視野画像を送信する」とは、視野画像の画像データを送信することを意味する。さらに、視野画像取得装置３２は、ユーザＸの視線方向を検出するセンサを備え（不図示）、センサから検出されたユーザＸの視線方向の情報を視野画像の画像データと共に送信してもよい。 The field-of-view image acquisition device 32 is an image pickup device that acquires a field-of-view image by capturing the peripheral situation corresponding to the field-of-view Ex of the user X. The field image acquisition device 32 is provided on the user terminal 3A so that the user X can take an image of the line-of-sight direction of the user X while the user terminal 3A is attached. The field image acquisition device 32 is provided, for example, on the side of the display image display device 31A. The user terminal 3A transmits the field of view image captured by the field of view image acquisition device 32 to the display image generation device 1A. In addition, "transmitting the field of view image" means transmitting the image data of the field of view image. Further, the visual field image acquisition device 32 may include a sensor for detecting the line-of-sight direction of the user X (not shown), and may transmit information on the line-of-sight direction of the user X detected from the sensor together with the image data of the visual field image.

発言主体用端末４は、ユーザＹの頭部に装着される装置であり、発言データ取得装置４１を備えている。発言データ取得装置４１は、ユーザＹによりユーザＸに対して発せられた発言を発言データとして取得する装置である。発言データ取得装置４１は、例えばマイクロフォンによって構成されている。ここでは、発言データ取得装置４１は、発言主体用端末４はヘッドセットであり、発言データ取得装置４１はヘッドセットに設けられたマイクロフォンである。なお、発言データ取得装置４１は、車内マイクロフォン又はイヤホーンであってもよい。また、発言主体用端末４は、ユーザ用端末３Ａと同様の表示画像表示装置３１Ａ及び視野画像取得装置３２を更に備えていてもよい。「発言データ」とは、発言の内容についての情報を有するデータであり、ここでは、発言データは、発言の発言信号データである。「発言信号データ」とは、発言の音声信号を意味する。なお、発言データには、ユーザＹが何も発していないデータも含まれる。 The speaking subject terminal 4 is a device worn on the head of the user Y, and includes a speaking data acquisition device 41. The speech data acquisition device 41 is a device that acquires the speech made by the user Y to the user X as speech data. The speech data acquisition device 41 is composed of, for example, a microphone. Here, in the speech data acquisition device 41, the speech subject terminal 4 is a headset, and the speech data acquisition device 41 is a microphone provided in the headset. The speech data acquisition device 41 may be an in-vehicle microphone or an earphone. Further, the speaking subject terminal 4 may further include a display image display device 31A and a visual field image acquisition device 32 similar to the user terminal 3A. The “speech data” is data having information about the content of the remark, and here, the remark data is the remark signal data of the remark. The “speech signal data” means a speech signal. It should be noted that the speech data also includes data in which the user Y does not emit anything.

発言主体用端末４は、発言データ取得装置４１により取得された発言を表示画像生成装置１Ａに送信する。このとき、発言主体用端末４は、当該発言主体用端末４がユーザＹにより装着されていることを特定する情報（ユーザＹを特定する情報）を、表示画像生成装置１Ａへ更に送信する。「発言主体用端末４がユーザＹにより装着されていることを特定する情報」とは、ユーザＹに紐付けられた情報であり、例えば、ユーザＹと紐付けられた発言主体用端末４のＩＤ（Identification）番号であってもよい。なお、「発言を送信する」とは、発言の発言信号データ（詳しくは後述）を送信することを意味する。 The speech subject terminal 4 transmits the speech acquired by the speech data acquisition device 41 to the display image generation device 1A. At this time, the speaking subject terminal 4 further transmits information specifying that the speaking subject terminal 4 is worn by the user Y (information specifying the user Y) to the display image generation device 1A. The "information that identifies that the speaking subject terminal 4 is attached by the user Y" is information associated with the user Y, for example, the ID of the speaking subject terminal 4 associated with the user Y. It may be an (Identification) number. In addition, "transmitting a remark" means transmitting the remark signal data (details will be described later) of the remark.

次に、表示画像生成装置１Ａの機能的な構成について説明する。表示画像生成装置１Ａは、周辺画像取得部１１、発言データ取得部１２、対象物抽出部１３、視野画像取得部１４Ａ、対象物判定部１５Ａ、存否判定部１６Ａ、位置関係取得部１７Ａ、及び表示画像生成部１８Ａを有している。 Next, the functional configuration of the display image generation device 1A will be described. The display image generation device 1A includes a peripheral image acquisition unit 11, a speech data acquisition unit 12, an object extraction unit 13, a field image acquisition unit 14A, an object determination unit 15A, an existence / absence determination unit 16A, a positional relationship acquisition unit 17A, and a display. It has an image generation unit 18A.

周辺画像取得部１１は、車両２Ａから送信される周辺画像を取得して記憶する。周辺画像取得部１１は、ユーザＸの周辺画像を取得して時系列で記憶する。より具体的には、周辺画像取得部１１は、車両２Ａの周辺撮像装置２２により撮像されたユーザＸの周辺画像を車両２Ａから受信することで、ユーザＸの周辺画像を取得する。周辺画像取得部１１は、取得したユーザＸの周辺画像を時系列で記憶する。つまり、周辺画像取得部１１は、ユーザＸの現在の周辺画像を取得するとともに、取得された周辺画像を過去の周辺画像として記憶（蓄積）していく。周辺画像取得部１１は、予め設定されたタイミングで、記憶している過去の周辺画像の情報を消去してもよい。 The peripheral image acquisition unit 11 acquires and stores the peripheral image transmitted from the vehicle 2A. The peripheral image acquisition unit 11 acquires the peripheral image of the user X and stores it in chronological order. More specifically, the peripheral image acquisition unit 11 acquires the peripheral image of the user X by receiving the peripheral image of the user X captured by the peripheral image pickup device 22 of the vehicle 2A from the vehicle 2A. The peripheral image acquisition unit 11 stores the acquired peripheral image of the user X in chronological order. That is, the peripheral image acquisition unit 11 acquires the current peripheral image of the user X and stores (accumulates) the acquired peripheral image as a past peripheral image. The peripheral image acquisition unit 11 may delete the stored information on the past peripheral image at a preset timing.

発言データ取得部１２は、ユーザＹによりユーザＸに対して発せられた発言の発言データを取得する。より具体的には、発言データ取得部１２は、発言主体用端末４の発言データ取得装置４１により取得されたユーザＹの発言の発言信号データを発言主体用端末４から受信することで、ユーザＹによりユーザＸに対して発せられた発言の発言データを取得する。なお、発言データ取得部１２は、発言データにユーザＹの発言が含まれるか否かを判定する。即ち、ユーザＹが発言していない場合には、発言データにユーザＹの発言が含まれないと判定する。 The remark data acquisition unit 12 acquires the remark data of the remarks made by the user Y to the user X. More specifically, the speech data acquisition unit 12 receives the speech signal data of the user Y's speech acquired by the speech data acquisition device 41 of the speech subject terminal 4 from the speech subject terminal 4, so that the user Y Acquires the speech data of the speech issued to the user X by. The speech data acquisition unit 12 determines whether or not the speech data includes the speech of the user Y. That is, when the user Y does not speak, it is determined that the speech data does not include the speech of the user Y.

また、発言データ取得部１２は、ユーザＸに対して発言を発したユーザＹを特定する情報を取得する。例えば、発言データ取得部１２は、ユーザＹを特定する情報を発言主体用端末４から受信する。 In addition, the speech data acquisition unit 12 acquires information that identifies the user Y who has made a speech to the user X. For example, the speech data acquisition unit 12 receives information identifying the user Y from the speech subject terminal 4.

対象物抽出部１３は、発言データ取得部１２により取得された発言データに基づいて、当該発言データに係る発言に含まれる予め記憶された対象物Ｔを表す文字列を抽出する。詳述すると、対象物抽出部１３は、予め複数の対象物Ｔを表す文字列（対象物データ）を記憶しており、複数の対象物Ｔを表す文字列と発言データを変換した文字列（発言データの一種）を対比して、発言データを変換した文字列のうち対象物Ｔを表す文字列と一致する文字列（データ）を抽出対象物Ｔｅとして抽出する。「対象物Ｔ」とは、現実に存在している物体である。物体としては、例えば、一般名詞で表現される物体の種別（自転車、街灯、建物等）であってもよく、固有名詞で表現される物体の名称（富士山、国会議事堂等）であってもよい。また、物体は、その属性、特徴等について限定されていてもよい（例えば、青い自転車、富士山の頂上等）。対象物抽出部１３は、記憶部を有し、対象物Ｔを表す一般名詞、固有名詞、属性、または特徴を予め記憶している。対象物抽出部１３は、発言データ取得部１２により取得された発言データから予め記憶された対象物Ｔを表す一般名詞、固有名詞、属性、または特徴を抽出する。 The object extraction unit 13 extracts a character string representing a pre-stored object T included in the speech related to the speech data based on the speech data acquired by the speech data acquisition unit 12. More specifically, the object extraction unit 13 stores character strings (object data) representing a plurality of object Ts in advance, and converts the character strings representing the plurality of objects T and the speech data (object data). (A type of speech data) is compared, and a character string (data) that matches the character string representing the object T among the character strings converted from the speech data is extracted as the extraction target Te. The "object T" is an object that actually exists. The object may be, for example, the type of the object expressed by a general noun (bicycle, street light, building, etc.) or the name of the object expressed by a proper noun (Mt. Fuji, Parliament building, etc.). .. In addition, the object may be limited in terms of its attributes, characteristics, etc. (for example, a blue bicycle, the summit of Mt. Fuji, etc.). The object extraction unit 13 has a storage unit and stores in advance a general noun, a proper noun, an attribute, or a feature representing the object T. The object extraction unit 13 extracts a general noun, a proper noun, an attribute, or a feature representing the object T stored in advance from the speech data acquired by the speech data acquisition unit 12.

一例として、ユーザＹによりユーザＸに対して「向こうに自転車があるね。」との発言が発せられた場合を説明する。この場合、発言データ取得部１２によりユーザＹが発せられた発言の発言データに基づいて、対象物抽出部１３は、ユーザＹにより発せられた発言から抽出対象物Ｔｅを抽出する。ここでは、対象物抽出部１３は、「自転車」との言葉が対象物Ｔ（自転車）の種別を表すことを予め記憶しているものとする。対象物抽出部１３は、ユーザＹにより発せられた発言から「自転車」という抽出対象物Ｔｅを抽出する。なお、ユーザＹの発言内容から、抽出対象物Ｔｅを抽出できない場合もある。 As an example, a case where the user Y makes a statement to the user X that "there is a bicycle over there" will be described. In this case, the object extraction unit 13 extracts the extraction target Te from the remarks made by the user Y based on the remark data of the remarks made by the user Y by the remark data acquisition unit 12. Here, it is assumed that the object extraction unit 13 stores in advance that the word "bicycle" represents the type of the object T (bicycle). The object extraction unit 13 extracts the extraction object Te called "bicycle" from the remarks made by the user Y. In some cases, the extraction target Te may not be extracted from the content of the user Y's remarks.

対象物抽出部１３は、例えば発言認識（音声認識）により、発言データに係る発言において言及されている予め記憶された複数の対象物Ｔを表す文字列を抽出する。ここで、「発言認識」としては、公知の発言認識技術が適用可能である。例えば、対象物抽出部１３は、発言認識により、発言データに基づいて発言音声信号を文字列として認識し、認識された文字列と複数の対象物Ｔを表す文字列から抽出対象物Ｔｅを抽出する。 The object extraction unit 13 extracts, for example, by speech recognition (speech recognition), character strings representing a plurality of pre-stored objects T referred to in the speech relating to the speech data. Here, as "speech recognition", a known speech recognition technique can be applied. For example, the object extraction unit 13 recognizes the speech voice signal as a character string based on the speech data by the speech recognition, and extracts the extraction target Te from the recognized character string and the character string representing the plurality of objects T. To do.

視野画像取得部１４Ａは、ユーザＸの視野Ｅｘに対応する画像である視野画像を少なくとも含む画像を取得する。「視野画像を少なくとも含む画像」とは、視野画像と同一範囲の画像であってもよく、視野画像よりも広い範囲の画像であってもよい。視野画像取得部１４Ａは、ユーザ用端末３Ａの視野画像取得装置３２により撮像された視野画像を視野画像取得装置３２から受信することで、当該視野画像を取得する。また、視野画像取得部１４Ａは、視野画像取得装置３２からユーザＸの視線方向の情報を取得してもよい。 The visual field image acquisition unit 14A acquires an image including at least a visual field image which is an image corresponding to the visual field Ex of the user X. The "image including at least the visual field image" may be an image having the same range as the visual field image, or may be an image having a wider range than the visual field image. The visual field image acquisition unit 14A acquires the visual field image by receiving the visual field image captured by the visual field image acquisition device 32 of the user terminal 3A from the visual field image acquisition device 32. Further, the visual field image acquisition unit 14A may acquire information on the line-of-sight direction of the user X from the visual field image acquisition device 32.

対象物判定部１５Ａは、抽出対象物Ｔｅが視野画像取得部１４Ａにより取得されたユーザＸの視野Ｅｘの視野画像に含まれるか否かを判定する。「抽出対象物Ｔｅ」とは、上述した通り、対象物抽出部１３が記憶している複数の対象物Ｔの中からその発言データと一致するものである。ここでは、対象物抽出部１３により「自転車（bicycle）」という抽出対象物Ｔｅが抽出されている。 The object determination unit 15A determines whether or not the extraction target Te is included in the visual field image of the visual field Ex of the user X acquired by the visual field image acquisition unit 14A. As described above, the “extraction target object Te” is the one that matches the remark data from the plurality of object T stored in the object extraction unit 13. Here, the object extraction unit 13 extracts the extraction object Te called "bicycle".

対象物判定部１５Ａは、例えば画像認識により、抽出対象物Ｔｅが視野画像に含まれるか否かを判定する。ここで、「画像認識」としては、公知の画像認識技術が適用可能である。例えば、対象物判定部１５Ａは、画像認識として、画像上に含まれる物体の名称、種別、形状、色、方向等の識別情報を検出できる機械学習モデル、深層学習モデル、及びＯｐｅｎＣＶ（Open Source Computer Vision Library）を用いた画像処理アルゴリズムが適用されてもよい。 The object determination unit 15A determines whether or not the extraction target Te is included in the visual field image, for example, by image recognition. Here, as "image recognition", a known image recognition technique can be applied. For example, the object determination unit 15A can detect identification information such as the name, type, shape, color, and direction of an object included in the image as image recognition, a machine learning model, a deep learning model, and OpenCV (Open Source Computer). An image processing algorithm using Vision Library) may be applied.

例えば、対象物判定部１５Ａは、視野画像に含まれる複数の物体の識別情報を検出し、対象物抽出部１３により取得された抽出対象物Ｔｅを表現するデータ（物体の種別等）と、視野画像に含まれる複数の物体の検出された識別情報と、を比較する。その後、対象物判定部１５Ａは、種別及び名称の少なくともいずれかにおいて、抽出対象物Ｔｅを表現するデータと、視野画像に含まれる複数の物体と、が一致するか否かに基づいて、視野画像に抽出対象物Ｔｅが含まれるか否かを判定する。また、対象物判定部１５Ａは、画像認識として、ＯＣＲ（Optical Character Recognition）を用いて、視野画像に含まれる看板の文字内容を認識し、対象物抽出部１３が取得された抽出対象物Ｔｅの名称を表現されるデータと認識された看板の内容と比較し、抽出対象物Ｔｅの名称と視野画像に含まれる看板の中に少なくとも１つの看板の内容の一部と一致するか否かに基づいて、視野画像に抽出対象物Ｔｅが含まれるか否かを判定してもよい。 For example, the object determination unit 15A detects identification information of a plurality of objects included in the field image, and data (object type, etc.) representing the extraction target Te acquired by the object extraction unit 13 and the field of view. The detected identification information of a plurality of objects included in the image is compared with the detected identification information. After that, the object determination unit 15A determines the visual field image based on whether or not the data representing the extraction target Te and the plurality of objects included in the visual field image match in at least one of the type and the name. It is determined whether or not the extraction target Te is included in. Further, the object determination unit 15A uses OCR (Optical Character Recognition) as image recognition to recognize the character content of the signboard included in the field image, and the object extraction unit 13 has acquired the extracted object Te. Compared with the data expressing the name and the content of the recognized signboard, it is based on whether or not the name of the extraction target Te and a part of the content of at least one signboard in the signboard included in the field image match. It may be determined whether or not the field image includes the extraction target Te.

対象物判定部１５Ａは、抽出対象物Ｔｅが視野画像に含まれるか否かの判定結果の情報をユーザＸの表示画像表示装置３１Ａに出力する。「視野画像に含まれるか否かの判定結果」とは、抽出対象物ＴｅがユーザＸにより視認可能である（ユーザＸの視野Ｅｘ内）か否かの判定結果の情報を意味する。ここでは、対象物判定部１５Ａは、ユーザＸのユーザ用端末３Ａに判定結果の情報を出力する。なお、発言主体であるユーザＹの発言主体用端末４にも判定結果の情報を出力する。 The object determination unit 15A outputs the information of the determination result as to whether or not the extraction object Te is included in the visual field image to the display image display device 31A of the user X. The “determination result of whether or not it is included in the visual field image” means the information of the determination result of whether or not the extraction target Te is visible to the user X (within the visual field Ex of the user X). Here, the object determination unit 15A outputs the determination result information to the user terminal 3A of the user X. It should be noted that the information of the determination result is also output to the speaking subject terminal 4 of the user Y who is the speaking subject.

存否判定部１６Ａは、抽出対象物Ｔｅが視野画像に含まれないと対象物判定部１５Ａにより判定された場合に、抽出対象物Ｔｅが予め設定された対象範囲内に存在するか否かを判定する。具体的には、存否判定部１６Ａは、周辺画像取得部１１により取得された現在又は過去の周辺画像に基づいて、抽出対象物Ｔｅが対象範囲内に存在するか否かを判定する。「対象範囲」とは、ユーザＸまたは車両２Ａの位置を中心として予め設定された所定の範囲である。例えば、対象範囲は、ユーザＸまたは車両２Ａの位置を中心としてユーザＸが視認可能な所定の範囲であってもよい（図３に二点鎖線で示した範囲）。当該範囲は、ユーザＸまたは車両２Ａから例えば５０キロメートルの円形の範囲であってもよく、円形以外の任意の形状の範囲であってもよい。対象範囲は、抽出対象物Ｔｅの大きさに応じて、ユーザＸが、抽出対象物Ｔｅが視認可能な範囲でもよく、例えば、抽出対象物Ｔｅが富士山であれば、対象範囲をユーザＸまたは車両２Ａの位置（中心）から３００キロメートルまでの範囲に設定すればよい。この例では、対象範囲は、中心から半径３００キロメートルの範囲とする。 The existence / non-existence determination unit 16A determines whether or not the extraction target Te exists within the preset target range when the object determination unit 15A determines that the extraction target Te is not included in the visual field image. To do. Specifically, the presence / absence determination unit 16A determines whether or not the extraction target Te exists within the target range based on the current or past peripheral image acquired by the peripheral image acquisition unit 11. The “target range” is a predetermined range set in advance around the position of the user X or the vehicle 2A. For example, the target range may be a predetermined range that can be visually recognized by the user X around the position of the user X or the vehicle 2A (the range shown by the chain double-dashed line in FIG. 3). The range may be a circular range, for example 50 kilometers, from User X or vehicle 2A, or may be a range of any shape other than circular. The target range may be a range in which the extraction target Te can be visually recognized by the user X according to the size of the extraction target Te. For example, if the extraction target Te is Mt. Fuji, the target range is the user X or the vehicle. It may be set in the range from the position (center) of 2A to 300 kilometers. In this example, the target range is a range with a radius of 300 kilometers from the center.

まず、存否判定部１６Ａは、周辺画像取得部１１により取得されて時系列で記憶されたユーザＸの現在の周辺画像及び過去の周辺画像に抽出対象物Ｔｅが含まれているか否かを判定する。より詳細には、存否判定部１６Ａは、周辺画像取得部１１により記憶されている現在の周辺画像及び過去の周辺画像を取得し、取得された現在の周辺画像及び過去の周辺画像に抽出対象物Ｔｅが含まれているか否かを判定する。存否判定部１６Ａは、例えば画像認識により、当該判定を実行してもよい。存否判定部１６Ａは、周辺画像取得部１１により取得され記憶された現在の周辺画像及び過去の周辺画像に含まれる複数の画像の画像認識の処理を対象物判定部１５Ａに実行させて、その実行結果に基づいて、当該判定を実行してもよい。 First, the presence / absence determination unit 16A determines whether or not the extraction target Te is included in the current peripheral image and the past peripheral image of the user X acquired by the peripheral image acquisition unit 11 and stored in time series. .. More specifically, the presence / absence determination unit 16A acquires the current peripheral image and the past peripheral image stored by the peripheral image acquisition unit 11, and extracts the acquired current peripheral image and the past peripheral image into the extracted object. Determine if Te is included. The existence / non-existence determination unit 16A may execute the determination by, for example, image recognition. The existence / non-existence determination unit 16A causes the object determination unit 15A to execute image recognition processing of the current peripheral image acquired and stored by the peripheral image acquisition unit 11 and a plurality of images included in the past peripheral image, and executes the processing. Based on the result, the determination may be executed.

また、存否判定部１６Ａは、周辺画像取得部１１により記憶されている現在の周辺画像及び過去の周辺画像に含まれる複数の画像内の様々な物体を検出して、物体の名称、種別、形状、色、及び方向等の識別情報を検出し、検出された識別情報に１つ以上の画像タグを割り当ててタグ付き画像を生成し記憶する。その後、存否判定部１６Ａは、対象物抽出部１３により取得された抽出対象物Ｔｅを表現する発言データと、複数の画像タグのうち物体の名称及び種別の少なくともいずれかと一致する周辺画像が存在するか否かに基づいて、取得された現在の周辺画像及び過去の周辺画像に抽出対象物Ｔｅが含まれているか否かを判定する。 Further, the presence / absence determination unit 16A detects various objects in a plurality of images included in the current peripheral image and the past peripheral image stored by the peripheral image acquisition unit 11, and the name, type, and shape of the object. , Color, direction, and other identification information is detected, and one or more image tags are assigned to the detected identification information to generate and store a tagged image. After that, the existence / non-existence determination unit 16A has speech data representing the extraction target object Te acquired by the object extraction unit 13, and a peripheral image that matches at least one of the names and types of the objects among the plurality of image tags. Based on whether or not, it is determined whether or not the acquired current peripheral image and the past peripheral image include the extraction target Te.

また、存否判定部１６Ａは、取得された現在の周辺画像及び過去の周辺画像に抽出対象物Ｔｅが含まれていないと判定された場合には、抽出対象物Ｔｅが予め設定された対象範囲内に存在しないと判定する。 Further, when the presence / absence determination unit 16A determines that the acquired current peripheral image and the past peripheral image do not include the extraction target Te, the extraction target Te is within the preset target range. It is determined that it does not exist in.

次に、存否判定部１６Ａは、取得された現在の周辺画像及び過去の周辺画像に抽出対象物Ｔｅが含まれていると判定された場合に、抽出対象物Ｔｅが存在する位置が対象範囲内であるか否かを判定する。存否判定部１６Ａは、抽出対象物Ｔｅが現在の周辺画像に含まれる場合に、ユーザＸ又は車両２Ａから抽出対象物Ｔｅまでの方向及び距離を公知の手法により取得することができる。例えば、存否判定部１６Ａは、周辺画像取得部１１により取得されたユーザＸの現在の周辺画像に基づいて、ユーザＸ又は車両２Ａから抽出対象物Ｔｅまでの方向及び距離を推定し、抽出対象物Ｔｅが存在する位置が対象範囲内であるか否かを判定してもよい。あるいは、存否判定部１６Ａは、車両２Ａに設けられたＲＡＤＡＲ（Radio Detection and Ranging）又はＬＩＤＡＲ（Light Detection and Ranging）等を用いて（不図示）、ユーザＸ又は車両２Ａから抽出対象物Ｔｅまでの方向及び距離を計測し、抽出対象物Ｔｅが存在する位置が対象範囲内であるか否かを判定してもよい。 Next, when the existence / non-existence determination unit 16A determines that the acquired current peripheral image and the past peripheral image include the extraction target Te, the position where the extraction target Te exists is within the target range. It is determined whether or not it is. When the extraction target Te is included in the current peripheral image, the existence / non-existence determination unit 16A can acquire the direction and distance from the user X or the vehicle 2A to the extraction target Te by a known method. For example, the presence / absence determination unit 16A estimates the direction and distance from the user X or the vehicle 2A to the extraction target Te based on the current peripheral image of the user X acquired by the peripheral image acquisition unit 11, and the extraction target object. It may be determined whether or not the position where Te exists is within the target range. Alternatively, the presence / absence determination unit 16A uses RADAR (Radio Detection and Ranging) or LIDAR (Light Detection and Ranging) provided in the vehicle 2A (not shown) from the user X or the vehicle 2A to the extraction target Te. The direction and distance may be measured to determine whether or not the position where the extraction target Te exists is within the target range.

なお、存否判定部１６Ａは、抽出対象物Ｔｅが現在の周辺画像に含まれない場合に、周辺画像取得部１１から時間順で抽出対象物Ｔｅが含まれる最後の周辺画像を取得する。次に、存否判定部１６Ａは、ナビゲーション装置２１から取得した実車両位置履歴により現在のユーザＸ又は車両２Ａと撮像した時点でのユーザＸ又は車両２Ａとの相対方向及び距離を算出する。次に、存否判定部１６Ａは、その相対方向及び距離と、ユーザＸ又は車両２Ａから抽出対象物Ｔｅとの相対方向及び距離に基づいて、ユーザＸ又は車両２Ａから抽出対象物Ｔｅまでの方向及び距離を推定する。続いて、存否判定部１６Ａは、ユーザＸ又は車両２Ａから抽出対象物Ｔｅまでの距離が対象範囲内であるか否かを判定してもよい。 When the extraction target Te is not included in the current peripheral image, the presence / absence determination unit 16A acquires the last peripheral image including the extraction target Te from the peripheral image acquisition unit 11 in chronological order. Next, the presence / absence determination unit 16A calculates the relative direction and distance between the current user X or the vehicle 2A and the user X or the vehicle 2A at the time of imaging based on the actual vehicle position history acquired from the navigation device 21. Next, the presence / absence determination unit 16A determines the direction and distance from the user X or the vehicle 2A to the extraction target Te based on the relative direction and distance thereof and the relative direction and distance from the user X or the vehicle 2A to the extraction target Te. Estimate the distance. Subsequently, the presence / absence determination unit 16A may determine whether or not the distance from the user X or the vehicle 2A to the extraction target Te is within the target range.

位置関係取得部１７Ａは、抽出対象物ＴｅとユーザＸとの相対的な位置関係を取得する。「位置関係」は、ユーザＸの位置又はユーザＸの近傍の位置に設定される基準位置（例えば車両２Ａの中心位置）を基準として、抽出対象物Ｔｅの位置の方向及び距離により表されてもよいし、抽出対象物Ｔｅが予め設定された対象範囲内に存在しない情報を表されてもよい。位置関係取得部１７Ａは、ユーザＸ又は車両２Ａから抽出対象物Ｔｅまでの方向及び距離を存否判定部１６Ａから取得してもよい。また、位置関係取得部１７Ａは、周辺画像取得部１１により取得された現在または過去の周辺画像に基づいて、ユーザＸ又は車両２Ａから抽出対象物Ｔｅまでの方向及び距離を推定してもよい。また、位置関係取得部１７Ａは、車両２Ａに設けられたレーダ又はライダー等によりユーザＸ又は車両２Ａから抽出対象物Ｔｅまでの方向及び距離を推定してもよい。また、位置関係取得部１７Ａは、存否判定部１６Ａから、抽出対象物Ｔｅが予め設定された対象範囲内に存在しない情報を取得してもよい。 The positional relationship acquisition unit 17A acquires the relative positional relationship between the extraction target Te and the user X. The "positional relationship" may be represented by the direction and distance of the position of the extraction target Te with reference to the reference position (for example, the center position of the vehicle 2A) set at the position of the user X or the position near the user X. Alternatively, the information that the extraction target Te does not exist within the preset target range may be represented. The positional relationship acquisition unit 17A may acquire the direction and distance from the user X or the vehicle 2A to the extraction target Te from the existence / non-existence determination unit 16A. Further, the positional relationship acquisition unit 17A may estimate the direction and distance from the user X or the vehicle 2A to the extraction target Te based on the current or past peripheral image acquired by the peripheral image acquisition unit 11. Further, the positional relationship acquisition unit 17A may estimate the direction and distance from the user X or the vehicle 2A to the extraction target Te by a radar or a rider provided on the vehicle 2A. Further, the positional relationship acquisition unit 17A may acquire information from the existence / non-existence determination unit 16A that the extraction target Te does not exist within the preset target range.

位置関係取得部１７Ａは、ユーザＸの視線方向に対する抽出対象物Ｔｅの方向を算出する。位置関係取得部１７Ａは、視野画像取得部１４Ａから取得されたユーザＸの視野画像と周辺画像取得部１１から取得されたユーザＸの周辺画像に基づいてユーザの視線方向を推定してもよい。また、位置関係取得部１７Ａは、視野画像取得部１４ＡからユーザＸの視線方向を取得してもよい。位置関係取得部１７Ａは、算出されたユーザＸ又は車両２Ａから抽出対象物Ｔｅまでの方向とユーザＸの視線方向に基づいて、ユーザＸの視線方向に対する抽出対象物Ｔｅ方向を推定する。また、上記ユーザＸの視線方向に対する抽出対象物Ｔｅの方向は、視線方向の左後方、視線方向の右後方の２種類であってもよい。 The positional relationship acquisition unit 17A calculates the direction of the extraction target Te with respect to the line-of-sight direction of the user X. The positional relationship acquisition unit 17A may estimate the user's line-of-sight direction based on the visual field image of the user X acquired from the visual field image acquisition unit 14A and the peripheral image of the user X acquired from the peripheral image acquisition unit 11. Further, the positional relationship acquisition unit 17A may acquire the line-of-sight direction of the user X from the visual field image acquisition unit 14A. The positional relationship acquisition unit 17A estimates the extraction target Te direction with respect to the user X's line-of-sight direction based on the calculated direction from the user X or vehicle 2A to the extraction target Te and the user X's line-of-sight direction. Further, the direction of the extraction target Te with respect to the line-of-sight direction of the user X may be two types, left rear in the line-of-sight direction and right rear in the line-of-sight direction.

表示画像生成部１８Ａは、抽出対象物情報を取得し、当該抽出対象物情報を含む表示画像Ｐを生成する。「抽出対象物情報」とは、抽出対象物Ｔｅの位置に関する情報を意味する。抽出対象物情報は、抽出対象物Ｔｅの位置そのものを示す情報であってもよく、抽出対象物Ｔｅが存在する方向又は距離を示す情報であってもよく、抽出対象物Ｔｅが所定エリア内に存在するか否かを示す情報であってもよい。 The display image generation unit 18A acquires the extraction target information and generates the display image P including the extraction target information. “Extraction target information” means information regarding the position of the extraction target Te. The extraction target information may be information indicating the position of the extraction target Te itself, information indicating the direction or distance in which the extraction target Te exists, and the extraction target Te may be within a predetermined area. It may be information indicating whether or not it exists.

表示画像生成部１８Ａは、対象物判定部１５Ａの判定結果に基づいて、抽出対象物Ｔｅの表示画像Ｐの表示態様を決定する。「表示態様」とは、抽出対象物情報を示す画像の表示態様である。表示態様は、抽出対象物Ｔｅの位置そのものを示す画像であってもよく、ユーザから見た抽出対象物Ｔｅの距離及び方向を示す画像であってもよく、抽出対象物Ｔｅが所定エリア内に存在するか否かを示す画像であってもよい。 The display image generation unit 18A determines the display mode of the display image P of the extraction target Te based on the determination result of the object determination unit 15A. The "display mode" is a display mode of an image showing information on an object to be extracted. The display mode may be an image showing the position of the extraction target Te itself, or an image showing the distance and direction of the extraction target Te as seen by the user, and the extraction target Te may be within a predetermined area. It may be an image showing whether or not it exists.

対象物判定部１５Ａにより抽出対象物Ｔｅが視野画像に含まれると判定された場合には、表示画像生成部１８Ａは、視野画像取得部１４Ａから視野Ｅｘの視野画像を取得し、視野画像から抽出対象物Ｔｅを画像認識し、抽出対象物Ｔｅに重畳して表示される抽出対象物Ｔｅそのものを強調する表示態様で抽出対象物情報を示した第１表示画像Ｐ１を生成する。「抽出対象物そのものを強調する表示態様」とは、例えば、抽出対象物Ｔｅを四角又は丸等で囲うような表示態様であってもよく、抽出対象物Ｔｅを矢印で直接指し示す表示態様であってもよい（図４参照）。 When the object determination unit 15A determines that the extraction target Te is included in the field image, the display image generation unit 18A acquires the field image of the field Ex from the field image acquisition unit 14A and extracts it from the field image. The image of the target object Te is recognized, and the first display image P1 showing the extraction target information is generated in a display mode that emphasizes the extraction target Te itself that is displayed superimposed on the extraction target Te. The "display mode that emphasizes the extraction target object itself" may be, for example, a display mode in which the extraction target object Te is surrounded by a square, a circle, or the like, and is a display mode in which the extraction target object Te is directly pointed by an arrow. It may be (see FIG. 4).

また、表示画像生成部１８Ａは、対象物判定部１５Ａにより抽出対象物Ｔｅが視野画像に含まれないと判定された場合には、抽出対象物Ｔｅが対象範囲内に存在すると存否判定部１６Ａにより判定されたか否かに基づいて、抽出対象物情報の表示態様を決定する。より詳細には、表示画像生成部１８Ａは、抽出対象物Ｔｅが対象範囲内に存在すると存否判定部１６Ａにより判定された場合に、基準位置を基準として抽出対象物Ｔｅの位置の方向及び距離を含む位置関係を表示する表示態様で抽出対象物情報を示した第２表示画像Ｐ２を生成する（図５参照）。「位置関係を表示する表示態様」とは、基準位置を基準として抽出対象物Ｔｅの位置の方向及び距離を示した画像の表示態様である。表示画像生成部１８Ａは、位置関係取得部１７Ａにより基準位置を基準として抽出対象物Ｔｅの位置の方向及び距離を含む位置関係情報を取得し、取得された基準位置を基準として抽出対象物Ｔｅの位置の方向及び距離を含む位置関係を表示する第２表示画像Ｐ２を生成する。例えば、抽出対象物ＴｅがユーザＸの視野Ｅｘの後方左に位置する場合、図５に示されるように、ユーザＸの視野Ｅｘの後方左を示す記号画像と距離を示す画像を生成して視野画像の左に表示する。 Further, when the display image generation unit 18A determines that the extraction target Te is not included in the visual field image by the object determination unit 15A, the presence / absence determination unit 16A determines that the extraction target Te is within the target range. The display mode of the extraction target information is determined based on whether or not the determination is made. More specifically, when the presence / absence determination unit 16A determines that the extraction target Te exists within the target range, the display image generation unit 18A determines the direction and distance of the position of the extraction target Te with reference to the reference position. A second display image P2 showing the extraction target information is generated in a display mode that displays the including positional relationship (see FIG. 5). The "display mode for displaying the positional relationship" is a display mode for an image showing the direction and distance of the position of the extraction target Te with reference to the reference position. The display image generation unit 18A acquires the positional relationship information including the direction and distance of the position of the extraction target Te with the reference position as a reference by the positional relationship acquisition unit 17A, and uses the acquired reference position as a reference for the extraction target Te. A second display image P2 that displays the positional relationship including the direction and distance of the position is generated. For example, when the extraction target Te is located to the rear left of the field of view Ex of the user X, as shown in FIG. 5, a symbol image indicating the rear left of the field of view Ex of the user X and an image indicating the distance are generated to generate a field of view. Display on the left of the image.

また、表示画像生成部１８Ａは、抽出対象物Ｔｅが対象範囲内に存在しないと存否判定部１６Ａにより判定された場合に、抽出対象物Ｔｅが予め設定された対象範囲内に存在しない情報を示す第３表示画像Ｐ３を生成する（図６参照）。 Further, the display image generation unit 18A indicates information that the extraction target Te does not exist in the preset target range when the existence / non-existence determination unit 16A determines that the extraction target Te does not exist in the target range. A third display image P3 is generated (see FIG. 6).

表示画像生成部１８Ａは、発言データ取得部１２により取得された発言主体を特定する情報を含む表示画像Ｐを生成する。例えば、表示画像生成部１８Ａは、発言データ取得部１２により取得された発言主体がユーザＹである場合には、「Mentioned by Y.」という第１表示画像Ｐ１〜第３表示画像Ｐ３を生成してもよい（図４〜図６参照）。 The display image generation unit 18A generates a display image P including information for identifying the speaking subject acquired by the speaking data acquisition unit 12. For example, the display image generation unit 18A generates the first display image P1 to the third display image P3 of "Mentioned by Y." when the speaker subject acquired by the speech data acquisition unit 12 is the user Y. It may be (see FIGS. 4 to 6).

表示画像生成部１８Ａは、対象物判定部１５Ａによる抽出対象物Ｔｅが視野画像に含まれるか否かの判定結果に基づいて、抽出対象物ＴｅがユーザＸにより視認可能であるか否かを示す情報を含む表示画像Ｐを生成する（図４〜図６参照）。より詳細には、表示画像生成部１８Ａは、抽出対象物Ｔｅが視野画像に含まれると対象物判定部１５Ａにより判定された場合に、抽出対象物ＴｅがユーザＸにより視認可能であることを示す情報を含む第１表示画像Ｐ１を生成し、抽出対象物Ｔｅが視野画像に含まれないと対象物判定部１５Ａにより判定された場合に、抽出対象物ＴｅがユーザＸにより視認可能でないことを示す情報を含む第２表示画像Ｐ２，第３表示画像Ｐ３を生成する。例えば、表示画像生成部１８Ａは、抽出対象物Ｔｅが視野画像に含まれると対象物判定部１５Ａにより判定された場合には、「Bicycle is visible now.」という第１表示画像Ｐ１を生成してもよい（図４参照）。一方、表示画像生成部１８Ａは、抽出対象物Ｔｅが視野画像に含まれないと対象物判定部１５Ａにより判定された場合には、「Bicycle is invisible now.」という第２表示画像Ｐ２，第３表示画像Ｐ３を生成してもよい（図５と図６参照）。 The display image generation unit 18A indicates whether or not the extraction target Te is visible to the user X based on the determination result of whether or not the extraction target Te is included in the visual field image by the object determination unit 15A. A display image P including information is generated (see FIGS. 4 to 6). More specifically, the display image generation unit 18A indicates that the extraction target Te is visible to the user X when the object determination unit 15A determines that the extraction target Te is included in the field image. When the first display image P1 containing the information is generated and the object determination unit 15A determines that the extraction target Te is not included in the visual field image, it indicates that the extraction target Te is not visible to the user X. The second display image P2 and the third display image P3 including the information are generated. For example, the display image generation unit 18A generates a first display image P1 of "Bicycle is visible now." When the object determination unit 15A determines that the extraction target Te is included in the visual field image. It may be good (see FIG. 4). On the other hand, when the object determination unit 15A determines that the extraction object Te is not included in the visual field image, the display image generation unit 18A says "Bicycle is invisible now." The display image P3 may be generated (see FIGS. 5 and 6).

続いて、表示画像生成装置１Ａにより実行される画像生成処理について説明する。図７は、表示画像生成処理を示すフローチャートである。図７のフローチャートは、例えば表示画像生成装置１Ａによる表示画像生成処理は、車両２Ａが起動されたときに開始される。 Subsequently, the image generation process executed by the display image generation device 1A will be described. FIG. 7 is a flowchart showing the display image generation process. In the flowchart of FIG. 7, for example, the display image generation process by the display image generation device 1A is started when the vehicle 2A is started.

図７に示されるように、ステップＳ１０１において、表示画像生成装置１Ａは、周辺画像取得部１１により、ユーザＸの周辺画像を取得する。周辺画像取得部１１は、車両２Ａの周辺撮像装置２２が撮像した周辺画像を取得する。その後、表示画像生成装置１Ａは、ステップＳ１０２に進む。 As shown in FIG. 7, in step S101, the display image generation device 1A acquires the peripheral image of the user X by the peripheral image acquisition unit 11. The peripheral image acquisition unit 11 acquires the peripheral image captured by the peripheral image pickup device 22 of the vehicle 2A. After that, the display image generation device 1A proceeds to step S102.

ステップＳ１０２において、表示画像生成装置１Ａは、発言データ取得部１２により、ユーザ（発言主体）ＹによりユーザＸに対して発せられた発言の発言データを取得する。発言データ取得部１２は、発言主体用端末４の発言データ取得装置４１から取得されたユーザＹによりユーザＸに対して発せられた発言の発言データを取得する。なお、上述したとおり、発言データには、ユーザＹが何も発していないデータも含まれる。さらに、発言データ取得部１２は、同乗者Ｙを特定する情報を取得し、表示画像生成装置１Ａに送信する。その後、ステップＳ１０３に進む。 In step S102, the display image generation device 1A acquires the speech data of the speech issued to the user X by the user (subject of speech) Y by the speech data acquisition unit 12. The utterance data acquisition unit 12 acquires the utterance data of the utterance made to the user X by the user Y acquired from the utterance data acquisition device 41 of the utterance subject terminal 4. As described above, the speech data also includes data in which the user Y does not emit anything. Further, the speech data acquisition unit 12 acquires information for identifying the passenger Y and transmits it to the display image generation device 1A. Then, the process proceeds to step S103.

ステップＳ１０３において、表示画像生成装置１Ａは、発言データ取得部１２により、発言データにユーザ（発言主体）Ｙの発言が含まれるか否かを判定する。ユーザＹの発言が含まれると判定された場合には、ステップＳ１０４に進む。ユーザＹの発言が含まれないと判定された場合には、エンドに進む。 In step S103, the display image generation device 1A determines whether or not the speech data includes the speech of the user (subject of speech) Y by the speech data acquisition unit 12. If it is determined that the user Y's remark is included, the process proceeds to step S104. If it is determined that the user Y's remark is not included, the process proceeds to the end.

ステップＳ１０４において、表示画像生成装置１Ａは、対象物抽出部１３により、発言データのうち対象物Ｔと一致する抽出対象物Ｔｅを抽出できるか否かを判定する。抽出対象物Ｔｅを抽出できると判定された場合には、ステップＳ１０５に進む。抽出対象物Ｔｅを抽出できないと判定された場合には、エンドに進む。 In step S104, the display image generation device 1A determines whether or not the object extraction unit 13 can extract the extraction target Te that matches the object T from the statement data. If it is determined that the extraction target Te can be extracted, the process proceeds to step S105. If it is determined that the extraction target Te cannot be extracted, the process proceeds to the end.

ステップＳ１０５において、表示画像生成装置１Ａは、視野画像取得部１４Ａにより、ユーザＸの視野画像を取得する。視野画像取得部１４Ａは、ユーザＸが装着しているユーザ用端末３Ａの視野画像取得装置３２からユーザＸの視野画像を取得する。その後、ステップＳ１０６に進む。 In step S105, the display image generation device 1A acquires the visual field image of the user X by the visual field image acquisition unit 14A. The visual field image acquisition unit 14A acquires the visual field image of the user X from the visual field image acquisition device 32 of the user terminal 3A worn by the user X. Then, the process proceeds to step S106.

ステップＳ１０６において、表示画像生成装置１Ａは、対象物判定部１５Ａにより、対象物抽出部１３から抽出された抽出対象物Ｔｅが視野画像取得部１４Ａから取得したユーザＸの視野画像に含まれるか否かを判定する。抽出対象物ＴｅがユーザＸの視野画像に含まれると判定された場合には、ステップＳ１０７に進む。抽出対象物ＴｅがユーザＸの視野画像に含まれないと判定された場合には、ステップＳ１０８に進む。 In step S106, in the display image generation device 1A, whether or not the extraction target Te extracted from the object extraction unit 13 by the object determination unit 15A is included in the field image of the user X acquired from the field image acquisition unit 14A. Is determined. If it is determined that the extraction target Te is included in the visual field image of the user X, the process proceeds to step S107. If it is determined that the extraction target Te is not included in the visual field image of the user X, the process proceeds to step S108.

抽出対象物ＴｅがユーザＸの視野画像に含まれると判定された場合には、ステップＳ１０７において、表示画像生成装置１Ａは、表示画像生成部１８Ａにより、抽出対象物Ｔｅそのものを強調する第１表示画像Ｐ１を生成する。表示画像生成部１８Ａは、視野画像取得部１４Ａから視野Ｅｘの視野画像を取得し、視野画像から抽出対象物Ｔｅを画像認識し、視野画像に重畳して表示される抽出対象物Ｔｅそのものを強調する第１表示態様で抽出対象物情報を示した第１表示画像Ｐ１を生成する（図４参照）。なお、表示画像生成部１８Ａは、視野画像から抽出対象物ＴｅがユーザＸにより視認可能であることを示す情報（図４の「Bicycle is visible now.」）及び発言データ取得部１２により取得された発言主体を特定する情報（図４の「Mentioned by Y.」）をさらに含む第１表示画像Ｐ１を生成してもよい。表示画像生成部１８Ａは、生成した第１表示画像Ｐ１をユーザ用端末３Ａの表示画像表示装置３１Ａに送信する。 When it is determined that the extraction target Te is included in the visual field image of the user X, in step S107, the display image generation device 1A uses the display image generation unit 18A to emphasize the extraction target Te itself. Image P1 is generated. The display image generation unit 18A acquires the visual field image of the visual field Ex from the visual field image acquisition unit 14A, recognizes the extraction target Te from the visual field image, and emphasizes the extraction target Te itself displayed by superimposing it on the visual field image. A first display image P1 showing information on an object to be extracted is generated in the first display mode (see FIG. 4). The display image generation unit 18A was acquired by the information indicating that the extraction target Te is visible to the user X from the visual field image (“Bicycle is visible now.” In FIG. 4) and the speech data acquisition unit 12. The first display image P1 may further include information for identifying the speaking subject (“Mentioned by Y.” in FIG. 4). The display image generation unit 18A transmits the generated first display image P1 to the display image display device 31A of the user terminal 3A.

抽出対象物ＴｅがユーザＸの視野画像に含まれないと判定された場合には、ステップＳ１０８において、表示画像生成装置１Ａは、存否判定部１６Ａにより、周辺画像取得部１１により取得された現在または過去の周辺画像に基づいて、抽出対象物Ｔｅが対象範囲内に存在するか否かを判定する。存否判定部１６Ａは、抽出対象物Ｔｅが対象範囲内に存在しないと判定した場合には、スッテプＳ１１１に進む。存否判定部１６Ａは、抽出対象物Ｔｅが対象範囲内に存在すると判定した場合には、スッテプＳ１０９に進む。 When it is determined that the extraction target Te is not included in the visual field image of the user X, in step S108, the display image generation device 1A is currently or acquired by the peripheral image acquisition unit 11 by the presence / absence determination unit 16A. Based on the past peripheral image, it is determined whether or not the extraction target Te exists within the target range. When the existence / non-existence determination unit 16A determines that the extraction target Te does not exist within the target range, the existence / non-existence determination unit 16A proceeds to step S111. When the existence / non-existence determination unit 16A determines that the extraction target Te exists within the target range, the existence / non-existence determination unit 16A proceeds to step S109.

抽出対象物Ｔｅが存在する位置が対象範囲内であると判定された場合には、ステップＳ１０９において、表示画像生成装置１Ａは、位置関係取得部１７Ａにより、抽出対象物ＴｅとユーザＸとの位置関係を取得する。位置関係取得部１７Ａは、周辺画像取得部１１から取得された現在または過去のユーザＸの周辺画像に基づいて、抽出対象物ＴｅからユーザＸ又は車両２Ａまでの距離とユーザＸの視野Ｅｘに対する方向を推定する。また、位置関係取得部１７Ａは、存否判定部１６Ａより抽出対象物ＴｅからユーザＸ又は車両２Ａまでの距離を取得してもよい。その後、ステップＳ１１０に進む。 When it is determined that the position where the extraction target Te exists is within the target range, in step S109, the display image generation device 1A uses the positional relationship acquisition unit 17A to determine the position between the extraction target Te and the user X. Get a relationship. The positional relationship acquisition unit 17A is based on the peripheral image of the current or past user X acquired from the peripheral image acquisition unit 11, the distance from the extraction target Te to the user X or the vehicle 2A, and the direction of the user X with respect to the visual field Ex. To estimate. Further, the positional relationship acquisition unit 17A may acquire the distance from the extraction target Te to the user X or the vehicle 2A from the existence / non-existence determination unit 16A. Then, the process proceeds to step S110.

ステップＳ１１０において、表示画像生成装置１Ａは、表示画像生成部１８Ａにより、位置関係取得部１７Ａから取得された基準位置を基準として抽出対象物Ｔｅの位置の方向及び距離を含む位置関係を表示する第２表示画像Ｐ２を生成する。表示画像生成部１８Ａは、位置関係取得部１７Ａから取得されたユーザＸの視野Ｅｘに対する方向を示す記号画像（図５の矢印）と距離（図５の「２０ｍ」）を表示する第２表示態様で抽出対象物情報を示した第２表示画像Ｐ２を生成する。なお、表示画像生成部１８Ａは、視野画像から抽出対象物ＴｅがユーザＸにより視認不可能であることを示す情報（図５の「Bicycle is invisible now.」）及び発言データ取得部１２により取得された発言主体を特定する情報（図５の「Mentioned by Y.」）を含む第２表示画像Ｐ２を生成してもよい。表示画像生成部１８Ａは、生成した第２表示画像Ｐ２をユーザ用端末３Ａの表示画像表示装置３１Ａに送信する。 In step S110, the display image generation device 1A displays the positional relationship including the direction and distance of the position of the extraction target Te with reference to the reference position acquired from the positional relationship acquisition unit 17A by the display image generation unit 18A. 2 Display image P2 is generated. The display image generation unit 18A displays a symbol image (arrow in FIG. 5) and a distance (“20 m” in FIG. 5) indicating the direction of the user X with respect to the visual field Ex acquired from the positional relationship acquisition unit 17A. Generates the second display image P2 showing the extraction target information in. The display image generation unit 18A is acquired by the information indicating that the extraction target Te is invisible to the user X from the visual field image (“Bicycle is invisible now.” In FIG. 5) and the speech data acquisition unit 12. The second display image P2 including the information for identifying the speaking subject (“Mentioned by Y.” in FIG. 5) may be generated. The display image generation unit 18A transmits the generated second display image P2 to the display image display device 31A of the user terminal 3A.

抽出対象物Ｔｅが存在する位置が対象範囲内ではないと判定された場合には、ステップＳ１１１において、表示画像生成装置１Ａは、位置関係取得部１７Ａにより、抽出対象物ＴｅとユーザＸとの位置関係を取得する。具体的には、位置関係取得部１７Ａは、存否判定部１６Ａから抽出対象物Ｔｅが予め設定された対象範囲内に存在しない位置関係情報を取得する。その後、ステップＳ１１２に進む。 When it is determined that the position where the extraction target Te exists is not within the target range, in step S111, the display image generation device 1A uses the positional relationship acquisition unit 17A to determine the position between the extraction target Te and the user X. Get a relationship. Specifically, the positional relationship acquisition unit 17A acquires the positional relationship information in which the extraction target Te does not exist within the preset target range from the existence / non-existence determination unit 16A. Then, the process proceeds to step S112.

ステップＳ１１２において、表示画像生成装置１Ａは、位置関係取得部１７Ａから取得された抽出対象物Ｔｅが予め設定された対象範囲内に存在しないという抽出対象物ＴｅとユーザＸとの位置関係を表示する第３表示画像Ｐ３を生成する。表示画像生成部１８Ａは、視野画像から抽出対象物ＴｅがユーザＸにより視認不可能であることを示す情報（図６の「Bicycle is invisible now.」）及び発言データ取得部１２により取得された発言主体を特定する情報（図６の「Mentioned by Y.」）を含む第３表示画像Ｐ３を生成する。なお、抽出対象物Ｔｅの位置の方向及び距離を含む位置関係（第２表示態様に係る位置関係）は表示されない。表示画像生成部１８Ａは、生成した第３表示画像Ｐ３をユーザ用端末３Ａの表示画像表示装置３１Ａに送信する。 In step S112, the display image generation device 1A displays the positional relationship between the extraction target Te and the user X that the extraction target Te acquired from the positional relationship acquisition unit 17A does not exist within the preset target range. The third display image P3 is generated. The display image generation unit 18A indicates that the extraction target Te is invisible to the user X from the visual field image (“Bicycle is invisible now.” In FIG. 6) and the remarks acquired by the remark data acquisition unit 12. A third display image P3 including information for identifying the subject (“Mentioned by Y.” in FIG. 6) is generated. The positional relationship including the direction and distance of the position of the extraction target Te (the positional relationship according to the second display mode) is not displayed. The display image generation unit 18A transmits the generated third display image P3 to the display image display device 31A of the user terminal 3A.

表示画像生成装置１Ａは、表示画像生成部１８Ａの上述した処理が終了すると、今回の処理を終了して、再びステップＳ１０１から表示画像生成処理を繰り返す。 When the above-described processing of the display image generation unit 18A is completed, the display image generation device 1A ends the current processing and repeats the display image generation processing from step S101 again.

上記のとおり、本実施形態では、発言主体により発せられた発言に含まれる対象物Ｔを抽出対象物Ｔｅとして特定し、当該抽出対象物Ｔｅに関する表示画像Ｐを生成する表示画像生成装置１Ａを開示する。表示画像生成装置１Ａは、発言データ取得部１２と、対象物抽出部１３と、視野画像取得部１４Ａと、対象物判定部１５Ａと、表示画像生成部１８Ａと、を備える。発言データ取得部１２は、発言主体であるユーザＹによりユーザＸに対して発せられた発言の発言データを取得する。対象物抽出部１３は、予め複数の対象物データ（文字列）を記憶し、複数の対象物データと発言データ取得部１２により取得された発言データ（文字列）とを対比して、発言データのうち対象物データと一致するデータを抽出対象物Ｔｅとして抽出する。視野画像取得部１４Ａは、ユーザＸの視野画像を少なくとも含む画像を取得する。対象物判定部１５Ａは、対象物抽出部１３により抽出された抽出対象物Ｔｅが視野画像に含まれるか否かを判定する。表示画像生成部１８Ａは、抽出対象物Ｔｅの位置に関する情報である抽出対象物情報を取得し、視野画像とは異なる当該抽出対象物情報を含む表示画像Ｐを生成する。更に、表示画像生成部１８Ａは、対象物判定部１５Ａによる抽出対象物Ｔｅが視野画像に含まれるか否かの判定結果に基づいて、抽出対象物Ｔｅに関する表示画像Ｐの表示態様を決定する。 As described above, in the present embodiment, the display image generation device 1A that specifies the object T included in the remarks made by the speaking subject as the extraction target Te and generates the display image P related to the extraction target Te is disclosed. To do. The display image generation device 1A includes a speech data acquisition unit 12, an object extraction unit 13, a visual field image acquisition unit 14A, an object determination unit 15A, and a display image generation unit 18A. The remark data acquisition unit 12 acquires the remark data of the remark made to the user X by the user Y who is the remark subject. The object extraction unit 13 stores a plurality of object data (character strings) in advance, compares the plurality of object data with the speech data (character string) acquired by the speech data acquisition unit 12, and speaks data. Of these, the data that matches the object data is extracted as the extraction target Te. The visual field image acquisition unit 14A acquires an image including at least the visual field image of the user X. The object determination unit 15A determines whether or not the extraction target Te extracted by the object extraction unit 13 is included in the visual field image. The display image generation unit 18A acquires the extraction target information which is the information regarding the position of the extraction target Te, and generates the display image P including the extraction target information different from the visual field image. Further, the display image generation unit 18A determines the display mode of the display image P regarding the extraction target Te based on the determination result of whether or not the extraction target Te is included in the visual field image by the object determination unit 15A.

この結果、表示画像生成装置１Ａは、発言データ取得部１２と対象物抽出部１３によりユーザＸ以外の主体（ユーザＹ）により認識されている抽出対象物Ｔｅを特定することができる。表示画像生成装置１Ａは、視野画像取得部１４Ａと対象物判定部１５Ａにより、抽出対象物Ｔｅが視野画像に含まれるか否かの判定結果を得ることができる。そして、表示画像生成部１８Ａは、対象物判定部１５Ａの判定結果に基づいて、抽出対象物Ｔｅに関する表示画像Ｐの表示態様を決定する。これにより、表示画像生成装置１Ａは、ユーザＸ以外の主体によって認識されている抽出対象物ＴｅがユーザＸの視野Ｅｘ内に含まれているか否かにかかわらず、当該抽出対象物Ｔｅの位置に関する情報を適切に生成することができる（図４〜図６）。 As a result, the display image generation device 1A can identify the extraction target Te recognized by the subject (user Y) other than the user X by the speech data acquisition unit 12 and the object extraction unit 13. The display image generation device 1A can obtain a determination result of whether or not the extraction target Te is included in the visual field image by the visual field image acquisition unit 14A and the object determination unit 15A. Then, the display image generation unit 18A determines the display mode of the display image P regarding the extraction target Te based on the determination result of the object determination unit 15A. As a result, the display image generation device 1A relates to the position of the extraction target Te recognized by a subject other than the user X regardless of whether or not the extraction target Te is included in the field of view Ex of the user X. Information can be appropriately generated (FIGS. 4 to 6).

また、上記した実施形態においては、表示画像生成部１８Ａは、抽出対象物Ｔｅが視野画像に含まれると対象物判定部１５Ａにより判定された場合に、抽出対象物Ｔｅそのものを強調する表示態様で抽出対象物情報を示した第１表示画像Ｐ１を生成する。この結果、表示画像生成装置１Ａは、対象物判定部１５Ａにより抽出対象物ＴｅがユーザＸの視野画像に含まれると判定された場合には、ユーザＸが抽出対象物Ｔｅを特定することができる（図７のＳ１０７）。 Further, in the above-described embodiment, the display image generation unit 18A emphasizes the extraction target Te itself when the target determination unit 15A determines that the extraction target Te is included in the visual field image. The first display image P1 showing the extraction target information is generated. As a result, when the display image generation device 1A determines that the extraction target Te is included in the visual field image of the user X by the object determination unit 15A, the user X can specify the extraction target Te. (S107 in FIG. 7).

また、上記した実施形態においては、表示画像生成装置１Ａは、抽出対象物ＴｅとユーザＸとの相対的な位置関係を取得する位置関係取得部１７Ａを備える。表示画像生成部１８Ａは、抽出対象物Ｔｅが視野画像に含まれないと対象物判定部１５Ａにより判定された場合に、位置関係を表示する表示態様で抽出対象物情報を示した第２表示画像Ｐ２を生成する。この結果、表示画像生成装置１Ａは、対象物判定部１５Ａにより抽出対象物Ｔｅが視野画像に含まれないと判定された場合に、位置関係取得部１７Ａにより抽出対象物ＴｅとユーザＸとの相対的な位置関係を取得する。表示画像生成装置１Ａは、取得された位置関係を表示する表示態様で抽出対象物情報を示した第２表示画像Ｐ２を生成する。これにより、表示画像生成装置１Ａは、対象物ＴがユーザＸの視野Ｅｘ内に含まれていないときでも、抽出対象物Ｔｅの位置に関する情報を適切に生成することができる。 Further, in the above-described embodiment, the display image generation device 1A includes a positional relationship acquisition unit 17A that acquires a relative positional relationship between the extraction target Te and the user X. When the object determination unit 15A determines that the extraction object Te is not included in the visual field image, the display image generation unit 18A shows the extraction object information in a display mode for displaying the positional relationship. Generate P2. As a result, when the display image generation device 1A determines that the extraction target Te is not included in the visual field image by the object determination unit 15A, the positional relationship acquisition unit 17A determines that the extraction target Te and the user X are relative to each other. Get the positional relationship. The display image generation device 1A generates a second display image P2 showing the extraction target information in a display mode that displays the acquired positional relationship. As a result, the display image generation device 1A can appropriately generate information regarding the position of the extraction target Te even when the object T is not included in the field of view Ex of the user X.

また、上記した実施形態においては、表示画像生成部１８Ａは、対象物判定部１５Ａによる抽出対象物Ｔｅが視野画像に含まれるか否かの判定結果に基づいて、抽出対象物ＴｅがユーザＸにより視認可能であるか否かを示す情報を含む表示画像Ｐ（第１表示画像Ｐ１〜第３表示画像Ｐ３）を生成する。この結果、表示画像生成装置１Ａは、対象物判定部１５Ａの判定結果に基づいて、視野画像から抽出対象物ＴｅがユーザＸにより視認可能であるか否かを示す情報を含む表示画像Ｐを生成する。これにより、表示画像生成装置１Ａは、ユーザＸは抽出対象物Ｔｅが視認可能か否か情報を簡単に把握することができる。 Further, in the above-described embodiment, the display image generation unit 18A determines whether or not the extraction target Te is included in the visual field image by the object determination unit 15A, and the extraction target Te is determined by the user X. A display image P (first display image P1 to third display image P3) including information indicating whether or not the image is visible is generated. As a result, the display image generation device 1A generates a display image P including information indicating whether or not the extraction target Te is visible to the user X from the visual field image based on the determination result of the object determination unit 15A. To do. As a result, the display image generation device 1A can easily grasp the information as to whether or not the extraction target Te is visible to the user X.

また、上記した実施形態においては、表示画像生成装置１Ａは、抽出対象物Ｔｅが視野画像に含まれないと対象物判定部１５Ａにより判定された場合に、抽出対象物Ｔｅが予め設定された対象範囲内に存在するか否かを判定する存否判定部１６Ａを備える。表示画像生成部１８Ａは、抽出対象物Ｔｅが対象範囲内に存在するか否かの判定結果に基づいて、抽出対象物情報の表示態様を決定する。この結果、表示画像生成装置１Ａは、存否判定部１６Ａの判定結果に基づいて、抽出対象物情報の表示態様を決定することにより、抽出対象物ＴｅがユーザＸにより視認可能であるか否かを示す情報を含む表示画像Ｐ（第２表示画像Ｐ２，第３表示画像Ｐ３）を生成する。これより、表示画像生成装置１Ａは、抽出対象物Ｔｅが対象範囲に存在するか否かにかかわらず、当該抽出対象物Ｔｅの位置に関する情報を適切に生成することができる。 Further, in the above-described embodiment, when the display image generation device 1A determines by the object determination unit 15A that the extraction target Te is not included in the visual field image, the extraction target Te is a preset target. The presence / absence determination unit 16A for determining whether or not it exists within the range is provided. The display image generation unit 18A determines the display mode of the extraction target information based on the determination result of whether or not the extraction target Te exists within the target range. As a result, the display image generation device 1A determines whether or not the extraction target Te is visible to the user X by determining the display mode of the extraction target information based on the determination result of the presence / absence determination unit 16A. A display image P (second display image P2, third display image P3) including the information to be shown is generated. As a result, the display image generation device 1A can appropriately generate information regarding the position of the extraction target Te regardless of whether or not the extraction target Te exists in the target range.

また、上記した実施形態においては、表示画像生成装置１Ａは、周辺画像を取得して、取得した周辺画像を記憶する周辺画像取得部１１を備える。存否判定部１６Ａは、周辺画像取得部１１により取得された現在または過去の周辺画像に基づいて、抽出対象物Ｔｅが対象範囲内に存在するか否かを判定する。この結果、存否判定部１６Ａは、取得された現在または過去の周辺画像に基づいて、抽出対象物Ｔｅが対象範囲内に存在するか否かをより詳細に判定することができる。 Further, in the above-described embodiment, the display image generation device 1A includes a peripheral image acquisition unit 11 that acquires a peripheral image and stores the acquired peripheral image. The existence / non-existence determination unit 16A determines whether or not the extraction target Te exists within the target range based on the current or past peripheral image acquired by the peripheral image acquisition unit 11. As a result, the presence / absence determination unit 16A can determine in more detail whether or not the extraction target Te exists within the target range based on the acquired current or past peripheral image.

また、上記した実施形態においては、表示画像生成装置１Ａは、抽出対象物ＴｅとユーザＸとの相対的な位置関係を取得する位置関係取得部１７Ａを備える。表示画像生成部１８Ａは、抽出対象物Ｔｅが対象範囲内に存在すると存否判定部１６Ａにより判定された場合に、基準位置を基準として抽出対象物Ｔｅの位置の方向及び距離を含む位置関係を表示する表示態様で抽出対象物情報を示した表示画像Ｐ（第２表示画像Ｐ２，第３表示画像Ｐ３）を生成する。この結果、表示画像生成装置１Ａは、存否判定部１６Ａにより抽出対象物Ｔｅが対象範囲内に存在すると判定された場合に、位置関係取得部１７Ａにより基準位置を基準として抽出対象物Ｔｅの位置の方向及び距離を取得する。次に、表示画像生成装置１Ａは、基準位置を基準として抽出対象物Ｔｅの位置の方向及び距離を含む位置関係情報を生成することができる。これにより、表示画像生成装置１Ａは、存否判定部１６Ａにより抽出対象物Ｔｅが対象範囲内に存在すると判定された場合には、ユーザＸは抽出対象物Ｔｅの位置関係を把握することができる。 Further, in the above-described embodiment, the display image generation device 1A includes a positional relationship acquisition unit 17A that acquires a relative positional relationship between the extraction target Te and the user X. When the presence / absence determination unit 16A determines that the extraction target Te exists within the target range, the display image generation unit 18A displays the positional relationship including the direction and distance of the position of the extraction target Te with reference to the reference position. A display image P (second display image P2, third display image P3) showing information on the object to be extracted is generated in the display mode. As a result, when the presence / absence determination unit 16A determines that the extraction target Te exists within the target range, the display image generation device 1A determines that the extraction target Te position is based on the reference position by the positional relationship acquisition unit 17A. Get direction and distance. Next, the display image generation device 1A can generate positional relationship information including the direction and distance of the position of the extraction target Te with reference to the reference position. As a result, when the presence / absence determination unit 16A determines that the extraction target Te exists within the target range, the display image generation device 1A can grasp the positional relationship of the extraction target Te.

また、上記した実施形態においては、発言主体は人（ユーザＹ）であり、発言データは、発言の発言信号データである。この結果、表示画像生成装置１Ａは、人である発信主体から発言の発言信号データを取得することができる。これにより、表示画像生成装置１Ａは、発言主体が人であっても、ユーザＸ以外の主体によって認識されている抽出対象物ＴｅがユーザＸの視野内に含まれているか否かにかかわらず、当該抽出対象物Ｔｅの位置に関する情報を適切に生成することができる。 Further, in the above-described embodiment, the speaking subject is a person (user Y), and the speaking data is the speaking signal data of the speaking. As a result, the display image generation device 1A can acquire the speech signal data of the speech from the transmitting subject who is a person. As a result, in the display image generation device 1A, even if the speaking subject is a person, regardless of whether or not the extraction target Te recognized by the subject other than the user X is included in the field of view of the user X. Information regarding the position of the extraction target Te can be appropriately generated.

また、上記した実施形態においては、対象物判定部１５Ａは、抽出対象物ＴｅがユーザＸの視野画像に含まれるか否かの判定結果の情報を発言主体のユーザＹに出力する。この結果、表示画像生成装置は、対象物判定部１５により抽出対象物ＴｅがユーザＸの視野画像に含まれるか否かの判定結果を発言主体のユーザＹに出力することにより、発言主体は、ユーザＸが対象物を視認できるか否かの情報を取得することができ、ユーザＸが対象物を視認できるか否かに応じて話題の進み方を決めることができる。 Further, in the above-described embodiment, the object determination unit 15A outputs the information of the determination result as to whether or not the extraction object Te is included in the visual field image of the user X to the user Y who is the main speaker. As a result, the display image generation device outputs the determination result of whether or not the extraction target Te is included in the visual field image of the user X to the user Y who is the speaking subject by the object determining unit 15. Information on whether or not the user X can visually recognize the object can be acquired, and how the topic proceeds can be determined depending on whether or not the user X can visually recognize the object.

また、上記した実施形態においては、発言データ取得部１２は、ユーザＸに対して発言を発したユーザＹを特定する情報を取得する。表示画像生成部１８Ａは、発言データ取得部１２により取得されたユーザＹを特定する情報を含む表示画像Ｐ（第１表示画像Ｐ１〜第３表示画像Ｐ３）を生成する。この結果、表示画像生成装置１Ａは、発言データ取得部１２によりユーザＹを特定する情報を取得し、表示画像生成部１８ＡによりユーザＹを特定する情報を含む表示画像Ｐを生成することができる。これにより、ユーザＸがユーザＹを把握することができる。 Further, in the above-described embodiment, the speech data acquisition unit 12 acquires information that identifies the user Y who has made a speech to the user X. The display image generation unit 18A generates a display image P (first display image P1 to third display image P3) including information for identifying the user Y acquired by the speech data acquisition unit 12. As a result, the display image generation device 1A can acquire the information that identifies the user Y by the speech data acquisition unit 12, and can generate the display image P that includes the information that identifies the user Y by the display image generation unit 18A. As a result, the user X can grasp the user Y.

また、上記した実施形態においては、表示画像生成装置１Ａは、発言主体により発せられた発言に含まれる抽出対象物Ｔｅを特定し、当該抽出対象物Ｔｅに関する表示画像Ｐを生成する表示画像生成方法を開示する。表示画像生成装置１Ａは、発言データ取得ステップと、対象物抽出ステップと、視野画像取得ステップと、対象物判定ステップと、表示画像生成ステップと、を実行する。発言データ取得ステップは、発言主体であるユーザＹによりユーザＸに対して発せられた発言の発言データを取得する（図７のＳ１０３）。対象物抽出ステップは、予め記憶された複数の対象物データ（文字列）と取得された発言データ（文字列）とを対比して、発言データのうち対象物データと一致するデータを抽出対象物Ｔｅとして抽出する（図７のＳ１０４）。視野画像取得ステップは、ユーザＸの視野画像を取得する（図７のＳ１０５）対象物判定ステップは、抽出された抽出対象物Ｔｅが視野画像に含まれるか否かを判定する（図７のＳ１０６）。表示画像生成ステップは、抽出対象物Ｔｅの位置に関する情報である抽出対象物情報を取得し、視野画像とは異なる当該抽出対象物情報を含む表示画像Ｐを生成する（図７のＳ１０７，Ｓ１１０，Ｓ１１２）。更に、表示画像生成ステップにおいては、対象物判定ステップにおける抽出対象物Ｔｅが視野画像に含まれるか否かの判定結果に基づいて、抽出対象物Ｔｅに関する表示画像Ｐの表示態様を決定する（図７のＳ１０７，Ｓ１１０，Ｓ１１２）。 Further, in the above-described embodiment, the display image generation device 1A is a display image generation method that identifies the extraction target Te included in the remarks made by the remark subject and generates the display image P related to the extraction target Te. To disclose. The display image generation device 1A executes a speech data acquisition step, an object extraction step, a visual field image acquisition step, an object determination step, and a display image generation step. The remark data acquisition step acquires the remark data of the remark made to the user X by the user Y who is the remark subject (S103 in FIG. 7). The object extraction step compares a plurality of object data (character strings) stored in advance with the acquired speech data (character string), and extracts data that matches the object data among the speech data. Extract as Te (S104 in FIG. 7). The visual field image acquisition step acquires the visual field image of the user X (S105 in FIG. 7), and the object determination step determines whether or not the extracted extracted object Te is included in the visual field image (S106 in FIG. 7). ). The display image generation step acquires the extraction target information which is the information regarding the position of the extraction target Te, and generates the display image P including the extraction target information different from the visual field image (S107, S110, FIG. 7). S112). Further, in the display image generation step, the display mode of the display image P relating to the extraction target Te is determined based on the determination result of whether or not the extraction target Te in the object determination step is included in the visual field image (FIG. 7 S107, S110, S112).

この結果、表示画像生成装置１Ａは、発言データ取得ステップ対象物抽出ステップにより、ユーザＸ以外の主体（ユーザＹ）により認識されている抽出対象物Ｔｅを特定することができる。表示画像生成装置１Ａは、視野画像取得ステップと対象物判定ステップにより、抽出対象物Ｔｅが視野画像に含まれるか否かの判定結果を得ることができる。そして、表示画像生成ステップにおいて、対象物判定ステップの判定結果に基づいて、抽出対象物Ｔｅに関する表示画像Ｐの表示態様を決定する。これにより、表示画像生成装置１Ａは、ユーザＸ以外の主体によって認識されている抽出対象物ＴｅがユーザＸの視野Ｅｘ内に含まれているか否かにかかわらず、当該抽出対象物Ｔｅの位置に関する情報を適切に生成することができる（図４〜図６）。
［第２実施形態］ As a result, the display image generation device 1A can identify the extraction target Te recognized by the subject (user Y) other than the user X by the speech data acquisition step object extraction step. The display image generation device 1A can obtain a determination result of whether or not the extraction target object Te is included in the visual field image by the visual field image acquisition step and the object determination step. Then, in the display image generation step, the display mode of the display image P regarding the extraction target Te is determined based on the determination result of the object determination step. As a result, the display image generation device 1A relates to the position of the extraction target Te recognized by a subject other than the user X regardless of whether or not the extraction target Te is included in the field of view Ex of the user X. Information can be appropriately generated (FIGS. 4 to 6).
[Second Embodiment]

図８は、第２実施形態に係る表示画像生成装置１Ｂを示すブロック図である。本実施形態では、ＰＯＩ（Point of Interest）情報を用いて表示画像生成処理を実行可能な表示画像生成装置１Ｂについて説明する。ここで、「ＰＯＩ」とは、ＰＯＩ情報記憶部１９に名称、位置情報（緯度経度）が登録されている地図上の店舗、施設、興味ある名所などの特定な場所を意味する。また、第１実施形態の一例とした、ユーザＹによりユーザＸに対して発せられた発言「向こうに自転車があるね。」を、第２実施形態では一例として「向こうにコンビニエンスストアがあるね。」とする。そして、対象物抽出部１３は、ユーザＹにより発せられた発言から「コンビニエンスストア」という抽出対象物Ｔｅを抽出するものとする。なお、第２実施形態において、第１実施形態と同様の説明は省略又は簡略化する。 FIG. 8 is a block diagram showing a display image generation device 1B according to the second embodiment. In the present embodiment, the display image generation device 1B capable of executing the display image generation process using POI (Point of Interest) information will be described. Here, the "POI" means a specific place such as a store, a facility, or a famous place of interest on a map in which a name and location information (latitude / longitude) are registered in the POI information storage unit 19. Further, as an example of the first embodiment, the remark "There is a bicycle over there" made by the user Y to the user X, and as an example in the second embodiment, "There is a convenience store over there." ". Then, the object extraction unit 13 extracts the extraction object Te called "convenience store" from the remarks made by the user Y. In the second embodiment, the same description as in the first embodiment will be omitted or simplified.

図８において、表示画像生成装置１Ｂは、第１実施形態に係る表示画像生成装置１Ａと比較して、周辺画像取得部１１を備えていない点、視野画像取得部１４Ａに代えて視野画像取得部１４Ｂを備えている点、対象物判定部１５Ａに代えて対象物判定部１５Ｂを備えている点、存否判定部１６Ａに代えて存否判定部１６Ｂを備えている点、位置関係取得部１７Ａに代えて位置関係取得部１７Ｂを備えている点、表示画像生成部１８Ａに代えて表示画像生成部１８Ｂを備えている点、及び、ＰＯＩ情報記憶部１９を更に備えている点で相違しており、その他の点で同一である。 In FIG. 8, the display image generation device 1B does not include the peripheral image acquisition unit 11 as compared with the display image generation device 1A according to the first embodiment, and the visual field image acquisition unit replaces the visual field image acquisition unit 14A. 14B is provided, an object determination unit 15B is provided instead of the object determination unit 15A, an existence / absence determination unit 16B is provided instead of the existence / absence determination unit 16A, and a positional relationship acquisition unit 17A is used instead. The difference is that the positional relationship acquisition unit 17B is provided, the display image generation unit 18B is provided instead of the display image generation unit 18A, and the POI information storage unit 19 is further provided. It is otherwise the same.

表示画像生成装置１Ｂ、車両２Ｂ、ユーザ用端末３Ｂ、及び発言主体用端末４は、相互に有線又は無線により通信（送受信）可能に接続されている。 The display image generator 1B, the vehicle 2B, the user terminal 3B, and the speaking subject terminal 4 are connected to each other so as to be able to communicate (transmit and receive) by wire or wirelessly.

車両２Ｂは、第１実施形態に係る車両２Ａと比較して、周辺撮像装置２２を備えていない点で相違しており、その他の点で同一である。 The vehicle 2B is different from the vehicle 2A according to the first embodiment in that it is not provided with the peripheral imaging device 22, and is the same in other respects.

ユーザ用端末３Ｂは、第１実施形態に係るユーザ用端末３Ａと比較して、表示画像表示装置３１Ａに代えて表示画像表示装置３１Ｂを備えている点で相違しており、その他の点で同一である。 The user terminal 3B is different from the user terminal 3A according to the first embodiment in that it includes a display image display device 31B instead of the display image display device 31A, and is the same in other respects. Is.

発言主体用端末４は、第１実施形態に係る発言主体用端末４と同一である。 The speaking subject terminal 4 is the same as the speaking subject terminal 4 according to the first embodiment.

ＰＯＩ情報記憶部１９は、地図情報に含まれる対象物Ｔの位置に関する情報を少なくとも含むＰＯＩ情報を記憶する。この「ＰＯＩ情報」は、少なくともＰＯＩであるランドマークの名称、ランドマークの用途分類、ランドマークの特徴情報、ランドマークの画像、ランドマークの位置情報を含まれている。なお、ランドマークとは、建物や公園や商業施設や小売業の店舗（コンビニエンスストア等）等である。ＰＯＩ情報記憶部１９は、ＰＯＩ情報を車両２Ｂの外部から通信により取得してもよく、ナビゲーション装置２１に記憶されたランドマーク情報を当該ナビゲーション装置２１から取得してもよい。ＰＯＩ情報記憶部１９は、取得した車両２Ｂの位置に応じて、車両２Ｂが位置する区域のＰＯＩ情報をリアルタイムに更新してもよい。また、ナビゲーション装置２１によって経路探索が行われた場合、ＰＯＩ情報記憶部１９は、ナビゲーション装置２１によりダウンロードされた経路上のＰＯＩ情報を取得してもよい。 The POI information storage unit 19 stores POI information including at least information regarding the position of the object T included in the map information. This "POI information" includes at least the name of the landmark which is the POI, the usage classification of the landmark, the feature information of the landmark, the image of the landmark, and the location information of the landmark. Landmarks are buildings, parks, commercial facilities, retail stores (convenience stores, etc.), and the like. The POI information storage unit 19 may acquire POI information from the outside of the vehicle 2B by communication, or may acquire landmark information stored in the navigation device 21 from the navigation device 21. The POI information storage unit 19 may update the POI information of the area where the vehicle 2B is located in real time according to the acquired position of the vehicle 2B. Further, when the route search is performed by the navigation device 21, the POI information storage unit 19 may acquire the POI information on the route downloaded by the navigation device 21.

視野画像取得部１４Ｂは、第１実施形態に係る対象物判定部１５Ａと同一である。 The field image acquisition unit 14B is the same as the object determination unit 15A according to the first embodiment.

対象物判定部１５Ｂは、第１実施形態に係る対象物判定部１５Ａとは以下の点で異なるが、その他は同一である。対象物判定部１５Ｂは、ＰＯＩ情報記憶部１９からＰＯＩ情報を取得し、対象物抽出部１３により取得された抽出対象物ＴｅがＰＯＩである否かを判定する。そして、抽出対象物Ｔｅの画像が視野画像取得部１４Ｂにより取得された視野画像に含まれるか否かを判定する。なお、抽出対象物ＴｅがＰＯＩではない場合、又は、抽出対象物Ｔｅが視野画像に含まれない場合には、対象物判定部１５Ｂは、抽出対象物Ｔｅの画像が視野画像に含まれないと判定する。 The object determination unit 15B is different from the object determination unit 15A according to the first embodiment in the following points, but is the same except for the following points. The object determination unit 15B acquires POI information from the POI information storage unit 19, and determines whether or not the extraction object Te acquired by the object extraction unit 13 is a POI. Then, it is determined whether or not the image of the extraction target Te is included in the visual field image acquired by the visual field image acquisition unit 14B. If the extraction target Te is not a POI, or if the extraction target Te is not included in the visual field image, the object determination unit 15B determines that the image of the extraction target Te is not included in the visual field image. judge.

存否判定部１６Ｂは、第１実施形態に係る対象物判定部１５Ａとは以下の点（ＰＯＩ情報を用いる点）で異なるが、その他は同一である。存否判定部１６Ｂは、抽出対象物Ｔｅが視野画像に含まれないと対象物判定部１５Ｂにより判定された場合に、抽出対象物Ｔｅが予め設定された対象範囲内に存在するか否かを判定する。具体的には、存否判定部１６Ｂは、ＰＯＩ情報に基づいて、抽出対象物Ｔｅが対象範囲内に存在するか否かを判定する。 The existence / non-existence determination unit 16B is different from the object determination unit 15A according to the first embodiment in the following points (points of using POI information), but is the same in other respects. The existence / non-existence determination unit 16B determines whether or not the extraction target Te exists within the preset target range when the object determination unit 15B determines that the extraction target Te is not included in the visual field image. To do. Specifically, the presence / absence determination unit 16B determines whether or not the extraction target Te exists within the target range based on the POI information.

まず、存否判定部１６Ｂは、抽出対象物ＴｅがＰＯＩ情報記憶部１９により取得されたＰＯＩ情報に含まれるか否かを判定する。より詳細には、存否判定部１６Ｂは、ＰＯＩ情報記憶部１９により取得されたＰＯＩ情報を取得し、取得されたＰＯＩ情報に対象物抽出部１３により取得された抽出対象物Ｔｅが含まれているか否かを判定する。 First, the presence / absence determination unit 16B determines whether or not the extraction target Te is included in the POI information acquired by the POI information storage unit 19. More specifically, the presence / absence determination unit 16B acquires the POI information acquired by the POI information storage unit 19, and whether the acquired POI information includes the extraction target Te acquired by the object extraction unit 13. Judge whether or not.

また、存否判定部１６Ｂは、取得されたＰＯＩ情報に抽出対象物Ｔｅが含まれていないと判定された場合には、抽出対象物Ｔｅが予め設定された対象範囲内に存在しないと判定する。 Further, when it is determined that the acquired POI information does not include the extraction target Te, the existence / non-existence determination unit 16B determines that the extraction target Te does not exist within the preset target range.

次に、存否判定部１６Ｂは、取得されたＰＯＩ情報に抽出対象物Ｔｅ（ここでは例えばコンビニエンスストア）が含まれていると判定された場合に、取得されたＰＯＩ情報に基づいて、抽出対象物Ｔｅが存在する位置が対象範囲内であるか否かを判定する。存否判定部１６Ｂは、抽出対象物ＴｅがＰＯＩ情報に含まれる場合に、ナビゲーション装置２１から取得された車両２Ｂの位置情報とＰＯＩ情報記憶部１９に記憶されたＰＯＩ情報に含まれる抽出対象物Ｔｅの位置情報を用いて、車両２Ｂから抽出対象物Ｔｅまでの距離を算出する。また、存否判定部１６Ｂは、算出した距離に基づいて抽出対象物Ｔｅが予め設定された対象範囲内であるか否かを判定する。 Next, the presence / absence determination unit 16B determines that the acquired POI information includes the extraction target Te (here, for example, a convenience store), and the extraction target is based on the acquired POI information. It is determined whether or not the position where Te exists is within the target range. When the existence / non-existence determination unit 16B includes the extraction target Te in the POI information, the extraction target Te included in the position information of the vehicle 2B acquired from the navigation device 21 and the POI information stored in the POI information storage unit 19. The distance from the vehicle 2B to the extraction target Te is calculated using the position information of. In addition, the presence / absence determination unit 16B determines whether or not the extraction target Te is within the preset target range based on the calculated distance.

位置関係取得部１７Ｂは、抽出対象物ＴｅとユーザＸとの相対的な位置関係を取得する。位置関係取得部１７Ｂは、ＰＯＩ情報記憶部１９により抽出対象物Ｔｅの位置情報を取得し、ナビゲーション装置２１から車両２Ｂの位置を取得し、取得された抽出対象物Ｔｅの位置情報と車両２Ｂの位置情報に基づいて、車両２Ｂから抽出対象物Ｔｅまでの方向及び距離を算出してもよい。 The positional relationship acquisition unit 17B acquires the relative positional relationship between the extraction target Te and the user X. The positional relationship acquisition unit 17B acquires the position information of the extraction target Te by the POI information storage unit 19, acquires the position of the vehicle 2B from the navigation device 21, and obtains the position information of the acquired extraction target Te and the vehicle 2B. The direction and distance from the vehicle 2B to the extraction target Te may be calculated based on the position information.

表示画像生成部１８Ｂは、抽出対象物情報を取得し、当該抽出対象物情報を含む表示画像Ｐを生成する。 The display image generation unit 18B acquires the extraction target information and generates the display image P including the extraction target information.

表示画像生成部１８Ｂは、対象物判定部１５Ｂの判定結果に基づいて、抽出対象物Ｔｅの表示画像Ｐの表示態様を決定する。なお、「抽出対象物情報」等の用語の意味は第１実施形態と同様である。また、図４〜図６に表示されている「Bicycle」を、第２実施形態では「Convenience store」とする。対象物判定部１５Ｂにより、抽出対象物ＴｅがＰＯＩであり、かつ、抽出対象物Ｔｅが視野画像に含まれると判定された場合には、表示画像生成部１８Ｂは、第１表示画像Ｐ１を生成する。この場合、表示画像生成部１８Ｂは、視野画像取得部１４Ｂから視野Ｅｘの視野画像を取得し、視野画像から抽出対象物Ｔｅを画像認識し、抽出対象物Ｔｅに重畳して表示される抽出対象物Ｔｅそのものを強調する表示態様で抽出対象物情報を示した第１表示画像Ｐ１を生成する（図４参照）。なお、対象物判定部１５Ｂにより抽出対象物ＴｅがＰＯＩではないと判定された場合、又は、抽出対象物Ｔｅが視野画像に含まれないと判定された場合には、第２実施形態において第１表示画像Ｐ１は生成されない。 The display image generation unit 18B determines the display mode of the display image P of the extraction target Te based on the determination result of the object determination unit 15B. The meanings of terms such as "information on the object to be extracted" are the same as those in the first embodiment. Further, the "Bicycle" displayed in FIGS. 4 to 6 is referred to as a "Convenience store" in the second embodiment. When the object determination unit 15B determines that the extraction target Te is POI and the extraction target Te is included in the visual field image, the display image generation unit 18B generates the first display image P1. To do. In this case, the display image generation unit 18B acquires the visual field image of the visual field Ex from the visual field image acquisition unit 14B, recognizes the extraction target Te from the visual field image, and superimposes the extraction target Te on the extraction target Te to display the extraction target. A first display image P1 showing information on the object to be extracted is generated in a display mode that emphasizes the object Te itself (see FIG. 4). When the object determination unit 15B determines that the extraction object Te is not a POI, or when it is determined that the extraction object Te is not included in the visual field image, the first embodiment is performed. The display image P1 is not generated.

また、表示画像生成部１８Ｂは、対象物判定部１５Ｂにより抽出対象物Ｔｅが視野画像に含まれないと判定された場合には、ＰＯＩ情報に抽出対象物Ｔｅが含まれている、かつ、抽出対象物Ｔｅが対象範囲内に存在すると存否判定部１６Ｂにより判定されたか否かに基づいて、抽出対象物情報の表示態様を決定する。より詳細には、表示画像生成部１８Ｂは、ＰＯＩ情報に抽出対象物Ｔｅが含まれており、かつ、抽出対象物Ｔｅが対象範囲内に存在すると存否判定部１６Ｂにより判定された場合に、取得された抽出対象物Ｔｅの位置情報と車両２Ｂの位置情報に基づいて、抽出対象物Ｔｅが基準位置に対する方向及び距離を含む位置関係を表示する表示態様で抽出対象物情報を示した第２表示画像Ｐ２を生成する。表示画像生成部１８Ｂは、位置関係取得部１７Ｂにより抽出対象物Ｔｅが基準位置に対する方向及び距離を含む位置関係情報を取得し、取得された抽出対象物Ｔｅが基準位置に対する方向及び距離を含む位置関係を表示する第２表示画像Ｐ２を生成する（図５参照）。 Further, when the display image generation unit 18B determines that the extraction target Te is not included in the visual field image by the object determination unit 15B, the POI information includes the extraction target Te and extracts the extraction target Te. The display mode of the extraction target information is determined based on whether or not the presence / absence determination unit 16B determines that the target object Te exists within the target range. More specifically, the display image generation unit 18B acquires the extraction target Te when the POI information includes the extraction target Te and the presence / absence determination unit 16B determines that the extraction target Te exists within the target range. A second display showing the extraction target information in a display mode in which the extraction target Te displays the positional relationship including the direction and the distance to the reference position based on the position information of the extraction target Te and the position information of the vehicle 2B. Image P2 is generated. The display image generation unit 18B acquires the positional relationship information including the direction and distance of the extraction target Te from the reference position by the positional relationship acquisition unit 17B, and the acquired position where the extraction target Te includes the direction and distance with respect to the reference position. A second display image P2 displaying the relationship is generated (see FIG. 5).

また、表示画像生成部１８Ｂは、ＰＯＩ情報に抽出対象物Ｔｅが含まれていない、又は、抽出対象物Ｔｅが対象範囲内に存在しないと存否判定部１６Ｂにより判定された場合に、抽出対象物Ｔｅが予め設定された対象範囲内に存在しない情報を示す第３表示画像Ｐ３を生成する（図６参照）。 Further, the display image generation unit 18B determines that the extraction target object Te is not included in the POI information, or the existence / non-existence determination unit 16B determines that the extraction target object Te does not exist within the target range. A third display image P3 showing information that Te does not exist within the preset target range is generated (see FIG. 6).

また、表示画像生成部１８Ｂは、発言データ取得部１２により取得された発言主体を特定する情報を含む表示画像Ｐ（第１表示画像Ｐ１〜第３表示画像Ｐ３）を生成してもよい（図４〜図６参照）。 Further, the display image generation unit 18B may generate a display image P (first display image P1 to third display image P3) including information for identifying the speaking subject acquired by the speech data acquisition unit 12 (FIG. 4 to 6).

また、表示画像生成部１８Ｂは、対象物判定部１５Ｂによる抽出対象物Ｔｅが視野画像に含まれるか否かの判定結果に基づいて、抽出対象物ＴｅがユーザＸにより視認可能であるか否かを示す情報を含む表示画像Ｐ（第１表示画像Ｐ１〜第３表示画像Ｐ３）を生成する（図４〜図６参照）。 Further, the display image generation unit 18B determines whether or not the extraction target object Te is visible to the user X based on the determination result of whether or not the extraction target object Te is included in the visual field image by the object determination unit 15B. Display images P (first display images P1 to third display images P3) including information indicating the above are generated (see FIGS. 4 to 6).

続いて、表示画像生成装置１Ｂにより実行される画像生成処理について説明する。図９は、表示画像生成処理を示すフローチャートである。図９のフローチャートは、例えば表示画像生成装置１Ｂによる表示画像生成処理は、車両２Ｂが起動されたときに開始される。 Subsequently, the image generation process executed by the display image generation device 1B will be described. FIG. 9 is a flowchart showing a display image generation process. In the flowchart of FIG. 9, for example, the display image generation process by the display image generation device 1B is started when the vehicle 2B is started.

図９に示されるように、ステップＳ２０１において、ＰＯＩ情報記憶部１９は、外部又は車両２ＢからＰＯＩ情報を取得して記憶する。その後、表示画像生成装置１Ｂは、ステップＳ２０２に進む。 As shown in FIG. 9, in step S201, the POI information storage unit 19 acquires and stores POI information from the outside or the vehicle 2B. After that, the display image generation device 1B proceeds to step S202.

ステップＳ２０２において、表示画像生成装置１Ｂは、発言データ取得部１２により、ユーザ（発言主体）ＹによりユーザＸに対して発せられた音声の発言データを取得する。発言データ取得部１２は、発言主体用端末４の発言データ取得装置４１から取得されたユーザＹによりユーザＸに対して発せられた音声の発言データを取得する。なお、上述したとおり、発言データには、ユーザＹが何も発していないデータも含まれる。さらに、発言データ取得部１２は、ユーザＹを特定する情報を取得し、表示画像生成装置１Ｂに送信する。その後、ステップＳ２０３に進む。 In step S202, the display image generation device 1B acquires the speech data of the voice uttered to the user X by the user (subject of speech) Y by the speech data acquisition unit 12. The speech data acquisition unit 12 acquires speech data of the voice emitted to the user X by the user Y acquired from the speech data acquisition device 41 of the speech subject terminal 4. As described above, the speech data also includes data in which the user Y does not emit anything. Further, the speech data acquisition unit 12 acquires information that identifies the user Y and transmits it to the display image generation device 1B. Then, the process proceeds to step S203.

ステップＳ２０３において、表示画像生成装置１Ｂは、発言データ取得部１２により、発言データにユーザ（発言主体）Ｙの発言が含まれるか否かを判定する。ユーザＹの発言が含まれると判定された場合には、ステップＳ２０４に進む。ユーザＹの発言が含まれないと判定された場合には、エンドに進む。 In step S203, the display image generation device 1B determines whether or not the speech data includes the speech of the user (subject of speech) Y by the speech data acquisition unit 12. If it is determined that the remark of the user Y is included, the process proceeds to step S204. If it is determined that the user Y's remark is not included, the process proceeds to the end.

ステップＳ２０４において、表示画像生成装置１Ｂは、対象物抽出部１３により、発言データのうち対象物Ｔと一致する抽出対象物Ｔｅを抽出できるか否かを判定する。抽出対象物Ｔｅを抽出できると判定された場合には、ステップＳ２０５に進む。抽出対象物Ｔｅを抽出できないと判定された場合には、エンドに進む。 In step S204, the display image generation device 1B determines whether or not the object extraction unit 13 can extract the extraction target Te that matches the object T from the statement data. If it is determined that the extraction target Te can be extracted, the process proceeds to step S205. If it is determined that the extraction target Te cannot be extracted, the process proceeds to the end.

ステップＳ２０５において、表示画像生成装置１Ｂは、視野画像取得部１４Ｂにより、ユーザＸの視野画像を取得する。視野画像取得部１４Ｂは、ユーザＸが装着しているユーザ用端末３Ｂの視野画像取得装置３２からユーザＸの視野画像を取得する。その後、ステップＳ２０６に進む。 In step S205, the display image generation device 1B acquires the visual field image of the user X by the visual field image acquisition unit 14B. The visual field image acquisition unit 14B acquires the visual field image of the user X from the visual field image acquisition device 32 of the user terminal 3B worn by the user X. Then, the process proceeds to step S206.

ステップＳ２０６において、表示画像生成装置１Ｂは、対象物判定部１５Ｂにより、抽出対象物ＴｅがＰＯＩである否かを判定する。更に、表示画像生成装置１Ｂは、対象物判定部１５Ｂにより、対象物抽出部１３から抽出された抽出対象物Ｔｅが視野画像取得部１４Ｂから取得したユーザＸの視野画像に含まれるか否かを判定する。抽出対象物ＴｅがＰＯＩではないと判定された場合、又は、抽出対象物ＴｅがユーザＸの視野画像に含まれないと判定された場合には、ステップＳ２０８に進む。抽出対象物ＴｅがＰＯＩであると判定され、かつ、抽出対象物ＴｅがユーザＸの視野画像に含まれると判定された場合には、ステップＳ２０７に進む。ここで、例えば、抽出対象物Ｔｅがコンビニエンスストアであり、当該コンビニエンスストアがＰＯＩ情報としてＰＯＩ情報記憶部１９に記憶されていれば、抽出対象物ＴｅがＰＯＩであると判定される。また、例えば、抽出対象物Ｔｅが走行中の自転車であれば、ＰＯＩ情報としてＰＯＩ情報記憶部１９に記憶されていないので、抽出対象物ＴｅがＰＯＩではないと判定される。 In step S206, the display image generation device 1B determines whether or not the extraction target object Te is a POI by the object determination unit 15B. Further, the display image generation device 1B determines whether or not the extraction target object Te extracted from the object extraction unit 13 by the object determination unit 15B is included in the visual field image of the user X acquired from the visual field image acquisition unit 14B. judge. If it is determined that the extraction target Te is not a POI, or if it is determined that the extraction target Te is not included in the visual field image of the user X, the process proceeds to step S208. If it is determined that the extraction target Te is POI and the extraction target Te is included in the visual field image of the user X, the process proceeds to step S207. Here, for example, if the extraction target Te is a convenience store and the convenience store is stored in the POI information storage unit 19 as POI information, it is determined that the extraction target Te is a POI. Further, for example, if the extraction target Te is a running bicycle, it is determined that the extraction target Te is not a POI because it is not stored in the POI information storage unit 19 as POI information.

抽出対象物ＴｅがＰＯＩであると判定され、かつ、抽出対象物ＴｅがユーザＸの視野画像に含まれると判定された場合には、ステップＳ２０７において、表示画像生成装置１Ｂは、表示画像生成部１８Ｂにより、抽出対象物Ｔｅそのものを強調する第１表示画像Ｐ１を生成する。表示画像生成部１８Ｂは、視野画像取得部１４Ｂから視野Ｅｘの視野画像を取得し、視野画像から抽出対象物Ｔｅを画像認識し、視野画像に重畳して表示される抽出対象物Ｔｅそのものを強調する第１表示態様で抽出対象物情報を示した第１表示画像Ｐ１を生成する。なお、表示画像生成部１８Ｂは、視野画像から抽出対象物ＴｅがユーザＸにより視認可能であることを示す情報及び発言データ取得部１２により取得された発言主体を特定する情報をさらに含む第１表示画像Ｐ１を生成してもよい。表示画像生成部１８Ｂは、生成した第１表示画像Ｐ１をユーザ用端末３Ｂの表示画像表示装置３１Ｂに送信する。 When it is determined that the extraction target Te is POI and the extraction target Te is included in the visual field image of the user X, in step S207, the display image generation device 1B is the display image generation unit. The first display image P1 that emphasizes the extraction target Te itself is generated by 18B. The display image generation unit 18B acquires the visual field image of the visual field Ex from the visual field image acquisition unit 14B, recognizes the extraction target Te from the visual field image, and emphasizes the extraction target Te itself displayed by superimposing it on the visual field image. A first display image P1 showing information on an object to be extracted is generated in the first display mode. The display image generation unit 18B is the first display that further includes information indicating that the extraction target Te is visible to the user X from the visual field image and information for identifying the speaking subject acquired by the speaking data acquisition unit 12. Image P1 may be generated. The display image generation unit 18B transmits the generated first display image P1 to the display image display device 31B of the user terminal 3B.

抽出対象物ＴｅがＰＯＩではないと判定された場合、又は、抽出対象物ＴｅがユーザＸの視野画像に含まれないと判定された場合には、ステップＳ２０８において、まず、表示画像生成装置１Ｂは、存否判定部１６Ｂにより、ＰＯＩ情報記憶部１９により記憶されたＰＯＩ情報に基づいて、ＰＯＩ情報に抽出対象物Ｔｅが含まれているか否かを判定する。更に、ＰＯＩ情報に抽出対象物Ｔｅが含まれていると判定した場合には、表示画像生成装置１Ｂは、存否判定部１６Ｂにより、ＰＯＩ情報記憶部１９により記憶されたＰＯＩ情報に基づいて、抽出対象物Ｔｅが対象範囲内に存在するか否かを判定する。ＰＯＩ情報に抽出対象物Ｔｅが含まれていないと判定された場合、又は、抽出対象物Ｔｅが対象範囲内に存在しないと判定された場合には、ステップＳ２１１に進む。ＰＯＩ情報に抽出対象物Ｔｅが含まれていると判定され、かつ、抽出対象物Ｔｅが対象範囲内に存在すると判定された場合には、スッテプＳ２０９に進む。ここで、例えば、抽出対象物Ｔｅがコンビニエンスストアであり、当該コンビニエンスストアがＰＯＩ情報としてＰＯＩ情報記憶部１９に記憶されていれば、ＰＯＩ情報に抽出対象物Ｔｅが含まれていると判定される。また、例えば、抽出対象物Ｔｅが走行中の自転車であれば、ＰＯＩ情報としてＰＯＩ情報記憶部１９に記憶されていないので、ＰＯＩ情報に抽出対象物Ｔｅが含まれていないと判定される。 When it is determined that the extraction target Te is not a POI, or when it is determined that the extraction target Te is not included in the visual field image of the user X, in step S208, first, the display image generation device 1B , The presence / absence determination unit 16B determines whether or not the extraction target Te is included in the POI information based on the POI information stored by the POI information storage unit 19. Further, when it is determined that the POI information includes the extraction target Te, the display image generation device 1B extracts the POI information by the presence / absence determination unit 16B based on the POI information stored by the POI information storage unit 19. It is determined whether or not the object Te exists within the target range. If it is determined that the extraction target Te is not included in the POI information, or if it is determined that the extraction target Te does not exist within the target range, the process proceeds to step S211. If it is determined that the extraction target Te is included in the POI information and it is determined that the extraction target Te exists within the target range, the process proceeds to step S209. Here, for example, if the extraction target Te is a convenience store and the convenience store is stored in the POI information storage unit 19 as POI information, it is determined that the extraction target Te is included in the POI information. .. Further, for example, if the extraction target Te is a running bicycle, it is determined that the extraction target Te is not included in the POI information because it is not stored in the POI information storage unit 19 as POI information.

ＰＯＩ情報に抽出対象物Ｔｅが含まれていると判定され、かつ、抽出対象物Ｔｅが存在する位置が対象範囲内であると判定された場合には、ステップＳ２０９において、表示画像生成装置１Ｂは、位置関係取得部１７Ｂにより、抽出対象物ＴｅとユーザＸとの位置関係を取得する。位置関係取得部１７Ｂは、取得された抽出対象物Ｔｅの位置情報と車両２Ｂの位置情報に基づいて、抽出対象物ＴｅからユーザＸ又は車両２Ｂまでの方向と距離を算出により推定する。その後、ステップＳ２１０に進む。 When it is determined that the POI information includes the extraction target Te and the position where the extraction target Te exists is within the target range, in step S209, the display image generation device 1B , The positional relationship acquisition unit 17B acquires the positional relationship between the extraction target Te and the user X. The positional relationship acquisition unit 17B calculates the direction and distance from the extraction target Te to the user X or the vehicle 2B based on the acquired position information of the extraction target Te and the position information of the vehicle 2B. Then, the process proceeds to step S210.

ステップＳ２１０において、表示画像生成装置１Ｂは、表示画像生成部１８Ｂにより、位置関係取得部１７Ｂから取得された抽出対象物Ｔｅの位置情報と車両２Ｂの位置情報に基づいて、車両２Ｂから抽出対象物Ｔｅまでの方向及び距離を含む位置関係を表示する第２表示画像Ｐ２を生成する。表示画像生成部１８Ｂは、位置関係取得部１７Ｂから取得されたユーザＸの視野Ｅｘに対する方向を示す記号画像と距離を表示する第２表示態様で抽出対象物情報を示した第２表示画像Ｐ２を生成する。なお、表示画像生成部１８Ｂは、ユーザＸから抽出対象物Ｔｅが視認不可能であることを示す情報及び発言データ取得部１２により取得された発言主体を特定する情報を含む第２表示画像Ｐ２を生成してもよい。表示画像生成部１８Ｂは、生成した第２表示画像Ｐ２をユーザ用端末３Ｂの表示画像表示装置３１Ｂに送信する。 In step S210, the display image generation device 1B is extracted from the vehicle 2B based on the position information of the extraction target Te acquired from the positional relationship acquisition unit 17B and the position information of the vehicle 2B by the display image generation unit 18B. A second display image P2 that displays the positional relationship including the direction and distance to Te is generated. The display image generation unit 18B displays a symbol image indicating the direction of the user X with respect to the field of view Ex acquired from the positional relationship acquisition unit 17B and a second display image P2 showing the extraction target information in the second display mode of displaying the distance. Generate. The display image generation unit 18B provides a second display image P2 including information indicating that the extraction target Te is invisible from the user X and information for identifying the speaking subject acquired by the speaking data acquisition unit 12. It may be generated. The display image generation unit 18B transmits the generated second display image P2 to the display image display device 31B of the user terminal 3B.

ＰＯＩ情報に抽出対象物Ｔｅが含まれていないと判定した場合、又は、抽出対象物Ｔｅが存在する位置が対象範囲内ではないと判定された場合には、ステップＳ２１１において、表示画像生成装置１Ｂは、位置関係取得部１７Ｂにより、抽出対象物ＴｅとユーザＸとの位置関係を取得する。具体的には、位置関係取得部１７Ｂは、存否判定部１６Ｂから抽出対象物Ｔｅが予め設定された対象範囲内に存在しない位置関係情報を取得する。その後、ステップＳ２１２に進む。 If it is determined that the POI information does not include the extraction target Te, or if it is determined that the position where the extraction target Te exists is not within the target range, in step S211, the display image generation device 1B Acquires the positional relationship between the extraction target Te and the user X by the positional relationship acquisition unit 17B. Specifically, the positional relationship acquisition unit 17B acquires the positional relationship information in which the extraction target Te does not exist within the preset target range from the existence / non-existence determination unit 16B. After that, the process proceeds to step S212.

ステップＳ２１２において、表示画像生成装置１Ｂは、位置関係取得部１７Ｂから取得された抽出対象物Ｔｅが予め設定された対象範囲内に存在しないという抽出対象物ＴｅとユーザＸとの位置関係を表示する第３表示画像Ｐ３を生成する。表示画像生成部１８Ｂは、視野画像から抽出対象物ＴｅがユーザＸにより視認不可能であることを示す情報及び発言データ取得部１２により取得された発言主体を特定する情報を含む第３表示画像Ｐ３を生成する。なお、抽出対象物Ｔｅの位置の方向及び距離を含む位置関係（第２表示態様に係る位置関係）は表示されない。表示画像生成部１８Ｂは、生成した第３表示画像Ｐ３をユーザ用端末３Ｂの表示画像表示装置３１Ｂに送信する。 In step S212, the display image generation device 1B displays the positional relationship between the extraction target Te and the user X that the extraction target Te acquired from the positional relationship acquisition unit 17B does not exist within the preset target range. The third display image P3 is generated. The display image generation unit 18B is a third display image P3 including information indicating that the extraction target Te is invisible to the user X from the visual field image and information for identifying the speaker subject acquired by the speech data acquisition unit 12. To generate. The positional relationship including the direction and distance of the position of the extraction target Te (the positional relationship according to the second display mode) is not displayed. The display image generation unit 18B transmits the generated third display image P3 to the display image display device 31B of the user terminal 3B.

表示画像生成装置１Ｂは、表示画像生成部１８Ｂの上述した処理が終了すると、今回の処理を終了して、再びステップＳ２０１から表示画像生成処理を繰り返す。 When the above-described processing of the display image generation unit 18B is completed, the display image generation device 1B ends the current processing and repeats the display image generation processing from step S201 again.

上記のとおり、本実施形態では、抽出対象物Ｔｅの位置に関する情報を少なくとも含むＰＯＩ情報を記憶するＰＯＩ情報記憶部１９を備える。存否判定部１６Ｂは、ＰＯＩ情報記憶部１９により記憶されたＰＯＩ情報に基づいて、抽出対象物Ｔｅが対象範囲内に存在するか否かを判定する。この結果、存否判定部１６Ｂは、ＰＯＩ情報記憶部１９に記憶されたＰＯＩ情報に基づいて、抽出対象物Ｔｅが対象範囲内に存在するか否かを確実に判定することができる。
［第３実施形態］ As described above, the present embodiment includes a POI information storage unit 19 that stores POI information including at least information regarding the position of the extraction target Te. The existence / non-existence determination unit 16B determines whether or not the extraction target Te exists within the target range based on the POI information stored by the POI information storage unit 19. As a result, the presence / absence determination unit 16B can reliably determine whether or not the extraction target Te exists within the target range based on the POI information stored in the POI information storage unit 19.
[Third Embodiment]

図１０は、第３実施形態に係る表示画像生成装置１Ｃを示すブロック図である。本実施形態では、車両２Ｃに設置された表示装置であるユーザ用端末３Ｃを用いて表示画像生成処理を実行可能な表示画像生成装置１Ｃについて説明する。なお、第２実施形態において、第１実施形態と同様の説明は省略又は簡略化する。 FIG. 10 is a block diagram showing the display image generation device 1C according to the third embodiment. In the present embodiment, the display image generation device 1C capable of executing the display image generation process using the user terminal 3C, which is a display device installed in the vehicle 2C, will be described. In the second embodiment, the same description as in the first embodiment will be omitted or simplified.

図１０において、表示画像生成装置１Ｃは、第１実施形態に係る表示画像生成装置１Ａと比較して、視野画像取得部１４Ａに代えて視野画像取得部１４Ｃを備えている点、対象物判定部１５Ａに代えて対象物判定部１５Ｃを備えている点、存否判定部１６Ａに代えて存否判定部１６Ｃを備えている点、位置関係取得部１７Ａに代えて位置関係取得部１７Ｃを備えている点、表示画像生成部１８Ａに代えて表示画像生成部１８Ｃを備えている点、及び、視線認識部２０を備えている点で相違しており、その他の点で同一である。 In FIG. 10, the display image generation device 1C includes a field image acquisition unit 14C instead of the field image acquisition unit 14A as compared with the display image generation device 1A according to the first embodiment, that is, an object determination unit. An object determination unit 15C is provided instead of the 15A, an existence determination unit 16C is provided instead of the existence determination unit 16A, and a positional relationship acquisition unit 17C is provided instead of the positional relationship acquisition unit 17A. The difference is that the display image generation unit 18C is provided instead of the display image generation unit 18A, and the line-of-sight recognition unit 20 is provided, and they are the same in other respects.

表示画像生成装置１Ｃ、車両２Ｃ、ユーザ用端末３Ｃ、及び発言主体用端末４は、相互に有線又は無線により通信（送受信）可能に接続されている。 The display image generator 1C, the vehicle 2C, the user terminal 3C, and the speaking subject terminal 4 are connected to each other so as to be able to communicate (transmit and receive) by wire or wirelessly.

車両２Ｃは、第１実施形態に係る車両２Ａと比較して、姿勢取得装置２３を備えている点で相違しており、その他の点で同一である。 The vehicle 2C is different from the vehicle 2A according to the first embodiment in that it is provided with the posture acquisition device 23, and is the same in other respects.

ユーザ用端末３Ｂは、第１実施形態に係るユーザ用端末３Ａと比較して、視野画像取得装置３２を備えていない点、表示画像表示装置３１Ａに代えて表示画像表示装置３１Ｃを備えている点で相違しており、その他の点で同一である。 Compared with the user terminal 3A according to the first embodiment, the user terminal 3B does not include the visual field image acquisition device 32 and includes the display image display device 31C instead of the display image display device 31A. It is different in, and is the same in other respects.

姿勢取得装置２３は、ユーザＸの顔画像を含む画像情報を取得する。姿勢取得装置２３は、車両２Ｃに設置された車内カメラからユーザＸの顔画像を含む画像を撮像する。 The posture acquisition device 23 acquires image information including the face image of the user X. The posture acquisition device 23 captures an image including the face image of the user X from the in-vehicle camera installed in the vehicle 2C.

視線認識部２０は、ユーザＸの視線を認識する。「視線」とは、ユーザＸの両目の中心を通り、ユーザＸの顔向きを示す視線方向である。視線認識部２０は、姿勢取得装置２３からユーザＸの顔画像を含む画像情報を取得し、ユーザＸの視線方向を認識する。 The line-of-sight recognition unit 20 recognizes the line of sight of the user X. The "line of sight" is a line-of-sight direction that passes through the centers of both eyes of the user X and indicates the face orientation of the user X. The line-of-sight recognition unit 20 acquires image information including the face image of the user X from the posture acquisition device 23, and recognizes the line-of-sight direction of the user X.

視野画像取得部１４Ｃは、周辺画像取得部１１により取得されたリアルタイムの周辺画像と視線認識部２０により認識されたユーザＸの視線とに基づいて視野画像を取得する。より詳細には、視野画像取得部１４Ｃは、視線認識部２０からユーザＸの視線方向を取得し、ユーザＸの視野Ｅｘを推定する。視野画像取得部１４Ｃは、周辺画像取得部１１からリアルタイムの車両周辺の画像を取得し、車両周辺の画像から推定されたユーザＸの視野Ｅｘに対応する領域を切り出し、視野画像を取得する。ここで、「推定されたユーザＸの視野Ｅｘに対応する領域」とは、例えば、眼を動かさない状態で、垂直視野の上側６０度・下側７０度、水平視野で左右それぞれ１００度、の領域とする。 The visual field image acquisition unit 14C acquires a visual field image based on the real-time peripheral image acquired by the peripheral image acquisition unit 11 and the line of sight of the user X recognized by the line of sight recognition unit 20. More specifically, the field image acquisition unit 14C acquires the line-of-sight direction of the user X from the line-of-sight recognition unit 20 and estimates the field-of-view Ex of the user X. The visual field image acquisition unit 14C acquires a real-time image of the vehicle peripheral area from the peripheral image acquisition unit 11, cuts out a region corresponding to the user X's visual field Ex estimated from the image of the vehicle peripheral area, and acquires the visual field image. Here, the "region corresponding to the estimated user X's field of view Ex" is, for example, 60 degrees above and 70 degrees below the vertical field of view, and 100 degrees to the left and right in the horizontal field of view, respectively, without moving the eyes. Let it be an area.

対象物判定部１５Ｃは、抽出対象物Ｔｅが視野画像取得部１４Ｃにより取得されたユーザＸの視野Ｅｘの視野画像に抽出対象物Ｔｅが含まれるか否かを判定する。対象物判定部１５Ｃは、第１実施形態に係る対象物判定部１５Ａと同一の方法で判定すればよい。 The object determination unit 15C determines whether or not the extraction target Te is included in the visual field image of the visual field Ex of the user X acquired by the visual field image acquisition unit 14C. The object determination unit 15C may determine by the same method as the object determination unit 15A according to the first embodiment.

存否判定部１６Ｃは、抽出対象物Ｔｅが視野画像に含まれないと対象物判定部１５Ｃにより判定された場合に、抽出対象物Ｔｅが予め設定された対象範囲内に存在するか否かを判定する。存否判定部１６Ｃは、第１実施形態に係る存否判定部１６Ａと同一の方法で判定すればよい。 The existence / non-existence determination unit 16C determines whether or not the extraction target Te exists within the preset target range when the object determination unit 15C determines that the extraction target Te is not included in the visual field image. To do. The presence / absence determination unit 16C may determine the presence / absence determination unit 16C by the same method as the presence / absence determination unit 16A according to the first embodiment.

位置関係取得部１７Ｃは、抽出対象物ＴｅとユーザＸとの相対的な位置関係を取得する。位置関係取得部１７Ｃは、第１実施形態に係る位置関係取得部１７Ａと同一の方法で、ユーザＸ又は車両２Ｃから抽出対象物Ｔｅまでの方向及び距離を推定すればよい。また、位置関係取得部１７Ｃは、第１実施形態に係る位置関係取得部１７Ａと同一の方法で、存否判定部１６Ｃから、抽出対象物Ｔｅが予め設定された対象範囲内に存在しない情報を取得してもよい。 The positional relationship acquisition unit 17C acquires the relative positional relationship between the extraction target Te and the user X. The positional relationship acquisition unit 17C may estimate the direction and distance from the user X or the vehicle 2C to the extraction target Te by the same method as the positional relationship acquisition unit 17A according to the first embodiment. Further, the positional relationship acquisition unit 17C acquires information from the existence / non-existence determination unit 16C that the extraction target Te does not exist within the preset target range by the same method as the positional relationship acquisition unit 17A according to the first embodiment. You may.

表示画像生成部１８Ｃは、抽出対象物情報を取得し、当該抽出対象物情報を含む表示画像Ｐを生成する。 The display image generation unit 18C acquires the extraction target information and generates the display image P including the extraction target information.

表示画像生成部１８Ｃは、対象物判定部１５Ｃの判定結果に基づいて、第１実施形態と同様に抽出対象物Ｔｅの表示画像Ｐの表示態様を決定する。なお、「抽出対象物情報」等の用語の意味は第１実施形態と同様である。対象物判定部１５Ｃにより抽出対象物Ｔｅが視野画像に含まれると判定された場合には、表示画像生成部１８Ｃは、視野画像取得部１４Ｃから視野Ｅｘの視野画像を取得し、視野画像から抽出対象物Ｔｅを画像認識し、抽出対象物Ｔｅに重畳して表示される抽出対象物Ｔｅそのものを強調する表示態様で抽出対象物情報を示した第１表示画像Ｐ１を生成する（図４参照）。 The display image generation unit 18C determines the display mode of the display image P of the extraction target Te as in the first embodiment, based on the determination result of the object determination unit 15C. The meanings of terms such as "information on the object to be extracted" are the same as those in the first embodiment. When the object determination unit 15C determines that the extraction target Te is included in the field image, the display image generation unit 18C acquires the field image of the field Ex from the field image acquisition unit 14C and extracts it from the field image. The object Te is image-recognized, and the first display image P1 showing the extraction target information is generated in a display mode that emphasizes the extraction target Te itself displayed by superimposing it on the extraction target Te (see FIG. 4). ..

また、表示画像生成部１８Ｃは、対象物判定部１５Ｃにより抽出対象物Ｔｅが視野画像に含まれないと判定された場合には、抽出対象物Ｔｅが対象範囲内に存在すると存否判定部１６Ｃにより判定されたか否かに基づいて、第１実施形態と同様に抽出対象物情報の表示態様を決定する。より詳細には、表示画像生成部１８Ｃは、抽出対象物Ｔｅが対象範囲内に存在すると存否判定部１６Ｃにより判定された場合に、基準位置を基準として抽出対象物Ｔｅの位置の方向及び距離を含む位置関係を表示する表示態様で抽出対象物情報を示した第２表示画像Ｐ２を生成する（図５参照）。表示画像生成部１８Ｃは、位置関係取得部１７Ｃにより基準位置を基準として抽出対象物Ｔｅの位置の方向及び距離を含む位置関係情報を取得し、取得された基準位置を基準として抽出対象物Ｔｅの位置の方向及び距離を含む位置関係を表示する第２表示画像Ｐ２を生成する。 Further, when the display image generation unit 18C determines that the extraction target Te is not included in the visual field image by the object determination unit 15C, the presence / absence determination unit 16C determines that the extraction target Te exists within the target range. Based on whether or not it is determined, the display mode of the extraction target information is determined as in the first embodiment. More specifically, when the presence / absence determination unit 16C determines that the extraction target Te exists within the target range, the display image generation unit 18C determines the direction and distance of the position of the extraction target Te with reference to the reference position. A second display image P2 showing the extraction target information is generated in a display mode that displays the including positional relationship (see FIG. 5). The display image generation unit 18C acquires the positional relationship information including the direction and distance of the position of the extraction target Te with the reference position as a reference by the positional relationship acquisition unit 17C, and uses the acquired reference position as a reference for the extraction target Te. A second display image P2 that displays the positional relationship including the direction and distance of the position is generated.

また、表示画像生成部１８Ｃは、抽出対象物Ｔｅが対象範囲内に存在しないと存否判定部１６Ｃにより判定された場合に、抽出対象物Ｔｅが予め設定された対象範囲内に存在しない情報を示す第３表示画像Ｐ３を生成する（図６参照）。 Further, the display image generation unit 18C indicates information that the extraction target Te does not exist in the preset target range when the existence / non-existence determination unit 16C determines that the extraction target Te does not exist in the target range. A third display image P3 is generated (see FIG. 6).

また、表示画像生成部１８Ｃは、第１実施形態と同様に発言データ取得部１２により取得された発言主体を特定する情報を含む表示画像Ｐ（第１表示画像Ｐ１〜第３表示画像Ｐ３）を生成する（図４〜図６参照）。 Further, the display image generation unit 18C displays the display image P (first display image P1 to third display image P3) including the information for identifying the speaker acquired by the speech data acquisition unit 12 as in the first embodiment. Generate (see FIGS. 4 to 6).

また、表示画像生成部１８Ｃは、対象物判定部１５Ｃによる抽出対象物Ｔｅが視野画像に含まれるか否かの判定結果に基づいて、抽出対象物ＴｅがユーザＸにより視認可能であるか否かを示す情報を含む表示画像Ｐ（第１表示画像Ｐ１〜第３表示画像Ｐ３）を生成する（図４〜図６参照）。 Further, the display image generation unit 18C determines whether or not the extraction target object Te is visible to the user X based on the determination result of whether or not the extraction target object Te is included in the visual field image by the object determination unit 15C. Display images P (first display images P1 to third display images P3) including information indicating the above are generated (see FIGS. 4 to 6).

続いて、表示画像生成装置１Ｃにより実行される画像生成処理について説明する。図１１は、表示画像生成処理を示すフローチャートである。図１１のフローチャートは、例えば表示画像生成装置１Ｃによる表示画像生成処理は、車両２Ｃが起動されたときに開始される。 Subsequently, the image generation process executed by the display image generation device 1C will be described. FIG. 11 is a flowchart showing the display image generation process. In the flowchart of FIG. 11, for example, the display image generation process by the display image generation device 1C is started when the vehicle 2C is started.

図１１に示されるように、ステップＳ３０１において、表示画像生成装置１Ｃは、周辺画像取得部１１により、ユーザＸの周辺画像を取得する。周辺画像取得部１１は、車両２Ｃの周辺撮像装置２２が撮像した周辺画像を取得する。その後、表示画像生成装置１Ｃは、ステップＳ３０２に進む。 As shown in FIG. 11, in step S301, the display image generation device 1C acquires the peripheral image of the user X by the peripheral image acquisition unit 11. The peripheral image acquisition unit 11 acquires the peripheral image captured by the peripheral image pickup device 22 of the vehicle 2C. After that, the display image generation device 1C proceeds to step S302.

ステップＳ３０２において、表示画像生成装置１Ｃは、発言データ取得部１２により、ユーザ（発言主体）ＹによりユーザＸに対して発せられた音声の発言データを取得する。発言データ取得部１２は、発言主体用端末４の発言データ取得装置４１から取得されたユーザＹによりユーザＸに対して発せられた音声の発言データを取得する。なお、上述したとおり、発言データには、ユーザＹが何も発していないデータも含まれる。さらに、発言データ取得部１２は、ユーザＹを特定する情報を取得し、表示画像生成装置１Ｃに送信する。その後、ステップＳ３０３に進む。 In step S302, the display image generation device 1C acquires the speech data of the voice uttered to the user X by the user (subject of speech) Y by the speech data acquisition unit 12. The speech data acquisition unit 12 acquires speech data of the voice emitted to the user X by the user Y acquired from the speech data acquisition device 41 of the speech subject terminal 4. As described above, the speech data also includes data in which the user Y does not emit anything. Further, the speech data acquisition unit 12 acquires information that identifies the user Y and transmits it to the display image generation device 1C. Then, the process proceeds to step S303.

ステップＳ３０３において、表示画像生成装置１Ｃは、発言データ取得部１２により、発言データにユーザ（発言主体）Ｙの発言が含まれるか否かを判定する。ユーザＹの発言が含まれると判定された場合には、ステップＳ３０４に進む。ユーザＹの発言が含まれないと判定された場合には、エンドに進む。 In step S303, the display image generation device 1C determines whether or not the speech data includes the speech of the user (subject of speech) Y by the speech data acquisition unit 12. If it is determined that the remark of the user Y is included, the process proceeds to step S304. If it is determined that the user Y's remark is not included, the process proceeds to the end.

ステップＳ３０４において、表示画像生成装置１Ｃは、対象物抽出部１３により、発言データのうち対象物Ｔと一致する抽出対象物Ｔｅを抽出できるか否かを判定する。抽出対象物Ｔｅを抽出できると判定された場合には、ステップＳ３０５に進む。抽出対象物Ｔｅを抽出できないと判定された場合には、エンドに進む。 In step S304, the display image generation device 1C determines whether or not the object extraction unit 13 can extract the extraction target Te that matches the object T from the statement data. If it is determined that the extraction target Te can be extracted, the process proceeds to step S305. If it is determined that the extraction target Te cannot be extracted, the process proceeds to the end.

ステップＳ３０５において、表示画像生成装置１Ｃは、視線認識部２０により、ユーザＸの視線を認識する。視線認識部２０は、姿勢取得装置２３からユーザＸの顔画像を含む画像情報を取得し、取得された画像情報に基づいてユーザＸの視線方向を認識する。その後、ステップＳ３０６に進む。 In step S305, the display image generation device 1C recognizes the line of sight of the user X by the line of sight recognition unit 20. The line-of-sight recognition unit 20 acquires image information including the face image of the user X from the posture acquisition device 23, and recognizes the line-of-sight direction of the user X based on the acquired image information. Then, the process proceeds to step S306.

ステップＳ３０６において、表示画像生成装置１Ｃは、視野画像取得部１４Ｃにより、ユーザＸの視野画像を取得する。視野画像取得部１４Ｃは、視線認識部２０からユーザＸの視線方向を取得し、ユーザＸの視野Ｅｘを推定する。視野画像取得部１４Ｃは、周辺画像取得部１１からリアルタイムの車両周辺の画像を取得し、車両周辺の画像から推定されたユーザＸの視野Ｅｘに対する領域を切り出し、視野画像を取得する。その後、ステップＳ３０７に進む。 In step S306, the display image generation device 1C acquires the visual field image of the user X by the visual field image acquisition unit 14C. The field image acquisition unit 14C acquires the line-of-sight direction of the user X from the line-of-sight recognition unit 20 and estimates the field-of-view Ex of the user X. The visual field image acquisition unit 14C acquires a real-time image of the vehicle peripheral area from the peripheral image acquisition unit 11, cuts out a region with respect to the visual field Ex of the user X estimated from the image of the vehicle peripheral area, and acquires the visual field image. Then, the process proceeds to step S307.

ステップＳ３０７において、表示画像生成装置１Ｃは、対象物判定部１５Ｃにより、対象物抽出部１３から抽出された抽出対象物Ｔｅが視野画像取得部１４Ｃから取得したユーザＸの視野画像に含まれるか否かを判定する。抽出対象物ＴｅがユーザＸの視野画像に含まれると判定された場合には、ステップＳ３０８に進む。抽出対象物ＴｅがユーザＸの視野画像に含まれないと判定された場合には、ステップＳ３０９に進む。 In step S307, in the display image generation device 1C, whether or not the extraction target Te extracted from the object extraction unit 13 by the object determination unit 15C is included in the field image of the user X acquired from the field image acquisition unit 14C. Is determined. If it is determined that the extraction target Te is included in the visual field image of the user X, the process proceeds to step S308. If it is determined that the extraction target Te is not included in the visual field image of the user X, the process proceeds to step S309.

抽出対象物ＴｅがユーザＸの視野画像に含まれると判定された場合には、ステップＳ３０８において、表示画像生成装置１Ｃは、表示画像生成部１８Ｃにより、抽出対象物Ｔｅそのものを強調する第１表示画像Ｐ１を生成する。表示画像生成部１８Ｃは、視野画像取得部１４Ｃから視野Ｅｘの視野画像を取得し、視野画像から抽出対象物Ｔｅを画像認識し、視野画像に重畳して表示される抽出対象物Ｔｅそのものを強調する第１表示態様で抽出対象物情報を示した第１表示画像Ｐ１を生成する。なお、表示画像生成部１８Ｃは、視野画像から抽出対象物ＴｅがユーザＸにより視認可能であることを示す情報及び発言データ取得部１２により取得された発言主体を特定する情報をさらに含む第１表示画像Ｐ１を生成してもよい。表示画像生成部１８Ｃは、生成した第１表示画像Ｐ１をユーザ用端末３Ｃの表示画像表示装置３１Ｃに送信する。 When it is determined that the extraction target Te is included in the visual field image of the user X, in step S308, the display image generation device 1C uses the display image generation unit 18C to emphasize the extraction target Te itself. Image P1 is generated. The display image generation unit 18C acquires the visual field image of the visual field Ex from the visual field image acquisition unit 14C, recognizes the extraction target Te from the visual field image, and emphasizes the extraction target Te itself displayed by superimposing it on the visual field image. A first display image P1 showing information on an object to be extracted is generated in the first display mode. The display image generation unit 18C is the first display that further includes information indicating that the extraction target Te is visible to the user X from the visual field image and information for identifying the speaking subject acquired by the speaking data acquisition unit 12. Image P1 may be generated. The display image generation unit 18C transmits the generated first display image P1 to the display image display device 31C of the user terminal 3C.

抽出対象物ＴｅがユーザＸの視野画像に含まれないと判定された場合には、ステップＳ３０９において、表示画像生成装置１Ｃは、存否判定部１６Ｃにより、周辺画像取得部１１により取得された現在または過去の周辺画像に基づいて、抽出対象物Ｔｅが対象範囲内に存在するか否かを判定する。存否判定部１６Ｃは、抽出対象物Ｔｅが対象範囲内に存在しないと判定した場合には、スッテプＳ３１２に進む。存否判定部１６Ｃは、抽出対象物Ｔｅが対象範囲内に存在すると判定した場合には、スッテプＳ３１０に進む。 When it is determined that the extraction target Te is not included in the visual field image of the user X, in step S309, the display image generation device 1C is currently or acquired by the peripheral image acquisition unit 11 by the presence / absence determination unit 16C. Based on the past peripheral image, it is determined whether or not the extraction target Te exists within the target range. When the existence / non-existence determination unit 16C determines that the extraction target Te does not exist within the target range, the existence / non-existence determination unit 16C proceeds to step S312. When the existence / non-existence determination unit 16C determines that the extraction target Te exists within the target range, the existence / non-existence determination unit 16C proceeds to step S310.

抽出対象物Ｔｅが存在する位置が対象範囲内であると判定された場合には、ステップＳ３１０において、表示画像生成装置１Ｃは、位置関係取得部１７Ｃにより、抽出対象物ＴｅとユーザＸとの位置関係を取得する。位置関係取得部１７Ｃは、周辺画像取得部１１から取得された現在または過去のユーザＸの周辺画像に基づいて、抽出対象物ＴｅからユーザＸ又は車両２Ｃまでの距離とユーザＸの視野Ｅｘに対する方向を推定する。また、位置関係取得部１７Ｃは、存否判定部１６Ｃより抽出対象物ＴｅからユーザＸ又は車両２Ｃまでの距離を取得してもよい。その後、ステップＳ３１１に進む。 When it is determined that the position where the extraction target Te exists is within the target range, in step S310, the display image generation device 1C uses the positional relationship acquisition unit 17C to position the extraction target Te and the user X. Get a relationship. The positional relationship acquisition unit 17C is based on the peripheral image of the current or past user X acquired from the peripheral image acquisition unit 11, the distance from the extraction target Te to the user X or the vehicle 2C, and the direction of the user X with respect to the visual field Ex. To estimate. Further, the positional relationship acquisition unit 17C may acquire the distance from the extraction target Te to the user X or the vehicle 2C from the existence / non-existence determination unit 16C. Then, the process proceeds to step S311.

ステップＳ３１１において、表示画像生成装置１Ｃは、表示画像生成部１８Ｃにより、位置関係取得部１７Ｃから取得された基準位置を基準として抽出対象物Ｔｅの位置の方向及び距離を含む位置関係を表示する第２表示画像Ｐ２を生成する。表示画像生成部１８Ｃは、位置関係取得部１７Ｃから取得されたユーザＸの視野Ｅｘに対する方向を示す記号画像と距離を表示する第２表示態様で抽出対象物情報を示した第２表示画像Ｐ２を生成する。なお、表示画像生成部１８Ｃは、ユーザＸから抽出対象物Ｔｅが視認不可能であることを示す情報及び発言データ取得部１２により取得された発言主体を特定する情報を含む第２表示画像Ｐ２を生成してもよい。表示画像生成部１８Ｃは、生成した第２表示画像Ｐ２をユーザ用端末３Ｃの表示画像表示装置３１Ｃに送信する。 In step S311, the display image generation device 1C displays the positional relationship including the direction and distance of the position of the extraction target Te with reference to the reference position acquired from the positional relationship acquisition unit 17C by the display image generation unit 18C. 2 Display image P2 is generated. The display image generation unit 18C displays a symbol image indicating the direction of the user X with respect to the field of view Ex acquired from the positional relationship acquisition unit 17C and a second display image P2 showing the extraction target information in the second display mode of displaying the distance. Generate. The display image generation unit 18C provides a second display image P2 including information indicating that the extraction target Te is invisible from the user X and information for identifying the speaking subject acquired by the speaking data acquisition unit 12. It may be generated. The display image generation unit 18C transmits the generated second display image P2 to the display image display device 31C of the user terminal 3C.

抽出対象物Ｔｅが存在する位置が対象範囲内ではないと判定された場合には、ステップＳ３１２において、表示画像生成装置１Ｃは、位置関係取得部１７Ｃにより、抽出対象物ＴｅとユーザＸとの位置関係を取得する。具体的には、位置関係取得部１７Ｃは、存否判定部１６Ｃから抽出対象物Ｔｅが予め設定された対象範囲内に存在しない位置関係情報を取得する。その後、ステップＳ３１３に進む。 When it is determined that the position where the extraction target Te exists is not within the target range, in step S312, the display image generation device 1C uses the positional relationship acquisition unit 17C to determine the position between the extraction target Te and the user X. Get a relationship. Specifically, the positional relationship acquisition unit 17C acquires the positional relationship information in which the extraction target Te does not exist within the preset target range from the existence / non-existence determination unit 16C. Then, the process proceeds to step S313.

ステップＳ３１３において、表示画像生成装置１Ｃは、位置関係取得部１７Ｃから取得された抽出対象物Ｔｅが予め設定された対象範囲内に存在しないという抽出対象物ＴｅとユーザＸとの位置関係を表示する第３表示画像Ｐ３を生成する。表示画像生成部１８Ｃは、視野画像から抽出対象物ＴｅがユーザＸにより視認不可能であることを示す情報及び発言データ取得部１２により取得された発言主体を特定する情報を含む第３表示画像Ｐ３を生成する。なお、抽出対象物Ｔｅの位置の方向及び距離を含む位置関係（第２表示態様に係る位置関係）は表示されない。表示画像生成部１８Ｃは、生成した第３表示画像Ｐ３をユーザ用端末３Ｃの表示画像表示装置３１Ｃに送信する。 In step S313, the display image generation device 1C displays the positional relationship between the extraction target Te and the user X that the extraction target Te acquired from the positional relationship acquisition unit 17C does not exist within the preset target range. The third display image P3 is generated. The display image generation unit 18C includes a third display image P3 including information indicating that the extraction target Te is invisible to the user X from the visual field image and information for identifying the speaking subject acquired by the speaking data acquisition unit 12. To generate. The positional relationship including the direction and distance of the position of the extraction target Te (the positional relationship according to the second display mode) is not displayed. The display image generation unit 18C transmits the generated third display image P3 to the display image display device 31C of the user terminal 3C.

表示画像生成装置１Ｃは、表示画像生成部１８Ｃの上述した処理が終了すると、今回の処理を終了して、再びステップＳ３０１から表示画像生成処理を繰り返す。 When the above-described processing of the display image generation unit 18C is completed, the display image generation device 1C ends the current processing and repeats the display image generation processing from step S301 again.

上記のとおり、本実施形態では、表示画像生成装置１Ｃは、周辺画像を取得して記憶する周辺画像取得部１１と、ユーザＸの視線を認識する視線認識部２０と、を備える。視野画像取得部１４Ｃは、周辺画像取得部１１により取得された現在の周辺画像と視線認識部２０により認識されたユーザＸの現在の視線とに基づいて視野画像を取得する。この結果、表示画像生成装置１Ｃは、周辺画像取得部１１によりユーザＸの視野Ｅｘを含む領域の画像である周辺画像を取得し、視線認識部２０によりユーザＸの視線を認識し、取得された周辺画像が含まれたユーザＸの視線に応じる視野画像を取得することができる。これにより、ユーザ用端末３Ｃに視野画像取得装置３２が無くても、視線認識部２０によりユーザＸの視野画像を取得することができる。 As described above, in the present embodiment, the display image generation device 1C includes a peripheral image acquisition unit 11 that acquires and stores a peripheral image, and a line-of-sight recognition unit 20 that recognizes the line of sight of the user X. The field-of-view image acquisition unit 14C acquires a field-of-view image based on the current peripheral image acquired by the peripheral image acquisition unit 11 and the current line-of-sight of the user X recognized by the line-of-sight recognition unit 20. As a result, the display image generation device 1C acquires the peripheral image which is an image of the region including the field of view Ex of the user X by the peripheral image acquisition unit 11, recognizes the line of sight of the user X by the line of sight recognition unit 20, and acquires the peripheral image. It is possible to acquire a visual field image according to the line of sight of the user X including the peripheral image. As a result, the line-of-sight recognition unit 20 can acquire the field-of-view image of the user X even if the user terminal 3C does not have the field-of-view image acquisition device 32.

以上、本開示の表示画像生成装置及び表示画像生成方法を上述した各実施形態に基づき説明してきたが、具体的な構成については、これらの各実施形態に限られるものではなく、特許請求の範囲の各請求項に係る発明の要旨を逸脱しない限り、設計の変更や追加等は許容される。 Although the display image generation device and the display image generation method of the present disclosure have been described based on the above-described embodiments, the specific configuration is not limited to each of these embodiments and is within the scope of claims. As long as the gist of the invention according to each of the above claims is not deviated, design changes and additions are permitted.

各実施形態において、ユーザＸとユーザＹの両方とも、車両に乗車している例を示したが、これに限られない。例えば、ユーザＸ、及び、発言主体であるユーザＹの一方又は両方が、車両２Ａ〜Ｃの車外（すなわち、車両２Ａ〜Ｃから離間した場所）に存在（位置）してもよい。この場合、ユーザＸのユーザ用端末又はユーザＸのユーザ用端末が接続可能なサーバは、発言データ取得部と、対象物抽出部と、対象物判定部と、表示画像生成部と、の構成を少なくとも有する必要がある。なお、視野画像取得部は、例えば、ユーザ用端末が有する視野画像取得装置に含める。そして、周辺撮像装置により得られる周辺画像を、ユーザＸの視野Ｅｘに対応する視野画像としてもよいし、ユーザＸが周辺撮像装置を有しておりユーザ用端末に送信してもよい。更に、ユーザＸが車外にいる場合、姿勢取得装置２３はユーザ用端末３ＣまたはユーザＸの周辺に設置し、ユーザＸの顔画像またはセンサによりユーザＸの顔向き情報を取得する。そして、視線認識部２０は、姿勢取得装置２３により取得したユーザＸの顔画像または顔向き情報によりユーザＸの視線方向を認識する。視野画像取得部は、周辺撮像装置２２が撮像した周辺画像と視線認識部２０が認識したユーザＸの視線方向に基づいて、ユーザＸの視野画像を生成する。なお、ユーザＸが車外にいる場合、視線認識部２０は、ユーザＸのユーザ用端末又はユーザＸのユーザ用端末が接続可能なサーバが有するものとする。そして、ユーザＸが車外に存在する場合でも、ユーザ用端末は、発言主体により発せられた発言に含まれる抽出対象物Ｔｅに関する表示画像Ｐが生成される。そして、表示画像表示装置に表示画像Ｐが表示される。 In each embodiment, both user X and user Y have shown an example of being in a vehicle, but the present invention is not limited to this. For example, one or both of the user X and the user Y who is the speaking subject may exist (position) outside the vehicles 2A to C (that is, a place away from the vehicles 2A to C). In this case, the user terminal of user X or the server to which the user terminal of user X can be connected has a configuration of a speech data acquisition unit, an object extraction unit, an object determination unit, and a display image generation unit. You need to have at least. The field of view image acquisition unit is included in, for example, the field of view image acquisition device of the user terminal. Then, the peripheral image obtained by the peripheral imaging device may be used as a visual field image corresponding to the visual field Ex of the user X, or the user X may have the peripheral imaging device and transmit it to the user terminal. Further, when the user X is outside the vehicle, the posture acquisition device 23 is installed near the user terminal 3C or the user X, and acquires the face orientation information of the user X by the face image or the sensor of the user X. Then, the line-of-sight recognition unit 20 recognizes the line-of-sight direction of the user X from the face image or face orientation information of the user X acquired by the posture acquisition device 23. The field-of-view image acquisition unit generates a field-of-view image of the user X based on the peripheral image captured by the peripheral image pickup device 22 and the line-of-sight direction of the user X recognized by the line-of-sight recognition unit 20. When the user X is outside the vehicle, the line-of-sight recognition unit 20 is assumed to be owned by the user terminal of the user X or the server to which the user terminal of the user X can be connected. Then, even when the user X is outside the vehicle, the user terminal generates a display image P relating to the extraction target Te included in the speech made by the speaking subject. Then, the display image P is displayed on the display image display device.

各実施形態において、対象物判定部は、抽出対象物ＴｅがユーザＸの視野画像に含まれるか否かの判定結果の情報を発言主体であるユーザＹの発言主体用端末４へ出力する例を示したが、これに限られない。例えば、ユーザＹへ出力する情報としては、ユーザＸの視野画像や表示画像Ｐや周辺画像などを出力しても良い。また、ユーザＹが特に車外に存在する場合には、ユーザＹの発言主体用端末４やＶＲ（Virtual Reality、画像表示装置）などに画像を表示する。このように、発言主体であるユーザＹに画像を表示することにより、ユーザＹはユーザＸの視認可能領域や視線方向の情報をえることができるので、ユーザＸとユーザＹとの話題の進み方をより決めやすくなる。 In each embodiment, the object determination unit outputs information on the determination result of whether or not the extraction target Te is included in the visual field image of the user X to the speaker terminal 4 of the user Y who is the speaker. Although shown, it is not limited to this. For example, as the information to be output to the user Y, the field image, the display image P, the peripheral image, and the like of the user X may be output. Further, when the user Y is particularly present outside the vehicle, the image is displayed on the user Y's speaking subject terminal 4 or VR (Virtual Reality, image display device). By displaying the image to the user Y who is the main speaker in this way, the user Y can obtain information on the visible area and the line-of-sight direction of the user X, so that the topic of the user X and the user Y progresses. It becomes easier to decide.

また、周辺撮像装置２２により撮像された周辺画像は上記の各実施形態において説明したものに限定されず、例えばユーザＸの視野Ｅｘに対応する視野画像としてもよい。ここで、例えば、発言主体であるユーザＹが車両２Ａの車外に存在する場合には、発言主体用端末４には、周辺撮像装置２２により撮像された周辺画像の一部またはすべての画像が表示されてもよい。これにより、ユーザＸとユーザＹとの話題の進み方を決めることができる。 Further, the peripheral image captured by the peripheral imaging device 22 is not limited to the one described in each of the above embodiments, and may be, for example, a visual field image corresponding to the visual field Ex of the user X. Here, for example, when the user Y who is the speaking subject exists outside the vehicle of the vehicle 2A, the speaking subject terminal 4 displays a part or all of the peripheral images captured by the peripheral imaging device 22. May be done. As a result, it is possible to determine how to proceed with the topic between the user X and the user Y.

また、ユーザ用端末３Ａ〜３Ｃの表示画像表示装置３１Ａ〜３１Ｃは、透過型ディスプレイとする例を示したが、車両２Ａ〜２Ｃに設置されたヘッドアップディスプレイでもよい。例えば、ヘッドアップディスプレイは、車両２Ａ〜２Ｃのフロントウィンドウの下部位置に設定され、灯光器でウィンドシールドに画像を表示する。この場合、画像は、表示画像生成部１８Ａ〜１８Ｃが生成したユーザＸのＥｘの視野に対応する表示画像Ｐを表示する。 Further, although the display image display devices 31A to 31C of the user terminals 3A to 3C have shown an example of using a transmissive display, a head-up display installed in the vehicles 2A to 2C may be used. For example, the head-up display is set at the lower position of the front window of the vehicles 2A to 2C, and displays an image on the windshield with a light device. In this case, the image displays the display image P corresponding to the field of view of Ex of the user X generated by the display image generation units 18A to 18C.

また、発言主体は、人ではなく、ユーザＸに対して発言を発する発言装置でもよい。発言装置の場合、発言データは出力文データである。出力文データは、発言装置が出力文（文字列）を音声として出力する音声データであってもよいし、出力文（文字列）であってもよい。このため、表示画像生成装置１Ａ〜１Ｃは、発言データ取得装置によりユーザＸに対して発言を発する発言装置から出力文データを取得することができる。また、この場合、「発言主体により発せられた発言」は、「発言装置により発せられた（出力された）音声」である。また、表示画像生成部１８Ａ〜１８ＣがユーザＸに対する音声を発する発言装置を特定する情報を取得し、例えば「Mentioned by Speech output device.」という表示画像Ｐを生成してもよい。この結果、発言装置の発言に含まれる抽出対象物ＴｅをユーザＸに対する適切な表示態様で抽出対象物情報を表示させることができる。具体的には、発言装置は、ユーザＸと音声対話可能な、いわゆる対話型エージェント装置であってもよい。 Further, the speaking subject may be a speaking device that speaks to the user X instead of a person. In the case of a speaking device, the speaking data is output sentence data. The output sentence data may be voice data in which the speaking device outputs the output sentence (character string) as voice, or may be an output sentence (character string). Therefore, the display image generation devices 1A to 1C can acquire the output sentence data from the speech device that makes a speech to the user X by the speech data acquisition device. Further, in this case, the "speech uttered by the speaking subject" is the "speech uttered (output) by the speaking device". Further, the display image generation units 18A to 18C may acquire information for specifying a speaking device that emits a voice to the user X, and may generate a display image P such as "Mentioned by Speech output device." As a result, the extraction target information included in the speech of the speaking device can be displayed in an appropriate display mode for the user X. Specifically, the speaking device may be a so-called interactive agent device capable of voice dialogue with the user X.

また、上記では、発言主体は、１人のユーザＹのみ又は１つの発言装置のみであったが、発言主体の対象としては複数であってもよい。例えば、発言主体の対象として、２人以上の同乗者（ユーザ）であってもよいし、１人の同乗者（ユーザ）と１つの発言装置であってもよい。この場合、発言データ取得部１２は、ユーザＸに対して発言を発した発言主体を特定する情報を取得する。次に、表示画像生成部１８Ａ〜１８Ｃは、発言データ取得部１２により取得された発言主体を特定する情報を含む表示画像Ｐを生成する。この結果、表示画像生成装置１Ａ〜１Ｃは、発言データ取得部１２により発言主体を特定する情報を取得し、表示画像生成部１８Ａ〜１８Ｃにより発言主体を特定する情報を含む表示画像Ｐを生成することができる。これにより、発言主体の対象が複数であるとき、ユーザＸが発言主体を明確に把握することができる。 Further, in the above, the speaking subject is only one user Y or only one speaking device, but the target of the speaking subject may be a plurality. For example, the target of the speaking subject may be two or more passengers (users), or one passenger (user) and one speaking device. In this case, the remark data acquisition unit 12 acquires information that identifies the remark subject who made a remark to the user X. Next, the display image generation units 18A to 18C generate the display image P including the information for identifying the speaking subject acquired by the speaking data acquisition unit 12. As a result, the display image generation devices 1A to 1C acquire the information for identifying the speaking subject by the speech data acquisition unit 12, and generate the display image P including the information for identifying the speaking subject by the display image generation units 18A to 18C. be able to. As a result, when there are a plurality of subjects of the speaking subject, the user X can clearly grasp the speaking subject.

上記では、発言データ取得部１２と発言データ取得装置４１を有する例を示したが、発言データ取得部１２が発言データ取得装置４１の機能を備えていれば、発言データ取得装置４１を備えていなくてもよい。また、視野画像取得部１４Ａ，１４Ｂと視野画像取得装置３２を有する例を示したが、視野画像取得部１４Ａ，１４Ｂが視野画像取得装置３２の機能を備えていれば、視野画像取得装置３２を備えなくても良い。更に、視線認識部２０と姿勢取得装置２３を有する例を示したが、視線認識部２０が姿勢取得装置２３の機能を備えていれば、姿勢取得装置２３を備えなくても良い。更にまた、周辺画像取得部１１と周辺撮像装置２２を有する例を示したが、周辺画像取得部１１が周辺撮像装置２２の機能を備えていれば、周辺撮像装置２２を備えていなくても良い。 In the above, an example of having the speech data acquisition unit 12 and the speech data acquisition device 41 is shown, but if the speech data acquisition unit 12 has the function of the speech data acquisition device 41, the speech data acquisition device 41 is not provided. You may. Further, although an example having the visual field image acquisition units 14A and 14B and the visual field image acquisition device 32 is shown, if the visual field image acquisition units 14A and 14B have the functions of the visual field image acquisition device 32, the visual field image acquisition device 32 can be used. You don't have to prepare. Further, although an example having the line-of-sight recognition unit 20 and the posture acquisition device 23 is shown, if the line-of-sight recognition unit 20 has the function of the posture acquisition device 23, the posture acquisition device 23 may not be provided. Furthermore, although an example having the peripheral image acquisition unit 11 and the peripheral image pickup device 22 is shown, if the peripheral image acquisition unit 11 has the function of the peripheral image pickup device 22, the peripheral image pickup device 22 may not be provided. ..

第２実施形態では、対象物判定部１５Ｂは、抽出対象物ＴｅがＰＯＩである否かを判定すると共に、抽出対象物Ｔｅの画像が視野画像取得部１４Ｂにより取得された視野画像に含まれるか否かを判定する例を示したが、これに限定されない。例えば、対象物判定部は、抽出対象物がＰＯＩである否かを判定せず、抽出対象物の画像が視野画像取得部により取得された視野画像に含まれるか否かのみを判定しても良い。このように判定する場合、抽出対象物がＰＯＩでなくても、視野画像に含まれていると判定されれば、第１表示画像が生成される。 In the second embodiment, the object determination unit 15B determines whether or not the extraction target Te is POI, and whether the image of the extraction target Te is included in the visual field image acquired by the visual field image acquisition unit 14B. An example of determining whether or not to use is shown, but the present invention is not limited to this. For example, the object determination unit does not determine whether the extraction target is POI, but only determines whether the image of the extraction target is included in the field image acquired by the field image acquisition unit. good. In this determination, even if the extraction target is not POI, if it is determined that the extraction target is included in the visual field image, the first display image is generated.

１Ａ，１Ｂ，１Ｃ表示画像生成装置
１１周辺画像取得部
１２発言データ取得部
１３対象物抽出部
１４Ａ，１４Ｂ，１４Ｃ視野画像取得部
１５Ａ，１５Ｂ，１５Ｃ対象物判定部
１６Ａ，１６Ｂ，１６Ｃ存否判定部
１７Ａ，１７Ｂ，１７Ｃ位置関係取得部
１８Ａ，１８Ｂ，１８Ｃ表示画像生成部
１９ＰＯＩ情報記憶部
２０視線認識部
２Ａ，２Ｂ，２Ｃ車両
２１ナビゲーション装置
２２周辺撮像装置
２３姿勢取得装置
３Ａ，３Ｂ，３Ｃユーザ用端末
３１Ａ，３１Ｂ，３１Ｃ表示画像表示装置
３２視野画像取得装置
４発言主体用端末
４１発言データ取得装置 1A, 1B, 1C Display image generator 11 Peripheral image acquisition unit 12 Speech data acquisition unit 13 Object extraction unit 14A, 14B, 14C Field image acquisition unit 15A, 15B, 15C Object determination unit 16A, 16B, 16C Presence / absence determination unit 17A, 17B, 17C Positional relationship acquisition unit 18A, 18B, 18C Display image generation unit 19 POI information storage unit 20 Line-of-sight recognition unit 2A, 2B, 2C Vehicle 21 Navigation device 22 Peripheral image pickup device 23 Posture acquisition device 3A, 3B, 3C User Terminals 31A, 31B, 31C Display image display device 32 Field image acquisition device 4 Speech-based terminal 41 Speech data acquisition device

Claims

A display image generator that identifies an object included in a statement made by a speaker as an extraction object and generates a display image related to the extraction object.
A speech data acquisition unit that acquires speech data of the speech issued to the user by the speech subject, and a speech data acquisition unit.
A plurality of object data are stored in advance, the plurality of object data are compared with the speech data acquired by the speech data acquisition unit, and the data that matches the object data among the speech data is selected as described above. An object extraction unit to be extracted as an extraction object,
A field image acquisition unit that acquires an image including at least a field image corresponding to the user's field of view, and a field image acquisition unit.
An object determination unit that determines whether or not the extraction object extracted by the object extraction unit is included in the visual field image, and an object determination unit.
A display image generation unit that acquires extraction target information that is information on the position of the extraction target and generates the display image that includes the extraction target information that is different from the field image is provided.
The display image generation unit determines the display mode of the display image regarding the extraction target based on the determination result of whether or not the extraction target is included in the visual field image by the object determination unit. A featured display image generator.

When the object determination unit determines that the extraction object is included in the field image, the display image generation unit displays the extraction object information in the display mode that emphasizes the extraction object itself. The display image generation device according to claim 1, wherein the display image is generated.

A positional relationship acquisition unit for acquiring the relative positional relationship between the extraction target and the user is provided.
When the object determination unit determines that the extraction object is not included in the field image, the display image generation unit shows the extraction object information in the display mode for displaying the positional relationship. The display image generation device according to claim 1 or 2, wherein the display image is generated.

The display image generation unit makes the extraction target visible to the user from the field image based on the determination result of whether or not the extraction target is included in the field image by the object determination unit. The display image generation device according to any one of claims 1 to 3, wherein the display image including information indicating whether or not the image is generated is generated.

When the object determination unit determines that the extraction target is not included in the visual field image, the presence / absence determination unit that determines whether or not the extraction target exists within the preset target range is provided. Prepare,
The display image generation unit is characterized in that the display mode of the extraction target information is determined based on the determination result of whether or not the extraction target is within the target range by the presence / absence determination unit. The display image generator according to any one of claims 1 to 4.

A peripheral image acquisition unit that acquires a peripheral image that is a peripheral area of the user including the visual field image and stores the acquired peripheral image is provided.
The presence / absence determination unit determines whether or not the extraction target is within the target range based on the current or past peripheral image acquired by the peripheral image acquisition unit. Item 5. The display image generator according to item 5.

A positional relationship acquisition unit for acquiring the relative positional relationship between the extraction target and the user is provided.
When the presence / absence determination unit determines that the extraction target is within the target range, the display image generation unit uses a reference position set at the user's position or a position in the vicinity of the user as a reference. The display according to claim 5 or 6, wherein the display image showing the extraction target information is generated in the display mode for displaying the positional relationship including the direction and distance of the position of the extraction target. Image generator.

The extraction target is a POI (Points of Interest), which is a landmark associated with a position on a map.
A POI information storage unit for storing POI information of the POI including at least information on the position of the extraction target is provided.
Claims 5 to 7 are characterized in that the presence / absence determination unit determines whether or not the extraction target is within the target range based on the POI information stored by the POI information storage unit. The display image generator according to any one of the items up to.

The subject of the statement is a person
The display image generation device according to any one of claims 1 to 8, wherein the speech data is speech signal data of the speech issued to the user by a person.

The speaking subject is a speaking device that makes the speaking to the user.
The image generation device according to any one of claims 1 to 8, wherein the remark data is output sentence data indicating the content of the output sentence output as the remark.

The object determination unit according to any one of claims 1 to 10, wherein the object determination unit outputs information on the determination result of whether or not the extraction object is included in the visual field image to the speaking subject. The image generator described.

The remark data acquisition unit acquires information that identifies the remark subject who made the remark to the user, and obtains information.
The display image generation unit is described in any one of claims 1 to 11, wherein the display image generation unit generates the display image including the information that identifies the speech subject acquired by the speech data acquisition unit. Display image generator.

A peripheral image acquisition unit that acquires a peripheral image that is a peripheral area of the user including the visual field image and stores the acquired peripheral image.
A line-of-sight recognition unit that recognizes the user's line of sight is provided.
The visual field image acquisition unit acquires the visual field image based on the current peripheral image acquired by the peripheral image acquisition unit and the current line of sight of the user recognized by the line-of-sight recognition unit. The display image generation device according to any one of claims 1 to 12.

It is a display image generation method by a display image generation device that specifies an object included in a statement made by a speaker as an extraction object and generates a display image related to the extraction object.
The remark data acquisition step of acquiring the remark data of the remark made to the user by the remark subject, and
An object extraction step of comparing a plurality of object data stored in advance with the acquired remark data and extracting data that matches the object data among the remark data as the extraction target.
A field image acquisition step of acquiring a field image corresponding to the user's field of view, and
An object determination step for determining whether or not the extracted object to be extracted is included in the visual field image, and an object determination step.
The display image generation step of acquiring the extraction target information which is the information about the position of the extraction target and generating the display image including the extraction target information different from the field image is included.
In the display image generation step, the display mode of the display image relating to the extraction target is determined based on the determination result of whether or not the extraction target is included in the visual field image in the object determination step. A display image generation method characterized by.