JP2014233045A

JP2014233045A - Video display system and video display method

Info

Publication number: JP2014233045A
Application number: JP2013114118A
Authority: JP
Inventors: 知史三枝; Tomofumi Saegusa; 小澤　史朗; Shiro Ozawa; 史朗小澤; 高田　英明; Hideaki Takada; 英明高田; 伊達　宗和; Munekazu Date; 宗和伊達; 明小島; Akira Kojima
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2013-05-30
Filing date: 2013-05-30
Publication date: 2014-12-11

Abstract

PROBLEM TO BE SOLVED: To obtain smooth communication by processing a display image on the basis of the positional relationship among a plurality of users and the direction of faces, at the execution of dialogs among a multiplicity of persons using a video communication system.SOLUTION: A main point device 100 and a subordinate point device 200 connected through a network calculate the position and the direction of faces in imaged images, and deform the images on the basis of the calculation result, to transmit to an opposite device. The device, on receiving the image, synthesizes and displays the image and a background image on the basis of the positional relationship and the direction of the faces of the users.

Description

本発明は、映像コミュニケーション技術に関する。 The present invention relates to video communication technology.

従来、複数のユーザが参加して議論や会話を行うシステムとして、テレビ会議システムが利用されている。テレビ会議システムでは、固定された単一のカメラによって撮影されたユーザの映像を対話の相手に伝送することで、複数ユーザ間で議論や会話などが行われる。しかし、このような方法では、ユーザの視線方向が対話の相手に伝わりづらいという問題がある。このような問題に対して、多方向から複数のカメラでユーザを撮影することで、ユーザの視線方向を対話の相手に伝達し、自然にコミュニケーションを行えるようにする技術が提案されている（例えば、非特許文献１参照）。また、単一のカメラを用いてユーザ側と対話の相手側とのインターフェースとして使用される窓枠を持つユーザを撮影し、対話の相手が見ている画面上にユーザの空間を窓枠内に表示する映像コミュニケーション技術も提案されている（例えば、非特許文献２参照）。 Conventionally, a video conference system has been used as a system in which a plurality of users participate in discussions and conversations. In the video conference system, discussions and conversations are performed between a plurality of users by transmitting a user's video imaged by a single fixed camera to the other party. However, in such a method, there is a problem that it is difficult for the user's line-of-sight direction to be transmitted to the conversation partner. In order to solve such a problem, a technique has been proposed in which a user is photographed with a plurality of cameras from multiple directions so that the direction of the user's line of sight can be transmitted to a conversation partner and communication can be naturally performed (eg Non-Patent Document 1). In addition, a user having a window frame used as an interface between the user side and the conversation partner is photographed using a single camera, and the user's space is displayed in the window frame on the screen viewed by the conversation partner. A video communication technique for displaying has also been proposed (see, for example, Non-Patent Document 2).

Vertegaal, R., et al., “GAZE-2: conveying eye contact in group video conferencing using eye-controlled camera direction”, CHI ‘03, pp. 521-528. 2002.Vertegaal, R., et al., “GAZE-2: conveying eye contact in group video conferencing using eye-controlled camera direction”, CHI ‘03, pp. 521-528. 2002. 藤田真吾、吉野考、「重畳表示型ビデオチャットにおける枠の３次元的な移動と存在の効果」、情報処理学会Shingo Fujita, Osamu Yoshino, “Three-dimensional movement of frame and effect of existence in superimposed video chat”, Information Processing Society of Japan

対話において、相手の視線方向、興味方向がわかることは、自然な会話のしやすさや、相手の興味を把握できることによる円滑な会話に影響し重要である。従来の多人数による映像コミュニケーションシステムでは、事前に認識のある相手との会話や、映像コミュニケーションシステムを使ったことがあり、十分に習熟した状態での１対１の会話は容易である。しかし、画面の正面位置以外に位置する相手や、複数の利用者の位置関係、及び、視線方向を認識することは困難である。 In the conversation, it is important to know the direction of the other party's line of sight and the direction of interest, since it affects the ease of natural conversation and the smooth conversation by being able to grasp the interest of the other party. In a conventional video communication system with a large number of people, conversations with a partner who has recognized in advance or a video communication system has been used, and a one-on-one conversation in a sufficiently mastered state is easy. However, it is difficult to recognize a partner positioned other than the front position of the screen, the positional relationship between a plurality of users, and the line-of-sight direction.

また、非特許文献１の技術では、ユーザを複数方向に設置したカメラで撮影し、各ユーザの視線方向に応じてユーザの映像を表示する窓枠の方向を変更することによって、ユーザの視線方向を対話の相手に伝達している。そのため、撮影するカメラの位置が固定されており、ユーザ同士の位置関係が既知な状態でしかコミュニケーション環境を実現することができない。さらに、端末装置（例えば、スマートフォンやタブレット端末）などの小型のデバイスを用いた環境では、カメラを複数台準備することが困難という問題もある。また、非特許文献２の技術では、窓枠を画面上に表示することにより、ユーザと対話の相手とが明らかに異なる空間に存在していることを互いに認識してしまう。さらに、実空間での作業領域と画面上での観察位置とのずれにより明確な指示や視線を伝えることが困難であった。そのため、円滑なコミュニケーションが図れないという問題があった。 In the technique of Non-Patent Document 1, the user's line-of-sight direction is obtained by photographing the user with cameras installed in a plurality of directions and changing the direction of the window frame for displaying the user's video according to each user's line-of-sight direction. Is communicated to the other party. For this reason, the position of the camera to be photographed is fixed, and the communication environment can be realized only in a state where the positional relationship between users is known. Furthermore, in an environment using a small device such as a terminal device (for example, a smartphone or a tablet terminal), there is a problem that it is difficult to prepare a plurality of cameras. Further, in the technique of Non-Patent Document 2, by displaying the window frame on the screen, it is recognized that the user and the partner of the conversation are clearly present in different spaces. Furthermore, it has been difficult to convey clear instructions and line of sight due to the difference between the work area in the real space and the observation position on the screen. Therefore, there was a problem that smooth communication could not be achieved.

上記事情に鑑み、本発明は、円滑なコミュニケーションを図ることを可能にする技術の提供を目的としている。 In view of the above circumstances, an object of the present invention is to provide a technique that enables smooth communication.

本発明の一態様は、対話の相手であるユーザの向きを取得する取得部と、取得された前記向きに基づいて前記ユーザの映像を加工する加工部と、加工された前記映像を表示する表示部と、を備える映像表示システムである。 One aspect of the present invention is an acquisition unit that acquires the orientation of a user who is a conversation partner, a processing unit that processes the video of the user based on the acquired orientation, and a display that displays the processed video A video display system.

本発明の一態様は、上記の映像表示システムであって、前記加工部は、前記向きが画面方向を向いている場合に前記映像の前記ユーザが含まれない領域の透明度を変化させる。 One aspect of the present invention is the video display system described above, wherein the processing unit changes the transparency of an area of the video that does not include the user when the orientation is in a screen direction.

本発明の一態様は、上記の映像表示システムであって、前記加工部は、前記向きが画面方向を向いている場合に前記映像の枠の色を変化させる。 One aspect of the present invention is the video display system described above, wherein the processing unit changes a color of a frame of the video when the orientation is in a screen direction.

本発明の一態様は、上記の映像表示システムであって、前記加工部は、前記向きが画面方向を向いている場合に前記映像の前記ユーザが含まれない領域、及び、前記領域の枠に対してぼかし処理を行う。 One aspect of the present invention is the video display system described above, in which the processing unit includes an area that does not include the user of the video and a frame of the area when the orientation is in a screen direction. The blur process is performed on the image.

本発明の一態様は、対話の相手であるユーザの向きを取得する取得ステップと、取得された前記向きに基づいて前記ユーザの映像を加工する加工ステップと、加工された前記映像を表示する表示ステップと、を有する映像表示方法である。 One aspect of the present invention is an acquisition step of acquiring a direction of a user who is a partner of a conversation, a processing step of processing the user's video based on the acquired direction, and a display for displaying the processed video And a video display method.

本発明により、円滑なコミュニケーションを図ることが可能となる。 According to the present invention, smooth communication can be achieved.

本実施形態における映像表示システムのシステム構成を示すブロック図である。It is a block diagram which shows the system configuration | structure of the video display system in this embodiment. 人物の顔位置の座標に基づいて画像を変形する具体的な方法を示す図である。It is a figure which shows the specific method which deform | transforms an image based on the coordinate of a person's face position. 顔位置変形画像の具体例を示す図である。It is a figure which shows the specific example of a face position deformation | transformation image. 背景領域の透明度を変化させた場合の具体例を示す図である。It is a figure which shows the specific example at the time of changing the transparency of a background area | region. 対話の相手の画像を仮想空間内に配置した例を示す図である。It is a figure which shows the example which has arrange | positioned the image of the other party of dialogue in virtual space. 画像の周囲に色を付けた枠を表示する具体例を示す図である。It is a figure which shows the specific example which displays the frame which added the color around the image. 背景領域、及び、縁の領域をぼかした画像の具体例を示す図である。It is a figure which shows the specific example of the image which blurred the background area | region and the area | region of the edge. 本実施形態による映像コミュニケーションシステムを用いて各拠点にいるユーザが主地点にある物体を観察し、対話する様子を示す図である。It is a figure which shows a mode that the user in each base observes the object in a main point, and interacts using the video communication system by this embodiment. 本実施形態における映像表示システムの動作の流れを示すシーケンス図である。It is a sequence diagram which shows the flow of operation | movement of the video display system in this embodiment. 本実施形態における映像表示システムの動作の流れを示すシーケンス図である。It is a sequence diagram which shows the flow of operation | movement of the video display system in this embodiment.

本実施形態の映像表示システムでは、対話の相手が注目している方向に応じて、相手の映像が変化する。そのため、対話の相手が注目している方向が分かり、円滑なコミュニケーションを図ることができる。以下、本発明の一実施形態の詳細について説明する。 In the video display system of the present embodiment, the video of the other party changes in accordance with the direction in which the other party is paying attention. Therefore, it is possible to know the direction that the other party is paying attention to and to facilitate smooth communication. Hereinafter, details of one embodiment of the present invention will be described.

図１は、本実施形態における映像表示システムのシステム構成を示すブロック図である。本発明の映像表示システムは、主地点に設置されている主地点装置１００と、従地点に設置されている従地点装置２００とをネットワークを介して接続して構成される。主地点は、画像生成部１０６を有している地点である。画像生成部１０６は、ユーザ間で共有される背景画像を生成する。各装置（主地点装置１００及び従地点装置２００）には、画像生成部１０６によって生成された背景画像と各ユーザの映像とが表示される。従地点は１以上存在し、画像生成部１０６を有していない地点である。以下では、主地点にいるユーザをユーザＡ、従地点にいるユーザをユーザＢとして説明する。 FIG. 1 is a block diagram showing a system configuration of a video display system in the present embodiment. The video display system of the present invention is configured by connecting a master location device 100 installed at a master location and a slave location device 200 installed at a slave location via a network. The main point is a point having the image generation unit 106. The image generation unit 106 generates a background image shared between users. In each device (the main point device 100 and the slave point device 200), the background image generated by the image generation unit 106 and the video of each user are displayed. There are one or more slave points, and the image generator 106 is not included. In the following description, a user at a main point is described as user A, and a user at a slave point is described as user B.

主地点装置１００の機能構成を説明する。主地点装置１００は、バスで接続されたＣＰＵ（Central Processing Unit）やメモリや補助記憶装置などを備え、主地点プログラムを実行する。主地点プログラムの実行によって、主地点装置１００は、撮像部１０１、顔位置取得部１０２、画像変形部１０３、通信部１０４、領域分割部１０５、画像生成部１０６、位置方向算出部１０７、判断部１０８、加工部１０９、映像空間生成部１１０、表示部１１１を備える装置として機能する。なお、主地点装置１００の各機能の全て又は一部は、ＡＳＩＣ（Application Specific Integrated Circuit）やＰＬＤ（Programmable Logic Device）やＦＰＧＡ（Field Programmable Gate Array）等のハードウェアを用いて実現されてもよい。また、主地点プログラムは、コンピュータ読み取り可能な記録媒体に記録されてもよい。コンピュータ読み取り可能な記録媒体とは、例えばフレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置である。また、主地点プログラムは、電気通信回線を介して送受信されてもよい。 A functional configuration of the main point device 100 will be described. The main point device 100 includes a CPU (Central Processing Unit), a memory, an auxiliary storage device, and the like connected by a bus, and executes the main point program. By executing the main point program, the main point device 100 includes an imaging unit 101, a face position acquisition unit 102, an image deformation unit 103, a communication unit 104, a region division unit 105, an image generation unit 106, a position / direction calculation unit 107, and a determination unit. 108, a processing unit 109, a video space generation unit 110, and a display unit 111. Note that all or part of each function of the main point device 100 may be realized by using hardware such as an application specific integrated circuit (ASIC), a programmable logic device (PLD), and a field programmable gate array (FPGA). . The main point program may be recorded on a computer-readable recording medium. The computer-readable recording medium is, for example, a portable medium such as a flexible disk, a magneto-optical disk, a ROM, a CD-ROM, or a storage device such as a hard disk built in the computer system. The main point program may be transmitted / received via a telecommunication line.

撮像部１０１は、主地点のユーザＡ及び背景を含む画像を撮影する。撮像部１０１は、撮影した画像を顔位置取得部１０２に出力する。
顔位置取得部１０２は、撮像部１０１によって撮影された画像から人物（ユーザＡ）の顔位置の座標値を取得する。顔の位置を取得する手法としては、以下の参考文献１に記載されている技術が用いられてもよい。参考文献１：Paul Viola and Michael J. Jones. Rapid Object Detection using a Boosted Cascade of Simple Features. IEEE CVPR, 2001. The imaging unit 101 captures an image including the user A at the main location and the background. The imaging unit 101 outputs the captured image to the face position acquisition unit 102.
The face position acquisition unit 102 acquires the coordinate value of the face position of the person (user A) from the image captured by the imaging unit 101. As a method for acquiring the position of the face, the technique described in Reference Document 1 below may be used. Reference 1: Paul Viola and Michael J. Jones. Rapid Object Detection using a Boosted Cascade of Simple Features. IEEE CVPR, 2001.

画像変形部１０３は、取得された人物の顔位置の座標値に基づいて顔位置変形画像を生成する。具体的には、画像変形部１０３は、取得された顔位置の座標が強調されるように変形して顔位置変形画像を生成する。画像を変形する方法としては、画像の左端と右端とを基準点とし、顔位置の座標を制御点としたベジェ曲線などの変形方法が用いられてもよい。 The image deformation unit 103 generates a face position deformed image based on the acquired coordinate value of the face position of the person. Specifically, the image deformation unit 103 generates a face position deformed image by deforming the acquired face position coordinates so as to be emphasized. As a method for deforming the image, a deformation method such as a Bezier curve using the left end and the right end of the image as reference points and the coordinates of the face position as control points may be used.

以下、画像変形部１０３の処理について図２を用いて具体的に説明する。図２は、人物の顔位置の座標に基づいて画像を変形する具体的な方法を示す図である。
図２（Ａ）は、ユーザが撮像されている画像、及び、座標系を示す図である。図２（Ａ）に示すように画像の左端をＡ、右端をＢとする。Ａは、画像の左端の座標であり、Ａ（ｘ＿Ａ，ｙ＿Ａ）で表される。なお、ｘ＿Ａという表記は、Ａがｘの下付き文字であることを表す。ｙ＿Ａという表記は、Ａがｙの下付き文字であることを表す。また、Ｂは、画像の右端の座標であり、Ｂ（ｘ＿Ｂ，ｙ＿Ｂ）で表される。なお、ｘ＿Ｂという表記は、Ｂがｘの下付き文字であることを表す。ｙ＿Ｂという表記は、Ｂがｙの下付き文字であることを表す。なお、ｘ＿Ａ＜ｘ＿Ｂ、ｙ＿Ａ＜ｙ＿Ｂである。また、図２（Ａ）に示される座標系は、横方向をｘ軸、縦方向をｚ軸、奥行き方向をｙ軸と表される。 Hereinafter, the processing of the image transformation unit 103 will be specifically described with reference to FIG. FIG. 2 is a diagram illustrating a specific method of deforming an image based on the coordinates of the person's face position.
FIG. 2A is a diagram illustrating an image in which a user is captured and a coordinate system. As shown in FIG. 2A, the left end of the image is A and the right end is B. A is the coordinate of the left end of the image, and is represented by A (x_A, y_A). The notation x_A indicates that A is a subscript of x. The notation y_A indicates that A is a subscript of y. B is the coordinate of the right end of the image and is represented by B (x_B, y_B). The notation x_B indicates that B is a subscript of x. The notation y_B indicates that B is a subscript of y. Note that x_A <x_B and y_A <y_B. In the coordinate system shown in FIG. 2A, the horizontal direction is represented as the x axis, the vertical direction is represented as the z axis, and the depth direction is represented as the y axis.

画像変形部１０３は、画像の左端Ａと右端Ｂとを基準点とし、画像中の１点、例えば、画像中の人物の顔の中心位置や、左目と右目の中点を制御点とする２次のベジェ曲線に沿って画像を変形する。画像中から判別された人物の顔の中心座標をＣ１（ｘ＿Ｃ１，ｚ＿Ｃ１）で表す。ｘ＿Ｃ１という表記は、Ｃ１がｘの下付き文字であることを表す。ｚ＿Ｃ１という表記は、Ｃ１がｚの下付き文字であることを表す。 The image transformation unit 103 uses the left end A and the right end B of the image as reference points, and uses one point in the image, for example, the center position of the face of a person in the image, or the midpoint of the left eye and right eye as control points 2 Deform the image along the next Bezier curve. The center coordinate of the human face determined from the image is represented by C1 (x_C1, z_C1). The notation x_C1 indicates that C1 is a subscript of x. The notation z_C1 indicates that C1 is a subscript of z.

そして、図２（Ｂ）に示すように、Ｃはｘ−ｙ平面上で線分ＡＢの中点に４５度、及び、３１５度の角度で交わる直線上の点であり、Ｃのｘ座標を画像中の人物の顔の位置のｘ座標として表す。画像変形部１０３は、式（１）のｘ＿Ｃにｘ＿Ｃ１を代入することによって、ｙ＿Ｃを算出する。なお、ｘ＿Ｃという表記は、ｘの下付き文字としてＣがあることを表す。ｙ＿Ｃという表記は、ｙの下付き文字としてＣがあることを表す。その後、画像変形部１０３は、算出したｙ＿Ｃを式（１）に代入することによって、ｘ＿Ｃを算出する。画像変形部１０３は、このような処理を行うことによって、Ｃ（ｘ＿Ｃ、ｙ＿Ｃ）を算出する。 As shown in FIG. 2B, C is a point on a straight line that intersects the midpoint of the line segment AB at 45 degrees and 315 degrees on the xy plane. This is expressed as the x coordinate of the position of the person's face in the image. The image transformation unit 103 calculates y_C by substituting x_C1 for x_C in Expression (1). Note that the notation x_C indicates that there is C as a subscript of x. The notation y_C indicates that there is C as a subscript of y. Thereafter, the image deforming unit 103 calculates x_C by substituting the calculated y_C into Expression (1). The image transformation unit 103 calculates C (x_C, y_C) by performing such processing.

３点Ａ、Ｂ、Ｃのｘ座標、ｙ座標を算出後、画像変形部１０３は、Ａを始点、Ｂを終点、Ｃを制御点とする２次のベジェ曲線上の点Ｐ（ｘ＿Ｐ、ｙ＿Ｐ）を式（２）及び式（３）に基づいて算出する。なお、ｘ＿Ｐという表記は、Ｐがｘの下付き文字であることを表す。ｙ＿Ｐという表記は、Ｐがｙの下付き文字であることを表す。 After calculating the x-coordinate and y-coordinate of the three points A, B, and C, the image transformation unit 103 performs a point P (x_P, y_P) on the secondary Bezier curve with A as the start point, B as the end point, and C as the control point. ) Is calculated based on Equation (2) and Equation (3). Note that the notation x_P indicates that P is a subscript of x. The notation y_P indicates that P is a subscript of y.

なお、式（２）及び式（３）のｔは０から１に増加する媒介変数であり、式（４）で表される。また、式（４）のＸは、撮像された画像中の画像列の位置Ｘに対応する。画像変形部１０３は、画像列の位置Ｘに対応したＰを求めることで得られた曲線上に沿って画像を変形し配置する。
Note that t in the equations (2) and (3) is a parameter that increases from 0 to 1, and is represented by the equation (4). Further, X in Expression (4) corresponds to the position X of the image sequence in the captured image. The image deformation unit 103 deforms and arranges an image along a curve obtained by obtaining P corresponding to the position X of the image sequence.

図３に、顔位置変形画像の具体例を示す。図３に示すように、人物の顔位置が強調されるように画像が変形されていることがわかる。
画像変形部１０３は、顔位置変形画像を生成後、ユーザＡの視線方向に合わせて顔位置変形画像を回転する。なお、画像を変形させる方法は、上述の方法に限定される必要はない。例えば、画像中の人物の視点位置や眼球位置、頭部位置などを用いて画像を変形することで、人物の位置や人物の視線方向の伝達を補強することができる。 FIG. 3 shows a specific example of the face position deformed image. As shown in FIG. 3, it can be seen that the image is deformed so that the face position of the person is emphasized.
After generating the face position deformed image, the image deforming unit 103 rotates the face position deformed image in accordance with the line of sight of the user A. Note that the method for deforming an image need not be limited to the above-described method. For example, by changing the image using the viewpoint position, eyeball position, head position, etc. of the person in the image, it is possible to reinforce the transmission of the person's position and the person's gaze direction.

通信部１０４は、従地点装置２００との間でデータの送受信を行う。通信部１０４は、例えば画像変形部１０３によって生成された顔位置変形画像データを従地点装置２００に送信する。また、例えば通信部１０４は、従地点装置２００から顔位置変形画像データを受信する。
領域分割部１０５は、他の地点（例えば、従地点装置２００）の画像変形部２０３で生成された顔位置変形画像に対して領域分割を行う。具体的には、領域分割部１０５は、受信された顔位置変形画像から、事前に取得しておいた背景領域と人物領域の情報との差分を取ることで、人物領域と背景領域とを分割する。人物領域とは、画像内の手前に撮像されている人物の領域である。背景領域とは、人物領域以外の領域である。背景領域の具体例として、壁などがある。また、領域分割部１０５は、上述の参考文献１などの人物領域を判別する方法で人物領域と背景領域とを判別し、領域分割を行ってもよい。領域分割部１０５は、人物領域の画像と背景領域の画像とを加工部１０９に出力する。 The communication unit 104 transmits and receives data to and from the follower device 200. The communication unit 104 transmits, for example, the face position deformed image data generated by the image deforming unit 103 to the follower device 200. Further, for example, the communication unit 104 receives the face position deformed image data from the follower device 200.
The region dividing unit 105 performs region division on the face position deformed image generated by the image deforming unit 203 at another point (for example, the follower device 200). Specifically, the area dividing unit 105 divides the person area and the background area by taking the difference between the background area and the person area information acquired in advance from the received face position deformation image. To do. The person area is an area of a person captured in front of the image. The background area is an area other than the person area. A specific example of the background area is a wall. In addition, the region dividing unit 105 may determine a person region and a background region by a method of determining a person region such as the above-described Reference Document 1 and perform region division. The area dividing unit 105 outputs the person area image and the background area image to the processing unit 109.

画像生成部１０６は、主地点とする空間をカメラで事前に撮影し、仮想空間の人物を取り囲むように用意した半球面または多面体に貼り付けた背景画像（以下、「仮想空間環境」という。）を生成する。仮想空間環境の生成には、事前に準備した静止カメラ画像を用いることや、事前に撮影した映像を用いることもできる。さらに、多数のユーザで共有する環境に対して複数のカメラを設置することにより、実時間の映像を用いることも可能である。 The image generation unit 106 captures a space as a main point with a camera in advance, and a background image (hereinafter referred to as “virtual space environment”) pasted on a hemispherical surface or polyhedron prepared so as to surround a person in the virtual space. Is generated. For the generation of the virtual space environment, it is possible to use a still camera image prepared in advance or an image taken in advance. Furthermore, real-time video can be used by installing a plurality of cameras in an environment shared by many users.

位置方向算出部１０７は、各地点の環境（背景画像）に対するユーザの視点の位置と視線方向を取得する。また、位置方向算出部１０７は、ディスプレイの位置と方向とを取得する。３次元位置を取得する方法としては、例えば複数のカメラにより３次元位置を計測する装置や、磁気を用いて３次元位置を計測する装置などを用いる方法がある。具体的には、位置方向算出部１０７は、画像中の人物の頭部位置と人物の両眼の位置から環境に対するユーザＡの視点の位置と視線方向を算出する。 The position / direction calculation unit 107 acquires the position and line-of-sight direction of the user's viewpoint with respect to the environment (background image) at each point. In addition, the position / direction calculation unit 107 acquires the position and direction of the display. As a method of acquiring a three-dimensional position, for example, there are a method of using a device that measures a three-dimensional position with a plurality of cameras, a device that measures a three-dimensional position using magnetism, and the like. Specifically, the position / direction calculation unit 107 calculates the position and line-of-sight direction of the user A's viewpoint with respect to the environment from the head position of the person in the image and the positions of both eyes of the person.

３次元位置の基準となる原点（基準点）については，システムの開始時にユーザＡが設定した点を用いてもよいし、予め設定されている値を基準点としてもよい。また、基準点は、システム稼働中においても再設定可能である。視点の位置は、基準点とユーザの間の距離、及び、環境に対するユーザの方向を表す。また、ユーザやユーザの周囲に取り付けられた３次元位置センサを用いて、基準点に対するユーザの視点の位置と視線方向を取得してもよい。 For the origin (reference point) serving as a reference for the three-dimensional position, a point set by the user A at the start of the system may be used, or a preset value may be used as the reference point. The reference point can be reset even while the system is operating. The position of the viewpoint represents the distance between the reference point and the user and the direction of the user with respect to the environment. In addition, the position of the user's viewpoint and the line-of-sight direction with respect to the reference point may be acquired using the user or a three-dimensional position sensor attached around the user.

また、位置方向算出部１０７は、環境に対する主地点（ユーザＡ）のディスプレイの位置を取得する。ディスプレイの位置は、基準点に対するディスプレイの上端、下端、左端、右端との距離、及び、基準点に対するディスプレイの方向を示す。この位置の取得には、例えば、画像処理、位置センサ、超音波、赤外光センサなどを用いることができる。
位置方向算出部１０７は、人物の位置に対するディスプレイの位置に応じてコミュニケーション環境中のどの部分をユーザ（ユーザＡ）が観察しているかを検出し、検出結果を映像空間生成部１１０、加工部１０９及び従地点装置２００に出力する。 In addition, the position / direction calculation unit 107 acquires the display position of the main point (user A) with respect to the environment. The position of the display indicates the distance between the upper end, the lower end, the left end, and the right end of the display with respect to the reference point, and the direction of the display with respect to the reference point. For example, image processing, a position sensor, an ultrasonic wave, an infrared light sensor, or the like can be used for acquiring the position.
The position / direction calculation unit 107 detects which part of the communication environment the user (user A) is observing in accordance with the position of the display with respect to the position of the person, and the detected result is the video space generation unit 110 and the processing unit 109. And output to the follower device 200.

判断部１０８は、他の地点（例えば、従地点）の位置方向算出部２０６によって算出されたユーザＢの視線方向に対して、人物や物体などの観察対象が存在するか否かを判断する。観察対象とは、会話に参加している人物や会話の議題に挙がっている物体である。
加工部１０９は、判断部１０８の判断結果に基づいてユーザＢが撮像されている画像を加工する。具体的には、ユーザＢの視線方向に対して観察対象が存在する場合、加工部１０９はユーザＢが撮像されている画像の背景領域を加工する。加工方法の具体例として、３つの処理方法がある。例えば、背景領域の透明度を変化させる処理や画像の周囲に色を付けた枠を表示させる処理や背景領域、及び、画像の周囲の枠をぼかす処理などがある。一方、ユーザＢの視線方向に対して観察対象が存在しない場合、加工部１０９は加工を行わない。 The determination unit 108 determines whether or not an observation target such as a person or an object exists with respect to the user B's line-of-sight direction calculated by the position / direction calculation unit 206 of another point (for example, a slave point). The observation target is a person participating in the conversation or an object listed on the conversation agenda.
The processing unit 109 processes the image captured by the user B based on the determination result of the determination unit 108. Specifically, when an observation target exists with respect to the user B's line-of-sight direction, the processing unit 109 processes the background region of the image in which the user B is captured. There are three processing methods as specific examples of the processing method. For example, there are a process for changing the transparency of the background area, a process for displaying a colored frame around the image, a process for blurring the frame around the background area and the image, and the like. On the other hand, when there is no observation target with respect to the user B's line-of-sight direction, the processing unit 109 does not perform processing.

ユーザの興味や関心については、ユーザの視線方向に観察対象が存在することにより判断できる。また、遠隔地から参加するユーザ側から明示的に関心の程度、興味の程度をボタンやスライドバーなどの操作により変更することで自発的に程度を操作することもできる。 The user's interest and interest can be determined by the presence of an observation target in the user's line-of-sight direction. In addition, the degree of interest can be voluntarily manipulated by changing the degree of interest and the degree of interest by operating buttons, slide bars, etc. from the user side participating from a remote location.

以下、加工部１０９の処理の具体例として、３つの処理について説明する。
（１．背景領域の透明度を変化させる処理）
対話の相手の視線方向に対して会話に参加している人物や会話の議題に挙がっている物体などの観察対象が存在する場合、加工部１０９は対話の相手の画像の背景領域の透明度を変化させる。具体的には、加工部１０９は背景領域の透明度を高くする。この処理が行われることにより、ユーザに対して対話の相手のみの映像が仮想空間内に表示されているように感じさせることができる。これにより、仮想空間内における対話の相手の存在感を高めることができる。そのため、コミュニケーションを図りやすくなる。一方、対話の相手の視線方向に対して観察対象が存在しない場合、加工部１０９は対話の相手の画像の背景領域の透明度を変化しない。したがって、仮想空間内に対話の相手の映像と背景領域とが表示される。図４に、背景領域の透明度を変化させた場合の具体例を示す。 Hereinafter, three processes will be described as specific examples of the process of the processing unit 109.
(1. Processing to change the transparency of the background area)
When there is an observation target such as a person participating in the conversation or an object listed on the conversation agenda, the processing unit 109 changes the transparency of the background area of the image of the conversation partner. Let Specifically, the processing unit 109 increases the transparency of the background area. By performing this process, it is possible to make the user feel as if the video of only the other party of the conversation is displayed in the virtual space. Thereby, the presence of the other party in the virtual space can be enhanced. This makes it easier to communicate. On the other hand, when there is no observation target for the line-of-sight direction of the conversation partner, the processing unit 109 does not change the transparency of the background area of the image of the conversation partner. Therefore, the video and background area of the conversation partner are displayed in the virtual space. FIG. 4 shows a specific example when the transparency of the background region is changed.

図４は、背景領域の透明度を変化させた場合の具体例を示す図である。
図４に示されるように図４（Ａ）、図４（Ｂ）、図４（Ｃ）の順に透明度が高くなっている。図４（Ａ）は、透明度を変化させていない画像を示す図である。図４（Ａ）は、透明度を変化させていないため、対話の相手の領域と背景領域とが明確に表示されている。図４（Ｂ）は、図４（Ａ）に比べて背景領域の透明度が高い画像を示す図である。図４（Ｂ）は、図４（Ａ）に比べて透明度が高いため、背景領域の色合いが図４（Ａ）に比べて薄い。図４（Ｃ）は、図４（Ｂ）に比べて背景領域の透明度が高い画像を示す図である。図４（Ｃ）は、図４（Ｂ）に比べて透明度が高いため、背景領域の色合いが図４（Ｂ）に比べて薄い。図４に示されるように透明度が高くなると、背景領域の色合いが低くなり、対話の相手のみの映像が仮想空間内に表示されているように感じさせることができる。 FIG. 4 is a diagram illustrating a specific example when the transparency of the background region is changed.
As shown in FIG. 4, the transparency increases in the order of FIG. 4 (A), FIG. 4 (B), and FIG. 4 (C). FIG. 4A is a diagram illustrating an image in which the transparency is not changed. In FIG. 4A, since the transparency is not changed, the conversation partner area and the background area are clearly displayed. FIG. 4B is a diagram illustrating an image in which the transparency of the background region is higher than that in FIG. Since FIG. 4B has higher transparency than FIG. 4A, the color of the background region is lighter than that of FIG. FIG. 4C is a diagram illustrating an image in which the transparency of the background region is higher than that in FIG. Since FIG. 4C has higher transparency than FIG. 4B, the color of the background region is lighter than that of FIG. 4B. As shown in FIG. 4, when the transparency is increased, the color of the background area is lowered, and it is possible to make the user feel as if the video of only the other party of the conversation is displayed in the virtual space.

図５は、対話の相手の画像を仮想空間内に配置した例を示す図である。
図５に示すように、仮想空間内には、２つの画像１０、２０が配置されている。画像１０は、対話の相手の領域と背景領域とが明確に表示されている。そのため、仮想空間内に別空間が存在しているように感じられる。一方、画像２０は、画像１０に比べて背景領域の色合いが薄いため、対話の相手のみの映像が仮想空間内に表示されているように感じられる。 FIG. 5 is a diagram illustrating an example in which images of a conversation partner are arranged in a virtual space.
As shown in FIG. 5, two images 10 and 20 are arranged in the virtual space. In the image 10, the conversation partner area and the background area are clearly displayed. Therefore, it feels as if another space exists in the virtual space. On the other hand, since the image 20 has a lighter background area than the image 10, it feels as if the video of only the other party of the conversation is displayed in the virtual space.

（２．画像の周囲に色を付けた枠を表示する処理）
対話の相手の視線方向に対して観察対象が存在する場合、加工部１０９は対話の相手の画像の枠の色を変化させる。一方、対話の相手の視線方向に対して観察対象が存在しない場合、加工部１０９は対話の相手の画像の枠の色を変化させない。図６に、画像の周囲に色を付けた枠を変化させた場合の具体例を示す。 (2. Processing to display a colored frame around the image)
When there is an observation target with respect to the line-of-sight direction of the conversation partner, the processing unit 109 changes the color of the frame of the conversation partner image. On the other hand, when there is no observation target for the line-of-sight direction of the conversation partner, the processing unit 109 does not change the color of the frame of the image of the conversation partner. FIG. 6 shows a specific example in the case of changing the colored frame around the image.

図６は、画像の周囲に色を付けた枠を表示する具体例を示す図である。
図６には、２枚の画像が配置されている。加工部１０９は、対話の相手の視線方向に対して観察対象が存在する場合に、仮想空間に配置されている画像（例えば、画像３０）の枠の色を変化させる。例えば、加工部１０９は、画像の枠の色を緑に変化させる。また、例えば、加工部１０９は、画像の枠の色を赤に変化させる。このような処理が行われることにより、ユーザは、対話の相手の状態を一見して理解することができる。 FIG. 6 is a diagram illustrating a specific example of displaying a frame with a color around the image.
In FIG. 6, two images are arranged. The processing unit 109 changes the color of the frame of an image (for example, the image 30) arranged in the virtual space when there is an observation target with respect to the line-of-sight direction of the conversation partner. For example, the processing unit 109 changes the color of the frame of the image to green. For example, the processing unit 109 changes the color of the frame of the image to red. By performing such processing, the user can understand the state of the other party at a glance.

（３．背景領域、及び、縁の領域をぼかして表現する処理）
対話の相手の視線方向に対して観察対象が存在する場合、加工部１０９は対話の相手の画像の背景領域、及び、背景領域の縁にぼかし処理を行う。一方、対話の相手の視線方向に対して観察対象が存在しない場合、加工部１０９はぼかし処理を行わない。図７に、背景領域、及び、縁の領域をぼかした場合の具体例を示す。 (3. Processing to blur the background area and the edge area)
When there is an observation target for the line-of-sight direction of the conversation partner, the processing unit 109 performs a blurring process on the background area of the conversation partner image and the edge of the background area. On the other hand, when there is no observation target for the line-of-sight direction of the conversation partner, the processing unit 109 does not perform the blurring process. FIG. 7 shows a specific example when the background area and the edge area are blurred.

図７は、背景領域、及び、縁の領域をぼかした画像の具体例を示す図である。
図７（Ａ）から図７（Ｃ）に示されるように、仮想空間内に配置されている対話の相手の画像の背景領域、及び、背景領域の縁に対してぼかし処理が行われている。このような処理が行われることにより、対話の相手の領域と仮想空間の領域との接合の度合いを高め、同じ空間に存在しているように感じさせることができる。また、この処理については、上述した透明度を変化させる処理や背景領域の枠に色を付けて表示する処理を組み合わせて実行されてもよい。 FIG. 7 is a diagram illustrating a specific example of an image in which the background region and the edge region are blurred.
As shown in FIG. 7A to FIG. 7C, the blurring process is performed on the background area of the conversation partner image arranged in the virtual space and the edge of the background area. . By performing such processing, it is possible to increase the degree of joining between the conversation partner area and the virtual space area, and to make it feel as if they exist in the same space. In addition, this process may be executed in combination with the above-described process of changing the transparency and the process of displaying the frame of the background area with a color.

以上で、加工部１０９の処理の具体例についての説明を終了する。
なお、上述した３つの処理は、ユーザの設定により任意に変更することができる。 Above, the description about the specific example of the process of the process part 109 is complete | finished.
The three processes described above can be arbitrarily changed according to user settings.

映像空間生成部１１０は、加工部１０９の出力結果である画像、位置方向算出部１０７の出力結果である他の地点（例えば、従地点）の映像、及び、他の地点に位置するユーザＢにおける視点の位置と視点方向（３次元座標位置）を用いて、仮想空間環境内に他の地点のユーザの画像を配置することで人物の映像を含んだ仮想空間環境を生成する。 The video space generation unit 110 includes an image that is an output result of the processing unit 109, an image of another point (for example, a slave point) that is an output result of the position / direction calculation unit 107, and a user B located at another point. By using the viewpoint position and viewpoint direction (three-dimensional coordinate position), a virtual space environment including a person's video is generated by arranging images of users at other points in the virtual space environment.

表示部１１１は、ＣＲＴ（Cathode Ray Tube）ディスプレイ、液晶ディスプレイ、有機ＥＬ（Electro Luminescence）ディスプレイ等の画像表示装置である。表示部１１１は、生成された主地点のユーザＡ用の映像をディスプレイに表示する。具体的には、表示部１１１は、映像空間生成部１１０によって生成された３次元の仮想空間環境を、自地点であるユーザＡにおける視線の位置と視線方向に基づいて、２次元のスクリーン上に透視投影変換し描画結果をディスプレイ上に表示する。 The display unit 111 is an image display device such as a CRT (Cathode Ray Tube) display, a liquid crystal display, or an organic EL (Electro Luminescence) display. The display unit 111 displays the generated video for the user A at the main point on the display. Specifically, the display unit 111 displays the three-dimensional virtual space environment generated by the video space generation unit 110 on a two-dimensional screen based on the position of the line of sight and the line-of-sight direction of the user A who is the local point. Perspective projection conversion is performed and the drawing result is displayed on the display.

次に、従地点装置２００の機能構成を説明する。従地点装置２００は、バスで接続されたＣＰＵやメモリや補助記憶装置などを備え、従地点プログラムを実行する。従地点プログラムの実行によって、従地点装置２００は、撮像部２０１、顔位置取得部２０２、画像変形部２０３、通信部２０４、領域分割部２０５、位置方向算出部２０６、判断部２０７、加工部２０８、映像空間生成部２０９、表示部２１０を備える装置として機能する。なお、従地点装置２００の各機能の全て又は一部は、ＡＳＩＣやＰＬＤやＦＰＧＡ等のハードウェアを用いて実現されてもよい。また、従地点プログラムは、コンピュータ読み取り可能な記録媒体に記録されてもよい。コンピュータ読み取り可能な記録媒体とは、例えばフレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置である。また、主地点プログラムは、電気通信回線を介して送受信されてもよい。 Next, the functional configuration of the follower device 200 will be described. The follower device 200 includes a CPU, a memory, an auxiliary storage device, and the like connected by a bus, and executes a follower program. By executing the follow-up program, the follow-up device 200 includes an imaging unit 201, a face position acquisition unit 202, an image transformation unit 203, a communication unit 204, a region division unit 205, a position / direction calculation unit 206, a determination unit 207, and a processing unit 208. , Function as an apparatus including a video space generation unit 209 and a display unit 210. Note that all or part of the functions of the follower device 200 may be realized using hardware such as an ASIC, a PLD, or an FPGA. The follower program may be recorded on a computer-readable recording medium. The computer-readable recording medium is, for example, a portable medium such as a flexible disk, a magneto-optical disk, a ROM, a CD-ROM, or a storage device such as a hard disk built in the computer system. The main point program may be transmitted / received via a telecommunication line.

撮像部２０１は、従地点のユーザＢ及び背景を含む画像を撮影する。撮像部２０１は、撮影した画像を顔位置取得部２０２に出力する。
顔位置取得部２０２は、顔位置取得部１０２と同様の処理により、撮像部２０１によって撮影された画像から人物（ユーザＢ）の顔位置の座標値を取得する。 The imaging unit 201 captures an image including the user B and the background at the slave point. The imaging unit 201 outputs the captured image to the face position acquisition unit 202.
The face position acquisition unit 202 acquires the coordinate value of the face position of the person (user B) from the image captured by the imaging unit 201 by the same processing as the face position acquisition unit 102.

画像変形部２０３は、画像変形部１０３と同様の処理により、取得されたユーザＢの顔位置の座標値に基づいて顔位置変形画像を生成する。
通信部２０４は、主地点装置１００との間でデータの送受信を行う。また、通信部２０４は、他の従地点装置２００との間でデータの送受信を行う。 The image deforming unit 203 generates a face position deformed image based on the acquired coordinate value of the face position of the user B by the same processing as the image deforming unit 103.
The communication unit 204 transmits / receives data to / from the main point device 100. In addition, the communication unit 204 transmits and receives data to and from another slave device 200.

領域分割部２０５は、領域分割部１０５と同様の処理により、他の地点（例えば、主地点装置１００）の画像変形部１０３で生成された顔位置変形画像に対して領域分割を行う。
位置方向算出部２０６は、予め決められた従地点内の基準点に対する従地点のユーザＢの視点の位置と視線方向を取得する。また、位置方向算出部２０６は、基準点に対する従地点（ユーザＢ）のディスプレイの位置を取得する。ディスプレイは、物体やユーザＡなど任意の視線方向に設置され、移動可能である。 The region dividing unit 205 performs region division on the face position deformed image generated by the image deforming unit 103 at another point (for example, the main point device 100) by the same processing as the region dividing unit 105.
The position / direction calculation unit 206 acquires the position and line-of-sight direction of the user B's viewpoint at the slave point relative to the reference point in the slave point determined in advance. In addition, the position / direction calculation unit 206 acquires the display position of the slave point (user B) with respect to the reference point. The display is installed in an arbitrary line-of-sight direction such as an object or user A and is movable.

判断部２０７は、他の地点（例えば、主地点）の位置方向算出部１０７によって算出されたユーザＡの視線方向に対して、人物や物体などの観察対象が存在するか否かを判断する。
加工部２０８は、判断部２０７の判断結果に基づいてユーザＡが撮像されている画像を加工する。 The determination unit 207 determines whether or not there is an observation target such as a person or an object with respect to the line-of-sight direction of the user A calculated by the position / direction calculation unit 107 of another point (for example, the main point).
The processing unit 208 processes the image captured by the user A based on the determination result of the determination unit 207.

映像空間生成部２０９は、加工部１０９の出力結果である画像、位置方向算出部１０７の出力結果である他の地点（例えば、主地点）の画像及び他の地点に位置するユーザＡにおける観察位置と方向（３次元座標位置）を用いて、仮想空間環境内に他の地点のユーザの画像を配置することで人物の映像を含んだ仮想空間環境を生成する。 The video space generation unit 209 includes an image that is an output result of the processing unit 109, an image of another point (for example, a main point) that is an output result of the position / direction calculation unit 107, and an observation position of the user A located at the other point. And a direction (three-dimensional coordinate position) are used to generate a virtual space environment including a person's video by arranging images of users at other points in the virtual space environment.

表示部２１０は、ＣＲＴディスプレイ、液晶ディスプレイ、有機ＥＬディスプレイ等の画像表示装置である。表示部２１０は、映像空間生成部２０９によって生成された従地点のユーザＢ用の映像をディスプレイに表示する。具体的には、表示部２１０は、映像空間生成部２０９によって作成された３次元の仮想空間環境を、自地点であるユーザＢにおける視線の位置と視線方向に基づいて、２次元のスクリーン上に透視投影変換し描画結果をディスプレイ上に表示する。 The display unit 210 is an image display device such as a CRT display, a liquid crystal display, or an organic EL display. The display unit 210 displays the video for the user B at the slave point generated by the video space generation unit 209 on the display. Specifically, the display unit 210 displays the three-dimensional virtual space environment created by the video space generation unit 209 on a two-dimensional screen based on the position of the line of sight and the line of sight direction of the user B who is the local point. Perspective projection conversion is performed and the drawing result is displayed on the display.

図８は、本発明の映像表示システムを用いて、主地点、従地点１、従地点２の３地点にいる３人のユーザが主地点にある物体を観察し、対話する様子を示す図である。
符号３００は、仮想空間における物体と主地点のユーザ、従地点１のユーザ及び従地点２のユーザの位置関係を示している。また、符号３０１は主地点の様子であり、符号３０２は従地点１の様子である。主地点、及び、従地点１のそれぞれにおいて、ユーザは大型のディスプレイを用いて物体と対話の相手の様子を観察している。符号３０３は、従地点２の様子であり、ユーザは、タブレット型の端末を用いて物体と相手の様子を観察している。各拠点のユーザは、ディスプレイに表示されている映像により、物体を中心とした仮想空間における他の対話の相手の表示位置や、顔の表情、視線及び体の向きによって表される視線方向を一目で把握できる。 FIG. 8 is a diagram showing a situation in which three users at three locations, the master location, the follower location 1 and the follower location 2, observe and interact with an object at the master location using the video display system of the present invention. is there.
Reference numeral 300 indicates a positional relationship between the object in the virtual space and the user at the master point, the user at the slave point 1, and the user at the slave point 2. Reference numeral 301 denotes the state of the main point, and reference numeral 302 denotes the state of the slave point 1. At each of the main point and the slave point 1, the user observes the state of the object and the partner of the conversation using a large display. Reference numeral 303 denotes the state of the slave point 2, and the user observes the state of the object and the other party using a tablet-type terminal. The user at each base glances at the gaze direction represented by the display position of the other party in the virtual space centered on the object, the facial expression, the gaze, and the body direction from the video displayed on the display. Can be grasped.

このように、本発明の映像表示システムは、ユーザが利用するディスプレイの大きさや、ユーザとの距離や角度に応じた表現で対話の相手の映像を表示させ、さらには、人物と物体や空間を連続的に表現することで、コミュニケーションを活性化することができる。 In this way, the video display system of the present invention displays the video of the other party of conversation in an expression according to the size of the display used by the user, the distance and angle with the user, and further displays the person, object, and space. By expressing continuously, communication can be activated.

図９及び図１０は、本実施形態における映像表示システムの動作の流れを示すシーケンス図である。なお、図９及び図１０では、説明の簡単化のため、主地点装置１００と１つの従地点装置２００が存在する２地点の場合について説明する。まず、図９を用いて主地点装置１００に映像を表示させる具体例について説明する。
まず、従地点装置２００の撮像部２０１は、従地点のユーザＢ及び背景を含んだ画像を撮影する（ステップＳ１０１）。顔位置取得部２０２は、撮影された画像から人物の顔位置の座標値を取得する（ステップＳ１０２）。画像変形部２０３は、取得された人物の顔位置の座標に基づいて、撮影された画像を変形することによって顔位置変形画像を生成する（ステップＳ１０３）。 9 and 10 are sequence diagrams showing the flow of operations of the video display system in the present embodiment. 9 and 10, for the sake of simplicity of explanation, the case of two points where the main point device 100 and one slave point device 200 exist will be described. First, a specific example in which an image is displayed on the main point device 100 will be described with reference to FIG.
First, the imaging unit 201 of the follower device 200 takes an image including the follower user B and the background (step S101). The face position acquisition unit 202 acquires the coordinate value of the person's face position from the captured image (step S102). The image deforming unit 203 generates a face position deformed image by deforming the captured image based on the acquired coordinates of the face position of the person (step S103).

位置方向算出部２０６は、環境に対するユーザＢの視点の位置や視線方向を取得する。また、位置方向算出部２０６は、環境に対する従地点のディスプレイの位置を取得する（ステップＳ１０４）。位置方向算出部２０６は、取得したユーザＢの視点の位置や視線方向、及び、ディスプレイの位置を表示部２１０に出力する。また、位置方向算出部２０６は、取得したユーザＢの視点の位置や視線方向、及び、ディスプレイの位置を通信部２０４に出力する。通信部２０４は、生成された顔位置変形画像データ、ユーザＢの視点の位置や視線方向、及び、ディスプレイの位置を主地点装置１００に送信する（ステップＳ１０５）。 The position direction calculation unit 206 acquires the position of the viewpoint of the user B and the line-of-sight direction with respect to the environment. Further, the position / direction calculation unit 206 acquires the position of the slave point display with respect to the environment (step S104). The position / direction calculation unit 206 outputs the acquired viewpoint position, line-of-sight direction, and display position of the user B to the display unit 210. In addition, the position / direction calculation unit 206 outputs the acquired viewpoint position and line-of-sight direction of the user B and the display position to the communication unit 204. The communication unit 204 transmits the generated face position deformed image data, the viewpoint position and line-of-sight direction of the user B, and the display position to the main point device 100 (step S105).

主地点装置１００の通信部１０４は、従地点装置２００から送信された情報を受信する（ステップＳ１０６）。通信部１０４は、受信した情報からユーザＢの顔位置変形画像データ、ユーザＢの視点の位置や視線方向、及び、ディスプレイの位置を取得する（ステップＳ１０７）。加工部１０９は、判断部１０８の判断結果に基づいてユーザＢが撮像されている顔位置変形画像を加工する（ステップＳ１０８）。映像空間生成部１１０は、加工部１０９の出力結果である画像、位置方向算出部２０６の出力結果である他の地点（例えば、従地点）の画像、及び、他の地点に位置するユーザＢにおける視点の位置と視線方向（３次元座標位置）を用いて、仮想空間環境内に他の地点のユーザＢの画像を配置することで人物の映像を含んだ仮想空間環境を生成する（ステップＳ１０９）。位置方向算出部１０７は、環境に対するユーザＡの視点の位置、及び、視線方向を取得する。表示部１１１は、取得された主地点のユーザＡの視点の位置、及び、視線方向に基づいて、２次元のスクリーン上に透視投影変換することによって映像空間生成部１１０によって生成された仮想空間環境をディスプレイに表示する（ステップＳ１１０）。 The communication unit 104 of the master location device 100 receives the information transmitted from the slave location device 200 (step S106). The communication unit 104 obtains user B's face position deformed image data, user B's viewpoint position and line-of-sight direction, and display position from the received information (step S107). The processing unit 109 processes the face position deformed image captured by the user B based on the determination result of the determination unit 108 (step S108). The video space generation unit 110 includes an image that is an output result of the processing unit 109, an image of another point (for example, a slave point) that is an output result of the position / direction calculation unit 206, and a user B located at another point. By using the position of the viewpoint and the line-of-sight direction (three-dimensional coordinate position), a virtual space environment including a person's video is generated by arranging an image of the user B at another point in the virtual space environment (step S109). . The position / direction calculation unit 107 acquires the position of the viewpoint of the user A with respect to the environment and the line-of-sight direction. The display unit 111 is a virtual space environment generated by the video space generation unit 110 by performing perspective projection conversion on a two-dimensional screen based on the acquired position of the viewpoint of the user A at the main point and the line-of-sight direction. Is displayed on the display (step S110).

次に、図１０を用いて従地点装置２００に映像を表示させる具体例について説明する。
まず、主地点装置１００の撮像部１０１は、主地点のユーザＡ及び背景を含んだ画像を撮影する（ステップＳ２０１）。顔位置取得部１０２は、撮影された画像から人物の顔位置の座標値を取得する（ステップＳ２０２）。画像変形部１０３は、取得された人物の顔位置の座標に基づいて、撮影された画像を変形することによって顔位置変形画像を生成する（ステップＳ２０３）。 Next, a specific example in which an image is displayed on the follower device 200 will be described with reference to FIG.
First, the imaging unit 101 of the main point device 100 captures an image including the user A and the background at the main point (step S201). The face position acquisition unit 102 acquires the coordinate value of the person's face position from the captured image (step S202). The image deforming unit 103 generates a face position deformed image by deforming the captured image based on the acquired face position coordinates of the person (step S203).

位置方向算出部１０７は、環境に対するユーザＡの視点の位置や視線方向を取得する。また、位置方向算出部１０７は、環境に対する主地点のディスプレイの位置を取得する（ステップＳ２０４）。位置方向算出部１０７は、取得したユーザＡの視点の位置や視線方向、及び、ディスプレイの位置を表示部１１１に出力する。また、位置方向算出部１０７は、取得したユーザＡの視点の位置や視線方向、及び、ディスプレイの位置を通信部１０４に出力する。通信部１０４は、生成された顔位置変形画像データ、ユーザＡの視点の位置や視線方向、及び、ディスプレイの位置を従地点装置２００に送信する（ステップＳ２０５）。 The position / direction calculation unit 107 acquires the position of the viewpoint of the user A and the line-of-sight direction with respect to the environment. Further, the position / direction calculation unit 107 acquires the position of the display of the main point with respect to the environment (step S204). The position / direction calculation unit 107 outputs the acquired viewpoint position, line-of-sight direction, and display position of the user A to the display unit 111. Further, the position / direction calculation unit 107 outputs the acquired position of the viewpoint of the user A, the line-of-sight direction, and the display position to the communication unit 104. The communication unit 104 transmits the generated face position deformation image data, the position and line-of-sight direction of the user A's viewpoint, and the display position to the follower device 200 (step S205).

従地点装置２００の通信部２０４は、主地点装置１００から送信された情報を受信する（ステップＳ２０６）。通信部２０４は、受信した情報からユーザＡの顔位置変形画像データ、ユーザＡの視点の位置や視線方向、及び、ディスプレイの位置を取得する（ステップＳ２０７）。加工部２０８は、判断部２０７の判断結果に基づいてユーザＡが撮像されている顔位置変形画像を加工する（ステップＳ２０８）。映像空間生成部２０９は、加工部２０８の出力結果である画像、位置方向算出部１０７の出力結果である他の地点（例えば、主地点）の画像、及び、他の地点に位置するユーザＡにおける視点の位置と視線方向（３次元座標位置）を用いて、仮想空間環境内に他の地点のユーザＡの画像を配置することで人物の映像を含んだ仮想空間環境を生成する（ステップＳ２０９）。位置方向算出部２０６は、環境に対するユーザＢの視点の位置、及び、視線方向を取得する。表示部２１０は、取得した従地点のユーザＢの視点の位置、及び、視線方向に基づいて、２次元のスクリーン上に透視投影変換することによって映像空間生成部２０９によって生成された仮想空間環境をディスプレイに表示する（ステップＳ２１０）。 The communication unit 204 of the slave point device 200 receives the information transmitted from the master point device 100 (step S206). The communication unit 204 acquires the face position deformed image data of the user A, the position and line-of-sight direction of the user A, and the display position from the received information (step S207). The processing unit 208 processes the face position deformed image captured by the user A based on the determination result of the determination unit 207 (step S208). The video space generation unit 209 includes an image that is an output result of the processing unit 208, an image of another point (for example, a main point) that is an output result of the position / direction calculation unit 107, and a user A located at another point. Using the viewpoint position and the line-of-sight direction (three-dimensional coordinate position), a virtual space environment including a person's video is generated by arranging images of user A at other points in the virtual space environment (step S209). . The position / direction calculation unit 206 acquires the position of the viewpoint of the user B with respect to the environment and the line-of-sight direction. The display unit 210 converts the virtual space environment generated by the video space generation unit 209 by performing perspective projection conversion on the two-dimensional screen based on the acquired position of the viewpoint of the user B as the slave point and the line-of-sight direction. It is displayed on the display (step S210).

以上のように構成された映像表示システムによれば、複数の利用者が同じ空間を共有し、自由な位置関係を取りながらコミュニケーションを取ることができる。具体的には、ユーザの視線方向に対して、他のユーザや物体などが存在するかに応じて表示されているユーザの背景領域が加工される。したがって、ユーザは、画面に表示されている他のユーザの画像を見るだけで容易にそのユーザの状態を把握することができる。そのため、円滑なコミュニケーションを図ることができる。 According to the video display system configured as described above, a plurality of users can share the same space and communicate while taking a free positional relationship. Specifically, the displayed background area of the user is processed according to whether there are other users or objects with respect to the user's line-of-sight direction. Therefore, the user can easily grasp the state of the user just by looking at the image of the other user displayed on the screen. Therefore, smooth communication can be achieved.

また、画像変形部１０３及び２０３が、取得された画像中のユーザの顔位置に基づいて２次元の画像を変形し顔位置変形画像を生成する。その後、生成された画像が仮想空間内に表示されることにより、各ユーザの視線方向を容易に伝達することが可能になる。
また、ユーザの観察位置や視線方向に応じて、仮想空間中に人物を配置することにより、１つの空間を共有することができる。また、ユーザが観察したい空間中の人物や物体が存在する際には、ユーザが移動したり、ディスプレイの向きを変更するだけで用に所望の映像を見ることが可能になる。 Further, the image deforming units 103 and 203 deform the two-dimensional image based on the user's face position in the acquired image and generate a face position deformed image. Thereafter, the generated image is displayed in the virtual space, so that it becomes possible to easily transmit the line-of-sight direction of each user.
Moreover, one space can be shared by arranging a person in the virtual space according to the user's observation position and line-of-sight direction. In addition, when there is a person or object in the space that the user wants to observe, the user can view a desired video just by moving or changing the orientation of the display.

＜変形例＞
主地点装置１００は、撮像部１０１に代えて画像取得部を備えるように構成されてもよい。画像取得部は、主地点装置１００の周囲に設置された１つの固定カメラから、ユーザＡ及び背景を含む画像を取得する。従地点装置２００は、撮像部２０１に代えて画像取得部を備えるように構成されてもよい。画像取得部は、従地点装置２００の周囲に設置された１つの固定カメラから、ユーザＢ及び背景を含む画像を取得する。 <Modification>
The main point device 100 may be configured to include an image acquisition unit instead of the imaging unit 101. The image acquisition unit acquires an image including the user A and the background from one fixed camera installed around the main point device 100. The follower device 200 may be configured to include an image acquisition unit instead of the imaging unit 201. The image acquisition unit acquires an image including the user B and the background from one fixed camera installed around the slave point device 200.

以上、この発明の実施形態について図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計等も含まれる。 The embodiment of the present invention has been described in detail with reference to the drawings. However, the specific configuration is not limited to this embodiment, and includes designs and the like that do not depart from the gist of the present invention.

１００…主地点装置，２００…従地点装置，１０１…撮像部，１０２…顔位置取得部，１０３…画像変形部，１０４…通信部，１０５…領域分割部，１０６…画像生成部，１０７…位置方向算出部，１０８…判断部，１０９…加工部，１１０…映像空間生成部，１１１…表示部，２０１…撮像部，２０２…顔位置取得部，２０３…画像変形部，２０４…通信部，２０５…領域分割部，２０６…位置方向算出部，２０７…判断部，２０８…加工部，２０９…映像空間生成部，２１０…表示部 DESCRIPTION OF SYMBOLS 100 ... Master point apparatus, 200 ... Slave point apparatus, 101 ... Imaging part, 102 ... Face position acquisition part, 103 ... Image deformation part, 104 ... Communication part, 105 ... Area division part, 106 ... Image generation part, 107 ... Position Direction calculation unit 108 ... Judgment unit 109 ... Processing unit 110 ... Video space generation unit 111 ... Display unit 201 ... Imaging unit 202 ... Face position acquisition unit 203 ... Image deformation unit 204 ... Communication unit 205 ... area division unit, 206 ... position / direction calculation unit, 207 ... determination unit, 208 ... processing unit, 209 ... video space generation unit, 210 ... display unit

Claims

An acquisition unit for acquiring the orientation of the user who is the partner of the dialogue;
A processing unit that processes the video of the user based on the acquired orientation;
A display unit for displaying the processed image;
A video display system comprising:

2. The video display system according to claim 1, wherein the processing unit changes the transparency of an area of the video that does not include the user when the orientation is in a screen direction.

The video display system according to claim 1, wherein the processing unit changes a color of a frame of the video when the orientation is directed to a screen direction.

The video display system according to claim 1, wherein the processing unit performs a blurring process on a region that does not include the user of the video and a frame of the region when the orientation is in a screen direction.

An acquisition step of acquiring the orientation of the user who is the partner of the dialogue;
A processing step of processing the video of the user based on the acquired orientation;
A display step for displaying the processed video;
A video display method comprising: