JP2020065229A

JP2020065229A - Video communication method, video communication device, and video communication program

Info

Publication number: JP2020065229A
Application number: JP2018197595A
Authority: JP
Inventors: 正俊横井; Masatoshi Yokoi
Original assignee: Nippon Telegraph and Telephone West Corp
Current assignee: Nippon Telegraph and Telephone West Corp
Priority date: 2018-10-19
Filing date: 2018-10-19
Publication date: 2020-04-23

Abstract

To reduce the data processing amount and the sent/received data amount when the user's figure is displayed at a remote location.SOLUTION: The skeleton position of a user detected on the basis of images taken from multiple directions at another base is received, the posture of a user's 3D real avatar is changed on the basis of the skeleton position, and the 3D real avatar is superimposed and displayed on a real landscape. The skeleton position of a user is detected from the image in which the user is photographed, and the three-dimensional coordinates of the skeleton position can be obtained from the skeleton positions detected from a plurality of images using trigonometry. 3D scan is performed on the user to obtain the 3D scan data, and a base model with a rig is deformed to generate the 3D real avatar according to the shape of the 3D scan data. The coordinates of the coordinate system at the other base of the user is received, the coordinate system at the other base is converted into a space shared coordinate system corresponding to the real landscape, and the 3D real avatar is placed at the converted coordinates.SELECTED DRAWING: Figure 12

Description

本発明は、遠隔地間の映像通信技術に関し、特に３次元映像を送受信する技術に関する。 The present invention relates to a video communication technology between remote places, and more particularly to a technology for transmitting / receiving three-dimensional video.

我が国では、後期高齢化社会の進展に伴い、伝統工芸を含むあらゆる産業において後継者不足の問題、医療や教育においては地域格差の拡大が社会問題となっている。 In Japan, with the progress of the late aging society, the problem of lack of successors in all industries including traditional crafts, and the widening regional disparities in medical care and education have become social problems.

一方、人々の遠隔コミュニケーション形態は技術革新に伴い、電話、テレビ電話、仮想世界での３Ｄアバタを用いた通信へと発展している。また、デバイスにおいてもパーソナルコンピュータ（ＰＣ）からスマートフォン、さらに拡張現実（ＡｕｇｍｅｎｔｅｄＲｅａｌｉｔｙ：ＡＲ）デバイスや仮想現実（ＶｉｒｔｕａｌＲｅａｌｉｔｙ：ＶＲ）デバイスが登場している。 On the other hand, the form of people's remote communication has evolved into communication using telephones, videophones, and 3D avatars in the virtual world due to technological innovation. As for devices, smartphones, augmented reality (AR) devices, and virtual reality (VR) devices have emerged from personal computers (PCs).

従来の３Ｄキャラクタによる３Ｄ遠隔コミュニケーションは、デフォルメした３Ｄキャラクタを用い、３Ｄ仮想空間を共有する３Ｄキャラクタを、ユーザの手元のコントローラで操作しながら、その３Ｄキャラクタの視点で対話をするものが知られている（非特許文献１）。 In the conventional 3D remote communication using 3D characters, it is known that a deformed 3D character is used and a 3D character that shares a 3D virtual space is operated with a controller at the user's hand while the user interacts from the viewpoint of the 3D character. (Non-patent document 1).

また、実際の人間の姿を用いた３Ｄ遠隔コミュニケーションとして、深度センサつきカメラを用いて生成した、発信者自身の姿を表す膨大な３Ｄデータを毎フレーム対話者と送受信し、互いのデバイスに表示させるというものが知られている（非特許文献２）。 Also, as 3D remote communication using an actual human figure, a huge amount of 3D data representing the sender's own figure, generated using a camera with a depth sensor, is sent and received with each frame of the interlocutor and displayed on each other's device. It is known to allow it (Non-Patent Document 2).

“VRChat”、［online］、VRChat Inc.、［平成30年7月17日検索］、インターネット〈 URL：https://www.vrchat.net/ 〉"VRChat", [online], VRChat Inc., [July 17, 2018 search], Internet <URL: https://www.vrchat.net/> “Holoportation”、［online］、Microsoft、［平成30年7月11日検索］、インターネット〈 URL：https://www.microsoft.com/en-us/research/project/holoportation-3/ 〉"Holoportation", [online], Microsoft, [July 11, 2018 search], Internet <URL: https://www.microsoft.com/en-us/research/project/holoportation-3/>

遠隔技術指導や遠隔診療では、非特許文献１のようなデフォルメした３Ｄキャラクタを用いる方法よりも、非特許文献２のような発信者自身の姿を用いる方法のほうが、より正確に指導・診療を行う観点からよいと考えられる。 In remote technical guidance and remote medical care, the method using the figure of the sender himself as in Non-Patent Document 2 provides more accurate guidance and medical care than the method using deformed 3D characters as in Non-Patent Document 1. It is considered good from the point of view of doing.

非特許文献２では、発信者を遠隔地に出現させるために、発信者を表現するための膨大な３Ｄの点群データを生成後、その大量の３Ｄデータを毎フレーム送受信する必要があった。３Ｄデータの生成には高スペックなコンピュータが必要であり、さらに、大量の３Ｄデータを送受信する高速な回線も必要であった。 In Non-Patent Document 2, in order to make the caller appear in a remote place, it is necessary to generate enormous 3D point cloud data for expressing the caller and then transmit / receive the large amount of 3D data for each frame. Generation of 3D data required a computer with high specifications, and also required a high-speed line for transmitting / receiving a large amount of 3D data.

本発明は、上記に鑑みてなされたものであり、遠隔地において利用者の姿を表示する際のデータ処理量および送受信するデータ量を削減することを目的とする。 The present invention has been made in view of the above, and an object thereof is to reduce the amount of data processing and the amount of data to be transmitted / received when displaying the figure of a user in a remote place.

本発明に係る映像通信方法は、複数方向から撮影した映像に基づいて検出した通信相手の骨格位置情報を受信するステップと、前記骨格位置情報に基づいて前記通信相手の外見を有するリグ付き３次元モデルの姿勢を変更するステップと、前記３次元モデルを実風景に重畳させて表示するステップと、を有することを特徴とする。 A video communication method according to the present invention includes a step of receiving skeleton position information of a communication partner detected based on images shot from a plurality of directions, and a three-dimensional rigged form having an appearance of the communication partner based on the skeleton position information. It is characterized by including a step of changing the posture of the model and a step of displaying the three-dimensional model by superimposing it on a real landscape.

本発明に係る映像通信装置は、複数方向から撮影した映像に基づいて検出した通信相手の骨格位置情報を受信する受信部と、前記骨格位置情報に基づいて前記通信相手の外見を有するリグ付き３次元モデルの姿勢を変更するモデル制御部と、前記３次元モデルを実風景に重畳させて表示する表示部と、を有することを特徴とする。 A video communication device according to the present invention includes a receiving unit that receives skeleton position information of a communication partner detected based on images shot from a plurality of directions, and a rig 3 having a appearance of the communication partner based on the skeleton position information. It is characterized by comprising a model control unit for changing the posture of the three-dimensional model and a display unit for displaying the three-dimensional model by superimposing it on a real landscape.

本発明に係る映像通信プログラムは、上記の映像通信方法をコンピュータに実行させることを特徴とする。 A video communication program according to the present invention causes a computer to execute the above video communication method.

本発明によれば、遠隔地において利用者の姿を表示する際のデータ処理量および送受信するデータ量を削減することができる。 According to the present invention, it is possible to reduce the amount of data processing and the amount of data to be transmitted and received when displaying the figure of a user in a remote place.

本実施形態の映像通信システムのうち、３Ｄリアルアバタの生成するための構成を示す図である。It is a figure which shows the structure for producing | generating 3D real avatars among the video communication systems of this embodiment. ３Ｄリアルアバタを生成する処理を示すシーケンス図である。It is a sequence diagram which shows the process which produces | generates 3D real avatar. 図３（ａ）は３Ｄスキャンデータの一例を示す図であり、図３（ｂ）はベースモデルの一例を示す図であり、図３（ｃ）は３Ｄリアルアバタの一例を示す図である。3A is a diagram showing an example of 3D scan data, FIG. 3B is a diagram showing an example of a base model, and FIG. 3C is a diagram showing an example of a 3D real avatar. 本実施形態の映像通信システムを用いて３Ｄ通信を行うための構成を示す図である。It is a figure which shows the structure for performing 3D communication using the video communication system of this embodiment. 各拠点に配置されるＰＣの実行する機能を示す機能ブロック図である。It is a functional block diagram which shows the function which PC arrange | positioned at each base performs. 座標処理部の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of a coordinate processing part. 各拠点の座標系を示す図である。It is a figure which shows the coordinate system of each base. ある拠点の利用者を原点とした共有の座標系を示す図である。It is a figure which shows the shared coordinate system which made the user of a certain base the origin. 共有の座標系において利用者の位置を移動させた例を示す図である。It is a figure which shows the example which moved the user's position in a shared coordinate system. ３Ｄリアルアバタの表示位置を決める処理の流れを示すシーケンス図である。It is a sequence diagram which shows the flow of the process which determines the display position of 3D real avatar. アバタ処理部の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of an avatar process part. 骨格位置を３Ｄリアルアバタに反映する様子を示す図である。It is a figure which shows a mode that a skeleton position is reflected in 3D real avatar. ３Ｄリアルアバタを表示する処理の流れを示すシーケンス図である。It is a sequence diagram which shows the flow of the process which displays a 3D real avatar.

本発明に係る映像通信システムは、各拠点の利用者のリアルな３Ｄモデル（以下、「３Ｄリアルアバタ」と称する）を事前に作成しておき、各拠点において、他の拠点にいる利用者の３ＤリアルアバタをＡＲデバイスにより実風景に重畳させて表示するシステムである。本映像通信システムでは、各拠点で撮影した利用者の映像から骨格の動きを推定し、３Ｄリアルアバタに動作を表現させる。本映像通信システムは、３Ｄリアルアバタを様々な角度で表示できるので、遠隔地から体の動きを伴う技術の遠隔指導や遠隔診療に用いることができる。 In the video communication system according to the present invention, a realistic 3D model (hereinafter, referred to as “3D real avatar”) of users at each base is created in advance, and users at other bases at each base are created. It is a system for displaying a 3D real avatar by superimposing it on a real landscape by an AR device. In this video communication system, the movement of the skeleton is estimated from the video of the user photographed at each site, and the 3D real avatar is caused to express the motion. Since this video communication system can display the 3D real avatar at various angles, it can be used for remote teaching of techniques involving remote body movements and remote medical care.

以下、本発明の実施の形態について図面を用いて説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

［３Ｄリアルアバタの生成］
図１は、本実施形態の映像通信システムのうち、３Ｄリアルアバタの生成するための構成を示す図である。図１の映像通信システムは、パーソナルコンピュータ（ＰＣ）１０、３Ｄスキャナ２１、３Ｄリアルアバタ生成サーバ３０、及び３Ｄリアルアバタ管理サーバ４０を備える。 [Generation of 3D Real Avatar]
FIG. 1 is a diagram showing a configuration for generating a 3D real avatar in the video communication system of the present embodiment. The video communication system of FIG. 1 includes a personal computer (PC) 10, a 3D scanner 21, a 3D real avatar generation server 30, and a 3D real avatar management server 40.

ＰＣ１０は、３Ｄスキャナ２１でスキャンした利用者の３次元データ（以下、「３Ｄスキャンデータ」と称する）を３Ｄリアルアバタ生成サーバ３０へ送信し、利用者の３Ｄリアルアバタを生成させるとともに、その３Ｄリアルアバタを３Ｄリアルアバタ管理サーバ４０に登録させる。ＰＣ１０は、３Ｄリアルアバタを生成する機能に加えて、後述の３Ｄリアルアバタの配置位置を決定する機能および３Ｄリアルアバタを制御する機能を備えてもよい。 The PC 10 transmits the user's three-dimensional data scanned by the 3D scanner 21 (hereinafter, referred to as “3D scan data”) to the 3D real avatar generation server 30 to generate the user's 3D real avatar, and the 3D The real avatar is registered in the 3D real avatar management server 40. In addition to the function of generating the 3D real avatar, the PC 10 may have a function of determining the arrangement position of the 3D real avatar described later and a function of controlling the 3D real avatar.

３Ｄスキャナ２１は、利用者の３Ｄデータをスキャンする装置であり、深度センサ付きのカメラを用いることができる。 The 3D scanner 21 is a device that scans the 3D data of the user, and can use a camera with a depth sensor.

３Ｄリアルアバタ生成サーバ３０は、全身のリグ付けが施されている人型のベースモデルを変形させて３Ｄスキャンデータの形状に合わせ、３Ｄリアルアバタを生成する。リグを操作することで３Ｄモデルを自在に変形させることができる。３Ｄスキャンデータは、利用者の外見をスキャンした３Ｄデータであるので、モデルの姿勢を変形することは難しい。３Ｄリアルアバタは、人体の骨格の動きに応じて３Ｄモデルを動かすためのリグを付けているので、利用者の骨格位置が分かれば、その骨格位置に基づいてリグを操作することで、利用者と同じ動作をさせることが可能である。 The 3D real avatar generation server 30 transforms the humanoid base model to which the whole body is rigged to match the shape of the 3D scan data and generates the 3D real avatar. The 3D model can be freely transformed by operating the rig. Since the 3D scan data is 3D data obtained by scanning the appearance of the user, it is difficult to change the posture of the model. 3D Real Avatar has a rig for moving the 3D model according to the movement of the skeleton of the human body, so if the skeleton position of the user is known, by operating the rig based on that skeleton position, the user can It is possible to make the same operation as.

３Ｄリアルアバタ管理サーバ４０は、生成された３Ｄリアルアバタの登録を受け付けて管理する。遠隔地の通信相手の３Ｄリアルアバタを表示する際、通信相手の３Ｄリアルアバタを３Ｄリアルアバタ管理サーバ４０から事前にダウンロードしておく。以下、本映像通信システムを用いた３Ｄリアルアバタを表示させる通信を「３Ｄ通信」と称する。 The 3D real avatar management server 40 receives and manages the registration of the generated 3D real avatar. When displaying the 3D real avatar of the communication partner at the remote location, the 3D real avatar of the communication partner is downloaded from the 3D real avatar management server 40 in advance. Hereinafter, communication for displaying a 3D real avatar using this video communication system will be referred to as “3D communication”.

図２は、３Ｄリアルアバタを生成する処理を示すシーケンス図である。 FIG. 2 is a sequence diagram showing a process of generating a 3D real avatar.

まず、３Ｄスキャナ２１を用いて利用者を３Ｄスキャンする（ステップＳ１１）。３Ｄスキャンには、例えば、ＲｅａｌＳｅｎｓｅデプスカメラ（https://www.intel.co.jp/content/www/jp/ja/architecture-and-technology/realsense-overview.html）を用いることができる。あるいは、ｉｔＳｅｅｚ３Ｄ（https://itseez3d.com/）などの人体スキャンアプリケーションを用いることができる。３Ｄスキャンにより、図３（ａ）に示すような、利用者の外見を３次元でスキャンした３Ｄスキャンデータが得られる。３Ｄスキャナ２１により利用者の全身を全周囲方向から撮影することで、利用者全身の見た目と３Ｄ形状を含む３Ｄスキャンデータが得られる。 First, the user is 3D-scanned using the 3D scanner 21 (step S11). For 3D scanning, for example, a RealSense depth camera (https://www.intel.co.jp/content/www/jp/ja/architecture-and-technology/realsense-overview.html) can be used. Alternatively, a human body scanning application such as itSeez3D (https://itseez3d.com/) can be used. By the 3D scan, 3D scan data obtained by three-dimensionally scanning the appearance of the user as shown in FIG. 3A is obtained. By capturing the entire body of the user from the entire circumferential direction with the 3D scanner 21, 3D scan data including the appearance and 3D shape of the entire body of the user can be obtained.

ＰＣ１０は、３Ｄスキャナから３Ｄスキャンデータを受け取り、３Ｄリアルアバタ生成サーバ３０へ送信する（ステップＳ１２）。 The PC 10 receives the 3D scan data from the 3D scanner and transmits it to the 3D real avatar generation server 30 (step S12).

３Ｄリアルアバタ生成サーバ３０は、３Ｄスキャンデータを受信し、３Ｄリアルアバタを生成する（ステップＳ１３）。より具体的には、３Ｄリアルアバタ生成サーバ３０は、全身のリグ付けが施されている人型のベースモデルを変形させて、３Ｄスキャンデータの形状に合わせて３Ｄリアルアバタを生成する。図３（ｂ）にベースモデルの例を示す。リグ付きのベースモデルは、例えば、Ｂｌｅｎｄｅｒ（https://blender.jp/）やＵｎｉｔｙ（https://unity3d.com/jp/）などの３Ｄモデル作成ツールを用いて事前に作成しておく。ベースモデルを３Ｄスキャンデータの形状に合わせる際、リグも３Ｄスキャンデータに合わせて伸縮させる。図３（ｃ）に３Ｄリアルアバタの例を示す。 The 3D real avatar generation server 30 receives the 3D scan data and generates the 3D real avatar (step S13). More specifically, the 3D real avatar generation server 30 deforms the humanoid base model to which the whole body is rigged, and generates the 3D real avatar according to the shape of the 3D scan data. FIG. 3B shows an example of the base model. The base model with a rig is created in advance by using a 3D model creation tool such as Blender (https://blender.jp/) or Unity (https://unity3d.com/jp/). When fitting the base model to the shape of the 3D scan data, the rig is also expanded / contracted according to the 3D scan data. FIG. 3C shows an example of a 3D real avatar.

３Ｄリアルアバタ生成サーバ３０は、生成した３Ｄリアルアバタを３Ｄリアルアバタ管理サーバ４０へ送信して登録する（ステップＳ１４）。３Ｄリアルアバタ管理サーバ４０は、利用者を特定する識別子と３Ｄリアルアバタとを紐付けて３Ｄリアルアバタを管理する。 The 3D real avatar generation server 30 transmits and registers the generated 3D real avatar to the 3D real avatar management server 40 (step S14). The 3D real avatar management server 40 manages the 3D real avatar by associating the identifier identifying the user with the 3D real avatar.

３Ｄ通信を行う各利用者は、事前に自身の３Ｄリアルアバタの生成を行っておく。 Each user who performs 3D communication generates his or her own 3D real avatar in advance.

なお、本実施形態では、３Ｄリアルアバタ生成サーバ３０が３Ｄスキャンデータから３Ｄリアルアバタを生成したが、ＰＣ１０が３Ｄリアルアバタを生成する機能を備えてもよい。また、本実施形態では、３Ｄリアルアバタ管理サーバ４０が３Ｄリアルアバタを保持・管理しているが、各拠点のＰＣ１０のそれぞれで３Ｄリアルアバタを管理しておき、３Ｄ通信の前に、互いに３Ｄリアルアバタを送受信してもよい。 In the present embodiment, the 3D real avatar generation server 30 generates the 3D real avatar from the 3D scan data, but the PC 10 may have a function of generating the 3D real avatar. In addition, in the present embodiment, the 3D real avatar management server 40 holds and manages the 3D real avatar, but the PCs 10 at the respective locations manage the 3D real avatars before the 3D communication. You may send and receive real avatars.

［３Ｄ通信］
図４は、本実施形態の映像通信システムを用いて３Ｄ通信を行うための構成を示す図である。図４では、３つの拠点の間で３Ｄ通信を行う。拠点１からは利用者Ａ，Ｂ、拠点２からは利用者Ｃ，Ｄ、拠点３からは利用者Ｅ，Ｆが３Ｄ通信に参加する。拠点数は２ヶ所以上であればよい。拠点内の利用者は１人以上であればよい。 [3D communication]
FIG. 4 is a diagram showing a configuration for performing 3D communication using the video communication system of the present embodiment. In FIG. 4, 3D communication is performed between the three bases. Users A and B from the base 1, users C and D from the base 2, and users E and F from the base 3 participate in the 3D communication. The number of bases may be two or more. The number of users in the base may be one or more.

各拠点は、ＰＣ１０、カメラ２２Ａ，２２Ｂ、及びＡＲデバイス２３Ａ，２３Ｂを備える。図４の拠点２，３には図示していないが拠点１と同様の装置が設置されている。各拠点には、複数台のカメラ２２Ａ，２２Ｂが配置される。３Ｄ通信時には、利用者Ａ−ＦのそれぞれがＡＲデバイス２３Ａ，２３Ｂを装着する。カメラ２２Ａ，２２ＢおよびＡＲデバイス２３Ａ，２３Ｂは、ＰＣ１０と通信可能に接続される。ＰＣ１０は、３Ｄリアルアバタの生成に用いたものと同じものを用いてもよい。 Each base includes a PC 10, cameras 22A and 22B, and AR devices 23A and 23B. Although not shown, the same devices as those of the base 1 are installed in the bases 2 and 3 of FIG. A plurality of cameras 22A and 22B are arranged at each base. At the time of 3D communication, each of the users A to F wears the AR devices 23A and 23B. The cameras 22A and 22B and the AR devices 23A and 23B are communicably connected to the PC 10. The PC 10 may be the same as that used to generate the 3D real avatar.

複数台のカメラ２２Ａ，２２Ｂで撮影された映像から各拠点１，２，３における利用者Ａ−Ｆの座標を求めるとともに、利用者Ａ−Ｆの３次元の骨格位置を推定する。利用者Ａ−Ｆの座標および骨格位置が３Ｄリアルアバタに反映される。 The coordinates of the user AF at each of the bases 1, 2 and 3 are obtained from the images captured by the plurality of cameras 22A and 22B, and the three-dimensional skeleton position of the user AF is estimated. The coordinates and the skeleton position of the users A to F are reflected in the 3D real avatar.

ＡＲデバイス２３Ａ，２３Ｂは、各拠点において、各拠点の実風景に他の拠点の利用者の３Ｄリアルアバタを重ねて表示する。ＡＲデバイス２３Ａ，２３Ｂは、マイクおよびスピーカを内蔵し、マイクで集音された利用者Ａ−Ｆの音声が拠点１，２，３間で送受信されて、スピーカで再生される。ＡＲデバイス２３Ａ，２３Ｂは、自己位置が推定可能なデバイス（例えばジャイロセンサやカメラ）を内蔵する。ＡＲデバイス２３Ａ，２３Ｂとして、眼鏡型ＡＲデバイスを用いることができる。眼鏡型ＡＲデバイスの場合、利用者が見ている実風景に３Ｄリアルアバタが重畳して表示される。あるいは、ＡＲデバイス２３Ａ，２３Ｂとして、スマートフォンを用いてもよい。スマートフォンで撮影した実風景に３Ｄリアルアバタを重畳させて表示する。 The AR devices 23A and 23B display the 3D real avatars of the users of the other bases on the actual scenery of the bases at each base. The AR devices 23A and 23B have a built-in microphone and speaker, and the voices of the users AF collected by the microphone are transmitted and received between the sites 1, 2, and 3 and reproduced by the speaker. The AR devices 23A and 23B have a built-in device (for example, a gyro sensor or a camera) whose self position can be estimated. Eyeglass-type AR devices can be used as the AR devices 23A and 23B. In the case of the eyeglass-type AR device, the 3D real avatar is displayed so as to be superimposed on the actual scenery viewed by the user. Alternatively, smartphones may be used as the AR devices 23A and 23B. 3D real avatars are superimposed and displayed on a real scene taken with a smartphone.

図４の映像通信システムは、各拠点に配置される装置のほかに、３Ｄリアルアバタ管理サーバ４０、座標管理サーバ５０、及び画像認識サーバ６０を備える。 The video communication system in FIG. 4 includes a 3D real avatar management server 40, a coordinate management server 50, and an image recognition server 60, in addition to the devices arranged at each base.

３Ｄリアルアバタ管理サーバ４０は、利用者Ａ−Ｆの３Ｄリアルアバタを管理する。各拠点のＰＣ１０は、事前に他の拠点の利用者の３Ｄリアルアバタを取得しておく。 The 3D real avatar management server 40 manages the 3D real avatars of the users AF. The PC 10 of each base acquires in advance the 3D real avatars of the users of other bases.

座標管理サーバ５０は、各拠点から利用者Ａ−Ｆの座標と座標変換式を受信し、他の拠点へ座標および座標変換式を送信する。各拠点のＰＣ１０は、他の拠点の利用者の座標および座標変換式を受信し、座標変換を行って３Ｄリアルアバタの表示位置を決定する。 The coordinate management server 50 receives the coordinates and coordinate conversion formulas of the users AF from each base, and transmits the coordinates and coordinate conversion formulas to other bases. The PC 10 of each base receives the coordinates and the coordinate conversion formula of the user of the other base, performs the coordinate conversion, and determines the display position of the 3D real avatar.

画像認識サーバ６０は、各拠点から利用者Ａ−Ｆを撮影した映像を受信し、利用者Ａ−Ｆの骨格位置を推定する。各拠点のＰＣ１０は、他の拠点の利用者の骨格位置を受信し、３Ｄリアルアバタに反映させる。 The image recognition server 60 receives a video image of the user AF from each site and estimates the skeleton position of the user AF. The PC 10 of each base receives the skeleton position of the user of the other base and reflects it on the 3D real avatar.

本実施形態では、図５に示すように、演算処理装置や記憶装置等を備えたＰＣ１０をアバタ生成部１１、座標処理部１２、及びアバタ処理部１３として機能させる。各部の処理がプログラムによって実行されるものとしてもよい。このプログラムはＰＣ１０が備える記憶装置に記憶されており、磁気ディスク、光ディスク、半導体メモリ等の記録媒体に記録することも、ネットワークを通して提供することも可能である。 In the present embodiment, as shown in FIG. 5, the PC 10 including an arithmetic processing unit, a storage device, and the like is caused to function as an avatar generation unit 11, a coordinate processing unit 12, and an avatar processing unit 13. The processing of each unit may be executed by a program. This program is stored in a storage device included in the PC 10, and can be recorded in a recording medium such as a magnetic disk, an optical disk, a semiconductor memory, or provided via a network.

アバタ生成部１１は、前述の通り、３Ｄスキャナ２１から３Ｄスキャンデータを受信し、３Ｄスキャンデータを３Ｄリアルアバタ生成サーバ３０へ送信し、利用者の３Ｄリアルアバタを生成する。アバタ生成部１１自身が３Ｄリアルアバタを生成してもよい。 As described above, the avatar generation unit 11 receives the 3D scan data from the 3D scanner 21, transmits the 3D scan data to the 3D real avatar generation server 30, and generates the 3D real avatar of the user. The avatar generation unit 11 itself may generate the 3D real avatar.

座標処理部１２は、カメラ２２Ａ，２２Ｂで撮影された映像から、拠点内の利用者の座標を求める。また、座標処理部１２は、座標管理サーバ５０から他の拠点の利用者の座標を受信し、他の拠点の利用者の座標を共有の座標系に変換して、他の拠点の利用者の３Ｄリアルアバタの拡張現実空間内での表示位置を求める。 The coordinate processing unit 12 obtains the coordinates of the user in the base from the images captured by the cameras 22A and 22B. Further, the coordinate processing unit 12 receives the coordinates of the user of the other site from the coordinate management server 50, converts the coordinates of the user of the other site into the shared coordinate system, and converts the coordinates of the user of the other site to the user of the other site. Obtain the display position of the 3D real avatar in the augmented reality space.

アバタ処理部１３は、画像認識サーバ６０から他の拠点の利用者の骨格位置を受信し、骨格位置を３Ｄリアルアバタに反映して、３Ｄリアルアバタに他の拠点の利用者の動作を表現させる。 The avatar processing unit 13 receives the skeleton position of the user at another base from the image recognition server 60, reflects the skeleton position in the 3D real avatar, and causes the 3D real avatar to express the action of the user at the other base. .

［３Ｄリアルアバタの位置座標］
図６は、座標処理部１２の構成を示す機能ブロック図である。座標処理部１２は、計測部１２１、算出部１２２、送信部１２３、及び受信部１２４を備える。 [Position coordinates of 3D real avatar]
FIG. 6 is a functional block diagram showing the configuration of the coordinate processing unit 12. The coordinate processing unit 12 includes a measuring unit 121, a calculating unit 122, a transmitting unit 123, and a receiving unit 124.

計測部１２１は、２台のカメラ２２Ａ，２２Ｂで撮影された映像から、三角法を用いて、拠点内の利用者の座標を求める。各拠点１，２，３で求めた利用者Ａ−Ｆの座標系は、図７に示すように、拠点ごとに異なる。拠点１の利用者Ａ，Ｂについて求めた座標を、拠点１座標系の利用者Ａ，Ｂの座標と呼ぶ。同様に、拠点２の利用者Ｃ，Ｄの座標を拠点２座標系の利用者Ｃ，Ｄの座標、拠点３の利用者Ｅ，Ｆの座標を拠点３座標系の利用者Ｅ，Ｆの座標と呼ぶ。 The measuring unit 121 obtains the coordinates of the user in the site from the images captured by the two cameras 22A and 22B by using trigonometry. The coordinate system of the users AF obtained at each of the bases 1, 2 and 3 is different for each base as shown in FIG. The coordinates obtained for the users A and B of the base 1 are called the coordinates of the users A and B of the base 1 coordinate system. Similarly, the coordinates of the users C and D of the base 2 are the coordinates of the users C and D of the base 2 coordinate system, and the coordinates of the users E and F of the base 3 are the coordinates of the users E and F of the base 3 coordinate system. Call.

送信部１２３は、各拠点１，２，３で求めた各利用者の座標を座標管理サーバ５０へ送信する。 The transmission unit 123 transmits the coordinates of each user obtained at each base 1, 2, 3 to the coordinate management server 50.

受信部１２４は、他の拠点１，２，３の利用者の座標を座標管理サーバ５０から受信する。具体的には、拠点１のＰＣ１０の受信部１２４は、拠点２座標系の利用者Ｃ，Ｄの座標および拠点３座標系の利用者Ｅ，Ｆの座標を座標管理サーバ５０から受信する。同様に、拠点２のＰＣ１０は、拠点１座標系の利用者Ａ，Ｂの座標および拠点３座標系の利用者Ｅ，Ｆの座標を受信し、拠点３のＰＣ１０は、拠点１座標系の利用者Ａ，Ｂの座標および拠点２座標系の利用者Ｃ，Ｄの座標を受信する。 The receiving unit 124 receives the coordinates of the users of the other bases 1, 2, 3 from the coordinate management server 50. Specifically, the reception unit 124 of the PC 10 at the location 1 receives the coordinates of the users C and D in the location 2 coordinate system and the coordinates of the users E and F in the location 3 coordinate system from the coordinate management server 50. Similarly, the PC 10 of the base 2 receives the coordinates of the users A and B of the base 1 coordinate system and the coordinates of the users E and F of the base 3 coordinate system, and the PC 10 of the base 3 uses the base 1 coordinate system. The coordinates of the persons A and B and the coordinates of the users C and D in the base 2 coordinate system are received.

算出部１２２は、他の拠点の利用者の座標を共有の座標系（以下、「空間共有座標系」と称する）に変換して拡張現実空間内での表示位置を求める。 The calculation unit 122 converts the coordinates of the user at another base into a shared coordinate system (hereinafter, referred to as “space shared coordinate system”) to obtain the display position in the augmented reality space.

ここで、各拠点１，２，３の各座標系から空間共有座標系への変換について説明する。例えば、図８に示すように、利用者Ａの座標を原点とする座標系を空間共有座標系とする。空間共有座標系の原点（利用者Ａの座標）が拠点１座標系で（Tx1a, Ty1a, Tz1a）であった場合、拠点１座標系の任意の点の座標（x1a, y1a, z1a）は、次式（１）により、空間共有座標系の座標（x'1a, y'1a, z'1a）に変換できる。 Here, the conversion from each coordinate system of each base 1, 2, 3 to the space sharing coordinate system will be described. For example, as shown in FIG. 8, the coordinate system having the coordinates of the user A as the origin is the space sharing coordinate system. If the origin of the space sharing coordinate system (coordinates of user A) is (Tx1a, Ty1a, Tz1a) in the base 1 coordinate system, the coordinates (x1a, y1a, z1a) of any point in the base 1 coordinate system are It can be converted into coordinates (x'1a, y'1a, z'1a) in the space sharing coordinate system by the following expression (1).

同じ拠点１にいる利用者Ｂの拠点１座標系における座標も上記の式（１）を用いて空間共有座標系の座標に変換できる。利用者Ｂの拠点１座標系における座標は、計測部１２１によって求めることができる。 Coordinates of the user B at the same base 1 in the base 1 coordinate system can also be converted into coordinates of the space sharing coordinate system by using the above equation (1). The coordinates of the point B coordinate system of the user B can be obtained by the measuring unit 121.

拠点１の利用者Ａが、ＡＲデバイス２３Ａの提供するコントローラを操作し、拠点２の利用者Ｃの３Ｄリアルアバタを拡張現実空間内で映し出す位置を決める。利用者Ａが利用者Ｃの３Ｄリアルアバタの拡張現実空間内での位置を決めると、利用者Ｃの３Ｄリアルアバタの空間共有座標系の座標が決まる。 The user A at the site 1 operates the controller provided by the AR device 23A to determine the position where the 3D real avatar of the user C at the site 2 is displayed in the augmented reality space. When the user A determines the position of the user C's 3D real avatar in the augmented reality space, the coordinates of the space shared coordinate system of the user C's 3D real avatar are determined.

利用者Ａが利用者Ｃの３Ｄリアルアバタを空間共有座標系の座標（Sx2c, Sy2c, Sz2c）に配置した時点の、拠点２座標系における利用者Ｃの座標が（xs2c, ys2c, yz2c）であった場合、拠点２座標系の任意の点の座標（x2c, y2c, z2c）は、次式（２）により、空間共有座標系の座標（x'2c, y'2c, z'2c）に変換できる。 When the user A arranges the 3D real avatar of the user C at the coordinates (Sx2c, Sy2c, Sz2c) of the space sharing coordinate system, the coordinates of the user C in the base 2 coordinate system are (xs2c, ys2c, yz2c). If there is, the coordinates (x2c, y2c, z2c) of any point in the base 2 coordinate system become the coordinates (x'2c, y'2c, z'2c) of the space sharing coordinate system by the following equation (2). Can be converted.

拠点２にいる利用者Ｄの拠点２座標系における座標も上記の式（２）を用いて空間共有座標系の座標に変換できる。 The coordinates of the user D at the base 2 in the base 2 coordinate system can also be converted into the coordinates of the space sharing coordinate system by using the above equation (2).

続いて、拠点１，２の利用者Ａ−Ｄの３Ｄリアルアバタの空間共有座標系での座標が決まった後に、拠点３の利用者Ｅの空間共有座標系での位置を決めることについて考える。 Next, deciding the position of the user E of the base 3 in the space sharing coordinate system after the coordinates of the users A to D in the bases 1 and 2 in the space sharing coordinate system of the 3D real avatar are determined.

拠点３のＰＣ１０は、座標管理サーバ５０から、拠点１座標系から空間共有座標系への座標変換式および拠点２座標系から空間共有座標系への座標変換式を受信する。したがって、拠点１座標系および拠点２座標系の座標が分かれば空間共有座標系の座標へ変換できる。 The PC 10 of the base 3 receives from the coordinate management server 50 a coordinate conversion formula from the base 1 coordinate system to the space shared coordinate system and a coordinate conversion formula from the base 2 coordinate system to the space shared coordinate system. Therefore, if the coordinates of the base 1 coordinate system and the coordinates of the base 2 coordinate system are known, they can be converted into the coordinates of the space sharing coordinate system.

拠点３のＰＣ１０は、拠点１，２の利用者Ａ−Ｄの各座標系における座標を座標管理サーバ５０から受信する。利用者Ｅの空間共有座標系内での位置は決まっていないので、空間共有座標系の任意の位置（例えば原点）に利用者Ｅが存在するものとしたうえで、ＡＲデバイスにより利用者Ａ−Ｄの３Ｄリアルアバタを表示する。 The PC 10 of the base 3 receives the coordinates in the coordinate systems of the users A to D of the bases 1 and 2 from the coordinate management server 50. Since the position of the user E in the space sharing coordinate system has not been determined, it is assumed that the user E exists at an arbitrary position (for example, the origin) of the space sharing coordinate system, and then the AR device allows the user A- Display the 3D real avatar of D.

利用者Ｅは、ＡＲデバイスの提供するコントローラを操作し、利用者Ａ−Ｄの３Ｄリアルアバタが見えやすい位置に移動する。例えば、図９の左側の位置では利用者Ａが近すぎて見にくいので、コントローラを操作し、利用者Ａ−Ｄの３Ｄリアルアバタが見えやすいように表示位置を決定する。例えば、図９の右側の位置のように、利用者Ｅは利用者Ｂの近くに移動する。 The user E operates the controller provided by the AR device and moves to a position where the 3D real avatars of the users A to D can be easily seen. For example, at the position on the left side of FIG. 9, since the user A is too close to see, the controller is operated to determine the display position so that the 3D real avatars of the users A to D can be easily seen. For example, as in the position on the right side of FIG. 9, the user E moves near the user B.

利用者Ｅの空間共有座標系の座標が決まると、拠点３座標系の座標と空間共有座標系の座標との対応関係が決まるので、上記の拠点２の場合と同様に、拠点３座標系から空間共有座標系への座標変換式が求められる。拠点３のＰＣ１０が求めた座標変換式を座標管理サーバ５０に送信するとともに、拠点１，２のＰＣで利用者Ｅの座標変換式を受信することで、拠点３座標系の座標を空間共有座標系の座標へ変換できる。拠点３にいる利用者Ｆの拠点３座標系における座標も同じ座標変換式を用いて空間共有座標系の座標へ変換できる。 When the coordinates of the space sharing coordinate system of the user E are determined, the correspondence relationship between the coordinates of the base 3 coordinate system and the coordinates of the space sharing coordinate system is determined. Therefore, as in the case of the base 2, the base 3 coordinate system is used. A coordinate conversion formula to the space sharing coordinate system is obtained. By transmitting the coordinate conversion formula obtained by the PC 10 of the base 3 to the coordinate management server 50 and receiving the coordinate conversion formula of the user E by the PCs of the bases 1 and 2, the coordinates of the coordinate system of the base 3 are space-shared coordinates. Can be converted to system coordinates. The coordinates of the user F at the base 3 in the base 3 coordinate system can also be converted into the coordinates of the space sharing coordinate system using the same coordinate conversion formula.

送信部１２３と受信部１２４は、各拠点の座標系の座標の送受信に加えて、各拠点の座標系から空間共有座標系への座標変換式の送受信も行う。 The transmission unit 123 and the reception unit 124 perform transmission / reception of coordinate conversion formulas from the coordinate system of each base to the space sharing coordinate system in addition to transmission / reception of coordinates of the coordinate system of each base.

図１０を参照して、３Ｄリアルアバタの表示位置を決める処理について説明する。 A process of determining the display position of the 3D real avatar will be described with reference to FIG.

各拠点１，２，３において、カメラ２２Ａ，２２Ｂで撮影した映像から利用者Ａ−Ｆの各拠点における座標を求め（ステップＳ２１）、利用者Ａ−Ｆの各拠点での座標を座標管理サーバ５０へ送信する（ステップＳ２２）。 At each of the bases 1, 2 and 3, the coordinates at each base of the user AF are obtained from the images captured by the cameras 22A and 22B (step S21), and the coordinates at each base of the user AF are coordinate management server. It transmits to 50 (step S22).

座標管理サーバ５０は、各拠点１，２，３へ各利用者Ａ−Ｆの各拠点における座標を送信する（ステップＳ２３）。 The coordinate management server 50 transmits the coordinates of each user AF to each site 1, 2, 3 (step S23).

ステップＳ２１，Ｓ２２，Ｓ２３の処理は、通信終了まで続けられる。 The processes of steps S21, S22, and S23 are continued until the communication ends.

拠点１では、利用者Ａの位置を空間共有座標系の原点として、利用者ＡがＡＲデバイス２３Ａ上での拠点２の利用者Ｃの３Ｄリアルアバタの表示位置を決め、拠点１座標系から空間共有座標系への座標変換式および拠点２座標系から空間共有座標系への座標変換式を求める（ステップＳ２４）。以降、拠点１では、拠点２の利用者Ｃ，Ｄの座標を空間共有座標系へ変換する処理が継続して実行される。 At the base 1, with the position of the user A as the origin of the space sharing coordinate system, the user A determines the display position of the 3D real avatar of the user C of the base 2 on the AR device 23A, and the space is calculated from the base 1 coordinate system. A coordinate conversion formula to the shared coordinate system and a coordinate conversion formula from the base 2 coordinate system to the space shared coordinate system are obtained (step S24). After that, in the base 1, the process of converting the coordinates of the users C and D of the base 2 into the space sharing coordinate system is continuously executed.

求めた利用者Ａ，Ｂ，Ｃ，Ｄの座標変換式は、拠点１から座標管理サーバ５０へ送信される。その後、座標管理サーバ５０から拠点２、拠点３へ送信される（ステップＳ２５，Ｓ２７）。 The obtained coordinate conversion formulas of the users A, B, C, D are transmitted from the base 1 to the coordinate management server 50. Then, it is transmitted from the coordinate management server 50 to the bases 2 and 3 (steps S25 and S27).

拠点２では、受信した座標変換式を用いて拠点１の利用者Ａ，Ｂの座標を空間共有座標系へ変換する処理が開始される（ステップＳ２６）。 At the base 2, the process of converting the coordinates of the users A and B of the base 1 into the space sharing coordinate system using the received coordinate conversion formula is started (step S26).

拠点３では、受信した座標変換式を用いて拠点１，２の利用者Ａ−Ｄの座標を空間共有座標系へ変換する処理が開始される（ステップＳ２８）。 At the site 3, the process of converting the coordinates of the users A to D at the sites 1 and 2 into the space sharing coordinate system using the received coordinate conversion formula is started (step S28).

拠点３では、利用者Ｅの空間共有座標系における位置を決定し、拠点３座標系から空間共有座標系への座標変換式を求める（ステップＳ２９）。 At the base 3, the position of the user E in the space sharing coordinate system is determined, and the coordinate conversion formula from the base 3 coordinate system to the space sharing coordinate system is obtained (step S29).

拠点３から座標管理サーバ５０へ求めた利用者Ｅ，Ｆの座標変換式が送信され、座標管理サーバ５０から拠点１、拠点２へ送信される（ステップＳ３０，Ｓ３１，Ｓ３３）。 The coordinate conversion formulas of the users E and F thus obtained are transmitted from the base 3 to the coordinate management server 50, and are transmitted from the coordinate management server 50 to the base 1 and the base 2 (steps S30, S31, S33).

拠点２では、受信した座標変換式を用いた拠点３の利用者Ｅ，Ｆの座標を空間共有座標系へ変換する処理が開始される（ステップＳ３２）。 At the base 2, the process of converting the coordinates of the users E and F of the base 3 using the received coordinate conversion formula into the space sharing coordinate system is started (step S32).

同様に、拠点１では、受信した座標変換式を用いた拠点３の利用者Ｅ，Ｆの座標を空間共有座標系へ変換する処理が開始される（ステップＳ３４）。 Similarly, in the base 1, the process of converting the coordinates of the users E and F of the base 3 using the received coordinate conversion formula into the space sharing coordinate system is started (step S34).

以降、各拠点１，２，３では、映像から各拠点１，２，３における利用者Ａ−Ｆの座標を求めて送信するとともに、他の拠点１，２，３の利用者Ａ−Ｆの座標を受信して空間共有座標系の座標に変換する処理が続けられる。 After that, at each of the bases 1, 2 and 3, the coordinates of the users AF at the bases 1, 2 and 3 are obtained from the video and transmitted, and at the same time, the user AF of the other bases 1, 2 and 3 is transmitted. The process of receiving the coordinates and converting them into the coordinates of the space sharing coordinate system is continued.

［３Ｄリアルアバタの表示］
図１１は、アバタ処理部１３の構成を示す機能ブロック図である。アバタ処理部１３は、制御部１３１、表示部１３２、送信部１３３、受信部１３４、及び記憶部１３５を備える。 [Display of 3D Real Avatar]
FIG. 11 is a functional block diagram showing the configuration of the avatar processing unit 13. The avatar processing unit 13 includes a control unit 131, a display unit 132, a transmission unit 133, a reception unit 134, and a storage unit 135.

送信部１３３は、各拠点において複数のカメラ２２Ａ，２２Ｂで撮影した映像を画像認識サーバ６０に送信する。 The transmission unit 133 transmits the images captured by the plurality of cameras 22A and 22B at each site to the image recognition server 60.

画像認識サーバ６０では、各拠点から受信した映像に基づき、各拠点の各利用者の骨格位置を推定する。具体的には、各利用者について、例えば、openpose（https://github.com/CMU-Perceptual-Computing-Lab/openpose）を用いて複数の映像のそれぞれにおいて利用者の骨格を検出する。そして、複数の映像のそれぞれで検出した骨格の３次元座標を三角法を用いて求める。 The image recognition server 60 estimates the skeleton position of each user at each base based on the video received from each base. Specifically, for each user, for example, openpose (https://github.com/CMU-Perceptual-Computing-Lab/openpose) is used to detect the skeleton of the user in each of a plurality of images. Then, the three-dimensional coordinates of the skeleton detected in each of the plurality of images are obtained using trigonometry.

受信部１３４は、画像認識サーバ６０が映像から求めた利用者の骨格位置を受信する。ここで受信する骨格位置は、他の拠点の利用者の骨格の３次元座標である。例えば、拠点１のＰＣ１０は、画像認識サーバ６０から拠点２，３の利用者Ｃ−Ｆの骨格位置を受信する。 The receiving unit 134 receives the skeleton position of the user obtained from the image by the image recognition server 60. The skeleton position received here is the three-dimensional coordinates of the skeleton of the user at another base. For example, the PC 10 at the site 1 receives the skeleton positions of the users C-F at the sites 2 and 3 from the image recognition server 60.

なお、画像認識サーバ６０を用いずに、アバタ処理部１３が映像から骨格位置を推定してもよい。具体的には、アバタ処理部１３が、複数の映像に基づき自拠点の利用者の骨格位置を求め、求めた骨格位置を他の拠点に送信する。他の拠点からは、他の拠点の利用者の骨格位置を受信する。 The avatar processing unit 13 may estimate the skeleton position from the video without using the image recognition server 60. Specifically, the avatar processing unit 13 obtains the skeleton position of the user of the own site based on the plurality of videos, and transmits the obtained skeleton position to another site. The skeleton position of the user of the other base is received from the other base.

制御部１３１は、記憶部１３５から他の拠点の利用者の３Ｄリアルアバタを読み出し、受信した骨格位置を３Ｄリアルアバタに反映させる。３Ｄリアルアバタは、利用者の外見を持つリグ付けされた３Ｄモデルであるので、図１２に示すように、骨格位置に合わせてリグを操作することで、３Ｄリアルアバタに所望の姿勢を取らせることができる。制御部１３１、表示部１３２は、
ＡＲデバイスで行ってもよい。 The control unit 131 reads the 3D real avatars of the users at other bases from the storage unit 135 and reflects the received skeleton position on the 3D real avatars. Since the 3D real avatar is a rigged 3D model that has the appearance of the user, as shown in FIG. 12, by operating the rig according to the skeleton position, the 3D real avatar can take a desired posture. be able to. The control unit 131 and the display unit 132 are
You may do with an AR device.

記憶部１３５には、事前に、他の拠点の各利用者の３Ｄリアルアバタを記憶させておく。 The storage unit 135 stores in advance the 3D real avatars of users at other bases.

表示部１３２は、他の拠点の各利用者について、骨格位置を反映した３Ｄリアルアバタを空間共有座標系の利用者の座標に配置し、ＡＲデバイス２３Ａ，２３Ｂを装着した利用者の空間共有座標系の座標を視点として、ＡＲデバイス２３Ａ，２３Ｂの視線方向の３Ｄリアルアバタを描画する。各利用者の空間共有座標系の座標は、座標処理部１２の算出結果を用いる。 The display unit 132 arranges a 3D real avatar reflecting the skeleton position at the user's coordinates in the space sharing coordinate system for each user at the other location, and the space sharing coordinates of the user wearing the AR devices 23A and 23B. A 3D real avatar in the line-of-sight direction of the AR devices 23A and 23B is drawn with the coordinates of the system as a viewpoint. As the coordinates of the space sharing coordinate system of each user, the calculation result of the coordinate processing unit 12 is used.

表示部１３２は、描画結果をＡＲデバイス２３Ａ，２３Ｂに送信する。ＡＲデバイス２３Ａ，２３Ｂは、実風景に３Ｄリアルアバタを重畳して表示する。表示部１３２が視差のある右眼用画像と左眼用画像を作成し、ＡＲデバイス２３Ａ，２３Ｂに表示させると、３Ｄリアルアバタを立体視できる。 The display unit 132 transmits the drawing result to the AR devices 23A and 23B. The AR devices 23A and 23B superimpose and display the 3D real avatar on the real landscape. When the display unit 132 creates a right-eye image and a left-eye image with parallax and displays them on the AR devices 23A and 23B, the 3D real avatar can be viewed stereoscopically.

図１３を参照して、３Ｄリアルアバタを表示する処理について説明する。図１３に示す処理は、通信の終了まで継続して行われる。 Processing for displaying a 3D real avatar will be described with reference to FIG. The process shown in FIG. 13 is continuously performed until the end of communication.

画像認識サーバ６０は、各拠点１，２，３からカメラ２２Ａ，２２Ｂで撮影した映像を受信し（ステップＳ４１）、各利用者Ａ−Ｆの骨格位置を推定し（ステップＳ４２）、推定した骨格位置を各拠点１，２，３に送信する（ステップＳ４３）。 The image recognition server 60 receives the images captured by the cameras 22A and 22B from the bases 1, 2, and 3 (step S41), estimates the skeleton position of each user AF (step S42), and estimates the skeleton. The position is transmitted to each of the bases 1, 2 and 3 (step S43).

拠点１では、拠点２，３の各利用者Ｃ−Ｆの骨格位置を３Ｄリアルアバタに反映するとともに（ステップＳ４４）、各利用者Ｃ−Ｆの位置に基づいて３Ｄリアルアバタを空間共有座標系に配置する（ステップＳ４５）。 At the base 1, the skeleton position of each user C-F of the bases 2 and 3 is reflected in the 3D real avatar (step S44), and the 3D real avatar is set in the space sharing coordinate system based on the position of each user C-F. (Step S45).

拠点１の利用者Ａの位置を視点として、利用者Ｃ−Ｆの３Ｄリアルアバタを描画し、ＡＲデバイス２３Ａに表示させる（ステップＳ４６）。ＡＲデバイス２３Ａを装着した利用者Ａが拠点１内で移動して回り込んだ場合は、その方向から見た３Ｄリアルアバタが表示される。 The 3D real avatar of the user C-F is drawn with the position of the user A of the base 1 as the viewpoint, and displayed on the AR device 23A (step S46). When the user A wearing the AR device 23A moves within the base 1 and turns around, the 3D real avatar viewed from that direction is displayed.

ステップＳ４４，Ｓ４５，Ｓ４６の処理は、拠点１の利用者Ｂおよび拠点２，３の各利用者Ｃ−Ｆについても同様に実施される。 The processes of steps S44, S45, and S46 are similarly performed for the user B of the base 1 and the users C-F of the bases 2 and 3.

以上説明したように、本実施の形態によれば、他の拠点の、複数方向から撮影した映像に基づいて検出した利用者の骨格位置を受信し、骨格位置に基づいて利用者の３Ｄリアルアバタの姿勢を変更し、３Ｄリアルアバタを実風景に重畳させて表示することにより、遠隔地間で利用者の姿を表示する際に、各拠点のＰＣ１０のデータ処理量および送受信するデータ量を削減できる。利用者の骨格位置は利用者を撮影した映像から検出し、複数の映像から検出した骨格位置から三角法を用いて骨格位置の３次元座標を得られる。 As described above, according to the present embodiment, the skeleton position of the user detected based on the images captured from a plurality of directions at the other base is received, and the 3D real avatar of the user is received based on the skeleton position. By changing the posture and displaying the 3D real avatar by superimposing it on the real landscape, the amount of data processing and the amount of data transmitted and received by the PC 10 at each location is reduced when displaying the user's figure between remote locations. it can. The skeleton position of the user is detected from the images captured by the user, and the three-dimensional coordinates of the skeleton position can be obtained from the skeleton positions detected from the plurality of images by using trigonometry.

本実施形態によれば、利用者を３Ｄスキャンして３Ｄスキャンデータを取得し、リグ付きベースモデルを変形させて３Ｄスキャンデータの形状に合わせて３Ｄリアルアバタを生成することにより、利用者を３Ｄスキャンするだけで利用者の外見を有するリグ付きの３Ｄリアルアバタが得られる。リグを操作すれば、３Ｄリアルアバタの姿勢を制御できる。 According to the present embodiment, the user is 3D-scanned to obtain the 3D scan data, and the base model with rig is deformed to generate the 3D real avatar according to the shape of the 3D scan data. Simply scan to get a 3D real avatar with a rig that has the appearance of the user. By operating the rig, you can control the posture of the 3D real avatar.

本実施形態によれば、利用者の他の拠点での座標系の座標を受信し、他の拠点での座標系を実風景に対応させた空間共有座標系に変換して、変換後の座標に３Ｄリアルアバタを配置することにより、実風景内に３Ｄリアルアバタを表示できる。 According to the present embodiment, the coordinates of the coordinate system at the other site of the user are received, the coordinate system at the other site is converted into the space sharing coordinate system corresponding to the actual landscape, and the coordinates after conversion are converted. By arranging the 3D real avatars in 3D, the 3D real avatars can be displayed in the real landscape.

１０…パーソナルコンピュータ（ＰＣ）１１…アバタ生成部１２…座標処理部１２１…計測部１２２…算出部１２３…送信部１２４…受信部１３…アバタ処理部１３１…制御部１３２…表示部１３３…送信部１３４…受信部１３５…記憶部２１…３Ｄスキャナ２２Ａ，２２Ｂ…カメラ２３Ａ，２３Ｂ…ＡＲデバイス３０…３Ｄリアルアバタ生成サーバ４０…３Ｄリアルアバタ管理サーバ５０…座標管理サーバ６０…画像認識サーバ 10 ... Personal computer (PC) 11 ... Avatar generation unit 12 ... Coordinate processing unit 121 ... Measurement unit 122 ... Calculation unit 123 ... Transmission unit 124 ... Reception unit 13 ... Avatar processing unit 131 ... Control unit 132 ... Display unit 133 ... Transmission unit 134 ... Receiving part 135 ... Storage part 21 ... 3D scanner 22A, 22B ... Camera 23A, 23B ... AR device 30 ... 3D real avatar generation server 40 ... 3D real avatar management server 50 ... Coordinate management server 60 ... Image recognition server

本発明に係る映像通信方法は、拠点間で通信する映像通信方法であって、複数方向から撮影した映像に基づいて検出した通信相手の骨格位置情報を受信するステップと、前記骨格位置情報に基づいて前記通信相手の外見を有するリグ付き３次元モデルの姿勢を変更するステップと、前記通信相手のローカル座標を受信するステップと、前記ローカル座標を拠点間で共通の共有空間座標系に変換するステップと、変換後の座標に対応する位置に前記３次元モデルを配置し、前記３次元モデルを実風景に重畳させて表示するステップを有し、いずれかの拠点のローカル座標系に基づいて前記共有空間座標系を定めて、別拠点の通信相手の３次元モデルの前記共有空間座標系内での位置を決め、別拠点の通信相手のローカル座標を共有空間座標系に変換するための座標変換式を求めて、当該座標変換式を前記拠点間で共有することを特徴とする。 A video communication method according to the present invention is a video communication method for communicating between bases, and a step of receiving skeleton position information of a communication partner detected based on images taken from a plurality of directions, and based on the skeleton position information. Changing the posture of the three-dimensional model with a rig having the appearance of the communication partner, receiving the local coordinates of the communication partner, and transforming the local coordinates into a common space coordinate system common between the bases. And arranging the three-dimensional model at a position corresponding to the coordinates after conversion , and displaying the three-dimensional model by superimposing the three-dimensional model on the real landscape, and the sharing based on the local coordinate system of any base. By defining the spatial coordinate system, the position of the three-dimensional model of the communication partner of the other base in the shared space coordinate system is determined, and the local coordinates of the communication partner of the other base are set as the shared space coordinate system. Seeking coordinate conversion formula for conversion, characterized by sharing the coordinate conversion formula between the bases.

本発明に係る映像通信装置は、拠点間で通信する映像通信装置であって、複数方向から撮影した映像に基づいて検出した通信相手の骨格位置情報を受信する受信部と、前記骨格位置情報に基づいて前記通信相手の外見を有するリグ付き３次元モデルの姿勢を変更するモデル制御部と、前記通信相手のローカル座標を受信する座標受信部と、前記ローカル座標を拠点間で共通の共有空間座標系に変換する算出部と、変換後の座標に対応する位置に前記３次元モデルを配置し、前記３次元モデルを実風景に重畳させて表示する表示部を有し、いずれかの拠点のローカル座標系に基づいて前記共有空間座標系を定めて、別拠点の通信相手の３次元モデルの前記共有空間座標系内での位置を決め、別拠点の通信相手のローカル座標を共有空間座標系に変換するための座標変換式を求めて、当該座標変換式を前記拠点間で共有することを特徴とする。 A video communication device according to the present invention is a video communication device that communicates between bases, and includes a receiving unit that receives skeleton position information of a communication partner detected based on images shot from a plurality of directions , and the skeleton position information. A model control unit that changes the posture of the three-dimensional model with a rig that has the appearance of the communication partner based on the coordinates, a coordinate receiving unit that receives the local coordinates of the communication partner, and the shared space coordinates common to the bases between the local coordinates. A calculation unit for converting into a system and a display unit for arranging the three-dimensional model at a position corresponding to the coordinate after conversion and displaying the three-dimensional model by superimposing the three-dimensional model on a real scene are provided. The shared space coordinate system is determined on the basis of the coordinate system, the position of the three-dimensional model of the communication partner of the other base in the shared space coordinate system is determined, and the local coordinates of the communication partner of the other base are set to the shared space coordinate system. Seeking coordinate conversion formula for conversion, characterized by sharing the coordinate conversion formula between the bases.

Claims

Receiving skeleton position information of a communication partner detected based on images shot from a plurality of directions,
Changing the posture of the 3D model with a rig having the appearance of the communication partner based on the skeleton position information;
Displaying the three-dimensional model superimposed on a real scene,
A video communication method comprising:

Three-dimensionally scanning the appearance of the communication partner to obtain an appearance model,
Generating a three-dimensional model with a rig by deforming a base model with a rig according to the appearance model;
The video communication method according to claim 1, further comprising:

The video communication method according to claim 1, wherein the skeleton position information is three-dimensional coordinates of a skeleton position obtained by trigonometry based on skeleton positions detected on the plurality of videos.

Receiving local coordinates of the communication partner,
Transforming the local coordinates into a coordinate system of a real landscape,
The video communication method according to any one of claims 1 to 3, wherein in the displaying step, the three-dimensional model is arranged at a position corresponding to the coordinates after conversion.

A receiving unit for receiving skeleton position information of a communication partner detected based on images taken from a plurality of directions,
A model control unit that changes the posture of the three-dimensional model with a rig that has the appearance of the communication partner based on the skeleton position information;
A display unit for displaying the three-dimensional model by superimposing it on a real landscape;
A video communication device comprising:

A video communication program for causing a computer to execute the video communication method according to claim 1.