JP7241628B2

JP7241628B2 - MOVIE SYNTHESIS DEVICE, MOVIE SYNTHESIS METHOD, AND MOVIE SYNTHESIS PROGRAM

Info

Publication number: JP7241628B2
Application number: JP2019131762A
Authority: JP
Inventors: 量生川上; 進之介岩城; 尚小嶋; 俊博清水; 寛明齊藤
Original assignee: Dwango Co Ltd
Current assignee: Dwango Co Ltd
Priority date: 2019-07-17
Filing date: 2019-07-17
Publication date: 2023-03-17
Anticipated expiration: 2038-11-30
Also published as: JP2020087428A

Description

本発明は、拡張現実動画を生成する技術に関する。 The present invention relates to technology for generating augmented reality moving images.

近年、個人がネットワークを介して映像を配信できる動画配信サービスが広まっている。動画配信サービスにおいて、ユーザの代わりにコンピュータグラフィックス（ＣＧ）キャラクタを映像内に登場させて映像を配信できるアプリケーションが知られている。また、スマートフォンで自撮りした顔をフェイストラッキングし、ユーザの表情をＣＧキャラクタに反映する技術が知られている。 2. Description of the Related Art In recent years, video distribution services that allow individuals to distribute videos via networks have become widespread. 2. Description of the Related Art In moving image distribution services, there is known an application capable of distributing a video by making a computer graphics (CG) character appear in the video instead of the user. Also known is a technique of performing face tracking on a self-portrait of a user's face with a smartphone and reflecting the facial expression of the user in a CG character.

特開２０１７－１８８７８７号公報JP 2017-188787 A

フェイストラッキング技術を用いて、自撮り動画にＣＧキャラクタを合成すると、リアルタイムでユーザ自身の表情を反映したＣＧキャラクタを実写映像に簡単に合成できる。 By synthesizing a CG character with a self-portrait moving image using face tracking technology, it is possible to easily synthesize a CG character reflecting the user's own facial expression with a photographed image in real time.

しかしながら、撮影される実写映像はユーザの背後の風景である。ユーザの眼前に広がる風景をバックにユーザ自身の表情を反映したＣＧキャラクタを合成する場合、ユーザは、自撮り棒を用いて、撮影したい風景を背にしてユーザ自身を含めて撮影する必要があった。 However, the photographed live-action video is the scenery behind the user. When synthesizing a CG character that reflects the user's own facial expression against the background of the scenery that spreads out before the user's eyes, the user has to use a selfie stick to take a picture of the scenery that the user wants to shoot, including the user himself. rice field.

本発明は、上記に鑑みてなされたものであり、より簡単に表情豊かなコンピュータグラフィックスキャラクタを合成した動画を生成することを目的とする。 SUMMARY OF THE INVENTION The present invention has been made in view of the above, and it is an object of the present invention to generate a moving image by synthesizing expressive computer graphics characters more easily.

本発明に係る動画合成装置は、実写動画にアバターを合成した拡張現実動画を生成する動画合成装置であって、拡張現実動画の背景となる実写動画およびアバターを操作する操作者の画像を取得する動画取得部と、前記操作者の画像に基づいてアバターを制御する制御部と、前記操作者の操作に基づいて、前記実写動画の実空間に対応した座標系における任意の初期位置に当該アバターを配置する配置部と、前記実写動画に前記アバターを合成した拡張現実動画を生成する合成部と、を有し、前記合成部は、前記実写動画の撮影方向が変化した場合、前記アバターの前記実空間に対応する座標系における位置に応じて、前記拡張現実動画内における前記アバターの合成位置を変化させることを特徴とする。 A video synthesizer according to the present invention is a video synthesizer that generates an augmented reality video by synthesizing an avatar with a live-action video, and obtains a live-action video that serves as a background of the augmented reality video and an image of an operator who operates the avatar. a video acquisition unit, a control unit that controls the avatar based on the operator's image, and the avatar at an arbitrary initial position in a coordinate system corresponding to the real space of the live-action video based on the operator's operation. and a synthesizing unit that generates an augmented reality video by synthesizing the avatar with the live-action video. The synthetic position of the avatar in the augmented reality moving image is changed according to the position in the coordinate system corresponding to the space .

本発明に係る動画合成方法は、実写動画にアバターを合成した拡張現実動画を生成する動画合成方法であって、コンピュータによる、拡張現実動画の背景となる実写動画およびアバターを操作する操作者の画像を取得するステップと、前記操作者の画像に基づいてアバターを制御するステップと、前記操作者の操作に基づいて、前記実写動画の実空間に対応した座標系における任意の初期位置に当該アバターを配置するステップと、前記実写動画に前記アバターを合成した拡張現実動画を生成するステップと、を有し、前記拡張現実動画を生成するステップでは、前記実写動画の撮影方向が変化した場合、前記アバターの前記実空間に対応する座標系における位置に応じて、前記拡張現実動画内における前記アバターの合成位置を変化させることを特徴とする。 A video synthesis method according to the present invention is a video synthesis method for generating an augmented reality video by synthesizing an avatar with a live-action video. controlling an avatar based on the operator's image; and moving the avatar to an arbitrary initial position in a coordinate system corresponding to the real space of the live-action moving image based on the operator's operation. and generating an augmented reality video by synthesizing the avatar with the live-action video. The synthetic position of the avatar in the augmented reality video is changed according to the position in the coordinate system corresponding to the real space.

本発明に係る動画合成プログラムは、実写動画にアバターを合成した拡張現実動画を生成する動画合成プログラムであって、拡張現実動画の背景となる実写動画およびアバターを操作する操作者の画像を取得する処理と、前記操作者の画像に基づいてアバターを制御する処理と、前記操作者の操作に基づいて、前記実写動画の実空間に対応した座標系における任意の初期位置に当該アバターを配置する処理と、前記実写動画に前記アバターを合成した拡張現実動画を生成する処理と、をコンピュータに実行させ、前記拡張現実動画を生成する処理では、前記実写動画の撮影方向が変化した場合、前記アバターの前記実空間に対応する座標系における位置に応じて、前記拡張現実動画内における前記アバターの合成位置を変化させることを特徴とする。 A video synthesis program according to the present invention is a video synthesis program for generating an augmented reality video by synthesizing an avatar with a live-action video, and acquires a live-action video as a background of the augmented reality video and an image of an operator who operates the avatar. a process of controlling the avatar based on the operator's image; and a process of arranging the avatar at an arbitrary initial position in a coordinate system corresponding to the real space of the live-action moving image based on the operator's operation. and a process of generating an augmented reality video by synthesizing the avatar with the live-action video. The synthetic position of the avatar in the augmented reality video is changed according to the position in the coordinate system corresponding to the real space .

本発明によれば、より簡単に表情豊かなコンピュータグラフィックスキャラクタを合成した動画を生成することができる。 According to the present invention, it is possible to more easily generate a moving image in which expressive computer graphics characters are synthesized.

本実施形態の動画合成装置を含む動画配信システムの全体的な構成を示す全体構成図である。1 is an overall configuration diagram showing the overall configuration of a moving image delivery system including a moving image synthesizing device of this embodiment; FIG. 配信者がＡＲ動画を配信する様子を説明するための図である。FIG. 4 is a diagram for explaining how a distributor distributes an AR video; FIG. 撮影方向を右にパンしたときのＡＲ動画の例を示す図である。FIG. 10 is a diagram showing an example of an AR moving image when the shooting direction is panned to the right; 本実施形態の動画合成装置の構成例を示す機能ブロック図である。1 is a functional block diagram showing a configuration example of a moving image synthesizing device according to an embodiment; FIG. アバターの初期配置処理の流れを示すフローチャートである。10 is a flow chart showing the flow of avatar initial placement processing. 撮影した映像に検出した床部分を重畳表示した例を示す図である。It is a figure which shows the example which superimposed-displayed the floor part detected on the image|photographed image|video. 図６の床部分にアバターを立たせた例を示す図である。It is a figure which shows the example which stood the avatar on the floor part of FIG. 図７の状態から撮影方向を上にチルトした例を示す図である。FIG. 8 is a diagram showing an example in which the photographing direction is tilted upward from the state of FIG. 7; ＡＲ動画の生成処理の流れを示すフローチャートである。FIG. 10 is a flowchart showing the flow of AR video generation processing; FIG. アバターの表情と姿勢の制御処理の流れを示すフローチャートである。10 is a flow chart showing the flow of control processing for the facial expression and posture of an avatar. タッチパネルに操作ボタンを表示した例を示す図である。It is a figure which shows the example which displayed the operation button on the touch panel. 配信者の移動中に、アバターを後ろ向きに表示した例を示す図である。FIG. 10 is a diagram showing an example of displaying an avatar facing backward while the broadcaster is moving;

以下、本発明の実施の形態について図面を用いて説明する。 BEST MODE FOR CARRYING OUT THE INVENTION Embodiments of the present invention will be described below with reference to the drawings.

図１を参照し、本実施形態の動画合成装置を含む動画配信システムの全体的な構成について説明する。本動画配信システムは、動画合成装置１と動画配信サーバ３で構成される。 With reference to FIG. 1, the overall configuration of a moving picture distribution system including the moving picture synthesizing device of this embodiment will be described. This moving picture distribution system is composed of a moving picture synthesizing device 1 and a moving picture distribution server 3 .

動画合成装置１は、動画合成装置１の撮影した実写映像に３次元のコンピュータグラフィックスキャラクタ（アバター）を合成し、拡張現実（ＡｕｇｍｅｎｔｅｄＲｅａｌｉｔｙ）動画（以下、「ＡＲ動画」と称する）を生成する。 The video synthesizer 1 synthesizes a three-dimensional computer graphics character (avatar) with a live-action video captured by the video synthesizer 1 to generate an augmented reality video (hereinafter referred to as an "AR video"). .

動画配信サーバ３は、動画合成装置１からＡＲ動画を受信して、ＡＲ動画を視聴者端末９に配信する。動画配信サーバ３は、受信したＡＲ動画をリアルタイムで配信（いわゆる生放送）してもよいし、ＡＲ動画を蓄積しておき、視聴者端末９からの要求に応じてＡＲ動画を配信してもよい。 The moving picture distribution server 3 receives the AR moving picture from the moving picture synthesizing device 1 and distributes the AR moving picture to the viewer terminal 9 . The video distribution server 3 may distribute the received AR video in real time (so-called live broadcasting), or store the AR video and distribute the AR video in response to a request from the viewer terminal 9. .

動画配信システムは、アバター管理サーバ５およびコメント管理サーバ７を備えてもよい。 The video distribution system may include an avatar management server 5 and a comment management server 7.

アバター管理サーバ５は、アバターの３次元データを管理する。動画合成装置１は、アバター管理サーバ５が提供するアバターのパーツを組み合わせて、自分用のアバターを生成してもよい。 The avatar management server 5 manages three-dimensional data of avatars. The video synthesizing device 1 may combine avatar parts provided by the avatar management server 5 to generate an avatar for itself.

コメント管理サーバ７は、視聴者端末９からＡＲ動画に対するコメントを受信し、そのコメントを動画合成装置１および他の視聴者端末９に配信する。 The comment management server 7 receives comments on the AR video from the viewer terminal 9 and distributes the comments to the video synthesizer 1 and other viewer terminals 9 .

動画合成装置１、動画配信サーバ３、アバター管理サーバ５、コメント管理サーバ７、および視聴者端末９は、ネットワークを介して通信可能に接続される。 The video synthesizer 1, the video distribution server 3, the avatar management server 5, the comment management server 7, and the viewer terminal 9 are communicably connected via a network.

図２および図３を参照し、動画合成装置１の生成するＡＲ動画について説明する。 AR moving images generated by the moving image synthesizer 1 will be described with reference to FIGS. 2 and 3. FIG.

動画合成装置１は、撮影方向が互いに逆向きのアウトカメラとインカメラ、マイク、タッチパネル、および自己位置を検出するための各種センサ（例えば、加速度センサ、ジャイロセンサなど）を備える。動画合成装置１として、アウトカメラおよびインカメラを備えたスマートフォンおよびタブレットなどの携帯端末を利用できる。 The video synthesizing device 1 includes an out-camera and an in-camera whose shooting directions are opposite to each other, a microphone, a touch panel, and various sensors (for example, an acceleration sensor, a gyro sensor, etc.) for detecting its own position. A mobile terminal such as a smartphone and a tablet equipped with an out-camera and an in-camera can be used as the video synthesizer 1 .

図２に示すように、配信者２００は、動画合成装置１を持ち、アウトカメラで配信者２００が見ている風景を撮影し、インカメラで配信者２００自身を撮影する。動画合成装置１は、アウトカメラで撮影した実写映像にアバター１００を合成したＡＲ動画を生成する。動画合成装置１は、インカメラで撮影した配信者２００の表情をアバター１００の表情に反映する。例えば、配信者２００が話しているとき、動画合成装置１は、インカメラで撮影した配信者２００の顔をフェイストラッキングし、アバター１００の口を配信者２００の口に合わせて動かす。動画合成装置１は、配信者２００の頭の動きをアバター１００に反映してもよいし、配信者２００のジェスチャーをアバター１００に反映してもよい。これにより、配信者２００は、自身の眼前に広がる風景を撮影しながら、アバター１００を制御できる。 As shown in FIG. 2, a distributor 200 has a moving image synthesizing device 1, captures a scene viewed by the distributor 200 with an out-camera, and captures the distributor 200 himself/herself with an in-camera. The moving image synthesizing device 1 generates an AR moving image by synthesizing an avatar 100 with a live-action image taken by an out-camera. The moving image synthesizing device 1 reflects the facial expression of the distributor 200 captured by the in-camera on the facial expression of the avatar 100. - 特許庁For example, when the distributor 200 is speaking, the video synthesizer 1 performs face tracking on the face of the distributor 200 captured by the in-camera, and moves the mouth of the avatar 100 to match the mouth of the distributor 200 . The moving image synthesizer 1 may reflect the movement of the head of the distributor 200 on the avatar 100 , or may reflect the gesture of the distributor 200 on the avatar 100 . As a result, the distributor 200 can control the avatar 100 while photographing the scenery that spreads out in front of him.

動画合成装置１は、アバター１００を実空間に対応した座標系に固定し、映像にアバター１００を合成する。図３に示すように、アウトカメラの撮影方向を右にパンしたときも、アバター１００は実空間内に存在する物と同様に、映像の左方向に移動する。 The video synthesizer 1 fixes the avatar 100 to a coordinate system corresponding to the real space, and synthesizes the avatar 100 with the video. As shown in FIG. 3, when the shooting direction of the out-camera is panned to the right, the avatar 100 also moves leftward in the image, like an object existing in the real space.

［動画合成装置の構成］
図４を参照し、動画合成装置１の構成例について説明する。同図に示す動画合成装置１は、空間測定部１１、初期配置部１２、アバター制御部１３、合成部１４、位置検出部１５、インカメラ１６、アウトカメラ１７、入力部１８、表示部１９、通信制御部２０、および記憶部２１を備える。動画合成装置１が備える各部は、演算処理装置、記憶装置等を備えたコンピュータにより構成して、各部の処理がプログラムによって実行されるものとしてもよい。このプログラムは動画合成装置１が備える記憶装置に記憶されており、磁気ディスク、光ディスク、半導体メモリ等の記録媒体に記録することも、ネットワークを通して提供することも可能である。例えば、スマートフォンにアプリケーションをインストールし、スマートフォンを動画合成装置１として機能させてもよい。 [Configuration of video synthesizer]
A configuration example of the video synthesizing device 1 will be described with reference to FIG. The moving image synthesizing device 1 shown in FIG. A communication control unit 20 and a storage unit 21 are provided. Each unit included in the moving image synthesizer 1 may be configured by a computer including an arithmetic processing unit, a storage device, and the like, and the processing of each unit may be executed by a program. This program is stored in the storage device provided in the motion picture synthesizing apparatus 1, and can be recorded on recording media such as magnetic disks, optical disks, semiconductor memories, etc., or can be provided through a network. For example, an application may be installed on a smart phone and the smart phone may function as the video synthesizer 1 .

空間測定部１１は、アバターを配置する実空間の３次元空間情報を測定し、実空間に対応する実空間座標系を設定し、アバターを配置可能な領域（以下、「アバター配置可能領域」と称する）を検出する。例えば、アウトカメラ１７で実空間の動画を撮影し、単眼カメラを用いたマーカーレスＡＲの技術により、撮影場所の実空間の３次元空間情報を測定できる。空間測定部１１は、測定で得られた３次元空間情報のうち、例えば床などの平坦部分をアバター配置可能領域として検出する。空間測定部１１がアバター配置可能領域として検出する箇所は、アバターを配置しても不自然でない場所ならば、地面に対して傾いていてもよく、凸凹のある場所でもよい。 The space measurement unit 11 measures three-dimensional space information of the real space in which the avatar is placed, sets a real space coordinate system corresponding to the real space, and determines an area where the avatar can be placed (hereinafter referred to as "avatar placeable area"). ) is detected. For example, the out-camera 17 can capture a moving image of the real space, and the three-dimensional spatial information of the real space of the shooting location can be measured by markerless AR technology using a monocular camera. The space measurement unit 11 detects, from the three-dimensional space information obtained by the measurement, a flat portion such as a floor as an avatar placeable area. A location detected by the space measurement unit 11 as an avatar placement possible area may be inclined with respect to the ground or may be uneven as long as it is a place where it is not unnatural to place an avatar.

動画合成装置１がデプスカメラまたはステレオカメラを備える場合、空間測定部１１は、デプスカメラの測定結果またはステレオ画像から３次元空間情報を得てもよい。空間測定部１１は、測定した３次元空間情報を記憶部２１に記憶する。 If the video synthesizer 1 is equipped with a depth camera or a stereo camera, the space measurement unit 11 may obtain three-dimensional space information from the measurement results of the depth camera or stereo images. The space measurement unit 11 stores the measured three-dimensional space information in the storage unit 21 .

初期配置部１２は、空間測定部１１の検出したアバター配置可能領域内にアバターが存在するように、アバターの初期位置を決定する。例えば、アウトカメラ１７で撮影した実空間の映像に、アバター配置可能領域を示す図形（例えば、床の範囲を示す枠など）を重畳した画像を表示し、配信者にアバターの初期位置の指定を促す。配信者がアバター配置可能領域内をタップすると、初期配置部１２は、タップされた位置の実空間座標系の座標を算出し、アバターの初期位置として決定する。初期配置部１２は、アバター配置可能領域の任意の位置にアバターを配置してもよい。配信者が動画合成装置１を振ったときは、ランダムでアバターの位置を変更してもよい。 The initial placement unit 12 determines the initial position of the avatar so that the avatar exists within the avatar placeable area detected by the space measurement unit 11 . For example, an image in which a figure indicating the avatar placement possible area (for example, a frame indicating the floor area, etc.) is superimposed on the image of the real space captured by the out-camera 17 is displayed, and the distributor is asked to specify the initial position of the avatar. prompt. When the distributor taps the avatar placeable area, the initial placement unit 12 calculates the coordinates of the tapped position in the real space coordinate system and determines it as the initial position of the avatar. The initial placement unit 12 may place the avatar at any position in the avatar placeable area. When the distributor shakes the motion picture synthesizer 1, the position of the avatar may be changed at random.

アバター制御部１３は、インカメラ１６で撮影した配信者の顔をフェイストラッキングし、配信者の表情をアバターの表情に反映させる。アバター制御部１３は、配信者がメニュー等で入力した操作に従ってアバターを制御してもよい。アバター制御部１３は、動画合成装置１の移動に基づきアバターの姿勢および位置を制御してもよい。例えば、配信者が風景を撮影しながら前方に移動しているときは、アバターを前方に向けて、アバターを歩かせる。 The avatar control unit 13 performs face tracking on the face of the distributor photographed by the in-camera 16, and reflects the facial expression of the distributor on the facial expression of the avatar. The avatar control unit 13 may control the avatar according to the operation input by the distributor through a menu or the like. The avatar control section 13 may control the posture and position of the avatar based on the movement of the video synthesizer 1 . For example, when the broadcaster is moving forward while photographing the scenery, the avatar is turned forward and the avatar is made to walk.

合成部１４は、実空間座標系にアバターを配置し、アウトカメラ１７で撮影した映像にアバターを合成し、ＡＲ動画を生成する。ＡＲ動画は、表示部１９で表示されるとともに、通信制御部２０から動画配信サーバ３へ送信される。合成部１４は、ＡＲ動画を記憶部２１に蓄積してもよい。なお、アバターのレンダリングに必要なデータは、アバター管理サーバ５から受信し、記憶部２１に記憶しておく。事前に記憶部２１に記憶されたアバターのデータを用いてもよい。 The synthesizing unit 14 arranges the avatar in the real space coordinate system, synthesizes the avatar with the video captured by the out-camera 17, and generates an AR video. The AR moving image is displayed on the display unit 19 and transmitted from the communication control unit 20 to the moving image distribution server 3 . The synthesizing unit 14 may store the AR moving images in the storage unit 21 . Data necessary for rendering the avatar is received from the avatar management server 5 and stored in the storage unit 21 . Avatar data stored in the storage unit 21 in advance may be used.

位置検出部１５は、実空間座標系における動画合成装置１自身の位置および向き（アウトカメラ１７の位置および向きでもある）を検出する。合成部１４は、位置検出部１５の検出した位置および向きに基づいてアバターをレンダリングする。 The position detection unit 15 detects the position and orientation of the video synthesizing device 1 itself (also the position and orientation of the out-camera 17) in the real space coordinate system. The synthesizer 14 renders the avatar based on the position and orientation detected by the position detector 15 .

インカメラ１６は、配信者（動画合成装置１の操作者でもある）を撮影する。 The in-camera 16 photographs the distributor (who is also the operator of the moving picture synthesizer 1).

アウトカメラ１７は、ＡＲ動画として発信したい風景および被写体を撮影する。 The out-camera 17 shoots a landscape and a subject to be transmitted as an AR video.

入力部１８は、動画合成装置１の備えるタッチパネルからの操作を受け付ける。 The input unit 18 receives an operation from the touch panel provided in the moving image synthesizer 1 .

表示部１９は、タッチパネルにアウトカメラ１７で撮影した実写映像にアバターを合成したＡＲ動画を表示する。アバターを操作するための各種ボタンを表示してもよい。 The display unit 19 displays an AR moving image obtained by synthesizing an avatar with a live-action image taken by the out-camera 17 on the touch panel. Various buttons for operating the avatar may be displayed.

通信制御部２０は、ＡＲ動画を動画配信サーバ３へ送信する。 The communication control unit 20 transmits the AR video to the video distribution server 3 .

［アバターの初期配置］
図５を参照し、アバターの初期配置処理の一例について説明する。 [Initial placement of avatars]
An example of initial placement processing of avatars will be described with reference to FIG.

図５に示す処理は、配信者がＡＲ動画を生成する前に、実空間座標系におけるアバターの位置を決定する際に実行される。 The processing shown in FIG. 5 is performed when the distributor determines the position of the avatar in the real space coordinate system before generating the AR video.

配信者は、動画合成装置１を起動し、アウトカメラ１７でアバターを配置する場所の動画を撮影し、アバターを配置する場所の３次元空間情報を取得する（ステップＳ１１）。具体的には、アプリケーションを起動してスマートフォンを動画合成装置１として動作させ、アウトカメラ１７でアバターを配置する平らな場所を撮影する。配信者は、動画合成装置１を少し動かしながら配置場所を撮影する。動画合成装置１の動きおよび撮影した動画から検出できる特徴点の動きから３次元空間情報を取得し、アバターの配置を許可するアバター配置可能領域を検出する。ここでは、平坦な「床」をアバター配置可能領域として検出する。 The distributor activates the video synthesizer 1, takes a video of the place where the avatar is placed with the out-camera 17, and acquires the three-dimensional space information of the place where the avatar is placed (step S11). Specifically, the application is activated to operate the smartphone as the video synthesizer 1 , and the out-camera 17 shoots a flat place where the avatar is to be placed. The distributor shoots the arrangement place while slightly moving the moving picture synthesizing device 1. - 特許庁Three-dimensional space information is acquired from the movement of the moving image synthesizer 1 and the movement of feature points that can be detected from the captured moving image, and an avatar placement possible area that permits the placement of an avatar is detected. Here, a flat "floor" is detected as an avatar placeable area.

空間測定部１１が３次元空間情報を取得し、床を検出すると、表示部１９は、アウトカメラ１７で撮影した映像に床の領域を示す図形を重畳して表示する（ステップＳ１２）。例えば、図６に示すように、アウトカメラ１７で撮影した映像に床の領域を示す枠１１０を重畳して表示する。 When the space measurement unit 11 acquires the three-dimensional space information and detects the floor, the display unit 19 superimposes and displays a figure indicating the floor area on the image captured by the out-camera 17 (step S12). For example, as shown in FIG. 6, a frame 110 indicating the area of the floor is superimposed on the image captured by the out-camera 17 and displayed.

配信者が枠１１０内をタップすると、図７に示すように、アバター制御部１３は、タップされた位置にアバター１００を配置する（ステップＳ１３）。タップされた位置の実空間座標系における座標をアバターの立ち位置の座標とする。枠１１０内の別の場所をタップし直すと、新たにタップされた位置をアバターの立ち位置とする。実空間座標系におけるアバターの座標が定まると、合成部１４は、アウトカメラ１７で撮影した実写映像にアバターを重畳して表示する。以降は、アバターが実空間内に存在するかのように、アバターは実写映像に重畳表示される。例えば、図７の状態からアウトカメラ１７の撮影方向を上にチルトすると、図８に示すように、アバター１００の立ち位置は実空間内に固定されたままでアバター１００の上半身が表示される。アウトカメラ１７の撮影方向を左右にパンしたときも、アバター１００の立ち位置は実空間内に固定されたままで、アバター１００は、実写映像に重畳表示される。 When the distributor taps inside the frame 110, as shown in FIG. 7, the avatar control unit 13 arranges the avatar 100 at the tapped position (step S13). Let the coordinates of the tapped position in the real space coordinate system be the coordinates of the avatar's standing position. When another place within the frame 110 is tapped again, the newly tapped position is set as the standing position of the avatar. When the coordinates of the avatar in the real space coordinate system are determined, the synthesizing unit 14 superimposes the avatar on the actual image taken by the out-camera 17 and displays it. After that, the avatar is displayed superimposed on the live-action video as if the avatar exists in the real space. For example, when the photographing direction of the out-camera 17 is tilted upward from the state shown in FIG. 7, the upper body of the avatar 100 is displayed while the standing position of the avatar 100 remains fixed in the real space, as shown in FIG. Even when the photographing direction of the out-camera 17 is panned left and right, the standing position of the avatar 100 remains fixed in the real space, and the avatar 100 is superimposed on the photographed image.

アバター１００が表示されているときに、アバター１００をドラッグしてアバター１００の立ち位置を調整してもよい。例えば、図８に示すように、アバター１００が表示されているとき、配信者がアバター１００をタップして指を画面に沿って左右に動かすと、アバター１００の立ち位置を左右に移動させる。配信者が指を上下に動かすと、アバター１００の立ち位置を奥行き方向または手前方向に移動させる。アバター１００を前後左右に動かすとき、床と認識された範囲を超えないように、アバター１００の移動を停止する。 The standing position of the avatar 100 may be adjusted by dragging the avatar 100 while the avatar 100 is being displayed. For example, as shown in FIG. 8, when the avatar 100 is displayed and the broadcaster taps the avatar 100 and moves his or her finger left or right along the screen, the standing position of the avatar 100 is moved left or right. When the distributor moves his finger up and down, the standing position of the avatar 100 is moved in the depth direction or the front direction. The movement of the avatar 100 is stopped so as not to exceed the range recognized as the floor when the avatar 100 is moved back and forth and to the left and right.

配信者が動画合成装置１を振ったときに、初期配置部１２はアバターの立ち位置をランダムで決めてもよい。配信者が動画合成装置１を傾けたときに、初期配置部１２はアバターの立ち位置を動画合成装置１の傾きに応じて移動してもよい。例えば、配信者が動画合成装置１を右に傾けると、アバター１００を右方向に移動させ、動画合成装置１を手前に倒すと、アバター１００を手前方向に移動させる。 The initial placement unit 12 may randomly determine the standing position of the avatar when the distributor shakes the video synthesizing device 1 . When the distributor tilts the video synthesizer 1 , the initial placement unit 12 may move the standing position of the avatar according to the tilt of the video synthesizer 1 . For example, when the distributor tilts the video synthesizer 1 to the right, the avatar 100 is moved to the right, and when the video synthesizer 1 is tilted forward, the avatar 100 is moved forward.

インカメラ１６で撮影した配信者の画像に基づいてアバターの立ち位置を調整してもよい。例えば、配信者が右を向くと、アバターの立ち位置を右に移動させる。配信者が下を向くと、アバターの立ち位置を手前方向に移動させる。 The standing position of the avatar may be adjusted based on the image of the distributor captured by the in-camera 16 . For example, when the broadcaster turns to the right, the avatar's standing position is moved to the right. When the broadcaster looks down, the avatar's standing position is moved forward.

配信者がアバターの位置を決めると、初期配置部１２は、配信者の操作に応じて、アバターのサイズおよび向きを決定する（ステップＳ１４）。例えば、配信者がタッチパネルを上下フリックすると、アバターのサイズを拡大・縮小する。配信者がタッチパネルを左右フリックすると、アバターの向きを回転する。配信者がタッチパネルを２本指でタップすると、アバターのサイズおよび向きを最初の大きさおよび向きにリセットする。 When the distributor determines the position of the avatar, the initial placement unit 12 determines the size and orientation of the avatar according to the distributor's operation (step S14). For example, when the distributor flicks the touch panel up and down, the size of the avatar is enlarged or reduced. When the broadcaster flicks the touch panel left or right, the direction of the avatar is rotated. When the broadcaster taps the touch panel with two fingers, the size and orientation of the avatar are reset to the initial size and orientation.

アバターの立ち位置の床に、畳、絨毯、ステージなどのオブジェクトを配置してもよい。足元のオブジェクトとして高さのある台を配置した場合は、アバターの立ち位置を台の高さ分上昇させる。 Objects such as tatami mats, carpets, and stages may be placed on the floor where the avatar stands. If a tall platform is placed as a foot object, the avatar's standing position is raised by the height of the platform.

アバターの初期の位置を中心として、所定の範囲内の床部分をアバターが自由に移動できるようにしてもよい。例えば、配信者の沈黙がしばらく続いたときに、アバターが所定の範囲内をうろうろ歩くよう制御されてもよい。 The avatar may be allowed to move freely on the floor within a predetermined range centering on the initial position of the avatar. For example, when the silence of the broadcaster continues for a while, the avatar may be controlled to wander within a predetermined range.

［ＡＲ動画の生成］
配信者がアバターの初期配置を終えると、動画合成装置１は、インカメラ１６による配信者の撮影を開始し、ＡＲ動画の生成を開始する。 [Generation of AR video]
When the distributor finishes the initial placement of the avatar, the video synthesizer 1 starts photographing the distributor with the in-camera 16, and starts generating an AR video.

図９を参照し、ＡＲ動画の生成処理の一例について説明する。 An example of an AR video generation process will be described with reference to FIG. 9 .

アウトカメラ１７が風景を撮影すると同時に（ステップＳ２１）、インカメラ１６が配信者を撮影する（ステップＳ２２）。マイクは動画に付随させる音声を集音する。 At the same time when the out-camera 17 takes a picture of the scenery (step S21), the in-camera 16 takes a picture of the distributor (step S22). A microphone picks up the sound that accompanies the video.

位置検出部１５は、動画合成装置１の位置および向きを検出する（ステップＳ２３）。 The position detection unit 15 detects the position and orientation of the moving image synthesizer 1 (step S23).

アバター制御部１３は、インカメラ１６の撮影した配信者の映像に基づき、アバターの表情と姿勢を制御する（ステップＳ２４）。アバター制御部１３の処理の詳細は後述する。 The avatar control unit 13 controls the expression and posture of the avatar based on the image of the distributor captured by the in-camera 16 (step S24). Details of the processing of the avatar control unit 13 will be described later.

合成部１４は、アウトカメラ１７で撮影した実写映像にアバターを合成してＡＲ動画を生成する（ステップＳ２５）。 The synthesizing unit 14 synthesizes the avatar with the live-action video captured by the out-camera 17 to generate an AR video (step S25).

［アバターの制御］
図１０を参照し、アバター制御部１３によるアバターの表情と姿勢の制御処理の一例について説明する。 [Avatar control]
With reference to FIG. 10, an example of control processing of an avatar's facial expression and posture by the avatar control unit 13 will be described.

アバター制御部１３は、配信者が移動中であるか否か判定する（ステップＳ３１）。配信者が移動中であるか否かは、位置検出部１５で検出した動画合成装置１の動きに基づいて判定できる。 The avatar control unit 13 determines whether or not the distributor is moving (step S31). Whether or not the distributor is moving can be determined based on the movement of the moving picture synthesizing device 1 detected by the position detection unit 15 .

配信者が移動中の場合（ステップＳ３１のＹＥＳ）、アバターが位置固定状態であるか否か判定する（ステップＳ３２）。アバターが位置固定状態であるとは、実空間座標系におけるアバターの位置を動かさない状態である。配信者が移動してもアバターの立ち位置は動かさない。アバターが位置固定状態でないときは、実空間座標系における動画合成装置１とアバターの間を所定の距離に保ち、動画合成装置１の移動に合わせてアバターの位置を移動させる。配信者つまり動画合成装置１が移動すると、アバター制御部１３は配信者の移動に合わせてアバターの位置を移動させる。アバターの移動については後述する。 If the distributor is moving (YES in step S31), it is determined whether the avatar is in a position-fixed state (step S32). An avatar in a position-fixed state is a state in which the position of the avatar in the real space coordinate system is not moved. Even if the broadcaster moves, the avatar's standing position does not move. When the position of the avatar is not fixed, a predetermined distance is maintained between the video synthesizing device 1 and the avatar in the real space coordinate system, and the position of the avatar is moved according to the movement of the video synthesizing device 1 . When the distributor, that is, the motion picture synthesizer 1 moves, the avatar control unit 13 moves the position of the avatar according to the movement of the distributor. The movement of the avatar will be described later.

アバターの位置固定状態の解除は、図１１に示すタッチパネルに表示された位置固定ボタン１３０を操作することで変更できる。アバターが位置固定状態のときに、位置固定ボタン１３０が操作されると、アバターの位置固定を解除する。アバターが位置固定されていないときに、位置固定ボタン１３０が操作されると、アバターを位置固定状態とする。 Cancellation of the fixed position state of the avatar can be changed by operating the fixed position button 130 displayed on the touch panel shown in FIG. When the position fixing button 130 is operated while the avatar is in the fixed position state, the position fixing of the avatar is released. When the position fixing button 130 is operated when the position of the avatar is not fixed, the position of the avatar is fixed.

配信者が移動中でない場合（ステップＳ３１のＮＯ）、あるいはアバターの位置が固定されている場合（ステップＳ３２のＹＥＳ）、アバター制御部１３は、表情ボタンが操作されたか否かを判定する（ステップＳ３３）。本実施形態では、図１１に示すように、タッチパネルに表情ボタン１２０Ａ，１２０Ｂ，１２０Ｃを表示している。表情ボタン１２０Ａ，１２０Ｂ，１２０Ｃのそれぞれは、大笑い、泣き、怒りの表情に対応する。 If the distributor is not moving (NO in step S31) or if the position of the avatar is fixed (YES in step S32), the avatar control unit 13 determines whether or not the facial expression button has been operated (step S33). In this embodiment, as shown in FIG. 11, expression buttons 120A, 120B, and 120C are displayed on the touch panel. Expression buttons 120A, 120B, and 120C correspond to laughter, cry, and anger, respectively.

表情ボタン１２０Ａ，１２０Ｂ，１２０Ｃが操作された場合（ステップＳ３３のＹＥＳ）、アバター制御部１３は、アバターの表情を操作された表情ボタン１２０Ａ，１２０Ｂ，１２０Ｃに応じた表情に変更する（ステップＳ３４）。大げさな表情のアニメーションを用意しておき、表情ボタンの操作に応じてアバターにアニメーションさせることで、より明確に感情を表現することができる。アバターの表情を変更するだけでなく、アバターにジェスチャーをさせてもよい。例えば、泣くの表情ボタンが操作されたとき、手で涙を拭くジェスチャーをアバターにさせる。 When the facial expression buttons 120A, 120B, 120C are operated (YES in step S33), the avatar control unit 13 changes the facial expressions of the avatars to the facial expressions corresponding to the operated facial expression buttons 120A, 120B, 120C (step S34). . By preparing an exaggerated facial expression animation and making the avatar animate according to the operation of the facial expression button, it is possible to express emotions more clearly. In addition to changing the facial expression of the avatar, you may also make the avatar make gestures. For example, when the crying facial expression button is operated, the avatar is made to make a gesture of wiping tears with a hand.

アバターの姿勢（アバターの向きを含む）を制御するためのボタンをタッチパネルに表示してもよい。例えば、アバターを振り向かせるためのボタン、アバターの顔または上半身を右または左に向けるためのボタンなどをタッチパネルに表示してもよい。これらのボタンが操作されると、アバター制御部１３は、アバターの姿勢を操作されたボタンに応じた姿勢に変更する。 Buttons for controlling the posture of the avatar (including the orientation of the avatar) may be displayed on the touch panel. For example, a button for turning the avatar, a button for turning the avatar's face or upper body to the right or left, etc. may be displayed on the touch panel. When these buttons are operated, the avatar control unit 13 changes the posture of the avatar to the posture corresponding to the operated button.

メニューからアバターの表情および姿勢を制御できてもよい。例えば、タッチパネルの端をスワイプしてメニューバーを出現させて、アバターにさせたい表情や姿勢に対応する項目を選択する。 You may be able to control the avatar's expression and pose from a menu. For example, swipe the edge of the touch panel to bring up a menu bar, and select items corresponding to facial expressions and postures you want your avatar to have.

インカメラ１６で撮影したハンドサインでアバターを制御できてもよい。アバター制御部１３は、ハンドサインとアバターの制御内容（表情および姿勢など）を関連つけておき、インカメラ１６で撮影した動画から特定のハンドサインを検出した場合、検出したハンドサインに応じてアバターを制御する。例えば、インカメラ１６で握りこぶしを撮影したときはアバターに怒りの表情をさせる。 The hand sign captured by the in-camera 16 may be used to control the avatar. The avatar control unit 13 associates hand signs with avatar control details (such as facial expressions and postures), and when a specific hand sign is detected from a moving image taken by the in-camera 16, the avatar is displayed in accordance with the detected hand sign. to control. For example, when a clenched fist is photographed with the in-camera 16, the avatar is made to express anger.

インカメラ１６で撮影した文字または図形でアバターを制御できてもよい。アバター制御部１３は、インカメラ１６で撮影した動画から特定の文字または図形を検出した場合、検出した文字等に応じてアバターを制御する。例えば、紙に「笑って」と記載しておき、インカメラ１６でその紙を撮影してアバターに笑った表情をさせる。 The avatar may be controlled by characters or graphics captured by the in-camera 16 . When the avatar control unit 13 detects specific characters or graphics from the moving image captured by the in-camera 16, the avatar control unit 13 controls the avatar according to the detected characters or the like. For example, "smile" is written on a piece of paper, and the paper is photographed by the in-camera 16 to make the avatar smile.

動画合成装置１の動きでアバターを制御できてもよい。アバター制御部１３は、動画合成装置１の動きとアバターの制御内容を関連つけておき、位置検出部１５の検出した動画合成装置１の動きに応じてアバターを制御する。例えば、動画合成装置１が傾けられたときは、アバターにお辞儀をさせる。動画合成装置１が傾けられるとアウトカメラ１７で撮影する風景が傾いてしまうので、傾きを検出する直前に撮影された動画の１フレームを背景として、静止画にお辞儀するアバターを合成してもよい。静止画の代わりに、傾きを検出する直前の数秒程度の動画を背景としてもよい。 The avatar may be controlled by the movement of the motion picture synthesizer 1 . The avatar control unit 13 associates the motion of the motion picture synthesizer 1 with the control details of the avatar, and controls the avatar according to the motion of the motion picture synthesizer 1 detected by the position detection unit 15 . For example, when the motion picture synthesizer 1 is tilted, the avatar is made to bow. When the moving image synthesizing device 1 is tilted, the scenery photographed by the out-camera 17 is tilted. Therefore, a still image may be synthesized with a bowing avatar using one frame of the moving image photographed immediately before the tilt is detected as a background. . Instead of a still image, a moving image of about several seconds immediately before the tilt is detected may be used as the background.

アバター制御部１３は、配信者に取り付けたセンサからの情報に基づいてアバターを制御してもよいし、キーボードなどの外部入力装置でアバターを制御してもよい。センサおよび入力装置と動画合成装置１とは無線により通信する。 The avatar control unit 13 may control the avatar based on information from a sensor attached to the distributor, or may control the avatar using an external input device such as a keyboard. The sensor/input device and the video synthesizing device 1 communicate wirelessly.

表情ボタンが操作されない場合（ステップＳ３３のＹＥＳ）、アバター制御部１３は、インカメラ１６で撮影した配信者の顔をフェイストラッキングし、配信者の表情をアバターに反映する（ステップＳ３５）。インカメラ１６で撮影した配信者の表情をアバターに反映させることで、配信者の手を煩わすことなく、アバターを表情豊かに制御することができる。 If the facial expression button is not operated (YES in step S33), the avatar control unit 13 performs face tracking on the face of the broadcaster captured by the in-camera 16, and reflects the facial expression of the broadcaster on the avatar (step S35). By reflecting the expression of the broadcaster captured by the in-camera 16 on the avatar, the avatar can be controlled expressively without bothering the broadcaster.

アバター制御部１３は、マイクで集音した配信者の音声に基づき、アバターにしゃべる動きをさせてもよい。 The avatar control unit 13 may cause the avatar to speak based on the voice of the distributor collected by the microphone.

配信者が移動中であって（ステップＳ３１のＹＥＳ）、アバターが位置固定状態ではない場合（ステップＳ３２のＮＯ）、アバター制御部１３は、アバター１００を配信者の進行方向に向けて（ステップＳ３６）、アバター１００の立ち位置を進行方向に移動させる（ステップＳ３７）。具体的には、アバター制御部１３は、実空間座標系における動画合成装置１の位置とアバターの立ち位置との間の距離が所定の間隔に保たれるように、アバターの位置を移動する。アバター制御部１３は、配信者（動画合成装置１）が前進すると配信者から離れるようにアバターを移動し、配信者が後退すると配信者に近づくようにアバターを移動し、配信者が立ち止まるとアバターを立ち止まらせる。なお、アバターの移動とは独立して、ステップＳ３３からステップＳ３５の処理を行ってもよい。具体的には、ステップＳ３７の処理の後に、ステップＳ３３の処理を実行してもよい。 If the distributor is moving (YES in step S31) and the avatar is not in a position-fixed state (NO in step S32), the avatar control unit 13 directs the avatar 100 toward the distributor (step S36). ), and the standing position of the avatar 100 is moved in the advancing direction (step S37). Specifically, the avatar control unit 13 moves the position of the avatar so that the distance between the position of the video synthesizer 1 and the standing position of the avatar in the real space coordinate system is maintained at a predetermined distance. The avatar control unit 13 moves the avatar away from the distributor when the distributor (video synthesizer 1) moves forward, moves the avatar closer to the distributor when the distributor retreats, and moves the avatar closer to the distributor when the distributor stops. to stop The processing from step S33 to step S35 may be performed independently of the movement of the avatar. Specifically, the process of step S33 may be executed after the process of step S37.

アバター制御部１３は、アバターの向きを固定してもよいし、アバターの向きをアバターの進行方向に向けてもよい。例えば、アバターの位置を固定せず、アバターの向きを固定した場合、図１１の状態で配信者が前進すると、アバター制御部１３は、アバターの向きを動画合成装置１の方向に向けたまま、アバターを後退させる。アバターの位置を固定せず、アバターの向きも固定しない場合、図１１の状態で配信者が前進すると、アバター制御部１３は、図１２に示すように、アバターの向きを進行方向に向けて、アバターを前進させる。 The avatar control unit 13 may fix the orientation of the avatar, or may direct the orientation of the avatar to the traveling direction of the avatar. For example, if the position of the avatar is not fixed but the direction of the avatar is fixed, when the distributor moves forward in the state of FIG. Move your avatar backwards. If the position of the avatar is not fixed and the direction of the avatar is not fixed, when the distributor moves forward in the state of FIG. Move your avatar forward.

マイクが配信者以外の人の音声を集音していた場合、アバター制御部１３は、アバターをその音声の方向に向けてもよい。 If the microphone is collecting the voice of a person other than the distributor, the avatar control unit 13 may direct the avatar in the direction of the voice.

アバターの移動先が床でないとき、例えば移動先が壁のとき、アバター制御部１３は、アバターを移動させずに、床の端で立ち止まらせてもよい。配信者が道を歩きながらＡＲ動画を配信するとき、空間測定部１１は、アウトカメラ１７の映像から進行方向の平坦部分を検出し、アバター制御部１３は、アバターを進行方向の平坦部分に移動させる。これにより、配信者が道を歩きながらＡＲ動画を配信しているとき、アバターが道を歩いているようなＡＲ動画を生成できる。 When the destination of the avatar is not the floor, for example, when the destination is a wall, the avatar control unit 13 may cause the avatar to stop at the edge of the floor without moving. When a distributor distributes an AR video while walking on the road, the space measurement unit 11 detects a flat portion in the direction of travel from the image of the out-camera 17, and the avatar control unit 13 moves the avatar to the flat portion in the direction of travel. Let As a result, when the distributor distributes the AR video while walking down the road, it is possible to generate an AR video in which the avatar appears to be walking down the road.

アバターが位置固定状態でない場合、配信者が移動中でなくても、アウトカメラ１７の撮影方向を左右にパンしたときは、アバター制御部１３は、撮影方向に合わせてアバターを左右に移動させてもよい。 When the position of the avatar is not fixed, even if the broadcaster is not moving, when the shooting direction of the out-camera 17 is panned left or right, the avatar control unit 13 moves the avatar left or right according to the shooting direction. good too.

以上説明したように、本実施形態によれば、アウトカメラ１７で実写動画を撮影するとともに、インカメラ１６で配信者を撮影し、アバター制御部１３がインカメラ１６で撮影した配信者の画像に基づいてアバターを制御し、合成部１４が実空間座標系の所定位置にアバターを配置して実写動画にアバターを合成することにより、配信者は、眼前の風景を撮影しながら、自身の表情を反映させたアバターを合成したＡＲ動画を生成できる。インカメラ１６で撮影した配信者の顔をフェイストラッキングしてアバターに反映させることで、表情豊かなアバターを合成できる。 As described above, according to the present embodiment, the out-camera 17 shoots a live-action video, the in-camera 16 shoots the distributor, and the avatar control unit 13 captures an image of the distributor shot with the in-camera 16. The avatar is controlled based on the image, and the synthesizer 14 arranges the avatar at a predetermined position in the real space coordinate system and synthesizes the avatar with the live-action video. You can generate an AR video that synthesizes the reflected avatar. By performing face tracking of the face of the distributor photographed by the in-camera 16 and reflecting it on the avatar, an avatar with rich expressions can be synthesized.

本実施形態によれば、空間測定部１１が撮影場所の３次元空間情報を取得してアバターの配置を許可するアバター配置可能領域を検出し、初期配置部１２がアバター配置可能領域にアバターを配置して実空間座標系における前記アバターの位置を決定することにより、アバターの立ち位置を決めるためのマーカーを設置することなく、実空間にアバターを固定できる。 According to the present embodiment, the space measurement unit 11 acquires the three-dimensional space information of the shooting location, detects an avatar placeable area that permits placement of the avatar, and the initial placement unit 12 places the avatar in the avatar placeable area. By determining the position of the avatar in the real space coordinate system, the avatar can be fixed in the real space without installing a marker for determining the standing position of the avatar.

本実施形態によれば、アバター制御部１３が動画合成装置１の実空間座標系における位置に応じてアバターの実空間座標系における位置を移動することにより、配信者が動画合成装置１を持って歩きながら撮影すると、配信者の移動に合わせて移動するアバターを合成したＡＲ動画を生成できる。 According to the present embodiment, the avatar control unit 13 moves the position of the avatar in the real space coordinate system according to the position in the real space coordinate system of the video synthesizer 1 , thereby allowing the distributor to hold the video synthesizer 1 . If you shoot while walking, you can generate an AR video that synthesizes an avatar that moves according to the movement of the broadcaster.

なお、本実施形態では、動画合成装置１がＡＲ動画の生成を行ったが、サーバが、動画合成装置１が、アウトカメラ１７の撮影した実写動画およびインカメラ１６の撮影した画像などのＡＲ動画の生成に必要なデータを送信し、サーバまたはクラウドでＡＲ動画を生成してもよい。 In the present embodiment, the video synthesizer 1 generates the AR video, but the server generates the AR video such as the live-action video captured by the out-camera 17 and the image captured by the in-camera 16. You may transmit the data necessary for the generation of and generate the AR video on the server or the cloud.

１…動画合成装置１１…空間測定部１２…初期配置部１３…アバター制御部１４…合成部１５…位置検出部１６…インカメラ１７…アウトカメラ１８…入力部
１９…表示部２０…通信制御部２１…記憶部３…動画配信サーバ５…アバター管理サーバ７…コメント管理サーバ９…視聴者端末 REFERENCE SIGNS LIST 1 motion picture synthesis device 11 space measurement unit 12 initial placement unit 13 avatar control unit 14 synthesis unit 15 position detection unit 16 in-camera 17 out-camera 18 input unit 19 display unit 20 communication control unit 21... Storage unit 3... Video distribution server 5... Avatar management server 7... Comment management server 9... Viewer terminal

Claims

A video synthesizer for generating an augmented reality video by synthesizing an avatar with a live-action video,
a video acquisition unit that acquires a live-action video that serves as a background of an augmented reality video and an image of an operator who operates an avatar;
a control unit that controls an avatar based on the image of the operator;
a placement unit that places the avatar at an arbitrary initial position in a coordinate system corresponding to the real space of the live-action video based on the operation of the operator;
a synthesizing unit that generates an augmented reality video by synthesizing the avatar with the live-action video ;
The synthesizing unit changes a synthetic position of the avatar in the augmented reality moving image according to a position of the avatar in a coordinate system corresponding to the real space when a shooting direction of the live-action moving image changes. A moving image synthesizer.

A video synthesizer for generating an augmented reality video by synthesizing an avatar with a live-action video,
a video acquisition unit that acquires a live-action video that serves as a background of an augmented reality video and an image of an operator who operates an avatar;
a control unit that controls an avatar based on the image of the operator;
a space measurement unit that acquires three-dimensional space information of the shooting location of the live-action video and detects an area where the avatar can be placed ;
a placement unit that places the avatar at an initial position where the avatar can be placed in a coordinate system corresponding to the real space of the live-action video based on the operation of the operator;
and a synthesizing unit that generates an augmented reality video by synthesizing the avatar with the live-action video.
A video synthesizer characterized by:

The space measurement unit detects a flat portion as an area where the avatar can be placed.
3. The moving picture synthesizing device according to claim 2, wherein:

4. The moving picture synthesizing apparatus according to claim 2, wherein said space measuring section acquires three-dimensional spatial information of a shooting location from movements of feature points detected from said photographed moving picture.

5. The moving image synthesizing device according to claim 1, wherein the control unit moves the avatar according to movements of the operator.

6. The moving image synthesizing device according to claim 5 , wherein the controller directs the avatar in the direction in which the operator has advanced, and moves the position of the avatar according to the movement of the operator. .

A video synthesis method for generating an augmented reality video by synthesizing an avatar with a live-action video,
by computer,
a step of obtaining an image of an operator operating an avatar and a live-action video as a background of the augmented reality video;
controlling an avatar based on the image of the operator;
a step of arranging the avatar at an arbitrary initial position in a coordinate system corresponding to the real space of the live-action video based on the operation of the operator;
generating an augmented reality video by synthesizing the avatar with the live-action video ;
In the step of generating the augmented reality video, when the shooting direction of the live-action video changes, the composite position of the avatar in the augmented reality video is changed according to the position of the avatar in the coordinate system corresponding to the real space. A method of synthesizing moving images characterized by changing .

A video synthesis method for generating an augmented reality video by synthesizing an avatar with a live-action video,
by computer,
a step of obtaining an image of an operator operating an avatar and a live-action video as a background of the augmented reality video;
controlling an avatar based on the image of the operator;
obtaining three-dimensional spatial information of the shooting location of the live-action video, and detecting an area where the avatar can be placed;
arranging the avatar at an initial position where the avatar can be arranged in a coordinate system corresponding to the real space of the live-action video, based on the operation of the operator;
and generating an augmented reality video by synthesizing the avatar with the live-action video.
A moving image synthesis method characterized by:

A video synthesis program for generating an augmented reality video by synthesizing an avatar with a live-action video,
a process of acquiring an image of an operator operating an avatar and a live-action video as a background of the augmented reality video;
a process of controlling an avatar based on the image of the operator;
A process of arranging the avatar at an arbitrary initial position in a coordinate system corresponding to the real space of the live-action video based on the operation of the operator;
causing a computer to execute a process of generating an augmented reality video by synthesizing the avatar with the live-action video;
In the process of generating the augmented reality video, when the shooting direction of the live-action video changes, the composite position of the avatar in the augmented reality video is changed according to the position of the avatar in the coordinate system corresponding to the real space. A moving picture composition program characterized by changing .

A video synthesis program for generating an augmented reality video by synthesizing an avatar with a live-action video,
a process of acquiring an image of an operator operating an avatar and a live-action video as a background of the augmented reality video;
a process of controlling an avatar based on the image of the operator;
a process of acquiring three-dimensional spatial information of the shooting location of the live-action video and detecting an area where the avatar can be placed;
A process of arranging the avatar at an initial position where the avatar can be arranged in a coordinate system corresponding to the real space of the live-action video, based on the operation of the operator;
a process of generating an augmented reality video by synthesizing the avatar with the live-action video;
A video composition program characterized by causing a computer to execute