JP7261261B2

JP7261261B2 - Information processing device, switching system, program and method

Info

Publication number: JP7261261B2
Application number: JP2021071382A
Authority: JP
Inventors: 大樹加藤; 崇文久野; 秀樹横山
Original assignee: Nippon Television Network Corp
Current assignee: Nippon Television Network Corp
Priority date: 2019-05-15
Filing date: 2021-04-20
Publication date: 2023-04-19
Anticipated expiration: 2039-05-15
Also published as: JP2021119686A; JP6873186B2; JP2020187592A

Description

本発明は情報処理装置、スイッチングシステム、プログラム及び方法に関する。 The present invention relates to an information processing device, switching system, program and method.

近年、カメラの高画素化が進み、高画質な４K、８Kなどの映像を取得できるようになってきた。そして、高画質な映像の一部を切り出すことによって、専用の機材を要することなく、アップや、パンやティルト等のカメラワークを模した動画像を生成する技術が提案されている(例えば、特許文献１)。 In recent years, the number of pixels in cameras has increased, and it has become possible to acquire high-quality images such as 4K and 8K. Techniques have been proposed for generating moving images that mimic camera work such as close-ups, pans, and tilts without the need for special equipment by cutting out a portion of high-quality video (for example, patent Reference 1).

特開２０１６－２８５３９号公報JP 2016-28539 A

しかしながら、上記技術は、切り出す被写体を、ユーザが映像中で特定する必要があった。 However, the above technique requires the user to specify the subject to be cut out in the video.

更に、カメラワークによってユーザが自ら切り出す映像のサイズ、位置を指定する必要があり、自動で希望するカメラワークの映像を切り出すことはできなかった。特に、アップの映像といっても、フルショット、バストショット、ウェストショット等の色々なショットがあるが、このような高度なカメラワークを、自動で行うことができなかった。 Furthermore, the user has to specify the size and position of the image to be clipped by the camera work, and it has not been possible to automatically clip the video of the desired camera work. In particular, close-up images include various shots such as full shots, bust shots, and waist shots, but such advanced camera work could not be performed automatically.

そこで、本発明は、自動的に多種多様なカメラワークの映像を生成することができる情報処理装置、スイッチングシステム、プログラム及び方法を提供することにある。 Accordingly, it is an object of the present invention to provide an information processing apparatus, a switching system, a program, and a method capable of automatically generating various camerawork images.

本発明の一態様は、カメラで撮影された撮影映像を取得する取得部と、被写体となる特定人物を指定する特定人物指定受付部と、前記撮影映像の一部をトリミングして、仮想的なカメラの操作による映像効果を実現する仮想カメラワークを指定する仮想カメラワーク指定受付部と、前記撮影映像中の人物を認識する人物認識部と、前記撮影映像を用いて、前記映像中の人物の骨格を判定する骨格判定部と、前記人物認識部の認識結果を用いて前記映像中の特定人物を識別し、前記骨格判定部により判定された骨格のうち前記特定人物の骨格を用いて、前記特定人物の部位の撮影映像上の位置関係を特定する人物特定部と、前記仮想カメラワーク毎に、構図が定められた構図情報が格納された構図情報記憶部と、前記特定人物の部位の撮影映像上の位置関係と、前記指定された仮想カメラワークに対応する構図情報とを用いて、前記撮影映像の一部をトリミングするトリミングフレームのサイズを決定するトリミングフレームサイズ決定部と、前記特定人物の部位の撮影映像上の位置関係と、前記指定された仮想カメラワークに対応する構図情報とを用いて、前記指定された仮想カメラワークに対応するトリミングフレームの前記撮影映像の位置を制御するトリミングフレーム制御部と、前記撮影映像から前記トリミングフレーム内の映像をトリミングし、トリミングされたトリミング映像を、前記仮想カメラワークによる仮想カメラワーク映像として出力するトリミング部とを有する映像処理装置である。 One aspect of the present invention includes an acquisition unit that acquires a captured image captured by a camera, a specific person designation reception unit that designates a specific person as a subject, and a part of the captured image that is trimmed to produce a virtual image. A virtual camerawork designation reception unit that designates a virtual camerawork that realizes a video effect by operating a camera, a person recognition unit that recognizes a person in the captured image, and a person in the image using the captured image. a skeleton determination unit that determines a skeleton; a specific person in the video is identified using the recognition result of the person recognition unit; A person identification unit that identifies the positional relationship of parts of a specific person on a captured image, a composition information storage unit that stores composition information in which a composition is determined for each virtual camera work, and a photograph of the parts of the specific person. a trimming frame size determination unit that determines a size of a trimming frame for trimming a portion of the photographed video using the positional relationship on the video and composition information corresponding to the specified virtual camera work; and the specific person. Trimming for controlling the position of the captured image of the trimming frame corresponding to the specified virtual camera work, using the positional relationship of the part on the captured image and the composition information corresponding to the specified virtual camera work The video processing device includes a frame control unit and a trimming unit that trims the video within the trimming frame from the captured video and outputs the trimmed video as a virtual camerawork video by the virtual camerawork.

本発明の一態様は、少なくとも２以上の映像処理装置と、表示部と、スイッチング部とを有し、前記映像処理装置は、カメラで撮影された撮影映像を取得する取得部と、被写体となる特定人物を指定する特定人物指定受付部と、前記撮影映像の一部をトリミングして、仮想的なカメラの操作による映像効果を実現する仮想カメラワークを指定する仮想カメラワーク指定受付部と、前記撮影映像中の人物を認識する人物認識部と、前記撮影映像を用いて、前記映像中の人物の骨格を判定する骨格判定部と、前記人物認識部の認識結果を用いて前記映像中の特定人物を識別し、前記骨格判定部により判定された骨格のうち前記特定人物の骨格を用いて、前記特定人物の部位の撮影映像上の位置関係を特定する人物特定部と、前記仮想カメラワーク毎に、構図が定められた構図情報が格納された構図情報記憶部と、前記特定人物の部位の撮影映像上の位置関係と、前記指定された仮想カメラワークに対応する構図情報とを用いて、前記撮影映像の一部をトリミングするトリミングフレームのサイズを決定するトリミングフレームサイズ決定部と、前記特定人物の部位の撮影映像上の位置関係と、前記指定された仮想カメラワークに対応する構図情報とを用いて、前記指定された仮想カメラワークに対応するトリミングフレームの前記撮影映像の位置を制御するトリミングフレーム制御部と、前記撮影映像から前記トリミングフレーム内の映像をトリミングし、トリミングされたトリミング映像を、前記仮想カメラワークによる仮想カメラワーク映像として出力するトリミング部とを有し、前記表示部は、前記撮影映像と、少なくとも二以上の前記仮想カメラワーク映像とを表示し、前記スイッチング部は、少なくとも二以上の前記仮想カメラワーク映像のうち、ユーザにより指定されたひとつの仮想カメラワーク映像を出力するスイッチングシステムである。 One aspect of the present invention includes at least two or more video processing devices, a display unit, and a switching unit, and the video processing device includes an acquisition unit that acquires a captured video captured by a camera and a subject. a specific person designation reception unit that designates a specific person; a virtual camera work designation reception unit that designates virtual camera work that trims a part of the captured image and realizes a video effect by operating a virtual camera; A person recognition unit that recognizes a person in a captured image, a skeleton determination unit that uses the captured image to determine the skeleton of the person in the image, and a recognition result of the person recognition unit that identifies the person in the image. a person identification unit that identifies a person and uses the skeleton of the specific person among the skeletons determined by the skeleton determination unit to specify the positional relationship of the parts of the specific person on the captured image; using a composition information storage unit storing composition information in which the composition is determined, the positional relationship of the specific person's body part on the captured image, and the composition information corresponding to the specified virtual camera work, a trimming frame size determining unit that determines the size of a trimming frame for trimming a portion of the captured image; a positional relationship of the specific person's body parts on the captured image; and composition information corresponding to the specified virtual camera work. a trimming frame control unit for controlling the position of the photographed video in the trimming frame corresponding to the specified virtual camera work, and trimming the video within the trimming frame from the photographed video, and trimming the trimmed video as a virtual camera work image by the virtual camera work, the display unit displays the captured image and at least two or more of the virtual camera work images, and the switching unit The switching system outputs one virtual camerawork image specified by a user, out of at least two virtual camerawork images.

本発明の一態様は、カメラで撮影された撮影映像を取得する取得処理と、被写体となる特定人物の指定を受付ける特定人物指定受付処理と、前記撮影映像の一部をトリミングして、仮想的なカメラの操作による映像効果を実現する仮想カメラワークの指定を受付ける仮想カメラワーク指定受付処理と、前記撮影映像中の人物を認識する人物認識処理と、前記撮影映像を用いて、前記映像中の人物の骨格を判定する骨格判定処理と、前記人物認識処理の認識結果を用いて前記映像中の特定人物を識別し、前記骨格判定処理により判定された骨格のうち前記特定人物の骨格を用いて、前記特定人物の部位の撮影映像上の位置関係を特定する人物特定処理と、前記特定人物の部位の撮影映像上の位置関係と、前記仮想カメラワーク毎に構図が定められた構図情報のうち前記指定された仮想カメラワークに対応する構図情報とを用いて、前記撮影映像の一部をトリミングするトリミングフレームのサイズを決定するトリミングフレームサイズ決定処理と、前記特定人物の部位の撮影映像上の位置関係と、前記指定された仮想カメラワークに対応する構図情報とを用いて、前記指定された仮想カメラワークに対応するトリミングフレームの前記撮影映像の位置を制御するトリミングフレーム制御処理と、前記撮影映像から前記トリミングフレーム内の映像をトリミングし、トリミングされた映像を、前記仮想カメラワークによる仮想カメラワーク映像として出力するトリミング処理とを、コンピュータに実行させるプログラムである。 One aspect of the present invention includes an acquisition process for acquiring a captured image captured by a camera, a specific person designation reception process for receiving a designation of a specific person to be a subject, and a part of the captured image that is trimmed to produce a virtual image. a virtual camerawork specification acceptance process for accepting a virtual camerawork specification that realizes a video effect by operating a camera; a person recognition process for recognizing a person in the captured image; a skeleton determination process for determining a skeleton of a person; identifying a specific person in the video using a recognition result of the person recognition process; , person identification processing for identifying the positional relationship of the parts of the specific person on the captured image, the positional relationship of the parts of the specific person on the captured image, and the composition information in which the composition is determined for each of the virtual camerawork trimming frame size determination processing for determining the size of a trimming frame for trimming a portion of the captured video using composition information corresponding to the specified virtual camera work; trimming frame control processing for controlling the position of the captured image in the trimming frame corresponding to the specified virtual camera work using the positional relationship and composition information corresponding to the specified virtual camera work; A program for causing a computer to perform a trimming process of trimming a video within the trimming frame from a video and outputting the trimmed video as a virtual camerawork video by the virtual camerawork.

本発明の一態様は、カメラで撮影された撮影映像を取得する取得処理と、被写体となる特定人物の指定を受付ける特定人物指定処理と、前記撮影映像の一部をトリミングして、仮想的なカメラの操作による映像効果を実現する仮想カメラワークの指定を受付ける仮想カメラワーク指定処理と、前記撮影映像中の人物を認識する人物認識処理と、前記撮影映像を用いて、前記映像中の人物の骨格を判定する骨格判定処理と、前記人物認識処理の認識結果を用いて前記映像中の特定人物を識別し、前記骨格判定処理により判定された骨格のうち前記特定人物の骨格を用いて、前記特定人物の全身の撮影映像上の位置関係を特定する人物特定処理と、前記特定人物の全身の撮影映像上の位置関係と、前記仮想カメラワーク毎に被写体の構図が定められた構図情報のうち前記指定された仮想カメラワークに対応する構図情報とを用いて、前記撮影映像の一部をトリミングするトリミングフレームのサイズを決定するトリミングフレームサイズ決定処理と、前記特定人物の全身の撮影映像上の位置関係と、前記指定された仮想カメラワークに対応する構図情報とを用いて、前記指定された仮想カメラワークに対応するトリミングフレームの前記撮影映像の位置を制御するトリミングフレーム制御処理と、前記撮影映像から前記トリミングフレーム内の映像をトリミングし、トリミングされた映像を、前記仮想カメラワークによる仮想カメラワーク映像として出力するトリミング処理と、前記撮影映像と、少なくとも二以上の特定人物又は少なくとも二以上の仮想カメラワークに対応した少なくとも二以上の仮想カメラワーク映像とを、表示する表示処理と表示されている少なくとも二以上の仮想カメラワーク映像のうち、ユーザにより指定されたひとつの仮想カメラワーク映像を出力するスイッチング処理とを、コンピュータに実行させるプログラム。 One aspect of the present invention includes an acquisition process for acquiring a captured image captured by a camera, a specific person designation process for receiving a designation of a specific person to be a subject, and a part of the captured image that is trimmed to produce a virtual image. A virtual camera work specification process for accepting a virtual camera work specification for realizing a video effect by operating a camera, a person recognition process for recognizing a person in the captured image, and a person in the image using the captured image. a skeleton determination process for determining a skeleton; a specific person in the video is identified using a recognition result of the person recognition process; Person identification processing for identifying the positional relationship of a specific person's whole body in the captured image, the positional relationship of the specific person's whole body in the captured image, and the composition information in which the composition of the subject is determined for each of the virtual camerawork trimming frame size determination processing for determining the size of a trimming frame for trimming a portion of the captured video using composition information corresponding to the specified virtual camera work; trimming frame control processing for controlling the position of the captured image in the trimming frame corresponding to the specified virtual camera work using the positional relationship and composition information corresponding to the specified virtual camera work; A trimming process for trimming the video within the trimming frame from the video and outputting the trimmed video as a virtual camerawork video by the virtual camerawork; Display processing for displaying at least two or more virtual camerawork images corresponding to virtual camerawork, and outputting one virtual camerawork image specified by a user among the at least two or more displayed virtual camerawork images. A program that causes a computer to perform a switching process.

本発明の一態様は、コンピュータは、カメラで撮影された撮影映像を取得し、被写体となる特定人物の指定を受付け、前記撮影映像の一部をトリミングして、仮想的なカメラの操作による映像効果を実現する仮想カメラワークの指定を受付け、前記撮影映像中の人物を認識し、前記撮影映像を用いて、前記映像中の人物の骨格を判定し、前記人物の認識結果を用いて前記映像中の特定人物を識別し、前記判定された骨格のうち前記特定人物の骨格を用いて、前記特定人物の部位の撮影映像上の位置関係を特定し、前記特定人物の部位の撮影映像上の位置関係と、前記仮想カメラワーク毎に構図が定められた構図情報のうち前記指定された仮想カメラワークに対応する構図情報とを用いて、前記撮影映像の一部をトリミングするトリミングフレームのサイズを決定し、前記特定人物の部位の撮影映像上の位置関係と、前記指定された仮想カメラワークに対応する構図情報とを用いて、前記指定された仮想カメラワークに対応するトリミングフレームの前記撮影映像の位置を制御し、前記撮影映像から前記トリミングフレーム内の映像をトリミングし、トリミングされた映像を、前記仮想カメラワークによる仮想カメラワーク映像として出力する映像処理方法である。 In one aspect of the present invention, a computer obtains a captured image captured by a camera, receives designation of a specific person as a subject, trims a part of the captured image, and obtains a virtual image obtained by operating the camera. Receiving designation of virtual camera work that realizes an effect, recognizing a person in the captured image, determining the skeleton of the person in the image using the captured image, and using the recognition result of the person, the image identify a specific person in the system, use the skeleton of the specific person among the determined skeletons to specify the positional relationship of the parts of the specific person on the captured video, and identify the positional relationship of the parts of the specific person on the captured video Using the positional relationship and the composition information corresponding to the designated virtual camera work out of the composition information in which the composition is determined for each virtual camera work, the size of the trimming frame for trimming the part of the captured image is determined. and using the positional relationship of the parts of the specific person on the captured video and the composition information corresponding to the specified virtual camera work, the captured video of the trimming frame corresponding to the specified virtual camera work. and trimming the image within the trimming frame from the captured image, and outputting the trimmed image as a virtual camerawork image by the virtual camerawork.

本発明の一態様は、コンピュータは、カメラで撮影された撮影映像を取得し、被写体となる特定人物の指定を受付け、前記撮影映像の一部をトリミングして、仮想的なカメラの操作による映像効果を実現する仮想カメラワークの指定を受付け、前記撮影映像中の人物を認識し、前記撮影映像を用いて、前記映像中の人物の骨格を判定し、前記人物の認識結果を用いて前記映像中の特定人物を識別し、前記判定された骨格のうち前記特定人物の骨格を用いて、前記特定人物の全身の撮影映像上の位置関係を特定し、前記特定人物の全身の撮影映像上の位置関係と、前記仮想カメラワーク毎に被写体の構図が定められた構図情報のうち前記指定された仮想カメラワークに対応する構図情報とを用いて、前記撮影映像の一部をトリミングするトリミングフレームのサイズを決定し、前記特定人物の全身の撮影映像上の位置関係と、前記指定された仮想カメラワークに対応する構図情報とを用いて、前記指定された仮想カメラワークに対応するトリミングフレームの前記撮影映像の位置を制御し、前記撮影映像から前記トリミングフレーム内の映像をトリミングし、トリミングされた映像を、前記仮想カメラワークによる仮想カメラワーク映像として出力し、前記撮影映像と、少なくとも二以上の特定人物又は少なくとも二以上の仮想カメラワークに対応した少なくとも二以上の仮想カメラワーク映像とを、表示し、表示されている少なくとも二以上の仮想カメラワーク映像のうち、ユーザにより指定されたひとつの仮想カメラワーク映像を出力するスイッチング方法である。 In one aspect of the present invention, a computer obtains a captured image captured by a camera, receives designation of a specific person as a subject, trims a part of the captured image, and obtains a virtual image obtained by operating the camera. Receiving designation of virtual camera work that realizes an effect, recognizing a person in the captured image, determining the skeleton of the person in the image using the captured image, and using the recognition result of the person, the image identifying a specific person in the system, using the skeleton of the specific person among the determined skeletons, specifying the positional relationship of the specific person on the whole-body shot video, and identifying the specific person's whole-body shot video A trimming frame for trimming a part of the photographed image by using the positional relationship and the composition information corresponding to the designated virtual camera work out of the composition information in which the composition of the subject is determined for each virtual camera work. determining the size of the trimming frame corresponding to the specified virtual camera work using the positional relationship of the specific person's whole body shot video and the composition information corresponding to the specified virtual camera work; controlling the position of the captured image, trimming the image within the trimming frame from the captured image, outputting the trimmed image as a virtual camera work image by the virtual camera work, and combining the captured image with at least two At least two or more virtual camera work images corresponding to a specific person or at least two or more virtual camera works are displayed, and one virtual camera designated by the user among the displayed at least two or more virtual camera work images This is a switching method for outputting camerawork video.

本発明は、自動的に多種多様なカメラワークの映像を生成することができる。 The present invention can automatically generate a wide variety of camerawork videos.

図１は第１の実施の形態の全体の構成を示すブロック図である。FIG. 1 is a block diagram showing the overall configuration of the first embodiment. 図２は映像処理装置２のブロック図である。FIG. 2 is a block diagram of the video processing device 2. As shown in FIG. 図３は顔認識を説明するための図である。FIG. 3 is a diagram for explaining face recognition. 図４は骨格判定を説明するための図である。FIG. 4 is a diagram for explaining skeleton determination. 図５は人物特定部２６による特定人物の部位の撮影映像上の位置関係の特定を説明するための図である。FIG. 5 is a diagram for explaining how the person identification unit 26 identifies the positional relationship of the parts of a specific person on the captured image. 図６はコンピュータシステムによって構成された映像処理装置２のブロック図である。FIG. 6 is a block diagram of the video processing device 2 configured by a computer system. 図７はタブレット端末（コンピュータ１００）の表示部を説明するための図である。FIG. 7 is a diagram for explaining the display unit of the tablet terminal (computer 100). 図８は映像処理装置２の全体的な動作フローチャートであFIG. 8 is an overall operation flowchart of the video processing device 2. 図９は映像処理装置２の全体的な動作を説明するための図である。FIG. 9 is a diagram for explaining the overall operation of the video processing device 2. As shown in FIG. 図１０は第１のトリミングフレームを説明するための図である。FIG. 10 is a diagram for explaining the first trimming frame. 図１１は第２のトリミングフレームを説明するための図である。FIG. 11 is a diagram for explaining the second trimming frame. 図１２は第３のトリミングフレームを説明するための図である。FIG. 12 is a diagram for explaining the third trimming frame. 図１３は第４のトリミングフレームを説明するための図である。FIG. 13 is a diagram for explaining the fourth trimming frame. 図１４は第１の実施の形態の動作を説明するための図である。FIG. 14 is a diagram for explaining the operation of the first embodiment. 図１５は第１の実施の形態の動作を説明するための図である。FIG. 15 is a diagram for explaining the operation of the first embodiment. 図１６は第１の実施の形態の動作を説明するための図である。FIG. 16 is a diagram for explaining the operation of the first embodiment. 図１７は第１の実施の形態の動作を説明するための図である。FIG. 17 is a diagram for explaining the operation of the first embodiment. 図１８は第１の実施の形態の動作を説明するための図である。FIG. 18 is a diagram for explaining the operation of the first embodiment. 図１９は第１の実施の形態の動作を説明するための図である。FIG. 19 is a diagram for explaining the operation of the first embodiment. 図２０は第１の実施の形態の動作を説明するための図である。FIG. 20 is a diagram for explaining the operation of the first embodiment. 図２１は第１の実施の形態の動作を説明するための図である。FIG. 21 is a diagram for explaining the operation of the first embodiment. 図２２は第１の実施の形態の動作を説明するための図である。FIG. 22 is a diagram for explaining the operation of the first embodiment. 図２３は第１の実施の形態の動作を説明するための図である。FIG. 23 is a diagram for explaining the operation of the first embodiment. 図２４は第１の実施の形態の動作を説明するための図である。FIG. 24 is a diagram for explaining the operation of the first embodiment. 図２５は第１の実施の形態の動作を説明するための図である。FIG. 25 is a diagram for explaining the operation of the first embodiment. 図２６は第１の実施の形態の動作を説明するための図である。FIG. 26 is a diagram for explaining the operation of the first embodiment. 図２７は第１の実施の形態の動作を説明するための図である。FIG. 27 is a diagram for explaining the operation of the first embodiment. 図２８は第１の実施の形態の変形例の動作を説明するための図である。FIG. 28 is a diagram for explaining the operation of the modification of the first embodiment. 図２９は第１の実施の形態の変形例の動作を説明するための図である。FIG. 29 is a diagram for explaining the operation of the modification of the first embodiment. 図３０は第１の実施の形態の変形例を説明するための図である。FIG. 30 is a diagram for explaining a modification of the first embodiment. 図３１は第１の実施の形態の変形例の動作を説明するための図である。FIG. 31 is a diagram for explaining the operation of the modification of the first embodiment. 図３２は第１の実施の形態の変形例の動作を説明するための図である。FIG. 32 is a diagram for explaining the operation of the modification of the first embodiment. 図３３は第１の実施の形態の変形例の情報処理装置２のブロック図である。FIG. 33 is a block diagram of the information processing device 2 of the modification of the first embodiment. 図３４は第２の実施の形態の情報処理装置２のブロック図である。FIG. 34 is a block diagram of the information processing device 2 of the second embodiment. 図３５は第２の実施の形態の動作を説明するための図である。FIG. 35 is a diagram for explaining the operation of the second embodiment. 図３６は第２の実施の形態の動作を説明するための図である。FIG. 36 is a diagram for explaining the operation of the second embodiment. 図３７は第２の実施の形態の動作を説明するための図である。FIG. 37 is a diagram for explaining the operation of the second embodiment. 図３８は第３の実施の形態のスイッチングシステムのブロック図である。FIG. 38 is a block diagram of the switching system of the third embodiment. 図３９は第３の実施の形態の動作を説明するための図である。FIG. 39 is a diagram for explaining the operation of the third embodiment. 図４０は第３の実施の形態の動作を説明するための図である。FIG. 40 is a diagram for explaining the operation of the third embodiment. 図４１は第３の実施の形態の動作を説明するための図である。FIG. 41 is a diagram for explaining the operation of the third embodiment.

＜第１の実施の形態＞
第１の実施の形態を説明する。 <First Embodiment>
A first embodiment will be described.

図１は第１の実施の形態の全体の構成を示すブロック図である。図１中、１はカメラ、２は映像処理装置、３は表示装置である。 FIG. 1 is a block diagram showing the overall configuration of the first embodiment. In FIG. 1, 1 is a camera, 2 is a video processing device, and 3 is a display device.

カメラ１は、番組を撮影するカメラである。カメラ１は、原則１台で、被写体(例えば、番組の出演者)全員が写るような広い画角で、番組を撮影する。本実施の形態は、後述するように、カメラ１が撮影した映像（以下、撮影映像と記載する）の一部をトリミングすることにより、仮想的なカメラワークの映像を生成するため、カメラ１は高画質な映像が撮影できる４Ｋ又は８Ｋのカメラが好ましいが、これらに限定されるものではない。 A camera 1 is a camera for shooting a program. In principle, one camera 1 shoots a program with a wide angle of view so that all subjects (for example, performers of the program) can be captured. As will be described later, this embodiment generates virtual camerawork video by trimming a portion of video captured by camera 1 (hereinafter referred to as captured video). A 4K or 8K camera capable of capturing high-quality images is preferable, but is not limited to these.

映像処理装置２は、カメラ１の撮影映像を入力し、被写体に対してユーザが指定するカメラワーク（以下、仮想カメラワークと記載する）を行った場合に得られる映像(以下、仮想カメラワーク映像と記載する)を出力する。仮想カメラワークは、仮想的なカメラの操作によって得られる映像効果(例えば、映像の構図、画角を変化させる)を実現するものであり、例えば、アップショット、バストショット、ウェストショット、フルショット、パン（左右）、ティルト（上下）、ロール、ズームイン、ズームアウトなどがある。仮想カメラワーク映像は、被写体に対する仮想カメラワークを指定することによりその仮想カメラワークに対応する映像を、入力された撮影映像からトリミング(切り出す)することにより得られる映像である。 The image processing device 2 inputs the image captured by the camera 1, and generates an image (hereinafter referred to as a virtual camera work image) obtained when camera work (hereinafter referred to as virtual camera work) specified by the user is performed on the subject. ) is output. Virtual camera work realizes video effects (for example, changing the composition and angle of view) obtained by operating a virtual camera. Pan (left and right), tilt (up and down), roll, zoom in, zoom out. A virtual camerawork video is a video obtained by specifying a virtual camerawork for a subject and trimming (cutting out) a video corresponding to the virtual camerawork from an input captured video.

表示装置３は、撮影映像と、映像処理装置２から出力される仮想カメラワーク映像とが出力されるディスプレイである。しかし、表示装置３は、表示機能のみならず、タブレット端末のように、タッチパネルの機能を持つディスプレイであっても良い。 The display device 3 is a display on which captured images and virtual camerawork images output from the image processing device 2 are output. However, the display device 3 may be a display having not only a display function but also a touch panel function like a tablet terminal.

次に、映像処理装置２を説明する。図２は映像処理装置２のブロック図である。 Next, the video processing device 2 will be explained. FIG. 2 is a block diagram of the video processing device 2. As shown in FIG.

映像処理装置２は、撮影映像入力部２０と、特定人物指定受付部２１と、仮想カメラワーク指定受付部２２と、顔画像辞書部２３と、顔認識部２４と、骨格判定部２５と、人物特定部２６と、構図データベース２７と、トリミングフレームサイズ決定部２８と、トリミング制御部２９と、トリミング部３０とを備える。 The video processing device 2 includes a captured video input unit 20, a specific person designation receiving unit 21, a virtual camera work designation receiving unit 22, a face image dictionary unit 23, a face recognition unit 24, a skeleton determination unit 25, a person A specifying unit 26 , a composition database 27 , a trimming frame size determining unit 28 , a trimming control unit 29 , and a trimming unit 30 are provided.

撮影映像入力部２０は、カメラ１が撮影した撮影映像を入力するものである。 The captured image input unit 20 inputs the captured image captured by the camera 1 .

特定人物指定受付部２１は、仮想カメラワークの被写体となる人物の指定を、ユーザから受け付けるものである。特定人物指定受付部２１は、例えば、タッチ操作パッド、マウス、キーボードといった直接ユーザが指で操作する素子はもちろん、ユーザの音声を取得するマイクや、加速度センサや角速度センサ、傾斜センサ、地磁気センサといった、運動や姿勢を検知する素子等によっても実現できる。 The specific person designation receiving unit 21 receives designation of a person who will be the subject of virtual camerawork from the user. The specific person designation reception unit 21 includes, for example, elements such as a touch operation pad, a mouse, and a keyboard that are directly operated by the user with a finger, as well as a microphone that acquires the user's voice, an acceleration sensor, an angular velocity sensor, an inclination sensor, and a geomagnetic sensor. It can also be realized by an element or the like for detecting motion or posture.

仮想カメラワークの被写体の人物は、撮影映像に写っている人物であれば良い。また、仮想カメラワークの被写体となる人物(以下、特定人物と記載する)の指定は、特定人物の氏名を直接に入力して指定する方法、予め用意されている氏名から選択して指定する方法、音声入力によって特定人物を指定する方法、表示されている撮影映像中の人物をタッチパネル等の入力手段により選択して指定する方法等があるが、これらには限られない。 The person who is the subject of the virtual camerawork may be any person who appears in the captured image. In addition, the person who will be the subject of virtual camera work (hereinafter referred to as a specific person) can be specified by directly entering the name of the specific person, or by selecting from the names prepared in advance. , a method of designating a specific person by voice input, and a method of selecting and designating a person in a displayed photographed image by input means such as a touch panel, etc., but not limited to these.

仮想カメラワーク指定受付部２２は、仮想カメラワークの指定を、ユーザから受け付けるものである。仮想カメラワーク指定受付部２２は、例えば、タッチ操作パッド、マウス、キーボードといった直接ユーザが指で操作する素子はもちろん、ユーザの音声を取得するマイクや、加速度センサや角速度センサ、傾斜センサ、地磁気センサといった、運動や姿勢を検知する素子等によっても実現できる。 The virtual camera work designation receiving unit 22 receives designation of virtual camera work from the user. The virtual camera work designation reception unit 22 includes, for example, elements such as a touch operation pad, a mouse, and a keyboard that are directly operated by the user's fingers, as well as a microphone that acquires the user's voice, an acceleration sensor, an angular velocity sensor, an inclination sensor, and a geomagnetic sensor. It can also be realized by an element or the like that detects motion or posture.

仮想カメラワークの指定は、カメラワークの種類を直接入力して指定する方法、予め用意されているカメラワークから選択して指定する方法、音声入力によってカメラワークを指定する方法等があるが、これらには限られない。 The virtual camera work can be specified by directly inputting the type of camera work, by selecting from preset camera works, or by voice input. is not limited to

顔画像辞書部２３は、カメラ１で撮影される被写体の顔の画像データ、例えば、番組の出演者の顔の画像データが登録された記憶部である。画像データは、被写体の顔の特徴量のデータ等である。 The face image dictionary unit 23 is a storage unit in which image data of the face of a subject photographed by the camera 1, for example, image data of the face of a program performer is registered. The image data is, for example, data of facial features of a subject.

顔認識部２４は、顔認識辞書２３を用いて、撮影映像中の人物を認識するものである。認識方法の種類は問わないが、パターンマッチングによる方法、機械学習して得られたアルゴリズムによる認識などがある。機械学習の方法は、深層学習（ディープラーニング）が代表的なものであるが、これに限られない。顔認識部２４は、これらの方法用いて、撮影映像中の人物を認識する。具体的に説明すると、図３に示すような三人の出演者が映った撮影映像の場合、三人の出演者の顔を認識する。図３の例では、撮影映像中の人物Ａ、人物Ｂ、人物Ｃが認識されている。 The face recognition unit 24 uses the face recognition dictionary 23 to recognize a person in the captured image. Although the type of recognition method does not matter, there are methods based on pattern matching, recognition based on algorithms obtained by machine learning, and the like. A typical machine learning method is deep learning, but it is not limited to this. The face recognition unit 24 uses these methods to recognize the person in the captured image. More specifically, in the case of a captured image showing three performers as shown in FIG. 3, the faces of the three performers are recognized. In the example of FIG. 3, person A, person B, and person C are recognized in the captured image.

骨格判定部２５は、撮影映像中に写っている人物の骨格を判定する。骨格判定の手法は、例えば、OpenPose、VisionPose、tf-pose-estimation等があるが、これらに限られない。骨格判定部２５は、例えば、図４に示すような三人の出演者が映った撮影映像の場合、三人の出演者の骨格を判定する。 The skeleton determination unit 25 determines the skeleton of the person appearing in the captured image. Skeleton determination methods include, but are not limited to, OpenPose, VisionPose, tf-pose-estimation, and the like. For example, in the case of a photographed video in which three performers are shown as shown in FIG. 4, the skeleton determination unit 25 determines the skeletons of the three performers.

人物特定部２６は、特定人物指定受付部２１で受け付けた特定人物を、顔認識部２４の結果を用いて撮影映像から特定し、その特定人物の骨格判定部２５による骨格判定の結果を用いて、特定人物の部位の撮影映像上の位置関係を特定する。 The person identifying unit 26 identifies the specific person accepted by the specific person designation accepting unit 21 from the captured image using the result of the face recognition unit 24, and uses the result of skeleton determination by the skeleton determining unit 25 of the specific person. , to specify the positional relationship of the parts of a specific person on the captured image.

人物特定部２６による特定人物の部位の撮影映像上の位置関係の特定について説明する。図５は人物特定部２６による特定人物の部位の撮影映像上の位置関係の特定を説明するための図である。図５は特定人物が人物Ｂの場合を示しており、人物Ｂの部位の骨格判定の結果を示している。図５では、骨格の頂点から足の骨格の最下部までを人物Ｂの縦方向の全身としている。また、縦方向のうち、骨格の頂点から骨格の首部までを縦方向の人物Ｂの顔としている。また、縦方向のうち、骨格の首部から骨格の股関節に対応する部分までを人物Ｂの上半身としている。また、縦方向のうち、骨格の股関節に対応する部分から足の先端部までを人物Ｂの下半身としている。また、骨格が横方向に占める最大の範囲を人物Ｂの全身の横幅としている。また、顔の骨格のうち横方向に占める最大の範囲を人物Ｂの顔の横幅としている。また、縦方向の右手先から右腕の肘関節までの長さを縦右前腕部の長さとしている。横方向の右手先から右腕の肘関節までの長さを横右前腕部の長さとしている。また、縦方向の左手先から左腕の肘関節までの長さを縦左前腕部の長さとしている。横方向の左手先から左腕の肘関節までの長さを横左前腕部の長さとしている。また、骨格の首部を通る垂線を、人物Ｂの骨格中心線としている。そして、人物特定部２６は、特定人物の各部位の位置及び長さを、撮影映像の画素位置に対応させて特定する。以上の人物特定部２６による特定人物の部位の判定は一例であり、適時、判定する部位を変更しても良い。 The identification of the positional relationship of the parts of the specific person on the captured image by the person identification unit 26 will be described. FIG. 5 is a diagram for explaining how the person identification unit 26 identifies the positional relationship of the parts of a specific person on the captured image. FIG. 5 shows the case where the specific person is person B, and shows the result of skeleton determination of the parts of person B. FIG. In FIG. 5, the entire length of the person B is defined from the top of the skeleton to the bottom of the skeleton of the legs. Also, in the vertical direction, the face of the person B in the vertical direction is defined from the vertex of the skeleton to the neck of the skeleton. The upper half of the body of the person B is defined from the neck portion of the skeleton to the portion corresponding to the hip joint of the skeleton in the vertical direction. In addition, in the vertical direction, the lower half of the body of the person B is defined from the portion corresponding to the hip joint of the skeleton to the tip of the foot. Also, the width of the whole body of the person B is defined as the maximum range occupied by the skeleton in the horizontal direction. In addition, the width of the face of the person B is defined as the maximum range of the facial skeleton in the lateral direction. In addition, the length from the tip of the right hand in the vertical direction to the elbow joint of the right arm is defined as the length of the vertical right forearm. The length from the tip of the right hand in the lateral direction to the elbow joint of the right arm is defined as the length of the lateral right forearm. The vertical length of the left forearm is defined as the length from the tip of the left hand in the vertical direction to the elbow joint of the left arm. The length from the left hand in the lateral direction to the elbow joint of the left arm is taken as the length of the lateral left forearm. Also, the vertical line passing through the neck of the skeleton is defined as the center line of the skeleton of the person B. FIG. Then, the person identification unit 26 identifies the position and length of each part of the specific person in correspondence with the pixel position of the captured image. The determination of the body part of the specific person by the person identification unit 26 is an example, and the body part to be determined may be changed as appropriate.

尚、顔認識部２４、骨格判定部２５及び人物特定部２６による処理に用いる画像は、かならずしも撮影映像である必要はない。撮影映像よりも解像度の低い映像を用いることも可能である。撮影映像よりも解像度の低い映像を用いることにより、顔認識部２４、骨格判定部２５及び人物特定部２６の処理速度が速くなるという効果を得られる。 The images used for the processing by the face recognition unit 24, the skeleton determination unit 25, and the person identification unit 26 do not necessarily have to be captured images. It is also possible to use an image with a resolution lower than that of the captured image. By using an image with a resolution lower than that of the captured image, the processing speed of the face recognition unit 24, the skeleton determination unit 25, and the person identification unit 26 can be increased.

構図データベース２７は、仮想カメラワークの種類毎の構図情報が記憶されたデータベースである。構図情報は、被写体をどのような構図で映すかの情報である。具体的には、構図情報は、被写体フレームの大きさ（例えば、縦又は横の長さ）、使用するトリミングフレームの種類、トリミングフレームの位置（例えば、骨格中心線との位置関係）、トリミングフレームの移動の制御情報等がある。 The composition database 27 is a database in which composition information for each type of virtual camerawork is stored. The composition information is information about the composition of the subject. Specifically, the composition information includes the size of the subject frame (e.g., vertical or horizontal length), the type of trimming frame to be used, the position of the trimming frame (e.g., positional relationship with the skeleton center line), the trimming frame control information for the movement of

トリミングフレームサイズ決定部２８は、指定された仮想カメラワークと、特定人物の骨格と、構図データベース２７の構図情報とから、撮影映像から被写体を切り出す（トリミング）ためのトリミングフレームのサイズを決定する。 A trimming frame size determination unit 28 determines the size of a trimming frame for cutting out (trimming) a subject from a photographed video from the specified virtual camera work, the skeleton of a specific person, and the composition information of the composition database 27.

トリミング制御部２９は、トリミングフレームサイズ決定部２８により決定されたサイズのトリミングフレームの映像上の位置を制御する。 The trimming control unit 29 controls the position of the trimming frame of the size determined by the trimming frame size determination unit 28 on the video.

トリミング部３０は、撮影映像から、トリミング制御部２９により制御されて映像上の所定の位置に配置されたトリミングフレーム内の映像をトリミングする。 The trimming unit 30 trims the image within the trimming frame arranged at a predetermined position on the image under the control of the trimming control unit 29 from the captured image.

上述した映像処理装置２は、具体的には、各種の演算処理等を行うプロセッサを有するコンピュータシステムによって実現することができる。図６はコンピュータシステムによって構成された映像処理装置２のブロック図である。 Specifically, the video processing device 2 described above can be realized by a computer system having a processor that performs various kinds of arithmetic processing. FIG. 6 is a block diagram of the video processing device 2 configured by a computer system.

映像処理装置２は、図６に示す如く、プロセッサ１０１、メモリ（ＲＯＭやＲＡＭ）１０２、記憶装置（ハードディスク、半導体ディスクなど）１０３、入力装置（キーボード、マウス、タッチパネルなど）１０４、通信装置１０５を有するコンピュータ１００により構成することができる。 As shown in FIG. 6, the video processing apparatus 2 includes a processor 101, a memory (ROM or RAM) 102, a storage device (hard disk, semiconductor disk, etc.) 103, an input device (keyboard, mouse, touch panel, etc.) 104, and a communication device 105. It can be configured by a computer 100 having

映像処理装置２は、記憶装置１０３に格納されたプログラムがメモリ１０２にロードされ、プロセッサ１０１により実行されることにより、撮影映像入力処理１１０、特定人物指定受付処理１１１、仮想カメラワーク指定受付処理１１２、顔認識処理１１３、骨格判定処理１１４、人物特定処理１１５、トリミングフレームサイズ決定処理１１６、トリミング制御処理１１７及びトリミング処理１１８が実現されるものである。ここで、撮影映像入力処理１１０は撮影映像入力部２０に対応し、特定人物指定受付処理１１１は特定人物指定受付部２１に対応し、仮想カメラワーク指定受付処理１１２は仮想カメラワーク指定受付部２２に対応し、顔認識処理１１３は顔認識部２４に対応し、骨格判定処理１１４は骨格判定部２５に対応し、人物特定処理１１５は人物特定部２６に対応し、トリミングフレームサイズ決定処理１１６はトリミングフレームサイズ決定部２８に対応し、トリミング制御処理１１７はトリミング制御部２９に対応し、トリミング処理１１８はトリミング部３０に対応する。また、顔画像辞書部２３及び構図データベース２７は、記憶装置１０３に対応する。尚、顔画像辞書部２３及び構図データベース２７は、コンピュータ１００と物理的に外部に設けられ、ＬＡＮ等のネットワークを介してコンピュータ１００と接続されていても良い。 The program stored in the storage device 103 is loaded into the memory 102 and executed by the processor 101, so that the video processing device 2 performs a photographed video input process 110, a specific person designation reception process 111, and a virtual camera work designation reception process 112. , face recognition processing 113, skeleton determination processing 114, person identification processing 115, trimming frame size determination processing 116, trimming control processing 117 and trimming processing 118 are realized. Here, the captured image input processing 110 corresponds to the captured image input unit 20, the specific person designation reception processing 111 corresponds to the specific person designation reception unit 21, and the virtual camera work designation reception processing 112 corresponds to the virtual camera work designation reception unit 22. , face recognition processing 113 corresponds to the face recognition unit 24, skeleton determination processing 114 corresponds to the skeleton determination unit 25, person identification processing 115 corresponds to the person identification unit 26, and trimming frame size determination processing 116 corresponds to The trimming control processing 117 corresponds to the trimming control unit 29 , and the trimming processing 118 corresponds to the trimming unit 30 . Also, the face image dictionary unit 23 and the composition database 27 correspond to the storage device 103 . The face image dictionary section 23 and the composition database 27 may be physically provided outside the computer 100 and connected to the computer 100 via a network such as a LAN.

次に、第１の実施の形態における映像処理装置２の動作を説明する。 Next, the operation of the video processing device 2 according to the first embodiment will be described.

まず、映像処理装置２の全体の動作を説明する。 First, the overall operation of the video processing device 2 will be described.

以下の説明では、情報処理装置２と表示装置３とが一体に構成されたタブレット端末（コンピュータ１００）を想定して説明する。そして、タブレット端末（コンピュータ１００）の表示部（表示装置３）には、図７に示す如く、撮影映像が表示される表示部４０と、仮想カメラワーク映像が表示される表示部４１と、仮想カメラワークを指定する仮想カメラワーク指定受付部４２（仮想カメラワーク指定受付部２２）と、被写体となる特定人物を指定する特定人物指定受付部４３（特定人物指定受付部２１）とが表示されるものとする。尚、指定できる仮想カメラワークは、ロングショット、アップショット、バストショット、ウェストショット、ティルトアップ、ティルトダウン、パンである。また、特定人物指定受付部４３（特定人物指定受付部２１）は、顔認識部２４により認識された人物の名前が表示されるものとする。尚、人物の名前に代えて又は加えて、顔認識部２４により認識された人物の顔を表示しても良い。 In the following description, a tablet terminal (computer 100) in which the information processing device 2 and the display device 3 are integrated is assumed. As shown in FIG. 7, the display unit (display device 3) of the tablet terminal (computer 100) has a display unit 40 that displays the shot image, a display unit 41 that displays the virtual camera work image, and a virtual A virtual camerawork designation reception unit 42 (virtual camerawork designation reception unit 22) for designating camerawork and a specific person designation reception unit 43 (specific person designation reception unit 21) for designating a specific person as a subject are displayed. shall be The virtual camera work that can be specified is long shot, up shot, bust shot, waist shot, tilt up, tilt down, and pan. Further, the specific person designation reception unit 43 (the specific person designation reception unit 21) displays the name of the person recognized by the face recognition unit 24. FIG. Instead of or in addition to the person's name, the person's face recognized by the face recognition unit 24 may be displayed.

図８は映像処理装置２の全体的な動作フローチャートであり、図９は映像処理装置２の全体的な動作を説明するための図である。 FIG. 8 is an overall operation flowchart of the video processing device 2, and FIG. 9 is a diagram for explaining the overall operation of the video processing device 2. As shown in FIG.

撮影映像入力部２０は撮影映像を入力する（Ｓｔｅｐ１）。入力された撮影映像は、表示部４０に表示される。 The captured image input unit 20 inputs a captured image (Step 1). The captured image that has been input is displayed on the display unit 40 .

顔認識部２４は、撮影映像上の人物を認識する（Ｓｔｅｐ２）。図９の例では、人物Ａ、人物Ｂ及び人物Ｃが認識され、その結果として、図９の例では、特定人物指定部４３として、認識された人物Ａ、人物Ｂ及び人物Ｃが表示されている。 The face recognition unit 24 recognizes the person on the captured image (Step 2). In the example of FIG. 9, person A, person B, and person C are recognized, and as a result, in the example of FIG. there is

骨格判定部２５は、撮影映像上の人物の骨格を判定する（Ｓｔｅｐ３）。図９の例では、人物Ａ、人物Ｂ及び人物Ｃの骨格を判定している。 The skeleton determination unit 25 determines the skeleton of the person on the captured image (Step 3). In the example of FIG. 9, the skeletons of person A, person B, and person C are determined.

ユーザは、表示部４０に表示されている撮影映像を見ながら、特定人物指定受付部４３に表示されている人物をタッチすることにより、被写体となる特定人物を指定する（Ｓｔｅｐ４）。 The user designates a specific person to be a subject by touching the person displayed in the specific person designation acceptance unit 43 while watching the captured image displayed on the display unit 40 (Step 4).

人物特定部２６は、指定された特定人物を認識された人物から特定し、特定人物の部位の撮影映像における長さ、位置を判定する（Ｓｔｅｐ５）。 The person identification unit 26 identifies the designated specific person from the recognized persons, and determines the length and position of the part of the specific person in the captured image (Step 5).

ユーザは、仮想カメラワークを、仮想カメラワーク指定受付部４２に表示されている仮想カメラワークのいずれかをタッチすることにより、仮想カメラワークを指定する（Ｓｔｅｐ６）。 The user designates a virtual camerawork by touching any one of the virtual camerawork displayed in the virtual camerawork designation receiving section 42 (Step 6).

トリミングフレームサイズ決定部２８は、構図データベース２７から指定された仮想カメラワークの構図情報を読み出し、構図情報及び特定人物の部位の撮影映像における長さ、位置を用いて、トリミングフレームサイズを決定する（Ｓｔｅｐ７）。 The trimming frame size determination unit 28 reads the composition information of the designated virtual camera work from the composition database 27, and determines the trimming frame size using the composition information and the length and position of the specific person's part in the captured video ( Step 7).

トリミング制御部２９は、仮想カメラワークの構図情報を読み出し、構図情報及び特定人物の部位の撮影映像における長さ、位置を用いて、トリミングフレームサイズ決定部２８で決定されたトリミングフレームの撮影映像上における位置を制御する（Ｓｔｅｐ８）。 The trimming control unit 29 reads the composition information of the virtual camera work, and uses the composition information and the length and position of the specific person's body part in the captured image to trim the trimming frame determined by the trimming frame size determination unit 28 on the captured image. is controlled (Step 8).

トリミング部３０は、撮影映像からトリミングフレーム内の映像をトリミングし（Ｓｔｅｐ９）、トリミングした映像を仮想カメラワーク映像として出力し、表示部４１に表示する（Ｓｔｅｐ１０）。 The trimming unit 30 trims the image within the trimming frame from the captured image (Step 9), outputs the trimmed image as a virtual camerawork image, and displays it on the display unit 41 (Step 10).

以上が、映像処理装置２の概略の動作である。 The outline of the operation of the video processing device 2 has been described above.

次に、トリミングフレームサイズ決定部２８、トリミング制御部２９、トリミング部３０の具体的な動作を説明する。 Next, specific operations of the trimming frame size determination unit 28, the trimming control unit 29, and the trimming unit 30 will be described.

具体的な動作の説明に先立って、トリミングに使用されるトリミングフレームについて説明する。第１の実施の形態では、以下の第１のトリミングフレーム、第２のトリミングフレーム、第３のトリミングフレーム及び第４のトリミングフレームが用いられる。 Before describing specific operations, a trimming frame used for trimming will be described. In the first embodiment, the following first trimming frame, second trimming frame, third trimming frame and fourth trimming frame are used.

（１）第１のトリミングフレーム
第１のトリミングフレームは、アップショット、バストショット、ウェストショット等に用いられ、フレームの下部にマージン（余白）を取る必要がない仮想カメラワークに用いられる。具体的な第１のトリミングフレームを、図１０に示す。第１のトリミングフレームの特徴は以下の通りである。
・アスペクト比：１６：９
・上部マージン：縦のフレーム長に対して、０．１から０．１５倍の長さ（好ましくは、０．１５倍の長さ）
・左右部マージン：横のフレーム長に対して、０．０５から０．１倍の長さ（好ましくは、０．１倍の長さ）
・下部マージン：なし
（２）第２のトリミングフレーム
第２のトリミングフレームは、ロングショット等に用いられ、フレームの上下左右にマージン（余白）を取る必要がある仮想カメラワークに用いられる。具体的な第２のトリミングフレームを、図１１に示す。第２のトリミングフレームの特徴は以下の通りである。
・アスペクト比：１６：９
・上部マージン：縦のフレーム長に対して、０．１から０．１５倍の長さ（好ましくは、０．１５倍の長さ）
・左右部マージン：縦のフレーム長に対して、０．０５から０．１倍の長さ（好ましくは、０．１倍の長さ）
・下部マージン：縦のフレーム長に対して、０．０．５から０．１倍の長さ（好ましくは、０．１倍の長さ）
（３）第３のトリミングフレーム
第３のトリミングフレームは、ティルトアップ又はティルトダウン等の最初又は最後のフレームに用いられ、フレームの上部にマージン（余白）を取る必要がない仮想カメラワークに用いられる。具体的な第３のトリミングフレームを、図１２に示す。第３のトリミングフレームの特徴は以下の通りである。
・アスペクト比：１６：９
・上部マージン：なし
・左右部マージン：横のフレーム長に対して、０．０５から０．１倍の長さ（好ましくは、０．１倍の長さ）
・下部マージン：縦のフレーム長に対して、０．２から０．２５倍の長さ（好ましくは、０．２５倍の長さ）
（４）第４のトリミングフレーム
第４のトリミングフレームは、ティルトアップ、ティルトダウンの途中のフレームに用いられ、フレームの上下部にマージン（余白）を取る必要がない仮想カメラワークに用いられる。具体的な第４のトリミングフレームを、図１３に示す。第４のトリミングフレームの特徴は以下の通りである。
・アスペクト比：１６：９
・上部マージン：なし
・左右部マージン：横のフレーム長に対して、０．０５から０．１倍の長さ（好ましくは、０．１倍の長さ）
・下部マージン：なし
第１のトリミングフレーム、第２のトリミングフレーム、第３のトリミングフレーム及び第４のトリミングフレームの特徴は、トリミングフレームサイズ決定部２８に設定されている。 (1) First Trimming Frame The first trimming frame is used for close-up shots, bust shots, waist shots, etc., and is used for virtual camera work that does not require a margin at the bottom of the frame. A specific first trimming frame is shown in FIG. The features of the first trimming frame are as follows.
・Aspect ratio: 16:9
・Top margin: 0.1 to 0.15 times the length of the vertical frame (preferably 0.15 times the length)
・Left and right margins: 0.05 to 0.1 times the horizontal frame length (preferably 0.1 times the length)
Bottom Margin: None (2) Second Trimming Frame The second trimming frame is used for long shots and the like, and is used for virtual camera work that requires margins on the top, bottom, left, and right of the frame. A specific second trimming frame is shown in FIG. The characteristics of the second trimming frame are as follows.
・Aspect ratio: 16:9
・Top margin: 0.1 to 0.15 times the length of the vertical frame (preferably 0.15 times the length)
・Left and right margins: 0.05 to 0.1 times the length of the vertical frame (preferably 0.1 times the length)
・Bottom margin: 0.0.5 to 0.1 times the length of the vertical frame (preferably 0.1 times the length)
(3) Third trimming frame The third trimming frame is used for the first or last frame such as tilt-up or tilt-down, and is used for virtual camera work that does not require a margin at the top of the frame. . A specific third trimming frame is shown in FIG. The features of the third trimming frame are as follows.
・Aspect ratio: 16:9
- Top margin: none - Left and right margins: 0.05 to 0.1 times the horizontal frame length (preferably 0.1 times the length)
・Bottom margin: 0.2 to 0.25 times the length of the vertical frame (preferably 0.25 times the length)
(4) Fourth Trimming Frame The fourth trimming frame is used for frames in the middle of tilt-up and tilt-down, and is used for virtual camera work that does not require margins at the top and bottom of the frame. A specific fourth trimming frame is shown in FIG. The features of the fourth trimming frame are as follows.
・Aspect ratio: 16:9
- Top margin: none - Left and right margins: 0.05 to 0.1 times the horizontal frame length (preferably 0.1 times the length)
• Bottom Margin: None The features of the first trimming frame, the second trimming frame, the third trimming frame, and the fourth trimming frame are set in the trimming frame size determining section 28 .

次に、具体的な仮想カメラワークを想定したトリミングフレームサイズ決定部２８、トリミング制御部２９、トリミング部３０の動作を説明する。 Next, operations of the trimming frame size determination unit 28, the trimming control unit 29, and the trimming unit 30 assuming specific virtual camera work will be described.

１．特定人物が「人物Ａ（一人）」であり、仮想カメラワークが「アップショット」の場合の具体的な動作
以下の説明では、映像処理装置２には、図１４に示すような撮影映像が入力されており、特定人物指定受付部２１が受け付けた特定人物（仮想カメラワークの被写体）が「人物Ａ」であり、仮想カメラワーク指定受付部２２により指定された仮想カメラワークが「アップショット」である場合を説明する。また、各部が読み出す具体的な構図情報は、構図データベース２７に予め記憶されているものとする。 1. Specific operation when the specific person is "person A (single)" and the virtual camera work is "close-up shot". , the specific person (subject of virtual camera work) accepted by the specific person specification accepting unit 21 is "Person A", and the virtual camera work specified by the virtual camera work specification accepting unit 22 is "close-up shot". Describe a case. Further, it is assumed that specific composition information read by each unit is stored in the composition database 27 in advance.

人物特定部２６は、顔認識部２４により認識された人物及び骨格判定部２５により判定された骨格を用いて、図１５に示すように撮影映像から「人物Ａ」の骨格を判定する。また、「人物Ａ」の骨格の縦方向の骨格中心線も決定する。尚、骨格中心線は、骨格の首元を通過する線である。そして、これらの骨格の座標情報をトリミングフレームサイズ決定部２８に出力する。 Using the person recognized by the face recognition unit 24 and the skeleton determined by the skeleton determination unit 25, the person identification unit 26 determines the skeleton of "person A" from the captured video as shown in FIG. Also, the vertical skeletal centerline of the skeleton of "Person A" is determined. The skeleton centerline is a line passing through the neck of the skeleton. Then, the coordinate information of these skeletons is output to the trimming frame size determining section 28 .

トリミングフレームサイズ決定部２８は、特定人物が「一人」であり、仮想カメラワークが「アップショット」に対応する構図情報を、構図データベース２７から読み出す。ここで、読み出される「アップショット」に対応する構図情報は、以下の通りである。
・顔の縦方向の長さを１としたとき、被写体フレームの縦方向の長さを１．１倍とする。
・トリミングフレームは第１のトリミングフレームを使用する。
・トリミングフレームの縦方向の中心線と被写体フレームの縦方向の中心線とは一致させる。 The trimming frame size determining unit 28 reads from the composition database 27 the composition information corresponding to the specific person being "single person" and the virtual camera work being "close-up shot". Here, the composition information corresponding to the read "close-up shot" is as follows.
When the vertical length of the face is 1, the vertical length of the subject frame is 1.1 times.
・The trimming frame uses the first trimming frame.
・Make the vertical center line of the trimming frame coincide with the vertical center line of the subject frame.

トリミングフレームサイズ決定部２８は、人物Ａの骨格より、顔の縦方向の長さを計算する。そして、図１５に示すように、縦方向の長さが、顔の長さの１．１倍の長さである被写体フレームの仮フレームを生成し、仮フレームの上辺が人物の顔の骨格の頂点に位置し、かつ、縦方向の中心線と骨格中心線とが一致する位置に、仮フレームを設置する。 The trimming frame size determining unit 28 calculates the length of the face in the vertical direction from the skeleton of the person A. Then, as shown in FIG. 15, a temporary frame of the subject frame whose length in the vertical direction is 1.1 times the length of the face is generated. A temporary frame is placed at the vertex and at a position where the vertical center line and the skeleton center line coincide.

次に、トリミングフレームサイズ決定部２８は、図１５に示すように、仮フレームの縦方向の長さをかえずに、横方向の人物Ａの骨格の最大範囲となるように横方向の長さを調整する。このようにして生成されたフレームが被写体フレームである。 Next, as shown in FIG. 15, the trimming frame size determining unit 28 adjusts the horizontal length of the temporary frame so that the maximum range of the horizontal skeleton of the person A is achieved without changing the vertical length of the temporary frame. to adjust. A frame generated in this manner is a subject frame.

トリミングフレームサイズ決定部２８は、図１５に示すように、構図情報を満足するように、第１のトリミングフレームのサイズを調整する。第１のトリミングフレームのサイズの調整は、最低のマージンを確保した上で、第１のトリミングフレームのマージンが被写体フレームに重ならない範囲で、なるべく好ましいマージンの大きさになるように調整する。このようにして、トリミングフレームサイズ決定部２８は、図１６に示すように、トリミングフレームのサイズを決定する。 The trimming frame size determining unit 28 adjusts the size of the first trimming frame so as to satisfy the composition information, as shown in FIG. The size of the first trimming frame is adjusted so as to obtain a preferable size of the margin as long as the minimum margin is secured and the margin of the first trimming frame does not overlap the object frame. In this manner, the trimming frame size determining unit 28 determines the trimming frame size as shown in FIG.

トリミング制御部２９は、特定人物が「一人」であり、仮想カメラワークが「アップショット」に対応する構図情報のうち、制御に関する構図情報を、構図データベース２７から読み出す。ここで、読み出される構図情報は、以下の通りである。
・トリミングフレームを、トリミングフレームの縦方向の中心線と被写体の縦方向の骨格中心線とが一致するように配置する。 The trimming control unit 29 reads, from the composition database 27, the composition information related to control among the composition information corresponding to the specific person being "single person" and the virtual camera work being "close-up shot". Here, the read composition information is as follows.
・Arrange the trimming frame so that the vertical center line of the trimming frame and the vertical skeleton center line of the subject match.

トリミング制御部２９は、人物特定部２６からの骨格の位置情報（骨格中心線）と構図情報とを用いて、トリミングフレームサイズ決定部２８で決定されたサイズのトリミングフレームを、図１６に示すように、撮影映像上に配置する。 The trimming control unit 29 uses the skeleton position information (skeletal center line) and the composition information from the person identifying unit 26 to convert the trimming frame of the size determined by the trimming frame size determination unit 28 into a trimming frame as shown in FIG. , and place it on the captured image.

尚、撮影映像は動画であるため、人物は時間の経過にともなって移動する可能性があり、その移動に伴い骨格中心線も移動する可能性がある。骨格中心線の移動に伴って、トリミングフレームの位置も制御しても良いが、骨格中心線の移動が少ない場合にも、トリミングフレームを移動させると、視聴者にとって視聴し難い映像になる可能性がある。そこで、骨格中心線の移動の範囲が所定の距離範囲内である場合には、トリミングフレームの位置を維持するように制御しても良い。 In addition, since the captured image is a moving image, the person may move with the passage of time, and the skeleton centerline may also move along with the movement. Although the position of the trimming frame may be controlled along with the movement of the skeleton centerline, even if the movement of the skeleton centerline is small, if the trimming frame is moved, the image may become difficult for the viewer to view. There is Therefore, if the movement range of the skeleton centerline is within a predetermined distance range, control may be performed to maintain the position of the trimming frame.

トリミング部３０は、撮影映像からトリミングフレーム内の映像をトリミングし、トリミングされたトリミング映像を、図１６に示されるように、指定された仮想カメラワークである「人物Ａ」の「アップショット」の映像として出力する。また、タブレット端末（コンピュータ１００）の表示部４１に仮想カメラワーク映像が表示されている状態を、図１７に示す。 The trimming unit 30 trims the video within the trimming frame from the captured video, and uses the trimmed video as a "close-up shot" of "person A", which is the designated virtual camerawork, as shown in FIG. Output as video. FIG. 17 shows a state in which virtual camerawork images are displayed on the display unit 41 of the tablet terminal (computer 100).

２．特定人物が「人物Ａ（一人）」であり、仮想カメラワークが「ウェストショット」の場合の具体的な動作
以下の説明では、映像処理装置２には、図１４に示すような撮影映像が入力されており、特定人物指定受付部２１が受け付けた特定人物（仮想カメラワークの被写体）が「人物Ａ」であり、仮想カメラワーク指定受付部２２により指定された仮想カメラワークが「ウェストショット」である場合を説明する。また、各部が読み出す具体的な構図情報は、構図データベース２７に予め記憶されているものとする。 2. Specific operation when the specific person is "person A (single)" and the virtual camera work is "waist shot". The specific person (subject of the virtual camera work) accepted by the specific person specification accepting unit 21 is "Person A", and the virtual camera work specified by the virtual camera work specification accepting unit 22 is "West shot". Describe a case. Further, it is assumed that specific composition information read by each unit is stored in the composition database 27 in advance.

人物特定部２６は、顔認識部２４により認識された人物及び骨格判定部２５により判定された骨格を用いて、図１５に示すように撮影映像から「人物Ａ」の骨格を特定する。また、「人物Ａ」の骨格の縦方向の骨格中心線も決定する。尚、骨格中心線は、骨格の首元を通過する線である。そして、これらの骨格の座標情報をトリミングフレームサイズ決定部２８に出力する。 Using the person recognized by the face recognition unit 24 and the skeleton determined by the skeleton determination unit 25, the person identification unit 26 identifies the skeleton of "person A" from the captured video as shown in FIG. Also, the vertical skeletal centerline of the skeleton of "Person A" is determined. The skeleton centerline is a line passing through the neck of the skeleton. Then, the coordinate information of these skeletons is output to the trimming frame size determining section 28 .

トリミングフレームサイズ決定部２８は、特定人物が「一人」であり、仮想カメラワークが「ウェストショット」に対応する構図情報を、構図データベース２７から読み出す。ここで、読み出される「ウェストショット」に対応する構図情報は、以下の通りである。
・顔の長さを１としたとき、被写体フレームの縦方向の長さを１．７倍とする。
・トリミングフレームは第１のトリミングフレームを使用する。
・トリミングフレームの縦方向の中心線と被写体フレームの縦方向の中心線とは一致させる。 The trimming frame size determining unit 28 reads from the composition database 27 the composition information corresponding to the specific person being "single person" and the virtual camera work being "waist shot". Here, the composition information corresponding to the read "waist shot" is as follows.
・Assuming that the length of the face is 1, the length of the subject frame in the vertical direction is 1.7 times.
・The trimming frame uses the first trimming frame.
・Make the vertical center line of the trimming frame coincide with the vertical center line of the subject frame.

トリミングフレームサイズ決定部２８は、人物Ａの骨格より、顔の縦方向の長さを計算する。そして、図１８に示すように、縦方向の長さが、顔の長さの１．７倍の長さである被写体フレームの仮フレームを生成し、仮フレームの上辺が人物の顔の骨格の頂点に位置し、かつ、縦方向の中心線と骨格中心線とが一致する位置に、仮フレームを設置する。 The trimming frame size determining unit 28 calculates the length of the face in the vertical direction from the skeleton of the person A. Then, as shown in FIG. 18, a temporary frame of the subject frame whose length in the vertical direction is 1.7 times the length of the face is generated. A temporary frame is placed at the vertex and at a position where the vertical center line and the skeleton center line coincide.

次に、トリミングフレームサイズ決定部２８は、図１８に示すように、仮フレームの縦方向の長さをかえずに、横方向の人物Ａの骨格の最大範囲となるように横方向の長さを調整する。このようにして生成されたフレームが被写体フレームである。 Next, as shown in FIG. 18, the trimming frame size determining unit 28 adjusts the horizontal length of the temporary frame so that the maximum range of the horizontal skeleton of the person A is obtained without changing the vertical length of the temporary frame. to adjust. A frame generated in this manner is a subject frame.

トリミングフレームサイズ決定部２８は、図１８に示すように、構図情報を満足するように、第１のトリミングフレームのサイズを調整する。第１のトリミングフレームのサイズの調整は、最低のマージンを確保した上で、第１のトリミングフレームのマージンが被写体フレームに重ならない範囲で、なるべく好ましいマージンの大きさになるように調整する。このようにして、トリミングフレームサイズ決定部２８は、図１８に示すように、トリミングフレームのサイズを決定する。 The trimming frame size determining unit 28 adjusts the size of the first trimming frame so as to satisfy the composition information, as shown in FIG. The size of the first trimming frame is adjusted so as to obtain a preferable size of the margin as long as the minimum margin is secured and the margin of the first trimming frame does not overlap the object frame. In this manner, the trimming frame size determining unit 28 determines the trimming frame size as shown in FIG.

トリミング制御部２９は、特定人物が「一人」であり、仮想カメラワークが「ウェストショット」に対応する構図情報のうち、制御に関する構図情報を、構図データベース２７から読み出す。ここで、読み出される構図情報は、以下の通りである。
・トリミングフレームを、トリミングフレームの縦方向の中心線と被写体の縦方向の骨格中心線とが一致するように配置する。 The trimming control unit 29 reads from the composition database 27 the composition information related to the control, among the composition information corresponding to the specific person being “single person” and the virtual camera work being “waist shot”. Here, the read composition information is as follows.
・Arrange the trimming frame so that the vertical center line of the trimming frame and the vertical skeleton center line of the subject match.

トリミング制御部２９は、人物特定部２６からの骨格の位置情報（骨格中心線）と構図情報とを用いて、トリミングフレームサイズ決定部２８で決定されたサイズのトリミングフレームを、図１９に示すように、撮影映像上に配置する。 The trimming control unit 29 uses the skeleton position information (skeletal center line) and the composition information from the person identifying unit 26 to convert the trimming frame of the size determined by the trimming frame size determination unit 28 into a size as shown in FIG. , and place it on the captured image.

尚、撮影映像は動画であるため、人物は時間の経過にともなって移動する可能性があり、骨格中心線も移動する可能性がある。骨格中心線の移動に伴って、トリミングフレームの位置も制御しても良いが、骨格中心線の移動が少ない場合にも、トリミングフレームを移動させると、視聴者にとって視聴し難い映像になる可能性がある。そこで、骨格中心線の移動の範囲が所定の距離範囲内である場合には、トリミングフレームの位置を維持するように制御しても良い。 Note that since the captured video is a moving image, the person may move over time, and the skeletal centerline may also move. Although the position of the trimming frame may be controlled along with the movement of the skeleton centerline, even if the movement of the skeleton centerline is small, if the trimming frame is moved, the image may become difficult for the viewer to view. There is Therefore, if the movement range of the skeleton centerline is within a predetermined distance range, control may be performed to maintain the position of the trimming frame.

トリミング部３０は、撮影映像からトリミングフレーム内の映像をトリミングし、トリミングされたトリミング映像を、図１９に示されるように、指定された仮想カメラワークである「人物Ａ」の「ウェストショット」の映像として出力する。尚、タブレット端末（コンピュータ１００）の表示部４１に仮想カメラワーク映像が表示されている状態は、仮想カメラワーク映像が「人物Ａ」の「ウェストショット」の映像となるだけなので、省略する。 The trimming unit 30 trims the video within the trimming frame from the captured video, and converts the trimmed video to the "waist shot" of "person A", which is the designated virtual camera work, as shown in FIG. Output as video. The state in which the virtual camerawork image is displayed on the display unit 41 of the tablet terminal (computer 100) is omitted because the virtual camerawork image is only the image of the "person A"'s "waist shot".

３．仮想カメラワークが「人物Ａ（一人）」の「フルショット」の場合の具体的な動作
以下の説明では、映像処理装置２には、図１４に示すような撮影映像が入力されており、特定人物指定受付部２１が受け付けた特定人物（仮想カメラワークの被写体）が「人物Ａ」であり、仮想カメラワーク指定受付部２２により指定された仮想カメラワークが「フルショット」である場合を説明する。また、各部が読み出す具体的な構図情報は、構図データベース２７に予め記憶されているものとする。 3. Specific operation when the virtual camera work is "full shot" of "person A (single person)" In the following description, the image processing apparatus 2 is input with a shot image as shown in FIG. A case will be described where the specific person (subject of virtual camera work) accepted by the person designation accepting unit 21 is "person A" and the virtual camera work specified by the virtual camera work designation accepting unit 22 is "full shot". . Further, it is assumed that specific composition information read by each unit is stored in the composition database 27 in advance.

トリミングフレームサイズ決定部２８は、特定人物が「一人」であり、仮想カメラワークが「フルショット」に対応する構図情報を、構図データベース２７から読み出す。ここで、読み出される構図情報は、以下の通りである。
・特定人物の骨格の縦方向の長さ（全身）を、被写体フレームの縦方向の長さとする。
・トリミングフレームは第２のトリミングフレームを使用する。
・トリミングフレームの縦方向の中心線と被写体フレームの縦方向の中心線とは一致させる。 The trimming frame size determining unit 28 reads from the composition database 27 the composition information corresponding to the specific person being "single person" and the virtual camera work being "full shot". Here, the read composition information is as follows.
- The vertical length of a specific person's skeleton (whole body) is taken as the vertical length of the subject frame.
- The trimming frame uses the second trimming frame.
・Make the vertical center line of the trimming frame coincide with the vertical center line of the subject frame.

トリミングフレームサイズ決定部２８は、人物Ａの骨格より、骨格の縦方向の長さ（全身）を計算する。そして、図２０に示すように、縦方向の長さが、人物Ａの縦方向の骨格の長さと同一の長さである被写体フレームの仮フレームを生成し、仮フレームの上辺が人物の骨格の最上部の頂点に位置し、かつ、縦方向の中心線と骨格の中心線とが一致する位置に、仮フレームを設置する。 The trimming frame size determining unit 28 calculates the vertical length (whole body) of the skeleton of the person A from the skeleton. Then, as shown in FIG. 20, a temporary frame of the subject frame whose length in the vertical direction is the same as the length of the skeleton in the vertical direction of the person A is generated, and the upper side of the temporary frame is the skeleton of the person. A temporary frame is placed at the top vertex and at a position where the vertical center line and the center line of the skeleton coincide.

次に、トリミングフレームサイズ決定部２８は、図２０に示すように、仮フレームの縦方向の長さをかえずに、人物Ａの骨格が入る最大の範囲となるように、横方向の長さを調整する。このようにして生成されたフレームが被写体フレームである。 Next, as shown in FIG. 20, the trimming frame size determining unit 28 adjusts the horizontal length of the temporary frame so that the skeleton of the person A can be accommodated in the maximum range without changing the vertical length of the temporary frame. to adjust. A frame generated in this manner is a subject frame.

トリミングフレームサイズ決定部２８は、図２０に示すように、構図情報を満足するように、第２のトリミングフレームのサイズを調整する。第２のトリミングフレームのサイズの調整は、最低のマージンを確保した上で、第２のトリミングフレームのマージンが被写体フレームに重ならない範囲で、なるべく好ましいマージンの大きさになるように調整する。このようにして、トリミングフレームサイズ決定部２８は、図２１に示すように、トリミングフレームのサイズを決定する。 The trimming frame size determining unit 28 adjusts the size of the second trimming frame so as to satisfy the composition information, as shown in FIG. The adjustment of the size of the second trimming frame ensures the minimum margin, and adjusts the size of the margin to be as preferable as possible within a range in which the margin of the second trimming frame does not overlap the object frame. In this manner, the trimming frame size determining unit 28 determines the trimming frame size as shown in FIG.

トリミング制御部２９は、特定人物が「一人」であり、仮想カメラワークが「フルショット」に対応する構図情報のうち、制御に関する構図情報を、構図データベース２７から読み出す。ここで、読み出される構図情報は、以下の通りである。
・トリミングフレームを、トリミングフレームの縦方向の中心線と被写体の縦方向の骨格中心線とが一致するように配置する。 The trimming control unit 29 reads, from the composition database 27, composition information related to control, among composition information corresponding to the specific person being “single person” and the virtual camera work being “full shot”. Here, the read composition information is as follows.
・Arrange the trimming frame so that the vertical center line of the trimming frame and the vertical skeleton center line of the subject match.

トリミング制御部２９は、人物特定部２６からの骨格の位置情報（骨格中心線）と構図情報とを用いて、トリミングフレームサイズ決定部２８で決定されたサイズのトリミングフレームを、図２１に示すように、撮影映像上に配置する。 The trimming control unit 29 uses the skeleton position information (skeletal center line) and the composition information from the person identifying unit 26 to generate a trimming frame of the size determined by the trimming frame size determining unit 28 as shown in FIG. , and place it on the captured image.

トリミング部３０は、撮影映像からトリミングフレーム内の映像をトリミングし、トリミングされたトリミング映像を、図２１に示されるように、指定された仮想カメラワークである「人物Ａ」の「フルショット」の映像として出力する。尚、タブレット端末（コンピュータ１００）の表示部４１に仮想カメラワーク映像が表示されている状態は、仮想カメラワーク映像が「人物Ａ」の「フルショット」の映像となるだけなので、省略する。 The trimming unit 30 trims the video within the trimming frame from the captured video, and converts the trimmed video to a "full shot" of "person A", which is the designated virtual camera work, as shown in FIG. Output as video. It should be noted that the state in which the virtual camerawork image is displayed on the display unit 41 of the tablet terminal (computer 100) is only the "full shot" image of "person A", so a description thereof will be omitted.

４．仮想カメラワークが「人物Ａ、人物Ｂ（二人）」の「フルショット」の場合の具体的な動作
以下の説明では、映像処理装置２には、図１４に示すような撮影映像が入力されており、特定人物指定受付部２１が受け付けた特定人物（仮想カメラワークの被写体）が「人物Ａ、人物Ｂ（二人）」であり、仮想カメラワーク指定受付部２２により指定された仮想カメラワークが「フルショット」である場合を説明する。また、各部が読み出す具体的な構図情報は、構図データベース２７に予め記憶されているものとする。 4. Specific operation when the virtual camera work is "full shot" of "person A and person B (two people)" In the following description, the image processing apparatus 2 receives a photographed image as shown in FIG. , the specific persons (subjects of the virtual camera work) accepted by the specific person specification accepting unit 21 are "person A and person B (two people)", and the virtual camera work specified by the virtual camera work specification accepting unit 22 is is a "full shot". Further, it is assumed that specific composition information read by each unit is stored in the composition database 27 in advance.

人物特定部２６は、顔認識部２４により認識された人物及び骨格判定部２５により判定された骨格を用いて、図２２に示すように撮影映像から「人物Ａ」及び「人物Ｂ」の骨格を特定する。また、「人物Ａ」及び「人物Ｂ」」の骨格の縦方向の骨格中心線も決定する。尚、骨格中心線は、骨格の首元を通過する線である。そして、これらの骨格の座標情報をトリミングフレームサイズ決定部２８に出力する。 The person identification unit 26 uses the person recognized by the face recognition unit 24 and the skeleton determined by the skeleton determination unit 25 to identify the skeletons of "person A" and "person B" from the captured image as shown in FIG. Identify. Also, the vertical skeleton centerlines of the skeletons of "Person A" and "Person B" are determined. The skeleton centerline is a line passing through the neck of the skeleton. Then, the coordinate information of these skeletons is output to the trimming frame size determining section 28 .

トリミングフレームサイズ決定部２８は、特定人物がが「二人」であり、仮想カメラワークである「フルショット」に対応する構図情報のうち縦方向の構図情報を、構図データベース２７から読み出す。ここで、読み出される構図情報は、以下の通りである。
・全ての特定人物の骨格の縦方向が入る長さを、被写体フレームの縦方向の長さとする。
・被写体フレームの縦方向の中心線は、二人の人物の骨格中心線の真ん中に位置する。
・トリミングフレームは第２のトリミングフレームを使用する。
・トリミングフレームの縦方向の中心線と被写体フレームの縦方向の中心線とは一致させる。 The trimming frame size determining unit 28 reads from the composition database 27 the vertical composition information among the composition information corresponding to the virtual camerawork "full shot" in which the specific persons are "two people". Here, the read composition information is as follows.
- The length in which all the skeletons of the specific person are included in the vertical direction is set as the vertical length of the subject frame.
• The vertical centerline of the subject frame is located in the middle of the skeletal centerlines of the two persons.
- The trimming frame uses the second trimming frame.
・Make the vertical center line of the trimming frame coincide with the vertical center line of the subject frame.

トリミングフレームサイズ決定部２８は、「人物Ａ」及び「人物Ｂ」の骨格より、骨格の縦方向の長さを計算する。そして、図２２に示すように、縦方向の長さが、「人物Ａ」及び「人物Ｂ」の縦方向の骨格が全て入る長さと同一の長さである被写体フレームの仮フレームを生成し、仮フレームの上辺が人物の骨格の最上部の頂点に位置し、かつ、縦方向の中心線が二人の骨格中心線との真ん中となる位置に、仮フレームを設置する。 The trimming frame size determining unit 28 calculates the vertical length of the skeleton from the skeletons of "person A" and "person B". Then, as shown in FIG. 22, a temporary frame of a subject frame having a length in the vertical direction that is the same length as the length in which all the vertical skeletons of "Person A" and "Person B" are contained is generated, The temporary frame is installed at a position where the upper side of the temporary frame is positioned at the top vertex of the skeleton of the person and the center line in the vertical direction is in the middle of the skeleton center lines of the two people.

次に、トリミングフレームサイズ決定部２８は、図２２に示すように、フレームの縦方向の長さをかえずに、人物Ａ及び人物Ｂの骨格が入る最大の範囲となるように、横方向の長さを調整する。このようにして生成されたフレームが被写体フレームである。 Next, as shown in FIG. 22, the trimming frame size determination unit 28 adjusts the horizontal size of the frame so that the skeletons of the persons A and B can be accommodated in the maximum range without changing the length of the frame in the vertical direction. Adjust length. A frame generated in this manner is a subject frame.

トリミングフレームサイズ決定部２８は、図２２に示すように、構図情報を満足するように、第２のトリミングフレームのサイズを調整する。第２のトリミングフレームのサイズの調整は、最低のマージンを確保した上で、第２のトリミングフレームのマージンが被写体フレームに重ならない範囲で、なるべく好ましいマージンの大きさになるように調整する。このようにして、トリミングフレームサイズ決定部２８は、図２３に示すように、トリミングフレームのサイズを決定する。 The trimming frame size determining unit 28 adjusts the size of the second trimming frame so as to satisfy the composition information, as shown in FIG. The adjustment of the size of the second trimming frame ensures the minimum margin, and adjusts the size of the margin to be as preferable as possible within a range in which the margin of the second trimming frame does not overlap the object frame. In this manner, the trimming frame size determining unit 28 determines the trimming frame size as shown in FIG.

トリミング制御部２９は、特定人物がが「二人」であり、仮想カメラワークである「フルショット」に対応する構図情報のうち、制御に関する構図情報を、構図データベース２７から読み出す。ここで、読み出される構図情報は、以下の通りである。
・トリミングフレームを、トリミングフレームの縦方向の中心線が二人の骨格中心線の真ん中に位置するように配置する。 The trimming control unit 29 reads from the composition database 27 the composition information related to the control among the composition information corresponding to the "full shot" which is the virtual camera work and the specific persons are "two people". Here, the read composition information is as follows.
- Arrange the trimming frame so that the longitudinal centerline of the trimming frame is positioned in the middle of the skeletal centerlines of the two persons.

トリミング制御部２９は、人物特定部２６からの骨格の位置情報（骨格中心線）と構図情報とを用いて、トリミングフレームサイズ決定部２８で決定されたサイズのトリミングフレームを、図２３に示すように、撮影映像上に配置する。 The trimming control unit 29 uses the skeleton position information (skeletal center line) and the composition information from the person identifying unit 26 to convert the trimming frame of the size determined by the trimming frame size determination unit 28 into the trimming frame as shown in FIG. , and place it on the captured image.

トリミング部３０は、撮影映像からトリミングフレーム内の映像をトリミングし、トリミングされたトリミング映像を、図２３に示すように、指定された仮想カメラワークである「人物Ａ、人物Ｂ（二人）」の「フルショット」の映像として出力する。尚、タブレット端末（コンピュータ１００）の表示部４１に仮想カメラワーク映像が表示されている状態は、仮想カメラワーク映像が「人物Ａ、人物Ｂ（二人）」の「フルショット」の映像となるだけなので、省略する。 The trimming unit 30 trims the video within the trimming frame from the captured video, and converts the trimmed video into the specified virtual camera work "person A, person B (two people)" as shown in FIG. output as a “full-shot” image. When the virtual camerawork image is displayed on the display unit 41 of the tablet terminal (computer 100), the virtual camerawork image is a "full-shot" image of "person A and person B (two people)." only, so we omit it.

５．仮想カメラワークが「人物Ｃ（一人）」の「ティルトアップ」の場合の具体的な動作
以下の説明では、映像処理装置２には、図１４に示すような撮影映像が入力されており、特定人物指定受付部２１が受け付けた特定人物（仮想カメラワークの被写体）が「人物Ｃ」であり、仮想カメラワーク指定受付部２２により指定された仮想カメラワークが「ティルトアップ」である場合を説明する。また、各部が読み出す具体的な構図情報は、構図データベース２７に予め記憶されているものとする。 5. Specific operation when the virtual camera work is “person C (single person)” “tilt up” In the following description, the image processing device 2 is input with a shot image as shown in FIG. A case will be described where the specific person (subject of virtual camera work) accepted by the person designation accepting unit 21 is "person C" and the virtual camera work specified by the virtual camera work designation accepting unit 22 is "tilt up". . Further, it is assumed that specific composition information read by each unit is stored in the composition database 27 in advance.

人物特定部２６は、顔認識部２４により認識された人物及び骨格判定部２５により判定された骨格を用いて、図２４に示すように撮影映像から「人物Ｃ」の骨格を特定する。また、「人物Ｃ」の骨格の縦方向の骨格中心線も決定する。尚、骨格中心線は、骨格の首元を通過する線である。そして、これらの骨格の座標情報をトリミングフレームサイズ決定部２８に出力する。 Using the person recognized by the face recognition unit 24 and the skeleton determined by the skeleton determination unit 25, the person identification unit 26 identifies the skeleton of “person C” from the captured video as shown in FIG. Also, the vertical skeletal centerline of the skeleton of "person C" is determined. The skeleton centerline is a line passing through the neck of the skeleton. Then, the coordinate information of these skeletons is output to the trimming frame size determining section 28 .

トリミングフレームサイズ決定部２８は、特定人物が「一人」であり、仮想カメラワークである「ティルトアップ」に対応する構図情報を、構図データベース２７から読み出す。ここで、読み出される構図情報は、以下の通りである。
・顔の縦方向の長さを１としたとき、被写体フレームの縦方向の長さを１．１倍とする。
・被写体フレームの縦方向の長さは、人物の全身の横幅（人物の骨格が横方向に占める最大の範囲）とする。
・仮想カメラワークの開始時に使用するフレームは、第３のトリミングフレームを使用する。
・仮想カメラワークの終了時に使用するフレームは、第１のトリミングフレームを使用する。
・仮想カメラワークの開始時及び終了時に使用するフレームは、第４のトリミングフレームを使用する。
・トリミングフレームの縦方向の中心線と被写体フレームの縦方向の中心線とは一致させる。 The trimming frame size determining unit 28 reads from the composition database 27 the composition information corresponding to the “tilt up” virtual camerawork in which the specific person is “single person”. Here, the read composition information is as follows.
When the vertical length of the face is 1, the vertical length of the subject frame is 1.1 times.
・The length of the subject frame in the vertical direction is the width of the whole body of the person (the maximum horizontal range occupied by the skeleton of the person).
- The third trimming frame is used as the frame used at the start of virtual camera work.
- The first trimming frame is used as the frame used at the end of the virtual camerawork.
- The frame used at the start and end of virtual camera work is the fourth trimming frame.
・Make the vertical center line of the trimming frame coincide with the vertical center line of the subject frame.

トリミングフレームサイズ決定部２８は、人物Ｃの骨格より、横方向の横幅の長さを計算する。そして、図２４に示すように、横幅が人物Ｃの横幅であり、人物Ｃの骨格中心線と被写体フレームの中心線とが一致するように仮フレームを決定する。更に、トリミングフレームサイズ決定部２８は、仮フレームの縦方向の長さが、顔の長さの１．１倍の長さとする被写体フレームを生成する。 The trimming frame size determination unit 28 calculates the length of the width in the horizontal direction from the skeleton of the person C. FIG. Then, as shown in FIG. 24, a temporary frame is determined so that the horizontal width is the horizontal width of the person C, and the skeleton centerline of the person C and the centerline of the subject frame match. Furthermore, the trimming frame size determining unit 28 generates a subject frame in which the vertical length of the temporary frame is 1.1 times the length of the face.

トリミングフレームサイズ決定部２８は、図２４に示すように、構図情報を満足するように、第１のトリミングフレームを調整する。第１のトリミングフレームのサイズの調整は、最低のマージンを確保した上で、第１のトリミングフレームのマージンが被写体フレームに重ならない範囲で、なるべく好ましいマージンの大きさになるように調整する。第３及び第４のトリミングフレームの横幅については、サイズの調整後の第１のトリミングフレームの横幅と同じ大きさとする。このようにして、トリミングフレームサイズ決定部２８は、図２４に示すように、トリミングフレームのサイズを決定する。 The trimming frame size determining unit 28 adjusts the first trimming frame so as to satisfy the composition information, as shown in FIG. The size of the first trimming frame is adjusted so as to obtain a preferable size of the margin as long as the minimum margin is secured and the margin of the first trimming frame does not overlap the object frame. The widths of the third and fourth trimming frames are the same as the width of the first trimming frame after size adjustment. In this manner, the trimming frame size determining unit 28 determines the trimming frame size as shown in FIG.

トリミング制御部２９は、特定人物が「一人」であり、仮想カメラワークである「ティルトアップ」に対応する構図情報を、構図データベース２７から読み出す。ここで、読み出される構図情報は、以下の通りである。
・仮想カメラワークの開始時に使用するフレームは、第３のトリミングフレームを使用する。
・仮想カメラワークの終了時に使用するフレームは、第１のトリミングフレームを使用する。
・仮想カメラワークの開始時及び終了時に使用するフレームは、第４のトリミングフレームを使用する。
・トリミングフレームを、トリミングフレームの縦方向の中心線と人物の縦方向の骨格中心線とが一致するように配置する。
・仮想カメラワークの開始時に使用するフレームを１秒静止後、人物の骨格中心線に沿ってスプライン（最初と最後はゆっくり動き途中は速い動き）に移動する。
・仮想カメラワークの終了時に使用するフレームは静止を維持する。 The trimming control unit 29 reads from the composition database 27 the composition information corresponding to the “tilt-up” virtual camera work in which the specific person is “single person”. Here, the read composition information is as follows.
- The third trimming frame is used as the frame used at the start of virtual camera work.
- The first trimming frame is used as the frame used at the end of the virtual camerawork.
- The frame used at the start and end of virtual camera work is the fourth trimming frame.
・Arrange the trimming frame so that the vertical center line of the trimming frame and the vertical skeleton center line of the person match.
・After stopping the frame used at the start of the virtual camera work for 1 second, it moves to a spline (slow movement at the beginning and end, fast movement in the middle) along the skeleton centerline of the person.
- The frame used at the end of the virtual camerawork remains stationary.

トリミング部３０は、撮影映像から移動するトリミングフレーム内の映像をトリミングし、トリミングされたトリミング映像を、図２５に示されるように、指定された仮想カメラワークである「人物Ｃ」の「ティルトアップ」の映像として出力する。尚、タブレット端末（コンピュータ１００）の表示部４１に仮想カメラワーク映像が表示されている状態は、仮想カメラワーク映像が仮想カメラワークである「人物Ｃ」の「ティルトアップ」の映像となるだけなので、省略する。 The trimming unit 30 trims the video within the trimming frame that moves from the captured video, and uses the trimmed video as shown in FIG. ” image. It should be noted that the state in which the virtual camerawork image is displayed on the display unit 41 of the tablet terminal (computer 100) is merely a "tilt-up" image of "Person C" who is the virtual camerawork. , omitted.

６．仮想カメラワークが「パン」の場合の具体的な動作
以下の説明では、映像処理装置２には、図１４に示すような撮影映像が入力されており、「人物Ａ」から「人物B」に、アップショットでパンする場合を説明する。尚、各部が読み出す具体的な構図情報は、構図データベース２７に予め記憶されているものとする。 6. Specific operation when the virtual camera work is "pan" In the following description, the video processing device 2 is input with a photographed video as shown in FIG. , to explain the case of panning in a close-up shot. It is assumed that specific composition information read by each unit is stored in the composition database 27 in advance.

仮想カメラワークが「パン」の場合は、始点と終点とを指定する必要がある。始点及び終点の指定内容は、被写体となる特定人物とショット(ロングショット、アップショット、バストショット、ウェストショット)である。本例では、始点は「人物Ａ」のアップショットであるので、仮想カメラワーク指定受付部４２（仮想カメラワーク指定受付部２２）の「Pan(パン)」をタッチした後に始点をタッチし、「Up(アップショット)」をタッチする。続けて、特定人物指定受付部４３（特定人物指定受付部２１）の「人物Ａ」をタッチする。次に、終点は「人物Ｂ」のアップショットであるので、仮想カメラワーク指定受付部４２（仮想カメラワーク指定受付部２２）の「Pan(パン)」をタッチした後に終点をタッチし、「Up(アップショット)」をタッチする。続けて、特定人物指定受付部４３（特定人物指定受付部２１）の「人物Ｂ」をタッチする。これにより、「パン」の始点と終点とを指定する。 When the virtual camera work is "pan", it is necessary to specify the start point and end point. The specified contents of the start point and the end point are a specific person as a subject and a shot (long shot, close-up shot, bust shot, waist shot). In this example, the starting point is a close-up shot of "Person A", so after touching "Pan" in the virtual camera work designation receiving section 42 (virtual camera work designation receiving section 22), the starting point is touched, and " Touch Up (up shot). Subsequently, the user touches "Person A" on the specific person designation reception unit 43 (specific person designation reception unit 21). Next, since the end point is a close-up shot of "Person B", after touching "Pan" of the virtual camera work designation reception unit 42 (virtual camera work designation reception unit 22), the end point is touched, and "Up (Upshot)”. Subsequently, the user touches "Person B" of the specific person designation reception unit 43 (specific person designation reception unit 21). This specifies the start point and end point of the "pan".

まず、始点の「人物Ａ」について、人物特定部２６及びトリミングフレームサイズ決定部２８は以下のように動作する。 First, the person identifying unit 26 and the trimming frame size determining unit 28 operate as follows for "person A" at the starting point.

トリミングフレームサイズ決定部２８は、始点の特定人物が「一人」であり、始点のショットが「アップショット」に対応する仮想カメラワーク「パン」に対応する構図情報を、構図データベース２７から読み出す。ここで、読み出される構図情報は、以下の通りである。
・顔の縦方向の長さを１としたとき、被写体フレームの縦方向の長さを１．１倍とする。
・トリミングフレームは第１のトリミングフレームを使用する。
・トリミングフレームの縦方向の中心線と被写体フレームの縦方向の中心線とは一致させる。 The trimming frame size determination unit 28 reads from the composition database 27 the composition information corresponding to the virtual camerawork "pan" in which the specific person at the starting point is "single person" and the shot at the starting point corresponds to "close-up shot". Here, the read composition information is as follows.
When the vertical length of the face is 1, the vertical length of the subject frame is 1.1 times.
・The trimming frame uses the first trimming frame.
・Make the vertical center line of the trimming frame coincide with the vertical center line of the subject frame.

次に、終点の「人物Ｂ」について、人物特定部２６及びトリミングフレームサイズ決定部２８は以下のように動作する。 Next, the person identification unit 26 and the trimming frame size determination unit 28 operate as follows for "person B" at the end point.

人物特定部２６は、顔認識部２４により認識された人物及び骨格判定部２５により判定された骨格を用いて、図２６に示すように撮影映像から「人物Ｂ」の骨格を判定する。また、「人物Ｂ」の骨格の縦方向の骨格中心線も決定する。尚、骨格中心線は、骨格の首元を通過する線である。そして、これらの骨格の座標情報をトリミングフレームサイズ決定部２８に出力する。 Using the person recognized by the face recognition unit 24 and the skeleton determined by the skeleton determination unit 25, the person identification unit 26 determines the skeleton of "person B" from the captured image as shown in FIG. Also, the vertical skeletal centerline of the skeleton of "person B" is determined. The skeleton centerline is a line passing through the neck of the skeleton. Then, the coordinate information of these skeletons is output to the trimming frame size determining section 28 .

トリミングフレームサイズ決定部２８は、終点の特定人物が「一人」であり、終点のショットが「アップショット」に対応する仮想カメラワーク「パン」に対応する構図情報を、構図データベース２７から読み出す。ここで、読み出される構図情報は、以下の通りである。
・顔の縦方向の長さを１としたとき、被写体フレームの縦方向の長さを１．１倍とする。
・トリミングフレームは第１のトリミングフレームを使用する。
・トリミングフレームの縦方向の中心線と被写体フレームの縦方向の中心線とは一致させる。 The trimming frame size determination unit 28 reads from the composition database 27 the composition information corresponding to the virtual camerawork "pan" in which the specific person at the end point is "single person" and the shot at the end point is "close-up shot". Here, the read composition information is as follows.
When the vertical length of the face is 1, the vertical length of the subject frame is 1.1 times.
・The trimming frame uses the first trimming frame.
・Make the vertical center line of the trimming frame coincide with the vertical center line of the subject frame.

トリミングフレームサイズ決定部２８は、人物Ｂの骨格より、顔の縦方向の長さを計算する。そして、図２６に示すように、縦方向の長さが、顔の長さの１．１倍の長さである被写体フレームの仮フレームを生成し、仮フレームの上辺が人物の顔の骨格の頂点に位置し、かつ、縦方向の中心線と骨格中心線とが一致する位置に、仮フレームを設置する。 The trimming frame size determining unit 28 calculates the length of the face in the vertical direction from the skeleton of the person B. Then, as shown in FIG. 26, a temporary frame of the subject frame whose length in the vertical direction is 1.1 times the length of the face is generated. A temporary frame is placed at the vertex and at a position where the vertical center line and the skeleton center line coincide.

次に、トリミングフレームサイズ決定部２８は、図２６に示すように、仮フレームの縦方向の長さをかえずに、横方向の人物Ａの骨格の最大範囲となるように横方向の長さを調整する。このようにして生成されたフレームが被写体フレームである。 Next, as shown in FIG. 26, the trimming frame size determining unit 28 adjusts the horizontal length of the temporary frame so that the maximum range of the horizontal skeleton of the person A is obtained without changing the vertical length of the temporary frame. to adjust. A frame generated in this manner is a subject frame.

トリミングフレームサイズ決定部２８は、図２６に示すように、構図情報を満足するように、第１のトリミングフレームのサイズを調整する。第１のトリミングフレームのサイズの調整は、最低のマージンを確保した上で、第１のトリミングフレームのマージンが被写体フレームに重ならない範囲で、なるべく好ましいマージンの大きさになるように調整する。このようにして、トリミングフレームサイズ決定部２８は、図２６に示すように、トリミングフレームのサイズを決定する。 The trimming frame size determining unit 28 adjusts the size of the first trimming frame so as to satisfy the composition information, as shown in FIG. The size of the first trimming frame is adjusted so as to obtain a preferable size of the margin as long as the minimum margin is secured and the margin of the first trimming frame does not overlap the object frame. In this manner, the trimming frame size determining unit 28 determines the trimming frame size as shown in FIG.

始点及び終点のトリミングフレームのサイズが決定すると、トリミング制御部２９は、始点及び終点の特定人物が「一人」であり、始点及び終点のショットが「アップショット」に対応する仮想カメラワーク「パン」の構図情報のうち、制御に関する構図情報を、構図データベース２７から読み出す。ここで、読み出される構図情報は、以下の通りである。
・始点時に指定された特定人物のトリミングフレームを、トリミングフレームの縦方向の中心線と、始点時に指定された特定人物の縦方向の骨格中心線とが一致するように配置する。
・終点時に指定された特定人物のトリミングフレームを、トリミングフレームの縦方向の中心線と、終点時時に指定された特定人物の縦方向の骨格中心線とが一致するように配置する。
・始点時から終点時直前のトリミングフレームは、始点時に使用されるトリミングフレームを使用する。
・終点時は、終点時に指定された特定人物のトリミングフレームを使用する。
・仮想カメラワークの開始時に使用するトリミングフレームを１秒静止後、始点時のトリミングフレームの位置から終点時直前のトリミングフレームの位置まで、スプライン（最初と最後はゆっくり動き途中は速い動き）に移動する。
・仮想カメラワークの終了時に使用するトリミングフレームは静止を維持する。 When the sizes of the trimming frames at the start and end points are determined, the trimming control unit 29 performs virtual camerawork "pan" in which the specific person at the start and end points is "single" and the shot at the start and end points is a "close-up shot". Of the composition information, the composition information relating to control is read out from the composition database 27 . Here, the read composition information is as follows.
Arrange the trimming frame of the specific person specified at the start point so that the vertical center line of the trimming frame coincides with the vertical skeleton center line of the specific person specified at the start point.
Arrange the trimming frame of the specific person specified at the end point so that the vertical center line of the trimming frame coincides with the vertical skeleton center line of the specified person specified at the end point.
・The trimming frame used at the start point is used as the trimming frame between the start point and the end point.
・At the end point, the trimming frame of the specific person specified at the end point is used.
・ After stopping the trimming frame used at the start of virtual camera work for 1 second, move it to a spline (slow at the beginning and end, fast in the middle) from the position of the trimming frame at the start point to the position of the trimming frame just before the end point. do.
・The trimming frame used at the end of the virtual camera work remains stationary.

トリミング制御部２９は、人物特定部２６からの骨格の位置情報（骨格中心線）と構図情報とを用いて、トリミングフレームサイズ決定部２８で決定されたサイズのトリミングフレームを、図２７に示すように、撮影映像上に配置して制御する。 The trimming control unit 29 uses the skeleton position information (skeletal center line) and the composition information from the person identifying unit 26 to convert the trimming frame of the size determined by the trimming frame size determining unit 28 into a trimming frame as shown in FIG. In addition, it is placed on the captured image and controlled.

トリミング部３０は、撮影映像からトリミングフレーム内の映像をトリミングし、トリミングされたトリミング映像を、図２７に示されるように、始点「人物Ａ」から終点「人物Ｂ」へのアップショットのパンに対応する仮想カメラワーク映像として出力する。 The trimming unit 30 trims the video within the trimming frame from the captured video, and uses the trimmed video as a close-up pan from the start point "person A" to the end point "person B" as shown in FIG. Output as a corresponding virtual camera work video.

以上で、第１の実施の形態の説明を終わる。尚、上述したトリミングフレーム及び構図情報の数値等は一例であり、適時変更が可能であることは言うまでもない。 This completes the description of the first embodiment. It goes without saying that the numerical values of the trimming frame and the composition information described above are examples, and can be changed as appropriate.

第１の実施の形態は、撮影映像中の人物を認識するように構成されているので、従来の技術のように、ユーザが撮影画像中の特定人物（被写体）を探す必要がなく、手間を削減することができる。 Since the first embodiment is configured to recognize a person in a captured image, the user does not need to search for a specific person (object) in the captured image unlike the conventional technology, and thus saves time and effort. can be reduced.

更に、第１の実施の形態は、特定人物（被写体）の骨格を判定し、特定人物の部位の映像上の大きさや位置を特定し、その情報と構図情報とを用いて、仮想カメラワーク映像を生成しているので、多様なカメラワークの映像を生成することができる。 Furthermore, in the first embodiment, the skeleton of a specific person (subject) is determined, the size and position of the part of the specific person on the video are specified, and the information and composition information are used to generate a virtual camerawork video. is generated, it is possible to generate various camera work images.

＜第１の実施の形態の変形例１＞
上述した第１の実施の形態における仮想カメラワークでは、特定人物の顔の向きを考慮していない。しかしながら、特定人物の顔の向きを考慮することにより、より自然なカメラワークを行うことができる。例えば、特定人物のアップショットを撮影する場合等、特定人物の顔が向いている方向の空間を広く取ることにより、より良い構図を得ることが出来る。そこで、第１の実施の形態の変形例では、顔認識部２４より、特定人物の顔の向きの情報を取得し、特定人物の顔の向きの情報を考慮した構図情報を用いた仮想カメラワークの例を説明する。 <Modification 1 of the first embodiment>
The virtual camerawork in the first embodiment described above does not take into consideration the direction of the face of a specific person. However, more natural camerawork can be performed by considering the orientation of a specific person's face. For example, when taking a close-up shot of a specific person, it is possible to obtain a better composition by widening the space in the direction in which the face of the specific person is facing. Therefore, in the modification of the first embodiment, information on the orientation of the face of a specific person is acquired from the face recognition unit 24, and virtual camera work is performed using composition information in consideration of the information on the orientation of the face of the specific person. An example of

以下の説明においては、特定人物が「人物Ｂ（一人）」であり、仮想カメラワークが「アップショット」の場合における、トリミングフレームサイズ決定部２８、トリミング制御部２９、トリミング部３０の具体的な動作を中心に説明する。そして、映像処理装置２には、図１４に示すような撮影映像が入力されている場合を説明する。また、各部が読み出す具体的な構図情報は、構図データベース２７に予め記憶されているものとする。 In the following description, when the specific person is "Person B (single)" and the virtual camera work is "close-up shot", specific details of the trimming frame size determination unit 28, the trimming control unit 29, and the trimming unit 30 will be described. The explanation will focus on the operation. Then, a case where a photographed image as shown in FIG. 14 is input to the image processing device 2 will be described. Further, it is assumed that specific composition information read by each unit is stored in the composition database 27 in advance.

人物特定部２６は、顔認識部２４により認識された人物及び骨格判定部２５により判定された骨格を用いて、図２８に示すように撮影映像から「人物Ｂ」の骨格を判定する。また、「人物Ｂ」の骨格の縦方向の骨格中心線も決定する。尚、骨格中心線は、骨格の首元を通過する線である。そして、これらの骨格の座標情報をトリミングフレームサイズ決定部２８に出力する。 Using the person recognized by the face recognition unit 24 and the skeleton determined by the skeleton determination unit 25, the person identification unit 26 determines the skeleton of "person B" from the captured image as shown in FIG. Also, the vertical skeletal centerline of the skeleton of "Person B" is determined. The skeleton centerline is a line passing through the neck of the skeleton. Then, the coordinate information of these skeletons is output to the trimming frame size determining section 28 .

また、顔認識部２４は、「人物Ｂ」の顔の向きの情報をトリミングフレームサイズ決定部２８に出力する。 The face recognition unit 24 also outputs information on the orientation of the face of “Person B” to the trimming frame size determination unit 28 .

トリミングフレームサイズ決定部２８は、特定人物が「一人」であり、仮想カメラワークが「アップショット」、顔の向きが「右方向」に対応する構図情報を、構図データベース２７から読み出す。ここで、読み出される「アップショット」に対応する構図情報は、以下の通りである。
・顔の縦方向の長さを１としたとき、被写体フレームの縦方向の長さを１．１倍とする。
・トリミングフレームは第１のトリミングフレームを使用する。
・被写体フレームの縦方向の中心線が、トリミングフレームの左から０．６５：０．３５の位置を通過するようにする。 The trimming frame size determining unit 28 reads from the composition database 27 the composition information corresponding to the specific person being "alone", the virtual camera work being "up shot", and the face direction being "right". Here, the composition information corresponding to the read "close-up shot" is as follows.
When the vertical length of the face is 1, the vertical length of the subject frame is 1.1 times.
・The trimming frame uses the first trimming frame.
・Make the vertical center line of the subject frame pass through the position of 0.65:0.35 from the left of the trimming frame.

トリミングフレームサイズ決定部２８は、人物Ｂの骨格より、顔の縦方向の長さを計算する。そして、図２８に示すように、縦方向の長さが、顔の長さの１．１倍の長さである被写体フレームの仮フレームを生成し、仮フレームの上辺が人物の顔の骨格の頂点に位置し、かつ、縦方向の中心線と骨格中心線とが一致する位置に、仮フレームを設置する。 The trimming frame size determining unit 28 calculates the length of the face in the vertical direction from the skeleton of the person B. Then, as shown in FIG. 28, a temporary frame of the subject frame whose length in the vertical direction is 1.1 times the length of the face is generated. A temporary frame is placed at the vertex and at a position where the vertical center line and the skeleton center line coincide.

次に、トリミングフレームサイズ決定部２８は、図２８に示すように、仮フレームの縦方向の長さをかえずに、横方向の人物Ｂの骨格の最大範囲となるように横方向の長さを調整する。このようにして生成されたフレームが被写体フレームである。 Next, as shown in FIG. 28, the trimming frame size determining unit 28 adjusts the horizontal length of the temporary frame so that the maximum range of the horizontal skeleton of the person B is achieved without changing the vertical length of the temporary frame. to adjust. A frame generated in this manner is a subject frame.

トリミングフレームサイズ決定部２８は、図２８に示すように、構図情報を満足するように、第１のトリミングフレームのサイズを調整する。第１のトリミングフレームのサイズの調整は、被写体フレームの縦方向の中心線が、トリミングフレームの左から０．６５：０．３５の位置を通過する位置で、最低のマージンを確保した上で、第１のトリミングフレームのマージンが被写体フレームに重ならない範囲で、なるべく好ましいマージンの大きさになるように調整する。このようにして、トリミングフレームサイズ決定部２８は、図２８に示すように、トリミングフレームのサイズを決定する。 The trimming frame size determining unit 28 adjusts the size of the first trimming frame so as to satisfy the composition information, as shown in FIG. The size of the first trimming frame is adjusted at a position where the vertical center line of the subject frame passes through a position of 0.65:0.35 from the left of the trimming frame, and after securing the minimum margin, The margin of the first trimming frame is adjusted so as to have a preferable size as long as it does not overlap the subject frame. In this manner, the trimming frame size determining unit 28 determines the trimming frame size as shown in FIG.

トリミング制御部２９は、特定人物が「一人」であり、仮想カメラワークが「アップショット」に対応する構図情報のうち、制御に関する構図情報を、構図データベース２７から読み出す。ここで、読み出される構図情報は、以下の通りである。
・トリミングフレームを、特定人物の縦方向の骨格中心線がトリミングフレームの左から０．６５：０．３５の位置を通過するように配置する。 The trimming control unit 29 reads, from the composition database 27, the composition information related to control among the composition information corresponding to the specific person being "single person" and the virtual camera work being "close-up shot". Here, the read composition information is as follows.
- Arrange the trimming frame so that the vertical skeletal centerline of the specific person passes through the trimming frame at a position of 0.65:0.35 from the left.

トリミング制御部２９は、人物特定部２６からの骨格の位置情報（骨格中心線）と構図情報とを用いて、トリミングフレームサイズ決定部２８で決定されたサイズのトリミングフレームを、図２９に示すように、撮影映像上に配置する。 The trimming control unit 29 uses the skeleton position information (skeletal center line) and the composition information from the person identifying unit 26 to convert the trimming frame of the size determined by the trimming frame size determining unit 28 into a trimming frame as shown in FIG. , and place it on the captured image.

トリミング部３０は、撮影映像からトリミングフレーム内の映像をトリミングし、トリミングされたトリミング映像を、図２９に示されるように、指定された仮想カメラワークである「人物Ｂ」の「アップショット」の映像として出力する。 The trimming unit 30 trims the video within the trimming frame from the captured video, and uses the trimmed video as a "close-up shot" of "person B", which is the designated virtual camera work, as shown in FIG. Output as video.

第１の実施の形態の変形例１によれば、図２７に例示されるように、顔の向きを考慮しない「人物Ｂ」の「アップショット」に比べて、図２９に示されるように、人物の顔の向きを考慮した「人物Ｂ」の「アップショット」の方が、より人間が撮影する理想的な構図の映像を得ることができる。すなわち、より、人間が撮影するカメラワーク映像に近い仮想カメラワーク映像を得ることができる。 According to Modification 1 of the first embodiment, as shown in FIG. 27, compared to the "close-up shot" of "person B" that does not consider the orientation of the face, as shown in FIG. A “close-up shot” of “person B”, which takes into consideration the orientation of the person's face, can provide an image with a more ideal composition for human photography. That is, it is possible to obtain a virtual camerawork image that is closer to a camerawork image taken by a human being.

＜第１の実施の形態の変形例２＞
上述した第１の実施の形態では、被写体の指定として特定人物のみを指定する例を説明した。しかしながら、人物特定部２６では、特定人物の各部位を判定している。そこで、第１の実施の形態の変形例２は、特定人物のみならず、特定人物の部位も指定し、特定人物の部位を被写体とするカメラワークを実現する例を説明する。 <Modification 2 of the first embodiment>
In the above-described first embodiment, an example of designating only a specific person as a subject has been described. However, the person identification unit 26 determines each part of the specific person. Therefore, in Modified Example 2 of the first embodiment, an example will be described in which not only a specific person but also parts of the specific person are specified, and camerawork with the parts of the specific person as subjects is realized.

第１の実施の形態の変形例２の具体例としては、特定人物が「人物Ｂ（一人）」であり、その特定人物の被写体となる部位が「右手」であり、仮想カメラワークが「アップショット」の場合を説明する。尚、各部が読み出す具体的な構図情報は、構図データベース２７に予め記憶されているものとする。 As a specific example of Modified Example 2 of the first embodiment, the specific person is "Person B (one person)", the part of the specific person that is the subject is the "right hand", and the virtual camera work is "up". The case of "shot" will be explained. It is assumed that specific composition information read by each unit is stored in the composition database 27 in advance.

まず、具体的な部位の指定方法であるが、図３０に示す如く、情報処理装置２と表示装置３とが一体に構成されたタブレット端末（コンピュータ１００）上に、特定人物指定受付部４３に加えて、部位指定部５０を設ける。部位指定部５０は、人物の全身を模式的に示した図であり、被写体となる部位をタッチすることにより、部位を指定することができる。例えば、特定人物が「人物Ｂ（一人）」であり、その特定人物の被写体となる部位が「右手」の場合、特定人物指定受付部４３の「人物Ｂ」をタッチ後、部位指定部５０の「右手」をタッチすることにより、「人物Ｂ」及び「右手」を指定することができる。 First, as a specific method of specifying a part, as shown in FIG. In addition, a site designation section 50 is provided. The part specifying section 50 is a diagram schematically showing the whole body of a person, and the part can be specified by touching the part of the subject. For example, when the specific person is "Person B (one person)" and the body part of the specific person is the "right hand", after touching "Person B" on the specific person designation accepting unit 43, By touching "Right Hand", "Person B" and "Right Hand" can be specified.

特定人物、被写体となる特定人物の部位及び仮想カメラワークの指定が終了すると、人物特定部２６は、顔認識部２４により認識された人物及び骨格判定部２５により判定された骨格を用いて、図５に示すように撮影映像から「人物Ｂ」の骨格を判定する。そして、これらの骨格の座標情報をトリミングフレームサイズ決定部２８に出力する。 When the specific person, the part of the specific person to be the subject, and the virtual camerawork are specified, the person specifying unit 26 uses the person recognized by the face recognition unit 24 and the skeleton determined by the skeleton determination unit 25 to determine the 5, the skeleton of "Person B" is determined from the captured image. Then, the coordinate information of these skeletons is output to the trimming frame size determining section 28 .

トリミングフレームサイズ決定部２８は、特定人物が「一人」、被写体となる部位が「右手」であり、仮想カメラワークが「アップショット」に対応する構図情報を、構図データベース２７から読み出す。ここで、読み出される構図情報は、以下の通りである。
・被写体フレームの縦横の中心線の交点（被写体フレームの中心）が、右手の先端となるようにする。
・被写体フレームの縦の長さは、縦右前腕部の長さの２倍とする。
・被写体フレームの横の長さは、横右前腕部の長さの２倍とする。
・トリミングフレームは第１のトリミングフレームを使用する。
・トリミングフレームの縦方向の中心線と被写体フレームの縦方向の中心線とは一致させる。 The trimming frame size determining unit 28 reads from the composition database 27 composition information corresponding to the specific person being "single person", the subject part being the "right hand", and the virtual camera work being "close-up shot". Here, the read composition information is as follows.
・The intersection of the vertical and horizontal center lines of the subject frame (the center of the subject frame) should be the tip of the right hand.
・The vertical length of the subject frame shall be twice the vertical length of the right forearm.
・The horizontal length of the subject frame is twice the length of the horizontal right forearm.
・The trimming frame uses the first trimming frame.
・Make the vertical center line of the trimming frame coincide with the vertical center line of the subject frame.

トリミングフレームサイズ決定部２８は、人物Ｂの骨格より、縦方向及び横方向の右前腕部の長さを計算する。そして、図３１に示すように、縦方向の長さが縦右前腕部の長さの２倍、横方向の長さが横右前腕部の長さの２倍であり、かつ、被写体フレームの縦横の中心線の交点（被写体フレームの中心）が、右手の先端となるように被写体フレームを調整する。 The trimming frame size determination unit 28 calculates the lengths of the right forearm in the vertical and horizontal directions from the skeleton of the person B. FIG. Then, as shown in FIG. 31, the vertical length is twice the length of the vertical right forearm, the horizontal length is twice the length of the horizontal right forearm, and the subject frame Adjust the subject frame so that the intersection of the vertical and horizontal center lines (the center of the subject frame) is the tip of the right hand.

トリミングフレームサイズ決定部２８は、図３１に示すように、構図情報を満足するように、第１のトリミングフレームのサイズを調整する。第１のトリミングフレームのサイズの調整は、最低のマージンを確保した上で、第１のトリミングフレームのマージンが被写体フレームに重ならない範囲で、なるべく好ましいマージンの大きさになるように調整する。このようにして、トリミングフレームサイズ決定部２８は、図３１に示すように、トリミングフレームのサイズを決定する。 The trimming frame size determining unit 28 adjusts the size of the first trimming frame so as to satisfy the composition information, as shown in FIG. The size of the first trimming frame is adjusted so as to obtain a preferable size of the margin as long as the minimum margin is secured and the margin of the first trimming frame does not overlap the object frame. In this manner, the trimming frame size determining unit 28 determines the trimming frame size as shown in FIG.

トリミング制御部２９は、特定人物が「一人」であり、被写体となる部位が「右手」であり、仮想カメラワークが「アップショット」に対応する構図情報のうち、制御に関する構図情報を、構図データベース２７から読み出す。ここで、読み出される構図情報は、以下の通りである。
・トリミングフレームは、トリミングフレームの縦横方向の中心線の交点と、右手の先端とが一致するように配置する。 The trimming control unit 29 stores the composition information related to control among the composition information corresponding to the specific person being "single person", the subject part being "right hand", and the virtual camera work being "close-up shot", to the composition database. 27. Here, the read composition information is as follows.
・Arrange the trimming frame so that the intersection of the vertical and horizontal center lines of the trimming frame is aligned with the tip of the right hand.

トリミング制御部２９は、人物特定部２６からの骨格の位置情報（右手の先端の座標）と構図情報とを用いて、トリミングフレームサイズ決定部２８で決定されたサイズのトリミングフレームを、図３２に示すように、撮影映像上に配置する。 The trimming control unit 29 uses the skeleton position information (the coordinates of the tip of the right hand) and the composition information from the person specifying unit 26 to generate a trimming frame of the size determined by the trimming frame size determining unit 28 as shown in FIG. Place on the captured video as shown.

尚、撮影映像は動画であるため、人物は時間の経過にともなって移動する可能性があり、その移動に伴い右手の先端の座標も移動する可能性がある。右手の先端の座標の移動に伴って、トリミングフレームの位置も制御しても良いが、右手の先端の座標の移動が少ない場合にも、トリミングフレームを移動させると、視聴者にとって視聴し難い映像になる可能性がある。そこで、右手の先端の座標の移動の範囲が所定の距離範囲内である場合には、トリミングフレームの位置を維持するように制御しても良い。 Since the captured image is a moving image, the person may move over time, and the coordinates of the tip of the right hand may also move along with the movement. Although the position of the trimming frame may be controlled according to the movement of the coordinates of the tip of the right hand, even if the movement of the coordinates of the tip of the right hand is small, if the trimming frame is moved, the image may be difficult for the viewer to view. could become Therefore, if the movement range of the coordinates of the tip of the right hand is within a predetermined distance range, control may be performed to maintain the position of the trimming frame.

トリミング部３０は、撮影映像からトリミングフレーム内の映像をトリミングし、トリミングされたトリミング映像を、図３２に示されるように、「人物Ｂの右手」の「アップショット」の映像として出力する。 The trimming unit 30 trims the video within the trimming frame from the captured video, and outputs the trimmed video as a "up shot" video of "person B's right hand" as shown in FIG.

第１の実施の形態の変形例２では、特定人物（被写体）の骨格を用いて判定した各部位の位置や長さの情報と、構図情報とを用いることにより、特定人物の特定の部位を被写体とする仮想カメラワークを実現することができる。 In Modification 2 of the first embodiment, a specific part of a specific person (subject) is determined by using information on the position and length of each part determined using the skeleton of the specific person (subject) and composition information. It is possible to realize virtual camera work with a subject.

＜第１の実施の形態の変形例３＞
第１の実施の形態の変形例３を説明する。図３３は第１の実施の形態の変形例３の情報処理装置２のブロック図である。 <Modification 3 of the first embodiment>
Modification 3 of the first embodiment will be described. FIG. 33 is a block diagram of the information processing device 2 of Modification 3 of the first embodiment.

第１の実施の形態の変形例３では、撮影映像をトリミングして生成された仮想カメラワーク映像に対して映像効果を与える例である。映像効果としては、例えば、フォーカスインのように、撮影フレーム自体は変化がないが、時間と共にフォーカスされていない映像から被写体からフォーカスされた映像になるような映像効果や、その逆のフォーカスアウトなどがある。 Modification 3 of the first embodiment is an example in which a video effect is applied to a virtual camerawork video generated by trimming a captured video. Video effects include, for example, focus-in, in which the shooting frame itself does not change, but over time, an unfocused video changes to a subject-focused video, or vice versa, such as focus-out. There is

これらの映像効果を、撮影映像をトリミングして生成された仮想カメラワーク映像に対して与えるため、第１の実施の形態の変形例３の情報処理装置２は、映像効果処理部３１を更に備える。 In order to give these video effects to the virtual camerawork video generated by trimming the captured video, the information processing device 2 of the third modification of the first embodiment further includes a video effect processing unit 31. .

映像効果処理部３１は、指定された仮想カメラワークに対応する構図情報から映像効果の情報を、構図データベース２７から取得する。そして、映像効果処理部３１は、トリミング部３０から出力されたトリミング映像に対し、その映像効果を処理し、仮想カメラワーク映像を出力する。 The image effect processing unit 31 acquires information on image effects from the composition information corresponding to the specified virtual camera work from the composition database 27 . Then, the video effect processing unit 31 processes the video effects of the trimmed video output from the trimming unit 30, and outputs a virtual camerawork video.

＜第２の実施の形態＞
第２の実施の形態を説明する。 <Second Embodiment>
A second embodiment will be described.

図３４は第２の実施の形態の情報処理装置２のブロック図である。 FIG. 34 is a block diagram of the information processing device 2 of the second embodiment.

第２の実施の形態の情報処理装置２は、ＣＧ（コンピュータグラフィックス）画像や、他の編集済み映像等を、仮想カメラワーク映像にテロップ又はスーパー（重畳）する機能を更に備える。その機能の実現のため、第２の実施の形態の情報処理装置２は、第１の実施の形態の情報処理装置２の構成に加えて、画像データベース５０と、画像選択部５１と、加算器５２とを備える。 The information processing apparatus 2 of the second embodiment further has a function of telopping or superimposing (superimposing) a CG (computer graphics) image or other edited video on the virtual camerawork video. In order to realize the function, the information processing apparatus 2 of the second embodiment includes an image database 50, an image selection unit 51, an adder, in addition to the configuration of the information processing apparatus 2 of the first embodiment. 52.

画像記憶部５０は、ロゴなどのＣＧ画像、編集済み映像等（重畳画像と記載する）が格納される。 The image storage unit 50 stores CG images such as logos, edited images and the like (referred to as superimposed images).

画像選択部５１は、画像記憶部５０に格納されている重畳画像のうち仮想カメラワーク映像に重畳する重畳画像の選択を、ユーザから受付ける部である。画像選択部５１は、ユーザの重畳画像の選択のために、画像記憶部５０に記憶されている重畳画像のうちユーザが選択可能な重畳画像を表示装置３に表示する。表示の一例としては、重畳画像を識別する識別情報、重畳画像のサムネイル画像などがある。また、画像選択部５１は、選択された重畳画像を、仮想カメラワーク映像のどの位置に重畳するか、及びその大きさの制御も行う。位置、大きさの制御は、ユーザによる仮想カメラワーク映像上の位置及び大きさの指定により決定される。 The image selection unit 51 is a unit that receives, from the user, selection of a superimposed image to be superimposed on the virtual camerawork video from among the superimposed images stored in the image storage unit 50 . The image selection unit 51 displays, on the display device 3, a superimposition image that can be selected by the user from the superimposition images stored in the image storage unit 50 so that the user can select a superimposition image. Examples of display include identification information for identifying the superimposed image, a thumbnail image of the superimposed image, and the like. The image selection unit 51 also controls the position of the selected superimposed image on the virtual camerawork video and the size of the superimposed image. Position and size control are determined by the user's designation of the position and size on the virtual camerawork video.

加算器５２は、仮想カメラワーク映像に、選択された重畳画像を加算（重畳）し、重畳画像が重畳された仮想カメラワーク映像を出力する。 The adder 52 adds (superimposes) the selected superimposed image on the virtual camerawork video, and outputs the virtual camerawork video on which the superimposed image is superimposed.

次に、第２の実施の形態の情報処理装置２の動作を説明する。以下の説明では、第１の実施の形態と同様に、情報処理装置２と表示装置３とが一体に構成されたタブレット端末（コンピュータ１００）を想定して説明する。 Next, the operation of the information processing device 2 according to the second embodiment will be described. In the following description, as in the first embodiment, a tablet terminal (computer 100) in which the information processing device 2 and the display device 3 are integrated is assumed.

タブレット端末（コンピュータ１００）の表示部（表示装置３）には、図３５に示す如く、撮影映像が表示される表示部４０と、仮想カメラワーク映像が表示される表示部４１と、仮想カメラワークを指定する仮想カメラワーク指定受付部４２（仮想カメラワーク指定受付部２２）と、被写体となる特定人物を指定する特定人物指定受付部４３（特定人物指定受付部２１）とに加えて、重畳画像を選択するためのサムネイル画像４４(画像選択部５１)が表示される。 As shown in FIG. 35, the display unit (display device 3) of the tablet terminal (computer 100) has a display unit 40 for displaying captured images, a display unit 41 for displaying virtual camera work images, and a virtual camera work image. In addition to a virtual camerawork designation reception unit 42 (virtual camerawork designation reception unit 22) that designates a specific person designation reception unit 43 (specific person designation reception unit 21) that designates a specific person to be a subject, a superimposed image A thumbnail image 44 (image selection portion 51) for selecting is displayed.

ユーザは、表示部４１に表示されている仮想カメラワーク映像に重畳したい重畳画像を、サムネイル画像４４から選択する。選択及び配置方法は、図３６に示す如く、選択するサムネイル画像を、指で仮想カメラワーク映像上にドラッグする。そして、仮想カメラワーク映像上の希望する位置にドロップする。重畳画像のサイズは、重畳画像を仮想カメラワーク映像上にドロップ後、図３７に示すように、その重畳画像をピンチイン又はピンチアウトすることで、そのサイズを決定する。 The user selects a superimposed image to be superimposed on the virtual camerawork video displayed on the display unit 41 from the thumbnail images 44 . As for the selection and placement method, as shown in FIG. 36, the thumbnail image to be selected is dragged onto the virtual camerawork video with a finger. Then, drop it at a desired position on the virtual camerawork video. After dropping the superimposed image onto the virtual camerawork video, the size of the superimposed image is determined by pinching in or pinching out the superimposed image as shown in FIG.

このようにして、ロゴや編集済み映像を仮想カメラワーク映像上に重畳した映像を生成することができる。 In this way, it is possible to generate an image in which the logo and the edited image are superimposed on the virtual camerawork image.

＜第３の実施の形態＞
第３の実施の形態を説明する。 <Third Embodiment>
A third embodiment will be described.

第３の実施の形態は、スイッチングシステムである。図３８は第３の実施の形態のスイッチングシステムのブロック図である。 A third embodiment is a switching system. FIG. 38 is a block diagram of the switching system of the third embodiment.

第３の実施の形態のスイッチングシステムは、第１又は第２の実施の形態における情報処理装置２を複数備えている。各情報処理装置２にはカメラ１の撮影映像が入力され、各情報処理装置２はユーザが選択した仮想カメラワークによるひとつの仮想カメラワーク映像を出力する。 A switching system according to the third embodiment includes a plurality of information processing apparatuses 2 according to the first or second embodiment. Images taken by the camera 1 are input to each information processing device 2, and each information processing device 2 outputs one virtual camerawork image by virtual camerawork selected by the user.

スイッチング装置４は、各情報処理装置２からの仮想カメラワーク映像が入力され、ユーザの選択により、ひとつの仮想カメラワーク映像がライン映像として出力される。 The switching device 4 receives the virtual camerawork video from each information processing device 2, and outputs one virtual camerawork video as a line video according to the user's selection.

次に、第３の実施の形態のスイッチングシステムの動作を説明する。以下の説明では、二つの情報処理装置２を用い、第１の実施の形態と同様に、情報処理装置２と表示装置３とが一体に構成されたタブレット端末（コンピュータ１００）を想定して説明する。すなわち、タブレット端末（コンピュータ１００）は、二つのの情報処理装置２と、表示装置３と、スイッチング装置４とを備える。 Next, the operation of the switching system of the third embodiment will be explained. In the following description, two information processing devices 2 are used, and a tablet terminal (computer 100) in which the information processing device 2 and the display device 3 are integrated as in the first embodiment is assumed. do. That is, the tablet terminal (computer 100) includes two information processing devices 2, a display device 3, and a switching device 4. FIG.

第３の実施の形態のタブレット端末（コンピュータ１００）の表示部（表示装置３）には、二つの情報処理装置２から出力される二種類の仮想カメラワーク映像が表示される表示部４１－１及び４１－２が有る点で、第１の実施の形態及び第２の実施の形態と異なる。表示部４１－１はライン映像が表示される表示部であり、表示部４１－２はライン映像の候補となるプレビュー映像が表示される表示部である。ライン映像が表示される表示部４１－１は、表示部４１－２に比べて枠線を太く表示され、ライン映像であることが容易に視認できるようになっている。 The display unit (display device 3) of the tablet terminal (computer 100) according to the third embodiment displays two types of virtual camerawork images output from the two information processing devices 2. The display unit 41-1 and 41-2 are different from the first and second embodiments. The display unit 41-1 is a display unit that displays line images, and the display unit 41-2 is a display unit that displays preview images that are candidates for line images. The display unit 41-1 displaying the line image has a thicker frame than the display unit 41-2, so that the line image can be easily recognized visually.

ユーザは、ライン映像の候補となる仮想カメラワーク映像を得るため、まず、表示部４１－２の画面を指でタッチ（選択）後、特定人物指定受付部４３（特定人物指定受付部２１）により特定人物を指定し、仮想カメラワーク指定受付部４２（仮想カメラワーク指定受付部２２）により仮想カメラワークを指定する。すると、選択した表示部４１－２には指定した特定人物及び仮想カメラワークに対応する仮想カメラワーク映像が表示される。ここで、表示部４１－２に表示されている仮想カメラワーク映像（プレビュー映像）をライン映像にスイッチングする場合、表示部４１－２をタッチして選択後、「ＴＡＫＥ」ボタンをタッチすることにより、表示部４１－２に表示されている仮想カメラワーク映像（プレビュー映像）をライン映像として出力することができる。このとき、表示部４１－２に表示されていた仮想カメラワーク映像（プレビュー映像）は、表示部４１－２には表示されず、ライン映像が表示される表示部４１－１に表示されるようになる。 In order to obtain a virtual camerawork image that is a candidate for a line image, the user first touches (selects) the screen of the display unit 41-2 with a finger, and then uses the specific person designation reception unit 43 (specific person designation reception unit 21) to A specific person is designated, and virtual camerawork is designated by the virtual camerawork designation reception unit 42 (virtual camerawork designation reception unit 22). Then, the selected display unit 41-2 displays the virtual camera work image corresponding to the designated specific person and the virtual camera work. Here, when switching the virtual camera work image (preview image) displayed on the display unit 41-2 to the line image, by touching the display unit 41-2 to select it and then touching the "TAKE" button. , the virtual camera work image (preview image) displayed on the display unit 41-2 can be output as a line image. At this time, the virtual camera work image (preview image) displayed on the display unit 41-2 is not displayed on the display unit 41-2, and is displayed on the display unit 41-1 where the line image is displayed. become.

更に、ライン映像の候補となる仮想カメラワーク映像を得るためには、仮想カメラワーク映像（プレビュー映像）が表示されていない表示部４１－２の画面を指でタッチ（選択）後、特定人物指定受付部４３（特定人物指定受付部２１）により特定人物を指定し、仮想カメラワーク指定受付部４２（仮想カメラワーク指定受付部２２）により仮想カメラワークを指定する。すると、選択した表示部４１－２には指定した特定人物及び仮想カメラワークに対応する仮想カメラワーク映像が表示される。 Furthermore, in order to obtain a virtual camerawork image that is a candidate for a line image, after touching (selecting) with a finger the screen of the display unit 41-2 where no virtual camerawork image (preview image) is displayed, a specific person is specified. A specific person is designated by the reception unit 43 (specific person designation reception unit 21), and virtual camerawork is designated by the virtual camerawork designation reception unit 42 (virtual camerawork designation reception unit 22). Then, the selected display unit 41-2 displays the virtual camera work image corresponding to the designated specific person and the virtual camera work.

図４０は二種類の仮想カメラワーク映像が表示部４１－１及び表示部４１－２に表示されている例を示している。表示部４１－１に表示されている仮想カメラワーク映像がライン映像であり、表示部４１－２に表示されている仮想カメラワーク映像がプレビュー映像である。そして、表示部４１－１に表示されている仮想カメラワーク映像から表示部４１－２に表示されている仮想カメラワーク映像に、ライン映像をスイッチングする場合、表示部４１－２をタッチして選択後、「ＴＡＫＥ」ボタンをタッチすることにより、表示部４１－２に表示されている仮想カメラワーク映像（プレビュー映像）をライン映像として出力することができる。このとき、図４１に示すように、表示部４１－２に表示されていた仮想カメラワーク映像（プレビュー映像）は、表示部４１－２には表示されず、ライン映像が表示される表示部４１－１に表示されるようになる。 FIG. 40 shows an example in which two types of virtual camerawork images are displayed on the display section 41-1 and the display section 41-2. The virtual camerawork image displayed on the display unit 41-1 is the line image, and the virtual camerawork image displayed on the display unit 41-2 is the preview image. When switching the line image from the virtual camerawork image displayed on the display unit 41-1 to the virtual camerawork image displayed on the display unit 41-2, the display unit 41-2 is touched to select. After that, by touching the "TAKE" button, the virtual camerawork image (preview image) displayed on the display unit 41-2 can be output as a line image. At this time, as shown in FIG. 41, the virtual camera work image (preview image) displayed on the display unit 41-2 is not displayed on the display unit 41-2, and the line image is displayed on the display unit 41. -1 will be displayed.

尚、上述の説明では、ライン映像の候補となる仮想カメラワーク映像を得るため、まず、表示部４１－２の画面を指でタッチ（選択）する例を説明した。これは、プレビュー映像を表示する表示部が複数の場合でも対応できるようにするためである。しかし、本例のように、プレビュー映像を表示する表示部がひとつの場合は、表示部を選択しなくても良い構成とすることもできる。 In the above description, an example of touching (selecting) the screen of the display unit 41-2 with a finger in order to obtain a virtual camerawork image that is a candidate for a line image has been described. This is in order to cope with the case where there are a plurality of display units for displaying preview images. However, as in this example, when there is only one display section for displaying a preview image, it is possible to adopt a configuration in which it is not necessary to select the display section.

第３の実施の形態によれば、多数のカメラやユーザ(操作者)を必要とすることなく、スイッチングシステムを実現することができる。 According to the third embodiment, a switching system can be realized without requiring a large number of cameras and users (operators).

以上好ましい実施の形態をあげて本発明を説明したが、全ての実施の形態の構成を備える必要はなく、適時組合せて実施することができるばかりでなく、本発明は必ずしも上記実施の形態に限定されるものではなく、その技術的思想の範囲内において様々に変形し実施することが出来る。 Although the present invention has been described above with reference to preferred embodiments, it is not necessary to include all the configurations of the embodiments, and not only can they be combined as appropriate, but the present invention is not necessarily limited to the above embodiments. However, it can be modified and implemented in various ways within the scope of its technical ideas.

１カメラ
２映像処理装置
３表示装置
４スイッチング装置
２０撮影映像入力部
２１特定人物指定受付部
２２仮想カメラワーク指定受付部
２３顔画像辞書部
２４顔認識部
２５骨格判定部
２６人物特定部
２７構図データベース
２８トリミングフレームサイズ決定部
２９トリミング制御部
３０トリミング部
３１映像効果処理部
５０画像データベース
５１画像選択部
５２加算器 1 Camera 2 Video processing device 3 Display device 4 Switching device 20 Captured video input unit 21 Specific person designation reception unit 22 Virtual camera work designation reception unit 23 Face image dictionary unit 24 Face recognition unit 25 Skeleton determination unit 26 Person identification unit 27 Composition database 28 trimming frame size determination unit 29 trimming control unit 30 trimming unit 31 image effect processing unit 50 image database 51 image selection unit 52 adder

Claims

an acquisition unit that acquires an image ;
a specific person designation reception unit that designates a specific person;
a camerawork designation reception unit that designates camerawork;
a person recognition unit that recognizes a person in the video;
a skeleton determination unit that determines a skeleton of a person in the video using the video;
identifying a specific person in the video using the recognition result of the person recognition unit; using the skeleton of the specific person among the skeletons determined by the skeleton determination unit; a person identification unit that identifies a positional relationship;
Trimming frame size determination for determining a size of a trimming frame for trimming a portion of the video by using the positional relationship of the parts of the specific person on the video and the composition information corresponding to the specified camerawork. Department and
Trimming for controlling the position of the video in the trimming frame corresponding to the specified camerawork , using the positional relationship of the part of the specific person on the video and the composition information corresponding to the specified camerawork . a frame controller;
a trimming unit that trims a video within the trimming frame from the video and outputs the trimmed video as a camerawork video corresponding to the camerawork ;
A video processing device having

A composition information storage unit storing composition information in which a composition corresponding to the camera work is determined,
The video processing device according to claim 1.

at least one of the person recognition unit and the skeleton determination unit recognizes a person or determines a skeleton using an image with a resolution lower than that of the image ;
3. The video processing device according to claim 1 or 2.

The person recognition unit determines the orientation of the face of the specific person,
The trimming frame control unit controls the position of the trimming frame in consideration of the orientation of the face of the specific person .
The video processing device according to any one of claims 1 to 3.

Having a part designation reception unit that receives designation of the part of the specific person ,
The video processing device according to any one of claims 1 to 4.

An image processing unit that processes the image output from the trimming unit and outputs a pseudo focus-in or focus-out image ,
The video processing device according to any one of claims 1 to 5.

a superimposed image acquisition unit that acquires a superimposed image to be superimposed on the camerawork video;
a superimposing unit that superimposes a superimposed image on the camerawork video ;
7. The video processing device according to any one of claims 1 to 6, comprising:

Having at least two video processing devices, a display unit, and a switching unit,
The video processing device is
an acquisition unit that acquires an image ;
a specific person designation reception unit that designates a specific person;
a camerawork designation reception unit that designates camerawork ;
a person recognition unit that recognizes a person in the video ;
a skeleton determination unit that determines a skeleton of a person in the video using the video ;
Identifying a specific person in the image using the recognition result of the person recognition unit, and using the skeleton of the specific person among the skeletons determined by the skeleton determination unit, the position of the part of the specific person on the image a person identification unit that identifies a relationship;
a trimming frame size determining unit that determines a trimming frame size for trimming a portion of the video by using the positional relationship of the specific person's body parts on the video and the composition information corresponding to the specified camerawork ; ,
Trimming for controlling the position of the video in the trimming frame corresponding to the specified camerawork , using the positional relationship of the part of the specific person on the video and the composition information corresponding to the specified camerawork . a frame controller;
a trimming unit that trims a video within the trimming frame from the video and outputs the trimmed video as a camerawork video corresponding to the camerawork ;
has
The display unit
displaying the image and at least two or more of the camerawork images ;
The switching unit is
Outputting one camerawork video specified by a user out of at least two camerawork videos;
switching system.

The display unit distinguishes between camerawork images output as line images and other camerawork images, and displays them.
9. A switching system according to claim 8.

an acquisition process for acquiring an image ;
Specific person designation reception processing for receiving designation of a specific person;
camerawork designation reception processing for receiving designation of camerawork ;
Person recognition processing for recognizing a person in the video ;
Skeleton determination processing for determining a skeleton of a person in the video using the video ;
Identifying a specific person in the image using the recognition result of the person recognition processing, and using the skeleton of the specific person among the skeletons determined by the skeleton determination processing, the position of the part of the specific person on the image Person identification processing to identify relationships;
Trimming frame size determination processing for determining the size of a trimming frame for trimming a portion of the video using the positional relationship of the specific person's parts on the video and the composition information corresponding to the specified camerawork. and,
controlling the position of the video of the trimming frame corresponding to the specified camerawork using the positional relationship of the parts of the specific person on the video and the composition information corresponding to the specified camerawork; trimming frame control processing;
A program for causing a computer to perform a trimming process of trimming the video within the trimming frame from the video and outputting the trimmed video as a camerawork video corresponding to the specified camerawork .

an acquisition process for acquiring an image ;
Specific person designation processing for accepting designation of a specific person;
a camerawork specification process for accepting a camerawork specification ;
Person recognition processing for recognizing a person in the video ;
Skeleton determination processing for determining a skeleton of a person in the video using the video ;
A specific person in the video is identified using the recognition result of the person recognition process, and a position of the specific person's whole body on the video is determined using the skeleton of the specific person among the skeletons determined by the skeleton determination process. Person identification processing to identify relationships;
trimming frame size determination for determining a size of a trimming frame for trimming a portion of the video using positional relationship of the whole body of the specific person on the video and composition information corresponding to the specified camera work; processing;
Trimming for controlling the position of the video in the trimming frame corresponding to the specified camerawork , using the positional relationship on the video of the whole body of the specific person and the composition information corresponding to the specified camerawork frame control processing;
a trimming process of trimming the video within the trimming frame from the video and outputting the trimmed video as a camerawork video corresponding to the specified camerawork ;
display processing for displaying the image and at least two or more camerawork images corresponding to at least two or more specific persons or at least two or more camerawork images ; A switching process that outputs one camerawork video specified by
A program that makes a computer run

The computer
get the video ,
Receiving the designation of a specific person ,
Receiving the designation of camera work ,
recognizing a person in the video ;
using the video to determine the skeleton of a person in the video;
Identifying a specific person in the image using the recognition result of the person, using the skeleton of the specific person among the determined skeletons to specify the positional relationship of the parts of the specific person on the image,
determining the size of a trimming frame for trimming a portion of the video using the positional relationship of the parts of the specific person on the video and the composition information corresponding to the camerawork ;
controlling the position of the video of the trimming frame corresponding to the specified camerawork using the positional relationship of the parts of the specific person on the video and the composition information corresponding to the specified camerawork; ,
trimming the video within the trimming frame from the video , and outputting the trimmed video as a camerawork video corresponding to the specified camerawork ;
video processing method.

The computer
get the video ,
Receiving the designation of a specific person ,
Receiving the designation of camera work ,
recognizing a person in the video ;
using the video to determine the skeleton of a person in the video;
Identifying a specific person in the image using the recognition result of the person, using the skeleton of the specific person among the determined skeletons to specify the positional relationship of the whole body of the specific person on the image,
determining the size of a trimming frame for trimming a portion of the video using the positional relationship of the whole body of the specific person on the video and the composition information corresponding to the specified camera work ;
controlling the position of the video of the trimming frame corresponding to the specified camerawork using the positional relationship of the whole body of the specific person on the video and the composition information corresponding to the specified camerawork; ,
trimming the video within the trimming frame from the video , and outputting the trimmed video as a camerawork video corresponding to the specified camerawork ;
displaying the image and at least two or more camerawork images corresponding to at least two or more specific persons or at least two or more camerawork ;
outputting one camerawork video specified by the user out of at least two or more displayed camerawork videos ;
switching method.