WO2024070761A1 - Information processing device, information processing method, and program - Google Patents

Information processing device, information processing method, and program Download PDF

Info

Publication number
WO2024070761A1
WO2024070761A1 PCT/JP2023/033687 JP2023033687W WO2024070761A1 WO 2024070761 A1 WO2024070761 A1 WO 2024070761A1 JP 2023033687 W JP2023033687 W JP 2023033687W WO 2024070761 A1 WO2024070761 A1 WO 2024070761A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
camera
overhead
frustum
view
Prior art date
Application number
PCT/JP2023/033687
Other languages
French (fr)
Japanese (ja)
Inventor
滉太 今枝
和平 岡田
大資 田原
慧 柿谷
Original Assignee
ソニーグループ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ソニーグループ株式会社 filed Critical ソニーグループ株式会社
Publication of WO2024070761A1 publication Critical patent/WO2024070761A1/en

Links

Images

Definitions

  • This technology relates to an information processing device, information processing method, and program, and is related to the display of images of the target space and virtual images.
  • Japanese Patent Application Laid-Open No. 2003-233693 discloses a technique for displaying the depth of field and the angle of view based on shooting information.
  • Japanese Patent Application Laid-Open No. 2003-233633 discloses expressing the shooting range in a captured image using a trapezoidal figure.
  • Japanese Patent Laid-Open No. 2003-233633 discloses generating and displaying a map image for indicating the depth position and focus position of an object to be imaged.
  • This disclosure therefore proposes technology that displays images that make it easier to understand the correspondence between camera images and positions in space.
  • An information processing device related to the present technology includes an image processing unit that generates image data that simultaneously displays an overhead image of a space to be photographed, a shooting range presentation image that presents the camera's shooting range within the overhead image, and the image photographed by the camera on a single screen.
  • the shooting range presentation image is an image showing the shooting range determined by the shooting direction and zoom angle of the camera. When the image showing the shooting range of the camera is displayed in the overhead image, the camera's shooting image is also displayed in the same screen.
  • FIG. 1 is an explanatory diagram of photography by a photography system according to an embodiment of the present technology. This is an explanatory diagram of AR (Augmented Reality) overlaid images.
  • FIG. 1 is an explanatory diagram of a system configuration according to an embodiment.
  • FIG. 11 is an explanatory diagram of another example of a system configuration according to the embodiment;
  • FIG. 2 is an explanatory diagram of an environment map according to the embodiment; 11A and 11B are diagrams illustrating drift correction of an environment map according to an embodiment.
  • FIG. 1 is a block diagram of an information processing apparatus according to an embodiment.
  • FIG. 2 is an explanatory diagram of a view frustum according to an embodiment.
  • FIG. 1 is an explanatory diagram of a display example of a captured image on a focus plane of a view frustum according to an embodiment
  • 1 is an explanatory diagram of a display example of a captured image within the depth of field of a view frustum according to an embodiment.
  • FIG. 11 is an explanatory diagram of a display example of a captured image at a position close to the starting point of a view frustum in the embodiment;
  • FIG. 1 is an explanatory diagram of an example of a display of a captured image on a far end surface of a view frustum according to an embodiment;
  • FIG. 13 is an explanatory diagram of a case where a view frustum according to an embodiment is set at infinity.
  • 11A to 11C are explanatory diagrams illustrating a change in the display state of a captured image on the far end side of a view frustum according to an embodiment.
  • 11A and 11B are explanatory diagrams of a display example of a captured image outside a view frustum according to an embodiment.
  • 1A to 1C are explanatory diagrams illustrating an example of display of captured images inside and outside a plurality of view frustums according to an embodiment.
  • 11A and 11B are explanatory diagrams of a display example of a captured image outside a view frustum according to an embodiment.
  • 11A and 11B are explanatory diagrams of a display example of a captured image outside a view frustum according to an embodiment.
  • 11 is a flowchart of a processing example of the information processing apparatus according to the embodiment.
  • 11 is a flowchart of an example of a process for setting a display position of a captured image according to an embodiment;
  • 11 is a flowchart of an example of a process for setting a display position of a captured image according to an embodiment;
  • 11 is a flowchart of an example of a process for setting a display position of a captured image according to an embodiment;
  • 11 is a flowchart of an example of a process for setting a display position of a captured image according to an embodiment;
  • 11 is a flowchart of an example of a process for setting a display position of a captured image according to an embodiment;
  • FIG. 4 is an explanatory diagram of a collision determination according to an embodiment.
  • FIG. 4 is an explanatory diagram of a collision determination according to an embodiment.
  • 11A and 11B are explanatory diagrams of changes in an overhead view image in the embodiment.
  • FIG. 13 is an explanatory diagram of an overhead view from the director's side in the embodiment.
  • 11A and 11B are diagrams illustrating a determination of an image to be highlighted according to an embodiment.
  • 11 is a flowchart of a processing example of the information processing apparatus according to the embodiment.
  • 11 is a flowchart of an example of a process for highlighting according to an embodiment.
  • 11 is a flowchart of an example of a process for highlighting according to an embodiment.
  • FIG. 11 is an explanatory diagram of a display example based on feedback according to the embodiment.
  • 11 is a flowchart of an example of a display process based on feedback according to an embodiment.
  • 11A and 11B are explanatory diagrams of a display example of overlapping view frustums according to an embodiment;
  • 11 is a flowchart of a processing example of displaying overlapped view frustum according to an embodiment.
  • FIG. 13 is an explanatory diagram of a preferred display of one view frustum according to an embodiment. 13 is a flowchart of a processing example when performing priority display according to the embodiment.
  • FIG. 13 is an explanatory diagram of an example of a display of the instruction frustum on the director side in the embodiment.
  • 11 is an explanatory diagram of an example of a display on the cameraman's side of an instruction frustum according to an embodiment.
  • FIG. 13 is a flowchart of a process for generating an overhead view video according to another embodiment.
  • 11 is an explanatory diagram of an example of a display on the cameraman's side of an instruction frustum according to an embodiment.
  • FIG. 11 is a flowchart of a process for generating an overhead video from a cameraman's side according to an embodiment.
  • 11 is an explanatory diagram of an example of instruction information displayed on the cameraman's side according to the embodiment;
  • FIG. 11 is a flowchart of a process for generating an overhead video from a cameraman's side according to an embodiment.
  • 11 is an explanatory diagram of a display example of a marker frustum according to an embodiment.
  • 11 is an explanatory diagram of a display example of a marker according to an embodiment.
  • 13 is a flowchart of a process example of displaying marker information according to an embodiment.
  • 11A and 11B are explanatory diagrams of a display example of a different overhead view image according to an embodiment.
  • 11A and 11B are explanatory diagrams of a display example of a different overhead view image according to an embodiment.
  • FIG. 13 is an explanatory diagram of a display example on the director side of the embodiment. 13 is a flowchart of a process for generating an overhead view video according to another embodiment.
  • System configuration 2. Configuration of information processing device ⁇ 3. Display of view frustum> ⁇ 4. Example of a cameraman and director screen> [4-1: Highlighted display] [4-2: Priority Display] [4-3: Instruction Display] [4-4: Marker display] [4-5: Examples of various displays] 5. Summary and Modifications
  • video or “image” includes both moving images and still images, but the embodiment will be described taking a moving image as an example.
  • FIG. 1 is a schematic diagram showing how an image is captured by the image capturing system.
  • the 1 shows an example in which three cameras 2 are arranged to capture images of a real target space 8.
  • the number of cameras 2 is just an example, and one or more cameras 2 may be used.
  • the subject space 8 may be any location, but one example is a stadium for soccer, rugby, or the like.
  • the camera 2 is a mobile camera 2M that is suspended by a wire 9 and can move above a target space 8. Images and metadata captured by this mobile camera 2M are sent to a render node 7. Also shown as the camera 2 is a fixed camera 2F that is fixedly disposed on, for example, a tripod 6. Images and metadata captured by this fixed camera 2F are sent to a render node 7 via a CCU (Camera Control Unit) 3. In addition, the captured images and metadata from the mobile camera 2M may be sent to the render node 7 via the CCU 3.
  • “camera 2" collectively refers to cameras 2F and 2M.
  • the render node 7 referred to here refers to a CG engine or image processor that generates CG (Computer Graphics) and synthesizes it with live-action video, and is, for example, a device that generates AR video.
  • CG Computer Graphics
  • FIG. 2A and 2B show examples of AR images.
  • a line that does not actually exist is composited as a CG image 38 into live-action footage of a game being played in a stadium.
  • an advertising logo that does not actually exist is composited as an image 38 into the live-action footage in the stadium.
  • These CG images 38 can be rendered to look like they exist in reality by appropriately setting the shape, size and synthesis position depending on the position of the camera 2 at the time of shooting, the shooting direction, the angle of view, the structural object photographed, etc.
  • the process of generating AR overlay images by combining CG with such live-action footage is already known.
  • the filming system of this embodiment also enables the cameraman and director involved in the video production to perform production tasks such as shooting and giving instructions while visually viewing the AR overlay image. This allows filming to be performed while checking the fusion state of the real scene and the virtual image, making it possible to produce videos that are in line with the creative intent.
  • a shooting range presentation image that is suitable for the viewer of the monitor image, such as the cameraman or director, is displayed.
  • FIG. 3 Two configuration examples of the imaging system are shown in FIG. 3 and FIG.
  • a camera system 1, 1A a control panel 10, a GUI (Graphical User Interface) device 11, a network hub 12, a switcher 13, and a master monitor 14 are shown.
  • the dashed arrows indicate the flow of various control signals CS, while the solid arrows indicate the flow of each of the image data of the shot image V1, the AR superimposed image V2, and the overhead image V3.
  • Camera system 1 is configured to perform AR linkage, while camera system 1A is configured not to perform AR linkage.
  • camera system 1A is configured not to perform AR linkage.
  • a mobile camera 2M may also be used as the camera system 1, 1A.
  • the camera system 1 includes a camera 2, a CCU 3, for example an AI (artificial intelligence) board 4 built into the CCU 3, and an AR system 5.
  • the camera 2 sends video data of the shot video V1 and metadata MT to the CCU 3.
  • the CCU 3 sends the video data of the shot video V1 to the switcher 13.
  • the CCU 3 also sends the video data of the shot video V1 and metadata MT to the AR system 5.
  • the metadata MT includes lens information including the zoom angle of view and focal length when the captured image V1 was captured, and sensor information such as the IMU (Inertial Measurement Unit) mounted on the camera 2. Specifically, this information includes the 3doF (Degree of Freedom) attitude information of the camera 2, acceleration information, lens focal length, aperture value, zoom angle of view, lens distortion, etc.
  • This metadata MT is output from the camera 2, for example, as frame-synchronized or asynchronous information.
  • the camera 2 is a fixed camera 2F, and the position information does not change, so the camera position information only needs to be stored as a known value by the CCU 3 and the AR system 5.
  • the position information is also included in the metadata MT transmitted successively from the camera 2M.
  • the AR system 5 is an information processing apparatus including a rendering engine that performs CG rendering.
  • the information processing apparatus as the AR system 5 is an example of the render node 7 shown in FIG.
  • the AR system 5 generates video data of an AR superimposed video V2 by superimposing an image 38 generated by CG on a video V1 captured by the camera 2.
  • the AR system 5 sets the size and shape of the image 38 by referring to the metadata MT, and also sets the synthesis position within the captured video V1, thereby generating video data of an AR superimposed video V2 in which the image 38 is naturally synthesized with the actual scenery.
  • the AR system 5 also generates video data of a CG overhead image V3, as described later.
  • the video data is the overhead image V3 that reproduces the target space 8 by CG.
  • the AR system 5 displays a view frustum 40 as shown in FIG. 8, which will be described later, in the overhead image V3 as a shooting range presentation image that visually presents the shooting range of the camera 2.
  • the AR system 5 calculates the shooting range in the shooting target space 8 from the metadata MT and position information of the camera 2.
  • the shooting range of the camera 2 can be obtained by acquiring the position information of the camera 2, the angle of view, and the attitude information (corresponding to the shooting direction) of the camera 2 in the three axial directions (yaw, pitch, roll) on the tripod 6.
  • the AR system 5 generates an image as a view frustum 40 in accordance with the calculation of the shooting range of the camera 2.
  • the AR system 5 generates image data of the overhead image V3 so that the view frustum 40 is presented from the position of the camera 2 in the overhead image V3 corresponding to the target space 8.
  • the term "bird's-eye view image” refers to an image from a bird's-eye view of the target space 8, but does not necessarily have to display the entire target space 8 within the image.
  • An image that includes at least a portion of the view frustum 40 of the camera 2 and the surrounding space is referred to as a bird's-eye view image.
  • the overhead image V3 is generated as an image expressing the shooting target space 8 such as a stadium by CG, but the overhead image V3 may be generated by real-life images.
  • a camera 2 may be provided with a viewpoint for the overhead image, and the image V1 shot by the camera 2 may be used to generate the overhead image V3.
  • the image V1 shot by a camera 2M moving in the sky on a wire 9 may be used as the overhead image V3. Furthermore, a 3D (three dimensions)-CG model of the shooting target space 8 may be generated using the images V1 shot by multiple cameras 2, and the viewpoint position may be set for the 3D-CG model and rendered to generate an overhead image V3 with a variable viewpoint position.
  • the video data of the AR superimposed image V2 and the overhead image V3 by the AR system 5 is supplied to a switcher 13. Furthermore, the image data of the AR superimposed image V2 and the overhead image V3 by the AR system 5 is supplied to the camera 2 via the CCU 3. This allows the cameraman of the camera 2 to visually recognize the AR superimposed image V2 and the overhead image V3 on a display unit such as a viewfinder.
  • the image data of the AR superimposed image V2 and the overhead image V3 by the AR system 5 may be supplied to the camera 2 without going through the CCU 3. Furthermore, there are also examples in which the CCU 3 is not used in the camera systems 1 and 1A.
  • the AI board 4 in the CCU 3 performs processing to calculate the amount of drift of the camera 2 from the captured image V1 and metadata MT.
  • the positional displacement of the camera 2 is obtained by integrating twice the acceleration information from the IMU mounted on the camera 2.
  • the amount of displacement at each time point from a certain reference origin attitude (attitude position that is the reference for each of the three axes of yaw, pitch, and roll)
  • the position of the three axes of yaw, pitch, and roll at each time point, that is, attitude information corresponding to the shooting direction of the camera 2 is obtained.
  • repeated accumulation increases the deviation (accumulated error) between the actual attitude position and the calculated attitude position.
  • the amount of deviation is called the drift amount.
  • the AI board 4 calculates the amount of drift using the captured image V1 and the metadata MT. Then, the calculated amount of drift is sent to the camera 2 side.
  • the camera 2 receives the drift amount from the CCU 3 (AI board 4) and corrects the attitude information of the camera 2. Then, the camera 2 outputs metadata MT including the corrected attitude information.
  • the above drift correction will be explained with reference to FIGS. 5 shows the environment map 35.
  • the environment map 35 stores feature points and feature amounts in the coordinates of a virtual dome, and is generated for each camera 2.
  • the camera 2 is rotated 360 degrees, and an environment map 35 is generated in which feature points and feature quantities are registered in global position coordinates on the celestial sphere. This makes it possible to restore the orientation even if it is lost during feature point matching.
  • FIG. 6A shows a schematic diagram of a state in which a drift amount DA occurs between the imaging direction Pc in the correct attitude of the camera 2 and the imaging direction Pj calculated from the IMU data.
  • Information on the three-axis motion, angle, and field of view of the camera 2 is sent from the camera 2 to the AI board 4 as a guide for feature point matching.
  • the AI board 4 detects the accumulated drift amount DA by feature point matching of image recognition, as shown in FIG. 6B.
  • the "+" in the figure indicates a feature point of a certain feature amount registered in the environment map 35 and a feature point of the corresponding feature amount in the frame of the current captured image V1, and the arrow between them is the drift amount vector. In this way, by detecting a coordinate error by feature point matching and correcting the coordinate error, the drift amount can be corrected.
  • the AI board 4 determines the amount of drift by this type of feature point matching, and the camera 2 transmits corrected metadata MT based on this, thereby improving the accuracy of the attitude information of the camera 2 detected in the AR system 5 based on the metadata MT.
  • Camera system 1A in FIG. 3 is an example having a camera 2 and a CCU 3, but not an AR system 5.
  • Video data and metadata MT of the shot video V1 are transmitted from the camera 2 of camera system 1A to the CCU 3.
  • the CCU 3 transmits the video data of the shot video V1 to the switcher 13.
  • the video data of the captured image V1, AR superimposed image V2, and overhead image V3 output from the camera system 1, 1A is supplied to the GUI device 11 via the switcher 13 and network hub 12.
  • the switcher 13 selects the so-called main line video from among the images V1 captured by the multiple cameras 2, the AR superimposed video V2, and the overhead video V3.
  • the main line video is the video output for broadcasting or distribution.
  • the switcher 13 outputs the selected video data to a transmitting device or recording device (not shown) as the main line video for broadcasting or distribution.
  • the video data of the video selected as the main line video is sent to the master monitor 14 and displayed thereon, so that the video production staff can check the main line video.
  • the master monitor 14 may display an AR superimposed image V2, an overhead image V3, etc. in addition to the main line image.
  • the control panel 10 is a device that allows video production staff to operate the switcher 13 to give switching instructions, video processing instructions, and various other instructions.
  • the control panel 10 outputs a control signal CS in response to operations by the video production staff.
  • This control signal CS is sent via the network hub 12 to the switcher 13 and the camera systems 1 and 1A.
  • the GUI device 11 is, for example, a PC or a tablet device, and is a device that enables video production staff, such as a director, to check the video and give various instructions.
  • the captured image V1, the AR superimposed image V2, and the overhead image V3 are displayed on the display screen of the GUI device 11.
  • the captured images V1 from the multiple cameras 2 are split into a screen and displayed as a list, the AR superimposed image V2 is displayed, and the overhead image V3 is displayed.
  • an image selected by the switcher 13 as a main line image is displayed.
  • the GUI device 11 is also provided with an interface for a director or the like to perform various instruction operations.
  • the GUI device 11 outputs a control signal CS in response to an operation by the director or the like.
  • This control signal CS is transmitted via a network hub 12 to a switcher 13 and the camera systems 1 and 1A.
  • a control signal CS corresponding to the instruction is transmitted to the AR system 5, and the AR system 5 generates video data of an overhead video V3 including a view frustum 40 in a display format corresponding to an instruction from a director or the like.
  • FIG. 3 has camera systems 1 and 1A, but in this case, camera system 1 is a set of camera 2, CCU 3, and AR system 5, and in particular, by having AR system 5, video data of AR superimposed video V2 and overhead video V3 corresponding to video V1 captured by camera 2 is generated. Then, AR superimposed video V2 and overhead video V3 are displayed on a display unit such as a viewfinder of camera 2, displayed on GUI device 11, or selected as a main line video by switcher 13. On the other hand, on the camera system 1A side, image data of the AR superimposed image V2 and the overhead image V3 corresponding to the captured image V1 of the camera 2 is not generated. Therefore, FIG. 3 shows a system in which a camera 2 that performs AR linkage and a camera 2 that performs normal shooting are mixed.
  • FIG. 4 is an example of a system in which one AR system 5 corresponds to each camera 2. 4, a plurality of camera systems 1A are provided. The AR system 5 is provided independently of each of the camera systems 1A.
  • the CCU 3 of each camera system 1A sends the video data and metadata MT of the shot video V1 from the camera 2 to the switcher 13.
  • the video data and metadata MT of the shot video V1 are then supplied from the switcher 13 to the AR system 5.
  • This allows the AR system 5 to acquire the video data and metadata MT of the captured video V1 for each camera system 1A, and generate video data of the AR superimposed video V2 corresponding to the captured video V1 of each camera system 1A, and video data of the overhead video V3 including the view frustum 40 corresponding to each camera system 1A.
  • the AR system 5 can generate video data of the overhead video V3 in which the view frustums 40 of the cameras 2 of the multiple camera systems 1A are collectively displayed.
  • the video data of the AR superimposed image V2 and the overhead image V3 generated by the AR system 5 is sent to the CCU 3 of the camera system 1A via the switcher 13, and then sent to the camera 2. This allows the cameraman to view the AR superimposed image V2 and the overhead image V3 on a display such as the viewfinder of the camera 2.
  • the video data of the AR overlay image V2 and the overhead image V3 generated by the AR system 5 is transmitted to the GUI device 11 via the switcher 13 and the network hub 12 and displayed. This allows the director and others to visually confirm the AR overlay image V2 and the overhead image V3.
  • the overhead view image V3 is denoted as "V3-1" and "V3-2".
  • the video data of the overhead image V3-1 is the video data of the overhead image V3 to be displayed on the GUI device 11 or the master monitor 14, with a director or the like assumed as the viewer.
  • the video data of the overhead image V3-2 is the video data of the overhead image V3 to be displayed on the viewfinder of the camera 2, with a cameraman or the like assumed as the viewer.
  • the video data for these overhead images V3-1 and V3-2 may be video data that displays images of the same content. Both of these are video data that display an overhead image V3 of the target space 8 that includes at least the view frustum 40. However, in the embodiment, a case will also be described in which these are video data that include different display contents.
  • the AR system 5 may generate video data that will become an overhead image V3 with the same video content regardless of the transmission destination, or may generate, for example, video data of a first overhead image V3-1 to be transmitted to the GUI device 11 and video data of a second overhead image V3-2 to be transmitted to the camera 2 in parallel. Furthermore, in the case of the system of FIG. 4, it is also assumed that the AR system 5 generates multiple second overhead images V3-2 in parallel so that the content differs for each camera 2.
  • the information processing device 70 is a device capable of information processing, particularly video processing, such as a computer device.
  • Specific examples of the information processing device 70 include personal computers, workstations, mobile terminal devices such as smartphones and tablets, video editing devices, etc.
  • the information processing device 70 may also be a computer device configured as a server device or a computing device in cloud computing.
  • the CPU 71 of the information processing device 70 executes various processes according to programs stored in the ROM 72 or a non-volatile memory unit 74, such as an EEPROM (Electrically Erasable Programmable Read-Only Memory), or programs loaded from the storage unit 79 to the RAM 73.
  • the RAM 73 also stores data necessary for the CPU 71 to execute various processes, as appropriate.
  • the CPU 71 is configured as a processor that performs various types of processing.
  • the CPU 71 performs overall control processing and various types of calculation processing, but in this embodiment, it also has the functions of an image processing unit 71a and an image generation control unit 71b in order to execute image processing as the AR system 5 based on a program.
  • the video processing unit 71a has a processing function for performing various types of video processing. For example, it performs one or more of the following: 3D model generation processing, rendering, video processing including color and brightness adjustment processing, video editing processing, video analysis and detection processing, etc.
  • the video processing unit 71a also performs processing to generate an overhead image V3 as video data that simultaneously displays an overhead image V3 of the target space 8, a view frustum 40 that shows the shooting range of camera 2 within the overhead image V3, and the captured image V1 of camera 2 on a single screen.
  • the image generation control unit 71b in the CPU 71 performs processing to variably set the display position of the captured image V1 to be simultaneously displayed on one screen in the overhead image V3 including the view frustum 40 generated by the image processing unit 71a, and to control the generation of image data by the image processing unit 71a.
  • the image processing unit 71a generates the overhead image V3 including the view frustum 40 according to the settings of the image generation control unit 71b.
  • the image processing unit 71a may also perform in parallel a process of generating first image data that displays the view frustum 40 of the camera 2 within the target space 8, and a process of generating second image data that displays an image of the view frustum 40 within the target space 8, the image having a different display mode from the image generated by the first image data.
  • the first video data is, for example, video data of the overhead view V3-1
  • the second video data is, for example, video data of the overhead view V3-2.
  • the functions of the image processing unit 71a and the image generation control unit 71b may be realized by a CPU separate from the CPU 71, a GPU (Graphics Processing Unit), a GPGPU (General-purpose computing on graphics processing units), an AI (artificial intelligence) processor, etc.
  • the functions of the video processing unit 71a and the video production control unit 71b may be realized by a plurality of processors.
  • the CPU 71, ROM 72, RAM 73, and non-volatile memory unit 74 are interconnected via a bus 83.
  • the input/output interface 75 is also connected to this bus 83.
  • An input unit 76 consisting of operators and operation devices is connected to the input/output interface 75.
  • the input unit 76 may be various operators and operation devices such as a keyboard, a mouse, a key, a trackball, a dial, a touch panel, a touch pad, a remote controller, or the like.
  • An operation by the user is detected by the input unit 76 , and a signal corresponding to the input operation is interpreted by the CPU 71 .
  • a microphone may also be used as the input unit 76. Voice uttered by the user may also be input as operation information.
  • the input/output interface 75 is connected, either integrally or separately, to a display unit 77 formed of an LCD (Liquid Crystal Display) or an organic EL (electro-luminescence) panel, or the like, and an audio output unit 78 formed of a speaker, or the like.
  • the display unit 77 is a display unit that performs various displays, and is configured, for example, by a display device provided in the housing of the information processing device 70, or a separate display device connected to the information processing device 70, or the like.
  • the display unit 77 displays various images, operation menus, icons, messages, etc., on the display screen based on instructions from the CPU 71, that is, displays them as a GUI (Graphical User Interface).
  • GUI Graphic User Interface
  • the input/output interface 75 may also be connected to a storage unit 79 and a communication unit 80, which may be configured using a hard disk drive (HDD) or solid-state memory.
  • HDD hard disk drive
  • solid-state memory solid-state memory
  • the storage unit 79 can store various data and programs.
  • a database can also be configured in the storage unit 79.
  • the communication unit 80 performs communication processing via a transmission path such as the Internet, and communication with various devices such as external databases, editing devices, and information processing devices via wired/wireless communication, bus communication, and the like. For example, assuming an information processing device 70 as the AR system 5 , communication with the CCU 3 and the switcher 13 is performed via a communication unit 80 .
  • a drive 81 is also connected to the input/output interface 75 as required, and a removable recording medium 82 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory is appropriately mounted thereon.
  • the drive 81 allows video data, various computer programs, and the like to be read from the removable recording medium 82.
  • the read data is stored in the storage unit 79, and the video and audio contained in the data are output on the display unit 77 and the audio output unit 78.
  • the computer programs, etc. read from the removable recording medium 82 are installed in the storage unit 79 as necessary.
  • software for the processing of this embodiment can be installed via network communication by the communication unit 80 or via a removable recording medium 82.
  • the software may be stored in advance in the ROM 72, the storage unit 79, etc.
  • the AR system 5 generates the overhead image V3 and can transmit it to the viewfinder of the camera 2, the GUI device 11, or the like for display.
  • the AR system 5 generates video data for the overhead image V3 so as to display the view frustum 40 of the camera 2 within the overhead image V3.
  • Fig. 8 shows an example of a view frustum 40 displayed in the overhead image V3.
  • Fig. 8 shows an example of a CG image of the subject space 8 in Fig. 1 as viewed from above, but for the sake of explanation, it is shown in a simplified form.
  • the overhead image V3 in Fig. 8 includes an image showing a background 31, such as a stadium, and a person 32, such as a player.
  • the overhead image V3 may or may not include an image of the camera 2 itself.
  • the view frustum 40 visually presents the shooting range of the camera 2 within the overhead image V3, and has a pyramid shape that spreads in the direction of the shooting optical axis with the position of the camera 2 within the overhead image V3 as the frustum origin 46.
  • it is a pyramid shape extending from the frustum origin 46 to the frustum far end surface 45.
  • the reason why it is a quadrangular pyramid is because the image sensor of the camera 2 is quadrangular.
  • the extent of the spread of the pyramid changes depending on the angle of view of the camera 2 at that time. Therefore, the range of the pyramid indicated by the view frustum 40 is the shooting range of the camera 2.
  • the view frustum 40 may be represented as a pyramid with a semi-transparent colored image.
  • the view frustum 40 displays a focus plane 41 and a depth of field range 42 at that time inside a quadrangular pyramid.
  • a depth of field range 42 for example, a range from a near depth end surface 43 to a far depth end surface 44 is expressed by a translucent color different from the rest.
  • the focus plane 41 is also expressed in a semi-transparent color that is different from the others.
  • the focus plane 41 indicates the depth position at which the camera 2 is focused at that point in time.
  • a subject at a depth (distance in the depth direction as seen from the camera 2) equivalent to the focus plane 41 is in focus.
  • the depth of field range 42 makes it possible to confirm the range in the depth direction in which the subject is not blurred.
  • the in-focus depth and the depth of field vary depending on the focus operation and aperture operation of the camera 2. Therefore, the focus plane 41 and the depth of field range 42 in the view frustum 40 vary each time.
  • the AR system 5 can set the pyramidal shape of the view frustum 40, the display position of the focus plane 41, the display position of the depth of field range 42, and the like, by acquiring metadata MT from the camera 2, which includes information such as focal length, aperture value, and angle of view. Furthermore, since the metadata MT includes attitude information of the camera 2, the AR system 5 can set the direction of the view frustum 40 from the camera position (frustum origin 46) in the overhead image V3.
  • the AR system 5 displays, together with the view frustum 40, the image V1 captured by the camera 2 in which the view frustum 40 is shown in the overhead image V3. That is, the AR system 5 generates an image of the CG space 30 to be used as the overhead image V3, synthesizes the image of the CG space 30 with the view frustum 40 generated based on the metadata MT supplied from the camera 2, and further synthesizes the image V1 captured by the camera 2. The image data of such a synthesized image is output as the overhead image V3.
  • FIG. 10 An example will be described in which a view frustum 40 in an image of a CG space 30 and a photographed image V1 are simultaneously displayed on one screen.
  • the AR system 5 generates video data of an overhead video V3 in which the captured video V1 is displayed within the view frustum 40.
  • this is an example of generating video data in which the captured video V1 is arranged within the range of the view frustum 40.
  • this can be said to be an example of generating video data in which the captured video V1 is displayed in a state in which it is arranged within the range of the view frustum 40.
  • Figure 9 shows an example in which the captured image V1 is displayed on the focus plane 41 in the view frustum 40. This makes it possible to view the image captured at the focus position.
  • the example in Figure 9 is also one example in which the captured image V1 is displayed within the depth of field range 42.
  • FIG. 10 shows an example in which a captured image V1 is displayed on a surface other than the focus surface 41 within the depth of field range 42 in the view frustum 40.
  • the captured image V1 is displayed on a surface 44 at the far end of the depth field.
  • examples are also conceivable in which the captured image V1 is displayed on the near depth end surface 43, or at a depth position midway within the depth of field range 42.
  • FIG. 11 shows an example in which the captured image V1 is displayed within the view frustum 40 at a position (surface 47 near the frustum origin) closer to the frustum origin 46 than the near-depth end surface 43 of the depth-of-field range 42.
  • the size of the captured image V1 becomes smaller the closer it is to the frustum origin 46, but by displaying it on the surface 47 near the frustum origin in this way, the focus plane 41, depth-of-field range 42, etc. become easier to see.
  • FIG. 12 shows an example in which a captured image V1 is displayed on the far side of a far end surface 44 of a depth of field range 42 within a view frustum 40.
  • far means far from the viewpoint of the camera 2 (frustum starting point 46).
  • the captured image V1 is displayed on the frustum far end surface 45, which is located at the far side.
  • the photographed image V1 is displayed on the far side of the depth of field range 42 within the view frustum 40, the area of the photographed image V1 can be made large. This is therefore suitable for checking the position of the focus plane 41 and the depth of field range 42 while carefully checking the contents of the photographed image V1.
  • the distance of the rendered view frustum 40 may be finite or infinite.
  • the view frustum 40 may be rendered at a finite distance, such as the rendering distance d1 in Fig. 12.
  • the rendering distance d1 may be twice the distance from the frustum starting point 46 to the focus plane 41. By doing so, the frustum far end surface 45 is determined, so that the photographed image V1 can be displayed in the widest area within the view frustum 40 as shown in FIG.
  • the view frustum 40 may be rendered at infinity as shown in FIG. 13 without any particular rendering distance.
  • the frustum far end surface 45 is not always specified as a constant.
  • the captured image V1 may be displayed at an indefinite position farther away than the depth of field range 42.
  • the far end of the rendering range is set as the frustum far end surface 45.
  • 14A and 14B show that when the view frustum 40 is rendered up to the position of the wall W, the position at which it collides with the wall W is the frustum far end surface 45. In other words, the frustum far end surface 45 changes depending on the positional relationship with the object created by the CG.
  • the far end of the range that can be drawn in the overhead image V3 is the frustum far end surface 45, and the captured image V1 is displayed on that frustum far end surface 45.
  • the photographed image V1 is displayed within the view frustum 40, but the photographed image V1 may be displayed at a position outside the view frustum 40 within the same screen as the overhead image V3.
  • 15 shows four examples (captured images V1w, V1x, V1y, and V1z) as examples of display positions outside the view frustum 40. In particular, these four examples are examples in which the captured image V1 is displayed near the view frustum 40.
  • the captured image V1 may be displayed near the far end surface 45 of the frustum as captured image V1w.
  • the captured image V1 can also be displayed near the focus plane 41 (or depth of field range 42) as in the captured image V1y in FIG. 15. In this case, it becomes easier to view the captured image V1 together with the focus plane 41 or depth of field range 42, which are areas of the view frustum 40 that are likely to be noticed by the viewer.
  • the captured image V1 can also be displayed near the camera 2 (or the frustum starting point 46) as the captured image V1z. In this case, the relationship between the camera 2 and the captured image V1 by that camera 2 becomes easier to understand.
  • the color of the frame of the captured image V1 may be matched with the semi-transparent color or the color of the contour of the corresponding view frustum 40 to indicate the correspondence.
  • view frustums 40a, 40b, and 40c corresponding to the three cameras 2 are displayed within the overhead image V3.
  • captured images V1a, V1b, and V1c corresponding to these view frustums 40a, 40b, and 40c are also displayed.
  • the photographed image V1a is displayed on a frustum far end surface 45 of the view frustum 40a.
  • the photographed image V1b is displayed in the vicinity of a frustum starting point 46 of the view frustum 40b (in the vicinity of the camera position).
  • the captured image V1c is displayed in a corner of the screen, but is displayed in the upper left corner, which is closest to the view frustum 40c, among the four corners of the overhead image V3.
  • the image V1 captured by the mobile camera 2 may be displayed fixedly in a corner of the screen, for example.
  • the above Figure 16 is an example of an overhead image V3 in which the target space 8 is viewed from diagonally above, but the AR system 5 may also display a planar overhead image V3 viewed from directly above, as shown in Figure 17.
  • cameras 2a, 2b, 2c, and 2d, their corresponding view frustums 40a, 40b, 40c, and 40d, and the captured images V1a, V1b, V1c, and V1d are displayed as an overhead image V3.
  • the captured images V1a, V1b, V1c, and V1d are displayed near the corresponding cameras 2a, 2b, 2c, and 2d, respectively.
  • the AR system 5 may be configured so that the viewpoint direction of the overhead image V3 shown in Figures 16 and 17 can be continuously changed by the viewer operating the GUI device 11, etc.
  • the view frustums 40a and 40b are displayed, and the images V1a and V1b captured by the cameras 2 of the view frustums 40a and 40b are displayed in the corners of the screen or near the camera positions.
  • the shooting conditions can be easily understood by displaying each of the view frustums 40 and shot images V1, as in the example shown in the figure.
  • the AR system 5 displays the view frustum 40 of the camera 2 in the CG space 30, and generates video data for an overhead image V3 so that the captured image V1 of the camera 2 is also displayed at the same time.
  • this overhead image V3 on the camera 2 or GUI device 11, viewers such as the cameraman or director can easily understand the shooting situation.
  • the viewer can easily understand what each camera 2 is capturing, where the camera is focused, and so on.
  • the director can very easily grasp the relative positions of the cameras, the relationship between the shooting directions, the subject being shot, etc. This allows the director to give appropriate instructions. From the director's point of view, it is enough to know the general content of each shot image V1. Therefore, there is no problem even if the shot image V1 is relatively small in the overhead image V3.
  • the director can check and simulate the composition, standing position, and camera position while taking into consideration the overall situation of each camera 2.
  • the cameraman can perform the focusing operation by looking at the depth of field range 42 of the view frustum 40.
  • the user can easily check the location and direction being photographed within the overhead image V3 of the subject space 8 represented by CG.
  • the user can see the view frustum 40 and the captured image V1 of the other camera 2 and reflect them in the operation of his/her own camera.
  • the user can also grasp the relationship between the contents of the images captured by the other camera 2, the direction of the subject, etc., and therefore can perform preferable shooting in relation to the other camera 2.
  • the user can check the position and angle of view of the other camera 2 and shoot from a different position and angle of view with his/her own camera 2.
  • the overhead image V3 increases the amount of information (captured image V1, position, etc.), making it easier to grasp the situation on-site.
  • FIG. 19 shows an example of processing by the AR system 5 that generates video data for the overhead view video V3.
  • the video data for the overhead view video V3 is video data in which the view frustum 40 and the captured video V1 are synthesized into the CG space 30, which corresponds to the subject space 8.
  • it is video data for displaying the images shown in FIGS. 9 to 18.
  • the AR system 5 performs the processes from step S101 to step S107 in FIG. 19 for each frame of the video data of the overhead video V3, for example. These processes can be considered as control processes of the CPU 71 (video processing unit 71a, video generation control unit 71b) in the information processing device 70 in FIG. 7 as the AR system 5.
  • the AR system 5 sets the CG space 30. For example, it sets the viewpoint position of the CG space 30 corresponding to the shooting target space 8, and renders an image as the CG space 30 from that viewpoint position. In particular, if there is no change in the viewpoint position or image content between the previous frame and the CG space 30, the image in the CG space of the previous frame can be used in the current frame as well.
  • step S102 the AR system 5 inputs the captured image V1 and metadata MT from the camera 2. That is, the captured image V1 of the current frame, and the attitude information, focal length, angle of view, aperture value, and the like of the camera 2 at the frame timing are acquired. For example, when one AR system 5 displays the view frustum 40 and the captured image V1 for a plurality of cameras 2 as shown in FIG. 4, the AR system 5 inputs the captured image V1 and metadata MT of each camera 2.
  • each of these AR systems 5 When there are multiple camera systems 1 in which the cameras 2 and AR systems 5 correspond 1:1 as shown in Figure 3, and each generates an overhead image V3 including multiple view frustums 40 and captured images V1, it is preferable for each of these AR systems 5 to work together so as to share the metadata MT and captured images V1 of the corresponding camera 2.
  • step S103 the AR system 5 generates a view frustum 40 for the current frame.
  • the AR system 5 sets the direction of the view frustum 40 in the CG space 30 according to the attitude of the camera 2, the quadrangular pyramid shape according to the angle of view, the positions of the focus plane 41 and the depth of field range 42 based on the focal length and aperture value, and the like, and generates an image of the view frustum 40 according to the settings.
  • the AR system 5 When displaying the view frustum 40 for a plurality of cameras 2 , the AR system 5 generates an image of the view frustum 40 according to the metadata MT of each camera 2 .
  • step S104 the AR system 5 sets the display position for the captured image V1 acquired in step S103.
  • the AR system 5 sets the display position for the captured image V1 acquired in step S103.
  • step S105 the AR system 5 synthesizes the view frustum 40 corresponding to one or more cameras 2 and the captured image V1 into the CG space 30 that becomes the overhead image V3, generating image data for one frame of the overhead image V3.
  • step S106 the AR system 5 outputs one frame of video data of the overhead view video V3.
  • the above process is repeated until the display of the view frustum 40 and the captured image V1 is completed.
  • the overhead image V3 as shown in FIGS.
  • FIGS. 23 and 24 show examples in which the display position of the photographed video V1 is set variably.
  • 20, 21, 22, 23, and 24 are examples of display position settings of the captured image V1 corresponding to one camera 2.
  • the processes as shown in Fig. 20 to 24 may be performed for each camera 2.
  • the same display position setting process may be performed for each camera 2, or different display position setting processes may be performed.
  • FIG. 20 shows a display position setting process when the photographed image V1 is displayed on the focus plane 41 as in FIG.
  • the AR system 5 determines the size and shape of the focus plane 41 in the view frustum 40 generated in step S103 in Fig. 19 for the current frame.
  • the AR system 5 sets the size and shape of the captured image V1 so as to match the focus plane 41.
  • the shape of the captured image V1 to be synthesized within the view frustum 40 may be the cross-sectional shape of that view frustum 40.
  • the shape of the focus plane 41 differs depending on the viewpoint of the overhead image V3 and the position and direction of the view frustum 40 to be displayed, but may be the shape of a cross section cut perpendicular to the optical axis of the camera 2 at the focus plane 41 of the view frustum 40 in that frame. Therefore, when the photographed image V1 is displayed within the view frustum 40, the photographed image V1 is transformed into a cross-sectional shape perpendicular to the optical axis and then synthesized.
  • step S105 in FIG. 19 the size and shape of the captured image V1 are adjusted, and an overhead image V3 is generated by combining the captured image V1 with the focus plane 41 of the view frustum 40.
  • FIG. 21 shows a display position setting process in the case where the photographed image V1 is displayed on the depth far end surface 44 as in FIG.
  • the AR system 5 determines the size and shape of the depth far end surface 44 in the view frustum 40 generated in step S103 in the current frame.
  • the AR system 5 sets the size and shape of the captured image V1 so as to match the size of the depth far end surface 44.
  • step S105 in FIG. 19 the size and shape of the captured image V1 are adjusted, and an overhead image V3 is generated in which the captured image V1 is composited onto the far end surface 44 of the view frustum 40.
  • FIG. 22 shows a display position setting process in the case where the photographed image V1 is displayed in the vicinity of the frustum starting point 46 as in FIG.
  • the AR system 5 sets the display position of the captured image V1 within the view frustum 40 generated in step S103 in the current frame. That is, a certain position is set on the frustum origin 46 side of the depth of field range 42. In this case, the position may be set as a fixed distance from the frustum origin 46, or may be set as a position where a minimum area is obtained as a cross section of a quadrangular pyramid shape according to the angle of view.
  • step S141 the AR system 5 determines the cross section at the set display position, that is, the size and shape of the display area.
  • step S142 the AR system 5 sets the size and shape of the captured image V1 so as to match the cross section of the determined display position.
  • step S105 when the process proceeds to step S105, the size and shape of the captured image V1 are adjusted, and an overhead image V3 is generated in which the captured image V1 is composited at a position near the frustum origin 46 of the view frustum 40.
  • FIG. 23 shows a display position setting process in which the display position of the captured image V1 is changed according to the operation of a user such as a cameraman or director.
  • step S150 the AR system 5 checks whether or not a display position change operation has been performed on the captured image V1.
  • the GUI device 11 and the camera 2 are configured so that a director, cameraman, etc. can change the display position by performing a specified operation.
  • the AR system 5 checks the operation information for the display position change operation from the control signal CS that it receives.
  • an operation interface may be provided that allows each plane to be switched by a toggle operation, or an operation interface may be provided that allows each plane to be directly specified.
  • the display position setting may be switched not only to positions within the view frustum 40 but also to positions outside the view frustum 40 .
  • it is possible to perform operations such as changing the focus plane 41, the far end plane of the frustum 45, the corner of the screen, and the vicinity of the camera.
  • the display position setting can be switched outside the view frustum 40. For example, it is possible to change the position to "near the focus plane 41," “near the far end surface 45 of the frustum,” “corner of the screen,” or "near camera 2.”
  • step S151 If no particular operation to change the display position is confirmed at the time of processing the current frame, the AR system 5 proceeds to step S151, maintains the same display position setting as in the previous frame, and ends the processing of FIG. As a result, when the process proceeds to step S105 in FIG. 19, a frame of the current overhead image V3 is generated in which the shot image V1 is displayed in the same position as in the previous frame.
  • step S150 the AR system 5 proceeds from step S150 to step S152 in FIG. 23 and changes the display position setting in response to the operation. For example, the setting that had been set as the focus plane 41 until then may be switched to the frustum far end plane 45.
  • step S153 the AR system 5 branches the process depending on whether the changed position setting is outside the view frustum 40 or not. If the changed position setting is a position within the view frustum 40, the AR system 5 proceeds to step S154, and determines the size and shape of the display area as a cross section of the view frustum 40 at the set position. Then, in step S156, the AR system 5 sets the size and shape of the captured image V1 so as to match the cross section of the determined display position.
  • step S105 in FIG. 19 the size of the captured image V1 is adjusted, and an overhead image V3 is generated in which the captured image V1 is composited at a position within the view frustum 40 different from the previous frame.
  • the AR system 5 proceeds from step S153 to step S155 in FIG. 23, and sets the display size and shape of the captured image V1 at the new set position.
  • the shape of the captured image V1 to be synthesized is not limited to the cross-sectional shape of the view frustum 40, and may be, for example, a rectangle, or if it is near the view frustum 40, a parallelogram according to the angle of the view frustum 40.
  • the size of the captured image V1 can also be set relatively freely, but it is desirable to set it appropriately according to other displays on the screen.
  • step S105 in FIG. 19 the size and shape of the captured image V1 are adjusted, and an overhead image V3 is generated in which the captured image V1 is composited at a position outside the view frustum 40 that is different from the previous frame.
  • the display position may be changed only outside the view frustum 40. In that case, steps S153 and S154 are unnecessary, and the process may proceed from step S152 to step S155.
  • FIG. 24 shows an example of processing in which the AR system 5 automatically changes the display position of the captured image V1.
  • step S160 the AR system 5 performs a display position change determination.
  • the display position change determination is a process of determining whether or not to change the display position setting of the photographed video V1 in the current frame from that in the previous frame. Examples of this determination process include the following processes (P1), (P2), and (P3).
  • P1 Determination based on the positional relationship between the view frustum 40 and an object in the overhead image V3.
  • P2 Determination based on the angle of the view frustum 40 in the overhead image V3.
  • P3 Determination based on the viewpoint position of the overhead image V3.
  • Fig. 25 shows a state in which the frustum far end surface 45 of the finitely distant view frustum 40 collides with the ground GR and is partially embedded therein.
  • Fig. 26 shows a state in which the far end side of the finitely distant or infinitely distant view frustum 40 collides with a structure CN and it becomes impossible to display anything beyond that.
  • the pyramidal shape of the view frustum 40 widens or its direction changes due to a change in the angle of view or shooting direction of the camera 2, and it is determined that the display position of the captured image V1 up to that point is not appropriate based on the positional relationship between a specific position of the view frustum 40 (such as the frustum far end surface 45 or the focus surface 41) and other objects being displayed, it may be determined that the display position needs to be changed.
  • view frustums 40 may also be considered as objects in the overhead image V3, and if it is determined that the display position of the captured image V1 is not appropriate due to its positional relationship with the other view frustums 40, it may be determined that the display position needs to be changed.
  • the example of (P2) takes into consideration the visibility of the captured image V1 that is adapted to the cross-sectional shape of the view frustum 40.
  • the cross-sectional shape may not be appropriate as a display surface.
  • the shape and direction of the view frustum 40 change according to the angle of view and the shooting direction of the camera 2.
  • the angle of the view frustum 40 displayed in the overhead image V3 also changes. That is, the angle between the direction from the viewpoint of the entire overhead image V3 and the axial direction of the view frustum 40 changes.
  • This angle is the angle between the normal direction on the display screen when viewed from the line of sight from the viewpoint set for the overhead image V3 at a certain time, and the axial direction of the displayed view frustum 40.
  • the axial direction of the view frustum 40 is the direction of the perpendicular line drawn from the frustum starting point 46 to the frustum far end surface 45.
  • Fig. 27 shows the captured images V1a, V1b, and V1c corresponding to the view frustum 40a, 40b, and 40c.
  • the captured image V1a displayed according to the cross-sectional shape becomes a parallelogram with a large difference between acute and obtuse angles due to the angle of the view frustum 40a in the overhead image V3. If this continues, the visibility of the captured image V1a will be poor. In such a case, it is advisable to change the display position as shown by the dashed arrow and display it at the position of the captured image V1a'. In this way, it is conceivable to determine that the display position needs to be changed when the acute angle and obtuse angle of the photographed image V1 are equal to or greater than a predetermined value.
  • the example (P3) is based on the same idea as (P2).
  • the viewpoint position of the overhead view video V3 can be changed according to an operation by a director, etc.
  • the viewpoint position of the overhead view video V3 may be changed from the state shown in Fig. 16 to that shown in Fig. 27 by an operation.
  • the visibility of the captured image V1a is poor, as in the case described above.
  • the shape of the rendered view frustum 40 and the captured image V1 changes due to a change in the viewpoint of the overhead image V3, which may reduce visibility.
  • the acute and obtuse angles of the captured image V1 become equal to or larger than a predetermined value, it is determined that the display position needs to be changed.
  • changing the viewpoint of the overhead image V3 may cause the size of the captured image V1 to become smaller. If the viewpoint position when rendering the overhead image V3 is changed to a distant position, causing the size of the captured image V1 to become equal to or smaller than a predetermined size, it may be determined that the display position needs to be changed.
  • step S160 of FIG. 24 the AR system 5 performs a display position change determination as described above, and in step S161, the process branches depending on whether a change is required.
  • step S162 If it is determined that no change is necessary, the AR system 5 proceeds to step S162, where it maintains the same display position setting as in the previous frame, and ends the processing of FIG. As a result, when the process proceeds to step S105 in FIG. 19, a frame of the current overhead image V3 is generated in which the shot image V1 is displayed in the same position as in the previous frame.
  • the AR system 5 proceeds from step S161 to step S163 in FIG. 24 and selects a destination to which the display position setting is to be changed.
  • the destination of this change may be determined depending on the reason why the display position change is required. For example, in the above (P1), if a collision with an object in the overhead image V3 occurs, it is possible to change the position to a position that is not affected by the collision point, such as the surface 47 near the frustum origin or a corner of the screen. If the visibility of the captured image V1 decreases in the above (P2) and (P3), it may be possible to select a location outside the view frustum 40 where a shape allows for good visibility, such as a corner of the screen or near the focus plane 41.
  • the type information of the camera 2 can also be used to set the destination of the captured image V1.
  • the destination can be a corner of the screen.
  • the captured image V1 of the mobile camera 2M it can be displayed within the view frustum 40 when the camera is not moving, and can be changed to the corner of the screen when the camera is moving. This is because the movement of the view frustum 40 within the overhead image V3 becomes larger when the camera is moving, and the visibility of the captured image V1 within the view frustum 40 decreases.
  • step S164 the AR system 5 branches the process depending on whether the selected destination is outside the view frustum 40 or not.
  • step S165 determines the size and shape of the display area as a cross section of the view frustum 40 at the set position. Then, in step S167, the AR system 5 sets the size and shape of the captured image V1 so that it matches the cross section of the determined display position.
  • step S105 in FIG. 19 the size of the captured image V1 is adjusted, and an overhead image V3 is generated in which the captured image V1 is composited at a position within the view frustum 40 different from the previous frame.
  • step S166 in FIG. 24 sets the display size and shape of the captured image V1 at the new set position (similar to step S155 in FIG. 23).
  • step S105 in FIG. 19 the size and shape of the captured image V1 are adjusted, and an overhead image V3 is generated in which the captured image V1 is composited at a position outside the view frustum 40 that is different from the previous frame.
  • step S164 and S166 are unnecessary.
  • the display position may be changed only to outside the view frustum 40. In this case, steps S164 and S165 are unnecessary, and the process may proceed from step S163 to step S166.
  • the view frustum 40 and the captured image V1 may be displayed together all the time, or may be displayed only temporarily.
  • the cameraman or director may perform an operation to select the view frustum 40, so that the shot image V1 corresponding to the selected view frustum 40 is displayed.
  • a cameraman or director may be able to switch between a mode in which only the view frustum 40 is displayed and a mode in which the view frustum 40 and the shot video V1 are displayed simultaneously.
  • Example of a cameraman and director screen> In the system of this embodiment, an overhead image V3-1 is displayed on the GUI device 11 for the director, and an overhead image V3-2 is displayed on a display unit such as a viewfinder of the camera 2 for the cameraman.
  • the overhead images V3-1 and V3-2 are both images showing the view frustum 40 in the CG space 30 simulating the shooting target space 8, but are images with different display modes. This makes it possible to provide information appropriate for the role of the director or cameraman.
  • FIG. 28 shows an example in which an overhead view image V3-1 is displayed as the device display image 51 on the GUI device 11.
  • This overhead image V3-1 is an image that includes a CG space 30 overlooking the target shooting space 8, for example a stadium, and displays the view frustums 40 of a plurality of cameras 2 taking pictures at the stadium. View frustums 40a, 40b, and 40c for the three cameras 2 are displayed.
  • the view frustum 40a is displayed in a different manner from the other view frustums 40b and 40c.
  • the view frustum 40a is highlighted and made to stand out more than the other view frustums 40b and 40c.
  • the shape and direction of the view frustum 40 and the display positions of the focus plane 41 and depth of field range 42 are determined by the angle of view, shooting direction, focal length, depth of field, etc. of the camera 2 at that time, and therefore these differences are not included in the difference in display mode referred to here.
  • Different display modes of the view frustum 40 do not refer to differences determined by the state of the angle of view or shooting direction of the camera 2, but to differences in the display of the view frustum 40 itself. For example, differences in color, brightness, darkness, type and thickness of the outline, differences in the display of the pyramid faces, differences between normal display and flashing display, differences in the flashing cycle, etc.
  • the view frustum 40 when the view frustum 40 is normally displayed in a semi-transparent white color, the view frustum 40a is highlighted in a semi-transparent red color, for example. This allows the view frustum 40a to be highlighted and shown to the director, etc.
  • the AR system 5 configured as shown in FIG. 4 determines whether or not a particular subject of interest, such as a specific player, is being photographed by performing image recognition processing on the captured image V1 of each camera 2. For example, it is determined whether or not the image of the video V1 captured by the camera 2 shows a target subject as shown in Fig. 29. Then, the AR system 5 generates an overhead video V3-1 so that the view frustum 40 of the camera 2 capturing the target subject is displayed in a highlighted manner.
  • the video data for the overhead images V3-1 and V3-2 in this case refers to video data in which a view frustum 40 is synthesized with a CG space 30 that corresponds to the subject space 8.
  • the overhead views V3-1 and V3-2 may be further synthesized with the shot image V1.
  • the AR system 5 performs the processes from step S101 to step S107 in FIG. 30 for each frame of the video data of the overhead images V3-1 and V3-2, for example. These processes can be considered as control processes of the CPU 71 (video processing unit 71a) in the information processing device 70 in FIG. 7 as the AR system 5.
  • the AR system 5 sets the CG space 30. For example, it sets the viewpoint position of the CG space 30 corresponding to the shooting target space 8, and renders an image as the CG space 30 from that viewpoint position. In particular, if there is no change in the viewpoint position or image content between the previous frame and the CG space 30, the image in the CG space of the previous frame can be used in the current frame as well.
  • step S102 the AR system 5 inputs the captured image V1 and metadata MT from the camera 2. That is, the captured image V1 of the current frame, and the attitude information, focal length, angle of view, aperture value, and the like of the camera 2 at the frame timing are acquired.
  • the AR system 5 inputs the captured video V1 and metadata MT of each camera 2.
  • step S201 the AR system 5 generates a view frustum 40 for the cameraman for the current frame.
  • the view frustum 40 for the cameraman is a view frustum 40 to be synthesized with the overhead image V3-2 to be transmitted to the camera 2 and displayed.
  • a view frustum 40 for the cameraman is generated separately in correspondence with each of the cameras 2.
  • the AR system 5 in the camera system 1 generates a view frustum 40 to be displayed on the camera 2 of the camera system 1 .
  • the AR system 5 sets the direction of the view frustum 40 within the CG space 30 according to the attitude of the camera 2, the pyramid shape according to the angle of view, the position of the focus plane 41 and depth of field range 42 based on the focal length and aperture value, and so on, and generates an image of the view frustum 40 according to the settings.
  • the AR system 5 When displaying the view frustum 40 for a plurality of cameras 2 , the AR system 5 generates an image of the view frustum 40 according to the metadata MT of each camera 2 .
  • step S202 the AR system 5 generates a director's view frustum 40 for the current frame.
  • the director's view frustum 40 is a view frustum 40 to be transmitted to the GUI device 11 and synthesized with the overhead view video V3-1 to be displayed.
  • an image of the view frustum 40 is generated based on the attitude (shooting direction), angle of view, focal length, and aperture value of each camera 2.
  • the view frustum 40 for the cameraman generated in step S201 and the view frustum 40 for the director generated in step S202 may be displayed in different ways. A specific example will be described later.
  • step S203 the AR system 5 synthesizes the view frustum 40 generated for the cameraman into the CG space 30 that will become the overhead image V3-2, to generate one frame of video data for the overhead image V3-2.
  • the captured image V1 may also be synthesized in correspondence with each view frustum 40.
  • step S204 the AR system 5 synthesizes the view frustum 40 generated for the director into the CG space 30 that will become the overhead image V3-1, to generate one frame of video data for the overhead image V3-1.
  • the shot image V1 may also be synthesized in correspondence with each view frustum 40.
  • step S205 the AR system 5 outputs one frame of video data of the overhead videos V3-1 and V3-2. The above process is repeated until the display of the view frustum 40 is completed.
  • a process of emphasizing one view frustum 40, for example the view frustum 40a, as shown in FIG. 28, using the process shown in FIG. 30 will be described.
  • 28 is an example of the overhead view V3-1 viewed by the director.
  • the highlighting is not performed.
  • the view frustums 40a, 40b, and 40c are all displayed in the same display mode, that is, white semi-transparent.
  • FIG. 31 shows a specific example of the processes in steps S201 and S202 in FIG.
  • the AR system 5 generates a view frustum 40 for each camera 2 as step S210. That is, for example, the view frustums 40a, 40b, and 40c are generated as the same white semi-transparent image for the cameraman.
  • the AR system 5 acquires the value of the screen occupancy rate of the target subject for the captured image V1 of each camera 2 as step S210.
  • the AR system 5 constantly performs image recognition processing on the captured images V1 of each camera 2, and determines whether or not the set target subject is captured, and determines the screen occupancy rate in each frame.
  • the screen occupancy rate is calculated by determining whether the target subject is captured and the area of the target subject in the screen.
  • the AR system 5 obtains the screen occupancy rate of the target subject in each captured image V1 at the current time calculated in this manner.
  • step S211 the AR system 5 determines the optimal captured image V1. For example, the captured image V1 with the highest screen occupancy rate is determined to be optimal.
  • step S212 the AR system 5 generates an image of each view frustum 40, including highlighting of the view frustum 40 corresponding to the camera 2 of the optimal shot image V1, as the view frustum 40 for the director.
  • the view frustum 40a is highlighted as a red semi-transparent image
  • the view frustums 40b and 40c are highlighted as white semi-transparent images.
  • the AR system 5 After performing the processes of steps S201 and S202 in FIG. 30 as shown in FIG. 31, the AR system 5 performs the processes of steps S203, S204, and S205. As a result, the overhead image V3-1 displayed on the GUI device 11 becomes as shown in FIG. 28. On the other hand, in the overhead image V3-2 displayed by each camera 2, the view frustum 40 is not highlighted.
  • the view frustum 40 for highlighting is selected based on the screen occupancy rate of the target subject, but the selection may be based on the continuous shooting time instead of the screen occupancy rate.
  • step S202 is shown in Fig. 32.
  • step S201 is the same as in Fig. 31.
  • step S202 of FIG. 30 the AR system 5 acquires the value of the continuous shooting time of the target subject for the captured image V1 of each camera 2 as step S215 of FIG.
  • the AR system 5 constantly performs image recognition processing on the captured images V1 of each camera 2, and determines whether or not the set target subject is captured. In this case, the AR system 5 calculates the duration (number of continuous frames) during which the target subject is recognized for each captured image V1. Then, in step S215, the AR system 5 obtains the continuous shooting time calculated in this manner.
  • step S211 the AR system 5 determines the optimal captured image V1.
  • the captured image V1 with the longest continuous shooting time is determined to be optimal.
  • step S212 the AR system 5 generates an image of each view frustum 40, including a highlight of the view frustum 40 corresponding to the camera 2 of the optimal shot image V1, as the view frustum 40 for the director.
  • the AR system 5 performs the processes of steps S203, S204, and S205 in Fig. 30.
  • the overhead image V3-1 displayed on the GUI device 11 becomes as shown in Fig. 28. This allows the director to recognize the camera 2 that is continuously capturing a long image of the subject of interest.
  • FIG. 33A shows an overhead image V3-1 as the device display image 51 of the GUI device 11.
  • the view frustums 40a, 40b, and 40c are each displayed in the same display mode, for example, semi-transparent white.
  • the overhead view video V3-1 will be as shown in Fig. 33B. That is, the view frustum 40a will be highlighted in a different manner from the view frustums 40b and 40c, so that this will be clearly indicated to the director.
  • a specific operation by the cameraman is an operation in which the cameraman notifies the director that "good footage is now being taken.” If such an operation is made possible on the camera 2 side, when the operation is performed, the AR system 5 makes the display mode of the view frustum 40 of the camera 2 on which the operation was performed in the overhead image V3-1 different from the others.
  • Fig. 34 shows a concrete example of steps S201 and S202 in Fig. 30.
  • the AR system 5 generates an image of the view frustum 40 for a cameraman as step S210 of Fig. 34.
  • the same white semi-transparent image is generated as the view frustum 40a, 40b, and 40c.
  • step S202 in FIG. 30 the AR system 5 first checks whether or not there has been feedback from each camera, that is, whether or not there has been a specific operation by the cameraman, in step S220 in FIG. 34, and branches the process in step S221. If no specific operation has been performed, the AR system 5 proceeds from step S221 to step S223, and generates an image of the director's view frustum 40. For example, the same white semi-transparent image is generated as the view frustum 40a, 40b, and 40c.
  • the AR system 5 proceeds to step S222 and generates an image of the view frustum 40 for the director, including highlighting.
  • the view frustum 40a is generated as a red semi-transparent image
  • the view frustums 40b and 40c are generated as white semi-transparent images.
  • the AR system 5 performs the processes of steps S203, S204, and S205 in Fig. 30.
  • the overhead image V3-1 displayed on the GUI device 11 becomes as shown in Fig. 33A or Fig. 33B.
  • the image becomes as shown in Fig. 33A
  • the view frustums 40a, 40b, and 40c are displayed in the same display mode.
  • 35A shows an overhead view image V3-1 as the device display image 51 of the GUI device 11.
  • the view frustums 40a, 40b, and 40c are displayed in the same display mode.
  • the view frustums 40a and 40b overlap on the image as shown in FIG. 35B.
  • the view frustums 40a and 40b are highlighted in a different way than usual to make them easier for the director to recognize.
  • Fig. 36 shows a concrete example of steps S201 and S202 in Fig. 30.
  • step S201 of Fig. 30 the AR system 5 generates an image of the view frustum 40 for a cameraman as step S210 of Fig. 36.
  • the same white semi-transparent image is generated as the view frustum 40a, 40b, and 40c.
  • step S202 of FIG. 30 the AR system 5 first sets the size, shape, and orientation of the view frustum 40 of each camera 2 based on the metadata MT of each camera 2 in step S230 of FIG. 36.
  • step S231 the AR system 5 checks the arrangement of each view frustum 40 within the three-dimensional coordinates of the CG space 30 of the current frame. This makes it possible to check whether the view frustums 40 overlap.
  • step S232 the AR system 5 branches the process depending on whether or not there is an overlap. If there is no overlap of the view frustum 40, the AR system 5 proceeds to step S234, and generates an image of the view frustum 40 for the director. For example, the same white semi-transparent image is generated as the view frustum 40a, 40b, and 40c.
  • the AR system 5 proceeds to step S233 and generates an image of the view frustum 40 for the director, including highlighting.
  • the overlapping view frustum 40 for example view frustum 40a and 40b, are generated as a red semi-transparent image, and the non-overlapping view frustum 40c is generated as a white semi-transparent image, etc.
  • the AR system 5 performs the processes of steps S203, S204, and S205 in Fig. 30.
  • the overhead image V3-1 displayed on the GUI device 11 becomes as shown in Fig. 35A or Fig. 35B.
  • the image becomes as shown in Fig. 35A
  • Fig. 35B if there is no overlap of the view frustum 40, the image becomes as shown in Fig. 35A, and if there is overlap, the image becomes as shown in Fig. 35B.
  • This allows the director, etc. to easily recognize the situation in which the same subject is being shot from different viewpoints by multiple cameras 2. This makes it possible to clarify instructions to each cameraman. It is also convenient for switching main line images when it is desired to switch images of the same subject.
  • the view frustums 40a, 40b, and 40c are displayed in the same display mode.
  • one view frustum 40 is preferentially displayed.
  • 37 shows an overhead image V3-1 as a device display image 51 of the GUI device 11.
  • the view frustum 40a, 40b, 40c, and 40d overlap, but the view frustum 40a is set as a priority, and the focus plane 41 and depth of field range 42 of the view frustum 40a are displayed in the overlapping portion.
  • FIG. 38 shows a concrete example of steps S201 and S202 in Fig. 30.
  • the AR system 5 generates an image of the view frustum 40 for the cameraman as step S210 of Fig. 38.
  • images are generated as the view frustum 40a, 40b, 40c, and 40d. No particular priority setting is made for the image of the view frustum 40 for the cameraman.
  • step S202 of FIG. 30 the AR system 5 first sets the size, shape, and orientation of the view frustum 40 of each camera 2 based on the metadata MT of each camera 2 in step S240 of FIG. 38.
  • step S241 the AR system 5 checks the arrangement of each view frustum 40 within the three-dimensional coordinates of the CG space 30 of the current frame. This makes it possible to check whether the view frustums 40 overlap.
  • step S242 the AR system 5 branches the process depending on whether or not there is an overlap. If there is no overlap of the view frustum 40, the AR system 5 proceeds to step S244, and generates an image of the view frustum 40 for the director. For example, images are generated as the view frustum 40a, 40b, 40c, and 40d.
  • the AR system 5 proceeds to step S245 to determine a view frustum 40 that has priority among the overlapping view frustum 40.
  • the AR system 5 may determine a view frustum 40 that has priority among all view frustum 40, including those that do not overlap.
  • the view frustum 40 selected as the one to be highlighted by photographing a subject of interest or by a specific operation of the cameraman may be set as a priority.
  • step S246 the AR system 5 generates an image of the view frustum 40 for the director.
  • an image is generated in which the focus plane 41 and depth of field range 42 are normally displayed.
  • an image is generated in which the focus plane 41 and depth of field range 42 are not displayed in the areas that overlap with the view frustum 40 set as a priority.
  • an image may be generated in which the focus plane 41 and depth of field range 42 are not displayed.
  • the AR system 5 performs the processes of steps S203, S204, and S205 in Fig. 30.
  • the overhead image V3-1 displayed on the GUI device 11 becomes an image in which the focus plane 41 and the depth of field range 42 can be clearly recognized for the view frustum 40 that has been prioritized, even if the view frustum 40 overlaps, as shown in Fig. 37.
  • the view frustums 40a, 40b, 40c, and 40d are displayed as shown in FIG.
  • the director's overhead view V3-1 is set as the priority, but the cameraman's overhead view V3-2 may be set as the priority.
  • the view frustum 40 of the camera 2 that he is operating be set as the priority. Therefore, in step S201 in Fig. 30 where a view frustum for the cameraman is generated, the same processing as steps S240 to S246 in Fig. 38 may be performed.
  • the view frustum to be given priority in step S245 is determined to be the view frustum 40 of the camera 2 itself. This allows the cameraman to clearly view the focus plane 41 and depth of field range 42 of the camera 2 he is operating even if the view frustum 40 overlaps with the view frustum 40 of another camera 2.
  • priority When priority is set in the overhead view video V3-2 in this way, priority may be set in the overhead view video V3-1 viewed by the director as described above, or it is also possible that priority is not set. Even if priority is set for both the overhead images V3-1 and V3-2, the conditions for determining the prioritized view frustum 40 are different, so the overhead image V3-1 and all of the overhead images V3-2 displayed by each camera 2 will not be displayed in the same manner.
  • FIG. 40A shows an overhead view image V3-1 as the device display image 51 of the GUI device 11.
  • view frustums 40a, 40b, and 40c are displayed.
  • Fig. 40A also shows an overhead image V3-2 as the viewfinder display image 50 of camera 2.
  • the overhead image V3-2 is synthesized in a corner of the screen of the shot image V1.
  • Fig. 40B shows an enlarged view of the overhead image V3-2.
  • FIG. 39A shows an example of a case where the director has performed an instruction operation on camera 2 of view frustum 40a.
  • the director may perform an operation such as dragging view frustum 40b on the GUI device 11 to cause instruction frustum 40DR to be displayed.
  • This is an instruction from the director to the cameraman of camera 2 of view frustum 40b to change the shooting direction to the direction of instruction frustum 40DR.
  • the AR system 5 displays the instruction frustum 40DR for the view frustum 40b also in the overhead image V3-2 viewed by the cameraman, as shown in Figures 40A and 40B.
  • the cameraman operating the camera 2 of the view frustum 40b can comply with the director's instructions by changing the shooting direction so that the view frustum 40b coincides with the instruction frustum 40DR.
  • the instruction frustum 40DR may be configured to be able to specify not only the shooting direction but also the angle of view and the focus plane 41.
  • the director may be able to move the focus plane 41 forward or backward, widen the angle of view (change the inclination of the pyramid), etc., by operating the instruction frustum 40DR.
  • the cameraman can adjust the focus so that the focus plane 41 of the view frustum 40b coincides with the pointing frustum 40DR, and can adjust the angle of view so that the inclinations of the pyramids coincide with each other.
  • overhead image V3-1 in FIG. 39A and overhead image V3-2 in FIG. 40A and FIG. 40B show examples in which the viewpoint position relative to CG space 30 is different.
  • the director or cameraman can change the viewpoint position for overhead images V3-1 and V3-2 by operating the camera.
  • the example in the figure shows that the CG space 30 is not necessarily displayed as seen from the same viewpoint position in overhead image V3-1 and overhead image V3-2.
  • FIG. 39B shows the state in which the director has also given instructions to the view frustum 40a, causing the instruction frustum 40DR to be displayed. In this way, in the overhead image V3-1, instructions can be given to each view frustum 40.
  • the instruction frustum 40DR of the previous instruction (instruction to the view frustum 40b) displayed as it is, in order to enable the director to confirm the currently valid instruction. It is considered that the instruction frustum 40DR is erased from the overhead images V3-1 and V3-2 when the view frustum 40 of the designated camera 2 substantially coincides with the instruction frustum 40DR. Alternatively, the instruction frustum 40DR may be erased from the overhead images V3-1 and V3-2 by a cancellation operation by the director, for example, to accommodate cancellation or change of instructions.
  • the instruction frustum 40DR for all the cameras 2 may be displayed, or only the instruction frustum 40DR for the camera 2 of the camera operator may be displayed.
  • each cameraman can grasp the overall instructions being issued.
  • the cameraman can easily recognize the instructions given to him from the director.
  • FIG. 41 shows specific examples of steps S201, S202, S203, and S204 in FIG. 30.
  • step S201 in FIG. 30 the AR system 5 performs the processes of steps S250 to S254 in FIG.
  • step S250 the AR system 5 generates an image of the view frustum 40 for a cameraman.
  • images are generated as the view frustums 40a, 40b, and 40c.
  • step S251 the AR system 5 checks whether or not there is an instruction operation by the director. If there is no instruction operation, the process proceeds to step S202 in FIG. If a pointing operation has been performed, the AR system 5 proceeds from step S251 to step S252 in FIG. 41, and branches the process depending on the display mode of the pointing frustum 40DR.
  • the display mode in this case can be selected by the cameraman either in a mode in which only the instruction frustum 40DR for the cameraman himself is displayed or in a mode in which all instruction frustum 40DR are displayed.
  • step S253 If the mode is one in which the instruction frustum 40DR directed to the user is displayed, the AR system 5 proceeds to step S253 and generates an image of the instruction frustum 40DR. However, if the instruction from the director is not directed to the camera 2 that is the subject of the overhead image V3-2 generation process, it is not necessary to generate an image of the instruction frustum 40DR in step S253.
  • the video data transmitted to each camera 2 as the overhead video V3-2 will have different display contents.
  • the AR system 5 proceeds to step S254 and generates an image of the indication frustum 40DR that is valid at that time.
  • the AR system 5 performs the processing of step S202 in FIG. 30 as shown in steps S260 to S262 in FIG. 41.
  • step S260 the AR system 5 generates an image of the view frustum 40 for the director.
  • images are generated as view frustum 40a, 40b, and 40c.
  • step S261 the AR system 5 checks whether or not there is an instruction operation by the director. If there is no instruction operation, the process proceeds to step S203 in FIG. If a pointing operation has been performed, the AR system 5 proceeds from step S261 to step S262 in FIG. 41, and generates an image of the pointing frustum 40DR that is valid at that time point.
  • step S203 in FIG. 30 the AR system 5 performs the processes of steps S255 and S256 in FIG.
  • step S255 the AR system 5 synthesizes the overhead view image V3-2 with the view frustum 40 and the instruction frustum 40DR, thereby generating image data of the overhead view image V3-2 as shown in FIG.
  • step S256 the AR system 5 synthesizes the overhead image V3-2 and the captured image V1 to generate image data of a synthetic image as shown in FIG. 40A.
  • the overhead view image V3-2 and the photographed image V1 may be combined on the camera 2 side.
  • step S204 in FIG. 30 the AR system 5 performs the process of step S265 in FIG.
  • step S265 the AR system 5 synthesizes the overhead view image V3-1 with the view frustum 40 and the instruction frustum 40DR, thereby generating image data of the overhead view image V3-1 as shown in Figures 39A and 39B.
  • step S205 of FIG. 30 the overhead view V3-1 is transmitted to the GUI device 11, and the overhead view V3-2 corresponding to each camera 2 is transmitted to each camera 2.
  • This allows the director to check his/her own instructions on the instruction frustum 40DR in the overhead view V3-1, and each cameraman can visually check the instructions from the director through the instruction frustum 40DR.
  • the display of the instruction frustum 40DR that the cameraman can see is seen in the overhead image V3-2, but it is a good idea to control the viewpoint position of the overhead image V3-2 to make the instructions easier for the cameraman to understand.
  • Figures 42A and 42B show overhead images V3-2 as the viewfinder display image 50 of camera 2. These are overhead images V3-2 with the position of camera 2 on the view frustum 40c as the viewpoint position, and are images viewed by the cameraman of camera 2.
  • an instruction frustum 40DR for the view frustum 40c is displayed, and an instruction frustum 40DR for the view frustum 40a of the other camera 2 is also displayed.
  • an instruction frustum 40DR for the view frustum 40c is displayed, but an instruction frustum 40DR for the view frustum 40a of the other camera 2 is not displayed.
  • the AR system 5 performs steps S201 and S202 in FIG. 30 as shown in FIG. 41. Then, it performs step S203 in FIG. 30 as shown in FIG. 43.
  • step S280 the AR system 5 branches the process depending on whether or not the instruction frustum 40DR is to be displayed in the current frame. If the instruction frustum 40DR is not to be displayed in the overhead image V3-2 for the camera 2 to be processed, the AR system 5 proceeds to step S281 and generates video data in which the image of the view frustum 40 is synthesized with the overhead image V3-2.
  • the AR system 5 proceeds to step S282, and sets the arrangement of the view frustum 40 and the instruction frustum 40DR within the 3D space coordinates for generating the overhead image V3-2. Then, in step S283, the AR system 5 sets the viewpoint position within the 3D space coordinates. That is, the coordinates of the position of a specific camera 2 among the multiple cameras to which the overhead video V3-2 is to be transmitted are set as the viewpoint position.
  • step S284 the AR system 5 generates video data for the overhead image V3-2, which is CG in which the view frustum 40 and the instruction frustum 40DR are combined at the set viewpoint position.
  • the viewfinder display image 50 can be switched by the cameraman between an overhead image V3-2 as shown in FIG. 42A and a shot image V1 as shown in FIG.
  • the cameraman needs to constantly check the captured image V1 (i.e., live view) of the camera 2 that he is operating while shooting, it is necessary for the captured image V1 to be displayed in the viewfinder. For this reason, it is conceivable to composite the overhead image V3-2 with the shot image V1 and display it as previously shown in FIG. 40A, but the overhead image V3-2 may be small and the indication frustum 40DR may be difficult to see. Therefore, it is advisable to switch between the overhead view image V3-2 as shown in FIG. 42A and the photographed image V1 as shown in FIG. 44 at any timing and display each image in full screen.
  • a pointed direction 54 and a coincidence rate 53 are displayed as instruction information on the photographed image V1.
  • the designated direction 54 is the shooting direction designated by the designated frustum 40DR.
  • the match rate 53 indicates the match rate between the current view frustum 40 and the designated frustum 40DR. When the match rate becomes 100%, the current view frustum 40 matches the designated frustum 40DR.
  • the cameraman can confirm that the director has given instructions even when he is normally viewing the shot image V1, and can follow the instructions by relying on the instruction direction 54 and the coincidence rate 53. If necessary, the screen can also be switched to the overhead image V3-2 to check the instruction frustum 40DR.
  • FIG. 5 An example of processing is shown in FIG.
  • the AR system 5 performs the processes of steps S270 to S273 in FIG. 45 in step S201 in FIG. Furthermore, the AR system 5 performs the processes of steps S275 to S278 in FIG. 45 in step S203 in FIG.
  • step S270 the AR system 5 checks whether the display of the view frustum 40 is OFF in the current frame. In other words, it checks whether the current frame is displaying the captured image V1 instead of the overhead image V3-2.
  • the AR system 5 ends the processing of step S201. In other words, there is no need to generate images of the view frustum 40 and the indication frustum 40DR.
  • the AR system 5 If the overhead image V3-2 is selected as the viewfinder display image 50, the AR system 5 generates image data for the view frustum 40 based on the metadata MT in step S271.
  • step S272 the AR system 5 determines whether or not to display the instruction frustum 40DR.
  • the instruction frustum 40DR is displayed when the director instructs it to be displayed.
  • the selection of a mode for displaying all the instruction frustum 40DR and a mode for displaying only the instruction frustum 40DR for the camera of the camera is also confirmed.
  • step S201 If the instruction frustum 40DR is not to be displayed, the process of step S201 is ended. If the instruction frustum 40DR is to be displayed in the overhead image V3-2, the AR system 5 proceeds to step S273 and generates image data of the instruction frustum 40DR.
  • step S203 in FIG. 30 the AR system 5 checks whether the display of the view frustum 40 is OFF, as in step S275 in FIG. 45. This is to check whether the captured image V1 is currently being displayed.
  • step S278 If the camera 2 being processed is currently displaying the overhead image V3-2, the AR system 5 proceeds to step S278, where it synthesizes the image data of the view frustum 40 with the image data of the overhead image V3-2, and if image data of the instruction frustum 40DR has been generated, it also generates image data that is synthesized with the instruction frustum 40DR.
  • step S276 the process branches depending on whether or not there is an instruction from the director. If there is no instruction, the process of step S203 ends. If there is an instruction from the director, in step S277, the captured image V1 is set to display the indicated direction 54 and the matching rate 53.
  • step S205 of FIG. 30 video data is output to camera 2. That is, video data of the shot video V1 as shown in FIG. 44 or video data of the overhead video V3-2 as shown in FIG. 42A is output to camera 2.
  • the viewfinder display image 50 may be switched between the shot image V1, the overhead image V3-2, and a composite image as shown in FIG. 40A by operation of the cameraman.
  • Fig. 46A shows a state in which a photographed image V1 and an overhead image V3-2 are displayed as a viewfinder display image 50 of camera 2.
  • the overhead image V3-2 is composited into a corner of the screen of the photographed image V1.
  • Fig. 46B shows an enlarged view of the overhead image V3-2.
  • the view frustum 40 of that camera itself is displayed in the overhead view video V3-2 displayed on the GUI device 11 on the director's side.
  • the view frustums 40 of all the cameras 2 are displayed as described with reference to FIG. 28 and the like.
  • marker frustum 40M1 and 40M2 are displayed in response to the cameraman registering the subject position and direction to be photographed, that is, the cameraman frequently marks the direction in which he or she wishes to photograph.
  • the marker frustum 40M1, 40M2 may be displayed in a different manner from, for example, the view frustum 40.
  • the marker frustum 40M1 and the marker frustum 40M2 may also be displayed in a different manner. For example, when the view frustum 40 is white and semi-transparent, the marker frustum 40M1 is yellow and semi-transparent, and the marker frustum 40M2 is light blue and semi-transparent.
  • the positions of the marker frustums 40M1 and 40M2 may be indicated by markers 55M1 and 55M2 on the captured image V1.
  • the correspondence may be clearly indicated by making the marker 55M1 yellow like the marker frustum 40M1 and making the marker 55M2 light blue like the marker frustum 40M2.
  • FIG. 48 shows a specific example of steps S201, S202, S203, and S204 in FIG.
  • step S300 the AR system 5 performs the processes of steps S300 to S303 in FIG.
  • step S300 the AR system 5 generates image data of the view frustum 40 based on the metadata MT.
  • the view frustum 40 corresponding to the camera 2 to be processed is generated.
  • the view frustum 40 corresponding to all of the cameras 2 is generated.
  • step S301 the AR system 5 determines whether or not a marking operation has been performed on the camera 2 to be processed.
  • a marking operation is an operation for adding or deleting a marking. If no marking operation has been performed, the process of step S201 ends.
  • step S302 the AR system 5 performs a process of adding a marking point to the registered marking or deleting a marking from the registered marking for the camera 2 to be processed. Then, in step S303, the AR system 5 generates image data of the marker frustum 40M as necessary. That is, if there are markings registered at that time, image data of the marker frustum 40M is generated.
  • step S202 of FIG. 30 the AR system 5 generates a view frustum 40 for the director in step S310 of FIG. 48.
  • image data of the view frustum 40 corresponding to all cameras 2 is generated.
  • step S203 in FIG. 30 the AR system 5 performs the processes of steps S320 and S321 in FIG.
  • step S320 the AR system 5 synthesizes the view frustum 40 with the CG data as the overhead view image V3-2. If there is a marking registration, the AR system 5 also synthesizes image data of the marker frustum 40M.
  • step S321 the AR system 5 combines the marker 55M with the captured image V1 in accordance with the marking registration. As described above, the video data of the overhead video V3-2 and the captured video V1 to be transmitted to the camera 2 is generated.
  • step S204 in FIG. 30 the AR system 5 performs the process of step S330 in FIG.
  • step S330 the AR system 5 synthesizes the view frustum 40 with the CG data as the overhead image V3-1. As a result, video data for the overhead view video V3-1 is generated.
  • step S205 of FIG. 30 the video data of the overhead view V3-2 and the shot video V1 are transmitted to the camera 2, and the video data of the overhead view V3-1 is transmitted to the GUI device 11.
  • This allows the cameraman to visually recognize the marker frustum 40M and the marker 55M in accordance with the marking registration operation. From the director's perspective, by not displaying the marker frustum 40M and marker 55M, the overhead image V3-1 does not become unnecessarily cluttered.
  • FIG. 49A shows an example in which an overhead image V3-1 is displayed as the device display image 51 of the GUI device 11
  • FIG. 49B shows an example in which an overhead image V3-2 is simultaneously displayed as the viewfinder display image 50 of the camera 2.
  • the view frustums 40a, 40b, and 40c of the cameras 2 are displayed in a similar manner, for example, in semi-transparent white.
  • the view frustum 40b of the camera 2 corresponding to the view frustum 40b is highlighted in, for example, a semi-transparent red, while the view frustums 40a and 40c of the other cameras 2 are each displayed in the usual semi-transparent white.
  • that view frustum 40a is highlighted in, for example, a semi-transparent red
  • the view frustums 40b and 40c of the other cameras 2 are each displayed in the normal semi-transparent white.
  • that view frustum 40c is highlighted in, for example, a semi-transparent red
  • the view frustums 40a, 40b of the other cameras 2 are each displayed in a normal semi-transparent white.
  • the director can check the view frustum 40 of each camera 2 evenly, and the cameraman can easily check the view frustum 40 of the camera 2 he is operating.
  • FIG. 50A shows an example in which an overhead image V3-1 is displayed as the device display image 51 of the GUI device 11
  • FIG. 50B shows an example in which an overhead image V3-2 is simultaneously displayed as the viewfinder display image 50 of the camera 2.
  • the view frustums 40a, 40b, and 40c of the cameras 2 are displayed in the same manner, for example, in semi-transparent white.
  • the viewpoint position is set to the position of the camera 2 corresponding to the view frustum 40b.
  • the overhead image V3-2 displayed by the camera 2 corresponding to the view frustum 40a that view frustum 40a is highlighted, for example, in a semi-transparent red color, and the view frustums 40b, 40c of the other cameras 2 are each displayed in a normal semi-transparent white color, and the viewpoint position is set to the position of the camera 2 of the view frustum 40a.
  • the overhead view image V3-2 of the camera 2 corresponding to the view frustum 40c also has its own view frustum 40 highlighted, and the viewpoint position is the position of the camera 2 of the view frustum 40c.
  • the director can check the view frustum 40 of each camera 2 evenly, and the cameraman can check the view frustum 40 of the camera 2 he is operating from a viewpoint similar to his own.
  • FIG. 51 shows an example in which an overhead image V3-1 is displayed as the device display image 51 of the GUI device 11.
  • an example is shown in which two overhead images are synthesized and displayed as overhead images V3-1a and V3-1b.
  • the overhead image V3-1a is an image from a viewpoint diagonally above the stadium
  • the overhead image V3-1b is an image from a viewpoint directly above.
  • the director needs to understand the entire camera, so it is ideal to display multiple overhead images V3-1 from different viewpoints.
  • FIG. 52 shows a specific example of steps S201, S202, S203, and S204 in FIG.
  • step S410 the AR system 5 generates image data of the view frustum 40 for the cameraman based on the metadata MT.
  • the image data is generated in a state in which the view frustum 40 corresponding to the camera 2 to be processed is highlighted.
  • step S202 of FIG. 30 the AR system 5 generates a view frustum 40 for the director in step S420 of FIG. 52.
  • image data with the same display mode is generated as the view frustum 40 corresponding to all cameras 2.
  • step S203 in FIG. 30 the AR system 5 performs the processes of steps S430 and S431 in FIG.
  • step S430 the AR system 5 sets the layout of the image data of the view frustum 40 within the 3D coordinate space as the overhead image V3-2.
  • step S431 the AR system 5 generates video data as an overhead image V3-2, with the position of the target camera 2 in the 3D coordinate space set as the viewpoint position. In this manner, the video data of the overhead video V3-2 to be transmitted to the camera 2 is generated.
  • step S204 in FIG. 30 the AR system 5 performs the processes of steps S440, S441, and S442 in FIG.
  • step S440 the AR system 5 synthesizes the view frustum 40 with the CG data as the overhead view image V3-1a.
  • step S441 the AR system 5 synthesizes the view frustum 40 with the CG data as the overhead view image V3-1b.
  • step S442 the AR system 5 generates video data that combines the overhead view image V3-1a and the overhead view image V3-1b on one screen. This generates the video data of the overhead view image V3-1 to be sent to the GUI device 11.
  • step S205 of FIG. 30 the video data of the overhead view V3-2 is transmitted to the camera 2, and the video data of the overhead view V3-1 is transmitted to the GUI device 11.
  • the cameraman to view, for example, the overhead image V3-2 as shown in FIG. 50B
  • the director to view, for example, the overhead images V3-1a and V3-1b as shown in FIG.
  • the captured image V1 may be displayed together with the view frustum 40 as described in Fig. 9 to Fig. 27.
  • the examples described in the embodiments can be implemented in a composite manner. 5. Summary and Modifications According to the above embodiment, the following effects can be obtained.
  • an information processing device 70 as an AR system 5 is equipped with an image processing unit 71a that generates image data for simultaneously displaying an overhead image V3 of the target space 8, a view frustum 40 (shooting range presentation image) that presents the shooting range of the camera 2 within the overhead image V3, and the captured image V1 of the camera 2 on one screen (see Figures 7 and 19).
  • an image processing unit 71a that generates image data for simultaneously displaying an overhead image V3 of the target space 8
  • a view frustum 40 shooting range presentation image
  • the viewer can easily grasp the correspondence between the image of the camera 2 and the position in space.
  • the video processor 71a generates video data that causes the captured video V1 to be displayed within the view frustum 40 (see FIGS. 9 to 14).
  • the image processor 71a generates image data in which the captured image V1 is arranged within the range of the image presentation image (view frustum 40).
  • the image processor 71a generates image data in which the captured image V1 is displayed in a state in which it is arranged within the range of the image presentation image (view frustum 40).
  • the image processing unit 71a generates image data in which the captured image V1 is displayed at a position within the depth of field range shown on the view frustum 40 (see Figures 9 and 10).
  • the depth of field range 42 is displayed within the view frustum 40, and the captured image V1 is displayed inside the display of the depth of field range 42.
  • This causes the captured image V1 to be displayed at a position close to the actual position of the subject within the overhead image V3. Therefore, the viewer can easily grasp the relationship between the shooting range of the view frustum 40, the actual captured image V1, and the position of the captured subject.
  • the video processor 71a generates video data in which the captured video V1 is displayed on the focus plane 41 shown on the view frustum 40 (see FIG. 9).
  • a focus plane 41 is displayed within the view frustum 40, and the captured image V1 is displayed on the focus plane 41. This allows the viewer to easily confirm the focus position of the camera 2 and the image of the subject at that position.
  • the image processing unit 71a generates image data in which the captured image V1 is displayed farther away than the depth of field range 42 when viewed from the frustum starting point 46 (see Figures 12 to 14).
  • the view frustum 40 is an image that spreads in a quadrangular pyramid shape, and the area of the cross section increases as it goes farther. Therefore, by displaying the captured image V1 on or near the frustum far end surface 45, it is possible to display the captured image V1 relatively large within the view frustum 40. This is suitable, for example, when the contents of the captured image V1 are to be confirmed.
  • the image processing unit 71a generates image data in which the captured image V1 is displayed at a position closer to the frustum origin 46 (a surface 47 near the frustum origin) than the depth of field range 42 shown on the view frustum 40 (see Figure 11).
  • the image processing unit 71a when it is desired to check the depth of field range 42 or the focus plane 41 in the view frustum 40, or when it is difficult to display the image on the far end surface 45 of the frustum, it is preferable to display the captured image V1 at a position close to the frustum starting point 46.
  • an image generation control unit 71b controls the generation of image data by variably setting the display position of the captured image V1, which is simultaneously displayed on one screen together with the overhead image V3 and the view frustum 40 (see Figures 7, 23, and 24).
  • the display position of the captured image V1 is set as any position inside the view frustum 40 or any position outside the view frustum 40. By setting an appropriate position, it is possible to make it easier for the viewer to grasp the captured image V1, and to prevent the view frustum 40 and the captured image V1 from interfering with each other.
  • video production control section 71b determines whether to change the display position of photographed video V1, and changes the setting of the display position of photographed video V1 in accordance with the determination result (see FIG. 24). For example, a change determination is performed so that the display position of the captured image V1 is automatically changed to an appropriate position, whereby the view frustum 40 and the captured image V1 are displayed in an appropriate positional relationship for the viewer, for example, a positional relationship that provides good visibility or a positional relationship that makes it easy to understand the correspondence relationship.
  • the image generation control unit 71b determines whether or not it is necessary to change the display position of the captured image V1 based on the positional relationship between the view frustum 40 and the object represented in the overhead image V3 (see steps S160 and P1 in Figure 24). For example, when the far end side of the view frustum 40 is embedded in the ground GR or a structure CN in the overhead image V3, the image may become unnatural or may not be displayed at all when displayed on the frustum far end surface 45. In such a case, the image generation control unit 71b determines that the position setting needs to be changed and changes the position setting of the captured image V1. This makes it possible to automatically provide an easily viewable captured image V1.
  • the image generation control unit 71b judges whether or not the display position of the captured image V1 needs to be changed based on the angle determined by the direction from the viewpoint of the entire overhead image V3 and the axial direction of the view frustum 40 (see steps S160 and P2 in FIG. 24). That is, it is the angle between the normal direction on the display screen when viewed from the line of sight direction from the viewpoint set for the overhead image V3 at a certain point in time, and the axial direction of the displayed view frustum 40.
  • the axial direction of the view frustum 40 is the direction of a perpendicular line drawn from the frustum starting point 46 to the frustum far end surface 45.
  • the size and direction of the rendered view frustum 40 change according to the angle of view and shooting direction of the camera 2.
  • the image generation control unit 71b determines that the position setting needs to be changed according to the angle of the view frustum 40, and changes the position setting of the captured image V1. This makes it possible to automatically provide the captured image V1 in an easy-to-view state.
  • video production control unit 71b determines whether or not the display position of captured video V1 needs to be changed based on a change in viewpoint within overhead video V3 (see steps S160 and P3 in Figure 24). For example, changing the viewpoint of the overhead image V3 changes the direction, size, angle, etc. of the view frustum 40.
  • the image generation control unit 71b judges whether the display of the captured image V1 up to that point is appropriate, and changes the settings if necessary. This makes it possible to provide the captured image V1 in a state that is always easy to view, even if the viewer arbitrarily changes the overhead image V3.
  • video production control section 71b uses type information of camera 2 capturing captured video V1 to set the destination of the captured video (see step S163 in FIG. 24).
  • the change destination of the display position of the captured image V1 is set depending on whether the camera 2 is a fixed type using a tripod 6 or a mobile type. This makes it possible to set a position according to the fixed type camera 2F and the mobile type camera 2M.
  • the view frustum 40 frequently changes, so that an easy-to-view display can be provided by displaying the captured image V1 at a position that is less affected by the change in the view frustum 40.
  • video production control section 71b changes the setting of the display position of captured video V1 in response to a user operation (see FIG. 23).
  • the user who is the viewer, can arbitrarily switch the display position of the captured image V1, thereby allowing the captured image V1 to be displayed at a position that suits the viewer's ease of viewing and purpose.
  • image production control section 71b changes the display position of captured image V1 within view frustum 40 (see Figs. 23 and 24). For example, within the view frustum 40, switching is performed among the focus plane 41, the frustum far end plane 45, the plane on the frustum starting point 46 side, the plane within the depth of field, etc. This allows the captured image V1 to be displayed at an appropriate position while clarifying the correspondence between the view frustum 40 and the captured image V1.
  • the image production control section 71b changes the display position of the captured image V1 between inside and outside the view frustum 40 (see Figs. 23 and 24).
  • the display position of the captured image V1 is changed within the view frustum 40, such as the focus plane 41, the frustum far end plane 45, the plane on the frustum starting point 46 side, and the plane within the depth of field range, or further, at a position outside the view frustum 40, such as near the camera, in the corner of the screen, or near the focus plane 41.
  • the video processing unit 71a generates video data that simultaneously displays an overhead image V3, each view frustum 40 for each of the multiple cameras 2, and each captured image V1 for each of the multiple cameras 2 on a single screen (see Figures 16, 17, and 27).
  • the view frustum 40 and the captured images V1 of the multiple cameras 2 are displayed in the CG space 30 represented by the overhead image V3. This allows the viewer to easily understand the relationship between the shooting ranges of the cameras 2. This is convenient for a director, for example, to check the contents of the images captured by each camera 2.
  • the view frustum 40 is given as an example of a shooting range presentation image, and its shape is a quadrangular pyramid, but it is not limited to this.
  • it may be an image in which multiple rectangular outlines of a quadrangular pyramid cross section are arranged, or an image in which the outline of a quadrangular pyramid is expressed by a dashed line.
  • the shooting range presentation image may display only the focus plane 41 or only the depth of field range 42 .
  • the information processing device 70 as, for example, the AR system 5 in the embodiment is equipped with an image processing unit 71a that performs in parallel a process of generating first image data that displays the view frustum 40 (image presenting the shooting range) of the camera 2 within the target shooting space 8, and a process of generating second image data that displays an image that displays the view frustum 40 within the target shooting space 8 and has a display mode different from that of the first image data.
  • the first video data and the second video data are the video data of the overhead video V3-1 transmitted to the GUI device 11 and the video data of the overhead video V3-2 transmitted to the camera 2 in the embodiment.
  • the viewer can easily grasp the correspondence between the image of the camera 2 and the position in the space.
  • the viewer By generating video data with different display modes according to the role of each viewer for the overhead image V3 including the view frustum 40, it is possible to present information suited to each viewer through the video display.
  • the video data of the overhead images V3-1 and V3-2 is video data of an image viewed by a video production supervisor, and the other is video data of an image viewed by a camera operator of camera 2 regarding the target space 8.
  • the overhead image V3-1 has content intended for viewing by a video production instructor such as a director on the GUI device 11
  • the overhead image V3-2 has video content intended for viewing by a shooting operator such as a cameraman.
  • the video production director refers to staff involved in video production, such as a director and a switcher engineer, other than the camera operator.
  • the camera operator refers to a cameraman who directly operates the camera 2 and a staff member who remotely operates the camera 2.
  • At least one of the video data of the overhead images V3-1 and V3-2 is video data that displays an image including a plurality of view frustums 40 corresponding to a plurality of cameras 2, respectively.
  • one or both of the overhead images V3-1, V3-2 display view frustums 40 for multiple cameras 2.
  • the director, cameraman, etc. can easily grasp the positional relationship of each camera 2 and the subject.
  • a view frustum 40 is displayed for multiple cameras 2, allowing the director or the like to give various instructions and select main line images while recognizing the position and direction of the subject of each camera 2.
  • the view frustum 40 is displayed for the plurality of cameras 2, so that the cameraman can perform shooting operations while taking into consideration the relationship with the other cameras 2.
  • the view frustum 40 may be displayed for his/her own camera 2. In this way, the cameraman can easily grasp the position of the subject in the image V1 captured by his/her own camera operation within the whole image. Furthermore, in the overhead image V3-2 viewed by the cameraman, only the view frustum 40 of the camera 2 of the other cameraman may be displayed. In this way, the cameraman can operate his own camera while recognizing the shooting locations and subjects of the other camera 2.
  • the video processing unit 71a generates video data as at least one of the video data for the overhead images V3-1, V3-2, which displays an image in which a portion of a plurality of view frustums 40 corresponding to a plurality of cameras 2 is displayed in a different manner from the other view frustums 40. That is, when a plurality of view frustums 40 are displayed, some of them are displayed in a different manner from the other view frustums 40. This makes it possible to realize a display in which a specific view frustum 40 has meaning when displaying a plurality of view frustums 40.
  • the video processing unit 71a generates video data that displays an image in which a portion of a plurality of view frustums 40 corresponding to a plurality of cameras 2 is highlighted as at least one of the video data for the overhead images V3-1, V3-2.
  • a particular view frustum 40 can be clearly identified by displaying some of the view frustums 40 in a more emphasized manner than the other view frustums 40 .
  • Examples of highlighting include a display with increased brightness, a display using a conspicuous color, a display with emphasized contours, a blinking display, and the like.
  • the video processing unit 71a generates video data that displays, as an overhead image V3-1, an image in which the view frustum 40 of a specific camera, which is a camera 2 among multiple cameras 2 that contains a subject of interest in the captured image V1, is displayed in a different manner from the other view frustums 40 (see Figures 28 to 32).
  • the view frustum 40 of the camera 2 selected from among the cameras 2 capturing the target subject it is easy for the director to know which camera is appropriate when he wants to use the image of the target subject as the main line image. It is also easy for the director to understand the positional relationship between the camera 2 capturing the target subject and the shooting direction of the other cameras 2.
  • the specific camera that highlights the view frustum 40 is the camera 2 in which the screen occupancy rate of the target subject in the captured image V1 is the highest (see FIGS. 29, 30, and 31).
  • the director can give instructions while grasping the status of the camera 2 mainly showing the target subject and the other cameras 2.
  • the specific camera for highlighting the view frustum 40 is the camera 2 that has the longest continuous shooting time of the target subject in the shot video V1 (see FIG. 32).
  • the director can grasp the status of the camera 2 that mainly films the subject of interest and other cameras 2 and give instructions accordingly.
  • the video processing unit 71a generates video data as the video data for the overhead video V3-1, in which the view frustum 40 of a camera 2 among multiple cameras 2 that has detected a specific operation by the shooting operator is displayed in a different manner from the other view frustum 40 (see Figures 33 and 34).
  • the video processing unit 71a generates video data as video data for the overhead video V3-1 in which, when the view frustums 40 of multiple cameras 2 overlap within the displayed image, the overlapping view frustums 40 are displayed in a different manner from the non-overlapping view frustums 40 (see Figures 35 and 36).
  • multiple view frustums 40 overlap multiple cameras 2 are shooting the direction of a common subject.
  • the video processing unit 71a generates video data that preferentially displays one of the overlapping view frustums 40 as at least one of the overhead images V3-1, V3-2 when the view frustums 40 of multiple cameras 2 overlap on the displayed image (see Figures 37 and 38).
  • the video processing unit 71a When multiple view frustums 40 overlap, one view frustum 40 is preferentially displayed in the overlapping portion.
  • the focus plane 41 and depth of field range 42 of only one view frustum 40 that has been set as the priority are displayed.
  • the overlapping portion it is possible to increase the brightness of only one view frustum 40 that has been set as a priority, or to give it a conspicuous color. Furthermore, the above-mentioned highlighted display may be performed. In the overlapping portion, only the view frustum 40 that has been set as a priority may be displayed. These also make it easier to view the overhead image V3 including multiple view frustum 40.
  • the view frustum 40 of camera 2 which is the main line image, is displayed as a priority, while in the overhead image V3-2 viewed by the cameraman, no particular priority is set.
  • the view frustum 40 of the camera 2 that he operates is displayed with priority.
  • video processing unit 71a generates video data for displaying, as overhead images V3-1 and V3-2, images including instruction images in different display modes (see FIGS. 39 to 45).
  • video processing unit 71a generates video data for displaying, as overhead images V3-1 and V3-2, images including instruction images in different display modes (see FIGS. 39 to 45).
  • the instruction frustum 40DR On the cameraman side, the instruction frustum 40DR is displayed on the screen, so that the cameraman can visually understand the instruction contents.
  • the overhead images V3-1 and V3-2 are displayed in a way that is appropriate for each role, so that the shooting can proceed smoothly.
  • the video processing unit 71a sets the video data of the overhead image V3-1 as video data that displays instruction images for multiple cameras 2, and sets the video data of the overhead image V3-2 as video data that displays instruction images for a specific camera 2 among the multiple cameras (see Figures 39, 41, and 42). This allows the director to understand the instructions for each camera, while camera operators can easily understand the instructions by only seeing the instructions that are directed to them.
  • the video processing unit 71a converts the video data of the overhead video V3-2 into video data that displays an instruction video within a video of a viewpoint corresponding to the position of a specific camera 2 among multiple cameras (see Figures 42 and 43).
  • the indication frustum 40DR is displayed in the overhead view image V3-2 from his/her viewpoint, so that the indication direction can be easily seen from his/her own viewpoint.
  • the video processing unit 71a generates video data for the overhead video V3-2 that displays the current view frustum 40 and a marker image in the shooting direction based on the marking operation (see Figures 46 to 48).
  • the bird's-eye view image V3-2 including the marker images of the marker frustum 40M, the marker 55M, etc. is displayed. This allows the cameraman to mark the shooting position or subject that he or she has set, which is convenient for taking pictures of that position at the appropriate time.
  • the video processing unit 71a generates video data as the video data for the overhead video V3-2, which displays an overhead video from a viewpoint corresponding to the position of a specific camera 2 among multiple cameras, and generates video data as the video data for the overhead video V3-1, which displays an overhead video from a different viewpoint (see Figures 49 to 52).
  • the bird's-eye view V3-2 is displayed from the same viewpoint as his/her own viewpoint, making it easy to recognize the overall situation and his/her own shooting direction.
  • the bird's-eye view V3-1 is displayed from a viewpoint that makes it easy to grasp the whole picture, rather than from the viewpoint of a specific cameraman, making it ideal for directing the entire shoot.
  • the video processing unit 71a generates video data for displaying a plurality of overhead views V3-1a, V3-1b from a plurality of viewpoints as the video data for the overhead view V3-1 (see FIGS. 51 and 52). Since the director needs to understand the shooting conditions of each camera 2, an overhead image V3-1 that provides an overall bird's-eye view from a plurality of viewpoints as shown in FIG. 51 is extremely useful.
  • the video processor 71a generates the overhead view video V3 as a virtual video using CG. This makes it possible to generate an overhead image V3 from any viewpoint, and to display the view frustum 40 and the captured image V1 from a variety of viewpoints.
  • the view frustum 40 is configured to display the shooting direction and angle of view at the time of shooting in real time, but it may also be configured to display a past view frustum 40, for example, during a prior simulation of camera work.
  • the current view frustum 40 at the time of shooting and the past view frustum 40 may be displayed at the same time for comparison. In such a case, it is advisable to make the past view frustum 40 different from the current view frustum 40 by increasing its transparency, for example, so that the cameraman or the like can distinguish between them.
  • the program of the embodiment is a program that causes a processor such as a CPU or DSP, or a device including these, to execute the processes shown in Figures 20, 21, 22, 23, and 24 described above. That is, the program of the embodiment is a program that causes the information processing device 70 to execute a process of generating video data that simultaneously displays, on one screen, an overhead image V3 of the space to be photographed, a view frustum 40 (shooting range presentation image) that presents the shooting range of the camera 2 within the overhead image V3, and the captured image V1 of the camera 2.
  • a processor such as a CPU or DSP, or a device including these
  • the program of the embodiment is a program that causes a processor such as a CPU or DSP, or a device including these, to execute the processes shown in Figures 30, 31, 32, 34, 36, 38, 41, 43, 45, 48, and 52 described above. That is, the program of the embodiment is a program that causes the information processing device 70 to execute in parallel a process of generating first video data that displays a view frustum 40 (shooting range display image) that presents the shooting range of the camera 2 within the shooting target space, and a process of generating second video data that displays an image that displays the view frustum 40 within the shooting target space and has a display mode different from that of the image generated by the first video data.
  • a processor such as a CPU or DSP, or a device including these
  • Such a program can be pre-recorded in a HDD as a recording medium built into a device such as a computer device, or in a ROM in a microcomputer having a CPU. Also, such a program can be temporarily or permanently stored (recorded) in a removable recording medium such as a flexible disk, a CD-ROM (Compact Disc Read Only Memory), an MO (Magneto Optical) disk, a DVD (Digital Versatile Disc), a Blu-ray Disc (registered trademark), a magnetic disk, a semiconductor memory, or a memory card.
  • a removable recording medium can be provided as a so-called package software.
  • Such a program can be installed in a personal computer or the like from a removable recording medium, or can be downloaded from a download site via a network such as a LAN (Local Area Network) or the Internet.
  • Such a program is suitable for the widespread provision of the information processing device 70 of the embodiment.
  • a program is suitable for the widespread provision of the information processing device 70 of the embodiment.
  • personal computers communication devices
  • mobile terminal devices such as smartphones and tablets, mobile phones, game devices, video devices, PDAs (Personal Digital Assistants), etc.
  • these devices can function as the information processing device 70 of the present disclosure.
  • An information processing device comprising: an image processing unit that generates image data for simultaneously displaying an overhead image of a space to be photographed, a shooting range presentation image that presents the shooting range of a camera within the overhead image, and the image photographed by the camera on a single screen.
  • an information processing device comprising: an image processing unit that generates image data for simultaneously displaying an overhead image of a space to be photographed, a shooting range presentation image that presents the shooting range of a camera within the overhead image, and the image photographed by the camera on a single screen.
  • the image processing unit generates image data in which the captured image is displayed within the shooting range presentation image.
  • the image processing unit generates image data in which the captured image is displayed at a position within a depth of field range shown in the shooting range presentation image.
  • the information processing device further comprising an image generation control unit that controls generation of image data by variably setting a display position of the captured image that is simultaneously displayed on one screen together with the overhead image and the shooting range presentation image.
  • the image generation control unit determines whether to change a display position of the shot image, and changes a setting of the display position of the shot image according to a result of the determination.
  • the image generation control unit determines whether or not it is necessary to change the display position of the captured image based on a positional relationship between the shooting range presentation image and an object represented in the overhead image.
  • a program that causes an information processing device to execute a process of generating video data that simultaneously displays an overhead image of a space to be photographed, a shooting range presentation image that presents the camera's shooting range within the overhead image, and the image captured by the camera on a single screen.

Abstract

This information processing device includes a video processing unit generating video data causing simultaneous display, within the same screen, of a bird's-eye view of a space for image capturing, an image capturing range presentation video presenting the image capturing range of a camera within the bird's eye view, and a video captured by the camera.

Description

情報処理装置、情報処理方法、プログラムInformation processing device, information processing method, and program
 本技術は情報処理装置、情報処理方法、プログラムに関し、撮影対象空間の映像や仮想映像の表示に関連する技術である。 This technology relates to an information processing device, information processing method, and program, and is related to the display of images of the target space and virtual images.
 カメラによる撮影方向や被写界深度を示す表示を行う技術が知られている。
 下記特許文献1では撮影情報に基づいて被写界深度や画角を表示する技術が開示されている。
 下記特許文献2には、台形状の図形を用いて撮影画像における撮影範囲を表現することが開示されている。
 下記特許文献3には撮像対象物の奥行き位置とフォーカス位置とを示すためのマップ画像を生成して表示することが開示されている。
2. Description of the Related Art There is known a technique for displaying the shooting direction and depth of field of a camera.
Japanese Patent Application Laid-Open No. 2003-233693 discloses a technique for displaying the depth of field and the angle of view based on shooting information.
Japanese Patent Application Laid-Open No. 2003-233633 discloses expressing the shooting range in a captured image using a trapezoidal figure.
Japanese Patent Laid-Open No. 2003-233633 discloses generating and displaying a map image for indicating the depth position and focus position of an object to be imaged.
特開2013-183217号公報JP 2013-183217 A 特開2009-60337号公報JP 2009-60337 A 特開2010-177741号公報JP 2010-177741 A
 例えば放送や配信のための映像を撮影するシステムでは、カメラマンやディレクター等が1又は複数のカメラの撮影方向や画角、フォーカスしている被写体位置などを把握できるようにすると便利である。そのために撮影方向や画角に応じて変化する撮影範囲を画面上で表示することが考えられる。ところが、そのような撮影範囲を表示するだけでは、現在どのような映像がカメラによって撮影されているかを同時に確認することができない。 For example, in a system that shoots video for broadcast or distribution, it is useful for cameramen, directors, etc. to be able to grasp the shooting direction and angle of view of one or more cameras, the position of the subject that is being focused on, etc. One way to achieve this is to display on the screen the shooting range that changes depending on the shooting direction and angle of view. However, simply displaying such a shooting range does not allow users to simultaneously check what image is currently being shot by the camera.
 そこで本開示では、カメラの映像と空間内での位置の対応が把握しやすくする映像を表示させる技術を提案する。 This disclosure therefore proposes technology that displays images that make it easier to understand the correspondence between camera images and positions in space.
 本技術に係る情報処理装置は、撮影対象の空間の俯瞰映像と、前記俯瞰映像内においてカメラの撮影範囲を提示する撮影範囲提示映像と、前記カメラの撮影映像とを一画面内で同時表示させる映像データを生成する映像処理部を備える。
 撮影範囲提示映像とは、カメラの撮影方向やズーム画角によって決まる撮影範囲を示す映像である。カメラの撮影範囲を提示する映像を俯瞰映像内に表示させる場合に、同じ画面内にカメラの撮影映像も表示させる。
An information processing device related to the present technology includes an image processing unit that generates image data that simultaneously displays an overhead image of a space to be photographed, a shooting range presentation image that presents the camera's shooting range within the overhead image, and the image photographed by the camera on a single screen.
The shooting range presentation image is an image showing the shooting range determined by the shooting direction and zoom angle of the camera. When the image showing the shooting range of the camera is displayed in the overhead image, the camera's shooting image is also displayed in the same screen.
本技術の実施の形態の撮影システムによる撮影の説明図である。FIG. 1 is an explanatory diagram of photography by a photography system according to an embodiment of the present technology. AR(Augmented Reality)重畳映像の説明図である。This is an explanatory diagram of AR (Augmented Reality) overlaid images. 実施の形態のシステム構成の説明図である。FIG. 1 is an explanatory diagram of a system configuration according to an embodiment. 実施の形態のシステム構成の他の例の説明図である。FIG. 11 is an explanatory diagram of another example of a system configuration according to the embodiment; 実施の形態の環境マップの説明図である。FIG. 2 is an explanatory diagram of an environment map according to the embodiment; 実施の形態の環境マップのドリフト補正の説明図である。11A and 11B are diagrams illustrating drift correction of an environment map according to an embodiment. 実施の形態の情報処理装置のブロック図である。FIG. 1 is a block diagram of an information processing apparatus according to an embodiment. 実施の形態のビューフラスタムの説明図である。FIG. 2 is an explanatory diagram of a view frustum according to an embodiment. 実施の形態のビューフラスタムのフォーカス面での撮影映像の表示例の説明図である。1 is an explanatory diagram of a display example of a captured image on a focus plane of a view frustum according to an embodiment; 実施の形態のビューフラスタムの被写界深度内での撮影映像の表示例の説明図である。1 is an explanatory diagram of a display example of a captured image within the depth of field of a view frustum according to an embodiment. FIG. 実施の形態のビューフラスタムの起点に近い位置での撮影映像の表示例の説明図である。11 is an explanatory diagram of a display example of a captured image at a position close to the starting point of a view frustum in the embodiment; FIG. 実施の形態のビューフラスタムの遠端面での撮影映像の表示例の説明図である。1 is an explanatory diagram of an example of a display of a captured image on a far end surface of a view frustum according to an embodiment; 実施の形態のビューフラスタムを無限遠とした場合の説明図である。FIG. 13 is an explanatory diagram of a case where a view frustum according to an embodiment is set at infinity. 実施の形態のビューフラスタムの遠端側での撮影映像の表示状態の変化の説明図である。11A to 11C are explanatory diagrams illustrating a change in the display state of a captured image on the far end side of a view frustum according to an embodiment. 実施の形態のビューフラスタム外の撮影映像の表示例の説明図である。11A and 11B are explanatory diagrams of a display example of a captured image outside a view frustum according to an embodiment. 実施の形態の複数のビューフラスタム内外における撮影映像の表示例の説明図である。1A to 1C are explanatory diagrams illustrating an example of display of captured images inside and outside a plurality of view frustums according to an embodiment. 実施の形態のビューフラスタム外の撮影映像の表示例の説明図である。11A and 11B are explanatory diagrams of a display example of a captured image outside a view frustum according to an embodiment. 実施の形態のビューフラスタム外の撮影映像の表示例の説明図である。11A and 11B are explanatory diagrams of a display example of a captured image outside a view frustum according to an embodiment. 実施の形態の情報処理装置の処理例のフローチャートである。11 is a flowchart of a processing example of the information processing apparatus according to the embodiment. 実施の形態の撮影映像の表示位置設定処理例のフローチャートである。11 is a flowchart of an example of a process for setting a display position of a captured image according to an embodiment; 実施の形態の撮影映像の表示位置設定処理例のフローチャートである。11 is a flowchart of an example of a process for setting a display position of a captured image according to an embodiment; 実施の形態の撮影映像の表示位置設定処理例のフローチャートである。11 is a flowchart of an example of a process for setting a display position of a captured image according to an embodiment; 実施の形態の撮影映像の表示位置設定処理例のフローチャートである。11 is a flowchart of an example of a process for setting a display position of a captured image according to an embodiment; 実施の形態の撮影映像の表示位置設定処理例のフローチャートである。11 is a flowchart of an example of a process for setting a display position of a captured image according to an embodiment; 実施の形態の衝突判定の説明図である。FIG. 4 is an explanatory diagram of a collision determination according to an embodiment. 実施の形態の衝突判定の説明図である。FIG. 4 is an explanatory diagram of a collision determination according to an embodiment. 実施の形態で俯瞰映像の変化の説明図である。11A and 11B are explanatory diagrams of changes in an overhead view image in the embodiment. 実施の形態のディレクター側の俯瞰映像の説明図である。FIG. 13 is an explanatory diagram of an overhead view from the director's side in the embodiment. 実施の形態の強調表示する映像の判定の説明図である。11A and 11B are diagrams illustrating a determination of an image to be highlighted according to an embodiment. 実施の形態の情報処理装置の処理例のフローチャートである。11 is a flowchart of a processing example of the information processing apparatus according to the embodiment. 実施の形態の強調表示のための処理例のフローチャートである。11 is a flowchart of an example of a process for highlighting according to an embodiment. 実施の形態の強調表示のための処理例のフローチャートである。11 is a flowchart of an example of a process for highlighting according to an embodiment. 実施の形態のフィードバックによる表示例の説明図である。FIG. 11 is an explanatory diagram of a display example based on feedback according to the embodiment. 実施の形態のフィードバックによる表示の処理例のフローチャートである。11 is a flowchart of an example of a display process based on feedback according to an embodiment. 実施の形態の重なったビューフラスタムの表示例の説明図である。11A and 11B are explanatory diagrams of a display example of overlapping view frustums according to an embodiment; 実施の形態の重なったビューフラスタムの表示の処理例のフローチャートである。11 is a flowchart of a processing example of displaying overlapped view frustum according to an embodiment. 実施の形態の1つのビューフラスタムの優先表示の説明図である。FIG. 13 is an explanatory diagram of a preferred display of one view frustum according to an embodiment. 実施の形態の優先表示を行う場合の処理例のフローチャートである。13 is a flowchart of a processing example when performing priority display according to the embodiment. 実施の形態の指示フラスタムのディレクター側の表示例の説明図である。FIG. 13 is an explanatory diagram of an example of a display of the instruction frustum on the director side in the embodiment. 実施の形態の指示フラスタムのカメラマン側の表示例の説明図である。11 is an explanatory diagram of an example of a display on the cameraman's side of an instruction frustum according to an embodiment. FIG. 実施の形態の異なる俯瞰映像の生成処理のフローチャートである。13 is a flowchart of a process for generating an overhead view video according to another embodiment. 実施の形態の指示フラスタムのカメラマン側の表示例の説明図である。11 is an explanatory diagram of an example of a display on the cameraman's side of an instruction frustum according to an embodiment. FIG. 実施の形態のカメラマン側の俯瞰映像の生成処理のフローチャートである。11 is a flowchart of a process for generating an overhead video from a cameraman's side according to an embodiment. 実施の形態の指示情報のカメラマン側の表示例の説明図である。11 is an explanatory diagram of an example of instruction information displayed on the cameraman's side according to the embodiment; FIG. 実施の形態のカメラマン側の俯瞰映像の生成処理のフローチャートである。11 is a flowchart of a process for generating an overhead video from a cameraman's side according to an embodiment. 実施の形態のマーカーフラスタムの表示例の説明図である。11 is an explanatory diagram of a display example of a marker frustum according to an embodiment. FIG. 実施の形態のマーカーの表示例の説明図である。FIG. 11 is an explanatory diagram of a display example of a marker according to an embodiment. 実施の形態のマーカー情報の表示の処理例のフローチャートである。13 is a flowchart of a process example of displaying marker information according to an embodiment. 実施の形態の異なる俯瞰映像の表示例の説明図である。11A and 11B are explanatory diagrams of a display example of a different overhead view image according to an embodiment. 実施の形態の異なる俯瞰映像の表示例の説明図である。11A and 11B are explanatory diagrams of a display example of a different overhead view image according to an embodiment. 実施の形態のディレクター側の表示例の説明図である。FIG. 13 is an explanatory diagram of a display example on the director side of the embodiment. 実施の形態の異なる俯瞰映像の生成処理のフローチャートである。13 is a flowchart of a process for generating an overhead view video according to another embodiment.
 以下、実施の形態を次の順序で説明する。
<1.システム構成>
<2.情報処理装置の構成>
<3.ビューフラスタムの表示>
<4.カメラマンとディレクターの画面例>
 [4-1:強調表示]
 [4-2:優先表示]
 [4-3:指示表示]
 [4-4:マーカー表示]
 [4-5:各種表示例]
<5.まとめ及び変形例>
The embodiments will be described below in the following order.
1. System configuration
2. Configuration of information processing device
<3. Display of view frustum>
<4. Example of a cameraman and director screen>
[4-1: Highlighted display]
[4-2: Priority Display]
[4-3: Instruction Display]
[4-4: Marker display]
[4-5: Examples of various displays]
5. Summary and Modifications
 なお本開示において「映像」或いは「画像」とは、動画、静止画のいずれをも含むものとする。但し実施の形態では動画撮影を行う場合を例にして説明する。
In this disclosure, the term "video" or "image" includes both moving images and still images, but the embodiment will be described taking a moving image as an example.
<1.システム構成>
 実施の形態では、実写映像に仮想映像を合成する、いわゆるAR映像を生成することができる撮影システムを例に挙げる。図1は撮影システムによる撮影の様子を模式的に示している。
1. System configuration
In the embodiment, an image capturing system capable of generating so-called AR images by combining a virtual image with a real image will be taken as an example. Fig. 1 is a schematic diagram showing how an image is captured by the image capturing system.
 図1では現実の撮影対象空間8に対して、3台のカメラ2を配置して撮影を行う例を示している。3台というのは一例であり、1又は複数のカメラ2が用いられる。
 撮影対象空間8は、どのような場所でもよいが、一例としてサッカー、ラグビー等の競技場などが想定される。
1 shows an example in which three cameras 2 are arranged to capture images of a real target space 8. The number of cameras 2 is just an example, and one or more cameras 2 may be used.
The subject space 8 may be any location, but one example is a stadium for soccer, rugby, or the like.
 図1の例では、カメラ2として、ワイヤー9によって吊り下げられて撮影対象空間8の上空を移動できるようにした移動型のカメラ2Mを示している。この移動型のカメラ2Mによる撮影映像やメタデータはレンダーノード7に送られる。
 またカメラ2として、例えば三脚6などにより固定配置される固定型のカメラ2Fも示している。この固定型のカメラ2Fの撮影映像やメタデータはCCU(Camera Control Unit)3を介してレンダーノード7に送られる。
 なお移動型のカメラ2Mの撮影映像やメタデータについてCCU3を介してレンダーノード7に送られる場合もある。
 以下「カメラ2」とは、カメラ2F、2Mを総称するものとする。
1, the camera 2 is a mobile camera 2M that is suspended by a wire 9 and can move above a target space 8. Images and metadata captured by this mobile camera 2M are sent to a render node 7.
Also shown as the camera 2 is a fixed camera 2F that is fixedly disposed on, for example, a tripod 6. Images and metadata captured by this fixed camera 2F are sent to a render node 7 via a CCU (Camera Control Unit) 3.
In addition, the captured images and metadata from the mobile camera 2M may be sent to the render node 7 via the CCU 3.
Hereinafter, "camera 2" collectively refers to cameras 2F and 2M.
 ここで示すレンダーノード7とは、CG(Computer Graphics)の生成や実写映像との合成などを行うCGエンジン、映像処理プロセッサ等を示しており、例えばAR映像を生成するデバイスとされる。 The render node 7 referred to here refers to a CG engine or image processor that generates CG (Computer Graphics) and synthesizes it with live-action video, and is, for example, a device that generates AR video.
 図2A,図2BにAR映像の例を示している。図2Aは競技場における試合中の撮影映像に、CGによる画像38として実際には存在しないラインを実写映像に合成している。図2Bでは、競技場内に、画像38として実際には存在しない広告ロゴを実写映像に合成している。
 これらのCGによる画像38は、撮影時のカメラ2の位置、撮影方向、画角、撮影された構造物体などに応じて適切に形状、サイズや合成位置を設定してレンダリングすることで、現実に存在するもののように見せることができる。
2A and 2B show examples of AR images. In Fig. 2A, a line that does not actually exist is composited as a CG image 38 into live-action footage of a game being played in a stadium. In Fig. 2B, an advertising logo that does not actually exist is composited as an image 38 into the live-action footage in the stadium.
These CG images 38 can be rendered to look like they exist in reality by appropriately setting the shape, size and synthesis position depending on the position of the camera 2 at the time of shooting, the shooting direction, the angle of view, the structural object photographed, etc.
 このような実写としての撮影映像にCGを合成してAR重畳映像を生成すること自体は知られている。実施の形態の撮影システムは、さらに映像制作に携わるカメラマンやディレクターが、AR重畳映像を視認しながら撮影や指示等の制作作業を行うようにする。これによりリアルな光景とバーチャルな画像の融合状態を確認しながら撮影を行うことができ、創作意図に沿った映像制作が可能となる。 The process of generating AR overlay images by combining CG with such live-action footage is already known. The filming system of this embodiment also enables the cameraman and director involved in the video production to perform production tasks such as shooting and giving instructions while visually viewing the AR overlay image. This allows filming to be performed while checking the fusion state of the real scene and the virtual image, making it possible to produce videos that are in line with the creative intent.
 特に本実施の形態では、このようなAR重畳映像をカメラマン等が確認できる撮影システムにおいて、カメラマンやディレクター等の、モニタ映像の視認者にとって好適な撮影範囲提示映像を表示させるようにする。 In particular, in this embodiment, in a shooting system where a cameraman or the like can check such AR superimposed images, a shooting range presentation image that is suitable for the viewer of the monitor image, such as the cameraman or director, is displayed.
 撮影システムの構成例として、2つの例を図3、図4に示す。
 図3の構成例では、カメラシステム1,1A、コントロールパネル10、GUI(Graphical User Interface)デバイス11、ネットワークハブ12、スイッチャー13、マスターモニタ14を示している。
 破線矢印で各種の制御信号CSの流れを示している。また実線矢印では撮影映像V1、AR重畳映像V2、俯瞰映像V3の各映像データの流れを示している。
Two configuration examples of the imaging system are shown in FIG. 3 and FIG.
In the configuration example of FIG. 3, a camera system 1, 1A, a control panel 10, a GUI (Graphical User Interface) device 11, a network hub 12, a switcher 13, and a master monitor 14 are shown.
The dashed arrows indicate the flow of various control signals CS, while the solid arrows indicate the flow of each of the image data of the shot image V1, the AR superimposed image V2, and the overhead image V3.
 カメラシステム1はAR連携を行う構成で、カメラシステム1AはAR連携を行わない構成としている。
 なお、図3,図4では三脚6による固定型のカメラ2Fの例で示しているが、カメラシステム1,1Aとして移動型のカメラ2Mを用いる場合もある。
Camera system 1 is configured to perform AR linkage, while camera system 1A is configured not to perform AR linkage.
Although an example of a fixed camera 2F mounted on a tripod 6 is shown in Figs. 3 and 4, a mobile camera 2M may also be used as the camera system 1, 1A.
 カメラシステム1は、カメラ2、CCU3、例えばCCU3に内蔵されるAI(artificial intelligence)ボード4、ARシステム5を有する。
 カメラ2からは、撮影映像V1の映像データや、メタデータMTがCCU3に送られる。CCU3は、撮影映像V1の映像データをスイッチャー13に送る。またCCU3は撮影映像V1の映像データとメタデータMTをARシステム5に送る。
The camera system 1 includes a camera 2, a CCU 3, for example an AI (artificial intelligence) board 4 built into the CCU 3, and an AR system 5.
The camera 2 sends video data of the shot video V1 and metadata MT to the CCU 3. The CCU 3 sends the video data of the shot video V1 to the switcher 13. The CCU 3 also sends the video data of the shot video V1 and metadata MT to the AR system 5.
 メタデータMTとしては、撮影映像V1の撮影時のズーム画角や焦点距離などを含むレンズ情報、カメラ2に搭載されるIMU(Inertial Measurement Unit)等のセンサ情報がある。これらは具体的には、カメラ2の3doF(Degree of Freedom:自由度)の姿勢情報、加速度情報、レンズの焦点距離、絞り値、ズーム画角、レンズ歪、などの情報となる。これらのメタデータMTは、例えばフレームに同期した情報又は非同期の情報としてカメラ2から出力される。 The metadata MT includes lens information including the zoom angle of view and focal length when the captured image V1 was captured, and sensor information such as the IMU (Inertial Measurement Unit) mounted on the camera 2. Specifically, this information includes the 3doF (Degree of Freedom) attitude information of the camera 2, acceleration information, lens focal length, aperture value, zoom angle of view, lens distortion, etc. This metadata MT is output from the camera 2, for example, as frame-synchronized or asynchronous information.
 なお図3の場合はカメラ2を固定型のカメラ2Fとしており、位置情報は変化しないため、カメラ位置情報は既知の値としてCCU3やARシステム5が記憶していればよい。移動型のカメラ2Mを用いる場合は、位置情報も逐次カメラ2Mから送信されるメタデータMTに含まれるようにする。 In the case of FIG. 3, the camera 2 is a fixed camera 2F, and the position information does not change, so the camera position information only needs to be stored as a known value by the CCU 3 and the AR system 5. When a mobile camera 2M is used, the position information is also included in the metadata MT transmitted successively from the camera 2M.
 ARシステム5はCGのレンダリングを行うレンダリングエンジンを備えた情報処理装置である。ARシステム5としての情報処理装置は図1に示したレンダーノード7の一例である。
 ARシステム5はCGで生成した画像38をカメラ2による撮影映像V1に重畳したAR重畳映像V2の映像データを生成する。この場合にARシステム5は、メタデータMTを参照して画像38のサイズ、形状を設定し、また撮影映像V1内への合成位置を設定することで、画像38が実写風景に自然に合成されたAR重畳映像V2の映像データを生成する。
The AR system 5 is an information processing apparatus including a rendering engine that performs CG rendering. The information processing apparatus as the AR system 5 is an example of the render node 7 shown in FIG.
The AR system 5 generates video data of an AR superimposed video V2 by superimposing an image 38 generated by CG on a video V1 captured by the camera 2. In this case, the AR system 5 sets the size and shape of the image 38 by referring to the metadata MT, and also sets the synthesis position within the captured video V1, thereby generating video data of an AR superimposed video V2 in which the image 38 is naturally synthesized with the actual scenery.
 またARシステム5は、後述するようにCGによる俯瞰映像V3の映像データを生成する。例えば撮影対象空間8をCGにより再現する俯瞰映像V3の映像データである。さらにARシステム5は、俯瞰映像V3内で、カメラ2の撮影範囲を視覚的に提示する撮影範囲提示映像として、後述の図8のようなビューフラスタム40を表示させるようにする。
 例えばARシステム5は、カメラ2のメタデータMTや位置情報から、撮影対象空間8内での撮影範囲を計算する。カメラ2の位置情報、画角、三脚6上での3軸方向(ヨー、ピッチ、ロール)のカメラ2の姿勢情報(撮影方向に相当)を取得することで、そのカメラ2の撮影範囲を求めることができる。
 ARシステム5はカメラ2の撮影範囲の算出に応じてビューフラスタム40としての映像を生成する。ARシステム5はビューフラスタム40が、撮影対象空間8に相当する俯瞰映像V3内におけるカメラ2の位置から提示されるように、俯瞰映像V3の映像データを生成する。
The AR system 5 also generates video data of a CG overhead image V3, as described later. For example, the video data is the overhead image V3 that reproduces the target space 8 by CG. Furthermore, the AR system 5 displays a view frustum 40 as shown in FIG. 8, which will be described later, in the overhead image V3 as a shooting range presentation image that visually presents the shooting range of the camera 2.
For example, the AR system 5 calculates the shooting range in the shooting target space 8 from the metadata MT and position information of the camera 2. The shooting range of the camera 2 can be obtained by acquiring the position information of the camera 2, the angle of view, and the attitude information (corresponding to the shooting direction) of the camera 2 in the three axial directions (yaw, pitch, roll) on the tripod 6.
The AR system 5 generates an image as a view frustum 40 in accordance with the calculation of the shooting range of the camera 2. The AR system 5 generates image data of the overhead image V3 so that the view frustum 40 is presented from the position of the camera 2 in the overhead image V3 corresponding to the target space 8.
 なお本開示において「俯瞰映像」とは、撮影対象空間8を俯瞰的に見る視点による映像であるが、必ずしも映像内に撮影対象空間8の全体を表示しているものである必要はない。少なくとも一部のカメラ2のビューフラスタム40と、その周囲の空間が含まれる映像を俯瞰映像と呼ぶこととする。
 実施の形態では、俯瞰映像V3はCGによりスタジアム等の撮影対象空間8を表現する映像として生成されるものとするが、実写映像による俯瞰映像V3が生成されてもよい。例えば俯瞰映像用の視点のカメラ2を設け、そのカメラ2の撮影映像V1を使用して俯瞰映像V3としてもよい。ワイヤー9で上空を移動されるカメラ2Mの撮影映像V1を俯瞰映像V3として用いてもよい。さらには複数のカメラ2の撮影映像V1を用いて撮影対象空間8の3D(three dimensions)-CGモデルを生成し、3D-CGモデルに対する視点位置を設定してレンダリングをすることで、視点位置可変の俯瞰映像V3を生成することもできる。
In this disclosure, the term "bird's-eye view image" refers to an image from a bird's-eye view of the target space 8, but does not necessarily have to display the entire target space 8 within the image. An image that includes at least a portion of the view frustum 40 of the camera 2 and the surrounding space is referred to as a bird's-eye view image.
In the embodiment, the overhead image V3 is generated as an image expressing the shooting target space 8 such as a stadium by CG, but the overhead image V3 may be generated by real-life images. For example, a camera 2 may be provided with a viewpoint for the overhead image, and the image V1 shot by the camera 2 may be used to generate the overhead image V3. The image V1 shot by a camera 2M moving in the sky on a wire 9 may be used as the overhead image V3. Furthermore, a 3D (three dimensions)-CG model of the shooting target space 8 may be generated using the images V1 shot by multiple cameras 2, and the viewpoint position may be set for the 3D-CG model and rendered to generate an overhead image V3 with a variable viewpoint position.
 ARシステム5によるAR重畳映像V2や俯瞰映像V3の映像データは、スイッチャー13に供給される。
 またARシステム5によるAR重畳映像V2や俯瞰映像V3の映像データは、CCU3を介してカメラ2に供給される。これによりカメラ2では、カメラマンがビューファインダーなどの表示部においてAR重畳映像V2や俯瞰映像V3を視認することができる。
 なおARシステム5によるAR重畳映像V2や俯瞰映像V3の映像データは、CCU3を介さないでカメラ2に供給されてもよい。さらに言えばカメラシステム1,1AにおいてCCU3を用いない例もある。
The video data of the AR superimposed image V2 and the overhead image V3 by the AR system 5 is supplied to a switcher 13.
Furthermore, the image data of the AR superimposed image V2 and the overhead image V3 by the AR system 5 is supplied to the camera 2 via the CCU 3. This allows the cameraman of the camera 2 to visually recognize the AR superimposed image V2 and the overhead image V3 on a display unit such as a viewfinder.
The image data of the AR superimposed image V2 and the overhead image V3 by the AR system 5 may be supplied to the camera 2 without going through the CCU 3. Furthermore, there are also examples in which the CCU 3 is not used in the camera systems 1 and 1A.
 CCU3におけるAIボード4は、撮影映像V1とメタデータMTから、カメラ2のドリフト量を計算する処理を行う。
 各時点で、カメラ2に搭載されたIMUからの加速度情報を2回積分することでカメラ2の位置変位が得られる。或る基準となる原点姿勢(ヨー、ピッチ、ロールの3軸のそれぞれで基準となる姿勢位置)から、各時点の変位量を積算していくことで、各時点のヨー、ピッチ、ロールの3軸の位置、即ちカメラ2の撮影方向に相当する姿勢情報が得られる。しかし積算を重ねることで、実際の姿勢位置と計算上の姿勢位置のずれ(蓄積誤差)が大きくなってくる。そのずれの量をドリフト量と呼ぶ。
The AI board 4 in the CCU 3 performs processing to calculate the amount of drift of the camera 2 from the captured image V1 and metadata MT.
At each time point, the positional displacement of the camera 2 is obtained by integrating twice the acceleration information from the IMU mounted on the camera 2. By accumulating the amount of displacement at each time point from a certain reference origin attitude (attitude position that is the reference for each of the three axes of yaw, pitch, and roll), the position of the three axes of yaw, pitch, and roll at each time point, that is, attitude information corresponding to the shooting direction of the camera 2, is obtained. However, repeated accumulation increases the deviation (accumulated error) between the actual attitude position and the calculated attitude position. The amount of deviation is called the drift amount.
 このようなドリフトを解消するために、AIボード4は撮影映像V1とメタデータMTを使ってドリフト量を計算する。そして算出したドリフト量をカメラ2側に送出する。
 カメラ2では、CCU3(AIボード4)から受信したドリフト量を受けてカメラ2の姿勢情報を補正する。そして補正後の姿勢情報を含むメタデータMTを出力する。
In order to eliminate such drift, the AI board 4 calculates the amount of drift using the captured image V1 and the metadata MT. Then, the calculated amount of drift is sent to the camera 2 side.
The camera 2 receives the drift amount from the CCU 3 (AI board 4) and corrects the attitude information of the camera 2. Then, the camera 2 outputs metadata MT including the corrected attitude information.
 以上のドリフト補正について、図5,図6で説明する。
 図5は環境マップ35を示している。環境マップ35は、仮想ドームの座標に特徴点、特徴量を記憶したものであり、カメラ2毎に生成される。
 カメラ2を360度回転させ、特徴点、特徴量を天球上のグローバル位置座標に登録した環境マップ35を生成する。これにより特徴点マッチングで姿勢をロストしても復帰できるようになる。
The above drift correction will be explained with reference to FIGS.
5 shows the environment map 35. The environment map 35 stores feature points and feature amounts in the coordinates of a virtual dome, and is generated for each camera 2.
The camera 2 is rotated 360 degrees, and an environment map 35 is generated in which feature points and feature quantities are registered in global position coordinates on the celestial sphere. This makes it possible to restore the orientation even if it is lost during feature point matching.
 図6Aは、カメラ2の正しい姿勢の撮影方向Pcと、IMUデータから計算した撮影方向Pjとでドリフト量DAが生じている様子を模式的に示している。
 カメラ2からAIボード4には、カメラ2の3軸の動作、角度、画角の情報が、特徴点マッチングのためのガイドとして送られる。AIボード4は、図6Bのように、蓄積されたドリフト量DAを、映像認識の特徴点マッチングで検出する。図の「+」が環境マップ35に登録された或る特徴量の特徴点と、現在の撮影映像V1のフレームの該当する特徴量の特徴点を示しており、その間の矢印がドリフト量ベクトルとなる。このように特徴点マッチングにより座標誤差を検出し、その座標誤差を補正すれば、ドリフト量の補正ができる。
FIG. 6A shows a schematic diagram of a state in which a drift amount DA occurs between the imaging direction Pc in the correct attitude of the camera 2 and the imaging direction Pj calculated from the IMU data.
Information on the three-axis motion, angle, and field of view of the camera 2 is sent from the camera 2 to the AI board 4 as a guide for feature point matching. The AI board 4 detects the accumulated drift amount DA by feature point matching of image recognition, as shown in FIG. 6B. The "+" in the figure indicates a feature point of a certain feature amount registered in the environment map 35 and a feature point of the corresponding feature amount in the frame of the current captured image V1, and the arrow between them is the drift amount vector. In this way, by detecting a coordinate error by feature point matching and correcting the coordinate error, the drift amount can be corrected.
 AIボード4がこのような特徴点マッチングでドリフト量を求め、これに基づいてカメラ2からは補正されたメタデータMTが送出されるようにすることで、ARシステム5においてメタデータMTに基づいて検出されるカメラ2の姿勢情報の精度を向上させることができる。 The AI board 4 determines the amount of drift by this type of feature point matching, and the camera 2 transmits corrected metadata MT based on this, thereby improving the accuracy of the attitude information of the camera 2 detected in the AR system 5 based on the metadata MT.
 図3のカメラシステム1Aは、カメラ2とCCU3を有し、ARシステム5を有さない例としている。カメラシステム1Aのカメラ2からCCU3には撮影映像V1の映像データとメタデータMTが送信される。CCU3は撮影映像V1の映像データをスイッチャー13に送信する。 Camera system 1A in FIG. 3 is an example having a camera 2 and a CCU 3, but not an AR system 5. Video data and metadata MT of the shot video V1 are transmitted from the camera 2 of camera system 1A to the CCU 3. The CCU 3 transmits the video data of the shot video V1 to the switcher 13.
 以上のカメラシステム1,1Aから出力される撮影映像V1、AR重畳映像V2、俯瞰映像V3の映像データはスイッチャー13及びネットワークハブ12を介して、GUIデバイス11に供給される。 The video data of the captured image V1, AR superimposed image V2, and overhead image V3 output from the camera system 1, 1A is supplied to the GUI device 11 via the switcher 13 and network hub 12.
 スイッチャー13は複数のカメラ2による各撮影映像V1や、AR重畳映像V2、俯瞰映像V3のうちで、いわゆる本線映像の選択を行う。本線映像とは放送や配信のために出力される映像である。スイッチャー13は選択した映像データを放送や配信のための本線映像として図示しない送信装置や記録装置などに出力する。 The switcher 13 selects the so-called main line video from among the images V1 captured by the multiple cameras 2, the AR superimposed video V2, and the overhead video V3. The main line video is the video output for broadcasting or distribution. The switcher 13 outputs the selected video data to a transmitting device or recording device (not shown) as the main line video for broadcasting or distribution.
 また本線映像として選択されている映像の映像データはマスターモニタ14に送られて表示される。これにより映像制作スタッフは本線映像を確認することができる。
 なおマスターモニタ14において本線映像以外にAR重畳映像V2、俯瞰映像V3等を表示させるようにしてもよい。
Moreover, the video data of the video selected as the main line video is sent to the master monitor 14 and displayed thereon, so that the video production staff can check the main line video.
In addition, the master monitor 14 may display an AR superimposed image V2, an overhead image V3, etc. in addition to the main line image.
 コントロールパネル10は、映像制作スタッフがスイッチャー13の切り替え指示や、映像処理に関する指示や、その他の各種の指示のための操作を行う装置である。コントロールパネル10は映像制作スタッフの操作に応じた制御信号CSを出力する。この制御信号CSはネットワークハブ12を介してスイッチャー13やカメラシステム1,1Aに送信される。 The control panel 10 is a device that allows video production staff to operate the switcher 13 to give switching instructions, video processing instructions, and various other instructions. The control panel 10 outputs a control signal CS in response to operations by the video production staff. This control signal CS is sent via the network hub 12 to the switcher 13 and the camera systems 1 and 1A.
 GUIデバイス11は、例えばPCやタブレット装置などで構成され、映像制作スタッフ、例えばディレクター等が映像の確認や各種の指示操作を行うことができる装置とされる。
 撮影映像V1、AR重畳映像V2、俯瞰映像V3は、GUIデバイス11の表示画面上で表示される。例えばGUIデバイス11において、複数のカメラ2の撮影映像V1が画面分割して一覧表示されたり、AR重畳映像V2が表示されたり、俯瞰映像V3が表示されたりする。或いはGUIデバイス11において、本線映像としてスイッチャー13で選択されている映像が表示されたりもする。
The GUI device 11 is, for example, a PC or a tablet device, and is a device that enables video production staff, such as a director, to check the video and give various instructions.
The captured image V1, the AR superimposed image V2, and the overhead image V3 are displayed on the display screen of the GUI device 11. For example, in the GUI device 11, the captured images V1 from the multiple cameras 2 are split into a screen and displayed as a list, the AR superimposed image V2 is displayed, and the overhead image V3 is displayed. Alternatively, in the GUI device 11, an image selected by the switcher 13 as a main line image is displayed.
 GUIデバイス11にはディレクター等が各種の指示操作を行うためのインタフェースも用意される。GUIデバイス11はディレクター等の操作に応じて制御信号CSを出力する。この制御信号CSはネットワークハブ12を介してスイッチャー13やカメラシステム1,1Aに送信される。
 GUIデバイス11によっては、例えば俯瞰映像V3におけるビューフラスタム40の表示態様などの指示を行うこともできる。
 その指示に応じた制御信号CSはARシステム5に送信され、ARシステム5はディレクター等の指示に応じた表示態様のビューフラスタム40を含む俯瞰映像V3の映像データを生成する。
The GUI device 11 is also provided with an interface for a director or the like to perform various instruction operations. The GUI device 11 outputs a control signal CS in response to an operation by the director or the like. This control signal CS is transmitted via a network hub 12 to a switcher 13 and the camera systems 1 and 1A.
Depending on the GUI device 11, it may be possible to give instructions regarding, for example, the display mode of the view frustum 40 in the overhead view image V3.
A control signal CS corresponding to the instruction is transmitted to the AR system 5, and the AR system 5 generates video data of an overhead video V3 including a view frustum 40 in a display format corresponding to an instruction from a director or the like.
 以上の図3の例は、カメラシステム1,1Aを有するが、この場合、カメラシステム1は、カメラ2、CCU3、ARシステム5がワンセットとなっており、特にARシステム5を有することで、カメラ2の撮影映像V1に対応するAR重畳映像V2や俯瞰映像V3の映像データが生成される。そしてAR重畳映像V2や俯瞰映像V3は、カメラ2のビューファインダーなどの表示部で表示されたり、GUIデバイス11において表示されたり、スイッチャー13により本線映像として選択されたりする。
 一方で、カメラシステム1A側では、そのカメラ2の撮影映像V1に対応したAR重畳映像V2や俯瞰映像V3の映像データが生成されない。
 従って図3は、AR連携を行うカメラ2と通常撮影を行うカメラ2が混在したシステムである。
3 has camera systems 1 and 1A, but in this case, camera system 1 is a set of camera 2, CCU 3, and AR system 5, and in particular, by having AR system 5, video data of AR superimposed video V2 and overhead video V3 corresponding to video V1 captured by camera 2 is generated. Then, AR superimposed video V2 and overhead video V3 are displayed on a display unit such as a viewfinder of camera 2, displayed on GUI device 11, or selected as a main line video by switcher 13.
On the other hand, on the camera system 1A side, image data of the AR superimposed image V2 and the overhead image V3 corresponding to the captured image V1 of the camera 2 is not generated.
Therefore, FIG. 3 shows a system in which a camera 2 that performs AR linkage and a camera 2 that performs normal shooting are mixed.
 図4の例は、1つのARシステム5が各カメラ2に対応するシステム例である。
 図4の場合、複数のカメラシステム1Aが備えられる。ARシステム5は各カメラシステム1Aからは独立して設けられる。
The example of FIG. 4 is an example of a system in which one AR system 5 corresponds to each camera 2.
4, a plurality of camera systems 1A are provided. The AR system 5 is provided independently of each of the camera systems 1A.
 各カメラシステム1AのCCU3は、カメラ2から撮影映像V1の映像データとメタデータMTをスイッチャー13に送る。そして、これら撮影映像V1の映像データとメタデータMTはスイッチャー13からARシステム5に供給される。
 これにより、ARシステム5は、それぞれのカメラシステム1Aについての撮影映像V1の映像データとメタデータMTを取得することができ、それぞれのカメラシステム1Aの撮影映像V1に対応したAR重畳映像V2の映像データや、それぞれのカメラシステム1Aに対応したビューフラスタム40を含む俯瞰映像V3の映像データを生成できる。或いはARシステム5は、複数のカメラシステム1Aのカメラ2のビューフラスタム40をまとめて表示させた俯瞰映像V3の映像データを生成することもできる。
The CCU 3 of each camera system 1A sends the video data and metadata MT of the shot video V1 from the camera 2 to the switcher 13. The video data and metadata MT of the shot video V1 are then supplied from the switcher 13 to the AR system 5.
This allows the AR system 5 to acquire the video data and metadata MT of the captured video V1 for each camera system 1A, and generate video data of the AR superimposed video V2 corresponding to the captured video V1 of each camera system 1A, and video data of the overhead video V3 including the view frustum 40 corresponding to each camera system 1A. Alternatively, the AR system 5 can generate video data of the overhead video V3 in which the view frustums 40 of the cameras 2 of the multiple camera systems 1A are collectively displayed.
 ARシステム5によって生成されるAR重畳映像V2や俯瞰映像V3の映像データは、スイッチャー13を介してカメラシステム1AのCCU3に送信され、さらにカメラ2に送信される。これによりカメラマンは、カメラ2のビューファインダーなどの表示部で、AR重畳映像V2や俯瞰映像V3を視認できる。 The video data of the AR superimposed image V2 and the overhead image V3 generated by the AR system 5 is sent to the CCU 3 of the camera system 1A via the switcher 13, and then sent to the camera 2. This allows the cameraman to view the AR superimposed image V2 and the overhead image V3 on a display such as the viewfinder of the camera 2.
 またARシステム5によって生成されるAR重畳映像V2や俯瞰映像V3の映像データは、スイッチャー13、ネットワークハブ12を介してGUIデバイス11に送信され、表示される。これによりディレクター等がAR重畳映像V2や俯瞰映像V3を視認できるようになる。 In addition, the video data of the AR overlay image V2 and the overhead image V3 generated by the AR system 5 is transmitted to the GUI device 11 via the switcher 13 and the network hub 12 and displayed. This allows the director and others to visually confirm the AR overlay image V2 and the overhead image V3.
 このような図4の構成では、それぞれのカメラシステム1AにARシステム5を設けなくとも、各カメラ2のAR重畳映像V2や、俯瞰映像V3を生成し、表示させることができる。 In the configuration shown in FIG. 4, it is possible to generate and display the AR superimposed image V2 and the overhead image V3 of each camera 2 without providing an AR system 5 to each camera system 1A.
 ところで図3,図4では、俯瞰映像V3について、「V3-1」「V3-2」と付記した。
 俯瞰映像V3-1の映像データは、ディレクター等を視認者と想定してGUIデバイス11やマスターモニタ14に表示させる俯瞰映像V3の映像データである。また俯瞰映像V3-2の映像データは、カメラマンを視認者と想定してカメラ2のビューファインダー等に表示させる俯瞰映像V3の映像データである。
Incidentally, in FIG. 3 and FIG. 4, the overhead view image V3 is denoted as "V3-1" and "V3-2".
The video data of the overhead image V3-1 is the video data of the overhead image V3 to be displayed on the GUI device 11 or the master monitor 14, with a director or the like assumed as the viewer. The video data of the overhead image V3-2 is the video data of the overhead image V3 to be displayed on the viewfinder of the camera 2, with a cameraman or the like assumed as the viewer.
 これら俯瞰映像V3-1、V3-2の映像データは同一内容の映像を表示させる映像データであるとしてもよい。これらはいずれも、少なくともビューフラスタム40を含む撮影対象空間8の俯瞰映像V3を表示させる映像データである。但し実施の形態では、これらが異なる表示内容を含む映像データとする場合についても説明する。 The video data for these overhead images V3-1 and V3-2 may be video data that displays images of the same content. Both of these are video data that display an overhead image V3 of the target space 8 that includes at least the view frustum 40. However, in the embodiment, a case will also be described in which these are video data that include different display contents.
 つまりARシステム5は、送信先に関わらず、同じ映像内容の俯瞰映像V3となる映像データを生成してもよいし、例えばGUIデバイス11に送信する第1の俯瞰映像V3-1の映像データと、カメラ2に送信する第2の俯瞰映像V3-2の映像データを並行して生成するようにしてもよい。
 さらに、図4のシステムの場合は、ARシステム5は、カメラ2毎にも内容が異なるように、複数の第2の俯瞰映像V3-2を並行して生成することも想定される。
In other words, the AR system 5 may generate video data that will become an overhead image V3 with the same video content regardless of the transmission destination, or may generate, for example, video data of a first overhead image V3-1 to be transmitted to the GUI device 11 and video data of a second overhead image V3-2 to be transmitted to the camera 2 in parallel.
Furthermore, in the case of the system of FIG. 4, it is also assumed that the AR system 5 generates multiple second overhead images V3-2 in parallel so that the content differs for each camera 2.
<2.情報処理装置の構成>
 以上の撮影システムで、例えばARシステム5とされる情報処理装置70の構成例を図7で説明する。
 情報処理装置70は、コンピュータ機器など、情報処理、特に映像処理が可能な機器である。この情報処理装置70としては、具体的には、パーソナルコンピュータ、ワークステーション、スマートフォンやタブレット等の携帯端末装置、ビデオ編集装置等が想定される。また情報処理装置70は、クラウドコンピューティングにおけるサーバ装置や演算装置として構成されるコンピュータ装置であってもよい。
2. Configuration of information processing device
In the above-described imaging system, a configuration example of an information processing device 70 that is, for example, the AR system 5 will be described with reference to FIG.
The information processing device 70 is a device capable of information processing, particularly video processing, such as a computer device. Specific examples of the information processing device 70 include personal computers, workstations, mobile terminal devices such as smartphones and tablets, video editing devices, etc. The information processing device 70 may also be a computer device configured as a server device or a computing device in cloud computing.
 情報処理装置70のCPU71は、ROM72や例えばEEP-ROM(Electrically Erasable Programmable Read-Only Memory)などの不揮発性メモリ部74に記憶されているプログラム、または記憶部79からRAM73にロードされたプログラムに従って各種の処理を実行する。RAM73にはまた、CPU71が各種の処理を実行する上において必要なデータなども適宜記憶される。 The CPU 71 of the information processing device 70 executes various processes according to programs stored in the ROM 72 or a non-volatile memory unit 74, such as an EEPROM (Electrically Erasable Programmable Read-Only Memory), or programs loaded from the storage unit 79 to the RAM 73. The RAM 73 also stores data necessary for the CPU 71 to execute various processes, as appropriate.
 CPU71は各種の処理を行うプロセッサとして構成される。CPU71は、全体の制御処理や各種演算処理を行うが、本実施の形態の場合、プログラムに基づいてARシステム5としての映像処理を実行するために映像処理部71a、映像生成制御部71bとしての機能を備える。 The CPU 71 is configured as a processor that performs various types of processing. The CPU 71 performs overall control processing and various types of calculation processing, but in this embodiment, it also has the functions of an image processing unit 71a and an image generation control unit 71b in order to execute image processing as the AR system 5 based on a program.
 映像処理部71aは各種の映像処理を行う処理機能を示している。例えば3Dモデル生成処理、レンダリング、色・輝度調整処理を含む映像処理、映像編集処理、映像解析・検出処理などのいずれか、或いは複数の処理を行う。 The video processing unit 71a has a processing function for performing various types of video processing. For example, it performs one or more of the following: 3D model generation processing, rendering, video processing including color and brightness adjustment processing, video editing processing, video analysis and detection processing, etc.
 また映像処理部71aは、撮影対象空間8の俯瞰映像V3と、俯瞰映像V3内においてカメラ2の撮影範囲を提示するビューフラスタム40と、カメラ2の撮影映像V1とを一画面内で同時表示させる映像データとして俯瞰映像V3を生成する処理も行う。 The video processing unit 71a also performs processing to generate an overhead image V3 as video data that simultaneously displays an overhead image V3 of the target space 8, a view frustum 40 that shows the shooting range of camera 2 within the overhead image V3, and the captured image V1 of camera 2 on a single screen.
 CPU71における映像生成制御部71bは、映像処理部71aが生成する、ビューフラスタム40を含む俯瞰映像V3において一画面内に同時表示させる撮影映像V1の表示位置を可変設定して、映像処理部71aによる映像データの生成を制御する処理を行う。映像処理部71aは映像生成制御部71bの設定に従ってビューフラスタム40を含む俯瞰映像V3を生成することになる。 The image generation control unit 71b in the CPU 71 performs processing to variably set the display position of the captured image V1 to be simultaneously displayed on one screen in the overhead image V3 including the view frustum 40 generated by the image processing unit 71a, and to control the generation of image data by the image processing unit 71a. The image processing unit 71a generates the overhead image V3 including the view frustum 40 according to the settings of the image generation control unit 71b.
 また映像処理部71aは、撮影対象空間8内におけるカメラ2のビューフラスタム40を表示させる第1映像データを生成する処理と、撮影対象空間8内におけるビューフラスタム40を表示させる映像であって第1映像データによる映像とは異なる表示態様の映像を表示させる第2映像データを生成する処理とを並行して行うこともある。
この場合の第1映像データとは例えば俯瞰映像V3-1の映像データであり、第2映像データとは例えば俯瞰映像V3-2の映像データである。
The image processing unit 71a may also perform in parallel a process of generating first image data that displays the view frustum 40 of the camera 2 within the target space 8, and a process of generating second image data that displays an image of the view frustum 40 within the target space 8, the image having a different display mode from the image generated by the first image data.
In this case, the first video data is, for example, video data of the overhead view V3-1, and the second video data is, for example, video data of the overhead view V3-2.
 なお、映像処理部71a、映像生成制御部71bとしての機能は、CPU71とは別体のCPU、GPU(Graphics Processing Unit)、GPGPU(General-purpose computing on graphics processing units)、AI(artificial intelligence)プロセッサ等により実現してもよい。
 また映像処理部71a、映像生成制御部71bとしての機能は、複数のプロセッサによって実現されても良い。
In addition, the functions of the image processing unit 71a and the image generation control unit 71b may be realized by a CPU separate from the CPU 71, a GPU (Graphics Processing Unit), a GPGPU (General-purpose computing on graphics processing units), an AI (artificial intelligence) processor, etc.
Furthermore, the functions of the video processing unit 71a and the video production control unit 71b may be realized by a plurality of processors.
 CPU71、ROM72、RAM73、不揮発性メモリ部74は、バス83を介して相互に接続されている。このバス83にはまた、入出力インタフェース75も接続されている。 The CPU 71, ROM 72, RAM 73, and non-volatile memory unit 74 are interconnected via a bus 83. The input/output interface 75 is also connected to this bus 83.
 入出力インタフェース75には、操作子や操作デバイスよりなる入力部76が接続される。例えば入力部76としては、キーボード、マウス、キー、トラックボール、ダイヤル、タッチパネル、タッチパッド、リモートコントローラ等の各種の操作子や操作デバイスが想定される。
 入力部76によりユーザの操作が検知され、入力された操作に応じた信号はCPU71によって解釈される。
 入力部76としてはマイクロフォンも想定される。ユーザの発する音声を操作情報として入力することもできる。
An input unit 76 consisting of operators and operation devices is connected to the input/output interface 75. For example, the input unit 76 may be various operators and operation devices such as a keyboard, a mouse, a key, a trackball, a dial, a touch panel, a touch pad, a remote controller, or the like.
An operation by the user is detected by the input unit 76 , and a signal corresponding to the input operation is interpreted by the CPU 71 .
A microphone may also be used as the input unit 76. Voice uttered by the user may also be input as operation information.
 また入出力インタフェース75には、LCD(Liquid Crystal Display)或いは有機EL(electro-luminescence)パネルなどよりなる表示部77や、スピーカなどよりなる音声出力部78が一体又は別体として接続される。
 表示部77は各種表示を行う表示部であり、例えば情報処理装置70の筐体に設けられるディスプレイデバイスや、情報処理装置70に接続される別体のディスプレイデバイス等により構成される。
 表示部77は、CPU71の指示に基づいて表示画面上に各種の画像、操作メニュー、アイコン、メッセージ等、即ちGUI(Graphical User Interface)としての表示を行う。
Further, the input/output interface 75 is connected, either integrally or separately, to a display unit 77 formed of an LCD (Liquid Crystal Display) or an organic EL (electro-luminescence) panel, or the like, and an audio output unit 78 formed of a speaker, or the like.
The display unit 77 is a display unit that performs various displays, and is configured, for example, by a display device provided in the housing of the information processing device 70, or a separate display device connected to the information processing device 70, or the like.
The display unit 77 displays various images, operation menus, icons, messages, etc., on the display screen based on instructions from the CPU 71, that is, displays them as a GUI (Graphical User Interface).
 入出力インタフェース75には、HDD(Hard Disk Drive)や固体メモリなどより構成される記憶部79や通信部80が接続される場合もある。 The input/output interface 75 may also be connected to a storage unit 79 and a communication unit 80, which may be configured using a hard disk drive (HDD) or solid-state memory.
 記憶部79は、各種のデータやプログラムを記憶することができる。記憶部79においてデータベースを構成することもできる。 The storage unit 79 can store various data and programs. A database can also be configured in the storage unit 79.
 通信部80は、インターネット等の伝送路を介しての通信処理や、外部のデータベース、編集装置、情報処理装置等の各種機器との有線/無線通信、バス通信などによる通信を行う。
 例えばARシステム5としての情報処理装置70を想定すると、通信部80によりCCU3やスイッチャー13との通信を行う。
The communication unit 80 performs communication processing via a transmission path such as the Internet, and communication with various devices such as external databases, editing devices, and information processing devices via wired/wireless communication, bus communication, and the like.
For example, assuming an information processing device 70 as the AR system 5 , communication with the CCU 3 and the switcher 13 is performed via a communication unit 80 .
 入出力インタフェース75にはまた、必要に応じてドライブ81が接続され、磁気ディスク、光ディスク、光磁気ディスク、或いは半導体メモリなどのリムーバブル記録媒体82が適宜装着される。
 ドライブ81により、リムーバブル記録媒体82からは映像データや、各種のコンピュータプログラムなどを読み出すことができる。読み出されたデータは記憶部79に記憶されたり、データに含まれる映像や音声が表示部77や音声出力部78で出力されたりする。またリムーバブル記録媒体82から読み出されたコンピュータプログラム等は必要に応じて記憶部79にインストールされる。
A drive 81 is also connected to the input/output interface 75 as required, and a removable recording medium 82 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory is appropriately mounted thereon.
The drive 81 allows video data, various computer programs, and the like to be read from the removable recording medium 82. The read data is stored in the storage unit 79, and the video and audio contained in the data are output on the display unit 77 and the audio output unit 78. In addition, the computer programs, etc. read from the removable recording medium 82 are installed in the storage unit 79 as necessary.
 この情報処理装置70では、例えば本実施の形態の処理のためのソフトウェアを、通信部80によるネットワーク通信やリムーバブル記録媒体82を介してインストールすることができる。或いは当該ソフトウェアは予めROM72や記憶部79等に記憶されていてもよい。
In this information processing device 70, for example, software for the processing of this embodiment can be installed via network communication by the communication unit 80 or via a removable recording medium 82. Alternatively, the software may be stored in advance in the ROM 72, the storage unit 79, etc.
<3.ビューフラスタムの表示>
 ビューフラスタム40の表示について説明する。上述のようにARシステム5は俯瞰映像V3を生成し、カメラ2のビューファインダーやGUIデバイス11などに送信して表示させることができる。ARシステム5は、この俯瞰映像V3内でカメラ2のビューフラスタム40を表示させるように俯瞰映像V3の映像データを生成する。
<3. Display of view frustum>
The display of the view frustum 40 will now be described. As described above, the AR system 5 generates the overhead image V3 and can transmit it to the viewfinder of the camera 2, the GUI device 11, or the like for display. The AR system 5 generates video data for the overhead image V3 so as to display the view frustum 40 of the camera 2 within the overhead image V3.
 図8に俯瞰映像V3内に表示されるビューフラスタム40の例を示している。図8は図1の撮影対象空間8を俯瞰的に見た状態のCGによる映像の例であるが、説明上、簡略化して示している。例えば競技場の俯瞰映像V3は、後述する図16のような例がある。
 図8の俯瞰映像V3は、例えば競技場などを表現する背景31や選手などの人物32を表す映像を含んでいる。なお図8においては、カメラ2を示しているが、これは説明上示したものである。俯瞰映像V3においてカメラ2自体の画像が含まれてもよいし、含まれなくてもよい。
Fig. 8 shows an example of a view frustum 40 displayed in the overhead image V3. Fig. 8 shows an example of a CG image of the subject space 8 in Fig. 1 as viewed from above, but for the sake of explanation, it is shown in a simplified form. For example, an example of the overhead image V3 of a stadium is as shown in Fig. 16, which will be described later.
The overhead image V3 in Fig. 8 includes an image showing a background 31, such as a stadium, and a person 32, such as a player. Note that, although the camera 2 is shown in Fig. 8, this is shown for the purpose of explanation. The overhead image V3 may or may not include an image of the camera 2 itself.
 ビューフラスタム40は、カメラ2の撮影範囲を俯瞰映像V3内で視覚的に提示するもので、俯瞰映像V3内のカメラ2の位置をフラスタム起点46として撮影光軸の方向に広がる四角錐の形状とされる。例えばフラスタム起点46からフラスタム遠端面45までの四角錐とされる。 The view frustum 40 visually presents the shooting range of the camera 2 within the overhead image V3, and has a pyramid shape that spreads in the direction of the shooting optical axis with the position of the camera 2 within the overhead image V3 as the frustum origin 46. For example, it is a pyramid shape extending from the frustum origin 46 to the frustum far end surface 45.
 四角錐であるのはカメラ2のイメージセンサが四角形であるためである。
 そして四角錐の広がり具合は、その時点のカメラ2の画角によって変化する。従ってビューフラスタム40で示される四角錐の範囲内がカメラ2による撮影範囲となる。
 実際には例えばビューフラスタム40は、或る半透明の色が付された映像としての四角錐で表わされることが考えられる。
The reason why it is a quadrangular pyramid is because the image sensor of the camera 2 is quadrangular.
The extent of the spread of the pyramid changes depending on the angle of view of the camera 2 at that time. Therefore, the range of the pyramid indicated by the view frustum 40 is the shooting range of the camera 2.
In practice, for example, the view frustum 40 may be represented as a pyramid with a semi-transparent colored image.
 ビューフラスタム40では、四角錐の内部において、その時点のフォーカス面41及び被写界深度範囲42が表示される。被写界深度範囲42としては、例えば深度近端面43から深度遠端面44までの範囲が、他と異なる半透明の色などで表現される。
 またフォーカス面41も、他と異なる半透明の色などで表現される。
The view frustum 40 displays a focus plane 41 and a depth of field range 42 at that time inside a quadrangular pyramid. As the depth of field range 42, for example, a range from a near depth end surface 43 to a far depth end surface 44 is expressed by a translucent color different from the rest.
The focus plane 41 is also expressed in a semi-transparent color that is different from the others.
 フォーカス面41は、その時点でのカメラ2において合焦している深度位置を示している。つまりフォーカス面41が表示されることで、そのフォーカス面41と同等のデプス(カメラ2から見ての奥行方向の距離)にある被写体が合焦状態となっていることを確認できる。
 また被写界深度範囲42により、被写体がぼけていないデプス方向の範囲が確認できることになる。
 合焦するデプスや被写界深度は、カメラ2のフォーカス操作や絞り操作によって変動する。従ってビューフラスタム40内でのフォーカス面41や被写界深度範囲42は、都度変動する。
The focus plane 41 indicates the depth position at which the camera 2 is focused at that point in time. In other words, by displaying the focus plane 41, it can be confirmed that a subject at a depth (distance in the depth direction as seen from the camera 2) equivalent to the focus plane 41 is in focus.
Furthermore, the depth of field range 42 makes it possible to confirm the range in the depth direction in which the subject is not blurred.
The in-focus depth and the depth of field vary depending on the focus operation and aperture operation of the camera 2. Therefore, the focus plane 41 and the depth of field range 42 in the view frustum 40 vary each time.
 ARシステム5は、カメラ2から、焦点距離、絞り値、画角などの情報を含むメタデータMTを取得することで、ビューフラスタム40の四角錐の広がり形状や、フォーカス面41の表示位置、被写界深度範囲42の表示位置などを設定できる。さらにメタデータMTにカメラ2の姿勢情報が含まれていることで、ARシステム5は、俯瞰映像V3内におけるカメラ位置(フラスタム起点46)からのビューフラスタム40の方向を設定できる。 The AR system 5 can set the pyramidal shape of the view frustum 40, the display position of the focus plane 41, the display position of the depth of field range 42, and the like, by acquiring metadata MT from the camera 2, which includes information such as focal length, aperture value, and angle of view. Furthermore, since the metadata MT includes attitude information of the camera 2, the AR system 5 can set the direction of the view frustum 40 from the camera position (frustum origin 46) in the overhead image V3.
 そしてARシステム5は、俯瞰映像V3内で、このようなビューフラスタム40とともに、そのビューフラスタム40が示されたカメラ2による撮影映像V1が表示されるようにする。
 つまりARシステム5は、俯瞰映像V3とするためのCG空間30の映像を生成するとともに、そのCG空間30の映像に、カメラ2から供給されたメタデータMTに基づいて生成したビューフラスタム40を合成し、さらにカメラ2による撮影映像V1を合成する。このような合成映像の映像データを、俯瞰映像V3として出力する。
The AR system 5 then displays, together with the view frustum 40, the image V1 captured by the camera 2 in which the view frustum 40 is shown in the overhead image V3.
That is, the AR system 5 generates an image of the CG space 30 to be used as the overhead image V3, synthesizes the image of the CG space 30 with the view frustum 40 generated based on the metadata MT supplied from the camera 2, and further synthesizes the image V1 captured by the camera 2. The image data of such a synthesized image is output as the overhead image V3.
 一画面内で、CG空間30の映像内のビューフラスタム40と撮影映像V1を同時に表示させるようにした例を説明する。
 まずARシステム5が、撮影映像V1がビューフラスタム40内に表示されるようにした俯瞰映像V3の映像データを生成する例を述べる。
 換言すれば、撮影映像V1がビューフラスタム40の範囲内に配置されるようにした映像データを生成する例である。さらには、撮影映像V1がビューフラスタム40の範囲内に配置された状態で表示されるようにする映像データを生成する例といえる。
An example will be described in which a view frustum 40 in an image of a CG space 30 and a photographed image V1 are simultaneously displayed on one screen.
First, an example will be described in which the AR system 5 generates video data of an overhead video V3 in which the captured video V1 is displayed within the view frustum 40.
In other words, this is an example of generating video data in which the captured video V1 is arranged within the range of the view frustum 40. Furthermore, this can be said to be an example of generating video data in which the captured video V1 is displayed in a state in which it is arranged within the range of the view frustum 40.
 図9はビューフラスタム40内のフォーカス面41に撮影映像V1を表示させた例である。これによりフォーカス位置において撮影されている映像を視認できるようになる。この図9の例は被写界深度範囲42内に撮影映像V1を表示させる1つの例でもある。 Figure 9 shows an example in which the captured image V1 is displayed on the focus plane 41 in the view frustum 40. This makes it possible to view the image captured at the focus position. The example in Figure 9 is also one example in which the captured image V1 is displayed within the depth of field range 42.
 図10はビューフラスタム40内の被写界深度範囲42内でフォーカス面41以外に撮影映像V1を表示させた例である。図の例は、深度遠端面44に撮影映像V1を表示させたものとしている。
 これ以外に、撮影映像V1を深度近端面43に表示させる例や、被写界深度範囲42内の途中の深度位置に表示させる例も考えられる。
10 shows an example in which a captured image V1 is displayed on a surface other than the focus surface 41 within the depth of field range 42 in the view frustum 40. In the example shown in the figure, the captured image V1 is displayed on a surface 44 at the far end of the depth field.
In addition to this, examples are also conceivable in which the captured image V1 is displayed on the near depth end surface 43, or at a depth position midway within the depth of field range 42.
 図11は、ビューフラスタム40内において、被写界深度範囲42の深度近端面43よりもフラスタム起点46に近い位置(フラスタム起点付近面47)に撮影映像V1が表示されるようにした例である。ビューフラスタム40内に表示することを考えると、フラスタム起点46に近いほど撮影映像V1のサイズが小さくなってしまうが、このようにフラスタム起点付近面47に表示させることで、フォーカス面41や被写界深度範囲42などが見やすくなる。 FIG. 11 shows an example in which the captured image V1 is displayed within the view frustum 40 at a position (surface 47 near the frustum origin) closer to the frustum origin 46 than the near-depth end surface 43 of the depth-of-field range 42. When considering displaying within the view frustum 40, the size of the captured image V1 becomes smaller the closer it is to the frustum origin 46, but by displaying it on the surface 47 near the frustum origin in this way, the focus plane 41, depth-of-field range 42, etc. become easier to see.
 図12は、ビューフラスタム40内において、被写界深度範囲42の深度遠端面44よりも遠方側に撮影映像V1が表示されるようにした例である。なお「遠方」とはカメラ2(フラスタム起点46)からみて遠方という意味である。
 図の例は、例えば遠方側の位置であるフラスタム遠端面45に撮影映像V1を表示させている。
 このようにビューフラスタム40内において被写界深度範囲42よりも遠方側に撮影映像V1を表示させる場合、撮影映像V1の面積を大きくできる。従って撮影映像V1の内容をよく確認しつつ、フォーカス面41や被写界深度範囲42の位置を確認したい場合などは好適となる。
12 shows an example in which a captured image V1 is displayed on the far side of a far end surface 44 of a depth of field range 42 within a view frustum 40. Note that "far" means far from the viewpoint of the camera 2 (frustum starting point 46).
In the illustrated example, the captured image V1 is displayed on the frustum far end surface 45, which is located at the far side.
In this way, when the photographed image V1 is displayed on the far side of the depth of field range 42 within the view frustum 40, the area of the photographed image V1 can be made large. This is therefore suitable for checking the position of the focus plane 41 and the depth of field range 42 while carefully checking the contents of the photographed image V1.
 ところで描画されるビューフラスタム40の距離は有限としてもよいし、無限としてもよい。例えば図12の描画距離d1のように、ビューフラスタム40を有限の距離で描画することが一例として考えられる。例えば描画距離d1はフラスタム起点46からフォーカス面41までの距離の2倍などとする。
 このようにすることで、フラスタム遠端面45が確定するため、図12のようにビューフラスタム40内で最も面積の広い状態で撮影映像V1を表示させることができる。
The distance of the rendered view frustum 40 may be finite or infinite. For example, the view frustum 40 may be rendered at a finite distance, such as the rendering distance d1 in Fig. 12. For example, the rendering distance d1 may be twice the distance from the frustum starting point 46 to the focus plane 41.
By doing so, the frustum far end surface 45 is determined, so that the photographed image V1 can be displayed in the widest area within the view frustum 40 as shown in FIG.
 一方、ビューフラスタム40は特に描画距離を決めずに、図13に示すように無限遠に描画してもよい。つまりフラスタム遠端面45が常時一定には特定されないものとする。その場合、撮影映像V1は、被写界深度範囲42よりも遠方側における不定の位置に表示させてもよい。 On the other hand, the view frustum 40 may be rendered at infinity as shown in FIG. 13 without any particular rendering distance. In other words, the frustum far end surface 45 is not always specified as a constant. In that case, the captured image V1 may be displayed at an indefinite position farther away than the depth of field range 42.
 また無限遠とする場合でも、実際のビューフラスタム40の遠方側については、CGにより表現された壁などにぶつかった部分までを描画するとよい。従って、その描画する範囲の遠端をフラスタム遠端面45とすればよい。
 図14A、図14Bには、ビューフラスタム40が壁Wの位置まで描画されるようにした場合に、その壁Wと衝突する位置がフラスタム遠端面45とされることを示している。つまりCGによる物体との位置関係によってフラスタム遠端面45が変化する。
 このようにビューフラスタム40を無限遠とする場合、俯瞰映像V3内における描画可能な範囲での遠端をフラスタム遠端面45とし、そのフラスタム遠端面45に撮影映像V1を表示させることが考えられる。
Even when the view frustum 40 is set to infinity, it is preferable to render the view frustum 40 up to the portion where the view frustum 40 hits a wall or the like represented by CG on the far side. Therefore, the far end of the rendering range is set as the frustum far end surface 45.
14A and 14B show that when the view frustum 40 is rendered up to the position of the wall W, the position at which it collides with the wall W is the frustum far end surface 45. In other words, the frustum far end surface 45 changes depending on the positional relationship with the object created by the CG.
When the view frustum 40 is set to infinity in this manner, it is conceivable that the far end of the range that can be drawn in the overhead image V3 is the frustum far end surface 45, and the captured image V1 is displayed on that frustum far end surface 45.
 なお図12のようにビューフラスタム40を有限遠とする場合でも、描画距離d1よりも手前で壁Wにぶつかるようなこともある。その場合は、壁Wと衝突する位置がフラスタム遠端面45とされればよい。 Even if the view frustum 40 is at a finite distance as in FIG. 12, it may happen that the object collides with the wall W before the rendering distance d1. In that case, the position of collision with the wall W should be set to the far end surface 45 of the frustum.
 ここまではビューフラスタム40内に撮影映像V1を表示する例を挙げたが、俯瞰映像V3と同一画面内で、ビューフラスタム40外となる位置に撮影映像V1を表示させてもよい。
 図15にビューフラスタム40外の表示位置の例として4つの例(撮影映像V1w,V1x,V1y,V1z)をまとめて示している。特には、これら4つの例は、ビューフラスタム40の近傍に撮影映像V1を表示させる例である。
So far, an example has been given in which the photographed image V1 is displayed within the view frustum 40, but the photographed image V1 may be displayed at a position outside the view frustum 40 within the same screen as the overhead image V3.
15 shows four examples (captured images V1w, V1x, V1y, and V1z) as examples of display positions outside the view frustum 40. In particular, these four examples are examples in which the captured image V1 is displayed near the view frustum 40.
 撮影映像V1は、撮影映像V1wのようにフラスタム遠端面45の近傍に表示することが考えられる。 The captured image V1 may be displayed near the far end surface 45 of the frustum as captured image V1w.
 また撮影映像V1は、撮影映像V1xのようにフラスタム遠端面45よりも遠方に表示することが考えられる。ビューフラスタム40を有限遠とする場合に、その描画距離d1(図12参照)より先の位置という意味である。 Furthermore, it is possible to display the captured image V1 at a distance farther than the far end surface 45 of the frustum, as in the captured image V1x. If the view frustum 40 is at a finite distance, this means a position beyond the rendering distance d1 (see FIG. 12).
 また撮影映像V1は、図15の撮影映像V1yのようにフォーカス面41(又は被写界深度範囲42)の近傍に表示することが考えられる。この場合、ビューフラスタム40において視認者が注目しやすい箇所であるフォーカス面41或いは被写界深度範囲42と、撮影映像V1とをまとめて見やすくなる。 The captured image V1 can also be displayed near the focus plane 41 (or depth of field range 42) as in the captured image V1y in FIG. 15. In this case, it becomes easier to view the captured image V1 together with the focus plane 41 or depth of field range 42, which are areas of the view frustum 40 that are likely to be noticed by the viewer.
 また撮影映像V1は、撮影映像V1zのようにカメラ2(又はフラスタム起点46)の近傍に表示することが考えられる。この場合、カメラ2とそのカメラ2による撮影映像V1の関係がわかりやすくなる。 The captured image V1 can also be displayed near the camera 2 (or the frustum starting point 46) as the captured image V1z. In this case, the relationship between the camera 2 and the captured image V1 by that camera 2 becomes easier to understand.
 カメラ2のビューフラスタム40(或いはカメラ2)と、そのカメラ2による撮影映像V1の対応関係について、視認者が把握しやすい状態がよい。ビューフラスタム40の近傍に撮影映像V1を表示させることで、関係を把握しやすくできる。 It is desirable for the viewer to easily grasp the correspondence between the view frustum 40 of camera 2 (or camera 2) and the image V1 captured by that camera 2. By displaying the captured image V1 near the view frustum 40, the relationship can be easily grasped.
 特にスポーツ映像制作などの場合、図16に示すように複数のカメラ2のビューフラスタム40を俯瞰映像V3内に表示させることが想定される。このような場合にビューフラスタム40と撮影映像V1の関係が明確でないと、視認者が混乱することが予想される。そこで、或るカメラ2の撮影映像V1は、そのカメラ2のビューフラスタム40の近傍に表示するとよい。 In particular, when producing sports footage, it is expected that the view frustum 40 of multiple cameras 2 will be displayed within the overhead image V3 as shown in FIG. 16. In such cases, if the relationship between the view frustum 40 and the captured image V1 is not clear, it is expected that the viewer will become confused. Therefore, it is advisable to display the captured image V1 of a certain camera 2 near the view frustum 40 of that camera 2.
 但し、俯瞰映像V3における構造物等、ビューフラスタム40の方向、角度或いはビューフラスタム40同士の位置関係などにより、ビューフラスタム40の近傍に撮影映像V1を表示できない場合や、対応関係が明確にならない場合も生じる。
 そこで、例えば撮影映像V1の枠の色と、対応するビューフラスタム40の半透明の色、或いは輪郭線の色などを一致させて、対応関係を示すようにするとよい。
However, depending on structures in the overhead image V3, the direction and angle of the view frustum 40, or the positional relationship between the view frustums 40, there may be cases where the captured image V1 cannot be displayed near the view frustum 40 or where the correspondence is not clear.
Therefore, for example, the color of the frame of the captured image V1 may be matched with the semi-transparent color or the color of the contour of the corresponding view frustum 40 to indicate the correspondence.
 図16の例では、俯瞰映像V3内に、3つのカメラ2に対応するビューフラスタム40a,40b,40cを表示させている。さらにこれらビューフラスタム40a,40b,40cに対応する撮影映像V1a,V1b,V1cも表示させている。 In the example of FIG. 16, view frustums 40a, 40b, and 40c corresponding to the three cameras 2 are displayed within the overhead image V3. In addition, captured images V1a, V1b, and V1c corresponding to these view frustums 40a, 40b, and 40c are also displayed.
 撮影映像V1aは、ビューフラスタム40aのフラスタム遠端面45に表示されている。撮影映像V1bは、ビューフラスタム40bのフラスタム起点46の近傍(カメラ位置の近傍)に表示されている。
 撮影映像V1cは、画面隅に表示されている。但し俯瞰映像V3の4隅のうちでビューフラスタム40cに近い左上の隅に表示されている。
The photographed image V1a is displayed on a frustum far end surface 45 of the view frustum 40a. The photographed image V1b is displayed in the vicinity of a frustum starting point 46 of the view frustum 40b (in the vicinity of the camera position).
The captured image V1c is displayed in a corner of the screen, but is displayed in the upper left corner, which is closest to the view frustum 40c, among the four corners of the overhead image V3.
 なお例えば移動型のカメラ2Mの場合は、ビューフラスタム40の変動が、固定側のカメラ2のビューフラスタム40よりも激しくなる。そのため、移動型のカメラ2の撮影映像V1については、画面隅などに固定して表示させてもよい。 For example, in the case of a mobile camera 2M, the fluctuation of the view frustum 40 will be more severe than the view frustum 40 of the fixed camera 2. Therefore, the image V1 captured by the mobile camera 2 may be displayed fixedly in a corner of the screen, for example.
 以上の図16は、撮影対象空間8を斜め上方からみたような俯瞰映像V3の例としたが、ARシステム5は、図17のように、真上から見た平面状の俯瞰映像V3を表示させてもよい。
 この例では、カメラ2a,2b,2c,2dと、これらのそれぞれに対応するビューフラスタム40a,40b,40c,40dと、さらにそれぞれの撮影映像V1a,V1b,V1c,V1dが表示された俯瞰映像V3としている。
 撮影映像V1a,V1b,V1c,V1dは、それぞれ対応するカメラ2a,2b,2c,2dの近傍に表示されている。
The above Figure 16 is an example of an overhead image V3 in which the target space 8 is viewed from diagonally above, but the AR system 5 may also display a planar overhead image V3 viewed from directly above, as shown in Figure 17.
In this example, cameras 2a, 2b, 2c, and 2d, their corresponding view frustums 40a, 40b, 40c, and 40d, and the captured images V1a, V1b, V1c, and V1d are displayed as an overhead image V3.
The captured images V1a, V1b, V1c, and V1d are displayed near the corresponding cameras 2a, 2b, 2c, and 2d, respectively.
 ARシステム5は、図16や図17で示す俯瞰映像V3の視点方向は、視認者がGUIデバイス11等の操作を行うことで連続的に変化されるようにしてもよい。 The AR system 5 may be configured so that the viewpoint direction of the overhead image V3 shown in Figures 16 and 17 can be continuously changed by the viewer operating the GUI device 11, etc.
 図18は俯瞰映像V3の他の例である。自動車レース場をCGで表した俯瞰映像V3において、ビューフラスタム40a、40bを表示させるとともに、ビューフラスタム40a,40bの各カメラ2による撮影映像V1a,V1bを画面隅やカメラ位置近傍などに表示させている。
 例えばレースの撮影の場合、撮影映像V1だけではコースのどのあたりを撮影しているかはわかりにくいが、俯瞰映像V3、ビューフラスタム40、撮影映像V1が同時表示されることで、関係がわかりやすくなる。
 特にコースに対して複数のカメラ2を配置する場合には、図の例のように、それぞれのビューフラスタム40や撮影映像V1を表示させると撮影状況がわかりやすくなる。
18 is another example of the overhead image V3. In the overhead image V3, which is a CG representation of an automobile racing track, the view frustums 40a and 40b are displayed, and the images V1a and V1b captured by the cameras 2 of the view frustums 40a and 40b are displayed in the corners of the screen or near the camera positions.
For example, when shooting a race, it is difficult to tell from the shot image V1 alone which part of the course is being shot, but by simultaneously displaying the overhead image V3, the view frustum 40, and the shot image V1, the relationship becomes easier to understand.
In particular, when multiple cameras 2 are arranged along the course, the shooting conditions can be easily understood by displaying each of the view frustums 40 and shot images V1, as in the example shown in the figure.
 以上の図9から図18に例示したように、ARシステム5は、CG空間30で、カメラ2のビューフラスタム40の表示を行うとともに、カメラ2の撮影映像V1も同時に表示させるように俯瞰映像V3の映像データを生成する。この俯瞰映像V3がカメラ2やGUIデバイス11において表示されることで、カメラマンやディレクター等の視認者は撮影状況を把握しやすくなる。 As illustrated in Figs. 9 to 18 above, the AR system 5 displays the view frustum 40 of the camera 2 in the CG space 30, and generates video data for an overhead image V3 so that the captured image V1 of the camera 2 is also displayed at the same time. By displaying this overhead image V3 on the camera 2 or GUI device 11, viewers such as the cameraman or director can easily understand the shooting situation.
 具体的に述べる。
 CG空間30にビューフラスタム40や撮影映像V1を表示することで、カメラ2の撮影映像V1と空間位置の対応が明確になり、視認者はカメラ2の撮影映像V1と撮影対象空間8での位置の対応関係を把握しやすくなる。
Let me be more specific.
By displaying the view frustum 40 and the captured image V1 in the CG space 30, the correspondence between the captured image V1 of the camera 2 and the spatial position becomes clear, and the viewer can easily grasp the correspondence between the captured image V1 of the camera 2 and the position in the target space 8.
 また視認者は、それぞれのカメラ2が何を映しているのか、或いはどこにフォーカスを合わせているか、などを把握しやすい。
 特にカメラ2による撮影や映像制作の経験が少ないと、カメラ2の位置と撮影映像V1との対応がわかりにくく、撮影映像V1の画面と俯瞰映像V3の画面を行き来することがある。一つの画面としてCG空間30内に撮影映像V1を表示することで、そのような画面の行き来をなくすことができる。
In addition, the viewer can easily understand what each camera 2 is capturing, where the camera is focused, and so on.
In particular, if you have little experience with shooting or video production using the camera 2, it may be difficult to understand the correspondence between the position of the camera 2 and the shot image V1, and you may end up going back and forth between the screen of the shot image V1 and the screen of the overhead image V3. By displaying the shot image V1 in the CG space 30 as a single screen, it is possible to eliminate such going back and forth between the screens.
 またカメラ2の位置と撮影映像V1から、次に目的の被写体が映るカメラ2を予測できる。
 例えば、カメラ2aの撮影映像V1aで選手が右に走っていくと、次はカメラ2bにその選手が映るだろうということが予測できる。このような予測は撮影映像V1aだけでは難しい。
In addition, from the position of the camera 2 and the captured image V1, it is possible to predict which camera 2 will next capture the target subject.
For example, if a player runs to the right in the image V1a captured by the camera 2a, it can be predicted that the player will next be captured by the camera 2b. Such a prediction is difficult to make using only the captured image V1a.
 例えばGUIデバイス11を使用するディレクター等の観点でいえば、複数のカメラ2のビューフラスタム40及び撮影映像V1を表示した俯瞰映像V3を視認することで、各カメラの位置関係、撮影方向の関係、撮影している被写体などを極めて容易に把握することができる。これによって適切な指示を行うことができる。
 ディレクターにとってみれば、個々の撮影映像V1については、大まかな内容がわかれば良い。このため俯瞰映像V3内で比較的小さい撮影映像V1でも問題はない。逆に各カメラ2のビューフラスタム40がCG空間30内に表示されることで、ディレクターは各カメラ2の状況を総合的に考慮して、構図、立ち位置、カメラ位置の確認やシミュレーションができる。
For example, from the viewpoint of a director using the GUI device 11, by visually checking the view frustum 40 of the multiple cameras 2 and the overhead image V3 displaying the shot images V1, the director can very easily grasp the relative positions of the cameras, the relationship between the shooting directions, the subject being shot, etc. This allows the director to give appropriate instructions.
From the director's point of view, it is enough to know the general content of each shot image V1. Therefore, there is no problem even if the shot image V1 is relatively small in the overhead image V3. On the other hand, by displaying the view frustum 40 of each camera 2 in the CG space 30, the director can check and simulate the composition, standing position, and camera position while taking into consideration the overall situation of each camera 2.
 カメラマンにとっては、ピントを合わせるときに、ビューフラスタム40の被写界深度範囲42を見てフォーカス操作ができる。
 また自分が操作するカメラ2のビューフラスタム40を確認することで、CGで表現される撮影対象空間8の俯瞰映像V3内での撮影している箇所や方向を容易に確認できる。
 また他のカメラ2のビューフラスタム40と撮影映像V1を見て、自分のカメラ操作に反映できる。他のカメラ2による撮影内容、被写体方向などとの関係も把握できるため、他のカメラ2との関係のうえで好ましい撮影を行うことができる。例えば別のカメラ2で撮影している位置や画角を確認して、自分のカメラ2では、異なる位置や画角の撮影を行うなどである。
When adjusting the focus, the cameraman can perform the focusing operation by looking at the depth of field range 42 of the view frustum 40.
In addition, by checking the view frustum 40 of the camera 2 that the user is operating, the user can easily check the location and direction being photographed within the overhead image V3 of the subject space 8 represented by CG.
In addition, the user can see the view frustum 40 and the captured image V1 of the other camera 2 and reflect them in the operation of his/her own camera. The user can also grasp the relationship between the contents of the images captured by the other camera 2, the direction of the subject, etc., and therefore can perform preferable shooting in relation to the other camera 2. For example, the user can check the position and angle of view of the other camera 2 and shoot from a different position and angle of view with his/her own camera 2.
 カメラ2のリモート操作、例えば移動型のカメラ2のフォーカス操作などを遠隔で行う操作スタッフの観点でいえば、リモート操作のために現場の状況が見えにくいときに便利である。即ち、俯瞰映像V3があれば情報量(撮影映像V1、位置等)が増えて、現場の状況を把握しやすくなる。 From the perspective of the operating staff who remotely operates the camera 2, for example, by controlling the focus of a mobile camera 2, this is convenient when the situation on-site is difficult to see due to remote operation. In other words, the overhead image V3 increases the amount of information (captured image V1, position, etc.), making it easier to grasp the situation on-site.
 図9から図18では、ビューフラスタム40とともに撮影映像V1を表示する例として、撮影映像V1の表示位置を各種示したが、この表示位置はユーザの意思や自動判定において適切に変更されることが好適である。
 以下では、撮影映像V1の表示設定の変更を含めたARシステム5の処理例を説明する。
In Figures 9 to 18, various display positions of the captured image V1 are shown as examples of displaying the captured image V1 together with the view frustum 40, but it is preferable that this display position be appropriately changed based on the user's intention or automatic determination.
An example of processing of the AR system 5, including changing the display settings of the captured video V1, will be described below.
 図19は俯瞰映像V3の映像データを生成するARシステム5の処理例である。この場合の俯瞰映像V3の映像データとは、撮影対象空間8に相当するCG空間30に、ビューフラスタム40及び撮影映像V1を合成した映像データのことである。つまり図9から図18のような表示を行うための映像データである。 FIG. 19 shows an example of processing by the AR system 5 that generates video data for the overhead view video V3. In this case, the video data for the overhead view video V3 is video data in which the view frustum 40 and the captured video V1 are synthesized into the CG space 30, which corresponds to the subject space 8. In other words, it is video data for displaying the images shown in FIGS. 9 to 18.
 ARシステム5は、例えば俯瞰映像V3の映像データとしてのフレーム毎に図19のステップS101からステップS107の処理を行う。これらの処理はARシステム5としての図7の情報処理装置70におけるCPU71(映像処理部71a、映像生成制御部71b)の制御処理と考えることができる。 The AR system 5 performs the processes from step S101 to step S107 in FIG. 19 for each frame of the video data of the overhead video V3, for example. These processes can be considered as control processes of the CPU 71 (video processing unit 71a, video generation control unit 71b) in the information processing device 70 in FIG. 7 as the AR system 5.
 ステップS101でARシステム5はCG空間30の設定を行う。例えば撮影対象空間8に相当するCG空間30の視点位置の設定、当該視点位置からのCG空間30としての映像のレンダリングを行う。特に前フレームとCG空間30に対する視点位置や映像内容の変更がなければ、前フレームのCG空間の映像を今回のフレームでも使用すればよい。 In step S101, the AR system 5 sets the CG space 30. For example, it sets the viewpoint position of the CG space 30 corresponding to the shooting target space 8, and renders an image as the CG space 30 from that viewpoint position. In particular, if there is no change in the viewpoint position or image content between the previous frame and the CG space 30, the image in the CG space of the previous frame can be used in the current frame as well.
 ステップS102でARシステム5は、カメラ2からの撮影映像V1とメタデータMTを入力する。つまり現在のフレームの撮影映像V1と、そのフレームタイミングでのカメラ2の姿勢情報、焦点距離、画角、絞り値などを取得する。
 例えば図4のように1つのARシステム5が複数のカメラ2についてビューフラスタム40や撮影映像V1の表示を行う場合は、ARシステム5は、それぞれのカメラ2の撮影映像V1とメタデータMTを入力する。
 図3のようにカメラ2とARシステム5が1:1に対応するカメラシステム1が複数存在し、それぞれが複数のビューフラスタム40や撮影映像V1を含む俯瞰映像V3を生成する場合は、それらの各ARシステム5が、対応するカメラ2のメタデータMTや撮影映像V1を共有できるように連携するとよい。
In step S102, the AR system 5 inputs the captured image V1 and metadata MT from the camera 2. That is, the captured image V1 of the current frame, and the attitude information, focal length, angle of view, aperture value, and the like of the camera 2 at the frame timing are acquired.
For example, when one AR system 5 displays the view frustum 40 and the captured image V1 for a plurality of cameras 2 as shown in FIG. 4, the AR system 5 inputs the captured image V1 and metadata MT of each camera 2.
When there are multiple camera systems 1 in which the cameras 2 and AR systems 5 correspond 1:1 as shown in Figure 3, and each generates an overhead image V3 including multiple view frustums 40 and captured images V1, it is preferable for each of these AR systems 5 to work together so as to share the metadata MT and captured images V1 of the corresponding camera 2.
 ステップS103でARシステム5は、現在のフレームについてのビューフラスタム40を生成する。ARシステム5はステップS102で取得したメタデータMTから、カメラ2の姿勢に応じたCG空間30内でのビューフラスタム40の方向や、画角に応じた四角錐形状や、焦点距離や絞り値に基づくフォーカス面41や被写界深度範囲42の位置などを設定し、その設定に応じたビューフラスタム40の映像を生成する。
 複数のカメラ2についてビューフラスタム40の表示を行う場合は、ARシステム5は、それぞれのカメラ2のメタデータMTに応じてビューフラスタム40の映像を生成する。
In step S103, the AR system 5 generates a view frustum 40 for the current frame. From the metadata MT acquired in step S102, the AR system 5 sets the direction of the view frustum 40 in the CG space 30 according to the attitude of the camera 2, the quadrangular pyramid shape according to the angle of view, the positions of the focus plane 41 and the depth of field range 42 based on the focal length and aperture value, and the like, and generates an image of the view frustum 40 according to the settings.
When displaying the view frustum 40 for a plurality of cameras 2 , the AR system 5 generates an image of the view frustum 40 according to the metadata MT of each camera 2 .
 ステップS104でARシステム5は、ステップS103で取得した撮影映像V1についての表示位置設定を行う。この処理については各種例を後述する。 In step S104, the AR system 5 sets the display position for the captured image V1 acquired in step S103. Various examples of this process will be described later.
 ステップS105でARシステム5は、俯瞰映像V3となるCG空間30に、1つ又は複数のカメラ2に対応するビューフラスタム40と撮影映像V1を合成して、俯瞰映像V3の1フレームの映像データを生成する。 In step S105, the AR system 5 synthesizes the view frustum 40 corresponding to one or more cameras 2 and the captured image V1 into the CG space 30 that becomes the overhead image V3, generating image data for one frame of the overhead image V3.
 そしてステップS106でARシステム5は、俯瞰映像V3の1フレームの映像データを出力する。
 以上の処理を、ビューフラスタム40と撮影映像V1の表示終了まで繰り返し行う。これにより、図9から図18のような俯瞰映像V3が、GUIデバイス11やカメラ2において表示される。
Then, in step S106, the AR system 5 outputs one frame of video data of the overhead view video V3.
The above process is repeated until the display of the view frustum 40 and the captured image V1 is completed. As a result, the overhead image V3 as shown in FIGS.
 ステップS104の撮影映像V1の表示位置設定について例を述べる。
 図20、図21、図22は撮影映像V1の表示位置を固定的に設定する例であり、図23,図24は撮影映像V1の表示位置を可変設定する例である。
 なお、以下の図20、図21、図22、図23、図24は1つのカメラ2に対応する撮影映像V1の表示位置設定の例である。複数のカメラ2についてビューフラスタム40と撮影映像V1の表示を行う場合は、各カメラ2に対応して、図20から図24の様な処理が行われれば良い。また各カメラ2で、同じ表示位置設定の処理が行われてもよいし、異なる表示位置設定の処理が行われてもよい。
An example of setting the display position of the captured image V1 in step S104 will be described.
20, 21 and 22 show examples in which the display position of the photographed video V1 is set fixedly, while FIGS. 23 and 24 show examples in which the display position of the photographed video V1 is set variably.
20, 21, 22, 23, and 24 are examples of display position settings of the captured image V1 corresponding to one camera 2. When displaying the view frustum 40 and the captured image V1 for a plurality of cameras 2, the processes as shown in Fig. 20 to 24 may be performed for each camera 2. Furthermore, the same display position setting process may be performed for each camera 2, or different display position setting processes may be performed.
 まず図20は図9のようにフォーカス面41に撮影映像V1を表示する場合の表示位置設定処理を示している。
 ステップS120でARシステム5は、現フレームにおいて図19のステップS103で生成したビューフラスタム40におけるフォーカス面41のサイズ及び形状を判定する。図20のステップS121でARシステム5は、フォーカス面41に合致するように撮影映像V1のサイズや形状の設定を行う。
First, FIG. 20 shows a display position setting process when the photographed image V1 is displayed on the focus plane 41 as in FIG.
In step S120, the AR system 5 determines the size and shape of the focus plane 41 in the view frustum 40 generated in step S103 in Fig. 19 for the current frame. In step S121 in Fig. 20, the AR system 5 sets the size and shape of the captured image V1 so as to match the focus plane 41.
 なおビューフラスタム40内に合成する撮影映像V1の形状は、そのビューフラスタム40の断面形状とすればよい。例えばフォーカス面41の形状は、俯瞰映像V3の視点、表示するビューフラスタム40の位置や方向などにより異なるが、そのフレームにおいてビューフラスタム40のフォーカス面41でカメラ2の光軸線と垂直にカットした断面の形状とすればよい。
 従って、ビューフラスタム40内に撮影映像V1を表示させる場合は、撮影映像V1は光軸に垂直な断面形状に変形されて合成される。
 但し、必ずしも光軸に垂直な断面形状で表示されなくてもよい。カメラ2の光軸に非垂直な断面形状とされ、ビューフラスタム40内に表示されてもよい。
The shape of the captured image V1 to be synthesized within the view frustum 40 may be the cross-sectional shape of that view frustum 40. For example, the shape of the focus plane 41 differs depending on the viewpoint of the overhead image V3 and the position and direction of the view frustum 40 to be displayed, but may be the shape of a cross section cut perpendicular to the optical axis of the camera 2 at the focus plane 41 of the view frustum 40 in that frame.
Therefore, when the photographed image V1 is displayed within the view frustum 40, the photographed image V1 is transformed into a cross-sectional shape perpendicular to the optical axis and then synthesized.
However, it does not necessarily have to be displayed in a cross-sectional shape perpendicular to the optical axis. It may be displayed in the view frustum 40 with a cross-sectional shape that is not perpendicular to the optical axis of the camera 2.
 以上の処理の後、図19のステップS105に進んだ際に、撮影映像V1のサイズや形状の調整が行われ、その撮影映像V1がビューフラスタム40のフォーカス面41に合成された俯瞰映像V3が生成される。 After the above processing, when the process proceeds to step S105 in FIG. 19, the size and shape of the captured image V1 are adjusted, and an overhead image V3 is generated by combining the captured image V1 with the focus plane 41 of the view frustum 40.
 図21は図10のように深度遠端面44に撮影映像V1を表示する場合の表示位置設定処理を示している。
 ステップS130でARシステム5は、現フレームにおいてステップS103で生成したビューフラスタム40における深度遠端面44のサイズ及び形状を判定する。
 ステップS131でARシステム5は、深度遠端面44のサイズに合致するように撮影映像V1のサイズや形状の設定を行う。
FIG. 21 shows a display position setting process in the case where the photographed image V1 is displayed on the depth far end surface 44 as in FIG.
In step S130, the AR system 5 determines the size and shape of the depth far end surface 44 in the view frustum 40 generated in step S103 in the current frame.
In step S131, the AR system 5 sets the size and shape of the captured image V1 so as to match the size of the depth far end surface 44.
 これにより図19のステップS105に進んだ際に、撮影映像V1のサイズや形状の調整が行われ、その撮影映像V1がビューフラスタム40の深度遠端面44に合成された俯瞰映像V3が生成される。 As a result, when the process proceeds to step S105 in FIG. 19, the size and shape of the captured image V1 are adjusted, and an overhead image V3 is generated in which the captured image V1 is composited onto the far end surface 44 of the view frustum 40.
 図22は図11のようにフラスタム起点46の近傍に撮影映像V1を表示する場合の表示位置設定処理を示している。
 ステップS140でARシステム5は、現フレームにおいてステップS103で生成したビューフラスタム40内での撮影映像V1の表示位置を設定する。即ち被写界深度範囲42よりもフラスタム起点46側において或る位置を設定する。この場合の位置は、フラスタム起点46からの距離として固定的に設定してもよいし、画角に応じた四角錐形状の断面として、最低限の面積が得られる位置などとして設定してもよい。
FIG. 22 shows a display position setting process in the case where the photographed image V1 is displayed in the vicinity of the frustum starting point 46 as in FIG.
In step S140, the AR system 5 sets the display position of the captured image V1 within the view frustum 40 generated in step S103 in the current frame. That is, a certain position is set on the frustum origin 46 side of the depth of field range 42. In this case, the position may be set as a fixed distance from the frustum origin 46, or may be set as a position where a minimum area is obtained as a cross section of a quadrangular pyramid shape according to the angle of view.
 ステップS141でARシステム5は、設定した表示位置での断面、つまり表示領域のサイズ及び形状を判定する。
 ステップS142でARシステム5は、判定した表示位置の断面に合致するように撮影映像V1のサイズや形状の設定を行う。
In step S141, the AR system 5 determines the cross section at the set display position, that is, the size and shape of the display area.
In step S142, the AR system 5 sets the size and shape of the captured image V1 so as to match the cross section of the determined display position.
 これによりステップS105に進んだ際に、撮影映像V1のサイズや形状の調整が行われ、その撮影映像V1がビューフラスタム40のフラスタム起点46の近傍位置に合成された俯瞰映像V3が生成される。 As a result, when the process proceeds to step S105, the size and shape of the captured image V1 are adjusted, and an overhead image V3 is generated in which the captured image V1 is composited at a position near the frustum origin 46 of the view frustum 40.
 続いて図23は、ユーザであるカメラマンやディレクター等の操作に応じて、撮影映像V1の表示位置が変更されるようにした表示位置設定処理を示している。 Next, FIG. 23 shows a display position setting process in which the display position of the captured image V1 is changed according to the operation of a user such as a cameraman or director.
 ステップS150でARシステム5は、撮影映像V1についての表示位置変更操作の有無を確認する。例えばGUIデバイス11やカメラ2は、所定の操作により、ディレクターやカメラマン等が、表示位置変更操作ができるように構成する。ARシステム5は受信する制御信号CSのうちで、その表示位置変更操作の操作情報を確認する。 In step S150, the AR system 5 checks whether or not a display position change operation has been performed on the captured image V1. For example, the GUI device 11 and the camera 2 are configured so that a director, cameraman, etc. can change the display position by performing a specified operation. The AR system 5 checks the operation information for the display position change operation from the control signal CS that it receives.
 例えば表示位置設定を、「フォーカス面41」「深度遠端面44」「フラスタム起点付近面47」「フラスタム遠端面45」のようにビューフラスタム40内で変更できるような操作を可能とする。各面がトグル操作で切り変えられるような操作インタフェースを設けてもよいし、各面を直接指定できる操作インタフェースを用意してもよい。 For example, it is possible to operate the display position setting so that it can be changed within the view frustum 40, such as to the "focus plane 41," "depth far end plane 44," "plane 47 near the frustum starting point," and "frustum far end plane 45." An operation interface may be provided that allows each plane to be switched by a toggle operation, or an operation interface may be provided that allows each plane to be directly specified.
 また表示位置設定の切り替えは、ビューフラスタム40内だけでなく、ビューフラスタム40外の位置も含めてもよい。
 例えば「フォーカス面41」「フラスタム遠端面45」「画面隅」「カメラ付近」で変更できるような操作を可能とする。
Furthermore, the display position setting may be switched not only to positions within the view frustum 40 but also to positions outside the view frustum 40 .
For example, it is possible to perform operations such as changing the focus plane 41, the far end plane of the frustum 45, the corner of the screen, and the vicinity of the camera.
 さらに表示位置設定の切り替えは、ビューフラスタム40外で行ってもよう。例えば「フォーカス面41の近く」「フラスタム遠端面45の近く」「画面隅」「カメラ2の近く」で変更できるような操作を可能とする。 Furthermore, the display position setting can be switched outside the view frustum 40. For example, it is possible to change the position to "near the focus plane 41," "near the far end surface 45 of the frustum," "corner of the screen," or "near camera 2."
 なお上述の図9から図18では撮影映像V1の表示位置として多様な例を挙げた。ビューフラスタム40内では「フォーカス面41」「深度近端面43」「深度遠端面44」「フラスタム起点付近面47」「フラスタム遠端面45」を例示した。またビューフラスタム40外で「画面隅」「カメラ付近」「フォーカス面41付近」「フラスタム遠端面45よりも遠方」などを例示した。
 これらのうちで、ユーザが切り替え操作で選択できる位置を設定できるようにしてもよい。
 また、例えば被写界深度範囲42内の表示位置やフォーカス面41付近の表示位置などについては、ユーザが位置を調整できるようにしてもよい。
9 to 18, various examples of display positions of the captured image V1 are given. Within the view frustum 40, the "focus plane 41", the "depth near end plane 43", the "depth far end plane 44", the "plane 47 near the frustum starting point", and the "frustum far end plane 45" are given as examples. Outside the view frustum 40, the "corner of the screen", "near the camera", "near the focus plane 41", "farther than the frustum far end plane 45" and the like are given as examples.
Of these, the user may be able to set a position that can be selected by a switching operation.
In addition, for example, the display position within the depth of field range 42 and the display position near the focus plane 41 may be adjustable by the user.
 現フレームの処理時点で特に表示位置変更操作が確認されなければ、ARシステム5はステップS151に進み、前フレームと同じ表示位置設定を維持して図23の処理を終える。
 これにより図19のステップS105に進んだ際に、撮影映像V1が前フレームと同じ位置に表示される、今回の俯瞰映像V3のフレームが生成される。
If no particular operation to change the display position is confirmed at the time of processing the current frame, the AR system 5 proceeds to step S151, maintains the same display position setting as in the previous frame, and ends the processing of FIG.
As a result, when the process proceeds to step S105 in FIG. 19, a frame of the current overhead image V3 is generated in which the shot image V1 is displayed in the same position as in the previous frame.
 現フレームの処理時点で特に表示位置変更操作が確認された場合は、ARシステム5は図23のステップS150からステップS152に進み、操作に応じて表示位置設定を変更する。例えばそれまでフォーカス面41としていた設定をフラスタム遠端面45に切り替えるなどである。 If a display position change operation is confirmed at the time of processing the current frame, the AR system 5 proceeds from step S150 to step S152 in FIG. 23 and changes the display position setting in response to the operation. For example, the setting that had been set as the focus plane 41 until then may be switched to the frustum far end plane 45.
 ステップS153でARシステム5は、変更した位置設定がビューフラスタム40外であるか否かで処理を分岐する。
 変更した位置設定がビューフラスタム40内の位置であればARシステム5はステップS154に進み、設定位置のビューフラスタム40の断面としての表示領域のサイズ及び形状を判定する。
 そしてステップS156でARシステム5は、判定した表示位置の断面に合致するように撮影映像V1のサイズ及び形状の設定を行う。
In step S153, the AR system 5 branches the process depending on whether the changed position setting is outside the view frustum 40 or not.
If the changed position setting is a position within the view frustum 40, the AR system 5 proceeds to step S154, and determines the size and shape of the display area as a cross section of the view frustum 40 at the set position.
Then, in step S156, the AR system 5 sets the size and shape of the captured image V1 so as to match the cross section of the determined display position.
 これにより図19のステップS105に進んだ際に、撮影映像V1のサイズ調整が行われ、その撮影映像V1が前フレームとは異なるビューフラスタム40内の位置に合成された俯瞰映像V3が生成される。 As a result, when the process proceeds to step S105 in FIG. 19, the size of the captured image V1 is adjusted, and an overhead image V3 is generated in which the captured image V1 is composited at a position within the view frustum 40 different from the previous frame.
 今回、操作に応じて変更した位置設定がビューフラスタム40外である場合は、ARシステム5は図23のステップS153からステップS155に進み、新たな設定位置における撮影映像V1の表示サイズや形状を設定する。ビューフラスタム40外の場合は、合成する撮影映像V1の形状は、ビューフラスタム40の断面形状に限らず、例えば長方形としてもよいし、ビューフラスタム40の近傍であればビューフラスタム40の角度に応じた平行四辺形などでもよい。撮影映像V1のサイズも比較的自由に設定できるが、画面内での他の表示に応じて適切に設定されることが望ましい。 If the position setting changed this time in response to an operation is outside the view frustum 40, the AR system 5 proceeds from step S153 to step S155 in FIG. 23, and sets the display size and shape of the captured image V1 at the new set position. If outside the view frustum 40, the shape of the captured image V1 to be synthesized is not limited to the cross-sectional shape of the view frustum 40, and may be, for example, a rectangle, or if it is near the view frustum 40, a parallelogram according to the angle of the view frustum 40. The size of the captured image V1 can also be set relatively freely, but it is desirable to set it appropriately according to other displays on the screen.
 これにより図19のステップS105に進んだ際に、撮影映像V1のサイズや形状が調整され、その撮影映像V1が、前フレームとは異なるビューフラスタム40外の位置に合成された俯瞰映像V3が生成される。 As a result, when the process proceeds to step S105 in FIG. 19, the size and shape of the captured image V1 are adjusted, and an overhead image V3 is generated in which the captured image V1 is composited at a position outside the view frustum 40 that is different from the previous frame.
 なお以上の図23の処理例では、ビューフラスタム40外への表示位置変更も可能としたが、表示位置変更はビューフラスタム40内のみで可能としてもよい。その場合、ステップS153、S155は不要となる。 In the above processing example of FIG. 23, it is possible to change the display position outside the view frustum 40, but it is also possible to change the display position only within the view frustum 40. In that case, steps S153 and S155 are unnecessary.
 また表示位置変更はビューフラスタム40外のみで可能としてもよい。その場合、ステップS153、S154は不要となり、ステップS152からステップS155に進むようにすればよい。 Alternatively, the display position may be changed only outside the view frustum 40. In that case, steps S153 and S154 are unnecessary, and the process may proceed from step S152 to step S155.
 次に図24は、ARシステム5が自動的に撮影映像V1の表示位置を変更する処理例を示している。 Next, FIG. 24 shows an example of processing in which the AR system 5 automatically changes the display position of the captured image V1.
 ステップS160でARシステム5は、表示位置変更判定を行う。
 表示位置変更判定とは、現在のフレームにおいて、前のフレームと撮影映像V1の表示位置の設定を変更するか否かの判定処理である。
 この判定処理としては、次の(P1)(P2)(P3)の様な処理例がある。
(P1)ビューフラスタム40と俯瞰映像V3内の物体との位置関係に基づく判定
(P2)俯瞰映像V3内でのビューフラスタム40の角度に基づく判定
(P3)俯瞰映像V3の視点位置に基づく判定
In step S160, the AR system 5 performs a display position change determination.
The display position change determination is a process of determining whether or not to change the display position setting of the photographed video V1 in the current frame from that in the previous frame.
Examples of this determination process include the following processes (P1), (P2), and (P3).
(P1) Determination based on the positional relationship between the view frustum 40 and an object in the overhead image V3. (P2) Determination based on the angle of the view frustum 40 in the overhead image V3. (P3) Determination based on the viewpoint position of the overhead image V3.
 まず(P1)の例を挙げる。
 例えばビューフラスタム40と俯瞰映像V3内の地面や壁などとの衝突を判定する。例えば図25は、有限遠のビューフラスタム40のフラスタム遠端面45が地面GRに衝突して一部めり込んでいる状態を示している。図26は、有限遠又は無限遠のビューフラスタム40の遠端側が構造物CNに衝突して、その先が表示できなくなっている状態を示している。
First, let us take an example of (P1).
For example, it judges whether the view frustum 40 collides with the ground or a wall in the overhead image V3. For example, Fig. 25 shows a state in which the frustum far end surface 45 of the finitely distant view frustum 40 collides with the ground GR and is partially embedded therein. Fig. 26 shows a state in which the far end side of the finitely distant or infinitely distant view frustum 40 collides with a structure CN and it becomes impossible to display anything beyond that.
 例えば前のフレームまで、ビューフラスタム40内で、撮影映像V1をフラスタム遠端面45或いはその近傍に表示させていたところ、今回のフレームで図25,図26のようにビューフラスタム40の遠端側が物体に衝突してめり込んでいる状態となったとする。このような場合、前回と同じ設定での撮影映像V1の表示は適切でなくなる。撮影映像V1の一部が欠けたり、全体が見えなくなったりすることが想定される。そこで、表示位置の変更要と判定する。 For example, suppose that up until the previous frame, the captured image V1 was displayed on or near the far end surface 45 of the frustum 40 within the view frustum 40, but in the current frame, the far end of the view frustum 40 has collided with an object and become embedded in it, as shown in Figures 25 and 26. In such a case, displaying the captured image V1 with the same settings as the previous time will no longer be appropriate. It is expected that part of the captured image V1 will be missing, or that the entire image will become invisible. Therefore, it is determined that a change in the display position is required.
 また、カメラ2の画角や撮影方向の変更によりビューフラスタム40の四角錐の形状が広がったり、方向が変化したりした場合において、ビューフラスタム40の特定位置(フラスタム遠端面45やフォーカス面41など)と、表示される他の物体の位置関係から、それまでの撮影映像V1の表示位置が適切でないと判定した場合に、表示位置の変更要と判定してもよい。 In addition, if the pyramidal shape of the view frustum 40 widens or its direction changes due to a change in the angle of view or shooting direction of the camera 2, and it is determined that the display position of the captured image V1 up to that point is not appropriate based on the positional relationship between a specific position of the view frustum 40 (such as the frustum far end surface 45 or the focus surface 41) and other objects being displayed, it may be determined that the display position needs to be changed.
 また俯瞰映像V3内の物体として他のビューフラスタム40も含めて考え、他のビューフラスタム40との位置関係により、撮影映像V1の表示位置が適切でないと判定した場合に、表示位置の変更要と判定してもよい。 In addition, other view frustums 40 may also be considered as objects in the overhead image V3, and if it is determined that the display position of the captured image V1 is not appropriate due to its positional relationship with the other view frustums 40, it may be determined that the display position needs to be changed.
 また他のビューフラスタム40との位置関係による場合、図17のように複数のビューフラスタム40が重なったことで、ビューフラスタム40と撮影映像V1の関係がわかりにくくなった場合なども、表示位置の変更要と判定してよい。 In addition, if the positional relationship with other view frustums 40 is the cause, as in Figure 17, when multiple view frustums 40 overlap, making it difficult to understand the relationship between the view frustum 40 and the captured image V1, it may be determined that the display position needs to be changed.
 次に(P2)の例は、ビューフラスタム40の断面形状に合わせた撮影映像V1の見やすさを考慮するものである。
 俯瞰映像V3内でのビューフラスタム40の方向によっては、断面形状が表示面として適切でなくなる場合がある。ビューフラスタム40の形状や方向は、カメラ2の画角や撮影方向に応じて変化する。すると俯瞰映像V3内で表示されるビューフラスタム40の角度も変化する。つまり俯瞰映像V3全体の視点からの方向とビューフラスタム40の軸方向との角度が変化する。この角度とは、或る時点の俯瞰映像V3について設定された視点からの視線方向で見た場合における表示画面上の法線方向と、表示されているビューフラスタム40と軸方向の角度である。なおビューフラスタム40の軸方向とはフラスタム起点46からフラスタム遠端面45に対して垂直となる垂直線を引いた場合の、当該垂直線の方向である。
 例えば図27には、ビューフラスタム40a,40b,40cに対応する撮影映像V1a、V1b、V1cを示している。この場合に、俯瞰映像V3内でのビューフラスタム40aの角度によって、その断面形状に合わせて表示させる撮影映像V1aが、鋭角と鈍角の差が大きい平行四辺形になってしまう。このままでは撮影映像V1aの視認性が良くない。このような場合に、破線矢印で示すように表示位置変更を行い、撮影映像V1a’としての位置に表示させるとよい。
 このように撮影映像V1の鋭角と鈍角の角度が所定以上になる場合に表示位置の変更要と判定することが考えられる。
Next, the example of (P2) takes into consideration the visibility of the captured image V1 that is adapted to the cross-sectional shape of the view frustum 40.
Depending on the direction of the view frustum 40 in the overhead image V3, the cross-sectional shape may not be appropriate as a display surface. The shape and direction of the view frustum 40 change according to the angle of view and the shooting direction of the camera 2. Then, the angle of the view frustum 40 displayed in the overhead image V3 also changes. That is, the angle between the direction from the viewpoint of the entire overhead image V3 and the axial direction of the view frustum 40 changes. This angle is the angle between the normal direction on the display screen when viewed from the line of sight from the viewpoint set for the overhead image V3 at a certain time, and the axial direction of the displayed view frustum 40. Note that the axial direction of the view frustum 40 is the direction of the perpendicular line drawn from the frustum starting point 46 to the frustum far end surface 45.
For example, Fig. 27 shows the captured images V1a, V1b, and V1c corresponding to the view frustum 40a, 40b, and 40c. In this case, the captured image V1a displayed according to the cross-sectional shape becomes a parallelogram with a large difference between acute and obtuse angles due to the angle of the view frustum 40a in the overhead image V3. If this continues, the visibility of the captured image V1a will be poor. In such a case, it is advisable to change the display position as shown by the dashed arrow and display it at the position of the captured image V1a'.
In this way, it is conceivable to determine that the display position needs to be changed when the acute angle and obtuse angle of the photographed image V1 are equal to or greater than a predetermined value.
 (P3)の例も(P2)と同様の考え方である。
 俯瞰映像V3の視点位置は、ディレクター等が操作を行うことに応じて変更できるようにしている。例えば図16に示した状態から、操作により図27のように俯瞰映像V3の視点位置が変更されることもある。
 この図27の場合、上述と同様に撮影映像V1aの視認性が良くない。つまり、カメラ2の画角や撮影方向の変化がなくても、俯瞰映像V3の視点変更により、描画されるビューフラスタム40及び撮影映像V1の形状が変化するため、視認性が低下することがある。そのような場合も、例えば結果として撮影映像V1の鋭角と鈍角の角度が所定以上になる場合に、表示位置の変更要と判定するようにする。
The example (P3) is based on the same idea as (P2).
The viewpoint position of the overhead view video V3 can be changed according to an operation by a director, etc. For example, the viewpoint position of the overhead view video V3 may be changed from the state shown in Fig. 16 to that shown in Fig. 27 by an operation.
In the case of Fig. 27, the visibility of the captured image V1a is poor, as in the case described above. In other words, even if there is no change in the angle of view or the shooting direction of the camera 2, the shape of the rendered view frustum 40 and the captured image V1 changes due to a change in the viewpoint of the overhead image V3, which may reduce visibility. In such a case, for example, if the acute and obtuse angles of the captured image V1 become equal to or larger than a predetermined value, it is determined that the display position needs to be changed.
 また、俯瞰映像V3の視点変更によっては、撮影映像V1のサイズが小さくなることもある。俯瞰映像V3を描画する場合の視点位置を遠方に変化させることで撮影映像V1のサイズが所定以下になった場合に、表示位置の変更要と判定するようにしてもよい。 In addition, changing the viewpoint of the overhead image V3 may cause the size of the captured image V1 to become smaller. If the viewpoint position when rendering the overhead image V3 is changed to a distant position, causing the size of the captured image V1 to become equal to or smaller than a predetermined size, it may be determined that the display position needs to be changed.
 ARシステム5は図24のステップS160では、例えば以上のような表示位置変更判定を行い、ステップS161では、変更要否により処理を分岐する。 In step S160 of FIG. 24, the AR system 5 performs a display position change determination as described above, and in step S161, the process branches depending on whether a change is required.
 変更不要と判定した場合は、ARシステム5はステップS162に進み、前フレームと同じ表示位置設定を維持して図24の処理を終える。
 これにより図19のステップS105に進んだ際に、撮影映像V1が前フレームと同じ位置に表示される、今回の俯瞰映像V3のフレームが生成される。
If it is determined that no change is necessary, the AR system 5 proceeds to step S162, where it maintains the same display position setting as in the previous frame, and ends the processing of FIG.
As a result, when the process proceeds to step S105 in FIG. 19, a frame of the current overhead image V3 is generated in which the shot image V1 is displayed in the same position as in the previous frame.
 表示位置変更判定で変更要と判定した場合は、ARシステム5は図24のステップS161からステップS163に進み、表示位置設定の変更先を選択する。
 この変更先は、表示位置変更判定で変更要とした原因に応じて決めされればよい。
 例えば上記(P1)で、俯瞰映像V3内の物体との衝突による場合は、フラスタム起点付近面47や画面隅など、衝突箇所の影響のない位置に変更することが考えられる。
 上記(P2)(P3)で、撮影映像V1の視認性が低下するような場合は、画面隅、フォーカス面41付近など、形状的に視認性のよい表示が可能なビューフラスタム40外を選択することが考えられる。
When it is determined that a change is required in the display position change determination, the AR system 5 proceeds from step S161 to step S163 in FIG. 24 and selects a destination to which the display position setting is to be changed.
The destination of this change may be determined depending on the reason why the display position change is required.
For example, in the above (P1), if a collision with an object in the overhead image V3 occurs, it is possible to change the position to a position that is not affected by the collision point, such as the surface 47 near the frustum origin or a corner of the screen.
If the visibility of the captured image V1 decreases in the above (P2) and (P3), it may be possible to select a location outside the view frustum 40 where a shape allows for good visibility, such as a corner of the screen or near the focus plane 41.
 また、カメラ2の種別情報を、撮影映像V1の変更先の設定に用いることもできる。例えば変更要とした対象が移動型のカメラ2Mであった場合は、変更先を画面隅などとする。移動型のカメラ2Mの撮影映像V1については、例えば移動していない期間はビューフラスタム40内に表示させ、移動中は画面隅に変更するということが考えられる。移動中は俯瞰映像V3内でのビューフラスタム40の移動も大きくなり、ビューフラスタム40内の撮影映像V1は視認性が落ちるためである。 The type information of the camera 2 can also be used to set the destination of the captured image V1. For example, if the object to be changed is a mobile camera 2M, the destination can be a corner of the screen. For the captured image V1 of the mobile camera 2M, it can be displayed within the view frustum 40 when the camera is not moving, and can be changed to the corner of the screen when the camera is moving. This is because the movement of the view frustum 40 within the overhead image V3 becomes larger when the camera is moving, and the visibility of the captured image V1 within the view frustum 40 decreases.
 ステップS164でARシステム5は、選択した変更先がビューフラスタム40外であるか否かで処理を分岐する。 In step S164, the AR system 5 branches the process depending on whether the selected destination is outside the view frustum 40 or not.
 変更先がビューフラスタム40内の位置であればARシステム5はステップS165に進み、設定位置のビューフラスタム40の断面としての表示領域のサイズや形状を判定する。そしてステップS167でARシステム5は、判定した表示位置の断面に合致するように撮影映像V1のサイズや形状の設定を行う。 If the change destination is a position within the view frustum 40, the AR system 5 proceeds to step S165, where it determines the size and shape of the display area as a cross section of the view frustum 40 at the set position. Then, in step S167, the AR system 5 sets the size and shape of the captured image V1 so that it matches the cross section of the determined display position.
 これにより図19のステップS105に進んだ際に、撮影映像V1のサイズ調整が行われ、その撮影映像V1が前フレームとは異なるビューフラスタム40内の位置に合成された俯瞰映像V3が生成される。 As a result, when the process proceeds to step S105 in FIG. 19, the size of the captured image V1 is adjusted, and an overhead image V3 is generated in which the captured image V1 is composited at a position within the view frustum 40 different from the previous frame.
 今回、変更先として選択した位置がビューフラスタム40外である場合は、ARシステム5は図24のステップS166に進み、新たな設定位置における撮影映像V1の表示サイズや形状を設定する(図23のステップS155と同様)。 If the position selected as the new position this time is outside the view frustum 40, the AR system 5 proceeds to step S166 in FIG. 24 and sets the display size and shape of the captured image V1 at the new set position (similar to step S155 in FIG. 23).
 これにより図19のステップS105に進んだ際に、撮影映像V1のサイズや形状が調整され、その撮影映像V1が、前フレームとは異なるビューフラスタム40外の位置に合成された俯瞰映像V3が生成される。 As a result, when the process proceeds to step S105 in FIG. 19, the size and shape of the captured image V1 are adjusted, and an overhead image V3 is generated in which the captured image V1 is composited at a position outside the view frustum 40 that is different from the previous frame.
 なお以上の図24の処理例では、表示位置の変更先はビューフラスタム40内のみとする例も考えられ、その場合、ステップS164、S166は不要となる。
 また表示位置の変更先はビューフラスタム40外のみとしてもよい。その場合、ステップS164、S165は不要となり、ステップS163からステップS166に進むようにすればよい。
In the above processing example of FIG. 24, an example is also conceivable in which the display position is changed only to within the view frustum 40, in which case steps S164 and S166 are unnecessary.
Furthermore, the display position may be changed only to outside the view frustum 40. In this case, steps S164 and S165 are unnecessary, and the process may proceed from step S163 to step S166.
 ここまで図8から図24を参照して、ビューフラスタム40とともに撮影映像V1を表示させる例を述べてきたが、例えばビューフラスタム40と撮影映像V1を常時、共に表示させてもよいし、一時的に行ってもよい。
例えば通常はビューフラスタム40を表示させるが撮影映像V1は表示させないことも考えられる。その場合、カメラマンやディレクターがビューフラスタム40を選択する操作を行うことで、その選択されたビューフラスタム40に対応する撮影映像V1が表示されるようにしてもよい。
 或いは、ビューフラスタム40のみのモードと、ビューフラスタム40及び撮影映像V1を同時表示させるモードを、カメラマンやディレクターが切り替え可能としてもよい。
Up to this point, with reference to Figures 8 to 24, we have described examples of displaying the captured image V1 together with the view frustum 40. However, for example, the view frustum 40 and the captured image V1 may be displayed together all the time, or may be displayed only temporarily.
For example, it is possible to normally display the view frustum 40 but not the shot image V1. In that case, the cameraman or director may perform an operation to select the view frustum 40, so that the shot image V1 corresponding to the selected view frustum 40 is displayed.
Alternatively, a cameraman or director may be able to switch between a mode in which only the view frustum 40 is displayed and a mode in which the view frustum 40 and the shot video V1 are displayed simultaneously.
<4.カメラマンとディレクターの画面例>
 本実施の形態のシステムでは、GUIデバイス11においてディレクター用に俯瞰映像V3-1を表示させ、またカメラ2のビューファインダー等の表示部においてカメラマン用に俯瞰映像V3-2を表示させる。
<4. Example of a cameraman and director screen>
In the system of this embodiment, an overhead image V3-1 is displayed on the GUI device 11 for the director, and an overhead image V3-2 is displayed on a display unit such as a viewfinder of the camera 2 for the cameraman.
 この場合に、俯瞰映像V3-1、V3-2は、どちらも撮影対象空間8を模したCG空間30においてビューフラスタム40を示した映像であるが、それぞれ異なる表示態様の映像とする。これにより、ディレクターやカメラマンといった役割に応じて適した情報が提供できるようにする。
In this case, the overhead images V3-1 and V3-2 are both images showing the view frustum 40 in the CG space 30 simulating the shooting target space 8, but are images with different display modes. This makes it possible to provide information appropriate for the role of the director or cameraman.
 [4-1:強調表示]
 俯瞰映像V3-1、V3-2を異なる態様の映像とする例は各種想定される。
 まず図28から図32では、ARシステム5が、ディレクター側の俯瞰映像V3-1を、撮影映像V1に注目被写体を含む特定カメラのビューフラスタム40を、他のビューフラスタム40と異なる表示態様とする例を説明する。特に或るビューフラスタム40が強調表示される例を述べる。一方で、カメラマン用の俯瞰映像V3-2では、そのような強調表示は行われない。
[4-1: Highlighted display]
Various examples are conceivable in which the overhead views V3-1 and V3-2 are images in different aspects.
28 to 32, an example will be described in which the AR system 5 displays the view frustum 40 of a specific camera including a subject of interest in the director's overhead video V3-1 in a different display mode from the other view frustum 40. In particular, an example will be described in which a certain view frustum 40 is highlighted. On the other hand, such highlighting is not performed in the cameraman's overhead video V3-2.
 図28はGUIデバイス11におけるデバイス表示映像51として、俯瞰映像V3-1が表示されている例を示している。
 この俯瞰映像V3-1は、撮影対象空間8である例えば競技場を俯瞰したCG空間30を含み、かつ競技場で撮影を行っている複数のカメラ2のビューフラスタム40を表示する映像である。そして3つのカメラ2についてのビューフラスタム40a,40b,40cが表示されている。
FIG. 28 shows an example in which an overhead view image V3-1 is displayed as the device display image 51 on the GUI device 11.
This overhead image V3-1 is an image that includes a CG space 30 overlooking the target shooting space 8, for example a stadium, and displays the view frustums 40 of a plurality of cameras 2 taking pictures at the stadium. View frustums 40a, 40b, and 40c for the three cameras 2 are displayed.
 この例では、ビューフラスタム40aについては、他のビューフラスタム40b,40cとは異なる表示態様となっている。特にこの場合は、ビューフラスタム40aが強調表示され、他のビューフラスタム40b,40cよりも目立つようにされている。 In this example, the view frustum 40a is displayed in a different manner from the other view frustums 40b and 40c. In this particular case, the view frustum 40a is highlighted and made to stand out more than the other view frustums 40b and 40c.
 なお上述のようにビューフラスタム40の形状や方向、フォーカス面41や被写界深度範囲42の表示位置等は、その時点のカメラ2の画角、撮影方向、焦点距離、被写界深度などにより決まるものであるため、これらの違いは、ここでいう表示態様の違いには含まない。ビューフラスタム40の表示態様が異なるというのは、カメラ2の画角や撮影方向等の状態によって決まる違いではなく、ビューフラスタム40の表示自体の違いを指す。例えば色の違い、輝度の違い、濃さの違い、輪郭線の種類や太さの違い、四角錐の面の表示の違い、通常表示と点滅表示の違い、点滅周期の違い等である。 As mentioned above, the shape and direction of the view frustum 40 and the display positions of the focus plane 41 and depth of field range 42 are determined by the angle of view, shooting direction, focal length, depth of field, etc. of the camera 2 at that time, and therefore these differences are not included in the difference in display mode referred to here. Different display modes of the view frustum 40 do not refer to differences determined by the state of the angle of view or shooting direction of the camera 2, but to differences in the display of the view frustum 40 itself. For example, differences in color, brightness, darkness, type and thickness of the outline, differences in the display of the pyramid faces, differences between normal display and flashing display, differences in the flashing cycle, etc.
 図28の例は、例えば通常はビューフラスタム40を半透明の白色として表示させている場合に、ビューフラスタム40aを例えば半透明の赤色とする強調表示を行う。これによりディレクター等に対してビューフラスタム40aを強調して示す。 In the example of FIG. 28, for example, when the view frustum 40 is normally displayed in a semi-transparent white color, the view frustum 40a is highlighted in a semi-transparent red color, for example. This allows the view frustum 40a to be highlighted and shown to the director, etc.
 このように強調表示させる条件の1つとして、注目被写体の撮影中という条件がある。
 注目被写体については各種設定可能であるが、スポーツ中継の場合では「特定の選手」「ボール等の競技物に関わっている選手」「ボール等の競技物」などが想定される。
 そして例えば図4の構成のARシステム5は、各カメラ2の撮影映像V1の画像認識処理により、特定の選手等の注目被写体を撮影しているか否かを判定する。
 例えばカメラ2の撮影映像V1の画像が図29のように注目被写体を映しているか否かを判定する。そしてARシステム5は、注目被写体を映しているカメラ2のビューフラスタム40を強調表示の態様で表示させるように俯瞰映像V3-1を生成する。
One of the conditions for such highlighted display is that the subject of interest is currently being photographed.
Various settings are possible for the target subject, but in the case of a sports broadcast, possible targets include "a specific player,""a player involved with a sports object such as a ball," and "a sports object such as a ball."
For example, the AR system 5 configured as shown in FIG. 4 determines whether or not a particular subject of interest, such as a specific player, is being photographed by performing image recognition processing on the captured image V1 of each camera 2.
For example, it is determined whether or not the image of the video V1 captured by the camera 2 shows a target subject as shown in Fig. 29. Then, the AR system 5 generates an overhead video V3-1 so that the view frustum 40 of the camera 2 capturing the target subject is displayed in a highlighted manner.
 但し、単に注目被写体を映しているという条件で強調表示させると、多数のビューフラスタム40が強調表示されてしまうこともあり、強調表示の意味が低減する。そこで以下では、注目被写体の映像として最も適切な撮影映像V1のカメラ2を選択する処理例を説明する。 However, if the highlighting is done simply on the condition that it captures the subject of interest, many view frustums 40 may be highlighted, which reduces the significance of the highlighting. Therefore, below, we will explain an example of processing to select the camera 2 with the most appropriate captured image V1 as an image of the subject of interest.
 なお、以下の図30、図31、図32、図34、図36、図38、図41、図43、図45、図48、図52の処理例は、図4のようにARシステム5が各カメラ2について統合して対応するシステムの場合に理解しやすい。但し図3の構成の場合でも、カメラシステム1が複数設けられ、各カメラシステム1のARシステム5が連携することで、実施可能である。 The processing examples in the following Figures 30, 31, 32, 34, 36, 38, 41, 43, 45, 48, and 52 are easy to understand in the case of a system in which the AR system 5 is integrated and corresponds to each camera 2 as in Figure 4. However, even in the case of the configuration in Figure 3, it is possible to implement the processing examples by providing multiple camera systems 1 and linking the AR systems 5 of each camera system 1.
 図30は俯瞰映像V3-1、V3-2の各映像データを生成するARシステム5の処理例である。この場合の俯瞰映像V3-1、V3-2の映像データとは、撮影対象空間8に相当するCG空間30に、ビューフラスタム40を合成した映像データのことである。
なお俯瞰映像V3-1、V3-2は、上述のように、さらに撮影映像V1を合成したものであってもよい。
30 shows an example of processing by the AR system 5 that generates the video data for the overhead images V3-1 and V3-2. The video data for the overhead images V3-1 and V3-2 in this case refers to video data in which a view frustum 40 is synthesized with a CG space 30 that corresponds to the subject space 8.
As described above, the overhead views V3-1 and V3-2 may be further synthesized with the shot image V1.
 ARシステム5は、例えば俯瞰映像V3-1、V3-2の映像データとしてのフレーム毎に図30のステップS101からステップS107の処理を行う。これらの処理はARシステム5としての図7の情報処理装置70におけるCPU71(映像処理部71a)の制御処理と考えることができる。 The AR system 5 performs the processes from step S101 to step S107 in FIG. 30 for each frame of the video data of the overhead images V3-1 and V3-2, for example. These processes can be considered as control processes of the CPU 71 (video processing unit 71a) in the information processing device 70 in FIG. 7 as the AR system 5.
 ステップS101でARシステム5はCG空間30の設定を行う。例えば撮影対象空間8に相当するCG空間30の視点位置の設定、当該視点位置からのCG空間30としての映像のレンダリングを行う。特に前フレームとCG空間30に対する視点位置や映像内容の変更がなければ、前フレームのCG空間の映像を今回のフレームでも使用すればよい。 In step S101, the AR system 5 sets the CG space 30. For example, it sets the viewpoint position of the CG space 30 corresponding to the shooting target space 8, and renders an image as the CG space 30 from that viewpoint position. In particular, if there is no change in the viewpoint position or image content between the previous frame and the CG space 30, the image in the CG space of the previous frame can be used in the current frame as well.
 ステップS102でARシステム5は、カメラ2からの撮影映像V1とメタデータMTを入力する。つまり現在のフレームの撮影映像V1と、そのフレームタイミングでのカメラ2の姿勢情報、焦点距離、画角、絞り値などを取得する。
 複数のカメラ2についてビューフラスタム40や撮影映像V1の表示を行う場合は、ARシステム5は、それぞれのカメラ2の撮影映像V1とメタデータMTを入力する。
In step S102, the AR system 5 inputs the captured image V1 and metadata MT from the camera 2. That is, the captured image V1 of the current frame, and the attitude information, focal length, angle of view, aperture value, and the like of the camera 2 at the frame timing are acquired.
When displaying the view frustum 40 and the captured video V1 for a plurality of cameras 2, the AR system 5 inputs the captured video V1 and metadata MT of each camera 2.
 ステップS201でARシステム5は、現在のフレームについての、カメラマン用のビューフラスタム40を生成する。カメラマン用のビューフラスタム40とはカメラ2に送信して表示させる俯瞰映像V3-2に合成するビューフラスタム40のことである。
 図4の構成のARシステム5の場合は、各カメラ2のそれぞれに対応して別個にカメラマン用のビューフラスタム40を生成する。
 図3の構成のARシステム5の場合は、カメラシステム1におけるARシステム5が、そのカメラシステム1のカメラ2に表示させるビューフラスタム40を生成する。
In step S201, the AR system 5 generates a view frustum 40 for the cameraman for the current frame. The view frustum 40 for the cameraman is a view frustum 40 to be synthesized with the overhead image V3-2 to be transmitted to the camera 2 and displayed.
In the case of the AR system 5 having the configuration shown in FIG. 4, a view frustum 40 for the cameraman is generated separately in correspondence with each of the cameras 2.
In the case of the AR system 5 having the configuration shown in FIG. 3, the AR system 5 in the camera system 1 generates a view frustum 40 to be displayed on the camera 2 of the camera system 1 .
 ARシステム5はステップS102で取得したメタデータMTから、カメラ2の姿勢に応じたCG空間30内でのビューフラスタム40の方向や、画角に応じた四角錐形状や、焦点距離や絞り値に基づくフォーカス面41や被写界深度範囲42の位置などを設定し、その設定に応じたビューフラスタム40の映像を生成する。
 複数のカメラ2についてビューフラスタム40の表示を行う場合は、ARシステム5は、それぞれのカメラ2のメタデータMTに応じてビューフラスタム40の映像を生成する。
From the metadata MT acquired in step S102, the AR system 5 sets the direction of the view frustum 40 within the CG space 30 according to the attitude of the camera 2, the pyramid shape according to the angle of view, the position of the focus plane 41 and depth of field range 42 based on the focal length and aperture value, and so on, and generates an image of the view frustum 40 according to the settings.
When displaying the view frustum 40 for a plurality of cameras 2 , the AR system 5 generates an image of the view frustum 40 according to the metadata MT of each camera 2 .
 ステップS202でARシステム5は、現在のフレームについての、ディレクター用のビューフラスタム40を生成する。ディレクター用のビューフラスタム40とはGUIデバイス11に送信して表示させる俯瞰映像V3-1に合成するビューフラスタム40のことである。
 基本的にはステップS201と同様に、各カメラ2の姿勢(撮影方向)、画角、焦点距離、絞り値に基づくビューフラスタム40の映像を生成する。
In step S202, the AR system 5 generates a director's view frustum 40 for the current frame. The director's view frustum 40 is a view frustum 40 to be transmitted to the GUI device 11 and synthesized with the overhead view video V3-1 to be displayed.
Basically, similarly to step S201, an image of the view frustum 40 is generated based on the attitude (shooting direction), angle of view, focal length, and aperture value of each camera 2.
 但し、ステップS201で生成するカメラマン用のビューフラスタム40と、ステップS202で生成するディレクター用のビューフラスタム40は、その表示態様が異なるようにされる場合がある。具体例は後述する。 However, the view frustum 40 for the cameraman generated in step S201 and the view frustum 40 for the director generated in step S202 may be displayed in different ways. A specific example will be described later.
 ステップS203でARシステム5は、俯瞰映像V3-2となるCG空間30に、カメラマン用に生成したビューフラスタム40を合成して、俯瞰映像V3-2の1フレームの映像データを生成する。なお、各ビューフラスタム40に対応して撮影映像V1を合成してもよい。 In step S203, the AR system 5 synthesizes the view frustum 40 generated for the cameraman into the CG space 30 that will become the overhead image V3-2, to generate one frame of video data for the overhead image V3-2. Note that the captured image V1 may also be synthesized in correspondence with each view frustum 40.
 ステップS204でARシステム5は、俯瞰映像V3-1となるCG空間30に、ディレクター用に生成したビューフラスタム40を合成して、俯瞰映像V3-1の1フレームの映像データを生成する。なお、各ビューフラスタム40に対応して撮影映像V1を合成してもよい。 In step S204, the AR system 5 synthesizes the view frustum 40 generated for the director into the CG space 30 that will become the overhead image V3-1, to generate one frame of video data for the overhead image V3-1. Note that the shot image V1 may also be synthesized in correspondence with each view frustum 40.
 そしてステップS205でARシステム5は、俯瞰映像V3-1、V3-2の1フレームの映像データを出力する。
 以上の処理を、ビューフラスタム40の表示終了まで繰り返し行う。
Then, in step S205, the AR system 5 outputs one frame of video data of the overhead videos V3-1 and V3-2.
The above process is repeated until the display of the view frustum 40 is completed.
 この図30の処理により、図28のように1つのビューフラスタム40、例えばビューフラスタム40aを強調表示させる場合の処理を説明する。
 なお図28はディレクターが視認する俯瞰映像V3-1の例である。このときカメラマンが視認する俯瞰映像V3-2では、強調表示はされないものとする。つまり俯瞰映像V3-2においてビューフラスタム40a,40b,40cは、どれも白色半透明という同じ表示態様で表示される。
A process of emphasizing one view frustum 40, for example the view frustum 40a, as shown in FIG. 28, using the process shown in FIG. 30 will be described.
28 is an example of the overhead view V3-1 viewed by the director. In the overhead view V3-2 viewed by the cameraman, the highlighting is not performed. In other words, in the overhead view V3-2, the view frustums 40a, 40b, and 40c are all displayed in the same display mode, that is, white semi-transparent.
 図30のステップS201,S202の処理の具体例を図31に示す。
 ステップS201においてARシステム5は、ステップS210として、各カメラ2についてのビューフラスタム40を生成する。つまりカメラマン用として、例えばビューフラスタム40a,40b,40cを同じ白色半透明の画像として生成する。
FIG. 31 shows a specific example of the processes in steps S201 and S202 in FIG.
In step S201, the AR system 5 generates a view frustum 40 for each camera 2 as step S210. That is, for example, the view frustums 40a, 40b, and 40c are generated as the same white semi-transparent image for the cameraman.
 続くステップS202においてARシステム5は、ステップS210として各カメラ2の撮影映像V1について注目被写体の画面占有率の値を取得する。
 例えばARシステム5は、常時各カメラ2の撮影映像V1について画像認識処理を実行しており、設定した注目被写体が撮影されているか否かを判定するとともに、各フレームでの画面占有率を判定している。例えば図29のように注目被写体が映っていることや、その画面内での注目被写体の面積を判定して画面占有率を求める。ARシステム5は、ステップS210において、そのように算出した現時点で各撮影映像V1における注目被写体の画面占有率を取得する。
In the subsequent step S202, the AR system 5 acquires the value of the screen occupancy rate of the target subject for the captured image V1 of each camera 2 as step S210.
For example, the AR system 5 constantly performs image recognition processing on the captured images V1 of each camera 2, and determines whether or not the set target subject is captured, and determines the screen occupancy rate in each frame. For example, as shown in Fig. 29, the screen occupancy rate is calculated by determining whether the target subject is captured and the area of the target subject in the screen. In step S210, the AR system 5 obtains the screen occupancy rate of the target subject in each captured image V1 at the current time calculated in this manner.
 ステップS211でARシステム5は、最適な撮影映像V1を決定する。例えば画面占有率が最も高い撮影映像V1を最適とする。 In step S211, the AR system 5 determines the optimal captured image V1. For example, the captured image V1 with the highest screen occupancy rate is determined to be optimal.
 ステップS212でARシステム5は、ディレクター用のビューフラスタム40として、最適な撮影映像V1のカメラ2に対応するビューフラスタム40の強調表示を含む、各ビューフラスタム40の映像を生成する。例えばビューフラスタム40aを強調表示の態様として赤色半透明の画像とし、ビューフラスタム40b,40cは白色半透明の画像とする。 In step S212, the AR system 5 generates an image of each view frustum 40, including highlighting of the view frustum 40 corresponding to the camera 2 of the optimal shot image V1, as the view frustum 40 for the director. For example, the view frustum 40a is highlighted as a red semi-transparent image, and the view frustums 40b and 40c are highlighted as white semi-transparent images.
 図30のステップS201,S202の処理を図31のように行った後、ARシステム5はステップS203,S204,S205の処理を行う。これによりGUIデバイス11に表示される俯瞰映像V3-1は、図28のようになる。一方、各カメラ2で表示される俯瞰映像V3-2ではビューフラスタム40の強調表示は行われていないものとなる。 After performing the processes of steps S201 and S202 in FIG. 30 as shown in FIG. 31, the AR system 5 performs the processes of steps S203, S204, and S205. As a result, the overhead image V3-1 displayed on the GUI device 11 becomes as shown in FIG. 28. On the other hand, in the overhead image V3-2 displayed by each camera 2, the view frustum 40 is not highlighted.
 これによりディレクターは、現時点で注目被写体を最も大きく映しているカメラ2を認識することができる。
 以上では、注目被写体の画面占有率で強調表示するビューフラスタム40を選択したが、画面占有率ではなく連続撮影時間で選択してもよい。
This allows the director to recognize the camera 2 currently showing the largest image of the subject of interest.
In the above, the view frustum 40 for highlighting is selected based on the screen occupancy rate of the target subject, but the selection may be based on the continuous shooting time instead of the screen occupancy rate.
 図32にステップS202の他の例を示している。なおステップS201は図31と同様とする。
 図30のステップS202においてARシステム5は、図32のステップS215として、各カメラ2の撮影映像V1について注目被写体の連続撮影時間の値を取得する。
 上述のようにARシステム5は、常時各カメラ2の撮影映像V1について画像認識処理を実行しており、設定した注目被写体が撮影されているか否かを判定する。この場合に各撮影映像V1について注目被写体が認識された継続時間(継続フレーム数)を求めるようにする。そしてARシステム5はステップS215において、そのように算出した連続撮影時間を取得する。
Another example of step S202 is shown in Fig. 32. Note that step S201 is the same as in Fig. 31.
In step S202 of FIG. 30, the AR system 5 acquires the value of the continuous shooting time of the target subject for the captured image V1 of each camera 2 as step S215 of FIG.
As described above, the AR system 5 constantly performs image recognition processing on the captured images V1 of each camera 2, and determines whether or not the set target subject is captured. In this case, the AR system 5 calculates the duration (number of continuous frames) during which the target subject is recognized for each captured image V1. Then, in step S215, the AR system 5 obtains the continuous shooting time calculated in this manner.
 ステップS211でARシステム5は、最適な撮影映像V1を決定する。この場合、連続撮影時間が最も長い撮影映像V1を最適とする。 In step S211, the AR system 5 determines the optimal captured image V1. In this case, the captured image V1 with the longest continuous shooting time is determined to be optimal.
 ステップS212でARシステム5は、ディレクター用のビューフラスタム40として、最適な撮影映像V1のカメラ2に対応するビューフラスタム40の強調表示を含む、各ビューフラスタム40の映像を生成する。 In step S212, the AR system 5 generates an image of each view frustum 40, including a highlight of the view frustum 40 corresponding to the camera 2 of the optimal shot image V1, as the view frustum 40 for the director.
 この後、ARシステム5は図30のステップS203,S204,S205の処理を行う。これによりGUIデバイス11に表示される俯瞰映像V3-1は、図28のようになる。
 これによりディレクターは、注目被写体を連続的に長く映しているカメラ2を認識することができる。
Thereafter, the AR system 5 performs the processes of steps S203, S204, and S205 in Fig. 30. As a result, the overhead image V3-1 displayed on the GUI device 11 becomes as shown in Fig. 28.
This allows the director to recognize the camera 2 that is continuously capturing a long image of the subject of interest.
 なお、以上のように俯瞰映像V3-1において注目被写体の画面占有率や連続撮影時間に応じてビューフラスタム40の強調表示を行う場合に、その強調表示するビューフラスタム40に対してのみ、撮影映像V1を表示させるという処理も考えられる。これにより、ディレクターは、注目被写体がどのように映されているかも同時に確認できる。 When highlighting the view frustum 40 in the overhead image V3-1 according to the screen occupancy rate or continuous shooting time of the subject of interest as described above, it is also possible to have the shot image V1 displayed only on the view frustum 40 that is being highlighted. This allows the director to simultaneously check how the subject of interest is being shot.
 続いて、ディレクターが視認する俯瞰映像V3-1について、カメラマンからのフィードバックにより表示態様を変更させる例を説明する。 Next, we will explain an example in which the display mode of the overhead image V3-1 viewed by the director is changed based on feedback from the cameraman.
 図33Aは、GUIデバイス11のデバイス表示映像51としての俯瞰映像V3-1を示している。この例では、ビューフラスタム40a,40b,40cはそれぞれ例えば白色半透明という同じ表示態様で表示されている。 FIG. 33A shows an overhead image V3-1 as the device display image 51 of the GUI device 11. In this example, the view frustums 40a, 40b, and 40c are each displayed in the same display mode, for example, semi-transparent white.
 ここで複数のカメラ2のうちで、ビューフラスタム40aに対応するカメラ2のカメラマン(或いはリモート操作者)による特定操作が行われたとする。
 その場合、俯瞰映像V3-1を図33Bのようにする。つまりビューフラスタム40aがビューフラスタム40b,40cとは異なる強調表示の態様とし、ディレクターに明示されるようにする。
Here, it is assumed that, among the multiple cameras 2, a specific operation is performed by the cameraman (or remote operator) of the camera 2 corresponding to the view frustum 40a.
In this case, the overhead view video V3-1 will be as shown in Fig. 33B. That is, the view frustum 40a will be highlighted in a different manner from the view frustums 40b and 40c, so that this will be clearly indicated to the director.
 例えばカメラマンによる特定操作とは、カメラマンがディレクターに「今、良い映像がとれている」ということを通知する操作とする。このような操作をカメラ2側で可能としておき、操作が行われた場合に、ARシステム5は、俯瞰映像V3-1において、その操作が行われたカメラ2のビューフラスタム40の表示態様を他とは異なるものとする。 For example, a specific operation by the cameraman is an operation in which the cameraman notifies the director that "good footage is now being taken." If such an operation is made possible on the camera 2 side, when the operation is performed, the AR system 5 makes the display mode of the view frustum 40 of the camera 2 on which the operation was performed in the overhead image V3-1 different from the others.
 処理例を図34に示す。図34は図30のステップS201,S202の具体例を示している。
 図30のステップS201でARシステム5は図34のステップS210として、カメラマン用のビューフラスタム40の画像を生成する。例えばビューフラスタム40a,40b,40cとして同じ白色半透明の画像を生成する。
An example of the process is shown in Fig. 34. Fig. 34 shows a concrete example of steps S201 and S202 in Fig. 30.
In step S201 of Fig. 30, the AR system 5 generates an image of the view frustum 40 for a cameraman as step S210 of Fig. 34. For example, the same white semi-transparent image is generated as the view frustum 40a, 40b, and 40c.
 図30のステップS202でARシステム5は、まず図34のステップS220として各カメラからのフィードバック、つまりカメラマンによる特定操作があったか否かを確認し、ステップS221で処理を分岐する。
 特定操作がなければARシステム5はステップS221からステップS223に進み、ディレクター用のビューフラスタム40の画像を生成する。例えばビューフラスタム40a,40b,40cとして同じ白色半透明の画像を生成する。
In step S202 in FIG. 30, the AR system 5 first checks whether or not there has been feedback from each camera, that is, whether or not there has been a specific operation by the cameraman, in step S220 in FIG. 34, and branches the process in step S221.
If no specific operation has been performed, the AR system 5 proceeds from step S221 to step S223, and generates an image of the director's view frustum 40. For example, the same white semi-transparent image is generated as the view frustum 40a, 40b, and 40c.
 一方、特定操作が検知された場合は、ARシステム5はステップS222に進んで強調表示を含むディレクター用のビューフラスタム40の画像を生成する。例えばビューフラスタム40aを赤色半透明の画像、ビューフラスタム40b,40cを白色半透明の画像などとして生成する。 On the other hand, if a specific operation is detected, the AR system 5 proceeds to step S222 and generates an image of the view frustum 40 for the director, including highlighting. For example, the view frustum 40a is generated as a red semi-transparent image, and the view frustums 40b and 40c are generated as white semi-transparent images.
 その後、ARシステム5は図30のステップS203,S204,S205の処理を行う。これによりGUIデバイス11に表示される俯瞰映像V3-1は、図33A又は図33Bのようになる。つまり、カメラマンからの特定操作がない場合は、図33Aのような映像となり、カメラマンからの特定操作があった時点から図33Bのような映像となる。これによりディレクターは、カメラマンからの「今、良い映像がとれている」というアピールを認識できる。
 一方、各カメラ2で表示される俯瞰映像V3-2ではビューフラスタム40a,40b,40cは同じ表示態様で表示されるものとなる。
After that, the AR system 5 performs the processes of steps S203, S204, and S205 in Fig. 30. As a result, the overhead image V3-1 displayed on the GUI device 11 becomes as shown in Fig. 33A or Fig. 33B. In other words, if there is no specific operation from the cameraman, the image becomes as shown in Fig. 33A, and once there is a specific operation from the cameraman, the image becomes as shown in Fig. 33B. This allows the director to recognize the cameraman's appeal that "we're getting some good footage now."
On the other hand, in the overhead view image V3-2 displayed by each camera 2, the view frustums 40a, 40b, and 40c are displayed in the same display mode.
 続いて、ビューフラスタム40が映像上で重なった場合に、表示態様を変更する例を説明する。
 図35Aは、GUIデバイス11のデバイス表示映像51としての俯瞰映像V3-1を示している。この例では、ビューフラスタム40a,40b,40cはそれぞれ同じ表示態様で表示されている。
Next, an example of changing the display mode when the view frustum 40 overlaps on the video image will be described.
35A shows an overhead view image V3-1 as the device display image 51 of the GUI device 11. In this example, the view frustums 40a, 40b, and 40c are displayed in the same display mode.
 ここで図35Bのように、ビューフラスタム40a,40bが映像上で重なったとする。その場合、ビューフラスタム40a,40bを通常とは異なる強調表示の態様とし、ディレクターが認識しやすいようにする。 Now, let us assume that the view frustums 40a and 40b overlap on the image as shown in FIG. 35B. In this case, the view frustums 40a and 40b are highlighted in a different way than usual to make them easier for the director to recognize.
 処理例を図36に示す。図36は図30のステップS201,S202の具体例を示している。
 図30のステップS201でARシステム5は図36のステップS210として、カメラマン用のビューフラスタム40の画像を生成する。例えばビューフラスタム40a,40b,40cとして同じ白色半透明の画像を生成する。
An example of the process is shown in Fig. 36. Fig. 36 shows a concrete example of steps S201 and S202 in Fig. 30.
In step S201 of Fig. 30, the AR system 5 generates an image of the view frustum 40 for a cameraman as step S210 of Fig. 36. For example, the same white semi-transparent image is generated as the view frustum 40a, 40b, and 40c.
 図30のステップS202でARシステム5は、まず図36のステップS230で各カメラ2のメタデータMTに基づいて、各カメラ2のビューフラスタム40についてのサイズ、形状、方向を設定する。 In step S202 of FIG. 30, the AR system 5 first sets the size, shape, and orientation of the view frustum 40 of each camera 2 based on the metadata MT of each camera 2 in step S230 of FIG. 36.
 ステップS231でARシステム5は、現フレームのCG空間30の三次元座標内で、各ビューフラスタム40の配置を確認する。これにより、ビューフラスタム40の重なりの有無が確認できる。 In step S231, the AR system 5 checks the arrangement of each view frustum 40 within the three-dimensional coordinates of the CG space 30 of the current frame. This makes it possible to check whether the view frustums 40 overlap.
 ARシステム5は、ステップS232で重なり有無により処理を分岐する。
 ビューフラスタム40の重なりが無い場合は、ARシステム5はステップS234に進み、ディレクター用のビューフラスタム40の画像を生成する。例えばビューフラスタム40a,40b,40cとして同じ白色半透明の画像を生成する。
In step S232, the AR system 5 branches the process depending on whether or not there is an overlap.
If there is no overlap of the view frustum 40, the AR system 5 proceeds to step S234, and generates an image of the view frustum 40 for the director. For example, the same white semi-transparent image is generated as the view frustum 40a, 40b, and 40c.
 一方、重なりがある場合は、ARシステム5はステップS233に進んで強調表示を含むディレクター用のビューフラスタム40の画像を生成する。この場合、重なる複数のビューフラスタム40、例えばビューフラスタム40a、40bを赤色半透明の画像、重なっていないビューフラスタム40cを白色半透明の画像などとして生成する。 On the other hand, if there is overlap, the AR system 5 proceeds to step S233 and generates an image of the view frustum 40 for the director, including highlighting. In this case, the overlapping view frustum 40, for example view frustum 40a and 40b, are generated as a red semi-transparent image, and the non-overlapping view frustum 40c is generated as a white semi-transparent image, etc.
 その後、ARシステム5は図30のステップS203,S204,S205の処理を行う。これによりGUIデバイス11に表示される俯瞰映像V3-1は、図35A又は図35Bのようになる。つまり、ビューフラスタム40の重なりがない場合は、図35Aのような映像となり、重なりがある場合は図35Bのような映像となる。これによりディレクター等は、複数のカメラ2で同じ被写体を異なる視点から撮影している状況を容易に認識できる。これにより各カメラマンへの指示を明確化できる。また同じ被写体の映像を切り替えるなどしたい場合の本線映像のスイッチングにも便利となる。
 一方、各カメラ2で表示される俯瞰映像V3-2ではビューフラスタム40a,40b,40cは同じ表示態様で表示されるものとなる。
After that, the AR system 5 performs the processes of steps S203, S204, and S205 in Fig. 30. As a result, the overhead image V3-1 displayed on the GUI device 11 becomes as shown in Fig. 35A or Fig. 35B. In other words, if there is no overlap of the view frustum 40, the image becomes as shown in Fig. 35A, and if there is overlap, the image becomes as shown in Fig. 35B. This allows the director, etc. to easily recognize the situation in which the same subject is being shot from different viewpoints by multiple cameras 2. This makes it possible to clarify instructions to each cameraman. It is also convenient for switching main line images when it is desired to switch images of the same subject.
On the other hand, in the overhead view image V3-2 displayed by each camera 2, the view frustums 40a, 40b, and 40c are displayed in the same display mode.
 [4-2:優先表示]
 続いて、ビューフラスタム40が映像上で重なった場合に、或るビューフラスタム40を優先して表示させる例を説明する。
 先に図17に示したように、ビューフラスタム40a,40b,40c,40dが重なっている場合を考えると、重なりにより、視認性が低下してしまうことがある。特に半透明のビューフラスタム40を重ねることで、それぞれのフォーカス面41や被写界深度範囲42等がわかりづらくなる。
[4-2: Priority Display]
Next, an example will be described in which, when view frustums 40 overlap on an image, a certain view frustum 40 is preferentially displayed.
17, when the view frustums 40a, 40b, 40c, and 40d overlap, the overlap may reduce visibility. In particular, overlapping semi-transparent view frustums 40 makes it difficult to see the focus planes 41 and depth of field ranges 42 of the respective view frustums.
 そこで図37のように1つのビューフラスタム40を優先的に表示させる。
 図37は、GUIデバイス11のデバイス表示映像51としての俯瞰映像V3-1を示している。この例では、ビューフラスタム40a,40b,40c,40dが重なっているが、ビューフラスタム40aを優先設定し、重なり部分ではビューフラスタム40aのフォーカス面41や被写界深度範囲42が表示されるようにしている。
Therefore, as shown in FIG. 37, one view frustum 40 is preferentially displayed.
37 shows an overhead image V3-1 as a device display image 51 of the GUI device 11. In this example, the view frustum 40a, 40b, 40c, and 40d overlap, but the view frustum 40a is set as a priority, and the focus plane 41 and depth of field range 42 of the view frustum 40a are displayed in the overlapping portion.
 処理例を図38に示す。図38は図30のステップS201,S202の具体例を示している。
 図30のステップS201でARシステム5は図38のステップS210として、カメラマン用のビューフラスタム40の画像を生成する。例えばビューフラスタム40a,40b,40c,40dとしての画像を生成する。カメラマン用のビューフラスタム40の画像については、特に優先設定は行わない。
A processing example is shown in Fig. 38. Fig. 38 shows a concrete example of steps S201 and S202 in Fig. 30.
In step S201 of Fig. 30, the AR system 5 generates an image of the view frustum 40 for the cameraman as step S210 of Fig. 38. For example, images are generated as the view frustum 40a, 40b, 40c, and 40d. No particular priority setting is made for the image of the view frustum 40 for the cameraman.
 図30のステップS202でARシステム5は、まず図38のステップS240で各カメラ2のメタデータMTに基づいて、各カメラ2のビューフラスタム40についてのサイズ、形状、方向を設定する。 In step S202 of FIG. 30, the AR system 5 first sets the size, shape, and orientation of the view frustum 40 of each camera 2 based on the metadata MT of each camera 2 in step S240 of FIG. 38.
 ステップS241でARシステム5は、現フレームのCG空間30の三次元座標内で、各ビューフラスタム40の配置を確認する。これにより、ビューフラスタム40の重なりの有無が確認できる。 In step S241, the AR system 5 checks the arrangement of each view frustum 40 within the three-dimensional coordinates of the CG space 30 of the current frame. This makes it possible to check whether the view frustums 40 overlap.
 ARシステム5は、ステップS242で重なり有無により処理を分岐する。
 ビューフラスタム40の重なりが無い場合は、ARシステム5はステップS244に進み、ディレクター用のビューフラスタム40の画像を生成する。例えばビューフラスタム40a,40b,40c,40dとしての画像を生成する。
In step S242, the AR system 5 branches the process depending on whether or not there is an overlap.
If there is no overlap of the view frustum 40, the AR system 5 proceeds to step S244, and generates an image of the view frustum 40 for the director. For example, images are generated as the view frustum 40a, 40b, 40c, and 40d.
 一方、重なりがある場合は、ARシステム5はステップS245に進んで、重なったビューフラスタム40のうちで優先するビューフラスタム40を決定する。又は重ならないものも含めて全てのビューフラスタム40のうちで優先するビューフラスタム40を決定してもよい。
 決定の手法はいくつか考えられる。
例えば現在本線映像としているカメラ2のビューフラスタム40を優先することが考えられる。
 或いはディレクター等が任意に優先するビューフラスタム40を選択できるようにしてもよい。
 また、上述のように注目被写体の撮影や、カメラマンの特定操作などにより強調表示させるものとして選択されたビューフラスタム40を、優先設定するようにしてもよい。
On the other hand, if there is overlap, the AR system 5 proceeds to step S245 to determine a view frustum 40 that has priority among the overlapping view frustum 40. Alternatively, the AR system 5 may determine a view frustum 40 that has priority among all view frustum 40, including those that do not overlap.
There are several possible methods for making this decision.
For example, it is conceivable to give priority to the view frustum 40 of the camera 2 which is currently capturing the main line image.
Alternatively, the director or the like may be allowed to arbitrarily select the view frustum 40 to be prioritized.
Furthermore, as described above, the view frustum 40 selected as the one to be highlighted by photographing a subject of interest or by a specific operation of the cameraman may be set as a priority.
 ステップS246でARシステム5は、ディレクター用のビューフラスタム40の画像を生成する。この場合、優先設定したビューフラスタム40については、通常にフォーカス面41や被写界深度範囲42が表示される画像とする。他のビューフラスタム40については、優先設定したビューフラスタム40と重なる部分では、フォーカス面41や被写界深度範囲42が表示されない画像とする。或いは他のビューフラスタム40については、全てフォーカス面41や被写界深度範囲42が表示されない画像としてもよい。 In step S246, the AR system 5 generates an image of the view frustum 40 for the director. In this case, for the view frustum 40 set as a priority, an image is generated in which the focus plane 41 and depth of field range 42 are normally displayed. For the other view frustum 40, an image is generated in which the focus plane 41 and depth of field range 42 are not displayed in the areas that overlap with the view frustum 40 set as a priority. Alternatively, for all other view frustum 40, an image may be generated in which the focus plane 41 and depth of field range 42 are not displayed.
 その後、ARシステム5は図30のステップS203,S204,S205の処理を行う。これによりGUIデバイス11に表示される俯瞰映像V3-1は、図37のように、ビューフラスタム40の重なりがあっても、優先設定されたビューフラスタム40についてはフォーカス面41や被写界深度範囲42が明確に認識できる映像となる。
 一方、各カメラ2で表示される俯瞰映像V3-2ではビューフラスタム40a,40b,40c,40dは図17のような表示となる。
Thereafter, the AR system 5 performs the processes of steps S203, S204, and S205 in Fig. 30. As a result, the overhead image V3-1 displayed on the GUI device 11 becomes an image in which the focus plane 41 and the depth of field range 42 can be clearly recognized for the view frustum 40 that has been prioritized, even if the view frustum 40 overlaps, as shown in Fig. 37.
On the other hand, in the overhead view image V3-2 displayed by each camera 2, the view frustums 40a, 40b, 40c, and 40d are displayed as shown in FIG.
 なお、この図37,図38では、ディレクター側の俯瞰映像V3-1で優先設定がされるものとしたが、カメラマン側の俯瞰映像V3-2で優先設定がされるようにしてもよい。カメラマンが視認することを考えると、自分が操作しているカメラ2のビューフラスタム40が優先設定されるとよい。
 従ってカメラマン用のビューフラスタム生成を行う図30のステップS201で、図38のステップS240からS246と同様の処理を行ってもよい。但しステップS245の優先するビューフラスタム決定は、自己のカメラ2のビューフラスタム40とする。
 これにより、カメラマンは、ビューフラスタム40が他のカメラ2のビューフラスタム40と重なっても、自分が操作するカメラ2のフォーカス面41や被写界深度範囲42を明確に視認できる。
37 and 38, the director's overhead view V3-1 is set as the priority, but the cameraman's overhead view V3-2 may be set as the priority. Considering the cameraman's visual confirmation, it is preferable that the view frustum 40 of the camera 2 that he is operating be set as the priority.
Therefore, in step S201 in Fig. 30 where a view frustum for the cameraman is generated, the same processing as steps S240 to S246 in Fig. 38 may be performed. However, the view frustum to be given priority in step S245 is determined to be the view frustum 40 of the camera 2 itself.
This allows the cameraman to clearly view the focus plane 41 and depth of field range 42 of the camera 2 he is operating even if the view frustum 40 overlaps with the view frustum 40 of another camera 2.
 このように俯瞰映像V3-2で優先設定する場合、ディレクターが視認する俯瞰映像V3-1では、上述の通り優先設定されてもよいし、優先設定しないことも考えられる。
 俯瞰映像V3-1、V3-2の両方で優先設定する場合でも、優先するビューフラスタム40の決定条件が異なることで、俯瞰映像V3-1と、各カメラ2で表示される俯瞰映像V3-2の全てとが同じ表示態様とはならない。
When priority is set in the overhead view video V3-2 in this way, priority may be set in the overhead view video V3-1 viewed by the director as described above, or it is also possible that priority is not set.
Even if priority is set for both the overhead images V3-1 and V3-2, the conditions for determining the prioritized view frustum 40 are different, so the overhead image V3-1 and all of the overhead images V3-2 displayed by each camera 2 will not be displayed in the same manner.
 またカメラマンが視認する俯瞰映像V3-2では、自己のカメラ2のビューフラスタム40のみを表示させ、他のカメラ2のビューフラスタム40を表示させないようにすることも考えられる。
It is also possible that in the overhead view image V3-2 visually recognized by the cameraman, only the view frustum 40 of his own camera 2 is displayed, and the view frustums 40 of the other cameras 2 are not displayed.
 [4-3:指示表示]
 続いてディレクターからの指示をカメラマンに視覚的に伝えることができるようにする例を説明する。
 図39A、図39Bは、GUIデバイス11のデバイス表示映像51としての俯瞰映像V3-1を示している。この例では、ビューフラスタム40a,40b,40cが表示されている。
 また図40Aはカメラ2のビューファインダー表示映像50としての俯瞰映像V3-2を示している。この例では、撮影映像V1の画面の隅に俯瞰映像V3-2が合成されているとする。図40Bに俯瞰映像V3-2を拡大して示す。
[4-3: Instruction Display]
Next, an example will be described in which instructions from a director can be visually conveyed to a cameraman.
39A and 39B show an overhead view image V3-1 as the device display image 51 of the GUI device 11. In this example, view frustums 40a, 40b, and 40c are displayed.
Fig. 40A also shows an overhead image V3-2 as the viewfinder display image 50 of camera 2. In this example, it is assumed that the overhead image V3-2 is synthesized in a corner of the screen of the shot image V1. Fig. 40B shows an enlarged view of the overhead image V3-2.
 図39Aは、ディレクターがビューフラスタム40aのカメラ2に対して指示操作を行った場合の例である。例えばディレクターがGUIデバイス11において、ビューフラスタム40bをドラッグするなどの操作に応じて、指示フラスタム40DRを表示させるようにする。これは、ディレクターが、ビューフラスタム40bのカメラ2のカメラマンに対して、撮影方向を指示フラスタム40DRの方向に変更する指示となる。 FIG. 39A shows an example of a case where the director has performed an instruction operation on camera 2 of view frustum 40a. For example, the director may perform an operation such as dragging view frustum 40b on the GUI device 11 to cause instruction frustum 40DR to be displayed. This is an instruction from the director to the cameraman of camera 2 of view frustum 40b to change the shooting direction to the direction of instruction frustum 40DR.
 従ってこの場合、ARシステム5は、図40A、図40Bに示すように、カメラマンが視認する俯瞰映像V3-2でもビューフラスタム40bについて指示フラスタム40DRが表示されるようにする。
 ビューフラスタム40bのカメラ2を操作するカメラマンは、ビューフラスタム40bが指示フラスタム40DRに一致するように、撮影方向を変えることで、ディレクターの指示に応じることができる。
Therefore, in this case, the AR system 5 displays the instruction frustum 40DR for the view frustum 40b also in the overhead image V3-2 viewed by the cameraman, as shown in Figures 40A and 40B.
The cameraman operating the camera 2 of the view frustum 40b can comply with the director's instructions by changing the shooting direction so that the view frustum 40b coincides with the instruction frustum 40DR.
 指示フラスタム40DRでは、撮影方向だけでなく、画角やフォーカス面41等を指示できるようにしてもよい。例えばディレクターが指示フラスタム40DRに対する操作で、フォーカス面41を前後させることや、画角を広げる(四角錐の傾斜を変える)等を行うことができるようにしてもよい。
 カメラマンは、ビューフラスタム40bのフォーカス面41が指示フラスタム40DRと一致するようにフォーカス調整を行ったり、四角錐の傾斜を一致させるように画角調整を行ったりすることもできる。
The instruction frustum 40DR may be configured to be able to specify not only the shooting direction but also the angle of view and the focus plane 41. For example, the director may be able to move the focus plane 41 forward or backward, widen the angle of view (change the inclination of the pyramid), etc., by operating the instruction frustum 40DR.
The cameraman can adjust the focus so that the focus plane 41 of the view frustum 40b coincides with the pointing frustum 40DR, and can adjust the angle of view so that the inclinations of the pyramids coincide with each other.
 なお、図39Aの俯瞰映像V3-1と、図40A、図40Bの俯瞰映像V3-2では、CG空間30に対する視点位置が異なる例を示している。俯瞰映像V3-1、V3-2はディレクターやカメラマンがそれぞれ操作により視点位置を変更できるようにする。図の例は、必ずしも俯瞰映像V3-1と俯瞰映像V3-2でCG空間30が同じ視点位置から見た状態で表示されるものではないことを示している。 Note that overhead image V3-1 in FIG. 39A and overhead image V3-2 in FIG. 40A and FIG. 40B show examples in which the viewpoint position relative to CG space 30 is different. The director or cameraman can change the viewpoint position for overhead images V3-1 and V3-2 by operating the camera. The example in the figure shows that the CG space 30 is not necessarily displayed as seen from the same viewpoint position in overhead image V3-1 and overhead image V3-2.
 図39Bは、ディレクターがさらにビューフラスタム40aに対しても指示操作を行って指示フラスタム40DRを表示させた状態である。このように、俯瞰映像V3-1では、各ビューフラスタム40に対して、それぞれ指示を行うことができるようにする。 FIG. 39B shows the state in which the director has also given instructions to the view frustum 40a, causing the instruction frustum 40DR to be displayed. In this way, in the overhead image V3-1, instructions can be given to each view frustum 40.
 なお図示のように新たに指示を行っても前の指示(ビューフラスタム40bに対する指示)の指示フラスタム40DRもそのまま表示させておくことが望ましい。ディレクターが現時点で行っている有効な指示を確認できるようにするためである。
 指示フラスタム40DRは、指示したカメラ2のビューフラスタム40が、その指示フラスタム40DRに略一致した時点で、俯瞰映像V3-1、V3-2から消去させることが考えられる。
或いは指示フラスタム40DRは、ディレクターの取り消し操作によっても俯瞰映像V3-1、V3-2から消去されるようにする。例えば指示の取り消し、指示の変更などにも対応できるようにするためである。
As shown in the figure, even if a new instruction is given, it is desirable to keep the instruction frustum 40DR of the previous instruction (instruction to the view frustum 40b) displayed as it is, in order to enable the director to confirm the currently valid instruction.
It is considered that the instruction frustum 40DR is erased from the overhead images V3-1 and V3-2 when the view frustum 40 of the designated camera 2 substantially coincides with the instruction frustum 40DR.
Alternatively, the instruction frustum 40DR may be erased from the overhead images V3-1 and V3-2 by a cancellation operation by the director, for example, to accommodate cancellation or change of instructions.
 また俯瞰映像V3-2においては、全てのカメラ2に対する指示フラスタム40DRを表示させてもよいし、自己のカメラ2に対する指示フラスタム40DRのみを表示させるようにしてもよい。これらをカメラマンが選択できるようにしてもよい。
 各カメラ2において、全てのカメラ2に対する指示フラスタム40DRを表示させることで、各カメラマンは、全体的にどのような指示が出されているかを把握できる。
 一方で、自身のカメラ2のみに対する指示フラスタム40DRが表示されるようにすることで、カメラマンは、ディレクターからの自身に対する指示を認識しやすい。
In addition, in the overhead view image V3-2, the instruction frustum 40DR for all the cameras 2 may be displayed, or only the instruction frustum 40DR for the camera 2 of the camera operator may be displayed.
By displaying the instruction frustum 40DR for all the cameras 2 on each camera 2, each cameraman can grasp the overall instructions being issued.
On the other hand, by displaying the instruction frustum 40DR only for the camera 2 of the cameraman himself, the cameraman can easily recognize the instructions given to him from the director.
 処理例を図41に示す。図41は図30のステップS201,S202、S203,S204の具体例を示している。 A processing example is shown in FIG. 41. FIG. 41 shows specific examples of steps S201, S202, S203, and S204 in FIG. 30.
 図30のステップS201においてARシステム5は、図41のステップS250からステップS254の処理を行う。
 まずステップS250としてARシステム5は、カメラマン用のビューフラスタム40の画像を生成する。例えばビューフラスタム40a,40b,40cとしての画像を生成する。
In step S201 in FIG. 30, the AR system 5 performs the processes of steps S250 to S254 in FIG.
First, in step S250, the AR system 5 generates an image of the view frustum 40 for a cameraman. For example, images are generated as the view frustums 40a, 40b, and 40c.
 ステップS251でARシステム5は、ディレクターによる指示操作の有無を確認する。特に指示操作がなければ図30のステップS202に進む。
 指示操作があった場合は、ARシステム5は図41のステップS251からステップS252に進み、指示フラスタム40DRの表示モードにより処理を分岐する。
 この場合の表示モードとは、自己に対する指示フラスタム40DRのみを表示させるモードと、全ての指示フラスタム40DRを表示させるモードがあるとし、カメラマンが選択できるものとする。
In step S251, the AR system 5 checks whether or not there is an instruction operation by the director. If there is no instruction operation, the process proceeds to step S202 in FIG.
If a pointing operation has been performed, the AR system 5 proceeds from step S251 to step S252 in FIG. 41, and branches the process depending on the display mode of the pointing frustum 40DR.
The display mode in this case can be selected by the cameraman either in a mode in which only the instruction frustum 40DR for the cameraman himself is displayed or in a mode in which all instruction frustum 40DR are displayed.
 なお、このようなモード選択を可能とせず、常に自己に対する指示フラスタム40DRのみを表示させるようにしてもよいし、常に全ての指示フラスタム40DRを表示させるものとしてもよい。 In addition, such mode selection may not be possible, and only the instruction frustum 40DR for the user may always be displayed, or all instruction frustum 40DR may always be displayed.
 自己に対する指示フラスタム40DRを表示させるモードの場合は、ARシステム5はステップS253に進み、指示フラスタム40DRの画像を生成する。但し、ディレクターからの指示が、俯瞰映像V3-2の生成処理対象のカメラ2に対する指示ではなかった場合は、ステップS253で指示フラスタム40DRの画像は生成しなくてよい。 If the mode is one in which the instruction frustum 40DR directed to the user is displayed, the AR system 5 proceeds to step S253 and generates an image of the instruction frustum 40DR. However, if the instruction from the director is not directed to the camera 2 that is the subject of the overhead image V3-2 generation process, it is not necessary to generate an image of the instruction frustum 40DR in step S253.
 この場合、各カメラ2に送信する俯瞰映像V3-2としての各映像データは、それぞれ表示内容が異なるものとなる。つまりカメラ2毎に、指示フラスタム40DRを含む映像となる映像データと、指示フラスタム40DRを含まない映像となる映像データがある。 In this case, the video data transmitted to each camera 2 as the overhead video V3-2 will have different display contents. In other words, for each camera 2, there will be video data that contains the indication frustum 40DR and video data that does not contain the indication frustum 40DR.
 全ての指示フラスタム40DRを表示させるモードの場合は、ARシステム5はステップS254に進み、その時点で有効な指示フラスタム40DRの画像を生成する。 If the mode is one in which all the indication frustums 40DR are to be displayed, the AR system 5 proceeds to step S254 and generates an image of the indication frustum 40DR that is valid at that time.
 以上のステップS250からステップS254の処理に続いてARシステム5は、図30のステップS202の処理を、図41のステップS260からステップS262で示すように行う。 Following the above processing of steps S250 to S254, the AR system 5 performs the processing of step S202 in FIG. 30 as shown in steps S260 to S262 in FIG. 41.
 ステップS260でARシステム5は、ディレクター用のビューフラスタム40の画像を生成する。例えばビューフラスタム40a,40b,40cとしての画像を生成する。 In step S260, the AR system 5 generates an image of the view frustum 40 for the director. For example, images are generated as view frustum 40a, 40b, and 40c.
 ステップS261でARシステム5は、ディレクターによる指示操作の有無を確認する。特に指示操作がなければ図30のステップS203に進む。
 指示操作があった場合は、ARシステム5は図41のステップS261からステップS262に進み、その時点で有効な指示フラスタム40DRの画像を生成する。
In step S261, the AR system 5 checks whether or not there is an instruction operation by the director. If there is no instruction operation, the process proceeds to step S203 in FIG.
If a pointing operation has been performed, the AR system 5 proceeds from step S261 to step S262 in FIG. 41, and generates an image of the pointing frustum 40DR that is valid at that time point.
 図30のステップS203として、ARシステム5は図41のステップS255,S256の処理を行う。
 ステップS255でARシステム5は、俯瞰映像V3-2にビューフラスタム40及び指示フラスタム40DRを合成する。これにより図40Bのような俯瞰映像V3-2の映像データを生成する。
As step S203 in FIG. 30, the AR system 5 performs the processes of steps S255 and S256 in FIG.
In step S255, the AR system 5 synthesizes the overhead view image V3-2 with the view frustum 40 and the instruction frustum 40DR, thereby generating image data of the overhead view image V3-2 as shown in FIG.
 ステップS256でARシステム5は、俯瞰映像V3-2と撮影映像V1を合成して、図40Aのような合成映像の映像データを生成する。
なお俯瞰映像V3-2と撮影映像V1の合成は、カメラ2側で行うようにしてもよい。
In step S256, the AR system 5 synthesizes the overhead image V3-2 and the captured image V1 to generate image data of a synthetic image as shown in FIG. 40A.
The overhead view image V3-2 and the photographed image V1 may be combined on the camera 2 side.
 図30のステップS204として、ARシステム5は図41のステップS265の処理を行う。
 ステップS265でARシステム5は、俯瞰映像V3-1にビューフラスタム40及び指示フラスタム40DRを合成する。これにより図39Aや図39Bのような俯瞰映像V3-1の映像データを生成する。
As step S204 in FIG. 30, the AR system 5 performs the process of step S265 in FIG.
In step S265, the AR system 5 synthesizes the overhead view image V3-1 with the view frustum 40 and the instruction frustum 40DR, thereby generating image data of the overhead view image V3-1 as shown in Figures 39A and 39B.
 その後、図30のステップS205で、俯瞰映像V3-1がGUIデバイス11に送信され、各カメラ2に対応する俯瞰映像V3-2が各カメラ2に送信される。
 これによりディレクターが自分の指示操作を俯瞰映像V3-1内の指示フラスタム40DRで確認でき、各カメラマンは、ディレクターからの指示を指示フラスタム40DRにより視覚的に確認できる。
Thereafter, in step S205 of FIG. 30, the overhead view V3-1 is transmitted to the GUI device 11, and the overhead view V3-2 corresponding to each camera 2 is transmitted to each camera 2.
This allows the director to check his/her own instructions on the instruction frustum 40DR in the overhead view V3-1, and each cameraman can visually check the instructions from the director through the instruction frustum 40DR.
 ところで、カメラマンが視認する指示フラスタム40DRの表示は、俯瞰映像V3-2で見られるが、俯瞰映像V3-2の視点位置を制御して、よりカメラマンにとって指示をわかりやすくするとよい。 By the way, the display of the instruction frustum 40DR that the cameraman can see is seen in the overhead image V3-2, but it is a good idea to control the viewpoint position of the overhead image V3-2 to make the instructions easier for the cameraman to understand.
 例えば図42A、図42Bは、カメラ2のビューファインダー表示映像50としての俯瞰映像V3-2を示している。これらはビューフラスタム40cのカメラ2の位置を視点位置とした俯瞰映像V3-2であり、そのカメラ2のカメラマンが視認する映像である。 For example, Figures 42A and 42B show overhead images V3-2 as the viewfinder display image 50 of camera 2. These are overhead images V3-2 with the position of camera 2 on the view frustum 40c as the viewpoint position, and are images viewed by the cameraman of camera 2.
 なお図42Aの俯瞰映像V3-2ではビューフラスタム40cについての指示フラスタム40DRが表示されるとともに、他のカメラ2のビューフラスタム40aについての指示フラスタム40DRも表示されている例である。
 また図42Bの俯瞰映像V3-2ではビューフラスタム40cについての指示フラスタム40DRが表示されるが、他のカメラ2のビューフラスタム40aについての指示フラスタム40DRは表示されない例である。
In addition, in the overhead view image V3-2 of FIG. 42A, an instruction frustum 40DR for the view frustum 40c is displayed, and an instruction frustum 40DR for the view frustum 40a of the other camera 2 is also displayed.
Also, in the overhead view image V3-2 of FIG. 42B, an instruction frustum 40DR for the view frustum 40c is displayed, but an instruction frustum 40DR for the view frustum 40a of the other camera 2 is not displayed.
 これら図42A又は図42Bのように、視認するカメラマンが、自分の視点に近い状態で俯瞰映像V3-2を見ることができると、指示フラスタム40DRによる指示の方向性がわかりやすいものとなる。
 つまり図42A、図42Bでは、自身を対象とした指示フラスタム40DRは、撮影方向を左に向ける指示ということが直感的にわかる。
 そこで、俯瞰映像V3-2で指示フラスタム40DRを表示させる場合は、視点位置を、カメラ位置に設定した3D映像とし、そこにビューフラスタム40及び指示フラスタム40DRを表示するようにする。
As shown in FIG. 42A or 42B, if the cameraman viewing the overhead image V3-2 can see it in a state close to his own viewpoint, the direction of the instruction by the instruction frustum 40DR becomes easy to understand.
That is, in Figures 42A and 42B, it can be intuitively understood that the instruction frustum 40DR directed at the user himself is an instruction to turn the imaging direction to the left.
Therefore, when the instruction frustum 40DR is displayed in the overhead view image V3-2, a 3D image is displayed in which the viewpoint position is set to the camera position, and the view frustum 40 and the instruction frustum 40DR are displayed therein.
 処理例を説明する。まずARシステム5は、図30のステップS201,S202は図41のように行う。そして図30のステップS203を図43のように行う。 An example of the process will be described. First, the AR system 5 performs steps S201 and S202 in FIG. 30 as shown in FIG. 41. Then, it performs step S203 in FIG. 30 as shown in FIG. 43.
 ステップS280でARシステム5は、今回のフレームで指示フラスタム40DRを表示させるか否かで処理を分岐する。
 処理対象のカメラ2に対する俯瞰映像V3-2で指示フラスタム40DRを表示させないのであれば、ARシステム5はステップS281に進み、ビューフラスタム40の画像を俯瞰映像V3-2に合成した映像データを生成する。
In step S280, the AR system 5 branches the process depending on whether or not the instruction frustum 40DR is to be displayed in the current frame.
If the instruction frustum 40DR is not to be displayed in the overhead image V3-2 for the camera 2 to be processed, the AR system 5 proceeds to step S281 and generates video data in which the image of the view frustum 40 is synthesized with the overhead image V3-2.
 今回のフレームに指示フラスタム40DRを表示させる場合は、ARシステム5はステップS282に進み、俯瞰映像V3-2を生成する3D空間座標内でビューフラスタム40及び指示フラスタム40DRの配置を設定する。
そしてステップS283でARシステム5は、3D空間座標内で視点位置を設定する。つまり、その俯瞰映像V3-2の送信先となる、複数のカメラの内の特定のカメラ2の位置の座標を視点位置とする。
When the instruction frustum 40DR is to be displayed in the current frame, the AR system 5 proceeds to step S282, and sets the arrangement of the view frustum 40 and the instruction frustum 40DR within the 3D space coordinates for generating the overhead image V3-2.
Then, in step S283, the AR system 5 sets the viewpoint position within the 3D space coordinates. That is, the coordinates of the position of a specific camera 2 among the multiple cameras to which the overhead video V3-2 is to be transmitted are set as the viewpoint position.
 ステップS284でARシステム5は、設定した視点位置でビューフラスタム40及び指示フラスタム40DRが合成されたCGである俯瞰映像V3-2の映像データを生成する。 In step S284, the AR system 5 generates video data for the overhead image V3-2, which is CG in which the view frustum 40 and the instruction frustum 40DR are combined at the set viewpoint position.
 このような処理により、ビューファインダー表示映像50として指示フラスタム40DRが含まれる俯瞰映像V3-2が表示される場合には、カメラマンは、そのカメラ2の視点による図42A又は図42Bのような映像を視認できる。これによりディレクターの指示がわかりやすくなる。 When this processing is performed and an overhead image V3-2 including the instruction frustum 40DR is displayed as the viewfinder display image 50, the cameraman can see an image such as that shown in FIG. 42A or FIG. 42B from the viewpoint of the camera 2. This makes it easier to understand the director's instructions.
 ところでカメラマンが、ビューファインダー表示映像50を、俯瞰映像V3-2と撮影映像V1とで任意に切り替えることができるようにすると便利である。
 例えばビューファインダー表示映像50を、カメラマンが操作により図42Aのような俯瞰映像V3-2と、図44のような撮影映像V1を切り替えることができるようにする。
Incidentally, it would be convenient if the cameraman could arbitrarily switch the viewfinder display image 50 between the overhead image V3-2 and the shot image V1.
For example, the viewfinder display image 50 can be switched by the cameraman between an overhead image V3-2 as shown in FIG. 42A and a shot image V1 as shown in FIG.
 特にカメラマンは、撮影中は自分が操作しているカメラ2の撮影映像V1(つまりライブビュー)を常に確認する必要があるため、ビューファインダーに撮影映像V1を表示することは必要である。
 そのため、俯瞰映像V3-2については先に図40Aに示したように、撮影映像V1と合成して表示させることが考えられるが、俯瞰映像V3-2が小さくて指示フラスタム40DRがわかりにくいということもある。
 そこで任意のタイミングで図42Aのような俯瞰映像V3-2と図44のような撮影映像V1を切り替えてそれぞれ全面で表示できるようにするとよい。
In particular, since the cameraman needs to constantly check the captured image V1 (i.e., live view) of the camera 2 that he is operating while shooting, it is necessary for the captured image V1 to be displayed in the viewfinder.
For this reason, it is conceivable to composite the overhead image V3-2 with the shot image V1 and display it as previously shown in FIG. 40A, but the overhead image V3-2 may be small and the indication frustum 40DR may be difficult to see.
Therefore, it is advisable to switch between the overhead view image V3-2 as shown in FIG. 42A and the photographed image V1 as shown in FIG. 44 at any timing and display each image in full screen.
 但し、撮影映像V1の表示中に指示が発生したことを知る必要もある。そこで図44のように撮影映像V1上に指示情報として指示方向54や一致率53を表示させるようにする。
 指示方向54は、指示フラスタム40DRで指示された撮影方向である。一致率53は、現在のビューフラスタム40と、指示フラスタム40DRの一致率を示している。一致率が100%になったときに、現在のビューフラスタム40が指示フラスタム40DRに一致したことになる。
However, it is also necessary to know that an instruction has been issued while the photographed image V1 is being displayed.To this end, as shown in Figure 44, a pointed direction 54 and a coincidence rate 53 are displayed as instruction information on the photographed image V1.
The designated direction 54 is the shooting direction designated by the designated frustum 40DR. The match rate 53 indicates the match rate between the current view frustum 40 and the designated frustum 40DR. When the match rate becomes 100%, the current view frustum 40 matches the designated frustum 40DR.
 このように表示を行うことで、カメラマンは通常は撮影映像V1を視認していてもディレクターの指示があったことを確認できると共に、指示方向54と一致率53を頼りに指示に応じることができる。また必要に応じて画面を俯瞰映像V3-2に切り替えて、指示フラスタム40DRを確認することもできる。 By displaying in this way, the cameraman can confirm that the director has given instructions even when he is normally viewing the shot image V1, and can follow the instructions by relying on the instruction direction 54 and the coincidence rate 53. If necessary, the screen can also be switched to the overhead image V3-2 to check the instruction frustum 40DR.
 処理例を図45に示す。
 ARシステム5は図30のステップS201で図45のステップS270からステップS273の処理を行う。
 またARシステム5は、図30のステップS203で図45のステップS275からステップS278の処理を行う。
An example of processing is shown in FIG.
The AR system 5 performs the processes of steps S270 to S273 in FIG. 45 in step S201 in FIG.
Furthermore, the AR system 5 performs the processes of steps S275 to S278 in FIG. 45 in step S203 in FIG.
 ステップS270でARシステム5は、今回のフレームでビューフラスタム40の表示がOFFであるか否かを確認する。即ち俯瞰映像V3-2ではなく撮影映像V1を表示している期間であるか否かを確認する。 In step S270, the AR system 5 checks whether the display of the view frustum 40 is OFF in the current frame. In other words, it checks whether the current frame is displaying the captured image V1 instead of the overhead image V3-2.
 ビューファインダー表示映像50として撮影映像V1を選択しているのであれば、ARシステム5はステップS201の処理を終える。つまりビューフラスタム40及び指示フラスタム40DRの画像を生成する必要はない。 If the captured image V1 has been selected as the viewfinder display image 50, the AR system 5 ends the processing of step S201. In other words, there is no need to generate images of the view frustum 40 and the indication frustum 40DR.
 ビューファインダー表示映像50として俯瞰映像V3-2を選択しているのであれば、ARシステム5は、ステップS271でメタデータMTに基づいてビューフラスタム40の画像データを生成する。 If the overhead image V3-2 is selected as the viewfinder display image 50, the AR system 5 generates image data for the view frustum 40 based on the metadata MT in step S271.
 ステップS272でARシステム5は、指示フラスタム40DRを表示させるか否かを判定する。
 指示フラスタム40DRを表示させる場合とは、ディレクターの指示操作があった場合である。上述の全ての指示フラスタム40DRを表示させるモードと、自己のカメラについての指示フラスタム40DRのみを表示するモード選択についても確認する。
In step S272, the AR system 5 determines whether or not to display the instruction frustum 40DR.
The instruction frustum 40DR is displayed when the director instructs it to be displayed. The selection of a mode for displaying all the instruction frustum 40DR and a mode for displaying only the instruction frustum 40DR for the camera of the camera is also confirmed.
 指示フラスタム40DRを表示させないのであればステップS201の処理を終える。
 俯瞰映像V3-2で指示フラスタム40DRを表示させるのであれば、ARシステム5はステップS273に進んで、指示フラスタム40DRの画像データを生成する。
If the instruction frustum 40DR is not to be displayed, the process of step S201 is ended.
If the instruction frustum 40DR is to be displayed in the overhead image V3-2, the AR system 5 proceeds to step S273 and generates image data of the instruction frustum 40DR.
 図30のステップS203では、ARシステム5は図45のステップS275で同じくビューフラスタム40の表示がOFFであるか否かを確認する。撮影映像V1を表示している期間であるか否かの確認である。 In step S203 in FIG. 30, the AR system 5 checks whether the display of the view frustum 40 is OFF, as in step S275 in FIG. 45. This is to check whether the captured image V1 is currently being displayed.
 現在、処理対象のカメラ2が俯瞰映像V3-2を表示しているのであれば、ARシステム5はステップS278に進み、俯瞰映像V3-2の映像データにビューフラスタム40の映像データを合成し、また指示フラスタム40DRの画像データを生成した場合は指示フラスタム40DRも合成した映像データを生成する。 If the camera 2 being processed is currently displaying the overhead image V3-2, the AR system 5 proceeds to step S278, where it synthesizes the image data of the view frustum 40 with the image data of the overhead image V3-2, and if image data of the instruction frustum 40DR has been generated, it also generates image data that is synthesized with the instruction frustum 40DR.
 現在、処理対象のカメラ2が撮影映像V1を表示しているのであれば、ARシステム5はステップS276に進み、ディレクターの指示があるか否かで処理を分岐する。指示がなければステップS203の処理を終える。ディレクターの指示があった場合は、ステップS277で撮影映像V1に指示方向54と一致率53を表示させるように設定する。 If the camera 2 to be processed is currently displaying the captured image V1, the AR system 5 proceeds to step S276, where the process branches depending on whether or not there is an instruction from the director. If there is no instruction, the process of step S203 ends. If there is an instruction from the director, in step S277, the captured image V1 is set to display the indicated direction 54 and the matching rate 53.
 その後、図30のステップS205で、カメラ2に対して映像データが出力される。つまり図44のような撮影映像V1の映像データか、図42Aのような俯瞰映像V3-2の映像データがカメラ2に対して出力される。 Then, in step S205 of FIG. 30, video data is output to camera 2. That is, video data of the shot video V1 as shown in FIG. 44 or video data of the overhead video V3-2 as shown in FIG. 42A is output to camera 2.
 なお、例えばカメラマンの操作により、ビューファインダー表示映像50を、撮影映像V1と、俯瞰映像V3-2と、図40Aのような合成映像とで切り替えられるようにしてもよい。
For example, the viewfinder display image 50 may be switched between the shot image V1, the overhead image V3-2, and a composite image as shown in FIG. 40A by operation of the cameraman.
 [4-4:マーカー表示]
 次に、カメラマンが視認するビューファインダー表示映像50として俯瞰映像V3-2において、マーカー表示を実行する例を説明する。
[4-4: Marker display]
Next, an example of executing marker display in the overhead view image V3-2 as the viewfinder display image 50 visually recognized by the cameraman will be described.
 図46Aはカメラ2のビューファインダー表示映像50として撮影映像V1と俯瞰映像V3-2が表示されている状態を示している。この例では、撮影映像V1の画面の隅に俯瞰映像V3-2が合成されている。図46Bに俯瞰映像V3-2を拡大して示す。
 また図46Bのように、カメラ2で表示される俯瞰映像V3-2では、そのカメラ自身のビューフラスタム40のみが表示される例としている。
 ディレクター側においてGUIデバイス11で表示される俯瞰映像V3-2では、例えば図28等で説明したように全てのカメラ2のビューフラスタム40が表示されているとする。
Fig. 46A shows a state in which a photographed image V1 and an overhead image V3-2 are displayed as a viewfinder display image 50 of camera 2. In this example, the overhead image V3-2 is composited into a corner of the screen of the photographed image V1. Fig. 46B shows an enlarged view of the overhead image V3-2.
Also, as shown in FIG. 46B, in the overhead view image V3-2 displayed by camera 2, only the view frustum 40 of that camera itself is displayed.
In the overhead view video V3-2 displayed on the GUI device 11 on the director's side, it is assumed that the view frustums 40 of all the cameras 2 are displayed as described with reference to FIG. 28 and the like.
 図46A、図46Bに示す俯瞰映像V3-2では、ビューフラスタム40に加えて、マーカーフラスタム40M1,40M2が表示されている。
 このマーカーフラスタム40M1,40M2は、カメラマンが、撮影する被写体位置や方向を登録したことに応じて表示される。つまりカメラマンが、たびたび撮影したい方向をマーキングしておくものである。
In the overhead view image V3-2 shown in FIGS. 46A and 46B, in addition to the view frustum 40, marker frustum 40M1 and 40M2 are displayed.
The marker frustums 40M1 and 40M2 are displayed in response to the cameraman registering the subject position and direction to be photographed, that is, the cameraman frequently marks the direction in which he or she wishes to photograph.
 マーカーフラスタム40M1,40M2は、例えばビューフラスタム40と異なる表示態様とする。またマーカーフラスタム40M1とマーカーフラスタム40M2も異なる表示態様とするとよい。
 例えばビューフラスタム40を白色半透明とする場合、マーカーフラスタム40M1は黄色半透明、マーカーフラスタム40M2は水色半透明とするなどである。
The marker frustum 40M1, 40M2 may be displayed in a different manner from, for example, the view frustum 40. The marker frustum 40M1 and the marker frustum 40M2 may also be displayed in a different manner.
For example, when the view frustum 40 is white and semi-transparent, the marker frustum 40M1 is yellow and semi-transparent, and the marker frustum 40M2 is light blue and semi-transparent.
 また図47のように、撮影映像V1上で、マーカー55M1,55M2により、マーカーフラスタム40M1、40M2の位置を示すようにしてもよい。
 この場合に、マーカー55M1をマーカーフラスタム40M1と同様に黄色とし、マーカー55M2をマーカーフラスタム40M2と同様に水色とするなどして、対応関係を明示するとよい。
Also, as shown in FIG. 47, the positions of the marker frustums 40M1 and 40M2 may be indicated by markers 55M1 and 55M2 on the captured image V1.
In this case, the correspondence may be clearly indicated by making the marker 55M1 yellow like the marker frustum 40M1 and making the marker 55M2 light blue like the marker frustum 40M2.
 処理例を説明する。説明上、マーカーフラスタム40M1,40M2等については「マーカーフラスタム40M」と総称する。またマーカー55M1,55M2等については「マーカー55M」と総称する。
 図48は図30のステップS201,S202、S203,S204の具体例を示している。
A processing example will be described. For the sake of explanation, the marker frustums 40M1, 40M2, etc. will be collectively referred to as "marker frustum 40M." Also, the markers 55M1, 55M2, etc. will be collectively referred to as "marker 55M."
FIG. 48 shows a specific example of steps S201, S202, S203, and S204 in FIG.
 図30のステップS201としてARシステム5は、図48のステップS300からステップS303の処理を行う。
 まずステップS300でARシステム5は、メタデータMTに基づいてビューフラスタム40の画像データを生成する。例えば処理対象のカメラ2に対応するビューフラスタム40を生成する。全てのカメラ2に対応するビューフラスタム40を生成する場合もある。
As step S201 in FIG. 30, the AR system 5 performs the processes of steps S300 to S303 in FIG.
First, in step S300, the AR system 5 generates image data of the view frustum 40 based on the metadata MT. For example, the view frustum 40 corresponding to the camera 2 to be processed is generated. In some cases, the view frustum 40 corresponding to all of the cameras 2 is generated.
 ステップS301でARシステム5は、処理対象のカメラ2においてマーキング操作が行われたか否かを判定する。マーキング操作とは、マーキングの追加又は削除の操作である。特にマーキング操作が行われていなければステップS201の処理を終える。 In step S301, the AR system 5 determines whether or not a marking operation has been performed on the camera 2 to be processed. A marking operation is an operation for adding or deleting a marking. If no marking operation has been performed, the process of step S201 ends.
 マーキング操作が行われた場合は、ARシステム5はステップS302で、処理対象のカメラ2に関して、マーキングポイントの登録の追加、或いはマーキングの登録の削除の処理を行う。
 そしてステップS303でARシステム5は、必要に応じてマーカーフラスタム40Mの画像データの生成を行う。即ちその時点でマーキング登録がある場合に、それらのマーカーフラスタム40Mの画像データの生成を行う。
When a marking operation has been performed, in step S302, the AR system 5 performs a process of adding a marking point to the registered marking or deleting a marking from the registered marking for the camera 2 to be processed.
Then, in step S303, the AR system 5 generates image data of the marker frustum 40M as necessary. That is, if there are markings registered at that time, image data of the marker frustum 40M is generated.
 図30のステップS202ではARシステム5は、図48のステップS310でディレクター用のビューフラスタム40の生成を行う。この場合、全てのカメラ2に対応するビューフラスタム40の画像データを生成する。 In step S202 of FIG. 30, the AR system 5 generates a view frustum 40 for the director in step S310 of FIG. 48. In this case, image data of the view frustum 40 corresponding to all cameras 2 is generated.
 図30のステップS203ではARシステム5は、図48のステップS320,S321の処理を行う。
 ステップS320でARシステム5は、俯瞰映像V3-2としてのCGデータにビューフラスタム40を合成する。またマーキング登録がある場合は、マーカーフラスタム40Mの画像データも合成する。
In step S203 in FIG. 30, the AR system 5 performs the processes of steps S320 and S321 in FIG.
In step S320, the AR system 5 synthesizes the view frustum 40 with the CG data as the overhead view image V3-2. If there is a marking registration, the AR system 5 also synthesizes image data of the marker frustum 40M.
 ステップS321でARシステム5は、マーキング登録に応じて、マーカー55Mを撮影映像V1に合成する。
 以上でカメラ2に送信する俯瞰映像V3-2及び撮影映像V1の映像データが生成される。
In step S321, the AR system 5 combines the marker 55M with the captured image V1 in accordance with the marking registration.
As described above, the video data of the overhead video V3-2 and the captured video V1 to be transmitted to the camera 2 is generated.
 図30のステップS204ではARシステム5は、図48のステップS330の処理を行う。
 ステップS330でARシステム5は、俯瞰映像V3-1としてのCGデータにビューフラスタム40を合成する。
 これにより俯瞰映像V3-1の映像データが生成される。
In step S204 in FIG. 30, the AR system 5 performs the process of step S330 in FIG.
In step S330, the AR system 5 synthesizes the view frustum 40 with the CG data as the overhead image V3-1.
As a result, video data for the overhead view video V3-1 is generated.
 その後、図30のステップS205で、俯瞰映像V3-2及び撮影映像V1の映像データがカメラ2に送信され、俯瞰映像V3-1の映像データがGUIデバイス11に送信される。
 これにより、カメラマンはマーキング登録操作に応じて、マーカーフラスタム40Mやマーカー55Mを視認することができる。
 ディレクター側は、マーカーフラスタム40Mやマーカー55Mが表示されないことで、俯瞰映像V3-1がむやみに煩雑にならない。
Then, in step S205 of FIG. 30, the video data of the overhead view V3-2 and the shot video V1 are transmitted to the camera 2, and the video data of the overhead view V3-1 is transmitted to the GUI device 11.
This allows the cameraman to visually recognize the marker frustum 40M and the marker 55M in accordance with the marking registration operation.
From the director's perspective, by not displaying the marker frustum 40M and marker 55M, the overhead image V3-1 does not become unnecessarily cluttered.
 [4-5:各種表示例]
 さらに他の例として、ディレクター側、カメラマン側でそれぞれ適切な俯瞰映像V3-1、V3-2の表示例を説明する。
[4-5: Examples of various displays]
As yet another example, a display example of appropriate overhead views V3-1 and V3-2 on the director's side and cameraman's side, respectively, will be described.
 図49AはGUIデバイス11のデバイス表示映像51として俯瞰映像V3-1が表示されている例であり、図49Bは同時にカメラ2のビューファインダー表示映像50として俯瞰映像V3-2が表示されている例である。 FIG. 49A shows an example in which an overhead image V3-1 is displayed as the device display image 51 of the GUI device 11, and FIG. 49B shows an example in which an overhead image V3-2 is simultaneously displayed as the viewfinder display image 50 of the camera 2.
 図49Aの俯瞰映像V3-1では、各カメラ2のビューフラスタム40a,40b,40cが、それぞれ同様の態様、例えば白色半透明で表示されている。
 図49Bの俯瞰映像V3-2では、ビューフラスタム40bに対応するカメラ2において、そのビューフラスタム40bが例えば赤色半透明で強調表示され、他のカメラ2のビューフラスタム40a,40cが、それぞれ通常の白色半透明で表示されている例である。
In the overhead image V3-1 of FIG. 49A, the view frustums 40a, 40b, and 40c of the cameras 2 are displayed in a similar manner, for example, in semi-transparent white.
In the overhead image V3-2 of Figure 49B, the view frustum 40b of the camera 2 corresponding to the view frustum 40b is highlighted in, for example, a semi-transparent red, while the view frustums 40a and 40c of the other cameras 2 are each displayed in the usual semi-transparent white.
 図示していないが、ビューフラスタム40aに対応するカメラ2においては、そのビューフラスタム40aが例えば赤色半透明で強調表示され、他のカメラ2のビューフラスタム40b,40cが、それぞれ通常の白色半透明で表示されることになる。
 またビューフラスタム40cに対応するカメラ2においては、そのビューフラスタム40cが例えば赤色半透明で強調表示され、他のカメラ2のビューフラスタム40a,40bが、それぞれ通常の白色半透明で表示されることになる。
Although not shown, in the camera 2 corresponding to the view frustum 40a, that view frustum 40a is highlighted in, for example, a semi-transparent red, and the view frustums 40b and 40c of the other cameras 2 are each displayed in the normal semi-transparent white.
In addition, in the camera 2 corresponding to the view frustum 40c, that view frustum 40c is highlighted in, for example, a semi-transparent red, and the view frustums 40a, 40b of the other cameras 2 are each displayed in a normal semi-transparent white.
 このようにすることで、ディレクターは、各カメラ2のビューフラスタム40を均等に確認でき、カメラマンは、自身が操作するカメラ2のビューフラスタム40を容易に確認できるようになる。 By doing this, the director can check the view frustum 40 of each camera 2 evenly, and the cameraman can easily check the view frustum 40 of the camera 2 he is operating.
 図50AはGUIデバイス11のデバイス表示映像51として俯瞰映像V3-1が表示されている例であり、図50Bは同時にカメラ2のビューファインダー表示映像50として俯瞰映像V3-2が表示されている例である。 FIG. 50A shows an example in which an overhead image V3-1 is displayed as the device display image 51 of the GUI device 11, and FIG. 50B shows an example in which an overhead image V3-2 is simultaneously displayed as the viewfinder display image 50 of the camera 2.
 図50Aの俯瞰映像V3-1では、各カメラ2のビューフラスタム40a,40b,40cが、それぞれ同様の態様、例えば白色半透明で表示されている。撮影対象空間8に相当するCG空間30において比較的高い位置を視点位置とすることで、全体が見やすい画像となっている。
 図50Bの俯瞰映像V3-2では、ビューフラスタム40bに対応するカメラ2において、そのビューフラスタム40bが例えば赤色半透明で強調表示され、他のカメラ2のビューフラスタム40a,40cが、それぞれ通常の白色半透明で表示されている。そしてさらに視点位置は、ビューフラスタム40bに対応するカメラ2の位置とされる。
In the overhead image V3-1 in Fig. 50A, the view frustums 40a, 40b, and 40c of the cameras 2 are displayed in the same manner, for example, in semi-transparent white. By setting the viewpoint position at a relatively high position in the CG space 30 corresponding to the subject space 8, the image is easy to see overall.
50B, in the camera 2 corresponding to the view frustum 40b, the view frustum 40b is highlighted in, for example, a semi-transparent red, and the view frustums 40a and 40c of the other cameras 2 are each displayed in a normal semi-transparent white. Furthermore, the viewpoint position is set to the position of the camera 2 corresponding to the view frustum 40b.
 図示していないが、ビューフラスタム40aに対応するカメラ2で表示される俯瞰映像V3-2では、そのビューフラスタム40aが例えば赤色半透明で強調表示され、他のカメラ2のビューフラスタム40b,40cが、それぞれ通常の白色半透明で表示されるとともに、視点位置がビューフラスタム40aのカメラ2の位置とされる。
 またビューフラスタム40cに対応するカメラ2についての俯瞰映像V3-2も同様に自身のビューフラスタム40が強調表示され、視点位置がビューフラスタム40cのカメラ2の位置とされる。
Although not shown, in the overhead image V3-2 displayed by the camera 2 corresponding to the view frustum 40a, that view frustum 40a is highlighted, for example, in a semi-transparent red color, and the view frustums 40b, 40c of the other cameras 2 are each displayed in a normal semi-transparent white color, and the viewpoint position is set to the position of the camera 2 of the view frustum 40a.
Similarly, the overhead view image V3-2 of the camera 2 corresponding to the view frustum 40c also has its own view frustum 40 highlighted, and the viewpoint position is the position of the camera 2 of the view frustum 40c.
 このようにすることで、ディレクターは、各カメラ2のビューフラスタム40を均等に確認でき、カメラマンは、自身が操作するカメラ2のビューフラスタム40を、自身の視点と同様の視点により確認できるようになる。 In this way, the director can check the view frustum 40 of each camera 2 evenly, and the cameraman can check the view frustum 40 of the camera 2 he is operating from a viewpoint similar to his own.
 図51はGUIデバイス11のデバイス表示映像51として俯瞰映像V3-1が表示されている例である。この場合は、俯瞰映像V3-1a,V3-1bとして、2つの俯瞰映像が合成表示された例としている。俯瞰映像V3-1aは試合会場の斜め上方の視点、俯瞰映像V3-1bは真上の視点の映像となっている。 FIG. 51 shows an example in which an overhead image V3-1 is displayed as the device display image 51 of the GUI device 11. In this case, an example is shown in which two overhead images are synthesized and displayed as overhead images V3-1a and V3-1b. The overhead image V3-1a is an image from a viewpoint diagonally above the stadium, and the overhead image V3-1b is an image from a viewpoint directly above.
ディレクターにとってはカメラ全体を把握する必要がある。そのため異なる視点で複数の俯瞰映像V3-1を表示させることは好適である。 The director needs to understand the entire camera, so it is ideal to display multiple overhead images V3-1 from different viewpoints.
 以上のような各例の表示のための処理例を説明する。
 図52は図30のステップS201,S202、S203,S204の具体例を示している。
An example of processing for displaying each of the above examples will be described.
FIG. 52 shows a specific example of steps S201, S202, S203, and S204 in FIG.
 図30のステップS201としてARシステム5は、図52のステップS410の処理を行う。ステップS410でARシステム5は、メタデータMTに基づいてカメラマン用のビューフラスタム40の画像データを生成する。この場合に、処理対象のカメラ2に対応するビューフラスタム40を強調表示した状態の画像データとする。 As step S201 in FIG. 30, the AR system 5 performs the process of step S410 in FIG. 52. In step S410, the AR system 5 generates image data of the view frustum 40 for the cameraman based on the metadata MT. In this case, the image data is generated in a state in which the view frustum 40 corresponding to the camera 2 to be processed is highlighted.
 図30のステップS202ではARシステム5は、図52のステップS420でディレクター用のビューフラスタム40の生成を行う。この場合、全てのカメラ2に対応するビューフラスタム40として同様の表示態様の画像データを生成する。 In step S202 of FIG. 30, the AR system 5 generates a view frustum 40 for the director in step S420 of FIG. 52. In this case, image data with the same display mode is generated as the view frustum 40 corresponding to all cameras 2.
 図30のステップS203ではARシステム5は、図52のステップS430,S431の処理を行う。
 ステップS430でARシステム5は、俯瞰映像V3-2としての3D座標空間内でビューフラスタム40の画像データの配置を設定する。
In step S203 in FIG. 30, the AR system 5 performs the processes of steps S430 and S431 in FIG.
In step S430, the AR system 5 sets the layout of the image data of the view frustum 40 within the 3D coordinate space as the overhead image V3-2.
 ステップS431でARシステム5は、3D座標空間内における対象のカメラ2の位置を視点位置として、俯瞰映像V3-2としての映像データを生成する。
 以上でカメラ2に送信する俯瞰映像V3-2の映像データが生成される。
In step S431, the AR system 5 generates video data as an overhead image V3-2, with the position of the target camera 2 in the 3D coordinate space set as the viewpoint position.
In this manner, the video data of the overhead video V3-2 to be transmitted to the camera 2 is generated.
 図30のステップS204ではARシステム5は、図52のステップS440、S441,S442の処理を行う。
 ステップS440でARシステム5は、俯瞰映像V3-1aとしてのCGデータにビューフラスタム40を合成する。
 ステップS441でARシステム5は、俯瞰映像V3-1bとしてのCGデータにビューフラスタム40を合成する。
In step S204 in FIG. 30, the AR system 5 performs the processes of steps S440, S441, and S442 in FIG.
In step S440, the AR system 5 synthesizes the view frustum 40 with the CG data as the overhead view image V3-1a.
In step S441, the AR system 5 synthesizes the view frustum 40 with the CG data as the overhead view image V3-1b.
 ステップS442でARシステム5は、俯瞰映像V3-1aと俯瞰映像V3-1bを1画面内に合成した映像データを生成する。これによりGUIデバイス11に送信する俯瞰映像V3-1の映像データが生成される。 In step S442, the AR system 5 generates video data that combines the overhead view image V3-1a and the overhead view image V3-1b on one screen. This generates the video data of the overhead view image V3-1 to be sent to the GUI device 11.
 その後、図30のステップS205で、俯瞰映像V3-2の映像データがカメラ2に送信され、俯瞰映像V3-1の映像データがGUIデバイス11に送信される。
 これにより、カメラマンは例えば図50Bのような俯瞰映像V3-2を視認でき、ディレクターが例えば図51のような俯瞰映像V3-1a,V3-1bを視認できる。
Thereafter, in step S205 of FIG. 30, the video data of the overhead view V3-2 is transmitted to the camera 2, and the video data of the overhead view V3-1 is transmitted to the GUI device 11.
This allows the cameraman to view, for example, the overhead image V3-2 as shown in FIG. 50B, and the director to view, for example, the overhead images V3-1a and V3-1b as shown in FIG.
 なお、以上の図28から図52で説明した各例において、図9から図27で説明したように、ビューフラスタム40と共に撮影映像V1が表示されるようにしてもよい。つまり実施の形態において説明した各例は、それぞれ複合的に実施可能である。

<5.まとめ及び変形例>
 以上の実施の形態によれば次のような効果が得られる。
In each of the examples described above in Fig. 28 to Fig. 52, the captured image V1 may be displayed together with the view frustum 40 as described in Fig. 9 to Fig. 27. In other words, the examples described in the embodiments can be implemented in a composite manner.

5. Summary and Modifications
According to the above embodiment, the following effects can be obtained.
 実施の形態の例えばARシステム5としての情報処理装置70は、撮影対象空間8の俯瞰映像V3と、俯瞰映像V3内においてカメラ2の撮影範囲を提示するビューフラスタム40(撮影範囲提示映像)と、カメラ2の撮影映像V1とを一画面内で同時表示させる映像データを生成する映像処理部71aを備えている(図7,図19参照)。
 CG空間30としての俯瞰映像V3内で、カメラ2のビューフラスタム40の表示を行うとともに、撮影映像V1も同時に表示させることで、視認者は、カメラ2の映像と空間での位置の対応が把握しやすくなる。
In one embodiment, for example, an information processing device 70 as an AR system 5 is equipped with an image processing unit 71a that generates image data for simultaneously displaying an overhead image V3 of the target space 8, a view frustum 40 (shooting range presentation image) that presents the shooting range of the camera 2 within the overhead image V3, and the captured image V1 of the camera 2 on one screen (see Figures 7 and 19).
By displaying the view frustum 40 of the camera 2 in the overhead image V3 as the CG space 30 and simultaneously displaying the captured image V1, the viewer can easily grasp the correspondence between the image of the camera 2 and the position in space.
 また実施の形態では、映像処理部71aは、撮影映像V1がビューフラスタム40内に表示されるようにした映像データを生成する例を挙げた(図9から図14参照)。
 換言すれば映像処理部71aは、撮影映像V1が撮影範囲提示映像(ビューフラスタム40)範囲内に配置されるようにした映像データを生成する。さらに換言すれば、映像処理部71aは、撮影映像V1が撮影範囲提示映像(ビューフラスタム40)範囲内に配置された状態で表示されるようにした映像データを生成する。
 ビューフラスタム40内に撮影映像V1が表示されるようにすることで、ビューフラスタム40と、そのビューフラスタム40に対応するカメラ2の撮影映像の関係が、視認者にとって極めてわかりやすくなる。
In the embodiment, an example has been given in which the video processor 71a generates video data that causes the captured video V1 to be displayed within the view frustum 40 (see FIGS. 9 to 14).
In other words, the image processor 71a generates image data in which the captured image V1 is arranged within the range of the image presentation image (view frustum 40). In other words, the image processor 71a generates image data in which the captured image V1 is displayed in a state in which it is arranged within the range of the image presentation image (view frustum 40).
By displaying the captured image V1 within the view frustum 40, the relationship between the view frustum 40 and the image captured by the camera 2 corresponding to the view frustum 40 becomes extremely easy for the viewer to understand.
 また実施の形態では、映像処理部71aは、撮影映像V1がビューフラスタム40に示される被写界深度範囲内となる位置に表示されるようにした映像データを生成する例を挙げた(図9,図10参照)。
 ビューフラスタム40内に被写界深度範囲42の表示が行われるが、その被写界深度範囲42の表示の内方に撮影映像V1が表示されるようにする。これにより撮影映像V1が、俯瞰映像V3内で、実際の被写体の位置に近い位置に表示されることになる。従って視認者は、ビューフラスタム40による撮影範囲と、実際の撮影映像V1と、撮影された被写体位置の関係性を容易に把握できる。
In the embodiment, an example has been given in which the image processing unit 71a generates image data in which the captured image V1 is displayed at a position within the depth of field range shown on the view frustum 40 (see Figures 9 and 10).
The depth of field range 42 is displayed within the view frustum 40, and the captured image V1 is displayed inside the display of the depth of field range 42. This causes the captured image V1 to be displayed at a position close to the actual position of the subject within the overhead image V3. Therefore, the viewer can easily grasp the relationship between the shooting range of the view frustum 40, the actual captured image V1, and the position of the captured subject.
 また実施の形態では、映像処理部71aは、撮影映像V1がビューフラスタム40に示されるフォーカス面41に表示されるようにした映像データを生成する例を挙げた(図9参照)。
 ビューフラスタム40内にフォーカス面41が表示されるが、そのフォーカス面41に撮影映像V1が表示されるようにする。これにより視認者は、カメラ2のフォーカス位置と、その位置における被写体の映像を容易に確認できる。
In the embodiment, an example has been given in which the video processor 71a generates video data in which the captured video V1 is displayed on the focus plane 41 shown on the view frustum 40 (see FIG. 9).
A focus plane 41 is displayed within the view frustum 40, and the captured image V1 is displayed on the focus plane 41. This allows the viewer to easily confirm the focus position of the camera 2 and the image of the subject at that position.
 また実施の形態では、映像処理部71aは、撮影映像V1がフラスタム起点46から見て被写界深度範囲42よりも遠方側に表示されるようにした映像データを生成する例を挙げた(図12から図14参照)。
 ビューフラスタム40は四角錐状に広がる映像であり、遠方側に行くほど、断面の面積が大きくなる。そのため、フラスタム遠端面45又はその付近に表示させることで、撮影映像V1をビューフラスタム40内で比較的大きく表示させることができる。例えば撮影映像V1の内容を確認したいような場合に好適となる。
In addition, in the embodiment, an example was given in which the image processing unit 71a generates image data in which the captured image V1 is displayed farther away than the depth of field range 42 when viewed from the frustum starting point 46 (see Figures 12 to 14).
The view frustum 40 is an image that spreads in a quadrangular pyramid shape, and the area of the cross section increases as it goes farther. Therefore, by displaying the captured image V1 on or near the frustum far end surface 45, it is possible to display the captured image V1 relatively large within the view frustum 40. This is suitable, for example, when the contents of the captured image V1 are to be confirmed.
 また実施の形態では、映像処理部71aは、撮影映像V1がビューフラスタム40に示される被写界深度範囲42よりも、フラスタム起点46に近い位置(フラスタム起点付近面47)に表示されるようにした映像データを生成する例を挙げた(図11参照)。
 例えばビューフラスタム40における被写界深度範囲42やフォーカス面41を確認したい場合や、或いはフラスタム遠端面45には表示させにくい場合などは、フラスタム起点46に近い位置に撮影映像V1を表示させることが好適となる。
In addition, in the embodiment, an example is given in which the image processing unit 71a generates image data in which the captured image V1 is displayed at a position closer to the frustum origin 46 (a surface 47 near the frustum origin) than the depth of field range 42 shown on the view frustum 40 (see Figure 11).
For example, when it is desired to check the depth of field range 42 or the focus plane 41 in the view frustum 40, or when it is difficult to display the image on the far end surface 45 of the frustum, it is preferable to display the captured image V1 at a position close to the frustum starting point 46.
 実施の形態では、俯瞰映像V3とビューフラスタム40と共に一画面内に同時表示させる撮影映像V1の表示位置を可変設定して映像データの生成を制御する映像生成制御部71bを備える例を挙げた(図7,図23,図24参照)。
 例えばビューフラスタム40内のいずれかの位置や、ビューフラスタム40外でのいずれかの位置として、撮影映像V1の表示位置を設定する。適切な位置設定により、視認者が撮影映像V1を把握しやすくできたり、ビューフラスタム40と撮影映像V1が互いに邪魔にならないようにできたりする。
In the embodiment, an example has been given in which an image generation control unit 71b is provided that controls the generation of image data by variably setting the display position of the captured image V1, which is simultaneously displayed on one screen together with the overhead image V3 and the view frustum 40 (see Figures 7, 23, and 24).
For example, the display position of the captured image V1 is set as any position inside the view frustum 40 or any position outside the view frustum 40. By setting an appropriate position, it is possible to make it easier for the viewer to grasp the captured image V1, and to prevent the view frustum 40 and the captured image V1 from interfering with each other.
 実施の形態では、映像生成制御部71bは、撮影映像V1の表示位置変更判定を行い、判定結果に応じて撮影映像V1の表示位置の設定を変更する例を挙げた(図24参照)。
 例えば変更判定を行って、撮影映像V1の表示位置が自動的に適切な箇所に変更されるようにする。これにより視認者にとってビューフラスタム40と撮影映像V1が適切な配置関係、例えば良好な視認性が得られる配置関係や、対応関係がわかりやすい配置関係などで表示されるようになる。
In the embodiment, an example has been given in which video production control section 71b determines whether to change the display position of photographed video V1, and changes the setting of the display position of photographed video V1 in accordance with the determination result (see FIG. 24).
For example, a change determination is performed so that the display position of the captured image V1 is automatically changed to an appropriate position, whereby the view frustum 40 and the captured image V1 are displayed in an appropriate positional relationship for the viewer, for example, a positional relationship that provides good visibility or a positional relationship that makes it easy to understand the correspondence relationship.
 実施の形態では、映像生成制御部71bは、表示位置変更判定では、ビューフラスタム40と俯瞰映像V3で表現される物体の位置関係に基づいて撮影映像V1の表示位置の変更要否を判定する例を挙げた(図24のステップS160、P1参照)。
 例えばビューフラスタム40の遠端側が俯瞰映像V3における地面GRや構造物CNにめり込んでいるときなどは、フラスタム遠端面45に表示させると不自然な映像になったり表示できなくなったりする。映像生成制御部71bは、そのような場合に位置設定の変更要と判定し、撮影映像V1の位置設定を変更する。これにより自動的に見やすい状態の撮影映像V1を提供できる。
In the embodiment, an example has been given in which the image generation control unit 71b determines whether or not it is necessary to change the display position of the captured image V1 based on the positional relationship between the view frustum 40 and the object represented in the overhead image V3 (see steps S160 and P1 in Figure 24).
For example, when the far end side of the view frustum 40 is embedded in the ground GR or a structure CN in the overhead image V3, the image may become unnatural or may not be displayed at all when displayed on the frustum far end surface 45. In such a case, the image generation control unit 71b determines that the position setting needs to be changed and changes the position setting of the captured image V1. This makes it possible to automatically provide an easily viewable captured image V1.
 実施の形態では、映像生成制御部71bは、表示位置変更判定では、俯瞰映像V3全体の視点からの方向とビューフラスタム40の軸方向により決定される角度に基づいて撮影映像V1の表示位置の変更要否を判定する例を挙げた(図24のステップS160、P2参照)。つまり或る時点の俯瞰映像V3について設定された視点からの視線方向で見た場合における表示画面上の法線方向と、表示されているビューフラスタム40と軸方向の角度である。上述のようにビューフラスタム40の軸方向とはフラスタム起点46からフラスタム遠端面45に対して垂直となる垂直線を引いた場合の、当該垂直線の方向である。
 描画されるビューフラスタム40のサイズや方向は、カメラ2の画角や撮影方向に応じて変化する。そして俯瞰映像V3内でのビューフラスタム40の角度によっては、ビューフラスタム40内に撮影映像V1の表示のための十分な面がとれないことがある。その場合、撮影映像V1を表示しても視認者が内容を確認することは困難である。そこで映像生成制御部71bはビューフラスタム40の角度によって位置設定の変更要と判定し、撮影映像V1の位置設定を変更する。これにより自動的に見やすい状態の撮影映像V1を提供できる。
In the embodiment, the image generation control unit 71b judges whether or not the display position of the captured image V1 needs to be changed based on the angle determined by the direction from the viewpoint of the entire overhead image V3 and the axial direction of the view frustum 40 (see steps S160 and P2 in FIG. 24). That is, it is the angle between the normal direction on the display screen when viewed from the line of sight direction from the viewpoint set for the overhead image V3 at a certain point in time, and the axial direction of the displayed view frustum 40. As described above, the axial direction of the view frustum 40 is the direction of a perpendicular line drawn from the frustum starting point 46 to the frustum far end surface 45.
The size and direction of the rendered view frustum 40 change according to the angle of view and shooting direction of the camera 2. Depending on the angle of the view frustum 40 in the overhead image V3, it may not be possible to secure a sufficient surface area within the view frustum 40 for displaying the captured image V1. In that case, even if the captured image V1 is displayed, it is difficult for the viewer to confirm the content. Therefore, the image generation control unit 71b determines that the position setting needs to be changed according to the angle of the view frustum 40, and changes the position setting of the captured image V1. This makes it possible to automatically provide the captured image V1 in an easy-to-view state.
 実施の形態では、映像生成制御部71bは、表示位置変更判定では、俯瞰映像V3内の視点変更に基づいて撮影映像V1の表示位置の変更要否を判定する例を挙げた(図24のステップS160、P3参照)。
 例えば俯瞰映像V3の視点が変更されることで、ビューフラスタム40の方向、サイズ、角度などが変化する。そこで映像生成制御部71bは、俯瞰映像V3の視点が変更された際に、それまでの撮影映像V1の表示が適切であるか否かを判定し、変更要であれば設定を変更する。これにより視認者が俯瞰映像V3を任意に変更させても、撮影映像V1を常に見やすい状態で提供できる。
In the embodiment, an example has been given in which video production control unit 71b determines whether or not the display position of captured video V1 needs to be changed based on a change in viewpoint within overhead video V3 (see steps S160 and P3 in Figure 24).
For example, changing the viewpoint of the overhead image V3 changes the direction, size, angle, etc. of the view frustum 40. When the viewpoint of the overhead image V3 is changed, the image generation control unit 71b judges whether the display of the captured image V1 up to that point is appropriate, and changes the settings if necessary. This makes it possible to provide the captured image V1 in a state that is always easy to view, even if the viewer arbitrarily changes the overhead image V3.
 実施の形態では、映像生成制御部71bは、撮影映像V1を撮影するカメラ2の種別情報を、撮影映像の変更先の設定に用いる例を挙げた(図24のステップS163参照)。
 例えばカメラ2が三脚6等による位置固定タイプか、移動タイプかの種別に応じて、撮影映像V1の表示位置の変更先を設定する。これにより位置固定タイプのカメラ2Fや、移動タイプのカメラ2Mのそれぞれに応じた位置設定が可能となる。特に移動型のカメラ2Mの場合はビューフラスタム40の変動も頻繁であるため、ビューフラスタム40の変動に影響の少ない位置に撮影映像V1を表示させるなどすることで、見やすい表示を提供できる。
In the embodiment, an example has been given in which video production control section 71b uses type information of camera 2 capturing captured video V1 to set the destination of the captured video (see step S163 in FIG. 24).
For example, the change destination of the display position of the captured image V1 is set depending on whether the camera 2 is a fixed type using a tripod 6 or a mobile type. This makes it possible to set a position according to the fixed type camera 2F and the mobile type camera 2M. In particular, in the case of the mobile camera 2M, the view frustum 40 frequently changes, so that an easy-to-view display can be provided by displaying the captured image V1 at a position that is less affected by the change in the view frustum 40.
 実施の形態では、映像生成制御部71bは、ユーザ操作に応じて、撮影映像V1の表示位置の設定を変更する例を挙げた(図23参照)。
 視認者であるユーザが任意に撮影映像V1の表示位置を切り替えることができるようにする。これにより視認者の見やすさや目的に応じた位置に撮影映像V1を表示させることができる。
In the embodiment, an example has been given in which video production control section 71b changes the setting of the display position of captured video V1 in response to a user operation (see FIG. 23).
The user, who is the viewer, can arbitrarily switch the display position of the captured image V1, thereby allowing the captured image V1 to be displayed at a position that suits the viewer's ease of viewing and purpose.
 実施の形態では、映像生成制御部71bは、撮影映像V1の表示位置をビューフラスタム40内で変更する例を挙げた(図23,図24参照)。
 例えばビューフラスタム40内で、フォーカス面41、フラスタム遠端面45、フラスタム起点46側の面、被写界深度範囲内の面などで切り替える。これによりビューフラスタム40と撮影映像V1の対応関係を明確にしたまま、適切な位置に撮影映像V1を表示できる。
In the embodiment, an example has been given in which image production control section 71b changes the display position of captured image V1 within view frustum 40 (see Figs. 23 and 24).
For example, within the view frustum 40, switching is performed among the focus plane 41, the frustum far end plane 45, the plane on the frustum starting point 46 side, the plane within the depth of field, etc. This allows the captured image V1 to be displayed at an appropriate position while clarifying the correspondence between the view frustum 40 and the captured image V1.
 実施の形態では、映像生成制御部71bは、撮影映像V1の表示位置をビューフラスタム40内及びビューフラスタム40外で変更する例を挙げた(図23,図24参照)。
 例えばフォーカス面41、フラスタム遠端面45、フラスタム起点46側の面、被写界深度範囲内の面などビューフラスタム40内や、さらにカメラ近辺、画面隅、フォーカス面41付近などのビューフラスタム40外の位置で、撮影映像V1の表示位置を変更する。これにより俯瞰映像V3やビューフラスタム40の状態に応じて広く撮影映像V1の表示位置を選択できる。
In the embodiment, an example has been given in which the image production control section 71b changes the display position of the captured image V1 between inside and outside the view frustum 40 (see Figs. 23 and 24).
For example, the display position of the captured image V1 is changed within the view frustum 40, such as the focus plane 41, the frustum far end plane 45, the plane on the frustum starting point 46 side, and the plane within the depth of field range, or further, at a position outside the view frustum 40, such as near the camera, in the corner of the screen, or near the focus plane 41. This makes it possible to widely select the display position of the captured image V1 according to the state of the overhead image V3 and the view frustum 40.
 実施の形態では、映像処理部71aは、俯瞰映像V3と、複数のカメラ2についてのそれぞれのビューフラスタム40と、複数のカメラ2のそれぞれの撮影映像V1とを、一画面内で同時表示させる映像データを生成する例を挙げた(図16,図17,図27参照)。
 俯瞰映像V3で表されるCG空間30に複数のカメラ2のビューフラスタム40と撮影映像V1を表示させる。これにより視認者は、各カメラ2の撮影範囲の関係を容易に把握できることになる。例えばディレクターなどが各カメラ2の撮影内容を確認する場合に便利となる。
In the embodiment, an example is given in which the video processing unit 71a generates video data that simultaneously displays an overhead image V3, each view frustum 40 for each of the multiple cameras 2, and each captured image V1 for each of the multiple cameras 2 on a single screen (see Figures 16, 17, and 27).
The view frustum 40 and the captured images V1 of the multiple cameras 2 are displayed in the CG space 30 represented by the overhead image V3. This allows the viewer to easily understand the relationship between the shooting ranges of the cameras 2. This is convenient for a director, for example, to check the contents of the images captured by each camera 2.
 撮影範囲提示映像の例としてビューフラスタム40を挙げ、その形状は四角錐形状としたが、これに限られない。例えば四角錐の断面の方形の輪郭を複数配置するような映像や、四角錐の輪郭線を破線で表現するような映像でもよい。また必ずしも四角錐に限らず円錐形状等でもよい。
 或いは撮影範囲提示映像は、フォーカス面41のみの表示や、被写界深度範囲42のみの表示などとしてもよい。
The view frustum 40 is given as an example of a shooting range presentation image, and its shape is a quadrangular pyramid, but it is not limited to this. For example, it may be an image in which multiple rectangular outlines of a quadrangular pyramid cross section are arranged, or an image in which the outline of a quadrangular pyramid is expressed by a dashed line. It is also not necessarily limited to a quadrangular pyramid, and it may be a cone shape, etc.
Alternatively, the shooting range presentation image may display only the focus plane 41 or only the depth of field range 42 .
 また、実施の形態の例えばARシステム5としての情報処理装置70は、撮影対象空間8内におけるカメラ2のビューフラスタム40(撮影範囲提示映像)を表示させる第1映像データを生成する処理と、撮影対象空間8内におけるビューフラスタム40を表示させる映像であって第1映像データとは異なる表示態様の映像を表示させる第2映像データを生成する処理とを並行して行う映像処理部71aを備えている。
 特に第1映像データと第2映像データは、実施の形態においてはGUIデバイス11に送信する俯瞰映像V3-1の映像データと、カメラ2に送信する俯瞰映像V3-2の映像データとした。
 CG空間30としての俯瞰映像V3内で、カメラ2のビューフラスタム40の表示を行うことで視認者は、カメラ2の映像と空間での位置の対応が把握しやすくなる。このビューフラスタム40を含む俯瞰映像V3について、各視認者の役割等に応じて異なる表示態様の映像データを生成することで、映像表示により各視認者のそれぞれに適した情報の提示を実現できる。
In addition, the information processing device 70 as, for example, the AR system 5 in the embodiment is equipped with an image processing unit 71a that performs in parallel a process of generating first image data that displays the view frustum 40 (image presenting the shooting range) of the camera 2 within the target shooting space 8, and a process of generating second image data that displays an image that displays the view frustum 40 within the target shooting space 8 and has a display mode different from that of the first image data.
In particular, the first video data and the second video data are the video data of the overhead video V3-1 transmitted to the GUI device 11 and the video data of the overhead video V3-2 transmitted to the camera 2 in the embodiment.
By displaying the view frustum 40 of the camera 2 within the overhead image V3 as the CG space 30, the viewer can easily grasp the correspondence between the image of the camera 2 and the position in the space. By generating video data with different display modes according to the role of each viewer for the overhead image V3 including the view frustum 40, it is possible to present information suited to each viewer through the video display.
 実施の形態では、俯瞰映像V3-1、V3-2の映像データは、一方は映像制作指示者が視認する映像の映像データであり、他方は撮影対象空間8に対するカメラ2の撮影操作者が視認する映像の映像データであるとした。
 例えば俯瞰映像V3-1はGUIデバイス11等でディレクター等の映像制作指示者による視認を想定した内容とし、俯瞰映像V3-2はカメラマン等の撮影操作者による視認を想定した映像内容とした。このようにディレクター用、カメラマン用として異なる映像内容の俯瞰映像V3-1,V3-2を表示させることで、映像制作指示、撮影操作のそれぞれに適した情報提示が可能となる。
 なおこの場合の映像制作指示者とは、ディレクター、スイッチャーエンジニアなど、映像制作に携わるスタッフで、撮影操作者以外を指す。撮影操作者とは、カメラ2を直接操作するカメラマンや、カメラ2を遠隔操作するスタッフを指す。
In the embodiment, the video data of the overhead images V3-1 and V3-2 is video data of an image viewed by a video production supervisor, and the other is video data of an image viewed by a camera operator of camera 2 regarding the target space 8.
For example, the overhead image V3-1 has content intended for viewing by a video production instructor such as a director on the GUI device 11, and the overhead image V3-2 has video content intended for viewing by a shooting operator such as a cameraman. By displaying the overhead images V3-1 and V3-2 with different video content for the director and the cameraman in this way, it becomes possible to present information suitable for video production instructions and shooting operations, respectively.
In this case, the video production director refers to staff involved in video production, such as a director and a switcher engineer, other than the camera operator. The camera operator refers to a cameraman who directly operates the camera 2 and a staff member who remotely operates the camera 2.
 実施の形態では、俯瞰映像V3-1、V3-2の映像データの少なくとも一方は、それぞれ複数のカメラ2に対応する複数のビューフラスタム40を含む映像を表示させる映像データであるとした。
 例えば俯瞰映像V3-1、V3-2の一方又は両方は、複数のカメラ2についてビューフラスタム40を表示する。複数のビューフラスタム40の表示により、ディレクターやカメラマン等は、各カメラ2の位置関係や被写体を把握しやすいものとなる。
 ディレクター等が視認する俯瞰映像V3-1については、複数のカメラ2についてビューフラスタム40が表示されることで、各カメラ2の被写体の位置や方向を認識しながら各種指示や本線映像の選択などを実行できるようになる。
 カメラマンが視認する俯瞰映像V3-2については、複数のカメラ2についてビューフラスタム40が表示されることで、他のカメラ2との関係を考慮しながら撮影操作を行うことができる。
In the embodiment, at least one of the video data of the overhead images V3-1 and V3-2 is video data that displays an image including a plurality of view frustums 40 corresponding to a plurality of cameras 2, respectively.
For example, one or both of the overhead images V3-1, V3-2 display view frustums 40 for multiple cameras 2. By displaying multiple view frustums 40, the director, cameraman, etc. can easily grasp the positional relationship of each camera 2 and the subject.
For the overhead image V3-1 viewed by a director or the like, a view frustum 40 is displayed for multiple cameras 2, allowing the director or the like to give various instructions and select main line images while recognizing the position and direction of the subject of each camera 2.
Regarding the overhead view image V3-2 viewed by the cameraman, the view frustum 40 is displayed for the plurality of cameras 2, so that the cameraman can perform shooting operations while taking into consideration the relationship with the other cameras 2.
 なおカメラマンが視認する俯瞰映像V3-2については、自分のカメラ2についてビューフラスタム40のみが表示されるようにしてもよい。そのようにすると、カメラマンは自身のカメラ操作による撮影映像V1が全体の中でどのような位置の被写体であるかを把握しやすい。
 さらには、カメラマンが視認する俯瞰映像V3-2において、他のカメラマンのカメラ2についてビューフラスタム40のみが表示されるようにしてもよい。そのようにすると、カメラマンは自身の他のカメラ2の撮影場所や被写体を認識しながら自身のカメラ操作を行うことができる。
For the overhead view image V3-2 viewed by the cameraman, only the view frustum 40 may be displayed for his/her own camera 2. In this way, the cameraman can easily grasp the position of the subject in the image V1 captured by his/her own camera operation within the whole image.
Furthermore, in the overhead image V3-2 viewed by the cameraman, only the view frustum 40 of the camera 2 of the other cameraman may be displayed. In this way, the cameraman can operate his own camera while recognizing the shooting locations and subjects of the other camera 2.
 実施の形態では、映像処理部71aが、俯瞰映像V3-1、V3-2の映像データの少なくとも一方として、それぞれ複数のカメラ2に対応する複数のビューフラスタム40のうちの一部を他のビューフラスタム40と異なる表示態様とした映像を表示させる映像データを生成する例を挙げた。
 即ち複数のビューフラスタム40を表示する場合に、その一部が他のビューフラスタム40と異なる表示態様とする。これにより、複数のビューフラスタム40の表示において特定のビューフラスタム40に意味を持たせた表示を実現できる。
In the embodiment, an example is given in which the video processing unit 71a generates video data as at least one of the video data for the overhead images V3-1, V3-2, which displays an image in which a portion of a plurality of view frustums 40 corresponding to a plurality of cameras 2 is displayed in a different manner from the other view frustums 40.
That is, when a plurality of view frustums 40 are displayed, some of them are displayed in a different manner from the other view frustums 40. This makes it possible to realize a display in which a specific view frustum 40 has meaning when displaying a plurality of view frustums 40.
 実施の形態では、映像処理部71aが、俯瞰映像V3-1、V3-2の映像データの少なくとも一方として、それぞれ複数のカメラ2に対応する複数のビューフラスタム40のうちの一部を強調表示とした映像を表示させる映像データを生成する例を挙げた。
 複数のビューフラスタム40を表示する場合に、その一部が他のビューフラスタム40よりも強調された表示とすることで、特定のビューフラスタム40を明示することができる。
 強調表示とは、例えば輝度を高めた表示、目立つ色を選定した表示、輪郭等を強調した表示、点滅表示等が考えられる。
In the embodiment, an example is given in which the video processing unit 71a generates video data that displays an image in which a portion of a plurality of view frustums 40 corresponding to a plurality of cameras 2 is highlighted as at least one of the video data for the overhead images V3-1, V3-2.
When a plurality of view frustums 40 are displayed, a particular view frustum 40 can be clearly identified by displaying some of the view frustums 40 in a more emphasized manner than the other view frustums 40 .
Examples of highlighting include a display with increased brightness, a display using a conspicuous color, a display with emphasized contours, a blinking display, and the like.
 実施の形態では、映像処理部71aが俯瞰映像V3-1として、複数のカメラ2のうちで撮影映像V1に注目被写体を含むカメラ2である特定カメラのビューフラスタム40を、他のビューフラスタム40と異なる表示態様とした映像を表示させる映像データを生成する例を挙げた(図28から図32参照)。
 注目被写体を撮影しているカメラ2の内で選択されたカメラ2のビューフラスタム40を明示することで、ディレクターが注目被写体の映像を本線映像としたい場合に、どのカメラが適切か把握しやすい。またディレクターが、注目被写体を映しているカメラ2と他のカメラ2の撮影方向との位置関係も把握しやすくできる。
In the embodiment, an example is given in which the video processing unit 71a generates video data that displays, as an overhead image V3-1, an image in which the view frustum 40 of a specific camera, which is a camera 2 among multiple cameras 2 that contains a subject of interest in the captured image V1, is displayed in a different manner from the other view frustums 40 (see Figures 28 to 32).
By clearly indicating the view frustum 40 of the camera 2 selected from among the cameras 2 capturing the target subject, it is easy for the director to know which camera is appropriate when he wants to use the image of the target subject as the main line image. It is also easy for the director to understand the positional relationship between the camera 2 capturing the target subject and the shooting direction of the other cameras 2.
 そして、ビューフラスタム40を強調表示する特定カメラは、撮影映像V1内における注目被写体の画面占有率が最も高いカメラ2であるとする例を挙げた(図29、図30、図31参照)。
 注目被写体を画面内で最も大きく映しているカメラ2を明示することで、ディレクターが注目被写体を主に映すカメラ2や他のカメラ2の状況を把握しながら指示できる。
In addition, an example has been given in which the specific camera that highlights the view frustum 40 is the camera 2 in which the screen occupancy rate of the target subject in the captured image V1 is the highest (see FIGS. 29, 30, and 31).
By clearly indicating the camera 2 showing the target subject most largely within the screen, the director can give instructions while grasping the status of the camera 2 mainly showing the target subject and the other cameras 2.
 またビューフラスタム40を強調表示する特定カメラは、撮影映像V1内における注目被写体の連続撮影時間が最も長いカメラ2であるとする例も挙げた(図32参照)。
 注目被写体を継続して映しているカメラ2を明示することでも、ディレクターが注目被写体を主に映すカメラ2や他のカメラ2の状況を把握して指示できる。
Also, an example has been given in which the specific camera for highlighting the view frustum 40 is the camera 2 that has the longest continuous shooting time of the target subject in the shot video V1 (see FIG. 32).
By clearly indicating the camera 2 that is continuously filming the subject of interest, the director can grasp the status of the camera 2 that mainly films the subject of interest and other cameras 2 and give instructions accordingly.
 実施の形態では、映像処理部71aが、俯瞰映像V3-1の映像データとして、複数のカメラ2のうちで撮影操作者による特定操作を検知したカメラ2についてのビューフラスタム40を、他のビューフラスタム40と異なる表示態様とした映像を表示させる映像データを生成する例を挙げた(図33、図34参照)。
 カメラマンが、よい映像がとれているときに、ディレクターにフィードバック操作できるようにすることで、ディレクター側がカメラマン側の声を把握しやすい。特に突発的によいシーンの撮影ができている状況なども把握しやすくなる。
In the embodiment, an example is given in which the video processing unit 71a generates video data as the video data for the overhead video V3-1, in which the view frustum 40 of a camera 2 among multiple cameras 2 that has detected a specific operation by the shooting operator is displayed in a different manner from the other view frustum 40 (see Figures 33 and 34).
By allowing the cameraman to give feedback to the director when a good shot is being taken, the director can easily understand what the cameraman is saying. In particular, it makes it easier to understand situations where a good scene is being shot unexpectedly.
 実施の形態では、映像処理部71aが、俯瞰映像V3-1の映像データとして、複数のカメラ2のビューフラスタム40が表示映像内で重なる場合に、重なった複数のビューフラスタム40が、重なっていないビューフラスタム40と異なる表示態様とされた映像を表示させる映像データを生成する例を挙げた(図35、図36参照)。
 複数のビューフラスタム40が重なっている場合は、複数のカメラ2が、共通の被写体の方向を映している状況である。これをディレクターに明示することで、共通の被写体に対する指示に好適となる。例えば互いのカメラ2のフォーカス位置や画角を変更させたりする指示に適しているし、本線映像の切り替えにも適した情報提示ともなる。
In the embodiment, an example is given in which the video processing unit 71a generates video data as video data for the overhead video V3-1 in which, when the view frustums 40 of multiple cameras 2 overlap within the displayed image, the overlapping view frustums 40 are displayed in a different manner from the non-overlapping view frustums 40 (see Figures 35 and 36).
When multiple view frustums 40 overlap, multiple cameras 2 are shooting the direction of a common subject. By clearly indicating this to the director, it becomes suitable for giving instructions regarding the common subject. For example, it is suitable for giving instructions to change the focus position or angle of view of each camera 2, and it also serves as information presentation suitable for switching main line images.
 実施の形態では、映像処理部71aが、俯瞰映像V3-1、V3-2の少なくとも一方として、複数のカメラ2のビューフラスタム40が表示映像上で重なる場合に、重なった複数のビューフラスタム40の1つを優先的に表示させる映像データを生成する例を挙げた(図37、図38参照)。
 複数のビューフラスタム40が重なっている場合は、重なり部分で1つのビューフラスタム40を優先的に表示させる。例えば重なり部分では、優先設定した1つのビューフラスタム40のみフォーカス面41や被写界深度範囲42を表示させる。フォーカス面41や被写界深度範囲42の表示が重ならないようにすることで、俯瞰映像V3を煩雑にせず、見やすいものとすることができる。
 また重なり部分では、優先設定した1つのビューフラスタム40のみ輝度を上げることや目立つ色にすることも考えられる。さらに上述の強調表示を行ってもよい。重なり部分では優先設定したビューフラスタム40以外は表示しないようにしてもよい。これらによっても複数のビューフラスタム40を含む俯瞰映像V3を見やすくできる。
In the embodiment, an example is given in which the video processing unit 71a generates video data that preferentially displays one of the overlapping view frustums 40 as at least one of the overhead images V3-1, V3-2 when the view frustums 40 of multiple cameras 2 overlap on the displayed image (see Figures 37 and 38).
When multiple view frustums 40 overlap, one view frustum 40 is preferentially displayed in the overlapping portion. For example, in the overlapping portion, the focus plane 41 and depth of field range 42 of only one view frustum 40 that has been set as the priority are displayed. By preventing the display of the focus plane 41 and depth of field range 42 from overlapping, the overhead view video V3 can be made easy to view without being cluttered.
In addition, in the overlapping portion, it is possible to increase the brightness of only one view frustum 40 that has been set as a priority, or to give it a conspicuous color. Furthermore, the above-mentioned highlighted display may be performed. In the overlapping portion, only the view frustum 40 that has been set as a priority may be displayed. These also make it easier to view the overhead image V3 including multiple view frustum 40.
 具体的な例としては、例えばディレクターが視認する俯瞰映像V3-1では、本線映像とされているカメラ2のビューフラスタム40を優先して表示させ、カメラマンが視認する俯瞰映像V3-2では、特に優先設定を行わないとする例がある。
 またディレクターが視認する俯瞰映像V3-1では、特に優先設定を行わず、カメラマンが視認する俯瞰映像V3-2では自分が操作するカメラ2のビューフラスタム40を優先して表示させる例がある。
As a specific example, in the overhead image V3-1 viewed by the director, the view frustum 40 of camera 2, which is the main line image, is displayed as a priority, while in the overhead image V3-2 viewed by the cameraman, no particular priority is set.
In addition, there is an example in which no particular priority setting is made in the overhead image V3-1 viewed by the director, but in the overhead image V3-2 viewed by the cameraman, the view frustum 40 of the camera 2 that he operates is displayed with priority.
 実施の形態では、映像処理部71aが、俯瞰映像V3-1、V3-2のそれぞれとして、指示映像を含む映像を互いに異なる表示態様で表示させる映像データを生成する例を挙げた(図39から図45参照)。
 例えばディレクターが画面上でビューフラスタム40を操作して指示する場合に、指示フラスタム40DRにより指示内容を確認できるようにする。カメラマン側では、画面上で指示フラスタム40DRが表示されることで、指示内容を視覚的に理解できるようにする。この場合に、それぞれ俯瞰映像V3-1、V3-2において、役割に適した表示を行うようにすることで、撮影を円滑に進めることができる。
In the embodiment, an example has been given in which video processing unit 71a generates video data for displaying, as overhead images V3-1 and V3-2, images including instruction images in different display modes (see FIGS. 39 to 45).
For example, when a director gives instructions by operating the view frustum 40 on the screen, the instruction contents can be confirmed by the instruction frustum 40DR. On the cameraman side, the instruction frustum 40DR is displayed on the screen, so that the cameraman can visually understand the instruction contents. In this case, the overhead images V3-1 and V3-2 are displayed in a way that is appropriate for each role, so that the shooting can proceed smoothly.
 実施の形態では、映像処理部71aが、俯瞰映像V3-1の映像データは複数のカメラ2に対する指示映像を表示させる映像データとし、俯瞰映像V3-2の映像データは、複数のカメラの内の特定のカメラ2に対する指示映像を表示させる映像データとする例を挙げた(図39、図41、図42参照)。
 これによりディレクター側は、各カメラへの指示を把握できる。カメラマンにとっては、自己に対する指示のみが表示されることで、指示を認識しやすい。
In the embodiment, an example is given in which the video processing unit 71a sets the video data of the overhead image V3-1 as video data that displays instruction images for multiple cameras 2, and sets the video data of the overhead image V3-2 as video data that displays instruction images for a specific camera 2 among the multiple cameras (see Figures 39, 41, and 42).
This allows the director to understand the instructions for each camera, while camera operators can easily understand the instructions by only seeing the instructions that are directed to them.
 実施の形態では、映像処理部71aが、俯瞰映像V3-2の映像データは、複数のカメラの内の特定のカメラ2の位置に応じた視点の映像内で指示映像を表示させる映像データとする例を挙げた(図42、図43参照)。
 カメラマンにとっては、自分の視点位置からの俯瞰映像V3-2において、指示フラスタム40DRが表示されることで、自分が見ている状態から指示の方向がわかりやすい表示となる。
In the embodiment, an example was given in which the video processing unit 71a converts the video data of the overhead video V3-2 into video data that displays an instruction video within a video of a viewpoint corresponding to the position of a specific camera 2 among multiple cameras (see Figures 42 and 43).
For the cameraman, the indication frustum 40DR is displayed in the overhead view image V3-2 from his/her viewpoint, so that the indication direction can be easily seen from his/her own viewpoint.
 実施の形態では、映像処理部71aが、俯瞰映像V3-2の映像データとして、現在のビューフラスタム40と、マーキング操作に基づく撮影方向のマーカー映像を表示させる映像データを生成する例を挙げた(図46から図48参照)。
 カメラマンがマーキング操作を行うことに応じて、マーカーフラスタム40Mやマーカー55M等のマーカー映像を含む俯瞰映像V3-2を表示させるようにする。これによりカメラマンは自身が設定した撮影位置や被写体をマーキングし、適時、その位置の撮影を行う場合に便利な物となる。
 また、このようなマーカー映像の表示は、ディレクター側の俯瞰映像V3-1には表示させないことで、俯瞰映像V3-1をむやみに煩雑にしないことができる。
In the embodiment, an example has been given in which the video processing unit 71a generates video data for the overhead video V3-2 that displays the current view frustum 40 and a marker image in the shooting direction based on the marking operation (see Figures 46 to 48).
In response to the cameraman performing the marking operation, the bird's-eye view image V3-2 including the marker images of the marker frustum 40M, the marker 55M, etc. is displayed. This allows the cameraman to mark the shooting position or subject that he or she has set, which is convenient for taking pictures of that position at the appropriate time.
Furthermore, by not displaying such a marker image on the director's overhead image V3-1, it is possible to prevent the overhead image V3-1 from becoming unnecessarily cluttered.
 実施の形態では、映像処理部71aが、俯瞰映像V3-2の映像データとして、複数のカメラの内の特定のカメラ2の位置に応じた視点の俯瞰映像を表示させる映像データを生成し、俯瞰映像V3-1の映像データとして、異なる視点の俯瞰映像を表示させる映像データを生成する例を挙げた(図49から図52参照)。
 カメラマンにとっては、自分の視点位置と同等の視点で俯瞰映像V3-2が表示されることで、全体の状況や自分の撮影方向が認識しやすいものとなる。ディレクターにとっては、特定のカメラマンの視点ではなく、全体を把握しやすい視点での俯瞰映像V3-1が表示されることで、全体の撮影指揮に好適となる。
In the embodiment, an example is given in which the video processing unit 71a generates video data as the video data for the overhead video V3-2, which displays an overhead video from a viewpoint corresponding to the position of a specific camera 2 among multiple cameras, and generates video data as the video data for the overhead video V3-1, which displays an overhead video from a different viewpoint (see Figures 49 to 52).
For the cameraman, the bird's-eye view V3-2 is displayed from the same viewpoint as his/her own viewpoint, making it easy to recognize the overall situation and his/her own shooting direction. For the director, the bird's-eye view V3-1 is displayed from a viewpoint that makes it easy to grasp the whole picture, rather than from the viewpoint of a specific cameraman, making it ideal for directing the entire shoot.
 実施の形態では、映像処理部71aが、俯瞰映像V3-1の映像データとして、複数の視点による複数の俯瞰映像V3-1a,V3-1bを表示させる映像データを生成する例を挙げた(図51、図52参照)。
 ディレクターは、各カメラ2の撮影状況を把握することが必要であるので、図51のように複数の視点位置で全体を俯瞰できるような俯瞰映像V3-1は非常に有用となる。
In the embodiment, an example has been given in which the video processing unit 71a generates video data for displaying a plurality of overhead views V3-1a, V3-1b from a plurality of viewpoints as the video data for the overhead view V3-1 (see FIGS. 51 and 52).
Since the director needs to understand the shooting conditions of each camera 2, an overhead image V3-1 that provides an overall bird's-eye view from a plurality of viewpoints as shown in FIG. 51 is extremely useful.
 実施の形態では、映像処理部71aが、俯瞰映像V3をCGによる仮想映像として生成する例を挙げた。
 これにより自由な視点による俯瞰映像V3を生成でき、多様な視点の表現の上でビューフラスタム40や撮影映像V1を表示させることができる。
In the embodiment, an example has been given in which the video processor 71a generates the overhead view video V3 as a virtual video using CG.
This makes it possible to generate an overhead image V3 from any viewpoint, and to display the view frustum 40 and the captured image V1 from a variety of viewpoints.
 ところで実施の形態では、ビューフラスタム40は、撮影時の撮影方向や画角をリアルタイムに提示するものとしたが、例えばカメラワークの事前シミュレーション時などの過去のビューフラスタム40を表示させるようにしてもよい。
 例えば撮影時の現在のビューフラスタム40と、過去のビューフラスタム40を同時に表示させて比較できるようにしてもよい。
 また、そのような場合、過去のビューフラスタム40は透明度を上げるなどして現在のビューフラスタム40と異なるようにし、カメラマン等が区別できるようにするとよい。
In the embodiment, the view frustum 40 is configured to display the shooting direction and angle of view at the time of shooting in real time, but it may also be configured to display a past view frustum 40, for example, during a prior simulation of camera work.
For example, the current view frustum 40 at the time of shooting and the past view frustum 40 may be displayed at the same time for comparison.
In such a case, it is advisable to make the past view frustum 40 different from the current view frustum 40 by increasing its transparency, for example, so that the cameraman or the like can distinguish between them.
 実施の形態のプログラムは、上述の図20、図21、図22、図23、図24のような処理を、例えばCPU、DSP等のプロセッサ、或いはこれらを含むデバイスに実行させるプログラムである。即ち実施の形態のプログラムは、撮影対象の空間の俯瞰映像V3と、俯瞰映像V3内においてカメラ2の撮影範囲を提示するビューフラスタム40(撮影範囲提示映像)と、カメラ2の撮影映像V1とを一画面内で同時表示させる映像データを生成する処理を情報処理装置70に実行させるプログラムである。 The program of the embodiment is a program that causes a processor such as a CPU or DSP, or a device including these, to execute the processes shown in Figures 20, 21, 22, 23, and 24 described above. That is, the program of the embodiment is a program that causes the information processing device 70 to execute a process of generating video data that simultaneously displays, on one screen, an overhead image V3 of the space to be photographed, a view frustum 40 (shooting range presentation image) that presents the shooting range of the camera 2 within the overhead image V3, and the captured image V1 of the camera 2.
 また実施の形態のプログラムは、上述の図30、図31、図32、図34、図36、図38、図41、図43、図45、図48、図52のような処理を、例えばCPU、DSP等のプロセッサ、或いはこれらを含むデバイスに実行させるプログラムである。即ち実施の形態のプログラムは、撮影対象空間内におけるカメラ2の撮影範囲を提示するビューフラスタム40(撮影範囲提示映像)を表示させる第1映像データを生成する処理と、撮影対象空間内におけるビューフラスタム40を表示させる映像であって第1映像データによる映像とは異なる表示態様の映像を表示させる第2映像データを生成する処理とを並行して情報処理装置70に実行させるプログラムである。 The program of the embodiment is a program that causes a processor such as a CPU or DSP, or a device including these, to execute the processes shown in Figures 30, 31, 32, 34, 36, 38, 41, 43, 45, 48, and 52 described above. That is, the program of the embodiment is a program that causes the information processing device 70 to execute in parallel a process of generating first video data that displays a view frustum 40 (shooting range display image) that presents the shooting range of the camera 2 within the shooting target space, and a process of generating second video data that displays an image that displays the view frustum 40 within the shooting target space and has a display mode different from that of the image generated by the first video data.
 これらのプログラムにより、上述したARシステム5のように動作する情報処理装置70を、各種のコンピュータ装置により実現できる。 These programs allow an information processing device 70 that operates like the AR system 5 described above to be realized using various computer devices.
 このようなプログラムはコンピュータ装置等の機器に内蔵されている記録媒体としてのHDDや、CPUを有するマイクロコンピュータ内のROM等に予め記録しておくことができる。また、このようなプログラムは、フレキシブルディスク、CD-ROM(Compact Disc Read Only Memory)、MO(Magneto Optical)ディスク、DVD(Digital Versatile Disc)、ブルーレイディスク(Blu-ray Disc(登録商標))、磁気ディスク、半導体メモリ、メモリカードなどのリムーバブル記録媒体に、一時的あるいは永続的に格納(記録)しておくことができる。このようなリムーバブル記録媒体は、いわゆるパッケージソフトウェアとして提供することができる。
 また、このようなプログラムは、リムーバブル記録媒体からパーソナルコンピュータ等にインストールする他、ダウンロードサイトから、LAN(Local Area Network)、インターネットなどのネットワークを介してダウンロードすることもできる。
Such a program can be pre-recorded in a HDD as a recording medium built into a device such as a computer device, or in a ROM in a microcomputer having a CPU. Also, such a program can be temporarily or permanently stored (recorded) in a removable recording medium such as a flexible disk, a CD-ROM (Compact Disc Read Only Memory), an MO (Magneto Optical) disk, a DVD (Digital Versatile Disc), a Blu-ray Disc (registered trademark), a magnetic disk, a semiconductor memory, or a memory card. Such a removable recording medium can be provided as a so-called package software.
Such a program can be installed in a personal computer or the like from a removable recording medium, or can be downloaded from a download site via a network such as a LAN (Local Area Network) or the Internet.
 またこのようなプログラムによれば、実施の形態の情報処理装置70の広範な提供に適している。例えばパーソナルコンピュータ、通信機器、スマートフォンやタブレット等の携帯端末装置、携帯電話機、ゲーム機器、ビデオ機器、PDA(Personal Digital Assistant)等にプログラムをダウンロードすることで、これらの装置を本開示の情報処理装置70として機能させることができる。 Furthermore, such a program is suitable for the widespread provision of the information processing device 70 of the embodiment. For example, by downloading the program to personal computers, communication devices, mobile terminal devices such as smartphones and tablets, mobile phones, game devices, video devices, PDAs (Personal Digital Assistants), etc., these devices can function as the information processing device 70 of the present disclosure.
 なお、本明細書に記載された効果はあくまでも例示であって限定されるものではなく、また他の効果があってもよい。 Note that the effects described in this specification are merely examples and are not limiting, and other effects may also be present.
 なお本技術は以下のような構成も採ることができる。
 (1)
 撮影対象の空間の俯瞰映像と、前記俯瞰映像内においてカメラの撮影範囲を提示する撮影範囲提示映像と、前記カメラの撮影映像とを一画面内で同時表示させる映像データを生成する映像処理部を備えた
 情報処理装置。
 (2)
 前記映像処理部は、前記撮影映像が前記撮影範囲提示映像内に表示されるようにした映像データを生成する
 上記(1)に記載の情報処理装置。
 (3)
 前記映像処理部は、前記撮影映像が、前記撮影範囲提示映像に示される被写界深度範囲内となる位置に表示されるようにした映像データを生成する
 上記(1)又は(2)に記載の情報処理装置。
 (4)
 前記映像処理部は、前記撮影映像が、前記撮影範囲提示映像に示されるフォーカス面に表示されるようにした映像データを生成する
 上記(1)から(3)のいずれかに記載の情報処理装置。
 (5)
 前記映像処理部は、前記撮影映像が、前記撮影範囲提示映像の起点から見て被写界深度範囲よりも遠方側に表示されるようにした映像データを生成する
 上記(2)に記載の情報処理装置。
 (6)
 前記映像処理部は、前記撮影映像が、前記撮影範囲提示映像に示される被写界深度範囲よりも、前記撮影範囲提示映像の起点に近い位置に表示されるようにした映像データを生成する
 上記(2)に記載の情報処理装置。
 (7)
 前記俯瞰映像と前記撮影範囲提示映像と共に一画面内に同時表示させる前記撮影映像の表示位置を可変設定して映像データの生成を制御する映像生成制御部を備えた
 上記(1)から(6)のいずれかに記載の情報処理装置。
 (8)
 前記映像生成制御部は、前記撮影映像の表示位置の変更判定を行い、判定結果に応じて前記撮影映像の表示位置の設定を変更する
 上記(7)に記載の情報処理装置。
 (9)
 前記映像生成制御部は、前記変更判定では、前記撮影範囲提示映像と前記俯瞰映像で表現される物体の位置関係に基づいて前記撮影映像の表示位置の変更要否を判定する
 上記(8)に記載の情報処理装置。
 (10)
 前記映像生成制御部は、前記変更判定では、前記俯瞰映像全体の視点からの方向と前記撮影範囲提示映像の軸方向との角度に基づいて前記撮影映像の表示位置の変更要否を判定する
 上記(8)又は(9)に記載の情報処理装置。
 (11)
 前記映像生成制御部は、前記変更判定では、前記俯瞰映像内の視点変更に応じて前記撮影映像の表示位置の変更要否を判定する
 上記(8)又は(9)に記載の情報処理装置。
 (12)
 前記映像生成制御部は、前記撮影映像を撮影するカメラの種別情報を、前記撮影映像の変更先の設定に用いる
 上記(7)から(10)のいずれかに記載の情報処理装置。
 (13)
 前記映像生成制御部は、ユーザ操作に応じて、前記撮影映像の表示位置の設定を変更する
 上記(7)から(12)のいずれかに記載の情報処理装置。
 (14)
 前記映像生成制御部は、前記撮影映像の表示位置を前記撮影範囲提示映像内で変更する
 上記(7)から(13)のいずれかに記載の情報処理装置。
 (15)
 前記映像生成制御部は、前記撮影映像の表示位置を前記撮影範囲提示映像内及び前記撮影範囲提示映像外で変更する
 上記(7)から(13)のいずれかに記載の情報処理装置。
 (16)
 前記映像処理部は、前記俯瞰映像と、複数のカメラについてのそれぞれの撮影範囲提示映像と、複数のカメラのそれぞれの撮影映像とを、一画面内で同時表示させる映像データを生成する
 上記(1)から(15)のいずれかに記載の情報処理装置。
 (17)
 前記俯瞰映像は仮想映像により生成される
 上記(1)から(16)のいずれかに記載の情報処理装置。
 (18)
 撮影対象の空間の俯瞰映像と、前記俯瞰映像内においてカメラの撮影範囲を提示する撮影範囲提示映像と、前記カメラの撮影映像とを一画面内で同時表示させる映像データを生成する処理を
 情報処理装置が実行する情報処理方法。
 (19)
 撮影対象の空間の俯瞰映像と、前記俯瞰映像内においてカメラの撮影範囲を提示する撮影範囲提示映像と、前記カメラの撮影映像とを一画面内で同時表示させる映像データを生成する処理を
 情報処理装置に実行させるプログラム。
The present technology can also be configured as follows.
(1)
An information processing device comprising: an image processing unit that generates image data for simultaneously displaying an overhead image of a space to be photographed, a shooting range presentation image that presents the shooting range of a camera within the overhead image, and the image photographed by the camera on a single screen.
(2)
The information processing device according to (1), wherein the image processing unit generates image data in which the captured image is displayed within the shooting range presentation image.
(3)
The information processing device according to (1) or (2), wherein the image processing unit generates image data in which the captured image is displayed at a position within a depth of field range shown in the shooting range presentation image.
(4)
The information processing device according to any one of (1) to (3) above, wherein the image processing unit generates image data in which the captured image is displayed on a focus plane shown in the shooting range presentation image.
(5)
The information processing device according to (2) above, wherein the image processing unit generates image data in which the captured image is displayed farther away than a depth of field range as viewed from a starting point of the shooting range presentation image.
(6)
The information processing device described in (2) above, wherein the image processing unit generates image data in which the captured image is displayed at a position closer to an origin of the shooting range presentation image than a depth of field range shown in the shooting range presentation image.
(7)
The information processing device according to any one of (1) to (6) above, further comprising an image generation control unit that controls generation of image data by variably setting a display position of the captured image that is simultaneously displayed on one screen together with the overhead image and the shooting range presentation image.
(8)
The information processing device according to (7) above, wherein the image generation control unit determines whether to change a display position of the shot image, and changes a setting of the display position of the shot image according to a result of the determination.
(9)
The information processing device according to (8) above, wherein the image generation control unit, in the change determination, determines whether or not it is necessary to change the display position of the captured image based on a positional relationship between the shooting range presentation image and an object represented in the overhead image.
(10)
The information processing device described in (8) or (9) above, wherein, in the change determination, the image generation control unit determines whether or not it is necessary to change the display position of the captured image based on the angle between the direction from the viewpoint of the entire overhead image and the axial direction of the shooting range presentation image.
(11)
The information processing device according to (8) or (9), wherein the image generation control unit, in the change determination, determines whether or not a change is required for a display position of the captured image in accordance with a change in a viewpoint within the overhead image.
(12)
The information processing device according to any one of (7) to (10) above, wherein the image generation control unit uses type information of a camera that captures the captured image to set a destination of the captured image.
(13)
The information processing device according to any one of (7) to (12) above, wherein the image generation control unit changes a setting of a display position of the captured image in response to a user operation.
(14)
The information processing device according to any one of (7) to (13) above, wherein the image generation control unit changes a display position of the captured image within the shooting range presentation image.
(15)
The information processing device according to any one of (7) to (13), wherein the image generation control unit changes a display position of the captured image within the shooting range presentation image and outside the shooting range presentation image.
(16)
The information processing device described in any one of (1) to (15) above, wherein the image processing unit generates image data that simultaneously displays the overhead image, each of the shooting range presentation images for the multiple cameras, and each of the shot images for the multiple cameras on one screen.
(17)
The information processing device according to any one of (1) to (16) above, wherein the overhead image is generated by a virtual image.
(18)
An information processing method in which an information processing device executes a process of generating video data that simultaneously displays an overhead image of a space to be photographed, a shooting range presentation image that presents the camera's shooting range within the overhead image, and the image captured by the camera on a single screen.
(19)
A program that causes an information processing device to execute a process of generating video data that simultaneously displays an overhead image of a space to be photographed, a shooting range presentation image that presents the camera's shooting range within the overhead image, and the image captured by the camera on a single screen.
1,1A カメラシステム
2 カメラ
3 CCU
4 AIボード
5 ARシステム
6 三脚
8 撮影対象空間
10 コントロールパネル
11 GUIデバイス
12 ネットワークハブ
13 スイッチャー
14 マスターモニタ
30 CG空間
35 環境マップ
40,40a,40b,40c ビューフラスタム
40DR 指示フラスタム
40M1,40M2,40M マーカーフラスタム
41 フォーカス面
42 被写界深度範囲
43 深度近端面
44 深度遠端面
45 フラスタム遠端面
46 フラスタム起点
47 フラスタム起点付近面
V1 撮影映像
V2 AR重畳映像
V3 俯瞰映像
70 情報処理装置
71 CPU
71a 映像処理部
71b 映像生成制御部
1, 1A Camera system 2 Camera 3 CCU
4 AI board 5 AR system 6 Tripod 8 Space to be photographed 10 Control panel 11 GUI device 12 Network hub 13 Switcher 14 Master monitor 30 CG space 35 Environment map 40, 40a, 40b, 40c View frustum 40DR Indication frustum 40M1, 40M2, 40M Marker frustum 41 Focus plane 42 Depth of field range 43 Depth near end plane 44 Depth far end plane 45 Frustum far end plane 46 Frustum origin 47 Frustum origin vicinity plane V1 Photographed image V2 AR superimposed image V3 Bird's-eye image 70 Information processing device 71 CPU
71a Image processing unit 71b Image generation control unit

Claims (19)

  1.  撮影対象の空間の俯瞰映像と、前記俯瞰映像内においてカメラの撮影範囲を提示する撮影範囲提示映像と、前記カメラの撮影映像とを一画面内で同時表示させる映像データを生成する映像処理部を備えた
     情報処理装置。
    An information processing device comprising: an image processing unit that generates image data for simultaneously displaying an overhead image of a space to be photographed, a shooting range presentation image that presents the shooting range of a camera within the overhead image, and the image photographed by the camera on a single screen.
  2.  前記映像処理部は、前記撮影映像が前記撮影範囲提示映像内に表示されるようにした映像データを生成する
     請求項1に記載の情報処理装置。
    The information processing device according to claim 1 , wherein the image processing unit generates image data in which the captured image is displayed within the capturing range presentation image.
  3.  前記映像処理部は、前記撮影映像が、前記撮影範囲提示映像に示される被写界深度範囲内となる位置に表示されるようにした映像データを生成する
     請求項1に記載の情報処理装置。
    The information processing device according to claim 1 , wherein the image processing unit generates image data in which the captured image is displayed at a position within a depth of field range shown in the shooting range presentation image.
  4.  前記映像処理部は、前記撮影映像が、前記撮影範囲提示映像に示されるフォーカス面に表示されるようにした映像データを生成する
     請求項1に記載の情報処理装置。
    The information processing device according to claim 1 , wherein the image processing unit generates image data in which the captured image is displayed on a focus plane indicated in the capturing range presentation image.
  5.  前記映像処理部は、前記撮影映像が、前記撮影範囲提示映像の起点から見て被写界深度範囲よりも遠方側に表示されるようにした映像データを生成する
     請求項2に記載の情報処理装置。
    The information processing device according to claim 2 , wherein the image processing unit generates image data in which the captured image is displayed farther away than a depth of field range when viewed from a starting point of the shooting range presentation image.
  6.  前記映像処理部は、前記撮影映像が、前記撮影範囲提示映像に示される被写界深度範囲よりも、前記撮影範囲提示映像の起点に近い位置に表示されるようにした映像データを生成する
     請求項2に記載の情報処理装置。
    The information processing device according to claim 2 , wherein the image processing unit generates image data in which the captured image is displayed at a position closer to an origin of the shooting range presentation image than a depth of field range shown in the shooting range presentation image.
  7.  前記俯瞰映像と前記撮影範囲提示映像と共に一画面内に同時表示させる前記撮影映像の表示位置を可変設定して映像データの生成を制御する映像生成制御部を備えた
     請求項1に記載の情報処理装置。
    The information processing device according to claim 1 , further comprising an image generation control unit that controls generation of image data by variably setting a display position of the captured image that is simultaneously displayed on one screen together with the overhead image and the shooting range presentation image.
  8.  前記映像生成制御部は、前記撮影映像の表示位置の変更判定を行い、判定結果に応じて前記撮影映像の表示位置の設定を変更する
     請求項7に記載の情報処理装置。
    The information processing device according to claim 7 , wherein the image generation control unit determines whether to change a display position of the captured image, and changes a setting of the display position of the captured image in accordance with a result of the determination.
  9.  前記映像生成制御部は、前記変更判定では、前記撮影範囲提示映像と前記俯瞰映像で表現される物体の位置関係に基づいて前記撮影映像の表示位置の変更要否を判定する
     請求項8に記載の情報処理装置。
    The information processing device according to claim 8 , wherein the image generation control unit, in the change determination, determines whether or not it is necessary to change the display position of the captured image based on a positional relationship between the shooting range presentation image and an object represented in the overhead image.
  10.  前記映像生成制御部は、前記変更判定では、前記俯瞰映像全体の視点からの方向と前記撮影範囲提示映像の軸方向との角度に基づいて前記撮影映像の表示位置の変更要否を判定する
     請求項8に記載の情報処理装置。
    The information processing device according to claim 8 , wherein the image generation control unit, in the change determination, determines whether or not it is necessary to change the display position of the captured image based on an angle between a direction from a viewpoint of the entire overhead image and an axial direction of the shooting range presentation image.
  11.  前記映像生成制御部は、前記変更判定では、前記俯瞰映像内の視点変更に応じて前記撮影映像の表示位置の変更要否を判定する
     請求項8に記載の情報処理装置。
    The information processing device according to claim 8 , wherein the image generation control unit, in the change determination, determines whether or not it is necessary to change the display position of the captured image in response to a change in a viewpoint within the overhead image.
  12.  前記映像生成制御部は、前記撮影映像を撮影するカメラの種別情報を、前記撮影映像の変更先の設定に用いる
     請求項7に記載の情報処理装置。
    The information processing device according to claim 7 , wherein the image generation control unit uses type information of a camera that captures the captured image to set a destination of the captured image.
  13.  前記映像生成制御部は、ユーザ操作に応じて、前記撮影映像の表示位置の設定を変更する
     請求項7に記載の情報処理装置。
    The information processing device according to claim 7 , wherein the image generation control unit changes a setting of a display position of the captured image in response to a user operation.
  14.  前記映像生成制御部は、前記撮影映像の表示位置を前記撮影範囲提示映像内で変更する
     請求項7に記載の情報処理装置。
    The information processing device according to claim 7 , wherein the image generation control unit changes a display position of the captured image within the shooting range presentation image.
  15.  前記映像生成制御部は、前記撮影映像の表示位置を前記撮影範囲提示映像内及び前記撮影範囲提示映像外で変更する
     請求項7に記載の情報処理装置。
    The information processing device according to claim 7 , wherein the image generation control unit changes a display position of the captured image within the shooting range presentation image and outside the shooting range presentation image.
  16.  前記映像処理部は、前記俯瞰映像と、複数のカメラについてのそれぞれの撮影範囲提示映像と、複数のカメラのそれぞれの撮影映像とを、一画面内で同時表示させる映像データを生成する
     請求項1に記載の情報処理装置。
    The information processing device according to claim 1 , wherein the image processing unit generates image data for simultaneously displaying the overhead image, each of the shooting range presentation images for a plurality of cameras, and each of the shot images for the plurality of cameras on one screen.
  17.  前記俯瞰映像は仮想映像により生成される
     請求項1に記載の情報処理装置。
    The information processing device according to claim 1 , wherein the overhead image is generated from a virtual image.
  18.  撮影対象の空間の俯瞰映像と、前記俯瞰映像内においてカメラの撮影範囲を提示する撮影範囲提示映像と、前記カメラの撮影映像とを一画面内で同時表示させる映像データを生成する処理を
     情報処理装置が実行する情報処理方法。
    An information processing method in which an information processing device executes a process of generating video data that simultaneously displays, on a single screen, an overhead image of a space to be photographed, a shooting range presentation image that presents the camera's shooting range within the overhead image, and the image captured by the camera.
  19.  撮影対象の空間の俯瞰映像と、前記俯瞰映像内においてカメラの撮影範囲を提示する撮影範囲提示映像と、前記カメラの撮影映像とを一画面内で同時表示させる映像データを生成する処理を
     情報処理装置に実行させるプログラム。
    A program that causes an information processing device to execute a process of generating video data that simultaneously displays an overhead image of a space to be photographed, a shooting range presentation image that presents the camera's shooting range within the overhead image, and the image captured by the camera on a single screen.
PCT/JP2023/033687 2022-09-29 2023-09-15 Information processing device, information processing method, and program WO2024070761A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2022157054 2022-09-29
JP2022-157054 2022-09-29

Publications (1)

Publication Number Publication Date
WO2024070761A1 true WO2024070761A1 (en) 2024-04-04

Family

ID=90477483

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2023/033687 WO2024070761A1 (en) 2022-09-29 2023-09-15 Information processing device, information processing method, and program

Country Status (1)

Country Link
WO (1) WO2024070761A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0548964A (en) * 1991-08-19 1993-02-26 Nippon Telegr & Teleph Corp <Ntt> Display method of video and its photographing information
JPH08251467A (en) * 1995-03-09 1996-09-27 Canon Inc Display device for camera information
JP2008005450A (en) * 2006-06-20 2008-01-10 Kubo Tex Corp Method of grasping and controlling real-time status of video camera utilizing three-dimensional virtual space
JP2008011433A (en) * 2006-06-30 2008-01-17 Canon Marketing Japan Inc Imaging system, imaging method thereof, image server, and image processing method thereof
JP2013030924A (en) * 2011-07-27 2013-02-07 Jvc Kenwood Corp Camera control device, camera control method, and camera control program

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0548964A (en) * 1991-08-19 1993-02-26 Nippon Telegr & Teleph Corp <Ntt> Display method of video and its photographing information
JPH08251467A (en) * 1995-03-09 1996-09-27 Canon Inc Display device for camera information
JP2008005450A (en) * 2006-06-20 2008-01-10 Kubo Tex Corp Method of grasping and controlling real-time status of video camera utilizing three-dimensional virtual space
JP2008011433A (en) * 2006-06-30 2008-01-17 Canon Marketing Japan Inc Imaging system, imaging method thereof, image server, and image processing method thereof
JP2013030924A (en) * 2011-07-27 2013-02-07 Jvc Kenwood Corp Camera control device, camera control method, and camera control program

Similar Documents

Publication Publication Date Title
US9858643B2 (en) Image generating device, image generating method, and program
JP7042644B2 (en) Information processing equipment, image generation method and computer program
KR100990416B1 (en) Display apparatus, image processing apparatus and image processing method, imaging apparatus, and recording medium
JP7017175B2 (en) Information processing equipment, information processing method, program
US20110085017A1 (en) Video Conference
JP5861499B2 (en) Movie presentation device
US10681276B2 (en) Virtual reality video processing to compensate for movement of a camera during capture
US11627251B2 (en) Image processing apparatus and control method thereof, computer-readable storage medium
JP2019083402A (en) Image processing apparatus, image processing system, image processing method, and program
JP7378243B2 (en) Image generation device, image display device, and image processing method
JP2019121224A (en) Program, information processing device, and information processing method
US11847735B2 (en) Information processing apparatus, information processing method, and recording medium
WO2020166376A1 (en) Image processing device, image processing method, and program
US20230353717A1 (en) Image processing system, image processing method, and storage medium
KR102200115B1 (en) System for providing multi-view 360 angle vr contents
WO2020017600A1 (en) Display control device, display control method and program
WO2024070761A1 (en) Information processing device, information processing method, and program
WO2024070762A1 (en) Information processing device, information processing method, and program
CN111466113A (en) Apparatus and method for image capture
KR101263881B1 (en) System for controlling unmanned broadcasting
WO2024070763A1 (en) Information processing device, imaging system, information processing method, program
WO2023248832A1 (en) Remote viewing system and on-site imaging system
US20220086413A1 (en) Processing system, processing method and non-transitory computer-readable storage medium
JPH0937140A (en) Photographing simulator and photographing device
JP2023003765A (en) Image generation device and control method thereof, image generation system, and program