WO2024070761A1 - Dispositif de traitement d'informations, procédé de traitement d'informations et programme - Google Patents

Dispositif de traitement d'informations, procédé de traitement d'informations et programme Download PDF

Info

Publication number
WO2024070761A1
WO2024070761A1 PCT/JP2023/033687 JP2023033687W WO2024070761A1 WO 2024070761 A1 WO2024070761 A1 WO 2024070761A1 JP 2023033687 W JP2023033687 W JP 2023033687W WO 2024070761 A1 WO2024070761 A1 WO 2024070761A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
camera
overhead
frustum
view
Prior art date
Application number
PCT/JP2023/033687
Other languages
English (en)
Japanese (ja)
Inventor
滉太 今枝
和平 岡田
大資 田原
慧 柿谷
Original Assignee
ソニーグループ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ソニーグループ株式会社 filed Critical ソニーグループ株式会社
Publication of WO2024070761A1 publication Critical patent/WO2024070761A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/63Control of cameras or camera modules by using electronic viewfinders

Definitions

  • This technology relates to an information processing device, information processing method, and program, and is related to the display of images of the target space and virtual images.
  • Japanese Patent Application Laid-Open No. 2003-233693 discloses a technique for displaying the depth of field and the angle of view based on shooting information.
  • Japanese Patent Application Laid-Open No. 2003-233633 discloses expressing the shooting range in a captured image using a trapezoidal figure.
  • Japanese Patent Laid-Open No. 2003-233633 discloses generating and displaying a map image for indicating the depth position and focus position of an object to be imaged.
  • This disclosure therefore proposes technology that displays images that make it easier to understand the correspondence between camera images and positions in space.
  • An information processing device related to the present technology includes an image processing unit that generates image data that simultaneously displays an overhead image of a space to be photographed, a shooting range presentation image that presents the camera's shooting range within the overhead image, and the image photographed by the camera on a single screen.
  • the shooting range presentation image is an image showing the shooting range determined by the shooting direction and zoom angle of the camera. When the image showing the shooting range of the camera is displayed in the overhead image, the camera's shooting image is also displayed in the same screen.
  • FIG. 1 is an explanatory diagram of photography by a photography system according to an embodiment of the present technology. This is an explanatory diagram of AR (Augmented Reality) overlaid images.
  • FIG. 1 is an explanatory diagram of a system configuration according to an embodiment.
  • FIG. 11 is an explanatory diagram of another example of a system configuration according to the embodiment;
  • FIG. 2 is an explanatory diagram of an environment map according to the embodiment; 11A and 11B are diagrams illustrating drift correction of an environment map according to an embodiment.
  • FIG. 1 is a block diagram of an information processing apparatus according to an embodiment.
  • FIG. 2 is an explanatory diagram of a view frustum according to an embodiment.
  • FIG. 1 is an explanatory diagram of a display example of a captured image on a focus plane of a view frustum according to an embodiment
  • 1 is an explanatory diagram of a display example of a captured image within the depth of field of a view frustum according to an embodiment.
  • FIG. 11 is an explanatory diagram of a display example of a captured image at a position close to the starting point of a view frustum in the embodiment;
  • FIG. 1 is an explanatory diagram of an example of a display of a captured image on a far end surface of a view frustum according to an embodiment;
  • FIG. 13 is an explanatory diagram of a case where a view frustum according to an embodiment is set at infinity.
  • 11A to 11C are explanatory diagrams illustrating a change in the display state of a captured image on the far end side of a view frustum according to an embodiment.
  • 11A and 11B are explanatory diagrams of a display example of a captured image outside a view frustum according to an embodiment.
  • 1A to 1C are explanatory diagrams illustrating an example of display of captured images inside and outside a plurality of view frustums according to an embodiment.
  • 11A and 11B are explanatory diagrams of a display example of a captured image outside a view frustum according to an embodiment.
  • 11A and 11B are explanatory diagrams of a display example of a captured image outside a view frustum according to an embodiment.
  • 11 is a flowchart of a processing example of the information processing apparatus according to the embodiment.
  • 11 is a flowchart of an example of a process for setting a display position of a captured image according to an embodiment;
  • 11 is a flowchart of an example of a process for setting a display position of a captured image according to an embodiment;
  • 11 is a flowchart of an example of a process for setting a display position of a captured image according to an embodiment;
  • 11 is a flowchart of an example of a process for setting a display position of a captured image according to an embodiment;
  • 11 is a flowchart of an example of a process for setting a display position of a captured image according to an embodiment;
  • FIG. 4 is an explanatory diagram of a collision determination according to an embodiment.
  • FIG. 4 is an explanatory diagram of a collision determination according to an embodiment.
  • 11A and 11B are explanatory diagrams of changes in an overhead view image in the embodiment.
  • FIG. 13 is an explanatory diagram of an overhead view from the director's side in the embodiment.
  • 11A and 11B are diagrams illustrating a determination of an image to be highlighted according to an embodiment.
  • 11 is a flowchart of a processing example of the information processing apparatus according to the embodiment.
  • 11 is a flowchart of an example of a process for highlighting according to an embodiment.
  • 11 is a flowchart of an example of a process for highlighting according to an embodiment.
  • FIG. 11 is an explanatory diagram of a display example based on feedback according to the embodiment.
  • 11 is a flowchart of an example of a display process based on feedback according to an embodiment.
  • 11A and 11B are explanatory diagrams of a display example of overlapping view frustums according to an embodiment;
  • 11 is a flowchart of a processing example of displaying overlapped view frustum according to an embodiment.
  • FIG. 13 is an explanatory diagram of a preferred display of one view frustum according to an embodiment. 13 is a flowchart of a processing example when performing priority display according to the embodiment.
  • FIG. 13 is an explanatory diagram of an example of a display of the instruction frustum on the director side in the embodiment.
  • 11 is an explanatory diagram of an example of a display on the cameraman's side of an instruction frustum according to an embodiment.
  • FIG. 13 is a flowchart of a process for generating an overhead view video according to another embodiment.
  • 11 is an explanatory diagram of an example of a display on the cameraman's side of an instruction frustum according to an embodiment.
  • FIG. 11 is a flowchart of a process for generating an overhead video from a cameraman's side according to an embodiment.
  • 11 is an explanatory diagram of an example of instruction information displayed on the cameraman's side according to the embodiment;
  • FIG. 11 is a flowchart of a process for generating an overhead video from a cameraman's side according to an embodiment.
  • 11 is an explanatory diagram of a display example of a marker frustum according to an embodiment.
  • 11 is an explanatory diagram of a display example of a marker according to an embodiment.
  • 13 is a flowchart of a process example of displaying marker information according to an embodiment.
  • 11A and 11B are explanatory diagrams of a display example of a different overhead view image according to an embodiment.
  • 11A and 11B are explanatory diagrams of a display example of a different overhead view image according to an embodiment.
  • FIG. 13 is an explanatory diagram of a display example on the director side of the embodiment. 13 is a flowchart of a process for generating an overhead view video according to another embodiment.
  • System configuration 2. Configuration of information processing device ⁇ 3. Display of view frustum> ⁇ 4. Example of a cameraman and director screen> [4-1: Highlighted display] [4-2: Priority Display] [4-3: Instruction Display] [4-4: Marker display] [4-5: Examples of various displays] 5. Summary and Modifications
  • video or “image” includes both moving images and still images, but the embodiment will be described taking a moving image as an example.
  • FIG. 1 is a schematic diagram showing how an image is captured by the image capturing system.
  • the 1 shows an example in which three cameras 2 are arranged to capture images of a real target space 8.
  • the number of cameras 2 is just an example, and one or more cameras 2 may be used.
  • the subject space 8 may be any location, but one example is a stadium for soccer, rugby, or the like.
  • the camera 2 is a mobile camera 2M that is suspended by a wire 9 and can move above a target space 8. Images and metadata captured by this mobile camera 2M are sent to a render node 7. Also shown as the camera 2 is a fixed camera 2F that is fixedly disposed on, for example, a tripod 6. Images and metadata captured by this fixed camera 2F are sent to a render node 7 via a CCU (Camera Control Unit) 3. In addition, the captured images and metadata from the mobile camera 2M may be sent to the render node 7 via the CCU 3.
  • “camera 2" collectively refers to cameras 2F and 2M.
  • the render node 7 referred to here refers to a CG engine or image processor that generates CG (Computer Graphics) and synthesizes it with live-action video, and is, for example, a device that generates AR video.
  • CG Computer Graphics
  • FIG. 2A and 2B show examples of AR images.
  • a line that does not actually exist is composited as a CG image 38 into live-action footage of a game being played in a stadium.
  • an advertising logo that does not actually exist is composited as an image 38 into the live-action footage in the stadium.
  • These CG images 38 can be rendered to look like they exist in reality by appropriately setting the shape, size and synthesis position depending on the position of the camera 2 at the time of shooting, the shooting direction, the angle of view, the structural object photographed, etc.
  • the process of generating AR overlay images by combining CG with such live-action footage is already known.
  • the filming system of this embodiment also enables the cameraman and director involved in the video production to perform production tasks such as shooting and giving instructions while visually viewing the AR overlay image. This allows filming to be performed while checking the fusion state of the real scene and the virtual image, making it possible to produce videos that are in line with the creative intent.
  • a shooting range presentation image that is suitable for the viewer of the monitor image, such as the cameraman or director, is displayed.
  • FIG. 3 Two configuration examples of the imaging system are shown in FIG. 3 and FIG.
  • a camera system 1, 1A a control panel 10, a GUI (Graphical User Interface) device 11, a network hub 12, a switcher 13, and a master monitor 14 are shown.
  • the dashed arrows indicate the flow of various control signals CS, while the solid arrows indicate the flow of each of the image data of the shot image V1, the AR superimposed image V2, and the overhead image V3.
  • Camera system 1 is configured to perform AR linkage, while camera system 1A is configured not to perform AR linkage.
  • camera system 1A is configured not to perform AR linkage.
  • a mobile camera 2M may also be used as the camera system 1, 1A.
  • the camera system 1 includes a camera 2, a CCU 3, for example an AI (artificial intelligence) board 4 built into the CCU 3, and an AR system 5.
  • the camera 2 sends video data of the shot video V1 and metadata MT to the CCU 3.
  • the CCU 3 sends the video data of the shot video V1 to the switcher 13.
  • the CCU 3 also sends the video data of the shot video V1 and metadata MT to the AR system 5.
  • the metadata MT includes lens information including the zoom angle of view and focal length when the captured image V1 was captured, and sensor information such as the IMU (Inertial Measurement Unit) mounted on the camera 2. Specifically, this information includes the 3doF (Degree of Freedom) attitude information of the camera 2, acceleration information, lens focal length, aperture value, zoom angle of view, lens distortion, etc.
  • This metadata MT is output from the camera 2, for example, as frame-synchronized or asynchronous information.
  • the camera 2 is a fixed camera 2F, and the position information does not change, so the camera position information only needs to be stored as a known value by the CCU 3 and the AR system 5.
  • the position information is also included in the metadata MT transmitted successively from the camera 2M.
  • the AR system 5 is an information processing apparatus including a rendering engine that performs CG rendering.
  • the information processing apparatus as the AR system 5 is an example of the render node 7 shown in FIG.
  • the AR system 5 generates video data of an AR superimposed video V2 by superimposing an image 38 generated by CG on a video V1 captured by the camera 2.
  • the AR system 5 sets the size and shape of the image 38 by referring to the metadata MT, and also sets the synthesis position within the captured video V1, thereby generating video data of an AR superimposed video V2 in which the image 38 is naturally synthesized with the actual scenery.
  • the AR system 5 also generates video data of a CG overhead image V3, as described later.
  • the video data is the overhead image V3 that reproduces the target space 8 by CG.
  • the AR system 5 displays a view frustum 40 as shown in FIG. 8, which will be described later, in the overhead image V3 as a shooting range presentation image that visually presents the shooting range of the camera 2.
  • the AR system 5 calculates the shooting range in the shooting target space 8 from the metadata MT and position information of the camera 2.
  • the shooting range of the camera 2 can be obtained by acquiring the position information of the camera 2, the angle of view, and the attitude information (corresponding to the shooting direction) of the camera 2 in the three axial directions (yaw, pitch, roll) on the tripod 6.
  • the AR system 5 generates an image as a view frustum 40 in accordance with the calculation of the shooting range of the camera 2.
  • the AR system 5 generates image data of the overhead image V3 so that the view frustum 40 is presented from the position of the camera 2 in the overhead image V3 corresponding to the target space 8.
  • the term "bird's-eye view image” refers to an image from a bird's-eye view of the target space 8, but does not necessarily have to display the entire target space 8 within the image.
  • An image that includes at least a portion of the view frustum 40 of the camera 2 and the surrounding space is referred to as a bird's-eye view image.
  • the overhead image V3 is generated as an image expressing the shooting target space 8 such as a stadium by CG, but the overhead image V3 may be generated by real-life images.
  • a camera 2 may be provided with a viewpoint for the overhead image, and the image V1 shot by the camera 2 may be used to generate the overhead image V3.
  • the image V1 shot by a camera 2M moving in the sky on a wire 9 may be used as the overhead image V3. Furthermore, a 3D (three dimensions)-CG model of the shooting target space 8 may be generated using the images V1 shot by multiple cameras 2, and the viewpoint position may be set for the 3D-CG model and rendered to generate an overhead image V3 with a variable viewpoint position.
  • the video data of the AR superimposed image V2 and the overhead image V3 by the AR system 5 is supplied to a switcher 13. Furthermore, the image data of the AR superimposed image V2 and the overhead image V3 by the AR system 5 is supplied to the camera 2 via the CCU 3. This allows the cameraman of the camera 2 to visually recognize the AR superimposed image V2 and the overhead image V3 on a display unit such as a viewfinder.
  • the image data of the AR superimposed image V2 and the overhead image V3 by the AR system 5 may be supplied to the camera 2 without going through the CCU 3. Furthermore, there are also examples in which the CCU 3 is not used in the camera systems 1 and 1A.
  • the AI board 4 in the CCU 3 performs processing to calculate the amount of drift of the camera 2 from the captured image V1 and metadata MT.
  • the positional displacement of the camera 2 is obtained by integrating twice the acceleration information from the IMU mounted on the camera 2.
  • the amount of displacement at each time point from a certain reference origin attitude (attitude position that is the reference for each of the three axes of yaw, pitch, and roll)
  • the position of the three axes of yaw, pitch, and roll at each time point, that is, attitude information corresponding to the shooting direction of the camera 2 is obtained.
  • repeated accumulation increases the deviation (accumulated error) between the actual attitude position and the calculated attitude position.
  • the amount of deviation is called the drift amount.
  • the AI board 4 calculates the amount of drift using the captured image V1 and the metadata MT. Then, the calculated amount of drift is sent to the camera 2 side.
  • the camera 2 receives the drift amount from the CCU 3 (AI board 4) and corrects the attitude information of the camera 2. Then, the camera 2 outputs metadata MT including the corrected attitude information.
  • the above drift correction will be explained with reference to FIGS. 5 shows the environment map 35.
  • the environment map 35 stores feature points and feature amounts in the coordinates of a virtual dome, and is generated for each camera 2.
  • the camera 2 is rotated 360 degrees, and an environment map 35 is generated in which feature points and feature quantities are registered in global position coordinates on the celestial sphere. This makes it possible to restore the orientation even if it is lost during feature point matching.
  • FIG. 6A shows a schematic diagram of a state in which a drift amount DA occurs between the imaging direction Pc in the correct attitude of the camera 2 and the imaging direction Pj calculated from the IMU data.
  • Information on the three-axis motion, angle, and field of view of the camera 2 is sent from the camera 2 to the AI board 4 as a guide for feature point matching.
  • the AI board 4 detects the accumulated drift amount DA by feature point matching of image recognition, as shown in FIG. 6B.
  • the "+" in the figure indicates a feature point of a certain feature amount registered in the environment map 35 and a feature point of the corresponding feature amount in the frame of the current captured image V1, and the arrow between them is the drift amount vector. In this way, by detecting a coordinate error by feature point matching and correcting the coordinate error, the drift amount can be corrected.
  • the AI board 4 determines the amount of drift by this type of feature point matching, and the camera 2 transmits corrected metadata MT based on this, thereby improving the accuracy of the attitude information of the camera 2 detected in the AR system 5 based on the metadata MT.
  • Camera system 1A in FIG. 3 is an example having a camera 2 and a CCU 3, but not an AR system 5.
  • Video data and metadata MT of the shot video V1 are transmitted from the camera 2 of camera system 1A to the CCU 3.
  • the CCU 3 transmits the video data of the shot video V1 to the switcher 13.
  • the video data of the captured image V1, AR superimposed image V2, and overhead image V3 output from the camera system 1, 1A is supplied to the GUI device 11 via the switcher 13 and network hub 12.
  • the switcher 13 selects the so-called main line video from among the images V1 captured by the multiple cameras 2, the AR superimposed video V2, and the overhead video V3.
  • the main line video is the video output for broadcasting or distribution.
  • the switcher 13 outputs the selected video data to a transmitting device or recording device (not shown) as the main line video for broadcasting or distribution.
  • the video data of the video selected as the main line video is sent to the master monitor 14 and displayed thereon, so that the video production staff can check the main line video.
  • the master monitor 14 may display an AR superimposed image V2, an overhead image V3, etc. in addition to the main line image.
  • the control panel 10 is a device that allows video production staff to operate the switcher 13 to give switching instructions, video processing instructions, and various other instructions.
  • the control panel 10 outputs a control signal CS in response to operations by the video production staff.
  • This control signal CS is sent via the network hub 12 to the switcher 13 and the camera systems 1 and 1A.
  • the GUI device 11 is, for example, a PC or a tablet device, and is a device that enables video production staff, such as a director, to check the video and give various instructions.
  • the captured image V1, the AR superimposed image V2, and the overhead image V3 are displayed on the display screen of the GUI device 11.
  • the captured images V1 from the multiple cameras 2 are split into a screen and displayed as a list, the AR superimposed image V2 is displayed, and the overhead image V3 is displayed.
  • an image selected by the switcher 13 as a main line image is displayed.
  • the GUI device 11 is also provided with an interface for a director or the like to perform various instruction operations.
  • the GUI device 11 outputs a control signal CS in response to an operation by the director or the like.
  • This control signal CS is transmitted via a network hub 12 to a switcher 13 and the camera systems 1 and 1A.
  • a control signal CS corresponding to the instruction is transmitted to the AR system 5, and the AR system 5 generates video data of an overhead video V3 including a view frustum 40 in a display format corresponding to an instruction from a director or the like.
  • FIG. 3 has camera systems 1 and 1A, but in this case, camera system 1 is a set of camera 2, CCU 3, and AR system 5, and in particular, by having AR system 5, video data of AR superimposed video V2 and overhead video V3 corresponding to video V1 captured by camera 2 is generated. Then, AR superimposed video V2 and overhead video V3 are displayed on a display unit such as a viewfinder of camera 2, displayed on GUI device 11, or selected as a main line video by switcher 13. On the other hand, on the camera system 1A side, image data of the AR superimposed image V2 and the overhead image V3 corresponding to the captured image V1 of the camera 2 is not generated. Therefore, FIG. 3 shows a system in which a camera 2 that performs AR linkage and a camera 2 that performs normal shooting are mixed.
  • FIG. 4 is an example of a system in which one AR system 5 corresponds to each camera 2. 4, a plurality of camera systems 1A are provided. The AR system 5 is provided independently of each of the camera systems 1A.
  • the CCU 3 of each camera system 1A sends the video data and metadata MT of the shot video V1 from the camera 2 to the switcher 13.
  • the video data and metadata MT of the shot video V1 are then supplied from the switcher 13 to the AR system 5.
  • This allows the AR system 5 to acquire the video data and metadata MT of the captured video V1 for each camera system 1A, and generate video data of the AR superimposed video V2 corresponding to the captured video V1 of each camera system 1A, and video data of the overhead video V3 including the view frustum 40 corresponding to each camera system 1A.
  • the AR system 5 can generate video data of the overhead video V3 in which the view frustums 40 of the cameras 2 of the multiple camera systems 1A are collectively displayed.
  • the video data of the AR superimposed image V2 and the overhead image V3 generated by the AR system 5 is sent to the CCU 3 of the camera system 1A via the switcher 13, and then sent to the camera 2. This allows the cameraman to view the AR superimposed image V2 and the overhead image V3 on a display such as the viewfinder of the camera 2.
  • the video data of the AR overlay image V2 and the overhead image V3 generated by the AR system 5 is transmitted to the GUI device 11 via the switcher 13 and the network hub 12 and displayed. This allows the director and others to visually confirm the AR overlay image V2 and the overhead image V3.
  • the overhead view image V3 is denoted as "V3-1" and "V3-2".
  • the video data of the overhead image V3-1 is the video data of the overhead image V3 to be displayed on the GUI device 11 or the master monitor 14, with a director or the like assumed as the viewer.
  • the video data of the overhead image V3-2 is the video data of the overhead image V3 to be displayed on the viewfinder of the camera 2, with a cameraman or the like assumed as the viewer.
  • the video data for these overhead images V3-1 and V3-2 may be video data that displays images of the same content. Both of these are video data that display an overhead image V3 of the target space 8 that includes at least the view frustum 40. However, in the embodiment, a case will also be described in which these are video data that include different display contents.
  • the AR system 5 may generate video data that will become an overhead image V3 with the same video content regardless of the transmission destination, or may generate, for example, video data of a first overhead image V3-1 to be transmitted to the GUI device 11 and video data of a second overhead image V3-2 to be transmitted to the camera 2 in parallel. Furthermore, in the case of the system of FIG. 4, it is also assumed that the AR system 5 generates multiple second overhead images V3-2 in parallel so that the content differs for each camera 2.
  • the information processing device 70 is a device capable of information processing, particularly video processing, such as a computer device.
  • Specific examples of the information processing device 70 include personal computers, workstations, mobile terminal devices such as smartphones and tablets, video editing devices, etc.
  • the information processing device 70 may also be a computer device configured as a server device or a computing device in cloud computing.
  • the CPU 71 of the information processing device 70 executes various processes according to programs stored in the ROM 72 or a non-volatile memory unit 74, such as an EEPROM (Electrically Erasable Programmable Read-Only Memory), or programs loaded from the storage unit 79 to the RAM 73.
  • the RAM 73 also stores data necessary for the CPU 71 to execute various processes, as appropriate.
  • the CPU 71 is configured as a processor that performs various types of processing.
  • the CPU 71 performs overall control processing and various types of calculation processing, but in this embodiment, it also has the functions of an image processing unit 71a and an image generation control unit 71b in order to execute image processing as the AR system 5 based on a program.
  • the video processing unit 71a has a processing function for performing various types of video processing. For example, it performs one or more of the following: 3D model generation processing, rendering, video processing including color and brightness adjustment processing, video editing processing, video analysis and detection processing, etc.
  • the video processing unit 71a also performs processing to generate an overhead image V3 as video data that simultaneously displays an overhead image V3 of the target space 8, a view frustum 40 that shows the shooting range of camera 2 within the overhead image V3, and the captured image V1 of camera 2 on a single screen.
  • the image generation control unit 71b in the CPU 71 performs processing to variably set the display position of the captured image V1 to be simultaneously displayed on one screen in the overhead image V3 including the view frustum 40 generated by the image processing unit 71a, and to control the generation of image data by the image processing unit 71a.
  • the image processing unit 71a generates the overhead image V3 including the view frustum 40 according to the settings of the image generation control unit 71b.
  • the image processing unit 71a may also perform in parallel a process of generating first image data that displays the view frustum 40 of the camera 2 within the target space 8, and a process of generating second image data that displays an image of the view frustum 40 within the target space 8, the image having a different display mode from the image generated by the first image data.
  • the first video data is, for example, video data of the overhead view V3-1
  • the second video data is, for example, video data of the overhead view V3-2.
  • the functions of the image processing unit 71a and the image generation control unit 71b may be realized by a CPU separate from the CPU 71, a GPU (Graphics Processing Unit), a GPGPU (General-purpose computing on graphics processing units), an AI (artificial intelligence) processor, etc.
  • the functions of the video processing unit 71a and the video production control unit 71b may be realized by a plurality of processors.
  • the CPU 71, ROM 72, RAM 73, and non-volatile memory unit 74 are interconnected via a bus 83.
  • the input/output interface 75 is also connected to this bus 83.
  • An input unit 76 consisting of operators and operation devices is connected to the input/output interface 75.
  • the input unit 76 may be various operators and operation devices such as a keyboard, a mouse, a key, a trackball, a dial, a touch panel, a touch pad, a remote controller, or the like.
  • An operation by the user is detected by the input unit 76 , and a signal corresponding to the input operation is interpreted by the CPU 71 .
  • a microphone may also be used as the input unit 76. Voice uttered by the user may also be input as operation information.
  • the input/output interface 75 is connected, either integrally or separately, to a display unit 77 formed of an LCD (Liquid Crystal Display) or an organic EL (electro-luminescence) panel, or the like, and an audio output unit 78 formed of a speaker, or the like.
  • the display unit 77 is a display unit that performs various displays, and is configured, for example, by a display device provided in the housing of the information processing device 70, or a separate display device connected to the information processing device 70, or the like.
  • the display unit 77 displays various images, operation menus, icons, messages, etc., on the display screen based on instructions from the CPU 71, that is, displays them as a GUI (Graphical User Interface).
  • GUI Graphic User Interface
  • the input/output interface 75 may also be connected to a storage unit 79 and a communication unit 80, which may be configured using a hard disk drive (HDD) or solid-state memory.
  • HDD hard disk drive
  • solid-state memory solid-state memory
  • the storage unit 79 can store various data and programs.
  • a database can also be configured in the storage unit 79.
  • the communication unit 80 performs communication processing via a transmission path such as the Internet, and communication with various devices such as external databases, editing devices, and information processing devices via wired/wireless communication, bus communication, and the like. For example, assuming an information processing device 70 as the AR system 5 , communication with the CCU 3 and the switcher 13 is performed via a communication unit 80 .
  • a drive 81 is also connected to the input/output interface 75 as required, and a removable recording medium 82 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory is appropriately mounted thereon.
  • the drive 81 allows video data, various computer programs, and the like to be read from the removable recording medium 82.
  • the read data is stored in the storage unit 79, and the video and audio contained in the data are output on the display unit 77 and the audio output unit 78.
  • the computer programs, etc. read from the removable recording medium 82 are installed in the storage unit 79 as necessary.
  • software for the processing of this embodiment can be installed via network communication by the communication unit 80 or via a removable recording medium 82.
  • the software may be stored in advance in the ROM 72, the storage unit 79, etc.
  • the AR system 5 generates the overhead image V3 and can transmit it to the viewfinder of the camera 2, the GUI device 11, or the like for display.
  • the AR system 5 generates video data for the overhead image V3 so as to display the view frustum 40 of the camera 2 within the overhead image V3.
  • Fig. 8 shows an example of a view frustum 40 displayed in the overhead image V3.
  • Fig. 8 shows an example of a CG image of the subject space 8 in Fig. 1 as viewed from above, but for the sake of explanation, it is shown in a simplified form.
  • the overhead image V3 in Fig. 8 includes an image showing a background 31, such as a stadium, and a person 32, such as a player.
  • the overhead image V3 may or may not include an image of the camera 2 itself.
  • the view frustum 40 visually presents the shooting range of the camera 2 within the overhead image V3, and has a pyramid shape that spreads in the direction of the shooting optical axis with the position of the camera 2 within the overhead image V3 as the frustum origin 46.
  • it is a pyramid shape extending from the frustum origin 46 to the frustum far end surface 45.
  • the reason why it is a quadrangular pyramid is because the image sensor of the camera 2 is quadrangular.
  • the extent of the spread of the pyramid changes depending on the angle of view of the camera 2 at that time. Therefore, the range of the pyramid indicated by the view frustum 40 is the shooting range of the camera 2.
  • the view frustum 40 may be represented as a pyramid with a semi-transparent colored image.
  • the view frustum 40 displays a focus plane 41 and a depth of field range 42 at that time inside a quadrangular pyramid.
  • a depth of field range 42 for example, a range from a near depth end surface 43 to a far depth end surface 44 is expressed by a translucent color different from the rest.
  • the focus plane 41 is also expressed in a semi-transparent color that is different from the others.
  • the focus plane 41 indicates the depth position at which the camera 2 is focused at that point in time.
  • a subject at a depth (distance in the depth direction as seen from the camera 2) equivalent to the focus plane 41 is in focus.
  • the depth of field range 42 makes it possible to confirm the range in the depth direction in which the subject is not blurred.
  • the in-focus depth and the depth of field vary depending on the focus operation and aperture operation of the camera 2. Therefore, the focus plane 41 and the depth of field range 42 in the view frustum 40 vary each time.
  • the AR system 5 can set the pyramidal shape of the view frustum 40, the display position of the focus plane 41, the display position of the depth of field range 42, and the like, by acquiring metadata MT from the camera 2, which includes information such as focal length, aperture value, and angle of view. Furthermore, since the metadata MT includes attitude information of the camera 2, the AR system 5 can set the direction of the view frustum 40 from the camera position (frustum origin 46) in the overhead image V3.
  • the AR system 5 displays, together with the view frustum 40, the image V1 captured by the camera 2 in which the view frustum 40 is shown in the overhead image V3. That is, the AR system 5 generates an image of the CG space 30 to be used as the overhead image V3, synthesizes the image of the CG space 30 with the view frustum 40 generated based on the metadata MT supplied from the camera 2, and further synthesizes the image V1 captured by the camera 2. The image data of such a synthesized image is output as the overhead image V3.
  • FIG. 10 An example will be described in which a view frustum 40 in an image of a CG space 30 and a photographed image V1 are simultaneously displayed on one screen.
  • the AR system 5 generates video data of an overhead video V3 in which the captured video V1 is displayed within the view frustum 40.
  • this is an example of generating video data in which the captured video V1 is arranged within the range of the view frustum 40.
  • this can be said to be an example of generating video data in which the captured video V1 is displayed in a state in which it is arranged within the range of the view frustum 40.
  • Figure 9 shows an example in which the captured image V1 is displayed on the focus plane 41 in the view frustum 40. This makes it possible to view the image captured at the focus position.
  • the example in Figure 9 is also one example in which the captured image V1 is displayed within the depth of field range 42.
  • FIG. 10 shows an example in which a captured image V1 is displayed on a surface other than the focus surface 41 within the depth of field range 42 in the view frustum 40.
  • the captured image V1 is displayed on a surface 44 at the far end of the depth field.
  • examples are also conceivable in which the captured image V1 is displayed on the near depth end surface 43, or at a depth position midway within the depth of field range 42.
  • FIG. 11 shows an example in which the captured image V1 is displayed within the view frustum 40 at a position (surface 47 near the frustum origin) closer to the frustum origin 46 than the near-depth end surface 43 of the depth-of-field range 42.
  • the size of the captured image V1 becomes smaller the closer it is to the frustum origin 46, but by displaying it on the surface 47 near the frustum origin in this way, the focus plane 41, depth-of-field range 42, etc. become easier to see.
  • FIG. 12 shows an example in which a captured image V1 is displayed on the far side of a far end surface 44 of a depth of field range 42 within a view frustum 40.
  • far means far from the viewpoint of the camera 2 (frustum starting point 46).
  • the captured image V1 is displayed on the frustum far end surface 45, which is located at the far side.
  • the photographed image V1 is displayed on the far side of the depth of field range 42 within the view frustum 40, the area of the photographed image V1 can be made large. This is therefore suitable for checking the position of the focus plane 41 and the depth of field range 42 while carefully checking the contents of the photographed image V1.
  • the distance of the rendered view frustum 40 may be finite or infinite.
  • the view frustum 40 may be rendered at a finite distance, such as the rendering distance d1 in Fig. 12.
  • the rendering distance d1 may be twice the distance from the frustum starting point 46 to the focus plane 41. By doing so, the frustum far end surface 45 is determined, so that the photographed image V1 can be displayed in the widest area within the view frustum 40 as shown in FIG.
  • the view frustum 40 may be rendered at infinity as shown in FIG. 13 without any particular rendering distance.
  • the frustum far end surface 45 is not always specified as a constant.
  • the captured image V1 may be displayed at an indefinite position farther away than the depth of field range 42.
  • the far end of the rendering range is set as the frustum far end surface 45.
  • 14A and 14B show that when the view frustum 40 is rendered up to the position of the wall W, the position at which it collides with the wall W is the frustum far end surface 45. In other words, the frustum far end surface 45 changes depending on the positional relationship with the object created by the CG.
  • the far end of the range that can be drawn in the overhead image V3 is the frustum far end surface 45, and the captured image V1 is displayed on that frustum far end surface 45.
  • the photographed image V1 is displayed within the view frustum 40, but the photographed image V1 may be displayed at a position outside the view frustum 40 within the same screen as the overhead image V3.
  • 15 shows four examples (captured images V1w, V1x, V1y, and V1z) as examples of display positions outside the view frustum 40. In particular, these four examples are examples in which the captured image V1 is displayed near the view frustum 40.
  • the captured image V1 may be displayed near the far end surface 45 of the frustum as captured image V1w.
  • the captured image V1 can also be displayed near the focus plane 41 (or depth of field range 42) as in the captured image V1y in FIG. 15. In this case, it becomes easier to view the captured image V1 together with the focus plane 41 or depth of field range 42, which are areas of the view frustum 40 that are likely to be noticed by the viewer.
  • the captured image V1 can also be displayed near the camera 2 (or the frustum starting point 46) as the captured image V1z. In this case, the relationship between the camera 2 and the captured image V1 by that camera 2 becomes easier to understand.
  • the color of the frame of the captured image V1 may be matched with the semi-transparent color or the color of the contour of the corresponding view frustum 40 to indicate the correspondence.
  • view frustums 40a, 40b, and 40c corresponding to the three cameras 2 are displayed within the overhead image V3.
  • captured images V1a, V1b, and V1c corresponding to these view frustums 40a, 40b, and 40c are also displayed.
  • the photographed image V1a is displayed on a frustum far end surface 45 of the view frustum 40a.
  • the photographed image V1b is displayed in the vicinity of a frustum starting point 46 of the view frustum 40b (in the vicinity of the camera position).
  • the captured image V1c is displayed in a corner of the screen, but is displayed in the upper left corner, which is closest to the view frustum 40c, among the four corners of the overhead image V3.
  • the image V1 captured by the mobile camera 2 may be displayed fixedly in a corner of the screen, for example.
  • the above Figure 16 is an example of an overhead image V3 in which the target space 8 is viewed from diagonally above, but the AR system 5 may also display a planar overhead image V3 viewed from directly above, as shown in Figure 17.
  • cameras 2a, 2b, 2c, and 2d, their corresponding view frustums 40a, 40b, 40c, and 40d, and the captured images V1a, V1b, V1c, and V1d are displayed as an overhead image V3.
  • the captured images V1a, V1b, V1c, and V1d are displayed near the corresponding cameras 2a, 2b, 2c, and 2d, respectively.
  • the AR system 5 may be configured so that the viewpoint direction of the overhead image V3 shown in Figures 16 and 17 can be continuously changed by the viewer operating the GUI device 11, etc.
  • the view frustums 40a and 40b are displayed, and the images V1a and V1b captured by the cameras 2 of the view frustums 40a and 40b are displayed in the corners of the screen or near the camera positions.
  • the shooting conditions can be easily understood by displaying each of the view frustums 40 and shot images V1, as in the example shown in the figure.
  • the AR system 5 displays the view frustum 40 of the camera 2 in the CG space 30, and generates video data for an overhead image V3 so that the captured image V1 of the camera 2 is also displayed at the same time.
  • this overhead image V3 on the camera 2 or GUI device 11, viewers such as the cameraman or director can easily understand the shooting situation.
  • the viewer can easily understand what each camera 2 is capturing, where the camera is focused, and so on.
  • the director can very easily grasp the relative positions of the cameras, the relationship between the shooting directions, the subject being shot, etc. This allows the director to give appropriate instructions. From the director's point of view, it is enough to know the general content of each shot image V1. Therefore, there is no problem even if the shot image V1 is relatively small in the overhead image V3.
  • the director can check and simulate the composition, standing position, and camera position while taking into consideration the overall situation of each camera 2.
  • the cameraman can perform the focusing operation by looking at the depth of field range 42 of the view frustum 40.
  • the user can easily check the location and direction being photographed within the overhead image V3 of the subject space 8 represented by CG.
  • the user can see the view frustum 40 and the captured image V1 of the other camera 2 and reflect them in the operation of his/her own camera.
  • the user can also grasp the relationship between the contents of the images captured by the other camera 2, the direction of the subject, etc., and therefore can perform preferable shooting in relation to the other camera 2.
  • the user can check the position and angle of view of the other camera 2 and shoot from a different position and angle of view with his/her own camera 2.
  • the overhead image V3 increases the amount of information (captured image V1, position, etc.), making it easier to grasp the situation on-site.
  • FIG. 19 shows an example of processing by the AR system 5 that generates video data for the overhead view video V3.
  • the video data for the overhead view video V3 is video data in which the view frustum 40 and the captured video V1 are synthesized into the CG space 30, which corresponds to the subject space 8.
  • it is video data for displaying the images shown in FIGS. 9 to 18.
  • the AR system 5 performs the processes from step S101 to step S107 in FIG. 19 for each frame of the video data of the overhead video V3, for example. These processes can be considered as control processes of the CPU 71 (video processing unit 71a, video generation control unit 71b) in the information processing device 70 in FIG. 7 as the AR system 5.
  • the AR system 5 sets the CG space 30. For example, it sets the viewpoint position of the CG space 30 corresponding to the shooting target space 8, and renders an image as the CG space 30 from that viewpoint position. In particular, if there is no change in the viewpoint position or image content between the previous frame and the CG space 30, the image in the CG space of the previous frame can be used in the current frame as well.
  • step S102 the AR system 5 inputs the captured image V1 and metadata MT from the camera 2. That is, the captured image V1 of the current frame, and the attitude information, focal length, angle of view, aperture value, and the like of the camera 2 at the frame timing are acquired. For example, when one AR system 5 displays the view frustum 40 and the captured image V1 for a plurality of cameras 2 as shown in FIG. 4, the AR system 5 inputs the captured image V1 and metadata MT of each camera 2.
  • each of these AR systems 5 When there are multiple camera systems 1 in which the cameras 2 and AR systems 5 correspond 1:1 as shown in Figure 3, and each generates an overhead image V3 including multiple view frustums 40 and captured images V1, it is preferable for each of these AR systems 5 to work together so as to share the metadata MT and captured images V1 of the corresponding camera 2.
  • step S103 the AR system 5 generates a view frustum 40 for the current frame.
  • the AR system 5 sets the direction of the view frustum 40 in the CG space 30 according to the attitude of the camera 2, the quadrangular pyramid shape according to the angle of view, the positions of the focus plane 41 and the depth of field range 42 based on the focal length and aperture value, and the like, and generates an image of the view frustum 40 according to the settings.
  • the AR system 5 When displaying the view frustum 40 for a plurality of cameras 2 , the AR system 5 generates an image of the view frustum 40 according to the metadata MT of each camera 2 .
  • step S104 the AR system 5 sets the display position for the captured image V1 acquired in step S103.
  • the AR system 5 sets the display position for the captured image V1 acquired in step S103.
  • step S105 the AR system 5 synthesizes the view frustum 40 corresponding to one or more cameras 2 and the captured image V1 into the CG space 30 that becomes the overhead image V3, generating image data for one frame of the overhead image V3.
  • step S106 the AR system 5 outputs one frame of video data of the overhead view video V3.
  • the above process is repeated until the display of the view frustum 40 and the captured image V1 is completed.
  • the overhead image V3 as shown in FIGS.
  • FIGS. 23 and 24 show examples in which the display position of the photographed video V1 is set variably.
  • 20, 21, 22, 23, and 24 are examples of display position settings of the captured image V1 corresponding to one camera 2.
  • the processes as shown in Fig. 20 to 24 may be performed for each camera 2.
  • the same display position setting process may be performed for each camera 2, or different display position setting processes may be performed.
  • FIG. 20 shows a display position setting process when the photographed image V1 is displayed on the focus plane 41 as in FIG.
  • the AR system 5 determines the size and shape of the focus plane 41 in the view frustum 40 generated in step S103 in Fig. 19 for the current frame.
  • the AR system 5 sets the size and shape of the captured image V1 so as to match the focus plane 41.
  • the shape of the captured image V1 to be synthesized within the view frustum 40 may be the cross-sectional shape of that view frustum 40.
  • the shape of the focus plane 41 differs depending on the viewpoint of the overhead image V3 and the position and direction of the view frustum 40 to be displayed, but may be the shape of a cross section cut perpendicular to the optical axis of the camera 2 at the focus plane 41 of the view frustum 40 in that frame. Therefore, when the photographed image V1 is displayed within the view frustum 40, the photographed image V1 is transformed into a cross-sectional shape perpendicular to the optical axis and then synthesized.
  • step S105 in FIG. 19 the size and shape of the captured image V1 are adjusted, and an overhead image V3 is generated by combining the captured image V1 with the focus plane 41 of the view frustum 40.
  • FIG. 21 shows a display position setting process in the case where the photographed image V1 is displayed on the depth far end surface 44 as in FIG.
  • the AR system 5 determines the size and shape of the depth far end surface 44 in the view frustum 40 generated in step S103 in the current frame.
  • the AR system 5 sets the size and shape of the captured image V1 so as to match the size of the depth far end surface 44.
  • step S105 in FIG. 19 the size and shape of the captured image V1 are adjusted, and an overhead image V3 is generated in which the captured image V1 is composited onto the far end surface 44 of the view frustum 40.
  • FIG. 22 shows a display position setting process in the case where the photographed image V1 is displayed in the vicinity of the frustum starting point 46 as in FIG.
  • the AR system 5 sets the display position of the captured image V1 within the view frustum 40 generated in step S103 in the current frame. That is, a certain position is set on the frustum origin 46 side of the depth of field range 42. In this case, the position may be set as a fixed distance from the frustum origin 46, or may be set as a position where a minimum area is obtained as a cross section of a quadrangular pyramid shape according to the angle of view.
  • step S141 the AR system 5 determines the cross section at the set display position, that is, the size and shape of the display area.
  • step S142 the AR system 5 sets the size and shape of the captured image V1 so as to match the cross section of the determined display position.
  • step S105 when the process proceeds to step S105, the size and shape of the captured image V1 are adjusted, and an overhead image V3 is generated in which the captured image V1 is composited at a position near the frustum origin 46 of the view frustum 40.
  • FIG. 23 shows a display position setting process in which the display position of the captured image V1 is changed according to the operation of a user such as a cameraman or director.
  • step S150 the AR system 5 checks whether or not a display position change operation has been performed on the captured image V1.
  • the GUI device 11 and the camera 2 are configured so that a director, cameraman, etc. can change the display position by performing a specified operation.
  • the AR system 5 checks the operation information for the display position change operation from the control signal CS that it receives.
  • an operation interface may be provided that allows each plane to be switched by a toggle operation, or an operation interface may be provided that allows each plane to be directly specified.
  • the display position setting may be switched not only to positions within the view frustum 40 but also to positions outside the view frustum 40 .
  • it is possible to perform operations such as changing the focus plane 41, the far end plane of the frustum 45, the corner of the screen, and the vicinity of the camera.
  • the display position setting can be switched outside the view frustum 40. For example, it is possible to change the position to "near the focus plane 41," “near the far end surface 45 of the frustum,” “corner of the screen,” or "near camera 2.”
  • step S151 If no particular operation to change the display position is confirmed at the time of processing the current frame, the AR system 5 proceeds to step S151, maintains the same display position setting as in the previous frame, and ends the processing of FIG. As a result, when the process proceeds to step S105 in FIG. 19, a frame of the current overhead image V3 is generated in which the shot image V1 is displayed in the same position as in the previous frame.
  • step S150 the AR system 5 proceeds from step S150 to step S152 in FIG. 23 and changes the display position setting in response to the operation. For example, the setting that had been set as the focus plane 41 until then may be switched to the frustum far end plane 45.
  • step S153 the AR system 5 branches the process depending on whether the changed position setting is outside the view frustum 40 or not. If the changed position setting is a position within the view frustum 40, the AR system 5 proceeds to step S154, and determines the size and shape of the display area as a cross section of the view frustum 40 at the set position. Then, in step S156, the AR system 5 sets the size and shape of the captured image V1 so as to match the cross section of the determined display position.
  • step S105 in FIG. 19 the size of the captured image V1 is adjusted, and an overhead image V3 is generated in which the captured image V1 is composited at a position within the view frustum 40 different from the previous frame.
  • the AR system 5 proceeds from step S153 to step S155 in FIG. 23, and sets the display size and shape of the captured image V1 at the new set position.
  • the shape of the captured image V1 to be synthesized is not limited to the cross-sectional shape of the view frustum 40, and may be, for example, a rectangle, or if it is near the view frustum 40, a parallelogram according to the angle of the view frustum 40.
  • the size of the captured image V1 can also be set relatively freely, but it is desirable to set it appropriately according to other displays on the screen.
  • step S105 in FIG. 19 the size and shape of the captured image V1 are adjusted, and an overhead image V3 is generated in which the captured image V1 is composited at a position outside the view frustum 40 that is different from the previous frame.
  • the display position may be changed only outside the view frustum 40. In that case, steps S153 and S154 are unnecessary, and the process may proceed from step S152 to step S155.
  • FIG. 24 shows an example of processing in which the AR system 5 automatically changes the display position of the captured image V1.
  • step S160 the AR system 5 performs a display position change determination.
  • the display position change determination is a process of determining whether or not to change the display position setting of the photographed video V1 in the current frame from that in the previous frame. Examples of this determination process include the following processes (P1), (P2), and (P3).
  • P1 Determination based on the positional relationship between the view frustum 40 and an object in the overhead image V3.
  • P2 Determination based on the angle of the view frustum 40 in the overhead image V3.
  • P3 Determination based on the viewpoint position of the overhead image V3.
  • Fig. 25 shows a state in which the frustum far end surface 45 of the finitely distant view frustum 40 collides with the ground GR and is partially embedded therein.
  • Fig. 26 shows a state in which the far end side of the finitely distant or infinitely distant view frustum 40 collides with a structure CN and it becomes impossible to display anything beyond that.
  • the pyramidal shape of the view frustum 40 widens or its direction changes due to a change in the angle of view or shooting direction of the camera 2, and it is determined that the display position of the captured image V1 up to that point is not appropriate based on the positional relationship between a specific position of the view frustum 40 (such as the frustum far end surface 45 or the focus surface 41) and other objects being displayed, it may be determined that the display position needs to be changed.
  • view frustums 40 may also be considered as objects in the overhead image V3, and if it is determined that the display position of the captured image V1 is not appropriate due to its positional relationship with the other view frustums 40, it may be determined that the display position needs to be changed.
  • the example of (P2) takes into consideration the visibility of the captured image V1 that is adapted to the cross-sectional shape of the view frustum 40.
  • the cross-sectional shape may not be appropriate as a display surface.
  • the shape and direction of the view frustum 40 change according to the angle of view and the shooting direction of the camera 2.
  • the angle of the view frustum 40 displayed in the overhead image V3 also changes. That is, the angle between the direction from the viewpoint of the entire overhead image V3 and the axial direction of the view frustum 40 changes.
  • This angle is the angle between the normal direction on the display screen when viewed from the line of sight from the viewpoint set for the overhead image V3 at a certain time, and the axial direction of the displayed view frustum 40.
  • the axial direction of the view frustum 40 is the direction of the perpendicular line drawn from the frustum starting point 46 to the frustum far end surface 45.
  • Fig. 27 shows the captured images V1a, V1b, and V1c corresponding to the view frustum 40a, 40b, and 40c.
  • the captured image V1a displayed according to the cross-sectional shape becomes a parallelogram with a large difference between acute and obtuse angles due to the angle of the view frustum 40a in the overhead image V3. If this continues, the visibility of the captured image V1a will be poor. In such a case, it is advisable to change the display position as shown by the dashed arrow and display it at the position of the captured image V1a'. In this way, it is conceivable to determine that the display position needs to be changed when the acute angle and obtuse angle of the photographed image V1 are equal to or greater than a predetermined value.
  • the example (P3) is based on the same idea as (P2).
  • the viewpoint position of the overhead view video V3 can be changed according to an operation by a director, etc.
  • the viewpoint position of the overhead view video V3 may be changed from the state shown in Fig. 16 to that shown in Fig. 27 by an operation.
  • the visibility of the captured image V1a is poor, as in the case described above.
  • the shape of the rendered view frustum 40 and the captured image V1 changes due to a change in the viewpoint of the overhead image V3, which may reduce visibility.
  • the acute and obtuse angles of the captured image V1 become equal to or larger than a predetermined value, it is determined that the display position needs to be changed.
  • changing the viewpoint of the overhead image V3 may cause the size of the captured image V1 to become smaller. If the viewpoint position when rendering the overhead image V3 is changed to a distant position, causing the size of the captured image V1 to become equal to or smaller than a predetermined size, it may be determined that the display position needs to be changed.
  • step S160 of FIG. 24 the AR system 5 performs a display position change determination as described above, and in step S161, the process branches depending on whether a change is required.
  • step S162 If it is determined that no change is necessary, the AR system 5 proceeds to step S162, where it maintains the same display position setting as in the previous frame, and ends the processing of FIG. As a result, when the process proceeds to step S105 in FIG. 19, a frame of the current overhead image V3 is generated in which the shot image V1 is displayed in the same position as in the previous frame.
  • the AR system 5 proceeds from step S161 to step S163 in FIG. 24 and selects a destination to which the display position setting is to be changed.
  • the destination of this change may be determined depending on the reason why the display position change is required. For example, in the above (P1), if a collision with an object in the overhead image V3 occurs, it is possible to change the position to a position that is not affected by the collision point, such as the surface 47 near the frustum origin or a corner of the screen. If the visibility of the captured image V1 decreases in the above (P2) and (P3), it may be possible to select a location outside the view frustum 40 where a shape allows for good visibility, such as a corner of the screen or near the focus plane 41.
  • the type information of the camera 2 can also be used to set the destination of the captured image V1.
  • the destination can be a corner of the screen.
  • the captured image V1 of the mobile camera 2M it can be displayed within the view frustum 40 when the camera is not moving, and can be changed to the corner of the screen when the camera is moving. This is because the movement of the view frustum 40 within the overhead image V3 becomes larger when the camera is moving, and the visibility of the captured image V1 within the view frustum 40 decreases.
  • step S164 the AR system 5 branches the process depending on whether the selected destination is outside the view frustum 40 or not.
  • step S165 determines the size and shape of the display area as a cross section of the view frustum 40 at the set position. Then, in step S167, the AR system 5 sets the size and shape of the captured image V1 so that it matches the cross section of the determined display position.
  • step S105 in FIG. 19 the size of the captured image V1 is adjusted, and an overhead image V3 is generated in which the captured image V1 is composited at a position within the view frustum 40 different from the previous frame.
  • step S166 in FIG. 24 sets the display size and shape of the captured image V1 at the new set position (similar to step S155 in FIG. 23).
  • step S105 in FIG. 19 the size and shape of the captured image V1 are adjusted, and an overhead image V3 is generated in which the captured image V1 is composited at a position outside the view frustum 40 that is different from the previous frame.
  • step S164 and S166 are unnecessary.
  • the display position may be changed only to outside the view frustum 40. In this case, steps S164 and S165 are unnecessary, and the process may proceed from step S163 to step S166.
  • the view frustum 40 and the captured image V1 may be displayed together all the time, or may be displayed only temporarily.
  • the cameraman or director may perform an operation to select the view frustum 40, so that the shot image V1 corresponding to the selected view frustum 40 is displayed.
  • a cameraman or director may be able to switch between a mode in which only the view frustum 40 is displayed and a mode in which the view frustum 40 and the shot video V1 are displayed simultaneously.
  • Example of a cameraman and director screen> In the system of this embodiment, an overhead image V3-1 is displayed on the GUI device 11 for the director, and an overhead image V3-2 is displayed on a display unit such as a viewfinder of the camera 2 for the cameraman.
  • the overhead images V3-1 and V3-2 are both images showing the view frustum 40 in the CG space 30 simulating the shooting target space 8, but are images with different display modes. This makes it possible to provide information appropriate for the role of the director or cameraman.
  • FIG. 28 shows an example in which an overhead view image V3-1 is displayed as the device display image 51 on the GUI device 11.
  • This overhead image V3-1 is an image that includes a CG space 30 overlooking the target shooting space 8, for example a stadium, and displays the view frustums 40 of a plurality of cameras 2 taking pictures at the stadium. View frustums 40a, 40b, and 40c for the three cameras 2 are displayed.
  • the view frustum 40a is displayed in a different manner from the other view frustums 40b and 40c.
  • the view frustum 40a is highlighted and made to stand out more than the other view frustums 40b and 40c.
  • the shape and direction of the view frustum 40 and the display positions of the focus plane 41 and depth of field range 42 are determined by the angle of view, shooting direction, focal length, depth of field, etc. of the camera 2 at that time, and therefore these differences are not included in the difference in display mode referred to here.
  • Different display modes of the view frustum 40 do not refer to differences determined by the state of the angle of view or shooting direction of the camera 2, but to differences in the display of the view frustum 40 itself. For example, differences in color, brightness, darkness, type and thickness of the outline, differences in the display of the pyramid faces, differences between normal display and flashing display, differences in the flashing cycle, etc.
  • the view frustum 40 when the view frustum 40 is normally displayed in a semi-transparent white color, the view frustum 40a is highlighted in a semi-transparent red color, for example. This allows the view frustum 40a to be highlighted and shown to the director, etc.
  • the AR system 5 configured as shown in FIG. 4 determines whether or not a particular subject of interest, such as a specific player, is being photographed by performing image recognition processing on the captured image V1 of each camera 2. For example, it is determined whether or not the image of the video V1 captured by the camera 2 shows a target subject as shown in Fig. 29. Then, the AR system 5 generates an overhead video V3-1 so that the view frustum 40 of the camera 2 capturing the target subject is displayed in a highlighted manner.
  • the video data for the overhead images V3-1 and V3-2 in this case refers to video data in which a view frustum 40 is synthesized with a CG space 30 that corresponds to the subject space 8.
  • the overhead views V3-1 and V3-2 may be further synthesized with the shot image V1.
  • the AR system 5 performs the processes from step S101 to step S107 in FIG. 30 for each frame of the video data of the overhead images V3-1 and V3-2, for example. These processes can be considered as control processes of the CPU 71 (video processing unit 71a) in the information processing device 70 in FIG. 7 as the AR system 5.
  • the AR system 5 sets the CG space 30. For example, it sets the viewpoint position of the CG space 30 corresponding to the shooting target space 8, and renders an image as the CG space 30 from that viewpoint position. In particular, if there is no change in the viewpoint position or image content between the previous frame and the CG space 30, the image in the CG space of the previous frame can be used in the current frame as well.
  • step S102 the AR system 5 inputs the captured image V1 and metadata MT from the camera 2. That is, the captured image V1 of the current frame, and the attitude information, focal length, angle of view, aperture value, and the like of the camera 2 at the frame timing are acquired.
  • the AR system 5 inputs the captured video V1 and metadata MT of each camera 2.
  • step S201 the AR system 5 generates a view frustum 40 for the cameraman for the current frame.
  • the view frustum 40 for the cameraman is a view frustum 40 to be synthesized with the overhead image V3-2 to be transmitted to the camera 2 and displayed.
  • a view frustum 40 for the cameraman is generated separately in correspondence with each of the cameras 2.
  • the AR system 5 in the camera system 1 generates a view frustum 40 to be displayed on the camera 2 of the camera system 1 .
  • the AR system 5 sets the direction of the view frustum 40 within the CG space 30 according to the attitude of the camera 2, the pyramid shape according to the angle of view, the position of the focus plane 41 and depth of field range 42 based on the focal length and aperture value, and so on, and generates an image of the view frustum 40 according to the settings.
  • the AR system 5 When displaying the view frustum 40 for a plurality of cameras 2 , the AR system 5 generates an image of the view frustum 40 according to the metadata MT of each camera 2 .
  • step S202 the AR system 5 generates a director's view frustum 40 for the current frame.
  • the director's view frustum 40 is a view frustum 40 to be transmitted to the GUI device 11 and synthesized with the overhead view video V3-1 to be displayed.
  • an image of the view frustum 40 is generated based on the attitude (shooting direction), angle of view, focal length, and aperture value of each camera 2.
  • the view frustum 40 for the cameraman generated in step S201 and the view frustum 40 for the director generated in step S202 may be displayed in different ways. A specific example will be described later.
  • step S203 the AR system 5 synthesizes the view frustum 40 generated for the cameraman into the CG space 30 that will become the overhead image V3-2, to generate one frame of video data for the overhead image V3-2.
  • the captured image V1 may also be synthesized in correspondence with each view frustum 40.
  • step S204 the AR system 5 synthesizes the view frustum 40 generated for the director into the CG space 30 that will become the overhead image V3-1, to generate one frame of video data for the overhead image V3-1.
  • the shot image V1 may also be synthesized in correspondence with each view frustum 40.
  • step S205 the AR system 5 outputs one frame of video data of the overhead videos V3-1 and V3-2. The above process is repeated until the display of the view frustum 40 is completed.
  • a process of emphasizing one view frustum 40, for example the view frustum 40a, as shown in FIG. 28, using the process shown in FIG. 30 will be described.
  • 28 is an example of the overhead view V3-1 viewed by the director.
  • the highlighting is not performed.
  • the view frustums 40a, 40b, and 40c are all displayed in the same display mode, that is, white semi-transparent.
  • FIG. 31 shows a specific example of the processes in steps S201 and S202 in FIG.
  • the AR system 5 generates a view frustum 40 for each camera 2 as step S210. That is, for example, the view frustums 40a, 40b, and 40c are generated as the same white semi-transparent image for the cameraman.
  • the AR system 5 acquires the value of the screen occupancy rate of the target subject for the captured image V1 of each camera 2 as step S210.
  • the AR system 5 constantly performs image recognition processing on the captured images V1 of each camera 2, and determines whether or not the set target subject is captured, and determines the screen occupancy rate in each frame.
  • the screen occupancy rate is calculated by determining whether the target subject is captured and the area of the target subject in the screen.
  • the AR system 5 obtains the screen occupancy rate of the target subject in each captured image V1 at the current time calculated in this manner.
  • step S211 the AR system 5 determines the optimal captured image V1. For example, the captured image V1 with the highest screen occupancy rate is determined to be optimal.
  • step S212 the AR system 5 generates an image of each view frustum 40, including highlighting of the view frustum 40 corresponding to the camera 2 of the optimal shot image V1, as the view frustum 40 for the director.
  • the view frustum 40a is highlighted as a red semi-transparent image
  • the view frustums 40b and 40c are highlighted as white semi-transparent images.
  • the AR system 5 After performing the processes of steps S201 and S202 in FIG. 30 as shown in FIG. 31, the AR system 5 performs the processes of steps S203, S204, and S205. As a result, the overhead image V3-1 displayed on the GUI device 11 becomes as shown in FIG. 28. On the other hand, in the overhead image V3-2 displayed by each camera 2, the view frustum 40 is not highlighted.
  • the view frustum 40 for highlighting is selected based on the screen occupancy rate of the target subject, but the selection may be based on the continuous shooting time instead of the screen occupancy rate.
  • step S202 is shown in Fig. 32.
  • step S201 is the same as in Fig. 31.
  • step S202 of FIG. 30 the AR system 5 acquires the value of the continuous shooting time of the target subject for the captured image V1 of each camera 2 as step S215 of FIG.
  • the AR system 5 constantly performs image recognition processing on the captured images V1 of each camera 2, and determines whether or not the set target subject is captured. In this case, the AR system 5 calculates the duration (number of continuous frames) during which the target subject is recognized for each captured image V1. Then, in step S215, the AR system 5 obtains the continuous shooting time calculated in this manner.
  • step S211 the AR system 5 determines the optimal captured image V1.
  • the captured image V1 with the longest continuous shooting time is determined to be optimal.
  • step S212 the AR system 5 generates an image of each view frustum 40, including a highlight of the view frustum 40 corresponding to the camera 2 of the optimal shot image V1, as the view frustum 40 for the director.
  • the AR system 5 performs the processes of steps S203, S204, and S205 in Fig. 30.
  • the overhead image V3-1 displayed on the GUI device 11 becomes as shown in Fig. 28. This allows the director to recognize the camera 2 that is continuously capturing a long image of the subject of interest.
  • FIG. 33A shows an overhead image V3-1 as the device display image 51 of the GUI device 11.
  • the view frustums 40a, 40b, and 40c are each displayed in the same display mode, for example, semi-transparent white.
  • the overhead view video V3-1 will be as shown in Fig. 33B. That is, the view frustum 40a will be highlighted in a different manner from the view frustums 40b and 40c, so that this will be clearly indicated to the director.
  • a specific operation by the cameraman is an operation in which the cameraman notifies the director that "good footage is now being taken.” If such an operation is made possible on the camera 2 side, when the operation is performed, the AR system 5 makes the display mode of the view frustum 40 of the camera 2 on which the operation was performed in the overhead image V3-1 different from the others.
  • Fig. 34 shows a concrete example of steps S201 and S202 in Fig. 30.
  • the AR system 5 generates an image of the view frustum 40 for a cameraman as step S210 of Fig. 34.
  • the same white semi-transparent image is generated as the view frustum 40a, 40b, and 40c.
  • step S202 in FIG. 30 the AR system 5 first checks whether or not there has been feedback from each camera, that is, whether or not there has been a specific operation by the cameraman, in step S220 in FIG. 34, and branches the process in step S221. If no specific operation has been performed, the AR system 5 proceeds from step S221 to step S223, and generates an image of the director's view frustum 40. For example, the same white semi-transparent image is generated as the view frustum 40a, 40b, and 40c.
  • the AR system 5 proceeds to step S222 and generates an image of the view frustum 40 for the director, including highlighting.
  • the view frustum 40a is generated as a red semi-transparent image
  • the view frustums 40b and 40c are generated as white semi-transparent images.
  • the AR system 5 performs the processes of steps S203, S204, and S205 in Fig. 30.
  • the overhead image V3-1 displayed on the GUI device 11 becomes as shown in Fig. 33A or Fig. 33B.
  • the image becomes as shown in Fig. 33A
  • the view frustums 40a, 40b, and 40c are displayed in the same display mode.
  • 35A shows an overhead view image V3-1 as the device display image 51 of the GUI device 11.
  • the view frustums 40a, 40b, and 40c are displayed in the same display mode.
  • the view frustums 40a and 40b overlap on the image as shown in FIG. 35B.
  • the view frustums 40a and 40b are highlighted in a different way than usual to make them easier for the director to recognize.
  • Fig. 36 shows a concrete example of steps S201 and S202 in Fig. 30.
  • step S201 of Fig. 30 the AR system 5 generates an image of the view frustum 40 for a cameraman as step S210 of Fig. 36.
  • the same white semi-transparent image is generated as the view frustum 40a, 40b, and 40c.
  • step S202 of FIG. 30 the AR system 5 first sets the size, shape, and orientation of the view frustum 40 of each camera 2 based on the metadata MT of each camera 2 in step S230 of FIG. 36.
  • step S231 the AR system 5 checks the arrangement of each view frustum 40 within the three-dimensional coordinates of the CG space 30 of the current frame. This makes it possible to check whether the view frustums 40 overlap.
  • step S232 the AR system 5 branches the process depending on whether or not there is an overlap. If there is no overlap of the view frustum 40, the AR system 5 proceeds to step S234, and generates an image of the view frustum 40 for the director. For example, the same white semi-transparent image is generated as the view frustum 40a, 40b, and 40c.
  • the AR system 5 proceeds to step S233 and generates an image of the view frustum 40 for the director, including highlighting.
  • the overlapping view frustum 40 for example view frustum 40a and 40b, are generated as a red semi-transparent image, and the non-overlapping view frustum 40c is generated as a white semi-transparent image, etc.
  • the AR system 5 performs the processes of steps S203, S204, and S205 in Fig. 30.
  • the overhead image V3-1 displayed on the GUI device 11 becomes as shown in Fig. 35A or Fig. 35B.
  • the image becomes as shown in Fig. 35A
  • Fig. 35B if there is no overlap of the view frustum 40, the image becomes as shown in Fig. 35A, and if there is overlap, the image becomes as shown in Fig. 35B.
  • This allows the director, etc. to easily recognize the situation in which the same subject is being shot from different viewpoints by multiple cameras 2. This makes it possible to clarify instructions to each cameraman. It is also convenient for switching main line images when it is desired to switch images of the same subject.
  • the view frustums 40a, 40b, and 40c are displayed in the same display mode.
  • one view frustum 40 is preferentially displayed.
  • 37 shows an overhead image V3-1 as a device display image 51 of the GUI device 11.
  • the view frustum 40a, 40b, 40c, and 40d overlap, but the view frustum 40a is set as a priority, and the focus plane 41 and depth of field range 42 of the view frustum 40a are displayed in the overlapping portion.
  • FIG. 38 shows a concrete example of steps S201 and S202 in Fig. 30.
  • the AR system 5 generates an image of the view frustum 40 for the cameraman as step S210 of Fig. 38.
  • images are generated as the view frustum 40a, 40b, 40c, and 40d. No particular priority setting is made for the image of the view frustum 40 for the cameraman.
  • step S202 of FIG. 30 the AR system 5 first sets the size, shape, and orientation of the view frustum 40 of each camera 2 based on the metadata MT of each camera 2 in step S240 of FIG. 38.
  • step S241 the AR system 5 checks the arrangement of each view frustum 40 within the three-dimensional coordinates of the CG space 30 of the current frame. This makes it possible to check whether the view frustums 40 overlap.
  • step S242 the AR system 5 branches the process depending on whether or not there is an overlap. If there is no overlap of the view frustum 40, the AR system 5 proceeds to step S244, and generates an image of the view frustum 40 for the director. For example, images are generated as the view frustum 40a, 40b, 40c, and 40d.
  • the AR system 5 proceeds to step S245 to determine a view frustum 40 that has priority among the overlapping view frustum 40.
  • the AR system 5 may determine a view frustum 40 that has priority among all view frustum 40, including those that do not overlap.
  • the view frustum 40 selected as the one to be highlighted by photographing a subject of interest or by a specific operation of the cameraman may be set as a priority.
  • step S246 the AR system 5 generates an image of the view frustum 40 for the director.
  • an image is generated in which the focus plane 41 and depth of field range 42 are normally displayed.
  • an image is generated in which the focus plane 41 and depth of field range 42 are not displayed in the areas that overlap with the view frustum 40 set as a priority.
  • an image may be generated in which the focus plane 41 and depth of field range 42 are not displayed.
  • the AR system 5 performs the processes of steps S203, S204, and S205 in Fig. 30.
  • the overhead image V3-1 displayed on the GUI device 11 becomes an image in which the focus plane 41 and the depth of field range 42 can be clearly recognized for the view frustum 40 that has been prioritized, even if the view frustum 40 overlaps, as shown in Fig. 37.
  • the view frustums 40a, 40b, 40c, and 40d are displayed as shown in FIG.
  • the director's overhead view V3-1 is set as the priority, but the cameraman's overhead view V3-2 may be set as the priority.
  • the view frustum 40 of the camera 2 that he is operating be set as the priority. Therefore, in step S201 in Fig. 30 where a view frustum for the cameraman is generated, the same processing as steps S240 to S246 in Fig. 38 may be performed.
  • the view frustum to be given priority in step S245 is determined to be the view frustum 40 of the camera 2 itself. This allows the cameraman to clearly view the focus plane 41 and depth of field range 42 of the camera 2 he is operating even if the view frustum 40 overlaps with the view frustum 40 of another camera 2.
  • priority When priority is set in the overhead view video V3-2 in this way, priority may be set in the overhead view video V3-1 viewed by the director as described above, or it is also possible that priority is not set. Even if priority is set for both the overhead images V3-1 and V3-2, the conditions for determining the prioritized view frustum 40 are different, so the overhead image V3-1 and all of the overhead images V3-2 displayed by each camera 2 will not be displayed in the same manner.
  • FIG. 40A shows an overhead view image V3-1 as the device display image 51 of the GUI device 11.
  • view frustums 40a, 40b, and 40c are displayed.
  • Fig. 40A also shows an overhead image V3-2 as the viewfinder display image 50 of camera 2.
  • the overhead image V3-2 is synthesized in a corner of the screen of the shot image V1.
  • Fig. 40B shows an enlarged view of the overhead image V3-2.
  • FIG. 39A shows an example of a case where the director has performed an instruction operation on camera 2 of view frustum 40a.
  • the director may perform an operation such as dragging view frustum 40b on the GUI device 11 to cause instruction frustum 40DR to be displayed.
  • This is an instruction from the director to the cameraman of camera 2 of view frustum 40b to change the shooting direction to the direction of instruction frustum 40DR.
  • the AR system 5 displays the instruction frustum 40DR for the view frustum 40b also in the overhead image V3-2 viewed by the cameraman, as shown in Figures 40A and 40B.
  • the cameraman operating the camera 2 of the view frustum 40b can comply with the director's instructions by changing the shooting direction so that the view frustum 40b coincides with the instruction frustum 40DR.
  • the instruction frustum 40DR may be configured to be able to specify not only the shooting direction but also the angle of view and the focus plane 41.
  • the director may be able to move the focus plane 41 forward or backward, widen the angle of view (change the inclination of the pyramid), etc., by operating the instruction frustum 40DR.
  • the cameraman can adjust the focus so that the focus plane 41 of the view frustum 40b coincides with the pointing frustum 40DR, and can adjust the angle of view so that the inclinations of the pyramids coincide with each other.
  • overhead image V3-1 in FIG. 39A and overhead image V3-2 in FIG. 40A and FIG. 40B show examples in which the viewpoint position relative to CG space 30 is different.
  • the director or cameraman can change the viewpoint position for overhead images V3-1 and V3-2 by operating the camera.
  • the example in the figure shows that the CG space 30 is not necessarily displayed as seen from the same viewpoint position in overhead image V3-1 and overhead image V3-2.
  • FIG. 39B shows the state in which the director has also given instructions to the view frustum 40a, causing the instruction frustum 40DR to be displayed. In this way, in the overhead image V3-1, instructions can be given to each view frustum 40.
  • the instruction frustum 40DR of the previous instruction (instruction to the view frustum 40b) displayed as it is, in order to enable the director to confirm the currently valid instruction. It is considered that the instruction frustum 40DR is erased from the overhead images V3-1 and V3-2 when the view frustum 40 of the designated camera 2 substantially coincides with the instruction frustum 40DR. Alternatively, the instruction frustum 40DR may be erased from the overhead images V3-1 and V3-2 by a cancellation operation by the director, for example, to accommodate cancellation or change of instructions.
  • the instruction frustum 40DR for all the cameras 2 may be displayed, or only the instruction frustum 40DR for the camera 2 of the camera operator may be displayed.
  • each cameraman can grasp the overall instructions being issued.
  • the cameraman can easily recognize the instructions given to him from the director.
  • FIG. 41 shows specific examples of steps S201, S202, S203, and S204 in FIG. 30.
  • step S201 in FIG. 30 the AR system 5 performs the processes of steps S250 to S254 in FIG.
  • step S250 the AR system 5 generates an image of the view frustum 40 for a cameraman.
  • images are generated as the view frustums 40a, 40b, and 40c.
  • step S251 the AR system 5 checks whether or not there is an instruction operation by the director. If there is no instruction operation, the process proceeds to step S202 in FIG. If a pointing operation has been performed, the AR system 5 proceeds from step S251 to step S252 in FIG. 41, and branches the process depending on the display mode of the pointing frustum 40DR.
  • the display mode in this case can be selected by the cameraman either in a mode in which only the instruction frustum 40DR for the cameraman himself is displayed or in a mode in which all instruction frustum 40DR are displayed.
  • step S253 If the mode is one in which the instruction frustum 40DR directed to the user is displayed, the AR system 5 proceeds to step S253 and generates an image of the instruction frustum 40DR. However, if the instruction from the director is not directed to the camera 2 that is the subject of the overhead image V3-2 generation process, it is not necessary to generate an image of the instruction frustum 40DR in step S253.
  • the video data transmitted to each camera 2 as the overhead video V3-2 will have different display contents.
  • the AR system 5 proceeds to step S254 and generates an image of the indication frustum 40DR that is valid at that time.
  • the AR system 5 performs the processing of step S202 in FIG. 30 as shown in steps S260 to S262 in FIG. 41.
  • step S260 the AR system 5 generates an image of the view frustum 40 for the director.
  • images are generated as view frustum 40a, 40b, and 40c.
  • step S261 the AR system 5 checks whether or not there is an instruction operation by the director. If there is no instruction operation, the process proceeds to step S203 in FIG. If a pointing operation has been performed, the AR system 5 proceeds from step S261 to step S262 in FIG. 41, and generates an image of the pointing frustum 40DR that is valid at that time point.
  • step S203 in FIG. 30 the AR system 5 performs the processes of steps S255 and S256 in FIG.
  • step S255 the AR system 5 synthesizes the overhead view image V3-2 with the view frustum 40 and the instruction frustum 40DR, thereby generating image data of the overhead view image V3-2 as shown in FIG.
  • step S256 the AR system 5 synthesizes the overhead image V3-2 and the captured image V1 to generate image data of a synthetic image as shown in FIG. 40A.
  • the overhead view image V3-2 and the photographed image V1 may be combined on the camera 2 side.
  • step S204 in FIG. 30 the AR system 5 performs the process of step S265 in FIG.
  • step S265 the AR system 5 synthesizes the overhead view image V3-1 with the view frustum 40 and the instruction frustum 40DR, thereby generating image data of the overhead view image V3-1 as shown in Figures 39A and 39B.
  • step S205 of FIG. 30 the overhead view V3-1 is transmitted to the GUI device 11, and the overhead view V3-2 corresponding to each camera 2 is transmitted to each camera 2.
  • This allows the director to check his/her own instructions on the instruction frustum 40DR in the overhead view V3-1, and each cameraman can visually check the instructions from the director through the instruction frustum 40DR.
  • the display of the instruction frustum 40DR that the cameraman can see is seen in the overhead image V3-2, but it is a good idea to control the viewpoint position of the overhead image V3-2 to make the instructions easier for the cameraman to understand.
  • Figures 42A and 42B show overhead images V3-2 as the viewfinder display image 50 of camera 2. These are overhead images V3-2 with the position of camera 2 on the view frustum 40c as the viewpoint position, and are images viewed by the cameraman of camera 2.
  • an instruction frustum 40DR for the view frustum 40c is displayed, and an instruction frustum 40DR for the view frustum 40a of the other camera 2 is also displayed.
  • an instruction frustum 40DR for the view frustum 40c is displayed, but an instruction frustum 40DR for the view frustum 40a of the other camera 2 is not displayed.
  • the AR system 5 performs steps S201 and S202 in FIG. 30 as shown in FIG. 41. Then, it performs step S203 in FIG. 30 as shown in FIG. 43.
  • step S280 the AR system 5 branches the process depending on whether or not the instruction frustum 40DR is to be displayed in the current frame. If the instruction frustum 40DR is not to be displayed in the overhead image V3-2 for the camera 2 to be processed, the AR system 5 proceeds to step S281 and generates video data in which the image of the view frustum 40 is synthesized with the overhead image V3-2.
  • the AR system 5 proceeds to step S282, and sets the arrangement of the view frustum 40 and the instruction frustum 40DR within the 3D space coordinates for generating the overhead image V3-2. Then, in step S283, the AR system 5 sets the viewpoint position within the 3D space coordinates. That is, the coordinates of the position of a specific camera 2 among the multiple cameras to which the overhead video V3-2 is to be transmitted are set as the viewpoint position.
  • step S284 the AR system 5 generates video data for the overhead image V3-2, which is CG in which the view frustum 40 and the instruction frustum 40DR are combined at the set viewpoint position.
  • the viewfinder display image 50 can be switched by the cameraman between an overhead image V3-2 as shown in FIG. 42A and a shot image V1 as shown in FIG.
  • the cameraman needs to constantly check the captured image V1 (i.e., live view) of the camera 2 that he is operating while shooting, it is necessary for the captured image V1 to be displayed in the viewfinder. For this reason, it is conceivable to composite the overhead image V3-2 with the shot image V1 and display it as previously shown in FIG. 40A, but the overhead image V3-2 may be small and the indication frustum 40DR may be difficult to see. Therefore, it is advisable to switch between the overhead view image V3-2 as shown in FIG. 42A and the photographed image V1 as shown in FIG. 44 at any timing and display each image in full screen.
  • a pointed direction 54 and a coincidence rate 53 are displayed as instruction information on the photographed image V1.
  • the designated direction 54 is the shooting direction designated by the designated frustum 40DR.
  • the match rate 53 indicates the match rate between the current view frustum 40 and the designated frustum 40DR. When the match rate becomes 100%, the current view frustum 40 matches the designated frustum 40DR.
  • the cameraman can confirm that the director has given instructions even when he is normally viewing the shot image V1, and can follow the instructions by relying on the instruction direction 54 and the coincidence rate 53. If necessary, the screen can also be switched to the overhead image V3-2 to check the instruction frustum 40DR.
  • FIG. 5 An example of processing is shown in FIG.
  • the AR system 5 performs the processes of steps S270 to S273 in FIG. 45 in step S201 in FIG. Furthermore, the AR system 5 performs the processes of steps S275 to S278 in FIG. 45 in step S203 in FIG.
  • step S270 the AR system 5 checks whether the display of the view frustum 40 is OFF in the current frame. In other words, it checks whether the current frame is displaying the captured image V1 instead of the overhead image V3-2.
  • the AR system 5 ends the processing of step S201. In other words, there is no need to generate images of the view frustum 40 and the indication frustum 40DR.
  • the AR system 5 If the overhead image V3-2 is selected as the viewfinder display image 50, the AR system 5 generates image data for the view frustum 40 based on the metadata MT in step S271.
  • step S272 the AR system 5 determines whether or not to display the instruction frustum 40DR.
  • the instruction frustum 40DR is displayed when the director instructs it to be displayed.
  • the selection of a mode for displaying all the instruction frustum 40DR and a mode for displaying only the instruction frustum 40DR for the camera of the camera is also confirmed.
  • step S201 If the instruction frustum 40DR is not to be displayed, the process of step S201 is ended. If the instruction frustum 40DR is to be displayed in the overhead image V3-2, the AR system 5 proceeds to step S273 and generates image data of the instruction frustum 40DR.
  • step S203 in FIG. 30 the AR system 5 checks whether the display of the view frustum 40 is OFF, as in step S275 in FIG. 45. This is to check whether the captured image V1 is currently being displayed.
  • step S278 If the camera 2 being processed is currently displaying the overhead image V3-2, the AR system 5 proceeds to step S278, where it synthesizes the image data of the view frustum 40 with the image data of the overhead image V3-2, and if image data of the instruction frustum 40DR has been generated, it also generates image data that is synthesized with the instruction frustum 40DR.
  • step S276 the process branches depending on whether or not there is an instruction from the director. If there is no instruction, the process of step S203 ends. If there is an instruction from the director, in step S277, the captured image V1 is set to display the indicated direction 54 and the matching rate 53.
  • step S205 of FIG. 30 video data is output to camera 2. That is, video data of the shot video V1 as shown in FIG. 44 or video data of the overhead video V3-2 as shown in FIG. 42A is output to camera 2.
  • the viewfinder display image 50 may be switched between the shot image V1, the overhead image V3-2, and a composite image as shown in FIG. 40A by operation of the cameraman.
  • Fig. 46A shows a state in which a photographed image V1 and an overhead image V3-2 are displayed as a viewfinder display image 50 of camera 2.
  • the overhead image V3-2 is composited into a corner of the screen of the photographed image V1.
  • Fig. 46B shows an enlarged view of the overhead image V3-2.
  • the view frustum 40 of that camera itself is displayed in the overhead view video V3-2 displayed on the GUI device 11 on the director's side.
  • the view frustums 40 of all the cameras 2 are displayed as described with reference to FIG. 28 and the like.
  • marker frustum 40M1 and 40M2 are displayed in response to the cameraman registering the subject position and direction to be photographed, that is, the cameraman frequently marks the direction in which he or she wishes to photograph.
  • the marker frustum 40M1, 40M2 may be displayed in a different manner from, for example, the view frustum 40.
  • the marker frustum 40M1 and the marker frustum 40M2 may also be displayed in a different manner. For example, when the view frustum 40 is white and semi-transparent, the marker frustum 40M1 is yellow and semi-transparent, and the marker frustum 40M2 is light blue and semi-transparent.
  • the positions of the marker frustums 40M1 and 40M2 may be indicated by markers 55M1 and 55M2 on the captured image V1.
  • the correspondence may be clearly indicated by making the marker 55M1 yellow like the marker frustum 40M1 and making the marker 55M2 light blue like the marker frustum 40M2.
  • FIG. 48 shows a specific example of steps S201, S202, S203, and S204 in FIG.
  • step S300 the AR system 5 performs the processes of steps S300 to S303 in FIG.
  • step S300 the AR system 5 generates image data of the view frustum 40 based on the metadata MT.
  • the view frustum 40 corresponding to the camera 2 to be processed is generated.
  • the view frustum 40 corresponding to all of the cameras 2 is generated.
  • step S301 the AR system 5 determines whether or not a marking operation has been performed on the camera 2 to be processed.
  • a marking operation is an operation for adding or deleting a marking. If no marking operation has been performed, the process of step S201 ends.
  • step S302 the AR system 5 performs a process of adding a marking point to the registered marking or deleting a marking from the registered marking for the camera 2 to be processed. Then, in step S303, the AR system 5 generates image data of the marker frustum 40M as necessary. That is, if there are markings registered at that time, image data of the marker frustum 40M is generated.
  • step S202 of FIG. 30 the AR system 5 generates a view frustum 40 for the director in step S310 of FIG. 48.
  • image data of the view frustum 40 corresponding to all cameras 2 is generated.
  • step S203 in FIG. 30 the AR system 5 performs the processes of steps S320 and S321 in FIG.
  • step S320 the AR system 5 synthesizes the view frustum 40 with the CG data as the overhead view image V3-2. If there is a marking registration, the AR system 5 also synthesizes image data of the marker frustum 40M.
  • step S321 the AR system 5 combines the marker 55M with the captured image V1 in accordance with the marking registration. As described above, the video data of the overhead video V3-2 and the captured video V1 to be transmitted to the camera 2 is generated.
  • step S204 in FIG. 30 the AR system 5 performs the process of step S330 in FIG.
  • step S330 the AR system 5 synthesizes the view frustum 40 with the CG data as the overhead image V3-1. As a result, video data for the overhead view video V3-1 is generated.
  • step S205 of FIG. 30 the video data of the overhead view V3-2 and the shot video V1 are transmitted to the camera 2, and the video data of the overhead view V3-1 is transmitted to the GUI device 11.
  • This allows the cameraman to visually recognize the marker frustum 40M and the marker 55M in accordance with the marking registration operation. From the director's perspective, by not displaying the marker frustum 40M and marker 55M, the overhead image V3-1 does not become unnecessarily cluttered.
  • FIG. 49A shows an example in which an overhead image V3-1 is displayed as the device display image 51 of the GUI device 11
  • FIG. 49B shows an example in which an overhead image V3-2 is simultaneously displayed as the viewfinder display image 50 of the camera 2.
  • the view frustums 40a, 40b, and 40c of the cameras 2 are displayed in a similar manner, for example, in semi-transparent white.
  • the view frustum 40b of the camera 2 corresponding to the view frustum 40b is highlighted in, for example, a semi-transparent red, while the view frustums 40a and 40c of the other cameras 2 are each displayed in the usual semi-transparent white.
  • that view frustum 40a is highlighted in, for example, a semi-transparent red
  • the view frustums 40b and 40c of the other cameras 2 are each displayed in the normal semi-transparent white.
  • that view frustum 40c is highlighted in, for example, a semi-transparent red
  • the view frustums 40a, 40b of the other cameras 2 are each displayed in a normal semi-transparent white.
  • the director can check the view frustum 40 of each camera 2 evenly, and the cameraman can easily check the view frustum 40 of the camera 2 he is operating.
  • FIG. 50A shows an example in which an overhead image V3-1 is displayed as the device display image 51 of the GUI device 11
  • FIG. 50B shows an example in which an overhead image V3-2 is simultaneously displayed as the viewfinder display image 50 of the camera 2.
  • the view frustums 40a, 40b, and 40c of the cameras 2 are displayed in the same manner, for example, in semi-transparent white.
  • the viewpoint position is set to the position of the camera 2 corresponding to the view frustum 40b.
  • the overhead image V3-2 displayed by the camera 2 corresponding to the view frustum 40a that view frustum 40a is highlighted, for example, in a semi-transparent red color, and the view frustums 40b, 40c of the other cameras 2 are each displayed in a normal semi-transparent white color, and the viewpoint position is set to the position of the camera 2 of the view frustum 40a.
  • the overhead view image V3-2 of the camera 2 corresponding to the view frustum 40c also has its own view frustum 40 highlighted, and the viewpoint position is the position of the camera 2 of the view frustum 40c.
  • the director can check the view frustum 40 of each camera 2 evenly, and the cameraman can check the view frustum 40 of the camera 2 he is operating from a viewpoint similar to his own.
  • FIG. 51 shows an example in which an overhead image V3-1 is displayed as the device display image 51 of the GUI device 11.
  • an example is shown in which two overhead images are synthesized and displayed as overhead images V3-1a and V3-1b.
  • the overhead image V3-1a is an image from a viewpoint diagonally above the stadium
  • the overhead image V3-1b is an image from a viewpoint directly above.
  • the director needs to understand the entire camera, so it is ideal to display multiple overhead images V3-1 from different viewpoints.
  • FIG. 52 shows a specific example of steps S201, S202, S203, and S204 in FIG.
  • step S410 the AR system 5 generates image data of the view frustum 40 for the cameraman based on the metadata MT.
  • the image data is generated in a state in which the view frustum 40 corresponding to the camera 2 to be processed is highlighted.
  • step S202 of FIG. 30 the AR system 5 generates a view frustum 40 for the director in step S420 of FIG. 52.
  • image data with the same display mode is generated as the view frustum 40 corresponding to all cameras 2.
  • step S203 in FIG. 30 the AR system 5 performs the processes of steps S430 and S431 in FIG.
  • step S430 the AR system 5 sets the layout of the image data of the view frustum 40 within the 3D coordinate space as the overhead image V3-2.
  • step S431 the AR system 5 generates video data as an overhead image V3-2, with the position of the target camera 2 in the 3D coordinate space set as the viewpoint position. In this manner, the video data of the overhead video V3-2 to be transmitted to the camera 2 is generated.
  • step S204 in FIG. 30 the AR system 5 performs the processes of steps S440, S441, and S442 in FIG.
  • step S440 the AR system 5 synthesizes the view frustum 40 with the CG data as the overhead view image V3-1a.
  • step S441 the AR system 5 synthesizes the view frustum 40 with the CG data as the overhead view image V3-1b.
  • step S442 the AR system 5 generates video data that combines the overhead view image V3-1a and the overhead view image V3-1b on one screen. This generates the video data of the overhead view image V3-1 to be sent to the GUI device 11.
  • step S205 of FIG. 30 the video data of the overhead view V3-2 is transmitted to the camera 2, and the video data of the overhead view V3-1 is transmitted to the GUI device 11.
  • the cameraman to view, for example, the overhead image V3-2 as shown in FIG. 50B
  • the director to view, for example, the overhead images V3-1a and V3-1b as shown in FIG.
  • the captured image V1 may be displayed together with the view frustum 40 as described in Fig. 9 to Fig. 27.
  • the examples described in the embodiments can be implemented in a composite manner. 5. Summary and Modifications According to the above embodiment, the following effects can be obtained.
  • an information processing device 70 as an AR system 5 is equipped with an image processing unit 71a that generates image data for simultaneously displaying an overhead image V3 of the target space 8, a view frustum 40 (shooting range presentation image) that presents the shooting range of the camera 2 within the overhead image V3, and the captured image V1 of the camera 2 on one screen (see Figures 7 and 19).
  • an image processing unit 71a that generates image data for simultaneously displaying an overhead image V3 of the target space 8
  • a view frustum 40 shooting range presentation image
  • the viewer can easily grasp the correspondence between the image of the camera 2 and the position in space.
  • the video processor 71a generates video data that causes the captured video V1 to be displayed within the view frustum 40 (see FIGS. 9 to 14).
  • the image processor 71a generates image data in which the captured image V1 is arranged within the range of the image presentation image (view frustum 40).
  • the image processor 71a generates image data in which the captured image V1 is displayed in a state in which it is arranged within the range of the image presentation image (view frustum 40).
  • the image processing unit 71a generates image data in which the captured image V1 is displayed at a position within the depth of field range shown on the view frustum 40 (see Figures 9 and 10).
  • the depth of field range 42 is displayed within the view frustum 40, and the captured image V1 is displayed inside the display of the depth of field range 42.
  • This causes the captured image V1 to be displayed at a position close to the actual position of the subject within the overhead image V3. Therefore, the viewer can easily grasp the relationship between the shooting range of the view frustum 40, the actual captured image V1, and the position of the captured subject.
  • the video processor 71a generates video data in which the captured video V1 is displayed on the focus plane 41 shown on the view frustum 40 (see FIG. 9).
  • a focus plane 41 is displayed within the view frustum 40, and the captured image V1 is displayed on the focus plane 41. This allows the viewer to easily confirm the focus position of the camera 2 and the image of the subject at that position.
  • the image processing unit 71a generates image data in which the captured image V1 is displayed farther away than the depth of field range 42 when viewed from the frustum starting point 46 (see Figures 12 to 14).
  • the view frustum 40 is an image that spreads in a quadrangular pyramid shape, and the area of the cross section increases as it goes farther. Therefore, by displaying the captured image V1 on or near the frustum far end surface 45, it is possible to display the captured image V1 relatively large within the view frustum 40. This is suitable, for example, when the contents of the captured image V1 are to be confirmed.
  • the image processing unit 71a generates image data in which the captured image V1 is displayed at a position closer to the frustum origin 46 (a surface 47 near the frustum origin) than the depth of field range 42 shown on the view frustum 40 (see Figure 11).
  • the image processing unit 71a when it is desired to check the depth of field range 42 or the focus plane 41 in the view frustum 40, or when it is difficult to display the image on the far end surface 45 of the frustum, it is preferable to display the captured image V1 at a position close to the frustum starting point 46.
  • an image generation control unit 71b controls the generation of image data by variably setting the display position of the captured image V1, which is simultaneously displayed on one screen together with the overhead image V3 and the view frustum 40 (see Figures 7, 23, and 24).
  • the display position of the captured image V1 is set as any position inside the view frustum 40 or any position outside the view frustum 40. By setting an appropriate position, it is possible to make it easier for the viewer to grasp the captured image V1, and to prevent the view frustum 40 and the captured image V1 from interfering with each other.
  • video production control section 71b determines whether to change the display position of photographed video V1, and changes the setting of the display position of photographed video V1 in accordance with the determination result (see FIG. 24). For example, a change determination is performed so that the display position of the captured image V1 is automatically changed to an appropriate position, whereby the view frustum 40 and the captured image V1 are displayed in an appropriate positional relationship for the viewer, for example, a positional relationship that provides good visibility or a positional relationship that makes it easy to understand the correspondence relationship.
  • the image generation control unit 71b determines whether or not it is necessary to change the display position of the captured image V1 based on the positional relationship between the view frustum 40 and the object represented in the overhead image V3 (see steps S160 and P1 in Figure 24). For example, when the far end side of the view frustum 40 is embedded in the ground GR or a structure CN in the overhead image V3, the image may become unnatural or may not be displayed at all when displayed on the frustum far end surface 45. In such a case, the image generation control unit 71b determines that the position setting needs to be changed and changes the position setting of the captured image V1. This makes it possible to automatically provide an easily viewable captured image V1.
  • the image generation control unit 71b judges whether or not the display position of the captured image V1 needs to be changed based on the angle determined by the direction from the viewpoint of the entire overhead image V3 and the axial direction of the view frustum 40 (see steps S160 and P2 in FIG. 24). That is, it is the angle between the normal direction on the display screen when viewed from the line of sight direction from the viewpoint set for the overhead image V3 at a certain point in time, and the axial direction of the displayed view frustum 40.
  • the axial direction of the view frustum 40 is the direction of a perpendicular line drawn from the frustum starting point 46 to the frustum far end surface 45.
  • the size and direction of the rendered view frustum 40 change according to the angle of view and shooting direction of the camera 2.
  • the image generation control unit 71b determines that the position setting needs to be changed according to the angle of the view frustum 40, and changes the position setting of the captured image V1. This makes it possible to automatically provide the captured image V1 in an easy-to-view state.
  • video production control unit 71b determines whether or not the display position of captured video V1 needs to be changed based on a change in viewpoint within overhead video V3 (see steps S160 and P3 in Figure 24). For example, changing the viewpoint of the overhead image V3 changes the direction, size, angle, etc. of the view frustum 40.
  • the image generation control unit 71b judges whether the display of the captured image V1 up to that point is appropriate, and changes the settings if necessary. This makes it possible to provide the captured image V1 in a state that is always easy to view, even if the viewer arbitrarily changes the overhead image V3.
  • video production control section 71b uses type information of camera 2 capturing captured video V1 to set the destination of the captured video (see step S163 in FIG. 24).
  • the change destination of the display position of the captured image V1 is set depending on whether the camera 2 is a fixed type using a tripod 6 or a mobile type. This makes it possible to set a position according to the fixed type camera 2F and the mobile type camera 2M.
  • the view frustum 40 frequently changes, so that an easy-to-view display can be provided by displaying the captured image V1 at a position that is less affected by the change in the view frustum 40.
  • video production control section 71b changes the setting of the display position of captured video V1 in response to a user operation (see FIG. 23).
  • the user who is the viewer, can arbitrarily switch the display position of the captured image V1, thereby allowing the captured image V1 to be displayed at a position that suits the viewer's ease of viewing and purpose.
  • image production control section 71b changes the display position of captured image V1 within view frustum 40 (see Figs. 23 and 24). For example, within the view frustum 40, switching is performed among the focus plane 41, the frustum far end plane 45, the plane on the frustum starting point 46 side, the plane within the depth of field, etc. This allows the captured image V1 to be displayed at an appropriate position while clarifying the correspondence between the view frustum 40 and the captured image V1.
  • the image production control section 71b changes the display position of the captured image V1 between inside and outside the view frustum 40 (see Figs. 23 and 24).
  • the display position of the captured image V1 is changed within the view frustum 40, such as the focus plane 41, the frustum far end plane 45, the plane on the frustum starting point 46 side, and the plane within the depth of field range, or further, at a position outside the view frustum 40, such as near the camera, in the corner of the screen, or near the focus plane 41.
  • the video processing unit 71a generates video data that simultaneously displays an overhead image V3, each view frustum 40 for each of the multiple cameras 2, and each captured image V1 for each of the multiple cameras 2 on a single screen (see Figures 16, 17, and 27).
  • the view frustum 40 and the captured images V1 of the multiple cameras 2 are displayed in the CG space 30 represented by the overhead image V3. This allows the viewer to easily understand the relationship between the shooting ranges of the cameras 2. This is convenient for a director, for example, to check the contents of the images captured by each camera 2.
  • the view frustum 40 is given as an example of a shooting range presentation image, and its shape is a quadrangular pyramid, but it is not limited to this.
  • it may be an image in which multiple rectangular outlines of a quadrangular pyramid cross section are arranged, or an image in which the outline of a quadrangular pyramid is expressed by a dashed line.
  • the shooting range presentation image may display only the focus plane 41 or only the depth of field range 42 .
  • the information processing device 70 as, for example, the AR system 5 in the embodiment is equipped with an image processing unit 71a that performs in parallel a process of generating first image data that displays the view frustum 40 (image presenting the shooting range) of the camera 2 within the target shooting space 8, and a process of generating second image data that displays an image that displays the view frustum 40 within the target shooting space 8 and has a display mode different from that of the first image data.
  • the first video data and the second video data are the video data of the overhead video V3-1 transmitted to the GUI device 11 and the video data of the overhead video V3-2 transmitted to the camera 2 in the embodiment.
  • the viewer can easily grasp the correspondence between the image of the camera 2 and the position in the space.
  • the viewer By generating video data with different display modes according to the role of each viewer for the overhead image V3 including the view frustum 40, it is possible to present information suited to each viewer through the video display.
  • the video data of the overhead images V3-1 and V3-2 is video data of an image viewed by a video production supervisor, and the other is video data of an image viewed by a camera operator of camera 2 regarding the target space 8.
  • the overhead image V3-1 has content intended for viewing by a video production instructor such as a director on the GUI device 11
  • the overhead image V3-2 has video content intended for viewing by a shooting operator such as a cameraman.
  • the video production director refers to staff involved in video production, such as a director and a switcher engineer, other than the camera operator.
  • the camera operator refers to a cameraman who directly operates the camera 2 and a staff member who remotely operates the camera 2.
  • At least one of the video data of the overhead images V3-1 and V3-2 is video data that displays an image including a plurality of view frustums 40 corresponding to a plurality of cameras 2, respectively.
  • one or both of the overhead images V3-1, V3-2 display view frustums 40 for multiple cameras 2.
  • the director, cameraman, etc. can easily grasp the positional relationship of each camera 2 and the subject.
  • a view frustum 40 is displayed for multiple cameras 2, allowing the director or the like to give various instructions and select main line images while recognizing the position and direction of the subject of each camera 2.
  • the view frustum 40 is displayed for the plurality of cameras 2, so that the cameraman can perform shooting operations while taking into consideration the relationship with the other cameras 2.
  • the view frustum 40 may be displayed for his/her own camera 2. In this way, the cameraman can easily grasp the position of the subject in the image V1 captured by his/her own camera operation within the whole image. Furthermore, in the overhead image V3-2 viewed by the cameraman, only the view frustum 40 of the camera 2 of the other cameraman may be displayed. In this way, the cameraman can operate his own camera while recognizing the shooting locations and subjects of the other camera 2.
  • the video processing unit 71a generates video data as at least one of the video data for the overhead images V3-1, V3-2, which displays an image in which a portion of a plurality of view frustums 40 corresponding to a plurality of cameras 2 is displayed in a different manner from the other view frustums 40. That is, when a plurality of view frustums 40 are displayed, some of them are displayed in a different manner from the other view frustums 40. This makes it possible to realize a display in which a specific view frustum 40 has meaning when displaying a plurality of view frustums 40.
  • the video processing unit 71a generates video data that displays an image in which a portion of a plurality of view frustums 40 corresponding to a plurality of cameras 2 is highlighted as at least one of the video data for the overhead images V3-1, V3-2.
  • a particular view frustum 40 can be clearly identified by displaying some of the view frustums 40 in a more emphasized manner than the other view frustums 40 .
  • Examples of highlighting include a display with increased brightness, a display using a conspicuous color, a display with emphasized contours, a blinking display, and the like.
  • the video processing unit 71a generates video data that displays, as an overhead image V3-1, an image in which the view frustum 40 of a specific camera, which is a camera 2 among multiple cameras 2 that contains a subject of interest in the captured image V1, is displayed in a different manner from the other view frustums 40 (see Figures 28 to 32).
  • the view frustum 40 of the camera 2 selected from among the cameras 2 capturing the target subject it is easy for the director to know which camera is appropriate when he wants to use the image of the target subject as the main line image. It is also easy for the director to understand the positional relationship between the camera 2 capturing the target subject and the shooting direction of the other cameras 2.
  • the specific camera that highlights the view frustum 40 is the camera 2 in which the screen occupancy rate of the target subject in the captured image V1 is the highest (see FIGS. 29, 30, and 31).
  • the director can give instructions while grasping the status of the camera 2 mainly showing the target subject and the other cameras 2.
  • the specific camera for highlighting the view frustum 40 is the camera 2 that has the longest continuous shooting time of the target subject in the shot video V1 (see FIG. 32).
  • the director can grasp the status of the camera 2 that mainly films the subject of interest and other cameras 2 and give instructions accordingly.
  • the video processing unit 71a generates video data as the video data for the overhead video V3-1, in which the view frustum 40 of a camera 2 among multiple cameras 2 that has detected a specific operation by the shooting operator is displayed in a different manner from the other view frustum 40 (see Figures 33 and 34).
  • the video processing unit 71a generates video data as video data for the overhead video V3-1 in which, when the view frustums 40 of multiple cameras 2 overlap within the displayed image, the overlapping view frustums 40 are displayed in a different manner from the non-overlapping view frustums 40 (see Figures 35 and 36).
  • multiple view frustums 40 overlap multiple cameras 2 are shooting the direction of a common subject.
  • the video processing unit 71a generates video data that preferentially displays one of the overlapping view frustums 40 as at least one of the overhead images V3-1, V3-2 when the view frustums 40 of multiple cameras 2 overlap on the displayed image (see Figures 37 and 38).
  • the video processing unit 71a When multiple view frustums 40 overlap, one view frustum 40 is preferentially displayed in the overlapping portion.
  • the focus plane 41 and depth of field range 42 of only one view frustum 40 that has been set as the priority are displayed.
  • the overlapping portion it is possible to increase the brightness of only one view frustum 40 that has been set as a priority, or to give it a conspicuous color. Furthermore, the above-mentioned highlighted display may be performed. In the overlapping portion, only the view frustum 40 that has been set as a priority may be displayed. These also make it easier to view the overhead image V3 including multiple view frustum 40.
  • the view frustum 40 of camera 2 which is the main line image, is displayed as a priority, while in the overhead image V3-2 viewed by the cameraman, no particular priority is set.
  • the view frustum 40 of the camera 2 that he operates is displayed with priority.
  • video processing unit 71a generates video data for displaying, as overhead images V3-1 and V3-2, images including instruction images in different display modes (see FIGS. 39 to 45).
  • video processing unit 71a generates video data for displaying, as overhead images V3-1 and V3-2, images including instruction images in different display modes (see FIGS. 39 to 45).
  • the instruction frustum 40DR On the cameraman side, the instruction frustum 40DR is displayed on the screen, so that the cameraman can visually understand the instruction contents.
  • the overhead images V3-1 and V3-2 are displayed in a way that is appropriate for each role, so that the shooting can proceed smoothly.
  • the video processing unit 71a sets the video data of the overhead image V3-1 as video data that displays instruction images for multiple cameras 2, and sets the video data of the overhead image V3-2 as video data that displays instruction images for a specific camera 2 among the multiple cameras (see Figures 39, 41, and 42). This allows the director to understand the instructions for each camera, while camera operators can easily understand the instructions by only seeing the instructions that are directed to them.
  • the video processing unit 71a converts the video data of the overhead video V3-2 into video data that displays an instruction video within a video of a viewpoint corresponding to the position of a specific camera 2 among multiple cameras (see Figures 42 and 43).
  • the indication frustum 40DR is displayed in the overhead view image V3-2 from his/her viewpoint, so that the indication direction can be easily seen from his/her own viewpoint.
  • the video processing unit 71a generates video data for the overhead video V3-2 that displays the current view frustum 40 and a marker image in the shooting direction based on the marking operation (see Figures 46 to 48).
  • the bird's-eye view image V3-2 including the marker images of the marker frustum 40M, the marker 55M, etc. is displayed. This allows the cameraman to mark the shooting position or subject that he or she has set, which is convenient for taking pictures of that position at the appropriate time.
  • the video processing unit 71a generates video data as the video data for the overhead video V3-2, which displays an overhead video from a viewpoint corresponding to the position of a specific camera 2 among multiple cameras, and generates video data as the video data for the overhead video V3-1, which displays an overhead video from a different viewpoint (see Figures 49 to 52).
  • the bird's-eye view V3-2 is displayed from the same viewpoint as his/her own viewpoint, making it easy to recognize the overall situation and his/her own shooting direction.
  • the bird's-eye view V3-1 is displayed from a viewpoint that makes it easy to grasp the whole picture, rather than from the viewpoint of a specific cameraman, making it ideal for directing the entire shoot.
  • the video processing unit 71a generates video data for displaying a plurality of overhead views V3-1a, V3-1b from a plurality of viewpoints as the video data for the overhead view V3-1 (see FIGS. 51 and 52). Since the director needs to understand the shooting conditions of each camera 2, an overhead image V3-1 that provides an overall bird's-eye view from a plurality of viewpoints as shown in FIG. 51 is extremely useful.
  • the video processor 71a generates the overhead view video V3 as a virtual video using CG. This makes it possible to generate an overhead image V3 from any viewpoint, and to display the view frustum 40 and the captured image V1 from a variety of viewpoints.
  • the view frustum 40 is configured to display the shooting direction and angle of view at the time of shooting in real time, but it may also be configured to display a past view frustum 40, for example, during a prior simulation of camera work.
  • the current view frustum 40 at the time of shooting and the past view frustum 40 may be displayed at the same time for comparison. In such a case, it is advisable to make the past view frustum 40 different from the current view frustum 40 by increasing its transparency, for example, so that the cameraman or the like can distinguish between them.
  • the program of the embodiment is a program that causes a processor such as a CPU or DSP, or a device including these, to execute the processes shown in Figures 20, 21, 22, 23, and 24 described above. That is, the program of the embodiment is a program that causes the information processing device 70 to execute a process of generating video data that simultaneously displays, on one screen, an overhead image V3 of the space to be photographed, a view frustum 40 (shooting range presentation image) that presents the shooting range of the camera 2 within the overhead image V3, and the captured image V1 of the camera 2.
  • a processor such as a CPU or DSP, or a device including these
  • the program of the embodiment is a program that causes a processor such as a CPU or DSP, or a device including these, to execute the processes shown in Figures 30, 31, 32, 34, 36, 38, 41, 43, 45, 48, and 52 described above. That is, the program of the embodiment is a program that causes the information processing device 70 to execute in parallel a process of generating first video data that displays a view frustum 40 (shooting range display image) that presents the shooting range of the camera 2 within the shooting target space, and a process of generating second video data that displays an image that displays the view frustum 40 within the shooting target space and has a display mode different from that of the image generated by the first video data.
  • a processor such as a CPU or DSP, or a device including these
  • Such a program can be pre-recorded in a HDD as a recording medium built into a device such as a computer device, or in a ROM in a microcomputer having a CPU. Also, such a program can be temporarily or permanently stored (recorded) in a removable recording medium such as a flexible disk, a CD-ROM (Compact Disc Read Only Memory), an MO (Magneto Optical) disk, a DVD (Digital Versatile Disc), a Blu-ray Disc (registered trademark), a magnetic disk, a semiconductor memory, or a memory card.
  • a removable recording medium can be provided as a so-called package software.
  • Such a program can be installed in a personal computer or the like from a removable recording medium, or can be downloaded from a download site via a network such as a LAN (Local Area Network) or the Internet.
  • Such a program is suitable for the widespread provision of the information processing device 70 of the embodiment.
  • a program is suitable for the widespread provision of the information processing device 70 of the embodiment.
  • personal computers communication devices
  • mobile terminal devices such as smartphones and tablets, mobile phones, game devices, video devices, PDAs (Personal Digital Assistants), etc.
  • these devices can function as the information processing device 70 of the present disclosure.
  • An information processing device comprising: an image processing unit that generates image data for simultaneously displaying an overhead image of a space to be photographed, a shooting range presentation image that presents the shooting range of a camera within the overhead image, and the image photographed by the camera on a single screen.
  • an information processing device comprising: an image processing unit that generates image data for simultaneously displaying an overhead image of a space to be photographed, a shooting range presentation image that presents the shooting range of a camera within the overhead image, and the image photographed by the camera on a single screen.
  • the image processing unit generates image data in which the captured image is displayed within the shooting range presentation image.
  • the image processing unit generates image data in which the captured image is displayed at a position within a depth of field range shown in the shooting range presentation image.
  • the information processing device further comprising an image generation control unit that controls generation of image data by variably setting a display position of the captured image that is simultaneously displayed on one screen together with the overhead image and the shooting range presentation image.
  • the image generation control unit determines whether to change a display position of the shot image, and changes a setting of the display position of the shot image according to a result of the determination.
  • the image generation control unit determines whether or not it is necessary to change the display position of the captured image based on a positional relationship between the shooting range presentation image and an object represented in the overhead image.
  • a program that causes an information processing device to execute a process of generating video data that simultaneously displays an overhead image of a space to be photographed, a shooting range presentation image that presents the camera's shooting range within the overhead image, and the image captured by the camera on a single screen.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Studio Devices (AREA)

Abstract

L'invention concerne un dispositif de traitement d'informations qui comprend une unité de traitement vidéo générant des données vidéo provoquant un affichage simultané, à l'intérieur du même écran, d'une vue à vol d'oiseau d'un espace pour la capture d'image ; une vidéo de présentation de plage de capture d'image présentant la plage de capture d'image d'une caméra à l'intérieur de la vue à vol d'oiseau ; et une vidéo capturée par la caméra.
PCT/JP2023/033687 2022-09-29 2023-09-15 Dispositif de traitement d'informations, procédé de traitement d'informations et programme WO2024070761A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2022-157054 2022-09-29
JP2022157054 2022-09-29

Publications (1)

Publication Number Publication Date
WO2024070761A1 true WO2024070761A1 (fr) 2024-04-04

Family

ID=90477483

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2023/033687 WO2024070761A1 (fr) 2022-09-29 2023-09-15 Dispositif de traitement d'informations, procédé de traitement d'informations et programme

Country Status (1)

Country Link
WO (1) WO2024070761A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0548964A (ja) * 1991-08-19 1993-02-26 Nippon Telegr & Teleph Corp <Ntt> 映像とその撮影情報の提示方法
JPH08251467A (ja) * 1995-03-09 1996-09-27 Canon Inc カメラ情報の表示装置
JP2008005450A (ja) * 2006-06-20 2008-01-10 Kubo Tex Corp 3次元仮想空間を利用したビデオカメラのリアルタイム状態把握、制御の方法
JP2008011433A (ja) * 2006-06-30 2008-01-17 Canon Marketing Japan Inc 撮像システム及びその撮像方法、並びに画像サーバ及びその画像処理方法
JP2013030924A (ja) * 2011-07-27 2013-02-07 Jvc Kenwood Corp カメラ制御装置、カメラ制御方法及びカメラ制御プログラム

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0548964A (ja) * 1991-08-19 1993-02-26 Nippon Telegr & Teleph Corp <Ntt> 映像とその撮影情報の提示方法
JPH08251467A (ja) * 1995-03-09 1996-09-27 Canon Inc カメラ情報の表示装置
JP2008005450A (ja) * 2006-06-20 2008-01-10 Kubo Tex Corp 3次元仮想空間を利用したビデオカメラのリアルタイム状態把握、制御の方法
JP2008011433A (ja) * 2006-06-30 2008-01-17 Canon Marketing Japan Inc 撮像システム及びその撮像方法、並びに画像サーバ及びその画像処理方法
JP2013030924A (ja) * 2011-07-27 2013-02-07 Jvc Kenwood Corp カメラ制御装置、カメラ制御方法及びカメラ制御プログラム

Similar Documents

Publication Publication Date Title
JP7498209B2 (ja) 情報処理装置、情報処理方法およびコンピュータプログラム
US9858643B2 (en) Image generating device, image generating method, and program
KR100990416B1 (ko) 표시 장치, 화상 처리 장치 및 화상 처리 방법, 촬상 장치, 및 기록매체
JP7017175B2 (ja) 情報処理装置、情報処理方法、プログラム
US20110085017A1 (en) Video Conference
JP5861499B2 (ja) 動画提示装置
US10681276B2 (en) Virtual reality video processing to compensate for movement of a camera during capture
US11627251B2 (en) Image processing apparatus and control method thereof, computer-readable storage medium
JP2019083402A (ja) 画像処理装置、画像処理システム、画像処理方法、及びプログラム
JP7378243B2 (ja) 画像生成装置、画像表示装置および画像処理方法
JP2019121224A (ja) プログラム、情報処理装置、及び情報処理方法
US11847735B2 (en) Information processing apparatus, information processing method, and recording medium
WO2020166376A1 (fr) Dispositif de traitement d&#39;image, procédé de traitement d&#39;image et programme
US20230353717A1 (en) Image processing system, image processing method, and storage medium
KR102200115B1 (ko) 다시점 360도 vr 컨텐츠 제공 시스템
WO2020017600A1 (fr) Dispositif de commande d&#39;affichage, procédé de commande d&#39;affichage et programme
WO2024070761A1 (fr) Dispositif de traitement d&#39;informations, procédé de traitement d&#39;informations et programme
CN111466113B (zh) 图像捕获的装置和方法
WO2024070762A1 (fr) Dispositif de traitement d&#39;informations, procédé de traitement d&#39;informations, et programme
JP2022012900A (ja) 情報処理装置、表示方法、及び、プログラム
WO2024084943A1 (fr) Dispositif de traitement d&#39;informations, procédé de traitement d&#39;informations et programme
WO2024070763A1 (fr) Dispositif de traitement d&#39;informations, système d&#39;imagerie, procédé de traitement d&#39;informations et programme
WO2023248832A1 (fr) Système de visualisation à distance et système d&#39;imagerie sur site
US20240037843A1 (en) Image processing apparatus, image processing system, image processing method, and storage medium
US20220086413A1 (en) Processing system, processing method and non-transitory computer-readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23871992

Country of ref document: EP

Kind code of ref document: A1