WO2022230715A1 - Dispositif de traitement d'informations, procédé de traitement d'informations et programme - Google Patents

Dispositif de traitement d'informations, procédé de traitement d'informations et programme Download PDF

Info

Publication number
WO2022230715A1
WO2022230715A1 PCT/JP2022/018122 JP2022018122W WO2022230715A1 WO 2022230715 A1 WO2022230715 A1 WO 2022230715A1 JP 2022018122 W JP2022018122 W JP 2022018122W WO 2022230715 A1 WO2022230715 A1 WO 2022230715A1
Authority
WO
WIPO (PCT)
Prior art keywords
virtual viewpoint
data
virtual camera
time information
information
Prior art date
Application number
PCT/JP2022/018122
Other languages
English (en)
Japanese (ja)
Inventor
秀 藤田
Original Assignee
キヤノン株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by キヤノン株式会社 filed Critical キヤノン株式会社
Publication of WO2022230715A1 publication Critical patent/WO2022230715A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast

Definitions

  • the present disclosure relates to technology for generating virtual viewpoint video.
  • Patent Literature 1 discloses a technique for determining a virtual viewpoint by operating a device or UI screen.
  • the movement of the subject in the virtual viewpoint video matches the actual movement of the subject at the time of shooting. Therefore, when generating a virtual viewpoint video including a state different from the state of the subject that was actually shot, for example, by stopping the movement of the subject or playing back in reverse, the virtual viewpoint video generated by the method described in Patent Document 1 can be used.
  • the viewpoint data may not generate the desired virtual viewpoint video.
  • the present disclosure has been made in view of the above problems, and aims to enable the output of information for easily generating a virtual viewpoint video including the state of a desired subject.
  • An information processing device corresponds to a frame of a moving image that constitutes a virtual viewpoint video generated based on a plurality of captured images obtained by capturing images of a subject by a plurality of imaging devices.
  • a first acquisition means for acquiring a parameter representing a line-of-sight direction from the virtual viewpoint
  • a second acquisition means for acquiring time information for specifying a time when the subject was photographed
  • the first acquisition means and output means for outputting virtual viewpoint data in which the plurality of parameters acquired by the second acquisition means are associated with the time information acquired by the second acquisition means.
  • FIG. 4 is a diagram showing an example of the format of sequence data;
  • FIG. 4 is a diagram showing an example of a format of virtual camera path data;
  • FIG. 4 is a diagram showing an example of a format of virtual camera path data;
  • 4 is a flowchart for explaining the operation of the virtual camera path data generation processing device;
  • 1 is a diagram for explaining the configuration of a virtual viewpoint video generation device;
  • FIG. 4 is a flowchart for explaining the operation of the virtual viewpoint video generation device;
  • FIG. 2 is a diagram for explaining an example of communication status between devices;
  • FIG. FIG. 4 is a diagram showing an example of virtual camera path data;
  • FIG. 4 is a diagram showing an example of virtual camera path data;
  • FIG. 4 is a diagram showing an example of virtual camera path data;
  • FIG. 4 is a diagram showing an example of virtual camera path data;
  • FIG. 4 is a diagram showing an example of virtual camera path data
  • FIG. 3 is a diagram for explaining the hardware configuration of a virtual camera path data processing device
  • FIG. 4 is a diagram for explaining the concept of virtual camera path data
  • FIG. 4 is a diagram for explaining the concept of virtual camera path data
  • a virtual viewpoint used when generating a virtual viewpoint video is called a virtual camera. That is, the virtual camera is a camera that is virtually arranged at the position of the virtual viewpoint, and the position of the virtual viewpoint and the line-of-sight direction from the virtual viewpoint correspond to the position and orientation of the virtual camera, respectively.
  • the virtual camera path data can be said to be virtual viewpoint data including parameters of the position of the virtual viewpoint and the line-of-sight direction from the virtual viewpoint.
  • the virtual camera path data in this embodiment includes parameters representing the three-dimensional position of the virtual camera and parameters representing the orientation of the virtual camera in pan, tilt, and roll directions.
  • the content of the virtual camera path data is not limited to the above.
  • the virtual camera path data may include a parameter corresponding to the size of the field of view (angle of view) of the virtual viewpoint.
  • the virtual camera path data in the present embodiment has parameters for a plurality of video frames that form the virtual viewpoint video. That is, the virtual camera path data has a configuration in which a virtual camera parameter is associated with each of a plurality of frames constituting a moving image of a virtual viewpoint video. represents a change in
  • the virtual viewpoint video in the present embodiment is also called a free viewpoint image, but is not limited to an image corresponding to a viewpoint freely (arbitrarily) specified by the user. An image corresponding to the selected viewpoint is also included in the virtual viewpoint video. Also, the designation of the virtual viewpoint may be performed by a user operation, or may be automatically performed based on the result of image analysis or the like. Also, in this embodiment, the case where the virtual viewpoint video is a moving image will be mainly described. In this embodiment, a moving image is assumed to be composed of a plurality of images (frames). Therefore, it can be said that a video in which the virtual camera parameters change for each frame while the subject in the virtual viewpoint video is stationary is also a moving image. Furthermore, an image in which the subject in the virtual viewpoint image is stationary and the virtual camera parameters do not change for each frame may appear to be stationary as an image, but can be treated as a moving image because it has multiple frames. can.
  • FIG. 10 is a block diagram showing a configuration example of computer hardware applicable to a virtual camera path data processing device to be described later.
  • the camera path data processing device 1 has a CPU 1001 , a ROM 1002 , a RAM 1003 , an operation section 1004 , an output section 1005 , an external storage device 1006 , an I/F 1007 and a bus 1008 .
  • the hardware configuration described below can also be applied to other devices in the information processing system described later.
  • the CPU 1001 controls the entire computer using the computer programs and data stored in the RAM 1002 and ROM 1003, and executes each process as performed by the device described later. That is, the CPU 1001 functions as each processing unit in a virtual camera path data processing device to be described later.
  • the RAM 1002 has an area for temporarily storing computer programs and data loaded from the external storage device 1006, data externally acquired via the I/F (interface) 1007, and the like. Furthermore, the RAM 1002 has a work area used when the CPU 1001 executes various processes. That is, the RAM 1002 can be allocated, for example, as a frame memory, or can provide various other areas as appropriate.
  • the ROM 1003 stores setting data for this computer, a boot program, and the like.
  • An operation unit 1004 includes a keyboard, a mouse, and the like, and can be operated by the user of the computer to input various instructions to the CPU 1001 .
  • An output unit 1005 displays the results of processing by the CPU 1001 . Also, the output unit 1005 is configured by, for example, a liquid crystal display.
  • the external storage device 1006 is a large-capacity information storage device typified by a hard disk drive.
  • the external storage device 1006 stores an OS (operating system) and a computer program for causing the CPU 1001 to implement the functions of each section of the virtual camera path data processing device, which will be described later. Furthermore, each image data to be processed may be stored in the external storage device 1006 .
  • the computer programs and data stored in the external storage device 1006 are appropriately loaded into the RAM 1002 under the control of the CPU 1001 and are processed by the CPU 1001 .
  • the I/F 1007 can be connected to a network such as a LAN or the Internet, or other equipment such as a projection device or a display device. can be
  • a bus 1008 connects the above units.
  • the hardware configuration is not limited to this.
  • the operation unit 1004, the output unit 1005, and the external storage device 1006 may be configured to be externally connected as a device different from the virtual camera path data processing device 1.
  • the CPU 1001 functions as a reception control unit that receives inputs from the operation device and the external storage device, and an output control unit that outputs data to an output device such as a table device and the external storage device.
  • the computer program code read from the storage medium is written in a memory provided in a function expansion card inserted into the computer or a function expansion unit connected to the computer. It also includes the case where a CPU or the like provided in the function expansion card or function expansion unit performs part or all of the actual processing based on instructions from the code of the computer program, thereby realizing the functions described above.
  • the storage medium stores computer program code corresponding to the processing described above.
  • FIG. 1 is a diagram showing a configuration example of an information processing system including a virtual camera path data processing device according to this embodiment.
  • the information processing system 10 has a virtual camera path data processing device 1 , an imaging device 2 , a shape estimation device 3 , a storage device 4 , an image generation device 5 , a virtual camera operation device 6 and a sequence data processing device 7 .
  • the virtual camera path data processing device 1 receives input for determining parameters such as the position and orientation of the virtual camera from the virtual camera operation device 6, which will be described later, and generates virtual camera path data. Details of the virtual camera path data processing device 1 will be described later.
  • the shooting device 2 is a shooting device that acquires a shot image that is used to generate a virtual viewpoint video.
  • a virtual viewpoint video is generated based on a plurality of shot images obtained using a plurality of shooting devices.
  • FIG. 1 shows only the imaging device 2, it is assumed that the information processing system 10 includes a plurality of imaging devices.
  • the photographing device 2 will be used unless the plurality of photographing devices are to be distinguished from each other.
  • a plurality of photographing devices photograph the photographing area from a plurality of different directions.
  • the shooting area is, for example, a stadium where rugby or soccer is played, a hall or stage where a concert or the like is held, a shooting studio, or the like.
  • a plurality of photographing devices are installed in different positions and directions so as to surround such a photographing area, and photograph in synchronism. It should be noted that the plurality of photographing devices may not be installed over the entire periphery of the photographing area, and may be installed only in a part of the photographing area depending on restrictions on the installation location or the like. Also, the number of the plurality of imaging devices is not limited. For example, if the imaging area is a rugby stadium, several tens to hundreds of imaging devices may be installed around the stadium.
  • a plurality of imaging devices with different angles of view such as a telephoto camera and a wide-angle camera
  • a telephoto camera the object can be photographed with high resolution, so the resolution of the generated virtual viewpoint video is also improved.
  • a wide-angle camera the range that can be captured by one camera is wide, so the number of cameras can be reduced.
  • the photographing device 2 is synchronized with one piece of time information in the real world, and time information representing the photographing time is attached to each frame of the photographed video.
  • the imaging device 2 may be composed of one camera, or may be composed of a plurality of cameras. Furthermore, the photographing device 2 may include a device other than a camera. For example, a distance measuring device using a laser beam or the like may be included.
  • the state of the imaging device 2 may be controllable.
  • the state of the photographing device refers to the state of the photographing device, such as its position, orientation, focal length, optical center, and distortion.
  • the position and orientation of the imaging device may be controlled by the imaging device itself, or may be controlled by a platform that controls the position and orientation of the imaging device.
  • the imaging device 2 has a unique identification number for distinguishing it from other imaging devices.
  • the photographing device 2 may have other functions such as a function of extracting a foreground image from an image acquired by photographing, and may also include hardware (circuits, devices, etc.) that implements the functions.
  • the shape estimation device 3 generates shape data representing the shape of the subject based on the image data acquired from the imaging device 2.
  • the shape estimation device 3 is assumed to generate three-dimensional shape data representing the three-dimensional shape of the subject. A method for generating three-dimensional shape data in this embodiment will be described below.
  • a plurality of photographed images are acquired by photographing respective photographing areas from different directions with a plurality of imaging devices.
  • a foreground image obtained by extracting a foreground area corresponding to an object such as a person or a ball, and a background image obtained by extracting a background area other than the foreground area are acquired from a plurality of photographed images.
  • the foreground image is an image obtained by extracting a region of an object (foreground region) from a photographed image acquired by a photographing device.
  • An object extracted as a foreground area refers to a dynamic object (moving object) that moves (its position or shape can change) when photographed from the same direction in time series.
  • the object in a game, includes a person such as a player or a referee in the field where the game is played, and in the case of a ball game, it can include a ball in addition to the person.
  • a person such as a player or a referee in the field where the game is played
  • it can include a ball in addition to the person.
  • singers, players, performers, moderators, etc. are examples of objects. Note that if the background is registered in advance using a background image or the like, even a stationary object is extracted as a foreground area if it does not exist in the background that has been registered in advance.
  • the background image is an image of an area (background area) different from at least the foreground object.
  • the background image is an image obtained by removing the foreground object from the captured image.
  • the background refers to an object to be photographed that is stationary or continues to be nearly stationary when photographed from the same direction in chronological order. Such objects to be photographed are, for example, stages such as concerts, stadiums where events such as competitions are held, structures such as goals used in ball games, or fields.
  • the background is at least a region different from the foreground object, and the object to be photographed may include other objects in addition to the object and the background.
  • the process of generating the foreground image and the background image from the captured image may be performed by the shape estimation device 3 or may be performed by the imaging device 2 .
  • the imaging device 2 When the imaging device 2 is used, the imaging device 2 outputs the foreground image and the background image to the shape estimation device 3 .
  • the shape estimation device 3 uses the foreground image to generate three-dimensional shape data of the foreground subject by a shape estimation method such as the shape-from-silhouette method.
  • Three-dimensional shape data is, for example, point cloud model data, billboard data, mesh model data, and the like.
  • the shape estimating device 3 also uses the foreground image to represent the color of the subject, which is the foreground, and generates texture data for coloring the three-dimensional shape data of the foreground.
  • the three-dimensional shape data of the background is generated by three-dimensionally measuring an imaging area such as a stadium or venue in advance. Based on the background image, the shape estimation device 3 expresses the color of the subject, which is the background, and generates texture data for coloring the three-dimensional shape data of the background.
  • the shape estimation device 3 transmits the generated three-dimensional shape data and texture data to the storage device 4 .
  • the virtual viewpoint video in this embodiment is generated by, for example, the following method. That is, the virtual viewpoint video is generated by mapping the texture data to the foreground three-dimensional shape data according to the parameters of the position and orientation of the virtual camera and performing rendering. Rendering for the background three-dimensional shape data is performed in the same manner.
  • the method of generating the virtual viewpoint video is not limited to this, and various methods such as a method of generating a virtual viewpoint video by projective transformation of a captured image without using three-dimensional shape data can be used.
  • data used for generating virtual viewpoint video such as three-dimensional shape data and texture data, are collectively referred to as material data.
  • material data data used for generating three-dimensional shape data
  • a configuration for generating three-dimensional shape data has been described, but the present embodiment can also be applied when image-based rendering is performed as a method for generating a virtual viewpoint video.
  • the storage device 4 is a device that stores the material data generated by the shape estimation device 3. For example, it is composed of a semiconductor memory, a magnetic recording device, or the like. Each piece of material data stored in the storage device 4 is associated with corresponding shooting time information. Specifically, the material data and the photographing time information are linked by associating the photographing time information associated with the photographed image used to generate the material data with the material data. The association of the shooting time information with the material data is performed, for example, by adding the shooting time information to the metadata of the material data. Note that the device that provides the photographing time information is not particularly limited, and the shape estimation device 3 may provide the information, or the storage device 4 may provide the information. The storage device 4 outputs material data in response to requests from other devices.
  • the video generation device 5 is a device that generates a virtual viewpoint video.
  • the image generation device 5 generates an image that assists the operation when the user performs an operation for specifying the parameters of the virtual camera.
  • the image generation device 5 acquires the three-dimensional object data from the storage device 4, for example, and generates an image using the three-dimensional shape data.
  • the generated video is output to the virtual camera operation device 6.
  • the virtual camera operation device 6 is an operation device for specifying the parameters of the position and orientation of the virtual camera.
  • the virtual camera operation device 6 is composed of, for example, a joystick, a jog dial, a touch panel, a keyboard, and a mouse.
  • the parameters of the virtual camera that can be specified include information such as the position, orientation, and angle of view of the virtual camera, but are not limited to these, and other information may be specified.
  • the virtual camera operation device 6 in this embodiment transmits the virtual camera parameters specified by the user to the video generation device 5 .
  • the video generation device 5 generates a virtual viewpoint image based on the received virtual camera parameters and the material data acquired from the storage device 4 .
  • the virtual camera operation device 6 acquires the virtual viewpoint video generated by the video generation device 5 and displays it on the display unit.
  • the user designates virtual camera parameters while referring to the displayed virtual viewpoint video, and examines the virtual camera path.
  • the displayed image may be an image that assists the user in specifying the virtual camera parameters. There may be.
  • virtual camera parameters may be specified without generating and displaying an image.
  • the virtual camera parameters are not limited to being specified by the user, and may be automatically specified by recognizing the subject.
  • the virtual camera parameters specified by the virtual camera operating device 6 are transmitted to the virtual camera path data processing device 1 .
  • the virtual camera operating device 6 can specify the photographing time of the subject on the image according to the user's operation. Specifically, the user can pause the motion of the subject, reverse the motion of the subject, or fast-forward the motion of the subject in the video displayed on the virtual camera operating device 6 . This corresponds to pausing, reverse playback, and fast-forwarding of the shooting time when the shooting device 2 shoots. Since the virtual camera can be operated even if the shooting time is changed, for example, it is possible to generate a virtual viewpoint image in which the virtual camera is moved while the movement of the subject is paused. In such a video, the shooting time and the playback time on the virtual viewpoint video are treated as independent parameters.
  • the virtual camera operating device 6 transmits to the virtual camera path data processing device 1 shooting time information representing the shooting time corresponding to the image to be displayed.
  • the sequence data processing device 7 acquires virtual camera path data from the virtual camera path data processing device 1. Also, the sequence data processing device 7 acquires material data corresponding to the virtual camera path data from the storage device 4 . Then, the sequence data processing device 7 generates sequence data for storing or outputting the acquired data.
  • the format of sequence data will be described later. Note that the material data is not necessarily required, and the sequence data may be generated only with the virtual camera path data. Furthermore, the virtual camera path may include not only one pattern but also multiple patterns of virtual camera path data.
  • the movement of the subject in the virtual viewpoint video matches the movement of the subject during shooting.
  • the virtual viewpoint video is generated by changing the virtual viewpoint while stopping the movement of the subject or playing back in reverse, the movement of the subject is different from the movement at the time of shooting.
  • data for simply designating the position of the virtual viewpoint and the line-of-sight direction from the virtual viewpoint may not be able to generate a virtual viewpoint video including the movement of the subject desired by the user. Therefore, the system according to the present embodiment generates virtual camera path data capable of generating a virtual viewpoint video including the movement of the subject desired by the user.
  • the configuration of the virtual camera path data processing device 1 will be described using FIG.
  • the virtual camera path data processing device 1 has a virtual camera information acquisition section 101 , a shooting time information acquisition section 102 , a virtual camera path data generation section 103 and a virtual camera path data output section 104 .
  • the virtual camera information acquisition unit 101 acquires virtual camera parameters specified by the virtual camera operation device 6 .
  • the virtual camera information acquisition unit 101 may collectively acquire the virtual camera parameters for all frames determined by the virtual camera operation device 6, or the virtual camera operation device 6 may acquire the virtual camera parameters for one frame. You may continue acquiring one frame each time it outputs.
  • the shooting time information acquisition unit 102 acquires shooting time information representing the shooting time of the subject specified by the virtual camera operating device 6 .
  • the shooting time information is associated with the frames forming the virtual camera path (virtual viewpoint video).
  • the virtual camera parameters are associated with the frames forming the virtual camera path (virtual viewpoint video).
  • the photographing time when the subject was photographed and the virtual camera parameters corresponding to the photographing time are associated.
  • a virtual viewpoint image is generated as if the subject photographed at a specific photographing time was viewed from a virtual camera corresponding to the photographing time.
  • the shooting time information in this embodiment is assumed to be an absolute time in a certain standard time, but it is not limited to this.
  • the shooting time information may be represented, for example, by a relative time with respect to a certain reference time, and the number of frames relative to a frame of a shot image corresponding to a certain reference time.
  • relative A typical time may be used as the shooting time information.
  • the game start time is set to 0, and the shooting time information is specified based on the elapsed time from the game start time.
  • the shooting time information is information used to specify the material data corresponding to a certain frame, so information other than the shooting time information may be acquired as long as the material data can be specified.
  • the shooting time information acquisition unit 102 may acquire the identification number, ID, etc. of the material data instead of the shooting time information.
  • the shooting time information acquisition unit 102 may collectively acquire the shooting time information for all frames determined by the virtual camera operation device 6, or may acquire the shooting time information for each frame. You can continue.
  • the virtual camera path data generation unit 103 generates virtual camera path data in which the virtual camera parameters acquired by the virtual camera information acquisition unit 101 and the shooting time information acquired by the shooting time information acquisition unit 102 are associated.
  • FIG. 11A is virtual camera path data in which a time code of absolute time is used as shooting time information.
  • a camera path frame is information representing the number of frames forming a virtual camera path.
  • the shooting time information is information representing the shooting time of the subject.
  • the virtual camera position parameter represents the position of the virtual camera, and is represented by (x, y, z) in a three-dimensional space, for example.
  • the posture parameter of the virtual camera represents the posture of the virtual camera. For example, pan, tilt, and roll in a three-dimensional space are expressed as (pan, tilt, roll).
  • the virtual camera path data may include parameters corresponding to the angle of view and distortion of the virtual camera.
  • FIG. 11B is an example in which a frame number for specifying material data is used as shooting time information.
  • the format shown in FIG. 11A can be used in the generation of the virtual viewpoint video. A detailed format of the virtual camera path data described above will be described later.
  • the virtual camera path data output unit 104 performs processing for adding header information and the like to the virtual camera path data generated by the virtual camera path data generation unit 103, and outputs the data.
  • the virtual camera path data may be output as a data file or as packet data.
  • the virtual camera path data may be output in units of frames, or may be output for the entire virtual camera path or every predetermined number of frames.
  • FIG. 3 shows an example of the format of virtual camera path data generated by the virtual camera path data processing device 1.
  • FIG. 3 The virtual camera path data shown in FIG. 3 is an example of a data set having virtual camera parameters for a predetermined number of frames.
  • a virtual camera path data header is stored at the top of the dataset, and the header stores that this dataset is a dataset of virtual camera path data and the data size of the dataset.
  • the number of frames M of the stored virtual camera path data is described.
  • information on the format of the virtual camera path data is described. This is information that indicates the format of how the virtual camera path data is stored, and indicates whether the data related to the virtual camera path is stored in units of type or in units of frames. be.
  • the virtual camera path time information is information representing a frame in the virtual camera path, and corresponds to the camera path frame in FIGS. 11A and 11B described above. That is, the virtual camera path time information is represented by arbitrary information that can distinguish frames in the virtual camera path.
  • the data type code is first saved.
  • the data set includes virtual camera path time information, photographing time information, camera parameter information, and virtual advertisement display instruction information.
  • Each of them is expressed as a virtual camera path data type code.
  • the virtual camera path data type code is represented as a 1-byte code as shown in FIG.
  • the data type and code are not limited to these, and may be, for example, a code longer than 2 bytes or a code shorter than 2 bytes, depending on the information to be described. Other data used when generating a virtual viewpoint video may also be used.
  • the virtual advertisement display instruction information is instruction information for displaying a virtual advertisement as additional information in the virtual viewpoint video generated using the virtual camera path data.
  • a virtual advertisement is an example of additional information, and instruction information for adding optional additional information such as subject information and effects may be described in the virtual camera path data in addition to the virtual advertisement.
  • the format of the virtual camera path time information indicates whether it is a relative time or a relative frame number with respect to the beginning.
  • the format of the shooting time information uses absolute time based on the standard time (for example, Greenwich Mean Time or Japan Standard Time) when the subject was actually shot, relative time to a certain reference time, relative frame number to the reference frame, etc. format.
  • the format may use information other than the shooting time information, such as a file path or pointer to the material data.
  • the reference time information of the virtual camera path is described in the virtual camera path sequence description in the sequence header.
  • This reference time information is stored as an absolute time based on a certain standard time, an absolute time based on a management time within a certain content (for example, the start time of a predetermined game), or the like.
  • the year, month, day, hour, minute, and second may be represented by integers
  • the time after the decimal point may be represented by floating point numbers, fixed point numbers, integers, or the like.
  • the format of the virtual camera parameters for example, values representing the position and orientation of the virtual camera are expressed in quaternions. Note that the notation method of the virtual camera parameters is not limited to this. Below, the actual data of each data of the virtual camera path is described according to the format described in the virtual camera path data header. At the head of each data, a start code indicating the beginning of the data is written.
  • FIG. 4 is a diagram for explaining another example of the configuration of virtual camera path data.
  • various data included in the virtual camera parameters are stored in units of frames.
  • a frame data header is stored at the beginning of each frame data, and a code that indicates the beginning of the frame data and indicates what kind of data is stored in what order in the frame data. Information is provided.
  • FIG. 2 is a diagram showing an example of the format of sequence data output by the sequence data processing device 7.
  • the sequence data processing device 7 outputs sequence data including a plurality of virtual camera path data for generating one virtual viewpoint video.
  • the sequence data also includes material data used to generate the virtual viewpoint video. For example, a sequence is generated for each video clip or shot cut. In this way, with the configuration including the virtual camera path data and the material data, it is possible to generate the virtual viewpoint video on the side of the device that receives the sequence data.
  • Each sequence contains a sequence header, which stores information that can identify the corresponding material sequence data.
  • a sequence header start code related to material data that can uniquely identify the material sequence data, information related to the location and date when the subject was photographed, path information representing the location of the material sequence data, etc. are saved.
  • the sequence header includes information indicating that the virtual camera path data described above is included. For example, information indicating the data set included in the sequence header, or information indicating the presence or absence of virtual camera path data may be used.
  • the virtual camera path sequence description information can include information about the creator of the virtual camera path, right holder information, the name of the sequence, the name of the event in which the subject was shot, the camera frame rate at the time of shooting, and the like.
  • the virtual camera path sequence description information may include time information on which the virtual camera camera path is based, image size assumed at the time of rendering the virtual viewpoint video, background data information, and the like. However, it is not necessary to include all of these pieces of information, and arbitrary information may be included.
  • Each virtual camera path data is saved in a sequence in units of data sets.
  • the number N of data sets is described in the sequence header.
  • Information for each data set is stored below.
  • two data sets, virtual camera path data and material data are included.
  • a data set identification ID is given first.
  • a unique identification ID is assigned to all data sets.
  • the dataset type code is then saved.
  • the data sets include a data set representing virtual camera path data and data representing material data. Each of them is expressed as a data set type code.
  • the data set type code is expressed as a 2-byte code shown in FIG.
  • data types and codes are not limited to these. Other data used when generating a virtual viewpoint video may also be used.
  • a pointer to the dataset is then saved. However, it is not limited to pointers as long as it is information for accessing each data set. For example, it may be a file name in a file system built on the storage device 4 .
  • the sequence data processing device 7 may generate sequence data containing only virtual camera path data or only material data. By generating sequence data, data used for generating one virtual viewpoint image such as a video clip can be collectively managed as one piece of data.
  • data including time information for specifying the time when the subject was shot and a plurality of virtual camera parameters corresponding to the frames forming the virtual viewpoint video are generated.
  • the format described above is an example, and the configuration and included information are not limited to the above.
  • step S501 the virtual camera information acquisition unit 101 repeats processing for each frame of the virtual camera path designated by the virtual camera operation device 6.
  • the virtual camera information acquisition unit 101 acquires virtual camera parameters from the virtual camera operation device 6 .
  • the shooting time information acquisition unit 102 acquires shooting time information of the subject from the virtual camera operation device 6 .
  • steps S501 to S503 are repeated until the virtual camera path ends or the input in frame units ends.
  • the user operates the virtual camera while looking at the subject displayed on the image generation device 5 .
  • the virtual camera parameters are determined for each video frame and sent to the virtual camera information acquisition unit 101 .
  • the user operates the virtual camera while specifying, for example, stopping the time of the image of the subject displayed on the image generating device 5, playing it in reverse, or fast-forwarding it. By doing so, it is possible to specify the photographing time of the subject.
  • the shooting time information specified at this time is transmitted to the shooting time information acquisition unit 102 .
  • the virtual camera information acquisition unit 101 acquires virtual camera parameters on a frame-by-frame basis, but is not limited to this.
  • the virtual camera information acquisition unit 101 may acquire the virtual camera parameters collectively after the virtual camera parameters for a predetermined number of frames are determined.
  • the imaging time information acquisition unit 102 may collectively acquire virtual camera parameters for a predetermined number of frames instead of frame by frame.
  • step S505 the virtual camera path data generation unit 103 generates virtual camera path data in which the virtual camera parameters acquired by the virtual camera information acquisition unit 101 and the shooting time information acquired by the shooting time information acquisition unit 102 are associated. Generate.
  • the virtual camera path data output unit 104 adds header information and the like to the virtual camera path data generated by the virtual camera path data generation unit 103, and outputs the data.
  • the virtual camera path data can be output to the storage device 4 and stored, or output to the sequence data processing device 7 and used to generate sequence data.
  • the sequence data may include the material data acquired from the storage device 4 based on the shooting time information in the virtual camera path data.
  • the virtual camera path data is output to a virtual viewpoint video generation device in Embodiment 2, which will be described later, and used to generate a virtual viewpoint video.
  • the shooting time information is acquired from the virtual camera operation device 6, but the information is not limited to this, and information that can specify the shooting time, such as the number of frames from the beginning of the video, may be acquired.
  • virtual camera parameters and shooting time information a data structure in which the virtual camera parameters and the shooting time information are included in one data set has been described, but the method is not limited to this.
  • virtual camera parameters and shooting time information may be stored in separate files and linked with predetermined information.
  • FIG. 6 is a diagram showing a configuration example of a system including a virtual viewpoint video generation device according to this embodiment.
  • the virtual viewpoint video generation device 600 is connected to the virtual camera path data processing device 1 and the storage device 4 .
  • the configurations of the virtual camera path data processing device 1 and the storage device 4 are the same as those of the first embodiment. Descriptions of the same configurations as those of the first embodiment will be omitted below.
  • the hardware configuration of the virtual viewpoint video generation device 600 is the same as the hardware configuration shown in FIG. Note that the virtual viewpoint video generation device 600 may be included in the information processing system 10 according to the first embodiment.
  • the virtual viewpoint video generation device 600 has a virtual camera path data acquisition unit 601, a shooting time information acquisition unit 602, a material data management unit 603, a virtual viewpoint video generation unit 604, and a virtual viewpoint video output unit 605.
  • the virtual viewpoint video generation device 600 in this embodiment acquires virtual camera path data from the virtual camera path data processing device 1 and generates a virtual viewpoint video based on the acquired virtual camera path data.
  • Each processing unit will be described below.
  • a virtual camera path data acquisition unit 601 acquires virtual camera path data output from the virtual camera path data processing device 1 .
  • the virtual camera path data acquisition unit 601 may acquire the virtual camera path data output by the virtual camera path data processing device 1 as a data file or as packet data.
  • the virtual camera path data acquisition unit 601 may acquire the virtual camera path data in units of frames, in units of a certain number of frames, or in units of one or a plurality of virtual camera path data sets. can be obtained with
  • the virtual viewpoint video output unit 605, which will be described later can distinguish and output the virtual viewpoint video corresponding to each virtual camera path data set.
  • Each virtual camera path data set can be distinguished by an identification ID described in each virtual camera path data set header.
  • the virtual viewpoint video output unit 605, which will be described later may perform a process of adding an identification ID described in the virtual camera path data set to the metadata of the virtual viewpoint video to be output.
  • a shooting time information acquisition unit 602 acquires shooting time information corresponding to the virtual camera path time information included in the virtual camera path data acquired by the virtual camera path data acquisition unit 601 .
  • the material data management unit 603 acquires material data corresponding to the shooting time information acquired by the shooting time information acquisition unit 602 from the storage device 4 .
  • the material data is associated with the shooting time information, so the material data management unit 603 acquires the material data by referring to the shooting time information associated with the material data. be able to. Note that the material data management unit 603 holds the correspondence between the acquired material data, the virtual camera path data set, and the virtual camera path time information.
  • necessary material data is acquired based on the method of generating the virtual viewpoint video in the virtual viewpoint video generation unit 604.
  • the generation method is based on a foreground model or background model
  • the point cloud model data or mesh model data of the foreground or background, the corresponding texture image, or the captured image and camera calibration data for generating the texture are acquired. be done.
  • captured images, camera calibration data, and the like are acquired.
  • the virtual viewpoint video generation unit 604 acquires virtual camera parameters included in the virtual camera path data acquired by the virtual camera path data acquisition unit 601 . Also, the virtual viewpoint video generation unit 604 generates a virtual viewpoint video using the acquired virtual camera parameters and the material data acquired by the material data management unit 603 . For example, the frame of the virtual viewpoint video corresponding to the 1st virtual camera path time information of the virtual camera path data set in FIG. Generated using . In this way, each frame of the virtual viewpoint video is generated using the shooting time information and the virtual camera parameters corresponding to the same virtual camera path time information.
  • a virtual viewpoint video output unit 605 acquires the virtual viewpoint video from the virtual viewpoint video generation unit 604 and outputs the virtual viewpoint video using a display device such as a display. Note that the virtual viewpoint video output unit 605 may output the virtual viewpoint video acquired from the virtual viewpoint video generation unit 604 as an image data file or packet data.
  • step S701 the virtual camera path data acquisition unit 601 repeats data acquisition for each frame in the virtual camera path data.
  • step S ⁇ b>702 the virtual camera path data acquisition unit 601 acquires virtual camera path data from the virtual camera path data processing device 1 .
  • step S ⁇ b>703 the shooting time information acquisition unit 602 acquires shooting time information from the virtual camera path data acquisition unit 601 .
  • step S ⁇ b>704 the material data management unit 603 acquires material data corresponding to the shooting time information acquired by the shooting time information acquisition unit 602 from the storage device 4 .
  • step S705 the virtual viewpoint video generation unit 604 acquires virtual camera parameters from the virtual camera path data acquired by the virtual camera path data acquisition unit 601.
  • step S706 the virtual viewpoint video generation unit 604 generates a virtual viewpoint video based on the material data acquired by the material data management unit 603 and the virtual camera parameters acquired in step S705.
  • step S707 the virtual viewpoint video output unit 605 outputs the virtual viewpoint video generated by the virtual viewpoint video generation unit 604 using a display device such as a display, or outputs it as a data file or packet data.
  • steps S701 to S707 are repeated until the processing of the previous frame included in the virtual camera path data ends or until the input in frame units ends.
  • FIG. 8 is a diagram showing the communication status of each part.
  • the virtual camera path data processing device 1 is activated, and notifies the virtual viewpoint video generation device 600 of the start of generation of the virtual viewpoint video.
  • a control unit that controls the virtual viewpoint video generation device configured by a CPU or the like notifies the virtual camera path data acquisition unit 601 and each unit to start generating a virtual viewpoint video.
  • Each department prepares for it.
  • the virtual camera path data processing device 1 transmits virtual camera path data to the virtual camera path data acquisition unit 601 .
  • the virtual camera path data acquisition unit 601 sends the virtual camera parameters of the sent virtual camera path data to the virtual viewpoint video generation device 604 and sends the shooting time information to the shooting time information unit 602 .
  • the shooting time information acquisition unit 602 sends the shooting time information to the material data management unit 603 .
  • the material data management unit 603 acquires material data corresponding to the input shooting time information from the storage device 4 and sends the acquired material data to the virtual viewpoint video generation unit 604 .
  • a virtual viewpoint video generation unit 604 renders and generates a virtual viewpoint video based on the acquired virtual camera information and material data.
  • the virtual viewpoint video generation unit 604 sends the generated virtual viewpoint video output unit 605 as soon as the rendering is completed.
  • the virtual viewpoint video generation unit 604 requests information regarding the frame of the virtual viewpoint video to be generated next. Thereafter, acquisition of virtual camera path data, acquisition of material data and virtual camera information corresponding to shooting time information of the virtual camera path data, and generation and output of virtual viewpoint video are repeated in order to process the next frame.
  • end of transmission is transmitted from the virtual camera path data processing device 1 to the virtual camera path data acquisition unit 601, all processing ends.
  • processing is shown in the flowchart as a sequential flow, but it is not limited to this.
  • a plurality of virtual camera path data may be output in parallel.
  • the data exchanged at once may be in units of frames, or may be collectively exchanged in units of multiple frames.
  • the virtual camera path data acquisition unit 601 may receive virtual camera path data in units of multiple frames.
  • virtual camera path data for the received frames may be stored in the virtual camera path data acquisition unit 601, and the information of the stored virtual camera path data may be sequentially transmitted in frame units. It may be transmitted in frame units.
  • the transmission order of the virtual camera information and the material data sent to the virtual viewpoint video generation unit 604 is not limited to this, and the order may be reversed or they may be sent simultaneously.
  • the virtual camera can generate a virtual viewpoint video in which the motion of the subject is stopped, fast-forwarded, reversed, or slow-played while moving in a predetermined manner.
  • FIG. 9A shows an example of virtual camera path data when the movement of the subject is stopped.
  • a table 901 in FIG. 9A is an example of virtual camera path data for generating a virtual viewpoint video including a state in which motion of a subject is stopped.
  • the virtual camera has 100 frames, of which the camera path frames 50 to 52 represent that the time of the subject corresponding to the virtual camera has stopped.
  • a virtual camera 902 in FIG. 9B represents changes in the position and orientation of the virtual camera in camera path frame 0 of the virtual camera, and a virtual camera path 903 represents changes in the position and orientation of the virtual camera from camera path frames 0 to 99. represents change. Therefore, the virtual camera path in this example continues to move while changing the posture of the virtual camera.
  • a subject 904 in FIG. 9C is a subject corresponding to camera path frame 0, and 905 represents the movement of the subject.
  • the subject 904 is making a jumping motion.
  • the photographing time corresponding to the virtual camera path is not manipulated, the image is that of the subject 904 jumping and landing as it is.
  • the movement of the subject can be stopped.
  • 906 is a period corresponding to camera path frames 49-52. According to this, during the period 906, the subject 904 is in a state as if it were floating in the air, and an image is generated in which the position and orientation of the virtual camera are changed in this state.
  • the relative frame numbers in FIG. 9A are information that goes back in time, such as 10, 9, 8, .
  • Such a virtual viewpoint video is generated.
  • the relative frame numbers in FIG. 9A are information skipping one frame such as 1, 3, 5, 7, . In this way, by including an arbitrary time in the shooting time information, it is possible to generate a virtual viewpoint video including the movement of an arbitrary subject.
  • the virtual viewpoint video generation device 600 can acquire data including virtual camera parameters and information about the shooting time of the subject corresponding to the virtual camera parameters, and generate a virtual viewpoint video.
  • the virtual viewpoint video generation device 600 according to the present embodiment is configured to acquire the virtual camera path data directly from the virtual camera path data processing device 1, but the configuration is not limited to this.
  • the virtual viewpoint video generation device 600 may, for example, acquire the sequence data generated by the sequence data processing device 7 in the first embodiment and generate a virtual viewpoint video. At this time, if material data is included in the sequence data, the virtual viewpoint video generation device 600 can generate the virtual viewpoint video without acquiring the material data from the storage device 4 .
  • the virtual viewpoint video generation device 600 may be configured to acquire virtual camera path data stored in the storage device 4 or another storage device and generate a virtual viewpoint video.
  • the present embodiment by generating virtual camera path data including shooting time information, information that enables generation of a virtual viewpoint video including a state different from the state of the subject that was actually shot is output. , but not limited to.
  • flags corresponding to actions such as stopping the movement of the subject, playing in reverse, and fast-forwarding may be set in advance and associated with each frame.
  • the virtual camera path data describes the shooting time of the subject corresponding to the first frame of the virtual camera path. is also played.
  • the present disclosure provides a program that implements one or more functions of the above-described embodiments to a device or device via a network or a storage medium, and one or more processors in the device or computer of the device reads and executes the program. It can also be realized by processing to It can also be implemented by a circuit (for example, ASIC) that implements one or more functions.
  • a circuit for example, ASIC

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Graphics (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Processing Or Creating Images (AREA)
  • Closed-Circuit Television Systems (AREA)
  • Studio Devices (AREA)

Abstract

Un dispositif de traitement de données de trajet de caméra virtuelle 1 comprend : une unité d'acquisition d'informations de caméra virtuelle 101 qui, en correspondance avec des trames d'image mobile constituant une vidéo de point de vue virtuel générée sur la base d'une pluralité d'images capturées obtenues par capture d'un sujet avec une pluralité de dispositifs de capture, acquiert un paramètre représentant la position du point de vue virtuel et une direction de ligne visuelle à partir du point de vue virtuel ; une unité d'acquisition d'informations temporelles de capture 102 pour acquérir des informations temporelles pour identifier le moment de capture du sujet ; et une unité de sortie de données de trajet de caméra virtuelle pour fournir des données de point de vue virtuel dans lesquelles une pluralité de paramètres acquis et les informations temporelles acquises sont associés les uns aux autres.
PCT/JP2022/018122 2021-04-27 2022-04-19 Dispositif de traitement d'informations, procédé de traitement d'informations et programme WO2022230715A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2021-075039 2021-04-27
JP2021075039A JP2022169176A (ja) 2021-04-27 2021-04-27 情報処理装置、情報処理方法、及び、プログラム

Publications (1)

Publication Number Publication Date
WO2022230715A1 true WO2022230715A1 (fr) 2022-11-03

Family

ID=83847095

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/018122 WO2022230715A1 (fr) 2021-04-27 2022-04-19 Dispositif de traitement d'informations, procédé de traitement d'informations et programme

Country Status (2)

Country Link
JP (1) JP2022169176A (fr)
WO (1) WO2022230715A1 (fr)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019159593A (ja) * 2018-03-09 2019-09-19 キヤノン株式会社 画像検索システム、画像検索装置、画像検索方法、及びプログラム
WO2020213426A1 (fr) * 2019-04-18 2020-10-22 ソニー株式会社 Dispositif de traitement d'image, procédé de traitement d'image et programme

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019159593A (ja) * 2018-03-09 2019-09-19 キヤノン株式会社 画像検索システム、画像検索装置、画像検索方法、及びプログラム
WO2020213426A1 (fr) * 2019-04-18 2020-10-22 ソニー株式会社 Dispositif de traitement d'image, procédé de traitement d'image et programme

Also Published As

Publication number Publication date
JP2022169176A (ja) 2022-11-09

Similar Documents

Publication Publication Date Title
US9299184B2 (en) Simulating performance of virtual camera
KR101748593B1 (ko) 생성된 장면 내에서 연기하는 배우의 시야와 움직임의 캡쳐
CN109889914A (zh) 视频画面推送方法、装置、计算机设备及存储介质
JP7459870B2 (ja) 画像処理装置、画像処理方法、及び、プログラム
JP2020086983A (ja) 画像処理装置、画像処理方法、及びプログラム
JP2022188095A (ja) 情報処理装置、情報処理装置の制御方法、及びプログラム
US11847735B2 (en) Information processing apparatus, information processing method, and recording medium
JP7446754B2 (ja) 画像処理装置、画像処理方法、及びプログラム
JP2024019537A (ja) 画像処理装置、画像処理方法、及びプログラム
US20230353717A1 (en) Image processing system, image processing method, and storage medium
WO2023236656A1 (fr) Procédé et appareil de rendu d'image interactive, et dispositif, support de stockage et produit programme
WO2022230715A1 (fr) Dispositif de traitement d'informations, procédé de traitement d'informations et programme
CN114584681A (zh) 目标对象的运动展示方法、装置、电子设备及存储介质
JP7459195B2 (ja) 生成装置、生成方法、及びプログラム
WO2022230718A1 (fr) Dispositif de traitement d'informations, procédé de traitement d'informations et programme
WO2023145571A1 (fr) Dispositif de traitement d'informations, procédé de traitement d'informations, structure de données, et programme
JP2000270261A (ja) 撮像装置及び画像合成方法並びに記録媒体
JP7494153B2 (ja) 生成装置、生成方法、及び、プログラム
Horváthová et al. Using blender 3D for learning virtual and mixed reality
US9715900B2 (en) Methods, circuits, devices, systems and associated computer executable code for composing composite content
JP2024014517A (ja) 情報処理システム、情報処理方法及びコンピュータプログラム
JP2022094907A (ja) 生成装置、生成方法、及び、プログラム
JP2023184358A (ja) 画像処理システム、画像処理システムの制御方法、及びプログラム
JP2023132320A (ja) 画像処理システム、画像処理方法及びコンピュータプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22795628

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22795628

Country of ref document: EP

Kind code of ref document: A1