CN114399553A

CN114399553A - Virtual viewpoint generation method and device based on camera posture

Info

Publication number: CN114399553A
Application number: CN202111467673.4A
Authority: CN
Inventors: 桑新柱; 叶晓倩; 王华春; 陈铎; 王鹏; 刘博阳; 齐帅; 万华明; 李宁驰; 王葵如; 颜玢玢
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2021-12-03
Filing date: 2021-12-03
Publication date: 2022-04-26

Abstract

The application provides a virtual viewpoint generating method and device based on camera gestures. The method comprises the following steps: constructing a three-dimensional point corresponding to a matching point pair according to the matching point pair between two adjacent viewpoint images and internal parameters and external parameters of two cameras in a camera array corresponding to the adjacent viewpoint images; and according to the target camera attitude and the internal parameters of each target camera corresponding to the selected camera attitude, carrying out re-projection on each three-dimensional point, and determining a virtual viewpoint image corresponding to the camera attitude. The virtual viewpoint generating method based on the camera gesture can improve the generating efficiency of the virtual viewpoint.

Description

Virtual viewpoint generation method and device based on camera posture

Technical Field

The application relates to the technical field of image processing, in particular to a virtual viewpoint generating method and device based on camera gestures.

Background

The real world is three-dimensional, but currently mainstream display devices are still two-dimensional. Three-dimensional displays, particularly those with the naked eye, are receiving increasing attention. The naked eye three-dimensional display needs a dense viewpoint image, and the adoption of a camera array for dense viewpoint acquisition has many difficulties, such as synchronous adjustment between camera arrays, camera calibration and attitude solution, data storage and transmission and the like. Therefore, in practical applications, a few real binocular cameras are usually used to acquire sparse viewpoints, and virtual viewpoints are generated by a virtual viewpoint generation method.

In the related art, DIBR (Depth Image Based Rendering, virtual viewpoint Rendering Based on a Depth Image) may be employed to generate a virtual viewpoint. However, when generating a virtual viewpoint by DIBR, since the virtual viewpoint generating method by DIBR needs to correct left and right images, polar lines of the two images are located on the same horizontal line, and only a virtual viewpoint located on the same polar line, but a virtual viewpoint image at an arbitrary position cannot be generated, thereby resulting in low generation efficiency of the virtual viewpoint.

Disclosure of Invention

The embodiment of the application provides a virtual viewpoint generation method and device based on camera gestures, and generation efficiency of virtual viewpoints is improved.

In a first aspect, an embodiment of the present application provides a method for generating a virtual viewpoint based on a camera pose, including:

constructing a three-dimensional point corresponding to a matching point pair according to the matching point pair between two adjacent viewpoint images and internal parameters and external parameters of two cameras in a camera array corresponding to the adjacent viewpoint images;

and according to the target camera attitude and the internal parameters of each target camera corresponding to the selected camera attitude, carrying out re-projection on each three-dimensional point, and determining a virtual viewpoint image corresponding to the camera attitude.

In one embodiment, before constructing a three-dimensional point corresponding to a matching point pair between adjacent viewpoint images according to the matching point pair and the internal and external parameters of two cameras corresponding to the adjacent viewpoint images, the method further includes:

extracting a first matching point from a first viewpoint image of two adjacent viewpoint images;

extracting a second matching point matched with the first matching point from a second viewpoint image of two adjacent viewpoint images according to the trained optical flow estimation model;

and forming a matching point pair according to the first matching point and the second matching point.

In one embodiment, the constructing a three-dimensional point corresponding to a matching point pair according to the matching point pair between two adjacent viewpoint images and the internal parameters and the external parameters of two cameras corresponding to the adjacent viewpoint images includes:

determining two light ray directions corresponding to two cameras according to a matching point pair between two adjacent viewpoint images and internal parameters and external parameters of the two cameras corresponding to the two adjacent viewpoint images;

and processing the two light ray directions according to the direct linear change to construct a three-dimensional point corresponding to the matching point pair.

In one embodiment, the re-projecting each three-dimensional point according to the target camera pose and the internal parameters of each target camera corresponding to the selected camera pose to determine a virtual viewpoint image corresponding to the camera pose includes:

when the selected camera posture is a horizontal position, carrying out camera position leveling on each target camera according to the three-dimensional coordinates of each target camera in the camera array, and acquiring leveling coordinates of each target camera;

determining the target camera attitude of each target camera according to the leveling coordinates of each target camera and the leveling rotation angle of each target camera obtained by averaging the rotation angle of each target camera;

and carrying out re-projection on each three-dimensional point according to the target camera pose and the internal parameters of each target camera, and determining a virtual viewpoint image corresponding to the camera pose.

In an embodiment, the re-projecting each three-dimensional point and determining a virtual viewpoint image corresponding to the camera pose includes:

carrying out re-projection on each three-dimensional point by taking two adjacent viewpoint images as reference images, and determining two adjacent initial virtual viewpoint images corresponding to the camera postures;

and filling holes in the left images in the adjacent initial virtual viewpoint images to generate virtual viewpoint images.

In an embodiment, the hole filling the left map in the adjacent initial virtual viewpoint images includes:

acquiring a first hole position at which the hole area in the left image of the adjacent initial virtual viewpoint image is larger than a preset area;

and acquiring a pixel corresponding to the first hole position from the right image and filling the pixel to the first hole position.

In one embodiment, the method further comprises:

acquiring a second cavity position of which the cavity area in the left drawing is smaller than or equal to the preset area;

and filling the second cavity position with a cavity through a closing operation.

In a second aspect, an embodiment of the present application provides a virtual viewpoint generating apparatus based on camera pose, including:

the three-dimensional point construction module is used for constructing three-dimensional points corresponding to matching point pairs according to the matching point pairs between two adjacent viewpoint images and internal parameters and external parameters of two cameras in a camera array corresponding to the adjacent viewpoint images;

and the virtual viewpoint generating module is used for carrying out re-projection on each three-dimensional point according to the target camera attitude and the internal parameters of each target camera corresponding to the selected camera attitude, and determining a virtual viewpoint image corresponding to the camera attitude.

In a third aspect, an embodiment of the present application provides an electronic device, which includes a processor and a memory storing a computer program, where the processor implements the steps of the method for generating a virtual viewpoint based on a camera pose according to the first aspect when executing the program.

In a fourth aspect, the present application provides a computer program product, which includes a computer program that, when being executed by a processor, implements the steps of the camera pose-based virtual viewpoint generation method according to the first aspect.

According to the method and the device for generating the virtual viewpoint based on the camera posture, after the three-dimensional points are reconstructed through the matching point pairs between the two adjacent viewpoint images and the internal parameters and the external parameters of the two cameras corresponding to the two viewpoint images, the three-dimensional points are re-projected according to the target camera posture and the internal parameters of each target camera corresponding to the selected camera posture to determine the virtual viewpoint image corresponding to the camera posture, so that the reconstructed three-dimensional points can be re-projected based on any camera posture to generate the virtual viewpoint images at corresponding positions, the virtual viewpoint images at different positions can be generated by selecting different camera postures, the left image and the right image do not need to be corrected, and the generation efficiency of the virtual viewpoint is improved.

Drawings

In order to more clearly illustrate the technical solutions in the present application or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic flowchart of a method for generating a virtual viewpoint based on a camera pose provided in an embodiment of the present application;

FIG. 2 is a schematic view of triangulation provided by embodiments of the present application;

fig. 3 is a schematic diagram of a camera position and an optical axis direction in real horizontal shooting according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a position and an optical axis direction of a camera after position and optical axis leveling provided by an embodiment of the present application;

fig. 5 is a schematic structural diagram of a virtual viewpoint generating apparatus based on camera pose provided in an embodiment of the present application;

fig. 6 is a schematic structural diagram of an electronic device provided in an embodiment of the present application.

Detailed Description

To make the purpose, technical solutions and advantages of the present application clearer, the technical solutions in the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

For a better understanding of the solution, the technical terms to which the embodiments of the present invention relate are explained.

Camera internal parameters refer to parameters related to the characteristics of the camera itself, such as the focal length and pixel size of the camera;

the camera pose refers to camera external parameters, namely parameters of the camera under a world coordinate system, such as the position and the rotation direction of the camera;

and re-projection, namely second projection, wherein the first projection is to project the 3D point on the image when the camera takes a picture. And 3D points can be reconstructed by utilizing the matching points on the image pair, and the reconstructed 3D points are projected onto an image plane through the camera pose again, namely the 3D points are re-projected.

The optical flow refers to the instantaneous speed and direction of pixel motion of a space moving object on an observation imaging plane;

convolutional Neural Networks (CNN), a class of feed forward neural networks that includes convolutional computations and has a deep structure, is one of the representative algorithms for deep learning.

The embodiments of the present application will be described in detail below with reference to the accompanying drawings.

Fig. 1 is a schematic flowchart of a method for generating a virtual viewpoint based on a camera pose according to an embodiment of the present invention, where the method is applied to a server or a terminal device to generate a virtual viewpoint. As shown in fig. 1, a method for generating a virtual viewpoint based on a camera pose provided by this embodiment includes:

step 101, constructing a three-dimensional point corresponding to a matching point pair according to the matching point pair between two adjacent viewpoint images and internal parameters and external parameters of two cameras in a camera array corresponding to the adjacent viewpoint images;

and 102, carrying out re-projection on each three-dimensional point according to the target camera attitude and the internal parameters of each target camera corresponding to the selected camera attitude, and determining a virtual viewpoint image corresponding to the camera attitude.

After three-dimensional points are reconstructed through matching point pairs between two adjacent viewpoint images and internal parameters and external parameters of two cameras corresponding to the two viewpoint images, the three-dimensional points are re-projected according to target camera postures and internal parameters of target cameras corresponding to the selected camera postures to determine virtual viewpoint images corresponding to the camera postures, so that the reconstructed three-dimensional points can be re-projected based on any camera postures to generate virtual viewpoint images of corresponding positions, virtual viewpoint images of different positions can be generated by selecting different camera postures, and the virtual viewpoint images can be generated without correcting left and right images, so that the generation efficiency of the virtual viewpoints is improved.

In step 101, the internal parameters and the external parameters of the camera may be obtained by calibrating the camera through a checkerboard image captured by the camera. Specifically, a real world coordinate system is defined by a checkerboard image shot by a camera in advance, and a 3D coordinate of each checkerboard point in the checkerboard image in the world coordinate system is obtained. Meanwhile, carrying out corner detection on the checkerboard image to obtain the 2D pixel coordinate of each checkerboard point. After obtaining the 3D coordinate of each checkerboard point in the world coordinate system and the 2D pixel coordinate of each checkerboard point in the plane rectangular coordinate system, calibrating the camera shooting the checkerboard image by converting a formula X into K [ R | T ] X (formula 1) according to the relation between the 3D coordinate and the 2D pixel coordinate of any checkerboard point, and obtaining the internal parameters and the external parameters of the camera.

Wherein X is a homogeneous 2D pixel coordinate, X is a homogeneous 3D coordinate, K is a camera intrinsic parameter, and R and T are rotation and translation in the camera extrinsic parameter, respectively.

In an embodiment, for two adjacent viewpoint images, the two viewpoint images may be matched in an image matching manner to obtain a first matching point of a first viewpoint image in the two viewpoint images and a second matching point of a second viewpoint image in the two viewpoint images, where the second viewpoint image is matched with the first matching point of the first viewpoint image, and the first matching point and the second matching point are used as a matching point pair.

Considering that, for adjacent viewpoint images photographed by the camera array, since it is difficult to horizontally place a plurality of cameras, optical axes are not necessarily parallel, and there is both horizontal parallax and vertical parallax between the photographed viewpoint images. Therefore, in order to obtain an accurate matching point pair, in an embodiment, before constructing a three-dimensional point corresponding to a matching point pair between adjacent viewpoint images according to the matching point pair and the internal parameters and the external parameters of two cameras corresponding to the adjacent viewpoint images, the method further includes:

In one embodiment, a first matching point is extracted from a first viewpoint image, then an optical flow estimation model is used for performing matching estimation between viewpoints, and a second matching point matched with the first matching point is extracted from a second viewpoint image. Considering that the convolution depth is high in accuracy and efficiency of optical flow estimation through the network, the optical flow estimation model can be an optical flow estimation model obtained based on convolutional neural network training.

Illustratively, the optical flow estimation model may be a RAFT model. Because RAFT adopts a recurrent neural network to carry out multiple iterations, the balance between the computation speed of the matching point pairs and the prediction performance requirements can be met by increasing or reducing the iteration times.

Two matched matching points are respectively extracted from two adjacent viewpoint images through an optical flow estimation model to form a matching point pair, so that the problem that the obtained matching point pair is inaccurate due to horizontal parallax and vertical parallax between the adjacent viewpoint images is solved, the accuracy and efficiency of the obtained matching point pair are improved, and the accuracy of a virtual viewpoint image determined by using the matching point pair subsequently is improved.

In an embodiment, after obtaining the matching point pairs of the two adjacent viewpoint images and the corresponding internal parameters and external parameters of the two cameras, the positions of the three-dimensional points can be obtained through triangulation.

Illustratively, the first matching point p in a pair of matching points and the inside-outside parameters of two known cameras₁And a second matching point p₂The positions O of the two cameras can then be derived from the internal and external parameters₁And O₂And a first matching point p₁And a second matching point p₂ObtainingLight direction O corresponding to two cameras respectively₁p₁And O₂p₂. If all parameters are calculated absolutely accurately, two rays O₁p₁And O₂p₂Will intersect at a point P in three-dimensional space as shown in fig. 2. The point P is the three-dimensional point corresponding to the matching point pair.

However, considering that there may be errors in camera calibration or optical flow estimation, O is the time when₁p₁And O₂p₂Due to errors, the images cannot be intersected, so that corresponding three-dimensional points cannot be obtained, and generation of subsequent virtual viewpoint images is influenced. To this end, in an embodiment, the constructing a three-dimensional point corresponding to a matching point pair according to the matching point pair between two adjacent viewpoint images and the internal parameter and the external parameter of two cameras corresponding to the adjacent viewpoint images includes:

In one embodiment, two rays O are determined₁p₁And O₂p₂Then, the light O is detected₁p₁And O₂p₂Whether or not to intersect. If the three-dimensional points are intersected, the intersected points are directly used as the three-dimensional points. If the points are not intersected, the nearest 3D point can be solved through a direct linear change method to serve as a reconstructed 3D point, and the specific implementation can be solved through a triangulatePoints function in an opencv library, so that a three-dimensional point corresponding to the matching point pair is constructed.

When two rays of two cameras corresponding to two adjacent viewpoint images are determined to be not intersected, the directions of the two rays are processed through direct linear change to construct a three-dimensional point, so that the situation that the three-dimensional point cannot be constructed due to camera calibration errors or matching point errors is avoided, and the generation efficiency of subsequent virtual viewpoint images is further improved.

In step 102, after obtaining each three-dimensional point, the target camera pose and the internal parameters of each corresponding target camera may be determined from the camera array based on any selected camera pose, and then the target camera pose, that is, the external parameters and the internal parameters, are re-projected to each three-dimensional point according to the above formula 1, so as to obtain a virtual viewpoint image at a position corresponding to the given camera pose.

The arbitrarily selected camera pose may be each target camera positioned in a horizontal position, that is, each target camera for horizontal shooting, or each target camera in a ring pose, or the like.

In particular, when the selected camera pose is a horizontal position, since the respective target camera positions for horizontal shooting may not be on the same horizontal line, the optical axes are not parallel as shown in fig. 3. Resulting in that the finally formed virtual visual point image may not be accurate enough. To this end, in an embodiment, the re-projecting each three-dimensional point according to the target camera pose and the internal parameters of each target camera corresponding to the selected camera pose, and determining the virtual viewpoint image corresponding to the camera pose includes:

In one embodiment, when the selected camera pose is a horizontal position, three-dimensional coordinates of each target camera may be determined based on internal and external parameters of each target camera in the horizontal position. And averaging the vertical coordinates in the three-dimensional coordinates of each target camera to obtain average vertical coordinates, and performing straight line fitting on the horizontal coordinates and the vertical coordinates in the three-dimensional coordinates of each target camera according to a least square method. Since each target camera is a camera in a horizontal position, the abscissa of each camera is not changed, and a new ordinate of each target camera can be obtained according to the least square method. And obtaining the leveling coordinate of the target camera according to the abscissa, the new ordinate and the average ordinate of the target camera.

For example, assuming there are 5 target cameras, and for the vertical coordinate Z, assuming that the Z coordinates of the 5 camera positions are Z1-Z5, respectively, then the average vertical coordinate of the 5 target virtual cameras is:

for the X and Y directions, a least squares method is used for straight line fitting, and an error function is established as follows:

and a and b are the slope and intercept of the straight line, and a straight line equation can be obtained after solving. The original X coordinate of each target camera is unchanged, and a new Y coordinate can be obtained through a linear equation. The leveling coordinate of each target camera after position leveling is

Due to the different rotations of the different cameras, the optical axes of the target cameras may not be parallel, in which case the rotation matrix in the extrinsic parameters of each target camera may be converted into rotation angles in order to level the optical axes. Specifically, assume that the rotation matrix R of the target camera is:

the three rotation angles of the target camera are:

θ_x＝atan2(r₃₂,r₃₃)

θ_z＝atan2(r₂₁,r₁₁)

in obtaining the rotation angle of each target camera

Then, the rotation angles of all the target cameras are averaged to obtain the leveling rotation angle of each target camera

Where i represents the serial number of the target camera.

After the leveling coordinates and the leveling rotation angles of the target cameras are obtained, the target cameras can be leveled integrally, and the target camera postures of the target cameras can be determined, as shown in fig. 4.

After the target camera pose of each target camera is obtained, the three-dimensional points can be re-projected according to the target camera pose and the internal parameters of each target camera, and the virtual viewpoint image is determined.

By leveling the position of the camera and leveling the optical axis of each target camera before the three-dimensional point is re-projected, the problem that when the camera which is horizontally shot is re-projected to generate a virtual viewpoint image, the generated virtual viewpoint image is inaccurate due to the fact that the positions of the cameras are not on the same horizontal line and/or the optical axes are not parallel is avoided, and therefore the quality of the generated virtual viewpoint image is improved.

In specific practice, it is found that the virtual viewpoint image obtained after the re-projection may have a hole, and at this time, in order to improve the quality of the virtual viewpoint image, the hole may be filled, so that the quality of the virtual viewpoint image is improved.

In one embodiment, for the holes with larger area, the filling can be performed by bidirectional fusion. Specifically, the filling the hole in the left image in the adjacent initial virtual viewpoint image includes:

Because the reference image comprises two adjacent viewpoint images when the re-projection is carried out, the left image V is obtained after the re-projection_lAnd right picture V_rTwo adjacent initial virtual viewpoint images. Due to V_lAnd V_rThe hole directions are not consistent due to the translation of the left image and the right image in the reference image, so that when the hole position with a larger area exists, V can be used_rOf (2) corresponding pixel fill V_lAfter the position of the cavity, the filled V_lAs a virtual viewpoint image.

In one embodiment, the method further comprises:

In one embodiment, for a fine hole, the hole can be directly filled through a closing operation, so as to obtain a virtual viewpoint image.

The following describes a camera pose-based virtual viewpoint generating apparatus provided in an embodiment of the present application, and the camera pose-based virtual viewpoint generating apparatus described below and the camera pose-based virtual viewpoint generating method described above may be referred to in correspondence with each other.

In an embodiment, as shown in fig. 5, there is provided a virtual viewpoint generating apparatus based on camera pose, including:

a three-dimensional point constructing module 210, configured to construct a three-dimensional point corresponding to a matching point pair between two adjacent viewpoint images according to the matching point pair and internal parameters and external parameters of two cameras in a camera array corresponding to the adjacent viewpoint images;

the virtual viewpoint generating module 220 is configured to re-project each three-dimensional point according to the target camera pose and the internal parameter of each target camera corresponding to the selected camera pose, and determine a virtual viewpoint image corresponding to the camera pose.

In an embodiment, the three-dimensional point construction module 210 is further configured to:

In an embodiment, the three-dimensional point construction module 210 is specifically configured to:

In an embodiment, the virtual viewpoint generating module 220 is specifically configured to:

In an embodiment, the virtual viewpoint generating module 220 is further configured to:

Fig. 6 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 6: a processor (processor)810, a Communication Interface 820, a memory 830 and a Communication bus 840, wherein the processor 810, the Communication Interface 820 and the memory 830 communicate with each other via the Communication bus 840. The processor 810 may invoke computer programs in the memory 830 to perform the steps of a camera pose based virtual viewpoint generation method, for example comprising:

In addition, the logic instructions in the memory 830 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

In another aspect, the present application further provides a computer program product, where the computer program product includes a computer program, the computer program may be stored on a non-transitory computer readable storage medium, and when the computer program is executed by a processor, a computer can perform the steps of the camera pose based virtual viewpoint generation method provided in the foregoing embodiments, for example, the steps include:

On the other hand, embodiments of the present application further provide a processor-readable storage medium, where the processor-readable storage medium stores a computer program, where the computer program is configured to cause a processor to perform the steps of the method provided in each of the above embodiments, for example, including:

The processor-readable storage medium can be any available medium or data storage device that can be accessed by a processor, including, but not limited to, magnetic memory (e.g., floppy disks, hard disks, magnetic tape, magneto-optical disks (MOs), etc.), optical memory (e.g., CDs, DVDs, BDs, HVDs, etc.), and semiconductor memory (e.g., ROMs, EPROMs, EEPROMs, non-volatile memory (NAND FLASH), Solid State Disks (SSDs)), etc.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. A virtual viewpoint generation method based on camera pose is characterized by comprising the following steps:

2. The camera pose-based virtual visual point generating method according to claim 1, wherein before constructing the three-dimensional points corresponding to the matching point pairs from the matching point pairs between the adjacent visual point images and the internal parameters and the external parameters of the two cameras corresponding to the adjacent visual point images, further comprising:

3. The method of claim 1, wherein constructing three-dimensional points corresponding to the pairs of matching points according to the pairs of matching points between two adjacent viewpoint images and the intrinsic parameters and the extrinsic parameters of two cameras corresponding to the two adjacent viewpoint images comprises:

4. The method of claim 1, wherein the re-projecting each three-dimensional point according to the target camera pose of each target camera corresponding to the selected camera pose and the internal parameters to determine the virtual viewpoint image corresponding to the camera pose comprises:

5. The method of claim 1 or 4, wherein the re-projecting each three-dimensional point to determine a virtual viewpoint image corresponding to the camera pose comprises:

6. The method of claim 5, wherein the hole filling of the left image in the adjacent initial virtual viewpoint images comprises:

7. The camera pose-based virtual viewpoint generation method according to claim 6, further comprising:

8. A virtual viewpoint generating apparatus based on a camera pose, comprising:

9. An electronic device comprising a processor and a memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the camera pose based virtual viewpoint generation method according to any one of claims 1 to 7.

10. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, realizes the steps of the camera pose based virtual viewpoint generation method of any one of claims 1 to 7.