CN117764954A

CN117764954A - New view synthesis and detection method, system, equipment and storage medium

Info

Publication number: CN117764954A
Application number: CN202311794373.6A
Authority: CN
Inventors: 陈赟; 张英; 李端姣; 刘建明; 梁永超; 颜大涵; 林佳润; 杨可; 甘焯坤; 郑煜辉; 陈家贤
Original assignee: Guangdong Power Grid Co Ltd
Current assignee: Guangdong Power Grid Co Ltd
Priority date: 2023-12-22
Filing date: 2023-12-22
Publication date: 2024-03-26

Abstract

The invention discloses a new view synthesis and detection method, a system, equipment and a storage medium, wherein a three-dimensional model and a first image set with a known view angle are respectively obtained, a plurality of camera projection points which are matched with the first image set in a proper way are selected in the three-dimensional model in a scattered way, target pose and conditional images to be synthesized are determined according to the camera projection points, the three-dimensional model and the first image set, a new view corresponding to the camera missing view angle is obtained according to the target pose and the conditional images to be synthesized, the new view and the first image set form a second image set, errors in the re-projection of each new view in the second image set are calculated, and whether the new view can be matched with the first image set or not is predicted according to the errors in the re-projection.

Description

New view synthesis and detection method, system, equipment and storage medium

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a method, a system, an apparatus, and a storage medium for synthesizing and detecting a new view.

Background

For complex real scenes such as substations and factories, large-scale capturing of images by cameras loaded on unmanned aerial vehicles or carts is often faced with the problem of difficulty in capturing images at some viewing angles, which may be defined as missing viewing angles, due to the compactness of some equipment deployments. In the subsequent applications of the complex scene images, for example, the monomeric reconstruction of the equipment, if the missing view angles are too many, the missing of the real image information is too much, which is unfavorable for the later application in the scene reconstruction with higher geometric constraint requirements.

In the existing task of synthesizing new views, a plurality of new views are synthesized by using a single real image and a plurality of given poses, and compared with the real image information, the synthesized image information is too much, so that poor quality of scene reconstruction is easily caused. The existing visual angle selection method is mainly used for reconstructing textures for a Mesh model, and considers the problems of the visibility of any triangular patch relative to a camera, whether the triangular patch is blocked by other triangular patches or not and the like, but is limited to selecting the optimal visual angle from the known visual angles of a plurality of source images, and does not consider the missing visual angle, so that the method is difficult to be matched with the existing new visual image synthesis task.

Disclosure of Invention

In order to solve the technical problems, the embodiment of the invention provides a new view synthesis and detection method and device, which realize the synthesis of selecting a new view based on a missing view angle and improve the authenticity of an image synthesis effect.

A first aspect of an embodiment of the present invention provides a new view synthesis and detection method, including:

respectively acquiring a three-dimensional model and a first image set with a known view angle;

a plurality of camera projection points which are matched with the first image set in a scattering manner are selected in the three-dimensional model, and the pose and the conditional image of the target to be synthesized are determined according to the plurality of camera projection points, the three-dimensional model and the first image set;

obtaining a new view corresponding to the camera missing view angle according to the target pose to be synthesized and the conditional image, and forming a second image set from the new view and the first image set;

and calculating errors in the re-projection of each new view in the second image set, and pre-judging whether the new view can be matched with the first image set for application according to the errors in the re-projection.

In a possible implementation manner of the first aspect, the selecting, in a three-dimensional stereo model, a plurality of camera proxels that are adapted to the first image set in a decentralized manner includes:

detecting a plurality of virtual camera viewpoints corresponding to a plurality of images in the first image set one by one;

connecting each virtual camera viewpoint with a geometric center in the three-dimensional model to obtain a corresponding first virtual sight;

and determining the intersection of each first virtual sight line and the surface in the three-dimensional model as a corresponding camera projection point, wherein the surface in the three-dimensional model is arranged in a reticular spherical surface.

In a possible implementation manner of the first aspect, determining the target pose and the conditional image to be synthesized according to the plurality of camera projection points, the three-dimensional stereo model and the first image set includes:

in the reticular sphere, selecting a grid cell without a camera projection point as a reference sphere area corresponding to the camera missing view angle;

selecting at least one position in the reference spherical area as an intermediate spherical point respectively, and taking the direction pointing to the geometric center from each intermediate spherical point as a corresponding new view;

and taking the camera projection points which are not more than a preset distance from the reference spherical area as selected projection points, selecting images corresponding to the selected projection points as conditional images in the first image set, and obtaining the target pose according to the conditional images and the new vision.

In a possible implementation manner of the first aspect, obtaining the target pose according to the conditional image and the new view includes:

respectively calculating the distances between a plurality of conditional images and the geometric center to obtain a plurality of geometric distances, and setting the average distance between the geometric distances as a new viewing distance;

based on the new view direction, determining a position with the distance equal to the new view distance from the geometric center as a new view point position, and representing the new view point position and the new view direction as a target pose.

In a possible implementation manner of the first aspect, calculating an error in the re-projection of each new view in the second image set includes:

respectively taking a plurality of new views in the second image set as new views to be calculated, and determining image feature points in each new view to be calculated;

calculating the position of each image characteristic point of the new view to be calculated projected into the three-dimensional model to obtain a corresponding projection position;

connecting the projection positions and the new view points corresponding to the projection positions to obtain corresponding second virtual sights, and respectively determining a plurality of positions intersected by the second virtual sights as target object side points;

and carrying out back projection on each new viewpoint by utilizing each target object direction point to obtain a back projection position, calculating the distance between the back projection position and the image characteristic point to obtain the re-projection error of each new view to be calculated, and carrying out error calculation by utilizing the re-projection error of each new view to be calculated and the re-projection error of each new view to be calculated.

A second aspect of an embodiment of the present invention provides a new view synthesis and detection system, the system comprising:

the acquisition module is used for respectively acquiring the three-dimensional model and a first image set with a known view angle;

the computing module is used for dispersedly selecting a plurality of camera projection points which are matched with the first image set in the three-dimensional model, and determining the target pose and the conditional image to be synthesized according to the plurality of camera projection points, the three-dimensional model and the first image set;

the synthesizing module is used for obtaining a new view corresponding to the camera missing view angle according to the target pose to be synthesized and the conditional image, and forming a second image set from the new view and the first image set;

and the evaluation module is used for calculating errors in the re-projection of each new view in the second image set, and predicting whether the new view can be matched with the first image set for application according to the errors in the re-projection.

In a possible implementation manner of the second aspect, the calculation module includes a deletion unit, an estimation unit, a connection unit, and a camera projection point determination unit;

the detection unit is used for detecting a plurality of virtual camera viewpoints which are in one-to-one correspondence with a plurality of images in the first image set;

the connecting unit is used for connecting the virtual camera viewpoints at all positions with the geometric center in the three-dimensional model to obtain corresponding first virtual vision;

and the camera projection point determining unit is used for determining the intersection of each first virtual sight line and the surface in the three-dimensional stereo model as a corresponding camera projection point, wherein the surface in the three-dimensional stereo model is arranged in a reticular spherical surface.

In a possible implementation manner of the second aspect, the computing module further includes a traversing unit, a first selecting unit, and a second selecting unit;

the traversing unit is used for selecting a grid unit without a camera projection point as a reference spherical surface area of a camera missing view angle in the reticular spherical surface;

the first selecting unit is used for selecting at least one position in the reference spherical surface area as an intermediate spherical point respectively, and taking the direction pointing to the geometric center from each intermediate spherical point as a corresponding new view;

the second selecting unit is used for taking the camera projection points which are not more than a preset distance from the reference spherical area as selected projection points, selecting images corresponding to the selected projection points as conditional images in the first image set, and obtaining the target pose according to the conditional images and the new view.

A third aspect of an embodiment of the present invention provides a computing device, comprising:

a memory for storing a computer program;

a processor for implementing the new view synthesis and detection method as in the first aspect when executing the computer program.

A fourth aspect of embodiments of the present invention provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the new view synthesis and detection method as in the first aspect.

Drawings

Fig. 1: a flow diagram of one embodiment of the new view synthesis and detection method provided by the invention;

fig. 2: the virtual globe schematic diagram of one embodiment of the new view synthesis and detection method provided by the invention;

fig. 3: an example schematic diagram of a virtual globe of an embodiment of the new view synthesis and detection method provided by the invention;

fig. 4: a schematic diagram is determined for camera projection points of one embodiment of the new view synthesis and detection method provided by the invention;

fig. 5: the reference sphere region schematic diagram of one embodiment of the new view synthesis and detection method provided by the invention;

fig. 6: the projection position determination map is an embodiment of the new view synthesis and detection method;

fig. 7: the invention provides a system structure block diagram of one embodiment of a new view synthesis and detection method.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1, a flow chart of an embodiment of a new view synthesis and detection method according to an embodiment of the present invention includes steps S101 to S104.

S101, respectively acquiring a three-dimensional model and a first image set of known camera view angles.

For example, a compact scenario may be deployed for a device such as a substation or a production plant, a certain device may be considered as a target object, and geographic coordinates may be measured for any location on the target object; the geographic coordinates can be converted into a three-dimensional virtual space through a preset coordinate conversion relation, a three-dimensional model is built by taking the converted position as the center, and the three-dimensional model can be a regular grid or a sphere (see fig. 2) with the surface arranged in a longitude and latitude grid, and the like.

It should be noted that, in the case where the three-dimensional model is configured as a sphere, the size in the longitude and latitude network may be set as needed, and generally, the shooting angle of the camera and the distance between the camera and the position on the target object are related, for example, each grid cell may be configured as a specification of 30 ° longitude by 30 ° latitude or other specifications, see fig. 3.

For example, live-action images can be respectively shot by a camera loaded on the unmanned plane or the robot dog or the cart under a plurality of poses with spatial sequence, and some shot live-action images are combined into a first image set; or, performing object recognition on each of the live-action images through an image recognition model, and forming a plurality of live-action images corresponding to the object into a first image set, wherein the image recognition model can be obtained by training an example segmentation algorithm or a semantic segmentation algorithm or a panoramic segmentation algorithm in advance;

for example, a sparse point cloud and a plurality of camera poses can be obtained by an SFM algorithm for the first image set, and a position in the sparse point cloud is selected as a geometric center of the three-dimensional stereo model according to the plurality of camera poses and a plurality of image areas capable of representing the target object in the first image set, wherein the geometric center can represent an actual position on the target object, and the actual position can be a geometric center point or a surface point or other points.

S102, a plurality of camera projection points which are matched with the first image set in a scattering mode are selected in the three-dimensional model, and the target pose and the conditional image to be synthesized are determined according to the plurality of camera projection points, the three-dimensional model and the first image set.

After the background or the obstacle of each image in the first image set is deleted, the camera projection points of each image in the first image set are determined by combining the three-dimensional model, and the target pose and the conditional image which are suitable for being used as the condition input in the new view synthesis task are respectively determined according to the longitude and latitude grids on the spherical surface.

And S103, obtaining a new view corresponding to the camera missing view angle according to the target pose to be synthesized and the conditional image, and forming a second image set from the new view and the first image set.

In the task of synthesizing the new views of 3DiM, at least two condition images corresponding to a single target pose are utilized to generate at least one corresponding new view D, and the new view D is added into the first image set to obtain a second image set.

S104, calculating errors in the re-projection of each new view in the second image set, and pre-judging whether the new view can be matched with the first image set for application according to the errors in the re-projection.

Adding new viewsAnd obtaining a new image set, namely a second image set, from the image set obtained after deleting the background or/and the obstacle of each image in the first image set. And processing the new image set through an SFM algorithm, and calculating errors in the re-projection in the new view. The error M is greater than or equal to M _i When the new view D is considered to introduce more errors than constraint information, the new view D cannot be applied to the application of strong geometric constraints together with the source image, for example, for three-dimensional reconstruction or panorama making; otherwise, the new view D may be used in the missing part geometry reconstruction.

In this embodiment, M _i Setting to 3 can be performed according to actual needs in practical application. The SFM algorithm refers to a motion restoration structure technique that can restore structural information of a scene and a pose of a photographing camera from a plurality of images of different perspectives.

According to the method, a three-dimensional model and a first image set of a known view angle are respectively obtained, a plurality of camera projection points which are matched with the first image set in a proper mode are selected in the three-dimensional model in a scattered mode, target pose and conditional images to be synthesized are determined according to the camera projection points, the three-dimensional model and the first image set, a new view corresponding to the missing view angle of a camera is obtained according to the target pose and the conditional images to be synthesized, the new view is placed into the first image set to form a second image set, errors in re-projection of the new views in the second image set are calculated, whether the new view can be matched with the first image set or not can be judged according to the errors in re-projection, and the new view is selected to be synthesized based on the missing view angle, so that the reality of an image synthesis effect is improved.

In one embodiment, a number of virtual camera viewpoints are detected that are in one-to-one correspondence with a number of images in a first set of images; connecting each virtual camera viewpoint with a geometric center in the three-dimensional model to obtain a corresponding first virtual sight; and determining the intersection of each first virtual sight line and the surface in the three-dimensional model as a corresponding camera projection point, wherein the surface in the three-dimensional model is arranged in a reticular spherical surface.

In this embodiment, SAM (segment anything model) and other technologies are adopted to assist, and the background and the shielding object of the target T are deleted for each source image in the image set I, so as to obtain an image set I'. And then, carrying out pose estimation on all images in the image set I 'again by adopting an SFM algorithm, connecting the spherical center with a virtual camera viewpoint corresponding to each image in the image set I', and marking the intersection position of the connecting line and the spherical surface as a camera projection point, as shown in fig. 4.

The geometric center is a three-dimensional sphere center constructed in two ways, one is obtained by measuring coordinates of a target object in a real scene, and the other is an image shot from each view angle, and the SFM technology is used for scene simulation, so that the position of the target object is estimated, and the three-dimensional sphere center is obtained.

Note that, the image set I refers to the first image set. SAM is a general model for processing image segmentation task, which image pixels belong to one object can be automatically identified, and each object in the image is automatically processed in style, so that SAM can be widely used for analyzing scientific images, editing photos and the like, and is not described in detail because of the prior art.

In one embodiment, in the mesh sphere, selecting a grid cell without a camera projection point as a reference sphere region corresponding to a camera missing view angle; selecting at least one position in the reference spherical area as an intermediate spherical point respectively, and taking the direction pointing to the geometric center from each intermediate spherical point as a corresponding new view; and taking the camera projection points which are not more than a preset distance from the reference spherical area as selected projection points, selecting images corresponding to the selected projection points as conditional images in the first image set, and obtaining the target pose according to the conditional images and the new vision.

In this embodiment, according to the longitude and latitude grid on the sphere, the target pose and the conditional image suitable for being used as the condition input in the new view synthesis task are respectively determined, and the specific steps are as follows: each cell on the sphere is traversed, each cell of the projected point of the camera which is not marked can be respectively recorded as a reference sphere area with a missing camera view angle, and a plurality of cells with adjacent relations along the radial direction or the latitudinal direction of the sphere can be combined into one reference sphere area under the condition that the missing view angle is too large or/and the grid specification is too small. As shown in fig. 5, for a certain reference spherical area, one or more positions may be randomly selected as intermediate spherical points, and a direction pointing from each intermediate spherical point to the center of the sphere is used as a new viewing direction, for example, a center point of a cell may be used as a single intermediate spherical point. At least two camera projection points around the reference spherical area are respectively selected as selected projection points, and an image corresponding to each selected projection point is respectively selected from the image set I' as a conditional image.

It should be noted that, for any camera projection point, the distance between the camera projection point and the reference spherical area is detected, if the distance is smaller than a certain distance threshold value, the camera projection point is considered to be around the reference spherical area, otherwise, the camera projection point is considered to be far away from the reference spherical area. For example, the distance may be expressed as the distance between the camera projection point and the geometric center point of the reference spherical region, or the closest distance between the camera projection point and the boundary of the reference spherical region.

In one embodiment, distances between a plurality of conditional images and a geometric center are calculated respectively to obtain a plurality of geometric distances, and an average distance among the geometric distances is set as a new viewing distance; based on the new view direction, determining a position with the distance equal to the new view distance from the geometric center as a new view point position, and representing the new view point position and the new view direction as a target pose.

In this embodiment, after the conditional images are obtained, the average distance between all the conditional images selected and the sphere center is used as the new viewing distance. And in each new view direction, determining the position with the distance from the sphere center equal to the new view distance as a new view point position, and representing the target pose based on the new view point position and the corresponding new view direction.

In one embodiment, a plurality of new views in the second image set are respectively used as new views to be calculated, and image feature points in the new views to be calculated are determined; calculating the position of each image characteristic point of the new view to be calculated projected into the three-dimensional model to obtain a corresponding projection position; connecting the projection positions and the new view points corresponding to the projection positions to obtain corresponding second virtual sights, and respectively determining a plurality of positions intersected by the second virtual sights as target object side points; and carrying out back projection on each new viewpoint by utilizing each target object direction point to obtain a back projection position, calculating the distance between the back projection position and the image characteristic point to obtain the re-projection error of each new view to be calculated, and carrying out error calculation by utilizing the re-projection error of each new view to be calculated and the re-projection error of each new view to be calculated.

In this embodiment, after adding the new view D to the image set I', an image set i″ is formed, and after performing motion recovery processing again by the SFM algorithm, an error M in the re-projection of the new view D is calculated.

As shown in fig. 6, taking three images in the image set i″ as an example, three virtual camera viewpoints are P1, P2, and P3, pixels suitable for representing the same object point exist in the three images respectively, that is, image feature points, and a geometric method is adopted to calculate the position of the image feature point in each image projected in the three-dimensional space, that is, the projection position, where a connection line between each virtual camera viewpoint and each projection position is taken as a projection line of sight.

It should be noted that, the position of the image feature point projected in the three-dimensional space in each image is calculated by adopting a geometric method, and the method adopts a method of utilizing front intersection to calculate, specifically: the work of calculating model point coordinates (or ground point coordinates) using the inner azimuth elements, homonymous image point coordinates and the relative azimuth elements (or outer azimuth elements) of the pair of two images of a stereopair is called spatial front intersection.

Because of the central projection error, the three projection views of the same object point from three view directions are mutually staggered, the three projection views are required to be optimized, and finally, the intersecting position of the three views is determined as the target object point, and the 6 target object points are respectively X1 to X6. The scheme preferably utilizes SFM algorithm to optimize three projection views. The position where the three lines of sight intersect is obtained by way of a front intersection.

And then back projecting each target object direction P1, P2 and P3 onto the corresponding image, taking the distance between the position back projected onto the image and the corresponding image characteristic point as a re-projection error, and substituting all re-projection errors belonging to the same new view D into a middle error calculation formula to calculate a corresponding middle error M.

The middle error calculation formula is the arithmetic square root of variance, and the specific calculation formula is as follows:

wherein, sigma represents the error, delta _i Representing true errors, n representing the number of error values.

In one embodiment, as shown in fig. 7, a block diagram of a new view synthesis and detection system 700 provided in an embodiment of the present application includes an acquisition module 701, a calculation module 702, a synthesis module 703, and an evaluation module 704, where:

the acquiring module 701 is configured to acquire a three-dimensional model and a first image set of a known view angle respectively;

the computing module 702 is configured to dispersedly select a plurality of camera projection points that are appropriately matched with the first image set in the three-dimensional model, and determine a target pose and a conditional image to be synthesized according to the plurality of camera projection points, the three-dimensional model and the first image set;

the synthesizing module 703 is configured to obtain a new view corresponding to the camera missing view according to the target pose and the conditional image to be synthesized, and put the new view into the first image set to form a second image set;

the evaluation module 704 is configured to calculate an error in the re-projection of each new view in the second image set, and predict whether the new view can be matched with the first image set according to the error in the re-projection.

In one embodiment, the computing module 702 includes a detection unit, a connection unit and a camera proxel determination unit,

the connection unit is used for connecting each virtual camera viewpoint with the geometric center in the three-dimensional model to obtain a corresponding first virtual sight;

the camera projection point determining unit is used for determining the intersection of each first virtual sight line and the surface in the three-dimensional stereo model as a corresponding camera projection point, wherein the surface in the three-dimensional stereo model is arranged in a net-shaped spherical surface.

In one embodiment, the computation module 702 further includes a traversal unit, a first pick unit, and a second pick unit,

the traversing unit is used for selecting a grid unit without a camera projection point as a reference spherical surface area corresponding to the camera missing visual angle in the reticular spherical surface;

the second selecting unit is used for selecting the camera projection points which are not more than a preset distance from the reference spherical area as selected projection points, selecting the images corresponding to the selected projection points as conditional images in the first image set, and obtaining the target pose according to the conditional images and the new view.

In one embodiment of the present application, a computing device is provided that includes a memory having a computer program stored therein and a processor that when executing the computer program performs the above steps; the computing device provided in the embodiments of the present application has similar implementation principles and technical effects to those of the above-described method embodiments, and will not be described herein again.

In one embodiment of the present application, there is provided a non-transitory computer readable storage medium having stored thereon a computer program which when executed by a processor further performs the above steps; the computer readable storage medium provided in this embodiment has similar principles and technical effects to those of the above method embodiment, and will not be described herein.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The foregoing embodiments have been provided for the purpose of illustrating the general principles of the present invention, and are not to be construed as limiting the scope of the invention. It should be noted that any modifications, equivalent substitutions, improvements, etc. made by those skilled in the art without departing from the spirit and principles of the present invention are intended to be included in the scope of the present invention.

Claims

1. A new view synthesis and detection method, comprising:

a plurality of camera projection points which are matched with the first image set in a scattering manner are selected in the three-dimensional model, and target pose and conditional images to be synthesized are determined according to the plurality of camera projection points, the three-dimensional model and the first image set;

obtaining a new view corresponding to the camera missing view angle according to the target pose to be synthesized and the conditional image, and forming a second image set by the new view and the first image set;

calculating errors in the re-projection of each new view in the second image set, and predicting whether the new view can be matched with the first image set for application according to the errors in the re-projection.

2. The new view synthesis and detection method according to claim 1, wherein the selecting, in the three-dimensional stereoscopic model, a plurality of camera proxels that fit the first image set in a decentralized manner includes:

connecting the virtual camera viewpoints at all positions with the geometric centers in the three-dimensional model to obtain corresponding first virtual vision;

and determining the intersection of each first virtual sight line and the surface in the three-dimensional model as a corresponding camera projection point, wherein the surface in the three-dimensional model is arranged in a reticular sphere.

3. The method of claim 2, wherein determining the target pose and the conditional image to be synthesized from the plurality of camera projection points, the three-dimensional stereo model, and the first image set comprises:

selecting a grid unit without the camera projection points as a reference spherical surface area corresponding to the camera missing view angle in the reticular spherical surface;

and taking the camera projection points which are not more than a preset distance from the reference spherical area as selected projection points, selecting images corresponding to the selected projection points as conditional images in the first image set, and obtaining target pose according to the conditional images and the new view.

4. A new view synthesis and detection method according to any of claims 1-3, wherein said obtaining a target pose from said conditional image and said new view comprises:

and determining a position with the distance equal to the new visual distance from the geometric center as a new visual point position based on the new visual direction, and representing the new visual point position and the new visual direction as target pose.

5. The new view synthesis and detection method according to claim 4, wherein said calculating the error in the re-projection of each of the new views in the second image set comprises:

and carrying out back projection on each new viewpoint by utilizing each target object direction point to obtain a back projection position, calculating the distance between the back projection position and the image characteristic point to obtain a re-projection error of each new view to be calculated, and carrying out error calculation by utilizing the re-projection error of each new view to be calculated, wherein the re-projection error of each new view to be calculated is the error in the re-projection of each new view to be calculated.

6. A new view synthesis and detection system, comprising:

the computing module is used for dispersedly selecting a plurality of camera projection points which are matched with the first image set in the three-dimensional model, and determining a target pose and a conditional image to be synthesized according to the plurality of camera projection points, the three-dimensional model and the first image set;

7. The new view synthesis and detection system according to claim 6, wherein the computing module comprises a detection unit, a connection unit, and a camera proxel determination unit;

the connecting unit is used for connecting the virtual camera viewpoints at all positions with the geometric centers in the three-dimensional model to obtain corresponding first virtual vision;

the camera projection point determining unit is used for determining the intersection of each first virtual sight line and the surface in the three-dimensional model as a corresponding camera projection point, wherein the surface in the three-dimensional model is arranged in a reticular sphere.

8. The new view synthesis and detection system of claim 6, wherein the computing module further comprises a traversal unit, a first pick unit, and a second pick unit;

the traversing unit is used for selecting a grid unit without the camera projection points from the reticular sphere as a reference sphere area of a camera missing view angle;

the first selecting unit is used for selecting at least one position in the reference spherical surface area as an intermediate spherical surface point respectively, and taking the direction pointing to the geometric center from each intermediate spherical surface point as a corresponding new view;

the second selecting unit is configured to select, as a selected projection point, the camera projection point that is not more than a preset distance from the reference spherical region, and in the first image set, select, as a conditional image, an image corresponding to each selected projection point, and obtain a target pose according to the conditional image and the new view.

9. A computing device, comprising:

a memory for storing a computer program;

a processor for implementing the new view synthesis and detection method according to any of claims 1 to 5 when executing said computer program.

10. A non-transitory computer readable storage medium, having stored thereon a computer program which, when executed by a processor, implements the new view synthesis and detection method according to any of claims 1 to 5.