WO2023093739A1 - Multi-view three-dimensional reconstruction method - Google Patents

Multi-view three-dimensional reconstruction method Download PDF

Info

Publication number
WO2023093739A1
WO2023093739A1 PCT/CN2022/133598 CN2022133598W WO2023093739A1 WO 2023093739 A1 WO2023093739 A1 WO 2023093739A1 CN 2022133598 W CN2022133598 W CN 2022133598W WO 2023093739 A1 WO2023093739 A1 WO 2023093739A1
Authority
WO
WIPO (PCT)
Prior art keywords
point cloud
target object
mask
image
view
Prior art date
Application number
PCT/CN2022/133598
Other languages
French (fr)
Chinese (zh)
Inventor
付明亮
董智超
周振坤
凌康
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023093739A1 publication Critical patent/WO2023093739A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/80Geometric correction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20092Interactive image processing based on input by user
    • G06T2207/20104Interactive definition of region of interest [ROI]

Definitions

  • the embodiments of the present application relate to the field of computer technology, and in particular to a method and device for multi-view three-dimensional reconstruction.
  • Multi-view stereo vision aims to restore the three-dimensional scene surface from a set of calibrated two-dimensional images and estimated camera parameters, and is widely used in autonomous driving, augmented reality, digital presentation and protection of cultural relics, urban scale measurement etc.
  • the process of 3D reconstruction it is necessary to outline the region to be processed in various shapes from the processed image, so as to identify and extract the region of interest (region of interest, ROI) of the main body of the scene.
  • region of interest region of interest, ROI
  • the current multi-view stereo reconstruction technology has greatly inspired the research of intensive computing scenarios, but on the one hand, the current ROI extraction technology is prone to incomplete ROI problems such as false detection and missed detection; on the other hand, based on specific signs or user Interactive methods are not suitable for automation or high-volume reconstruction scenarios.
  • intensive computing scenarios a large number of calculations occur in the background area, and the calculation overhead of the background area is too large, which affects the efficiency of 3D reconstruction.
  • the embodiment of the present application provides a method for multi-view 3D reconstruction, which can realize complete and efficient multi-view 3D reconstruction without user interaction, target object type restriction and calibration pattern restriction.
  • a method for multi-view 3D reconstruction including: determining the second 3D point according to the first image sequence of the target object, the first instance mask, the first 3D point cloud, and the pose information of the image acquisition device cloud, wherein the first sequence of images includes a plurality of images that are de-distorted after surrounding shooting of the target object, and the first instance mask includes a segmentation mask of the target object in the first sequence of images code and the segmentation mask of the background object, the first three-dimensional point cloud includes the sparse point cloud of the target object and the sparse point cloud of the background object in the first image sequence, and the pose information includes the image acquisition device Parameter information when shooting around the target object, the second 3D point cloud includes a sparse point cloud of the target object; acquire a 2D view area according to the second 3D point cloud, and the 2D view area includes the target A region of interest of the object; generating a third 3D point cloud according to the 2D view region, the third 3D point cloud including a dense 3D
  • the automatic extraction of 2D ROI is realized by obtaining the 3D ROI of the target object, and the 3D reconstruction of the target object is further realized. In this way, it is possible to avoid ROI misdetection, missed detection, and incomplete problems, and efficiently realize the complete multi-view 3D reconstruction of the target object.
  • the determining of the second 3D point cloud according to the first image sequence of the target object, the first instance mask, the first 3D point cloud, and the pose information of the image acquisition device comprising: determining a 3D spherical model of the target object according to the pose information; acquiring a 2D circular image of the target object according to the 3D spherical model; removing the background object from the 2D circular image A segmentation mask, determining a segmentation mask of the target object; determining the second 3D point cloud according to the first 3D point cloud and the segmentation mask of the target object.
  • the 3D spherical model is determined based on the visual axis information of the image acquisition device, and the background mask is removed according to the projection of the spherical model on the 2D viewpoint image, so as to avoid ROI false detection and missed detection problems.
  • a least square method is used to fit the center and radius of the 3D spherical model according to the pose information.
  • the 3D spherical model is fitted based on the visual axis information of the image acquisition device based on the least square method to ensure the accuracy of the position and outline of the 3D spherical model.
  • determining the segmentation mask of the target object according to the overlap between the 2D circular image and the first instance mask includes: when the 2D circular image If the partial mask included in the shape image overlaps with the first instance mask, then determine that the overlapping partial mask is the segmentation mask of the target object, and remove the non-overlapping mask; or when the The 2D circular image and the partial masks included in the first example mask do not overlap, then remove the non-overlapping partial masks, and determine the remaining segmentation masks as the segmentation masks of the corresponding target objects, the The non-overlapping partial masks are the segmentation masks of the background objects.
  • the background mask is removed based on the 2D projection image of the 3D spherical model to ensure the accuracy of the mask of the target object, thereby ensuring the accuracy and integrity of the ROI, and further reducing the background
  • the calculation of the area can effectively improve the reconstruction efficiency.
  • the first 3D point cloud is projected onto a 2D view image
  • the The second 3D point cloud includes: the 2D view image of a part of the first 3D point cloud overlaps with the segmentation mask of the target object, and then determine the part of the overlapping points of the first 3D point cloud
  • the cloud is the second 3D point cloud, and the remaining point clouds in the first 3D point cloud that do not overlap with the segmentation mask of the target object are removed.
  • the 3D point cloud of the target object is determined based on the 2D projection image of the sparse point cloud, so that the 3D point cloud of the target object is more accurate, which is conducive to the complete extraction of ROI, so as to realize the complete target object multi-view 3D reconstruction.
  • a 2D convex hull region is obtained according to the second 3D point cloud and the pose information, and the 2D convex hull region includes 2D point set, the 2D point set includes the 2D projection point of the second three-dimensional point cloud; edge detection is performed on the 2D convex hull area, and the 2D extended point set is obtained, and the edge detection is used to move according to the edge point set
  • the 2D extended point set includes the 2D point set and the edge point set; the target object is obtained according to the 2D extended point set 2D view area.
  • the expansion of the 2D point set is realized based on the 2D convex hull and edge detection to ensure the integrity of the 2D ROI, thereby realizing the complete multi-view 3D reconstruction of the target object.
  • the parameter information when the image acquisition device shoots around the target object includes a degree of freedom parameter when the image acquisition device moves relative to the target object.
  • 3D reconstruction is performed based on the degree of freedom parameters of the image acquisition device, which can ensure the accuracy of 3D ROI and 2D ROI, thereby realizing a complete multi-view 3D reconstruction of the target object.
  • the image acquisition device includes multiple devices.
  • a device for multi-view 3D reconstruction includes: a first determination module, configured to use the first image sequence of the target object, the first instance mask, the first 3D point cloud, and an image acquisition device
  • the pose information of the second 3D point cloud is determined, wherein the first image sequence includes a plurality of images that are de-distorted after shooting around the target object, and the first instance mask includes the first A segmentation mask of the target object and a segmentation mask of the background object in the image sequence, the first 3D point cloud includes a sparse point cloud of the target object and a sparse point cloud of the background object in the first image sequence, the The pose information includes parameter information when the image acquisition device shoots around the target object, and the second 3D point cloud includes a sparse point cloud of the target object; the second determination module is configured to The point cloud acquires a 2D view area, and the 2D view area includes the region of interest of the target object; a building module is used to generate a third three-dimensional point cloud according to
  • the automatic extraction of 2D ROI is realized by obtaining the 3D ROI of the target object, and the target object is further realized. 3D reconstruction. In this way, it is possible to avoid ROI misdetection, missed detection, and incomplete problems, and efficiently realize the complete multi-view 3D reconstruction of the target object.
  • the first determining module is specifically configured to determine a 3D spherical model of the target object according to the pose information; acquire the target object according to the 3D spherical model 2D circular image; remove the segmentation mask of the background object according to the 2D circular image, and determine the segmentation mask of the target object; according to the first 3D point cloud and the segmentation mask of the target object The code determines the second 3D point cloud.
  • the 3D spherical model is determined based on the visual axis information of the image acquisition device, and the background mask is removed according to the projection of the spherical model on the 2D viewpoint image, so as to avoid ROI false detection and missed detection problems.
  • the first determination module is specifically configured to use a least square method to fit the center and radius of the 3D spherical model according to the camera pose information.
  • the 3D spherical model is fitted based on the visual axis information of the image acquisition device based on the least square method to ensure the accuracy of the position and outline of the 3D spherical model.
  • the first determination module is specifically configured to determine the segmentation mask of the target object according to the overlap between the 2D circular image and the first instance mask.
  • the code includes: when the 2D circular image overlaps with a partial mask included in the first instance mask, then determining that the overlapped partial mask is the segmentation mask of the target object, and removing the non-identical Overlapping masks; or when the partial masks included in the 2D circular image and the first instance mask do not overlap, remove the non-overlapping partial masks, and determine that the remaining segmentation masks belong to A segmentation mask of the target object, the non-overlapping partial mask is the segmentation mask of the background object.
  • the background mask is removed based on the 2D projection image of the 3D spherical model to ensure the accuracy of the mask of the target object, thereby ensuring the accuracy and integrity of the ROI, and further reducing the background
  • the calculation of the area can effectively improve the reconstruction efficiency.
  • the first determination module is specifically configured to project the first 3D point cloud to a 2D view image, and according to the segmentation of the 2D view image and the target object If the mask overlaps to determine the second 3D point cloud, it includes: the 2D view image of a part of the point cloud of the first 3D point cloud overlaps with the segmentation mask of the target object, then determining all of the overlapping The partial point cloud of the first three-dimensional point cloud is the second three-dimensional point cloud, and the remaining point clouds in the first three-dimensional point cloud that do not overlap with the segmentation mask of the target object are removed.
  • the 3D point cloud of the target object is determined based on the 2D projection image of the sparse point cloud, so that the 3D point cloud of the target object is more accurate, which is conducive to the complete extraction of ROI, so as to realize the complete target object multi-view 3D reconstruction.
  • the second determination module is specifically configured to obtain a 2D convex hull region according to the second 3D point cloud and the pose information, and the 2D convex hull region includes A 2D point set within the outer contour of the target object, the 2D point set including 2D projection points of the second 3D point cloud; edge detection is performed on the 2D convex hull area to obtain a 2D extended point set, the Edge detection is used to remove the sparse point cloud of the background object included in the 2D convex hull area according to the edge point set, the 2D extended point set includes the 2D point set and the edge point set; according to the The 2D extension point set obtains the 2D viewing area of the target object.
  • the expansion of the 2D point set is realized based on the 2D convex hull and edge detection to ensure the integrity of the 2D ROI, thereby realizing the complete multi-view 3D reconstruction of the target object.
  • the parameter information when the image acquisition device shoots around the target object includes a degree of freedom parameter when the image acquisition device moves relative to the target object.
  • 3D reconstruction is performed based on the degree of freedom parameters of the image acquisition device, which can ensure the accuracy of 3D ROI and 2D ROI, thereby realizing a complete multi-view 3D reconstruction of the target object.
  • the image acquisition device includes multiple devices.
  • a device for multi-view 3D reconstruction including a processor and a memory, the memory is used to store a computer program, and the processor is used to call and run the computer program from the memory, so that the communication device executes the first An image generation method in one aspect and various possible implementations thereof.
  • processors there are one or more processors, and one or more memories.
  • the memory can be integrated with the processor, or the memory can be set separately from the processor.
  • a computer-readable storage medium stores program code for execution by a device, and the program code includes the method for executing the first aspect or the second aspect.
  • a computer program product containing instructions is provided, and when the computer program product is run on a computer, the computer is made to execute the method in any one of the implementation manners in the foregoing aspects.
  • a chip includes a processor and a data interface, and the processor reads instructions stored in the memory through the data interface, and executes the method in any one of the above aspects.
  • the chip may further include a memory, the memory stores instructions, the processor is configured to execute the instructions stored in the memory, and when the instructions are executed, the processor is configured to execute the above-mentioned A method in any one of the implementations of the aspect.
  • the aforementioned chip may specifically be a field-programmable gate array (field-programmable gate array, FPGA) or an application-specific integrated circuit (application-specific integrated circuit, ASIC).
  • FPGA field-programmable gate array
  • ASIC application-specific integrated circuit
  • FIG. 1 shows a schematic structural diagram of a system architecture provided by an embodiment of the present application
  • FIG. 2 shows a schematic diagram of a scene structure provided by an embodiment of the present application
  • Fig. 3 shows a schematic diagram of a product realization form provided by the embodiment of the present application
  • Fig. 4 shows a flowchart of a method for multi-view 3D reconstruction provided by an embodiment of the present application
  • Fig. 5 shows a structural diagram of an apparatus for multi-view 3D reconstruction provided by an embodiment of the present application.
  • the system architecture 100 includes an image acquisition module 110 and a model reconstruction module 120 .
  • the image acquisition module 110 is one of the bases for model reconstruction, and high-quality large-scale image acquisition is the key to reconstructing high-quality models.
  • the image acquisition device 111 is used to acquire the original image, wherein the image acquisition device 111 can be any device with a shooting function, which is not specifically limited in the embodiment of the present application;
  • the image preprocessing device 112 may be used to perform screening, filtering, and de-distortion processing of the original image, where the method for performing de-distortion processing on the image is not limited in this embodiment of the present application.
  • the image acquisition module 110 also includes an image sequence library 113 for storing image sequences.
  • the image sequence can be used in the model reconstruction module to reconstruct the three-dimensional model.
  • the model reconstruction module 120 includes a view reconstruction device 121, which can perform sparse reconstruction and dense reconstruction based on the image sequence maintained in the image sequence library 113, and further obtain the target 3D view by the view fusion device.
  • view reconstruction device 121 and view fusion device 122 can be used as independent devices, or can be coupled as one device to reconstruct the target 3D view.
  • the embodiment of the present application is only an exemplary description, and the embodiment of the present application does not limit it.
  • the image sequences maintained in the image sequence library 113 are not necessarily all acquired by the image acquisition device 111, and may also be received from other devices.
  • the view reconstruction device 121 does not necessarily reconstruct the 3D view entirely based on the image sequence maintained by the image sequence library 113, and may also obtain the image sequence from the cloud or other places to reconstruct the 3D view. The above description should not be regarded as Limitations on the embodiments of this application.
  • execution devices can be terminals, such as mobile terminals, tablet computers, notebook computers, augmented reality (augmented reality, AR) AR/virtual reality (virtual reality) reality, VR), vehicle-mounted terminals, etc., it can also be a server or cloud, etc.
  • the above-mentioned view reconstruction device 121 can reconstruct and obtain different three-dimensional objects based on different image sequences for different targets or different tasks, so as to provide users with desired results.
  • FIG. 1 is only a schematic diagram of a system architecture provided by an embodiment of the present application, and the positional relationship among devices, devices, modules, etc. shown in the figure does not constitute any limitation.
  • FIG. 2 shows a schematic diagram of a scene structure provided by an embodiment of the present application. This application scenario is applicable to the above-mentioned system 100 .
  • the application scene input includes two stages, the image sequence 2010 collected around the object and the construction input 2020 .
  • the construction input 2020 stage usually includes the sparsely reconstructed 3D point cloud output by the SFM algorithm, the pose of the image acquisition device, and the dedistorted image.
  • the pose of the image acquisition device can be understood as a parameter of the degree of freedom of the movement of the image acquisition device relative to the object when the image acquisition device shoots around the object.
  • the stage of extracting 2D ROI 2030 mainly refers to the use of related algorithms to complete the extraction of 2D ROI on the de-distorted image sequence; subsequently, the viewpoint sequence image after 2D ROI extraction is used as the input of dense reconstruction 1040, and the depth map is output, and the depth map is obtained through view fusion 2050. Graph and normal map to synthesize 3D point cloud.
  • the execution subject of the multi-view 3D reconstruction method provided by the embodiment of the present disclosure is generally a computer with certain computing power equipment, the computer equipment includes, for example: terminal equipment or server or other processing equipment, the terminal equipment can be user equipment (User Equipment, UE), mobile equipment, user terminal, terminal, cellular phone, cordless phone, personal digital processing (Personal Digital Assistant, PDA), handheld devices, computing devices, vehicle-mounted devices, wearable devices, etc.
  • the method for multi-view three-dimensional reconstruction may be implemented in a manner in which a processor invokes computer-readable instructions stored in a memory.
  • FIG. 3 shows a schematic flowchart of a method for multi-view 3D reconstruction provided by an embodiment of the present application.
  • the method 300 shown in FIG. 3 may be applied to the system 100 shown in FIG. 1 , and the method 300 may be executed by the above execution device.
  • the method 300 may be processed by a CPU, or other processors suitable for three-dimensional reconstruction, which is not limited in this embodiment of the present application.
  • the image acquisition device captures multiple sets of discrete viewpoint images surrounding the target object, and the discrete viewpoint image sets include multiple two-dimensional images for displaying the target object.
  • the set of discrete viewpoint images constitutes an image sequence as an input, and the reconstruction module outputs a corresponding 3D image of the target property, and the 3D image can display a stereoscopic image of the target object.
  • the image acquisition device may be any electronic device with a shooting function, for example, a mobile phone, a camera, a computer, and the like. This embodiment of the present application does not limit it.
  • the target object may be any object in a scene space that the user wants to perform 3D reconstruction.
  • an image sequence may be called an image set, a view set, an image set, and other similar terms.
  • the embodiment of the present application uses the image sequence as an example for description, and the embodiment of the present application does not limit this.
  • the method 300 includes step S310 to step S330. Step S310 to step S330 will be described in detail below.
  • the above-mentioned image sequence may be a distorted image obtained by shooting a target object by an image acquisition device, and a de-distorted image sequence may be obtained after de-distortion processing, and the first image sequence is an example of the de-distorted image sequence, that is, The first sequence of images includes a plurality of de-distorted discrete viewpoint images.
  • each object included in the current scene can be used as an instance
  • the first image sequence is input frame by frame based on the trained example segmentation model to realize the segmentation of each instance, and obtain the segmentation mask of each instance Code
  • the segmentation mask includes the contour of the corresponding object and the pixels within the contour.
  • the first instance mask includes a segmentation mask of the background object and a segmentation mask of the target object in the first sequence of images. The method of obtaining the instance mask is not limited in this embodiment of the application.
  • the first 3D point cloud includes a sparse 3D point cloud output according to the first graphics sequence
  • the sparse 3D point cloud includes a sparse 3D point cloud of a target object and a sparse 3D point cloud of a background object.
  • the first three-dimensional point cloud may be acquired by using an SFM algorithm.
  • the pose information of the image acquisition device (for clarity and simplicity, hereinafter referred to as the pose information) can be understood as the parameter information when the image acquisition device shoots around the target object.
  • the parameter information may be a degree of freedom parameter. It should be understood that when the image acquisition device moves around the target object when shooting, each image acquisition device generates a spatial position change relationship relative to the target object, and the spatial position change relationship can be converted into a coordinate system according to the above parameter information, Therefore, the movement track of the image acquisition device can be clarified.
  • the pose information of the image acquisition device may be acquired by using the SFM algorithm.
  • the degree of freedom parameter may include 3 position vector parameters and 3 Euler angle parameters, that is, the parameter information may include 6 degree of freedom parameters.
  • the second 3D point cloud is determined according to the above-mentioned first image sequence, the first instance mask, the first 3D point cloud, and the pose information of the image acquisition device.
  • the second three-dimensional point cloud may be understood as a sparse point cloud of the target object.
  • the 3D spherical model of the target object is determined according to the above pose information, and the 3D spherical model may be understood as a 3D ball of interest including the target object.
  • the visual axis formed by each image acquisition device will approximately form an intersection point on the object, and the intersection point is used as the center of the sphere, and the length of the visual axis is used as the radius to form a circle including the target. 3D fun ball with objects inside.
  • 0.2 times the length of the largest dimension in the first three-dimensional point cloud can be taken as the length of the visual axis, that is, the length of the radius. It should be understood that this value is an empirical value in actual implementation. Examples do not create any restrictions.
  • the viewing axis vector corresponding to the pose information can be used to fit the 3D ball of interest.
  • a 2D circular image of the target object is acquired according to the 3D spherical model, and the 2D circular image can be understood as a 2D circle of interest including the target object.
  • the 3D spherical model is back-projected to the 2D viewpoint image according to the pose information to form a 2D circular image.
  • the projection matrix corresponding to the pose information may be used to back-project the 3D spherical model onto the first image sequence to calculate the 2D circular image.
  • segmentation mask of the background object is removed according to the 2D circular image, and the segmentation mask of the target object is determined.
  • the part of the instance mask is the target object
  • the segmentation mask of the other instance masks is the instance mask of the background object, that is, the instance mask of the background object can be removed.
  • the segmentation mask of the target object is determined according to the overlap between the 2D viewpoint image and the mask of the first instance.
  • the overlapping partial mask is determined to be the segmentation mask of the target object, and the non-identical mask is removed.
  • Overlapping masks are
  • the non-overlapping partial masks are removed, and the remaining segmentation masks are determined as belonging targets A segmentation mask of an object, wherein the non-overlapping partial mask is a segmentation mask of the background object.
  • the second 3D point cloud is determined according to the first 3D point cloud and the segmentation mask of the target object.
  • the first 3D point cloud is back-projected to the 2D viewpoint image according to the pose information, and the second 3D point cloud is determined according to the overlap between the 2D viewpoint image and the segmentation mask of the target object.
  • the projection matrix corresponding to the pose information is used to project the first 3D point cloud onto the first image sequence, and the second 3D point cloud is extracted according to the overlapping relationship between the 2D projected points and the segmentation mask of the target object.
  • the cloud is a sparse point cloud of the target object, which is the second three-dimensional point cloud. At this point, other point clouds in the first 3D point cloud need to be removed, so as to obtain the second 3D point cloud.
  • a 2D convex hull region is obtained according to the second 3D point cloud and the pose information, the 2D convex hull region includes a 2D point set within the outer contour of the target object, and the 2D convex hull region includes The set of points includes 2D projected points of said second 3D point cloud.
  • the 2D convex hull region can be understood as a set of pixel points within the approximate outermost contour of the target object, and the 2D point set included in the outer contour ensures the integrity of the target object. It can be understood that the outer contour includes all 2D point sets of the target object, and also includes 2D point sets of some background objects.
  • the second 3D point cloud is back-projected to the 2D viewpoint image according to the pose information, and the 2D convex hull area is determined according to the 2D viewpoint image.
  • edge detection is performed on the 2D convex hull area to obtain a 2D extended point set, and the edge detection is used to remove the sparse point cloud of the background object included in the 2D convex hull area according to the edge point set,
  • the 2D extended point set includes the 2D point set and the edge point set. Acquire the 2D view area of the target object according to the 2D extension point set.
  • the 2D convex hull area is not the exact outline of the target object. Therefore, by performing edge detection on the 3D convex hull area, the precise edge point set of the target object can be obtained, thereby obtaining the 2D extension point set. It can be understood that the 2D extension point Sets include 2D point sets and edge point sets.
  • edge detection is to determine the precise outline of the target object based on the edge point set. Therefore, it is necessary to remove the sparse point cloud of the background object included in the 2D convex hull area, that is, to remove the object through morphological operations.
  • the set of points other than the set of edge points inside the 2D convex hull region It can be understood that the 2D convex hull area is further reduced, so that the 2D convex hull area tends to be more accurate to the outline of the target object, and the 2D view area of the target object is obtained.
  • dense reconstruction is further performed according to the 2D viewing area to obtain a third 3D point cloud, and the third 3D point cloud is used to display a 3D image of the target object.
  • the 3D ROI of the target object is obtained to realize the automatic extraction of 2D ROI, and the 3D reconstruction of the target object is further realized.
  • ROI misdetection, missed detection, and incompleteness can be avoided, and the complete multi-view 3D reconstruction of the target object can be realized efficiently.
  • Fig. 4 shows a flowchart of a method for multi-view 3D reconstruction provided by an embodiment of the present application.
  • the method 400 shown in FIG. 4 may be applied to the system 100 shown in FIG. 1 , and the method 400 includes specific implementation steps of the above-mentioned method 300 .
  • the method 400 includes six steps S4010 to S4060. The specific implementation process of each step will be described in detail below.
  • Step S4010 acquiring an image sequence.
  • the image sequence may be a set of images obtained by surrounding shooting of the target object by multiple image acquisition devices. It can be understood that the image set includes multiple images showing the target object from various angles.
  • image sequence may be acquired directly from the image acquisition device, or from other devices, or from the cloud or other places. This embodiment of the present application does not limit it.
  • the image sequence is obtained from any way, and the image sequence is captured by multiple image acquisition devices surrounding it.
  • Step S4020 construct input.
  • the input information includes a first image sequence 4021 , a first instance segmentation mask 4022 , a first 3D point cloud 4023 , and pose information 4024 .
  • the first image sequence 4021 is an image after de-distortion processing of the image sequence. for more accurate subsequent calculations.
  • the first instance segmentation mask 4022 includes the segmentation masks of all objects under the shooting lens of the image acquisition device in a 3D reconstruction scene, specifically, the first instance segmentation mask includes the segmentation masks of the target object and the background object code.
  • the first 3D point cloud 4023 includes sparse point clouds of target objects and background objects.
  • pose information 4024 includes parameter information when the image acquisition device shoots around the target object.
  • step S310 in the method 300 which is not repeated in this embodiment of the present application.
  • the constructed input information can be expressed as:
  • Input represents the input information of the construction.
  • points sfm represents the first 3D point cloud reconstructed by the SFM algorithm, that is, the sparse point cloud.
  • the pose information may consist of 6 parameters, including 3 representing position vectors and 3 representing attitude vectors.
  • view_img j represents the viewpoint image j of the first image sequence, that is, the viewpoint image from which distortion has been removed.
  • Step S4030 determining a second 3D point cloud according to the input information.
  • determining the second 3D point cloud according to the input information includes the following specific steps:
  • Step 4031 Determine the 3D ball of interest of the target object according to the above pose information.
  • the viewing axis vector corresponding to the pose information is used to fit the 3D ball of interest.
  • the least square method can be used to fit the focal point where the camera viewing axis converges, so as to represent the 3D ball of interest .
  • the 3D ball of interest fitted by the least square method can be expressed as:
  • S(x, y, z, r) represents the 3D interest ball with (x, y, z) as the center and r as the radius fitted by the least square method;
  • LS( ⁇ ) represents the least squares fitting algorithm
  • N represents the total number of viewpoint images, that is, the total number of viewpoint images included in the first image sequence.
  • x, y, z, r, N are all positive integers greater than 1.
  • Step 4032 Obtain the 2D circle of interest of the target object according to the above pose information and the 3D circle of interest.
  • the 3D ball of interest is back-projected to the 2D viewpoint image according to the projection matrix corresponding to the pose information to form a 2D circle of interest.
  • Step 4033 Remove the segmentation mask of the background object according to the 2D circle of interest.
  • the segmentation mask of the background object is removed according to the 2D circle of interest, and the segmentation mask of the target object is determined.
  • the segmentation mask of the determined target object can be expressed as:
  • Refine( ⁇ ) represents the refinement function of the segmentation mask, and it can be understood that according to this function, the 3D interest ball can be back-projected to the 2D viewpoint image to determine the segmentation mask of the target object.
  • Mj denotes the number of instances in the prediction results of the instance segmentation model on viewpoint image j.
  • refined_mask represents the segmentation mask of the determined target object.
  • the 2D circle of interest and Whether to overlap to remove the mask of background objects. For example, if the 2D circle of interest obtained by projecting the 3D spherical model to the 2D viewpoint image overlaps with a part of the mask in the first instance mask, it can be determined that the mask of the overlapping part is the segmentation mask of the target object, and the other non-overlapping parts mask is a mask of background objects that can be removed.
  • Step 4034 Determine the second 3D point cloud according to the first 3D point cloud and the segmentation mask of the target object.
  • the first 3D point cloud is back-projected to the 2D viewpoint image
  • the second 3D point cloud is determined according to the overlap between the 2D viewpoint image and the segmentation mask of the target object.
  • the second 3D point cloud can be understood as a 3D bounding box, and the determined 3D bounding box can be expressed as:
  • BB( ) represents the calculation function of the 3D bounding box. This function can realize the reverse projection of the first 3D point cloud to the 2D viewpoint image, and determine the final second 3D by judging whether the projected 2D point falls on the refined_mask point cloud.
  • Step S4040 acquiring a 2D view area according to the second 3D point cloud.
  • obtaining the 2D view area according to the second 3D point cloud includes the following specific steps:
  • Step S4041 extracting a 2D convex hull area.
  • the 2D convex hull region is obtained according to the second 3D point cloud and the pose information, and the obtained 2D convex hull region can be expressed as:
  • CH( ) represents the convex hull calculation function, which uses the corresponding pose parameters of view j Back-project the second 3D point cloud PC roi to the corresponding view, and then calculate the convex hull convex_hull based on the 2D projected points.
  • the 2D point set can be obtained through the extraction of the 2D convex hull area.
  • the 2D point set can be understood as all 2D point sets and 3D point sets in the convex hull area.
  • the point set includes all of the target object.
  • the point set also contains the point set of a part of the background object.
  • Step S4042 determine a 2D extension point set.
  • edge detection is performed on the 2D convex hull area to obtain an edge point set, and a relatively accurate 2D outline of the target object can be determined according to the edge point set.
  • the 2D extension point set can be expressed as:
  • Edge( ) represents the 2D interest point set expansion function, which performs edge detection on the area inside the convex hull convex_hull j on the viewpoint image view_img j , and combines the edge detection results with the 2D interest point set to obtain the extended interest point set
  • the above-mentioned determined 2D extension point set includes the above-mentioned 2D point set and edge point set.
  • Step S4043 the morphological operation acquires the 2D view area.
  • the purpose of edge detection is to determine the precise outline of the target object according to the edge point set. Therefore, it is necessary to remove the sparse point cloud of the background object included in the 2D convex hull area, that is, through the morphological operation Remove the point sets other than the edge point set in the 2D convex hull area.
  • the morphological operation can be expressed as:
  • Erosion( ) represents the 2D ROI extraction function, which performs the process from the boundary of the convex hull convex_hull j to the The determined boundary is corroded, and the final obtained is the 2D ROI on this view.
  • Step S4050 acquiring a third 3D point cloud according to the 2D view area.
  • dense reconstruction is further performed according to the 2D view area to obtain a 3D point cloud.
  • the dense reconstruction may be any dense reconstruction method in 3D reconstruction, which is not limited in this embodiment of the present application.
  • Step S4060 complete the 3D reconstruction of the target object through view fusion.
  • view fusion is performed on the third three-dimensional point cloud to obtain a final reconstructed image, which is used to display the three-dimensional image of the target object.
  • the 3D ROI of the target object is obtained to realize the automatic extraction of 2D ROI, and the 3D reconstruction of the target object is further realized.
  • the 3D ROI of the target object is obtained to realize the automatic extraction of 2D ROI, and the 3D reconstruction of the target object is further realized.
  • Fig. 5 shows a structural block diagram of an apparatus for multi-view 3D reconstruction provided by an embodiment of the present application.
  • the device 500 for multi-view three-dimensional reconstruction includes: a first determination module 510 , a second determination module 520 , and a construction module 530 .
  • the first determining module 510 is configured to determine the second 3D point cloud according to the first image sequence of the target object, the first instance mask, the first 3D point cloud, and the pose information of the image acquisition device.
  • the first determining module 510 determines the 3D spherical model of the target object according to the pose information; acquires a 2D circular image of the target object according to the 3D spherical model; The segmentation mask of the background object is removed from the circular image to determine the segmentation mask of the target object; and the second 3D point cloud is determined according to the first 3D point cloud and the segmentation mask of the target object.
  • the first determining module 510 uses a least square method to fit the center and radius of the 3D spherical model according to the camera pose information, thereby determining the 3D spherical model.
  • the first determination module 510 determines the segmentation mask of the target object according to the overlap between the 2D circular image and the first instance mask.
  • the 2D circular image overlaps with a partial mask included in the first instance mask
  • determine that the overlapped partial mask is the segmentation mask of the target object, and move Remove the non-overlapping masks; or when the partial masks included in the 2D circular image and the first instance mask do not overlap, remove the non-overlapping partial masks, and determine the remaining segmentation
  • the mask is the segmentation mask of the target object to which it belongs, and the non-overlapping partial mask is the segmentation mask of the background object.
  • the first determination module 510 is configured to project the first 3D point cloud to a 2D view image, and determine the 2D view image according to the overlap between the 2D view image and the segmentation mask of the target object. Describe the second 3D point cloud.
  • the 2D view image of a part of the first 3D point cloud overlaps with the segmentation mask of the target object, then determine the partial points of the overlapped first 3D point cloud The cloud is the second 3D point cloud, and the remaining point clouds in the first 3D point cloud that do not overlap with the segmentation mask of the target object are removed.
  • the second determining module 520 is configured to acquire a 2D view area according to the second 3D point cloud.
  • the second determining module 520 obtains a 2D convex hull area according to the second 3D point cloud and the pose information, and the 2D convex hull area includes a 2D convex hull area within the outer contour of the target object.
  • Point set the 2D point set includes the 2D projection point of the second three-dimensional point cloud;
  • Edge detection is performed on the 2D convex hull area to obtain a 2D extended point set, and the edge detection is used to remove according to the edge point set
  • the sparse point cloud of the background object included in the 2D convex hull area, the 2D extended point set includes the 2D point set and the edge point set; obtain the target object according to the 2D extended point set 2D view area.
  • the construction module 530 is used for generating a third 3D point cloud according to the 2D view area.
  • the disclosed systems, devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Processing Or Creating Images (AREA)
  • Image Analysis (AREA)

Abstract

Embodiments of the present application provide a multi-view three-dimensional reconstruction method, comprising: determining a second three-dimensional point cloud according to a first image sequence, a first instance mask and a first three-dimensional point cloud of a target object and pose information of an image acquisition device; obtaining a 2D view region according to the second three-dimensional point cloud, the 2D view region comprising a region of interest of the target object; and generating a third three-dimensional point cloud according to the 2D view region, wherein the third three-dimensional point cloud comprises a dense three-dimensional point cloud of the target object, and the dense three-dimensional point cloud is used for displaying the target object. Therefore, complete and efficient multi-view three-dimensional reconstruction can be realized on the premise of no user interaction, no target object category limitation, and no calibration pattern constraint.

Description

一种多视图三维重建的方法A method for multi-view 3D reconstruction
本申请要求于2021年11月25日提交中国专利局、申请号为202111414740.6、申请名称为“一种多视图三维重建的方法”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number 202111414740.6 and the application title "A Method for Multi-View Three-dimensional Reconstruction" submitted to the China Patent Office on November 25, 2021, the entire contents of which are incorporated by reference in this application middle.
技术领域technical field
本申请实施例涉及计算机技术领域,尤其涉及一种多视图三维重建的方法及装置。The embodiments of the present application relate to the field of computer technology, and in particular to a method and device for multi-view three-dimensional reconstruction.
背景技术Background technique
多视图立体视觉(multi-view stereo,MVS)旨在从一组经过标定的二维图像和估计的相机参数中恢复三维场景表面,广泛应于自动驾驶、增强现实、文物数字化呈现与保护、城市尺度的测量等领域。Multi-view stereo vision (multi-view stereo, MVS) aims to restore the three-dimensional scene surface from a set of calibrated two-dimensional images and estimated camera parameters, and is widely used in autonomous driving, augmented reality, digital presentation and protection of cultural relics, urban scale measurement etc.
现有技术中,三维重建过程中需要从被处理的图像中以各种形状勾勒出需要处理的区域,从而识别并提取出场景主体感兴趣区域(region of interest,ROI)。例如,借助用户交互确定输入图像序列上初始帧的兴趣区域;再例如,借助前景分割算法实现图像序列上前景兴趣区域的分割;再例如,借助场景中预置的标志图案实现兴趣区域的标记和提取。当前的多视图立体重建技术极大的启发了密集计算场景的研究,但一方面,当前的ROI提取技术容易出现错检、漏检等ROI不完整的问题,另一方面,基于特定标志或用户交互的方法不适合自动化或大批量重建的场景。此外,密集计算场景中,大量的计算发生在背景区域,背景区域的计算开销过大,影响三维重建的效率。In the prior art, in the process of 3D reconstruction, it is necessary to outline the region to be processed in various shapes from the processed image, so as to identify and extract the region of interest (region of interest, ROI) of the main body of the scene. For example, use user interaction to determine the ROI of the initial frame on the input image sequence; another example, use the foreground segmentation algorithm to realize the segmentation of the foreground ROI on the image sequence; another example, use the preset logo pattern in the scene to realize the marking and extract. The current multi-view stereo reconstruction technology has greatly inspired the research of intensive computing scenarios, but on the one hand, the current ROI extraction technology is prone to incomplete ROI problems such as false detection and missed detection; on the other hand, based on specific signs or user Interactive methods are not suitable for automation or high-volume reconstruction scenarios. In addition, in intensive computing scenarios, a large number of calculations occur in the background area, and the calculation overhead of the background area is too large, which affects the efficiency of 3D reconstruction.
如何在无用户交互、无目标物体类别限制及无标定图案约束的前提下实现完整并且高效的多视图三维重建,成为业界亟需解决的问题。How to achieve a complete and efficient multi-view 3D reconstruction without user interaction, no restriction on object types, and no constraints on calibration patterns has become an urgent problem to be solved in the industry.
发明内容Contents of the invention
本申请实施例提供一种多视图三维重建的方法,能够在无用户交互、无目标物体类别限制及无标定图案约束的前提下实现完整并且高效的多视图三维重建。The embodiment of the present application provides a method for multi-view 3D reconstruction, which can realize complete and efficient multi-view 3D reconstruction without user interaction, target object type restriction and calibration pattern restriction.
第一方面,提供了一种多视图三维重建的方法,包括:根据目标物体的第一图像序列、第一实例掩码、第一三维点云和图像采集设备的位姿信息确定第二三维点云,其中,所述第一图像序列包括使用对所述目标物体进行环绕拍摄后进行去畸变的多幅图像,所述第一实例掩码包括所述第一图像序列中的目标物体的分割掩码和背景物体的分割掩码,所述第一三维点云包括所述第一图像序列中的目标物体的稀疏点云和背景物体的稀疏点云,所述位姿信息包括所述图像采集设备环绕所述目标物体拍摄时的参数信息,所述第二三维点云包括所述目标物体的稀疏点云;根据所述第二三维点云获取2D视图区域,所述2D视图区域包括所述目标物体的兴趣区域;根据所述2D视图区域生成第三三维点云,所述第三三维点云包括所述目标物体的稠密三维点云,所述稠密三维点云用于展示所述目标物体。In the first aspect, a method for multi-view 3D reconstruction is provided, including: determining the second 3D point according to the first image sequence of the target object, the first instance mask, the first 3D point cloud, and the pose information of the image acquisition device cloud, wherein the first sequence of images includes a plurality of images that are de-distorted after surrounding shooting of the target object, and the first instance mask includes a segmentation mask of the target object in the first sequence of images code and the segmentation mask of the background object, the first three-dimensional point cloud includes the sparse point cloud of the target object and the sparse point cloud of the background object in the first image sequence, and the pose information includes the image acquisition device Parameter information when shooting around the target object, the second 3D point cloud includes a sparse point cloud of the target object; acquire a 2D view area according to the second 3D point cloud, and the 2D view area includes the target A region of interest of the object; generating a third 3D point cloud according to the 2D view region, the third 3D point cloud including a dense 3D point cloud of the target object, and the dense 3D point cloud is used to display the target object.
基于上述技术方案,在本申请的多视图三维重建场景中,基于图像采集设备的视轴先 验信息,通过获取目标物体的3D ROI实现2D ROI的自动化提取,进一步实现目标物体的三维重建。从而能够避免ROI错检,漏检,不完整的问题,高效的实现完整的目标物体的多视图三维重建。Based on the above technical solution, in the multi-view 3D reconstruction scene of the present application, based on the visual axis prior information of the image acquisition device, the automatic extraction of 2D ROI is realized by obtaining the 3D ROI of the target object, and the 3D reconstruction of the target object is further realized. In this way, it is possible to avoid ROI misdetection, missed detection, and incomplete problems, and efficiently realize the complete multi-view 3D reconstruction of the target object.
结合第一方面,在一种可能的实现方式中,所述根据目标物体的第一图像序列、第一实例掩码、第一三维点云和图像采集设备的位姿信息确定第二三维点云,包括:根据所述位姿信息确定所述目标物体的3D球形模型;根据所述3D球形模型获取所述目标物体的2D圆形图像;根据所述2D圆形图像移除所述背景物体的分割掩码,确定所述目标物体的分割掩码;根据所述第一三维点云和所述目标物体的分割掩码确定所述第二三维点云。With reference to the first aspect, in a possible implementation manner, the determining of the second 3D point cloud according to the first image sequence of the target object, the first instance mask, the first 3D point cloud, and the pose information of the image acquisition device , comprising: determining a 3D spherical model of the target object according to the pose information; acquiring a 2D circular image of the target object according to the 3D spherical model; removing the background object from the 2D circular image A segmentation mask, determining a segmentation mask of the target object; determining the second 3D point cloud according to the first 3D point cloud and the segmentation mask of the target object.
基于上述技术方案,在本申请中,基于图像采集设备的视轴信息确定3D球形模型,根据球形模型在2D视点图像的投影来移除背景掩码,从而能够避免ROI错检,漏检问题。Based on the above technical solution, in this application, the 3D spherical model is determined based on the visual axis information of the image acquisition device, and the background mask is removed according to the projection of the spherical model on the 2D viewpoint image, so as to avoid ROI false detection and missed detection problems.
结合第一方面,在一种可能的实现方式中,使用最小二乘法根据所述位姿信息拟合出所述3D球形模型的球心和半径。With reference to the first aspect, in a possible implementation manner, a least square method is used to fit the center and radius of the 3D spherical model according to the pose information.
基于上述技术方案,在本申请中,基于最小二乘法根据图像采集设备的视轴信息拟合3D球形模型,确保3D球形模型的位置和轮廓的准确性。Based on the above technical solution, in this application, the 3D spherical model is fitted based on the visual axis information of the image acquisition device based on the least square method to ensure the accuracy of the position and outline of the 3D spherical model.
结合第一方面,在一种可能的实现方式中,根据所述2D圆形图像和所述第一实例掩码的重叠情况来确定所述目标物体的分割掩码,包括:当所述2D圆形图像和所述第一实例掩码中包括的部分掩码重叠,则确定所述重叠的部分掩码为所属目标物体的分割掩码,移除所述不重叠的掩码;或者当所述2D圆形图像和所述第一实例掩码中包括的部分掩码不重叠,则移除所述不重叠的部分掩码,确定剩余的分割掩码为所属目标物体的分割掩码,所述不重叠的部分掩码为所述背景物体的分割掩码。With reference to the first aspect, in a possible implementation manner, determining the segmentation mask of the target object according to the overlap between the 2D circular image and the first instance mask includes: when the 2D circular image If the partial mask included in the shape image overlaps with the first instance mask, then determine that the overlapping partial mask is the segmentation mask of the target object, and remove the non-overlapping mask; or when the The 2D circular image and the partial masks included in the first example mask do not overlap, then remove the non-overlapping partial masks, and determine the remaining segmentation masks as the segmentation masks of the corresponding target objects, the The non-overlapping partial masks are the segmentation masks of the background objects.
基于上述技术方案,在本申请中,基于3D球形模型的2D投影图像来移除背景掩码,确保目标物体的掩码的准确性,从而保证ROI的精确性和完整性,进一步的,减少背景区域的计算,从而可以有效提高重建效率。Based on the above technical solution, in this application, the background mask is removed based on the 2D projection image of the 3D spherical model to ensure the accuracy of the mask of the target object, thereby ensuring the accuracy and integrity of the ROI, and further reducing the background The calculation of the area can effectively improve the reconstruction efficiency.
结合第一方面,在一种可能的实现方式中,将所述第一三维点云投影至2D视图图像,根据所述2D视图图像与所述目标物体的分割掩码重叠的情况来确定所述第二三维点云,包括:所述第一三维点云的部分点云的2D视图图像与所述目标物体的分割掩码重叠,则确定所述重叠的所述第一三维点云的部分点云为所述第二三维点云,移除所述第一三维点云中剩余的未与所述目标物体的分割掩码重叠的点云。With reference to the first aspect, in a possible implementation manner, the first 3D point cloud is projected onto a 2D view image, and the The second 3D point cloud includes: the 2D view image of a part of the first 3D point cloud overlaps with the segmentation mask of the target object, and then determine the part of the overlapping points of the first 3D point cloud The cloud is the second 3D point cloud, and the remaining point clouds in the first 3D point cloud that do not overlap with the segmentation mask of the target object are removed.
基于上述技术方案,在本申请中,基于稀疏点云的2D投影图像来确定目标物体的3D点云,使得目标物体的3D点云更加精确,有利于ROI的完整提取,从而实现完整的目标物体的多视图三维重建。Based on the above technical solution, in this application, the 3D point cloud of the target object is determined based on the 2D projection image of the sparse point cloud, so that the 3D point cloud of the target object is more accurate, which is conducive to the complete extraction of ROI, so as to realize the complete target object multi-view 3D reconstruction.
结合第一方面,在一种可能的实现方式中,根据所述第二三维点云和所述位姿信息获取2D凸包区域,所述2D凸包区域包括所述目标物体的外部轮廓内的2D点集,所述2D点集包括所述第二三维点云的2D投影点;对所述2D凸包区域进行边缘检测,获取2D扩展点集,所述边缘检测用于根据边缘点集移除所述2D凸包区域中包括的所述背景物体的稀疏点云,所述2D扩展点集包括所述2D点集和所述边缘点集;根据所述2D扩展点集获取所述目标物体的2D视图区域。With reference to the first aspect, in a possible implementation manner, a 2D convex hull region is obtained according to the second 3D point cloud and the pose information, and the 2D convex hull region includes 2D point set, the 2D point set includes the 2D projection point of the second three-dimensional point cloud; edge detection is performed on the 2D convex hull area, and the 2D extended point set is obtained, and the edge detection is used to move according to the edge point set In addition to the sparse point cloud of the background object included in the 2D convex hull region, the 2D extended point set includes the 2D point set and the edge point set; the target object is obtained according to the 2D extended point set 2D view area.
基于上述技术方案,在本申请中,基于2D凸包和边缘检测来实现2D点集的扩充,确保2D ROI的完整,从而实现完整的目标物体的多视图三维重建。Based on the above technical solution, in this application, the expansion of the 2D point set is realized based on the 2D convex hull and edge detection to ensure the integrity of the 2D ROI, thereby realizing the complete multi-view 3D reconstruction of the target object.
结合第一方面,在一种可能的实现方式中,所述图像采集设备环绕所述目标物体拍摄时的参数信息包括所述图像采集设备相对于所述目标物体运动时的自由度参数。With reference to the first aspect, in a possible implementation manner, the parameter information when the image acquisition device shoots around the target object includes a degree of freedom parameter when the image acquisition device moves relative to the target object.
基于上述技术方案,在本申请中,基于图像采集设备的自由度参数来进行三维重建,能确保3DROI和2D ROI的准确性,从而实现完整的目标物体的多视图三维重建。Based on the above technical solution, in this application, 3D reconstruction is performed based on the degree of freedom parameters of the image acquisition device, which can ensure the accuracy of 3D ROI and 2D ROI, thereby realizing a complete multi-view 3D reconstruction of the target object.
结合第一方面,在一种可能的实现方式中,所述图像采集设备包括多个。With reference to the first aspect, in a possible implementation manner, the image acquisition device includes multiple devices.
第二方面,提供了一种多视图三维重建的装置,该装置包括:第一确定模块,用于根据目标物体的第一图像序列、第一实例掩码、第一三维点云和图像采集设备的位姿信息确定第二三维点云,其中,所述第一图像序列包括使用对所述目标物体进行环绕拍摄后进行去畸变的多幅图像,所述第一实例掩码包括所述第一图像序列中的目标物体的分割掩码和背景物体的分割掩码,所述第一三维点云包括所述第一图像序列中的目标物体的稀疏点云和背景物体的稀疏点云,所述位姿信息包括所述图像采集设备环绕所述目标物体拍摄时的参数信息,所述第二三维点云包括所述目标物体的稀疏点云;第二确定模块,用于根据所述第二三维点云获取2D视图区域,所述2D视图区域包括所述目标物体的兴趣区域;构建模块,用于根据所述2D视图区域生成第三三维点云,所述第三三维点云包括所述目标物体的稠密三维点云,所述稠密三维点云用于展示所述目标物体。In a second aspect, a device for multi-view 3D reconstruction is provided, the device includes: a first determination module, configured to use the first image sequence of the target object, the first instance mask, the first 3D point cloud, and an image acquisition device The pose information of the second 3D point cloud is determined, wherein the first image sequence includes a plurality of images that are de-distorted after shooting around the target object, and the first instance mask includes the first A segmentation mask of the target object and a segmentation mask of the background object in the image sequence, the first 3D point cloud includes a sparse point cloud of the target object and a sparse point cloud of the background object in the first image sequence, the The pose information includes parameter information when the image acquisition device shoots around the target object, and the second 3D point cloud includes a sparse point cloud of the target object; the second determination module is configured to The point cloud acquires a 2D view area, and the 2D view area includes the region of interest of the target object; a building module is used to generate a third three-dimensional point cloud according to the 2D view area, and the third three-dimensional point cloud includes the target A dense three-dimensional point cloud of the object, where the dense three-dimensional point cloud is used to display the target object.
基于上述技术方案,在本申请中,在本申请的多视图三维重建场景中,基于图像采集设备的视轴先验信息,通过获取目标物体的3D ROI实现2D ROI的自动化提取,进一步实现目标物体的三维重建。从而能够避免ROI错检,漏检,不完整的问题,高效的实现完整的目标物体的多视图三维重建。Based on the above technical solution, in this application, in the multi-view 3D reconstruction scene of this application, based on the visual axis prior information of the image acquisition device, the automatic extraction of 2D ROI is realized by obtaining the 3D ROI of the target object, and the target object is further realized. 3D reconstruction. In this way, it is possible to avoid ROI misdetection, missed detection, and incomplete problems, and efficiently realize the complete multi-view 3D reconstruction of the target object.
结合第二方面,在一种可能的实现方式中,所述第一确定模块具体用于根据所述位姿信息确定所述目标物体的3D球形模型;根据所述3D球形模型获取所述目标物体的2D圆形图像;根据所述2D圆形图像移除所述背景物体的分割掩码,确定所述目标物体的分割掩码;根据所述第一三维点云和所述目标物体的分割掩码确定所述第二三维点云。With reference to the second aspect, in a possible implementation manner, the first determining module is specifically configured to determine a 3D spherical model of the target object according to the pose information; acquire the target object according to the 3D spherical model 2D circular image; remove the segmentation mask of the background object according to the 2D circular image, and determine the segmentation mask of the target object; according to the first 3D point cloud and the segmentation mask of the target object The code determines the second 3D point cloud.
基于上述技术方案,在本申请中,基于图像采集设备的视轴信息确定3D球形模型,根据球形模型在2D视点图像的投影来移除背景掩码,从而能够避免ROI错检,漏检问题。Based on the above technical solution, in this application, the 3D spherical model is determined based on the visual axis information of the image acquisition device, and the background mask is removed according to the projection of the spherical model on the 2D viewpoint image, so as to avoid ROI false detection and missed detection problems.
结合第二方面,在一种可能的实现方式中,所述第一确定模块具体用于使用最小二乘法根据所述相机位姿信息拟合出所述3D球形模型的球心和半径。With reference to the second aspect, in a possible implementation manner, the first determination module is specifically configured to use a least square method to fit the center and radius of the 3D spherical model according to the camera pose information.
基于上述技术方案,在本申请中,基于最小二乘法根据图像采集设备的视轴信息拟合3D球形模型,确保3D球形模型的位置和轮廓的准确性。Based on the above technical solution, in this application, the 3D spherical model is fitted based on the visual axis information of the image acquisition device based on the least square method to ensure the accuracy of the position and outline of the 3D spherical model.
结合第二方面,在一种可能的实现方式中,所述第一确定模块具体用于根据所述2D圆形图像和所述第一实例掩码的重叠情况来确定所述目标物体的分割掩码,包括:当所述2D圆形图像和所述第一实例掩码中包括的部分掩码重叠,则确定所述重叠的部分掩码为所属目标物体的分割掩码,移除所述不重叠的掩码;或者当所述2D圆形图像和所述第一实例掩码中包括的部分掩码不重叠,则移除所述不重叠的部分掩码,确定剩余的分割掩码为所属目标物体的分割掩码,所述不重叠的部分掩码为所述背景物体的分割掩码。With reference to the second aspect, in a possible implementation manner, the first determination module is specifically configured to determine the segmentation mask of the target object according to the overlap between the 2D circular image and the first instance mask. The code includes: when the 2D circular image overlaps with a partial mask included in the first instance mask, then determining that the overlapped partial mask is the segmentation mask of the target object, and removing the non-identical Overlapping masks; or when the partial masks included in the 2D circular image and the first instance mask do not overlap, remove the non-overlapping partial masks, and determine that the remaining segmentation masks belong to A segmentation mask of the target object, the non-overlapping partial mask is the segmentation mask of the background object.
基于上述技术方案,在本申请中,基于3D球形模型的2D投影图像来移除背景掩码,确保目标物体的掩码的准确性,从而保证ROI的精确性和完整性,进一步的,减少背景区域的计算,从而可以有效提高重建效率。Based on the above technical solution, in this application, the background mask is removed based on the 2D projection image of the 3D spherical model to ensure the accuracy of the mask of the target object, thereby ensuring the accuracy and integrity of the ROI, and further reducing the background The calculation of the area can effectively improve the reconstruction efficiency.
结合第二方面,在一种可能的实现方式中,所述第一确定模块具体用于将所述第一三 维点云投影至2D视图图像,根据所述2D视图图像与所述目标物体的分割掩码重叠的情况来确定所述第二三维点云,包括:所述第一三维点云的部分点云的2D视图图像与所述目标物体的分割掩码重叠,则确定所述重叠的所述第一三维点云的部分点云为所述第二三维点云,移除所述第一三维点云中剩余的未与所述目标物体的分割掩码重叠的点云。With reference to the second aspect, in a possible implementation manner, the first determination module is specifically configured to project the first 3D point cloud to a 2D view image, and according to the segmentation of the 2D view image and the target object If the mask overlaps to determine the second 3D point cloud, it includes: the 2D view image of a part of the point cloud of the first 3D point cloud overlaps with the segmentation mask of the target object, then determining all of the overlapping The partial point cloud of the first three-dimensional point cloud is the second three-dimensional point cloud, and the remaining point clouds in the first three-dimensional point cloud that do not overlap with the segmentation mask of the target object are removed.
基于上述技术方案,在本申请中,基于稀疏点云的2D投影图像来确定目标物体的3D点云,使得目标物体的3D点云更加精确,有利于ROI的完整提取,从而实现完整的目标物体的多视图三维重建。Based on the above technical solution, in this application, the 3D point cloud of the target object is determined based on the 2D projection image of the sparse point cloud, so that the 3D point cloud of the target object is more accurate, which is conducive to the complete extraction of ROI, so as to realize the complete target object multi-view 3D reconstruction.
结合第二方面,在一种可能的实现方式中,所述第二确定模块具体用于根据所述第二三维点云和所述位姿信息获取2D凸包区域,所述2D凸包区域包括所述目标物体的外部轮廓内的2D点集,所述2D点集包括所述第二三维点云的2D投影点;对所述2D凸包区域进行边缘检测,获取2D扩展点集,所述边缘检测用于根据边缘点集移除所述2D凸包区域中包括的所述背景物体的稀疏点云,所述2D扩展点集包括所述2D点集和所述边缘点集;根据所述2D扩展点集获取所述目标物体的2D视图区域。With reference to the second aspect, in a possible implementation manner, the second determination module is specifically configured to obtain a 2D convex hull region according to the second 3D point cloud and the pose information, and the 2D convex hull region includes A 2D point set within the outer contour of the target object, the 2D point set including 2D projection points of the second 3D point cloud; edge detection is performed on the 2D convex hull area to obtain a 2D extended point set, the Edge detection is used to remove the sparse point cloud of the background object included in the 2D convex hull area according to the edge point set, the 2D extended point set includes the 2D point set and the edge point set; according to the The 2D extension point set obtains the 2D viewing area of the target object.
基于上述技术方案,在本申请中,基于2D凸包和边缘检测来实现2D点集的扩充,确保2D ROI的完整,从而实现完整的目标物体的多视图三维重建。Based on the above technical solution, in this application, the expansion of the 2D point set is realized based on the 2D convex hull and edge detection to ensure the integrity of the 2D ROI, thereby realizing the complete multi-view 3D reconstruction of the target object.
结合第二方面,在一种可能的实现方式中,所述图像采集设备环绕所述目标物体拍摄时的参数信息包括所述图像采集设备相对于所述目标物体运动时的自由度参数。With reference to the second aspect, in a possible implementation manner, the parameter information when the image acquisition device shoots around the target object includes a degree of freedom parameter when the image acquisition device moves relative to the target object.
基于上述技术方案,在本申请中,基于图像采集设备的自由度参数来进行三维重建,能确保3DROI和2D ROI的准确性,从而实现完整的目标物体的多视图三维重建。Based on the above technical solution, in this application, 3D reconstruction is performed based on the degree of freedom parameters of the image acquisition device, which can ensure the accuracy of 3D ROI and 2D ROI, thereby realizing a complete multi-view 3D reconstruction of the target object.
结合第二方面,在一种可能的实现方式中,所述图像采集设备包括多个。With reference to the second aspect, in a possible implementation manner, the image acquisition device includes multiple devices.
第三方面,提供了一种多视图三维重建的装置,包括,处理器,存储器,该存储器用于存储计算机程序,该处理器用于从存储器中调用并运行该计算机程序,使得该通信设备执行第一方面及其各种可能实现方式中的图像生成方法。In a third aspect, a device for multi-view 3D reconstruction is provided, including a processor and a memory, the memory is used to store a computer program, and the processor is used to call and run the computer program from the memory, so that the communication device executes the first An image generation method in one aspect and various possible implementations thereof.
可选地,该处理器为一个或多个,该存储器为一个或多个。Optionally, there are one or more processors, and one or more memories.
可选地,该存储器可以与该处理器集成在一起,或者该存储器与处理器分离设置。Optionally, the memory can be integrated with the processor, or the memory can be set separately from the processor.
第四方面,提供了一种计算机可读存储介质,其特征在于,该计算机可读介质存储用于设备执行的程序代码,该程序代码包括用于执行第一方面或第二方面该的方法。In a fourth aspect, a computer-readable storage medium is provided, wherein the computer-readable medium stores program code for execution by a device, and the program code includes the method for executing the first aspect or the second aspect.
第五方面,提供一种包含指令的计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行上述各方面中的任意一种实现方式中的方法。According to a fifth aspect, a computer program product containing instructions is provided, and when the computer program product is run on a computer, the computer is made to execute the method in any one of the implementation manners in the foregoing aspects.
第六方面,提供一种芯片,该芯片包括处理器与数据接口,该处理器通过该数据接口读取存储器上存储的指令,执行上述各方面中的任意一种实现方式中的方法。According to a sixth aspect, a chip is provided, and the chip includes a processor and a data interface, and the processor reads instructions stored in the memory through the data interface, and executes the method in any one of the above aspects.
可选地,作为一种实现方式,该芯片还可以包括存储器,该存储器中存储有指令,该处理器用于执行该存储器上存储的指令,当该指令被执行时,该处理器用于执行上述各方面中的任意一种实现方式中的方法。Optionally, as an implementation manner, the chip may further include a memory, the memory stores instructions, the processor is configured to execute the instructions stored in the memory, and when the instructions are executed, the processor is configured to execute the above-mentioned A method in any one of the implementations of the aspect.
上述芯片具体可以是现场可编程门阵列(field-programmable gate array,FPGA)或者专用集成电路(application-specific integrated circuit,ASIC)。The aforementioned chip may specifically be a field-programmable gate array (field-programmable gate array, FPGA) or an application-specific integrated circuit (application-specific integrated circuit, ASIC).
附图说明Description of drawings
图1示出了本申请实施例提供的系统架构的结构示意图;FIG. 1 shows a schematic structural diagram of a system architecture provided by an embodiment of the present application;
图2示出了本申请实施例提供的一种场景结构示意图;FIG. 2 shows a schematic diagram of a scene structure provided by an embodiment of the present application;
图3示出了本申请实施例提供的一种产品实现形态示意图;Fig. 3 shows a schematic diagram of a product realization form provided by the embodiment of the present application;
图4示出了本申请实施例提供的一种多视图三维重建的方法的流程框图;Fig. 4 shows a flowchart of a method for multi-view 3D reconstruction provided by an embodiment of the present application;
图5示出了本申请实施例提供的一种多视图三维重建的装置的结构图。Fig. 5 shows a structural diagram of an apparatus for multi-view 3D reconstruction provided by an embodiment of the present application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述。The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.
应理解,本申请中所有节点、装置的名称仅仅是本申请为描述方便而设定的名称,在实际应用中的名称可能不同,不应理解本申请限定各种节点、装置的名称,相反,任何具有和本申请中用到的节点或装置具有相同或类似功能的名称都视作本申请的方法或等效替换,都在本申请的保护范围之内,以下不再赘述。It should be understood that the names of all nodes and devices in this application are only the names set by this application for the convenience of description, and the names in actual applications may be different. It should not be understood that this application limits the names of various nodes and devices. On the contrary, Any names with the same or similar functions as the nodes or devices used in this application are regarded as methods or equivalent replacements in this application, and are all within the scope of protection of this application, and will not be described in detail below.
为便于理解本申请实施例,首先结合图1简要说明本申请实施例的一种系统架构100的结构示意图。如图1所示,该系统架构100包括图像采集模块110和模型重建模块120。图像采集模块110是模型重建的基础之一,高质量大规模的图像采集,是重建得到高质量模型的关键。To facilitate understanding of the embodiment of the present application, firstly, a schematic structural diagram of a system architecture 100 of the embodiment of the present application is briefly described with reference to FIG. 1 . As shown in FIG. 1 , the system architecture 100 includes an image acquisition module 110 and a model reconstruction module 120 . The image acquisition module 110 is one of the bases for model reconstruction, and high-quality large-scale image acquisition is the key to reconstructing high-quality models.
其中,在图像采集模块110中,如图1所示,图像采集设备111用于采集原始图像,其中,图像采集设备111可以是任意具有拍摄功能的设备,本申请实施例对此不作具体限定;图像预处理设备112可以用于进行原始图像的筛选、过滤及去畸变处理,其中,对图像进行去畸变处理的方法本申请实施例对此不作限定。Wherein, in the image acquisition module 110, as shown in FIG. 1 , the image acquisition device 111 is used to acquire the original image, wherein the image acquisition device 111 can be any device with a shooting function, which is not specifically limited in the embodiment of the present application; The image preprocessing device 112 may be used to perform screening, filtering, and de-distortion processing of the original image, where the method for performing de-distortion processing on the image is not limited in this embodiment of the present application.
在该图像采集模块110中,还包括图像序列库113,用于存储图像序列。该图像序列可以用于模型重建模块进行三维模型的重建。The image acquisition module 110 also includes an image sequence library 113 for storing image sequences. The image sequence can be used in the model reconstruction module to reconstruct the three-dimensional model.
如图1所示,模型重建模块120中包括视图重建设备121,该视图重建设备121可以基于图像序列库113中维护的图像序列进行稀疏重建和稠密重建,进一步由视图融合设备得到目标三维视图。As shown in FIG. 1, the model reconstruction module 120 includes a view reconstruction device 121, which can perform sparse reconstruction and dense reconstruction based on the image sequence maintained in the image sequence library 113, and further obtain the target 3D view by the view fusion device.
需要说明的是,上述视图重建设备121和视图融合设备122可以作为独立的设备,也可以耦合为一个设备进行目标三维视图的重建。本申请实施例仅为一种示例性说明,本申请实施例对此不作限定。It should be noted that the above-mentioned view reconstruction device 121 and view fusion device 122 can be used as independent devices, or can be coupled as one device to reconstruct the target 3D view. The embodiment of the present application is only an exemplary description, and the embodiment of the present application does not limit it.
需要说明的是,在实际的应用中,所述图像序列库113中维护的图像序列不一定都来自于图像采集设备111的采集,也有可能是从其他设备接收得到的。另外需要说明的是,视图重建设备121也不一定完全基于图像序列库113维护的图像序列进行三维视图的重建,也有可能从云端或其他地方获取图像序列进行三维视图的重建,上述描述不应该作为对本申请实施例的限定。It should be noted that, in practical applications, the image sequences maintained in the image sequence library 113 are not necessarily all acquired by the image acquisition device 111, and may also be received from other devices. In addition, it should be noted that the view reconstruction device 121 does not necessarily reconstruct the 3D view entirely based on the image sequence maintained by the image sequence library 113, and may also obtain the image sequence from the cloud or other places to reconstruct the 3D view. The above description should not be regarded as Limitations on the embodiments of this application.
本申请实施例可以应用于不同的系统或设备中,如执行设备,所述执行设备可以是终端,如手机终端,平板电脑,笔记本电脑,增强现实(augmented reality,AR)AR/虚拟现实(virtual reality,VR),车载终端等,还可以是服务器或者云端等。The embodiments of the present application can be applied to different systems or devices, such as execution devices, which can be terminals, such as mobile terminals, tablet computers, notebook computers, augmented reality (augmented reality, AR) AR/virtual reality (virtual reality) reality, VR), vehicle-mounted terminals, etc., it can also be a server or cloud, etc.
值得说明的是,上述视图重建设备121可以针对不同的目标或不同的任务,基于不同的图像序列重建得到不同三维物体,从而为用户提供所需的结果。It is worth noting that the above-mentioned view reconstruction device 121 can reconstruct and obtain different three-dimensional objects based on different image sequences for different targets or different tasks, so as to provide users with desired results.
值得注意的是,图1仅是本申请实施例提供的一种系统架构的示意图,图中所示设备、器件、模块等之间的位置关系不构成任何限制。It should be noted that FIG. 1 is only a schematic diagram of a system architecture provided by an embodiment of the present application, and the positional relationship among devices, devices, modules, etc. shown in the figure does not constitute any limitation.
图2示出了本申请实施例提供的一种场景结构示意图。该应用场景适用于上述系统100。FIG. 2 shows a schematic diagram of a scene structure provided by an embodiment of the present application. This application scenario is applicable to the above-mentioned system 100 .
在本申请实施例中,该应用场景输入包括两个阶段,环绕物体采集的图像序列2010以及构造输入2020。其中,构造输入2020阶段通常包括SFM算法输出的稀疏重建三维点云,图像采集设备位姿和去畸变图像等。其中,图像采集设备位姿可以理解为图像采集设备环绕物体拍摄时相对于物体运动的自由度参数。In the embodiment of the present application, the application scene input includes two stages, the image sequence 2010 collected around the object and the construction input 2020 . Among them, the construction input 2020 stage usually includes the sparsely reconstructed 3D point cloud output by the SFM algorithm, the pose of the image acquisition device, and the dedistorted image. Wherein, the pose of the image acquisition device can be understood as a parameter of the degree of freedom of the movement of the image acquisition device relative to the object when the image acquisition device shoots around the object.
提取2D ROI阶段2030主要指利用相关算法完成去畸变图像序列上的2D ROI的提取;后续将2D ROI提取后的视点序列图像作为稠密重建1040的输入,输出深度图,通过视图融合2050将该深度图和法向图合成3D点云。The stage of extracting 2D ROI 2030 mainly refers to the use of related algorithms to complete the extraction of 2D ROI on the de-distorted image sequence; subsequently, the viewpoint sequence image after 2D ROI extraction is used as the input of dense reconstruction 1040, and the depth map is output, and the depth map is obtained through view fusion 2050. Graph and normal map to synthesize 3D point cloud.
值得注意的是,上述场景仅为一种示例性说明,本申请实施例可以用于各种多视图立体重建场景中,本申请实施例对此不作限定。It should be noted that the foregoing scenario is only an exemplary description, and the embodiment of the present application may be used in various multi-view stereo reconstruction scenarios, which is not limited in the embodiment of the present application.
为便于对本实施例进行理解,首先对本实施例所公开的一种多视图三维重建的方法进行详细介绍,本公开实施例所提供的多视图三维重建方法的执行主体一般为具有一定计算能力的计算机设备,该计算机设备例如包括:终端设备或服务器或其它处理设备,终端设备可以为用户设备(User Equipment,UE)、移动设备、用户终端、终端、蜂窝电话、无绳电话、个人数字处理(Personal Digital Assistant,PDA)、手持设备、计算设备、车载设备、可穿戴设备等。在一些可能的实现方式中,该多视图三维重建方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现。In order to facilitate the understanding of this embodiment, a method for multi-view 3D reconstruction disclosed in this embodiment is firstly introduced in detail. The execution subject of the multi-view 3D reconstruction method provided by the embodiment of the present disclosure is generally a computer with certain computing power equipment, the computer equipment includes, for example: terminal equipment or server or other processing equipment, the terminal equipment can be user equipment (User Equipment, UE), mobile equipment, user terminal, terminal, cellular phone, cordless phone, personal digital processing (Personal Digital Assistant, PDA), handheld devices, computing devices, vehicle-mounted devices, wearable devices, etc. In some possible implementation manners, the method for multi-view three-dimensional reconstruction may be implemented in a manner in which a processor invokes computer-readable instructions stored in a memory.
下面结合图3描述本申请实施例的多视图三维重建方法。图3示出了本申请实施例提供的一种多视图三维重建的方法的示意性流程图。图3所示的方法300可以应用于图1所示的系统100,所述方法300可以由上述执行设备执行。可选地,方法300可以由CPU处理,也可以是其他适合用于三维立体重建的处理器,本申请实施例对此不作限定。The multi-view 3D reconstruction method of the embodiment of the present application is described below with reference to FIG. 3 . Fig. 3 shows a schematic flowchart of a method for multi-view 3D reconstruction provided by an embodiment of the present application. The method 300 shown in FIG. 3 may be applied to the system 100 shown in FIG. 1 , and the method 300 may be executed by the above execution device. Optionally, the method 300 may be processed by a CPU, or other processors suitable for three-dimensional reconstruction, which is not limited in this embodiment of the present application.
在本申请实施例中的多视图三维重建的场景中,图像采集设备围绕目标物体拍摄多个离散视点图像集合,该离散视点图像集合包括多个用于展示该目标物体的二维图像。该离散视点图像集合组成图像序列作为输入,经过重建模块后对应输出为该目标物业的3D图像,该3D图像可以展示该目标物体的立体图像。In the scene of multi-view 3D reconstruction in the embodiment of the present application, the image acquisition device captures multiple sets of discrete viewpoint images surrounding the target object, and the discrete viewpoint image sets include multiple two-dimensional images for displaying the target object. The set of discrete viewpoint images constitutes an image sequence as an input, and the reconstruction module outputs a corresponding 3D image of the target property, and the 3D image can display a stereoscopic image of the target object.
其中,图像采集设备可以是任意具有拍摄功能的电子设备,例如,手机,相机,电脑等。本申请实施例对此不作限定。Wherein, the image acquisition device may be any electronic device with a shooting function, for example, a mobile phone, a camera, a computer, and the like. This embodiment of the present application does not limit it.
应理解,目标物体可以是一个场景空间中任意一个用户想要进行三维重建的物体。It should be understood that the target object may be any object in a scene space that the user wants to perform 3D reconstruction.
应理解,图像序列可以称为图像集合,视图集合,图像集等相似术语,本申请实施例采用图像序列为例进行说明,本申请实施例对此不作限定。It should be understood that an image sequence may be called an image set, a view set, an image set, and other similar terms. The embodiment of the present application uses the image sequence as an example for description, and the embodiment of the present application does not limit this.
方法300包括步骤S310至步骤S330。下面对步骤S310至步骤S330进行详细说明。The method 300 includes step S310 to step S330. Step S310 to step S330 will be described in detail below.
S310,根据目标物体的第一图像序列、第一实例掩码、第一三维点云和图像采集设备的位姿信息确定第二三维点云。S310. Determine a second 3D point cloud according to the first image sequence of the target object, the first instance mask, the first 3D point cloud, and the pose information of the image acquisition device.
在本申请实施例中,上述图像序列可以是图像采集设备拍摄目标物体获得的畸变图像,经过去畸变处理后可以获得去畸变图像序列,第一图像序列为该去畸变图像序列的一例,即,第一图像序列包括多幅去畸变离散视点图像。In the embodiment of the present application, the above-mentioned image sequence may be a distorted image obtained by shooting a target object by an image acquisition device, and a de-distorted image sequence may be obtained after de-distortion processing, and the first image sequence is an example of the de-distorted image sequence, that is, The first sequence of images includes a plurality of de-distorted discrete viewpoint images.
在本申请实施例中,当前场景中包括的每一个物体可以作为一个实例,将第一图像序列逐帧输入基于训练好的示例分割模型可以实现每一个实例的分割,得到每一个实例的分 割掩码,该分割掩码包括对应物体的轮廓及轮廓内的像素点。第一实例掩码包括第一图像序列中的背景物体的分割掩码和目标物体的分割掩码。获取实例掩码的途径本申请实施例不作限定。In the embodiment of the present application, each object included in the current scene can be used as an instance, and the first image sequence is input frame by frame based on the trained example segmentation model to realize the segmentation of each instance, and obtain the segmentation mask of each instance Code, the segmentation mask includes the contour of the corresponding object and the pixels within the contour. The first instance mask includes a segmentation mask of the background object and a segmentation mask of the target object in the first sequence of images. The method of obtaining the instance mask is not limited in this embodiment of the application.
应理解,上述背景物体为当前拍摄场景中除去目标物体以外的其他所有物体。It should be understood that the above-mentioned background objects are all other objects except the target object in the current shooting scene.
在本申请实施例中,第一三维点云包括根据第一图形序列输出的稀疏三维点云,该稀疏三维点云包括目标物体的稀疏点云和背景物体的稀疏三维点云。In the embodiment of the present application, the first 3D point cloud includes a sparse 3D point cloud output according to the first graphics sequence, and the sparse 3D point cloud includes a sparse 3D point cloud of a target object and a sparse 3D point cloud of a background object.
在一种可能的实施方式中,可以利用SFM算法获取第一三维点云。In a possible implementation manner, the first three-dimensional point cloud may be acquired by using an SFM algorithm.
在本申请实施例中,图像采集设备的位姿信息(为了表述清晰简便,下文简称位姿信息)可以理解为图像采集设备环绕目标物体拍摄时的参数信息。该参数信息可以是自由度参数。应理解,图像采集设备在进行拍摄时,环绕目标物体进行运动,每一个图像采集设备相对于目标物体产生一个空间位置的变化关系,根据上述参数信息可以将空间位置的变化关系转换为坐标系,从而可以明确图像采集设备的运动轨迹。In the embodiment of the present application, the pose information of the image acquisition device (for clarity and simplicity, hereinafter referred to as the pose information) can be understood as the parameter information when the image acquisition device shoots around the target object. The parameter information may be a degree of freedom parameter. It should be understood that when the image acquisition device moves around the target object when shooting, each image acquisition device generates a spatial position change relationship relative to the target object, and the spatial position change relationship can be converted into a coordinate system according to the above parameter information, Therefore, the movement track of the image acquisition device can be clarified.
在一种可能的实施方式中,可以利用SFM算法获取图像采集设备的位姿信息。In a possible implementation manner, the pose information of the image acquisition device may be acquired by using the SFM algorithm.
作为示例而非限定,该自由度参数可以包括3个位置向量参数和3个欧拉角参数,即,该参数信息可以包括6个自由度参数。As an example but not a limitation, the degree of freedom parameter may include 3 position vector parameters and 3 Euler angle parameters, that is, the parameter information may include 6 degree of freedom parameters.
在本申请实施例中,根据上述第一图像序列、第一实例掩码、第一三维点云和图像采集设备的位姿信息确定第二三维点云。In the embodiment of the present application, the second 3D point cloud is determined according to the above-mentioned first image sequence, the first instance mask, the first 3D point cloud, and the pose information of the image acquisition device.
在本申请实施例中,第二三维点云可以理解为目标物体的稀疏点云。In the embodiment of the present application, the second three-dimensional point cloud may be understood as a sparse point cloud of the target object.
在一种可能的实施方式中,根据上述位姿信息确定目标物体的3D球形模型,该3D球形模型可以理解为一个包括目标物体的3D兴趣球。In a possible implementation manner, the 3D spherical model of the target object is determined according to the above pose information, and the 3D spherical model may be understood as a 3D ball of interest including the target object.
可以理解,当图像采集设备环绕目标物体拍摄时,各图像采集设备拍摄形成的视轴会近似在物体上形成一个交汇点,以该交汇点作为球心,视轴长度作为半径,形成一个包括目标物体在内的3D兴趣球。It can be understood that when the image acquisition device shoots around the target object, the visual axis formed by each image acquisition device will approximately form an intersection point on the object, and the intersection point is used as the center of the sphere, and the length of the visual axis is used as the radius to form a circle including the target. 3D fun ball with objects inside.
作为一个示例,在实际执行中,可以取第一三维点云中最大尺寸边长的0.2倍作为视轴长度,也就是半径长度,应理解,该数值为实际执行中的经验值,对本申请实施例不产生任何限定。As an example, in actual implementation, 0.2 times the length of the largest dimension in the first three-dimensional point cloud can be taken as the length of the visual axis, that is, the length of the radius. It should be understood that this value is an empirical value in actual implementation. Examples do not create any restrictions.
具体的,可以利用位姿信息对应的视轴向量拟合3D兴趣球。Specifically, the viewing axis vector corresponding to the pose information can be used to fit the 3D ball of interest.
进一步的,根据3D球形模型获取目标物体的2D圆形图像,该2D圆形图像可以理解为一个包括目标物体的2D兴趣圆。Further, a 2D circular image of the target object is acquired according to the 3D spherical model, and the 2D circular image can be understood as a 2D circle of interest including the target object.
可以理解,根据位姿信息将3D球形模型反向投影至2D视点图像,形成2D圆形图像。It can be understood that the 3D spherical model is back-projected to the 2D viewpoint image according to the pose information to form a 2D circular image.
具体的,可以利用位姿信息对应的投影矩阵将3D球形模型反向投影至第一图像序列上计算2D圆形图像。Specifically, the projection matrix corresponding to the pose information may be used to back-project the 3D spherical model onto the first image sequence to calculate the 2D circular image.
进一步的,根据2D圆形图像移除背景物体的分割掩码,确定目标物体的分割掩码。Further, the segmentation mask of the background object is removed according to the 2D circular image, and the segmentation mask of the target object is determined.
一种实施方式中,3D球形模型投影至2D视点图像时,当该3D球形模型的球心投影到第一实例掩码中的部分实例掩码上,就可以确定该部分实例掩码为目标物体的分割掩码,其余实例掩码为背景物体的实例掩码,即可以移除该背景物体的实例掩码。In one embodiment, when a 3D spherical model is projected onto a 2D viewpoint image, when the center of the 3D spherical model is projected onto a part of the first instance mask, it can be determined that the part of the instance mask is the target object The segmentation mask of the other instance masks is the instance mask of the background object, that is, the instance mask of the background object can be removed.
另一种实施方式中,根据2D视点图像与第一实例掩码的重叠情况来确定目标物体的分割掩码。In another implementation manner, the segmentation mask of the target object is determined according to the overlap between the 2D viewpoint image and the mask of the first instance.
作为一个示例,当所述2D圆形图像和所述第一实例掩码中包括的部分掩码重叠,则 确定所述重叠的部分掩码为所属目标物体的分割掩码,移除所述不重叠的掩码。As an example, when the partial mask included in the 2D circular image and the first instance mask overlaps, the overlapping partial mask is determined to be the segmentation mask of the target object, and the non-identical mask is removed. Overlapping masks.
作为又一个示例,当所述2D圆形图像和所述第一实例掩码中包括的部分掩码不重叠,则移除所述不重叠的部分掩码,确定剩余的分割掩码为所属目标物体的分割掩码,所述不重叠的部分掩码为所述背景物体的分割掩码。As yet another example, when the partial masks included in the 2D circular image and the first instance mask do not overlap, the non-overlapping partial masks are removed, and the remaining segmentation masks are determined as belonging targets A segmentation mask of an object, wherein the non-overlapping partial mask is a segmentation mask of the background object.
最后,根据上述第一三维点云和和目标物体的分割掩码确定第二三维点云。Finally, the second 3D point cloud is determined according to the first 3D point cloud and the segmentation mask of the target object.
可以理解,根据位姿信息将第一三维点云反向投影至2D视点图像,通过2D视点图像与目标物体的分割掩码的重叠情况来确定第二三维点云。It can be understood that the first 3D point cloud is back-projected to the 2D viewpoint image according to the pose information, and the second 3D point cloud is determined according to the overlap between the 2D viewpoint image and the segmentation mask of the target object.
具体的,利用位姿信息对应的投影矩阵将第一三维点云投影至第一图像序列上,根据2D投影点与目标物体的分割掩码的重叠关系来提取第二三维点云。Specifically, the projection matrix corresponding to the pose information is used to project the first 3D point cloud onto the first image sequence, and the second 3D point cloud is extracted according to the overlapping relationship between the 2D projected points and the segmentation mask of the target object.
作为一个示例,当第一三维点云中包括的部分点云投影的2D点落在目标物体的分割掩码上,即,2D视点图像与目标物体的分割掩码重叠,则,判断该部分点云为目标物体的稀疏点云,即为第二三维点云。此时,需要移除第一三维点云中的其他点云,从而获取第二三维点云。As an example, when a 2D point of a part of the point cloud projection included in the first 3D point cloud falls on the segmentation mask of the target object, that is, the 2D viewpoint image overlaps with the segmentation mask of the target object, then the part of the point is judged The cloud is a sparse point cloud of the target object, which is the second three-dimensional point cloud. At this point, other point clouds in the first 3D point cloud need to be removed, so as to obtain the second 3D point cloud.
S320,根据第二三维点云获取2D视图区域。S320. Acquire a 2D view area according to the second 3D point cloud.
在一种可能的实施方式中,根据第二三维点云和所述位姿信息获取2D凸包区域,所述2D凸包区域包括所述目标物体的外部轮廓内的2D点集,所述2D点集包括所述第二三维点云的2D投影点。In a possible implementation manner, a 2D convex hull region is obtained according to the second 3D point cloud and the pose information, the 2D convex hull region includes a 2D point set within the outer contour of the target object, and the 2D convex hull region includes The set of points includes 2D projected points of said second 3D point cloud.
其中,2D凸包区域可以理解为目标物体的大致的最外围轮廓内的像素点点集,该外部轮廓包括的2D点集确保了目标物体的完整性。可以理解,该外部轮廓包括了目标物体的所有2D点集,也包含了一部分背景物体的2D点集。Wherein, the 2D convex hull region can be understood as a set of pixel points within the approximate outermost contour of the target object, and the 2D point set included in the outer contour ensures the integrity of the target object. It can be understood that the outer contour includes all 2D point sets of the target object, and also includes 2D point sets of some background objects.
具体的,根据位姿信息将第二三维点云反向投影至2D视点图像,根据2D视点图像确定2D凸包区域。Specifically, the second 3D point cloud is back-projected to the 2D viewpoint image according to the pose information, and the 2D convex hull area is determined according to the 2D viewpoint image.
进一步的,对所述2D凸包区域进行边缘检测,获取2D扩展点集,所述边缘检测用于根据边缘点集移除所述2D凸包区域中包括的所述背景物体的稀疏点云,所述2D扩展点集包括所述2D点集和所述边缘点集。根据所述2D扩展点集获取所述目标物体的2D视图区域。Further, edge detection is performed on the 2D convex hull area to obtain a 2D extended point set, and the edge detection is used to remove the sparse point cloud of the background object included in the 2D convex hull area according to the edge point set, The 2D extended point set includes the 2D point set and the edge point set. Acquire the 2D view area of the target object according to the 2D extension point set.
可以理解,2D凸包区域并非精确的目标物体的轮廓,因此通过对该3D凸包区域进行边缘检测,获取目标物体精准的边缘点集,从而获取2D扩展点集,可以理解,该2D扩展点集包括2D点集和边缘点集。It can be understood that the 2D convex hull area is not the exact outline of the target object. Therefore, by performing edge detection on the 3D convex hull area, the precise edge point set of the target object can be obtained, thereby obtaining the 2D extension point set. It can be understood that the 2D extension point Sets include 2D point sets and edge point sets.
需要说明的是,边缘检测的目的是根据边缘点集来确定目标物体的精确轮廓,因此,需要对2D凸包区域内包括的背景物体的稀疏点云进行移除,即通过形态学操作去除该2D凸包区域内边缘点集以外的点集。可以理解,进一步的对该2D凸包区域进行了缩小,使得2D凸包区域趋于更加精准的目标物体的轮廓,得到目标物体的2D视图区域。It should be noted that the purpose of edge detection is to determine the precise outline of the target object based on the edge point set. Therefore, it is necessary to remove the sparse point cloud of the background object included in the 2D convex hull area, that is, to remove the object through morphological operations. The set of points other than the set of edge points inside the 2D convex hull region. It can be understood that the 2D convex hull area is further reduced, so that the 2D convex hull area tends to be more accurate to the outline of the target object, and the 2D view area of the target object is obtained.
S330,根据2D视图区域生成第三三维点云。S330. Generate a third 3D point cloud according to the 2D view area.
在本申请实施例中,根据2D视图区域进一步进行稠密重建得到第三三维点云,该第三三维点云用于展示目标物体的三维图像。In the embodiment of the present application, dense reconstruction is further performed according to the 2D viewing area to obtain a third 3D point cloud, and the third 3D point cloud is used to display a 3D image of the target object.
根据本申请实施例提供的技术方案,多视图三维重建场景中,基于图像采集设备的视轴先验信息,通过获取目标物体的3D ROI实现2D ROI的自动化提取,进一步实现目标物体的三维重建。从而能够避免ROI错检,漏检,不完整的问题,高效的实现完整的目标 物体的多视图三维重建。According to the technical solution provided by the embodiment of the present application, in the multi-view 3D reconstruction scene, based on the visual axis prior information of the image acquisition device, the 3D ROI of the target object is obtained to realize the automatic extraction of 2D ROI, and the 3D reconstruction of the target object is further realized. In this way, ROI misdetection, missed detection, and incompleteness can be avoided, and the complete multi-view 3D reconstruction of the target object can be realized efficiently.
图4示出了本申请实施例提供的一种多视图三维重建的方法的流程框图。图4所示的方法400可以应用于图1所示的系统100,所述方法400包括上述方法300的具体实现步骤。Fig. 4 shows a flowchart of a method for multi-view 3D reconstruction provided by an embodiment of the present application. The method 400 shown in FIG. 4 may be applied to the system 100 shown in FIG. 1 , and the method 400 includes specific implementation steps of the above-mentioned method 300 .
方法400包括六个步骤S4010至步骤S4060。以下对各步骤的具体实现过程进行详细说明。The method 400 includes six steps S4010 to S4060. The specific implementation process of each step will be described in detail below.
步骤S4010,获取图像序列。Step S4010, acquiring an image sequence.
在本申请实施例中,图像序列可以是多个图像采集设备对目标物体进行环绕拍摄获得的图像集合,可以理解,该图像集合包括多个从各个角度展示该目标物体的图像。In the embodiment of the present application, the image sequence may be a set of images obtained by surrounding shooting of the target object by multiple image acquisition devices. It can be understood that the image set includes multiple images showing the target object from various angles.
应理解,图像序列的获取可以是从图像采集设备直接获取,也可以是从其他设备获取的,或者是从云端或其他地方获取。本申请实施例对此不作限定。It should be understood that the image sequence may be acquired directly from the image acquisition device, or from other devices, or from the cloud or other places. This embodiment of the present application does not limit it.
需要说明的是,从任意途径获取图像序列,该图像序列均由多个图像采集设备环绕拍摄得到。It should be noted that the image sequence is obtained from any way, and the image sequence is captured by multiple image acquisition devices surrounding it.
步骤S4020,构造输入。Step S4020, construct input.
输入信息中包括第一图像序列4021,第一实例分割掩码4022,第一3D点云4023,位姿信息4024。The input information includes a first image sequence 4021 , a first instance segmentation mask 4022 , a first 3D point cloud 4023 , and pose information 4024 .
其中,第一图像序列4021为图像序列经过去畸变处理的图像。以便于后续计算更加准确。Wherein, the first image sequence 4021 is an image after de-distortion processing of the image sequence. for more accurate subsequent calculations.
其中,第一实例分割掩码4022包括在一个三维重建场景中图像采集设备的拍摄镜头下的所有物体的分割掩码,具体来说,第一实例分割掩码包括目标物体和背景物体的分割掩码。Wherein, the first instance segmentation mask 4022 includes the segmentation masks of all objects under the shooting lens of the image acquisition device in a 3D reconstruction scene, specifically, the first instance segmentation mask includes the segmentation masks of the target object and the background object code.
其中,第一3D点云4023包括目标物体和背景物体的稀疏点云。Wherein, the first 3D point cloud 4023 includes sparse point clouds of target objects and background objects.
其中,位姿信息4024包括图像采集设备环绕目标物体拍摄时的参数信息。Wherein, pose information 4024 includes parameter information when the image acquisition device shoots around the target object.
上述输入信息详细解释可以参考方法300中的步骤S310,本申请实施例在此不作赘述。For a detailed explanation of the above input information, reference may be made to step S310 in the method 300, which is not repeated in this embodiment of the present application.
在一种可能的实施方式中,构造的输入信息可以表示为:In a possible implementation manner, the constructed input information can be expressed as:
Figure PCTCN2022133598-appb-000001
Figure PCTCN2022133598-appb-000001
其中,Input表示构造的输入信息。Among them, Input represents the input information of the construction.
Figure PCTCN2022133598-appb-000002
表示视点图像j上实例分割模型预测结果中第i个实例分割掩码,其中,分割掩码以0/1矩阵形式表示,0表示非物体区域,1表示物体区域,矩阵行列和图像分辨率相同,其中,j和i均为大于1的正整数。
Figure PCTCN2022133598-appb-000002
Indicates the i-th instance segmentation mask in the prediction result of the instance segmentation model on the viewpoint image j, where the segmentation mask is expressed in the form of a 0/1 matrix, 0 indicates the non-object area, 1 indicates the object area, and the matrix rows and columns are the same as the image resolution , where j and i are both positive integers greater than 1.
points sfm表示SFM算法重建得到的第一三维点云,即,稀疏点云。 points sfm represents the first 3D point cloud reconstructed by the SFM algorithm, that is, the sparse point cloud.
Figure PCTCN2022133598-appb-000003
表示SFM算法计算得到的视点图像J对应的位姿信息,该位姿信息可以由6个参数组成,该6个参数包括3个表示位置向量,3个表示姿态向量。
Figure PCTCN2022133598-appb-000003
Indicates the pose information corresponding to the viewpoint image J calculated by the SFM algorithm, and the pose information may consist of 6 parameters, including 3 representing position vectors and 3 representing attitude vectors.
view_img j表示第一图像序列的视点图像j,即去除畸变的视点图像。 view_img j represents the viewpoint image j of the first image sequence, that is, the viewpoint image from which distortion has been removed.
步骤S4030,根据输入信息来确定第二3D点云。Step S4030, determining a second 3D point cloud according to the input information.
在本申请实施例中,根据输入信息确定第二3D点云包括以下几个具体步骤:In the embodiment of the present application, determining the second 3D point cloud according to the input information includes the following specific steps:
步骤4031:根据上述位姿信息确定目标物体的3D兴趣球。Step 4031: Determine the 3D ball of interest of the target object according to the above pose information.
在一种可能的实施方式中,利用位姿信息对应的视轴向量拟合3D兴趣球,具体的, 可以采用最小二乘法拟合出相机视轴汇聚的焦点,从而表示出该3D兴趣球。采用最小二乘法拟合出的3D兴趣球可以表示为:In a possible implementation manner, the viewing axis vector corresponding to the pose information is used to fit the 3D ball of interest. Specifically, the least square method can be used to fit the focal point where the camera viewing axis converges, so as to represent the 3D ball of interest . The 3D ball of interest fitted by the least square method can be expressed as:
Figure PCTCN2022133598-appb-000004
Figure PCTCN2022133598-appb-000004
其中,S(x,y,z,r)表示最小二乘法拟合出的以(x,y,z)为球心,r为半径的3D兴趣球;Among them, S(x, y, z, r) represents the 3D interest ball with (x, y, z) as the center and r as the radius fitted by the least square method;
LS(·)表示最小二乘拟合算法;LS(·) represents the least squares fitting algorithm;
N表示视点图像总数,即,第一图像序列包括的视点图像的总数。N represents the total number of viewpoint images, that is, the total number of viewpoint images included in the first image sequence.
其中,x,y,z,r,N均为大于1的正整数。Wherein, x, y, z, r, N are all positive integers greater than 1.
步骤4032:根据上述位姿信息和3D兴趣球获取目标物体的2D兴趣圆。Step 4032: Obtain the 2D circle of interest of the target object according to the above pose information and the 3D circle of interest.
在一种可能的实施方式中,根据位姿信息对应的投影矩阵将3D兴趣球反向投影至2D视点图像,形成2D兴趣圆。In a possible implementation manner, the 3D ball of interest is back-projected to the 2D viewpoint image according to the projection matrix corresponding to the pose information to form a 2D circle of interest.
步骤4033:根据2D兴趣圆移除背景物体的分割掩码。Step 4033: Remove the segmentation mask of the background object according to the 2D circle of interest.
在一种可能的实施方式中,根据2D兴趣圆移除背景物体的分割掩码,确定目标物体的分割掩码。确定的目标物体的分割掩码可以表示为:In a possible implementation manner, the segmentation mask of the background object is removed according to the 2D circle of interest, and the segmentation mask of the target object is determined. The segmentation mask of the determined target object can be expressed as:
Figure PCTCN2022133598-appb-000005
Figure PCTCN2022133598-appb-000005
其中,Refine(□)表示分割掩码的精修函数,可以理解为根据该函数可以实现将3D兴趣球反向投影至2D视点图像确定目标物体的分割掩码。Among them, Refine(□) represents the refinement function of the segmentation mask, and it can be understood that according to this function, the 3D interest ball can be back-projected to the 2D viewpoint image to determine the segmentation mask of the target object.
其中,M j表示视点图像j上实例分割模型预测结果中的实例数量。 where Mj denotes the number of instances in the prediction results of the instance segmentation model on viewpoint image j.
refined_mask表示确定的目标物体的分割掩码。refined_mask represents the segmentation mask of the determined target object.
可以理解,根据上述函数可以由2D兴趣圆和
Figure PCTCN2022133598-appb-000006
是否重叠来移除背景物体的掩码。例如,3D球形模型投影至2D视点图像得到的2D兴趣圆和第一实例掩码中的部分掩码重叠,就可以确定该重叠部分的掩码为目标物体的分割掩码,其余未重合部分的掩码为可以移除的背景物体的掩码。
It can be understood that according to the above function, the 2D circle of interest and
Figure PCTCN2022133598-appb-000006
Whether to overlap to remove the mask of background objects. For example, if the 2D circle of interest obtained by projecting the 3D spherical model to the 2D viewpoint image overlaps with a part of the mask in the first instance mask, it can be determined that the mask of the overlapping part is the segmentation mask of the target object, and the other non-overlapping parts mask is a mask of background objects that can be removed.
步骤4034:根据第一3D点云和目标物体的分割掩码确定第二3D点云。Step 4034: Determine the second 3D point cloud according to the first 3D point cloud and the segmentation mask of the target object.
在一种可能的实施方式中,将第一3D点云反向投影至2D视点图像,通过2D视点图像与目标物体的分割掩码的重叠情况来确定第二三维点云。该第二3D点云可以理解为3D包围盒,确定3D包围盒可以表示为:In a possible implementation manner, the first 3D point cloud is back-projected to the 2D viewpoint image, and the second 3D point cloud is determined according to the overlap between the 2D viewpoint image and the segmentation mask of the target object. The second 3D point cloud can be understood as a 3D bounding box, and the determined 3D bounding box can be expressed as:
Figure PCTCN2022133598-appb-000007
Figure PCTCN2022133598-appb-000007
其中,BB(·)表示3D包围盒的计算函数,该函数可以实现将第一3D点云反向投影至2D视点图像,通过判断投影之后的2D点是否落在refined_mask来确定最终的第二3D点云。Among them, BB( ) represents the calculation function of the 3D bounding box. This function can realize the reverse projection of the first 3D point cloud to the 2D viewpoint image, and determine the final second 3D by judging whether the projected 2D point falls on the refined_mask point cloud.
步骤S4040,根据第二3D点云获取2D视图区域。Step S4040, acquiring a 2D view area according to the second 3D point cloud.
在本申请实施例中,根据第二3D点云获取2D视图区域包括以下几个具体步骤:In the embodiment of the present application, obtaining the 2D view area according to the second 3D point cloud includes the following specific steps:
步骤S4041,提取2D凸包区域。Step S4041, extracting a 2D convex hull area.
在一种可能的实施方式中,根据第二三维点云和所述位姿信息获取2D凸包区域,获取2D凸包区域可以表示为:In a possible implementation manner, the 2D convex hull region is obtained according to the second 3D point cloud and the pose information, and the obtained 2D convex hull region can be expressed as:
Figure PCTCN2022133598-appb-000008
Figure PCTCN2022133598-appb-000008
其中,CH(·)表示凸包计算函数,该函数利用视图j对应位姿参数
Figure PCTCN2022133598-appb-000009
将第二3D点云PC roi反向投影至对应视图,进而基于2D投影点计算凸包convex_hull。
Among them, CH( ) represents the convex hull calculation function, which uses the corresponding pose parameters of view j
Figure PCTCN2022133598-appb-000009
Back-project the second 3D point cloud PC roi to the corresponding view, and then calculate the convex hull convex_hull based on the 2D projected points.
可以理解,通过2D凸包区域的提取可以获得2D点集,该2D点集可以理解为凸包区域内的所有2D点集和3D点集,具体来说,该点集包括了目标物体的所有点集,也包含了一部分背景物体的点集。It can be understood that the 2D point set can be obtained through the extraction of the 2D convex hull area. The 2D point set can be understood as all 2D point sets and 3D point sets in the convex hull area. Specifically, the point set includes all of the target object. The point set also contains the point set of a part of the background object.
步骤S4042,确定2D扩展点集。Step S4042, determine a 2D extension point set.
在本申请实施例中,对所述2D凸包区域进行边缘检测,获取边缘点集,根据该边缘点集可以明确目标物体相对精确的2D轮廓。In the embodiment of the present application, edge detection is performed on the 2D convex hull area to obtain an edge point set, and a relatively accurate 2D outline of the target object can be determined according to the edge point set.
一种可能的实施方式中,2D扩展点集可以表示为:In a possible implementation manner, the 2D extension point set can be expressed as:
Figure PCTCN2022133598-appb-000010
Figure PCTCN2022133598-appb-000010
其中,Edge(·)表示2D兴趣点集扩展函数,该函数对视点图像view_img j上凸包convex_hull j内区域进行边缘检测,将边缘检测结果和2D兴趣点集取并集得到扩展的兴趣点集
Figure PCTCN2022133598-appb-000011
Among them, Edge( ) represents the 2D interest point set expansion function, which performs edge detection on the area inside the convex hull convex_hull j on the viewpoint image view_img j , and combines the edge detection results with the 2D interest point set to obtain the extended interest point set
Figure PCTCN2022133598-appb-000011
可以理解,上述确定的2D扩展点集包括上述2D点集和边缘点集。It can be understood that the above-mentioned determined 2D extension point set includes the above-mentioned 2D point set and edge point set.
步骤S4043,形态学操作获取2D视图区域。Step S4043, the morphological operation acquires the 2D view area.
在本申请实施例中,边缘检测的目的是根据边缘点集来确定目标物体的精确轮廓,因此,需要对2D凸包区域内包括的背景物体的稀疏点云进行移除,即通过形态学操作去除该2D凸包区域内边缘点集以外的点集。In the embodiment of this application, the purpose of edge detection is to determine the precise outline of the target object according to the edge point set. Therefore, it is necessary to remove the sparse point cloud of the background object included in the 2D convex hull area, that is, through the morphological operation Remove the point sets other than the edge point set in the 2D convex hull area.
一种可能的实施方式中,形态学操作可以表示为:In a possible implementation, the morphological operation can be expressed as:
Figure PCTCN2022133598-appb-000012
Figure PCTCN2022133598-appb-000012
其中,Erosion(·)表示2D ROI提取函数,该函数执行从凸包convex_hull j边界向以
Figure PCTCN2022133598-appb-000013
确定的边界进行腐蚀操作,最终得到的
Figure PCTCN2022133598-appb-000014
即为该视图上的2D ROI。
Among them, Erosion( ) represents the 2D ROI extraction function, which performs the process from the boundary of the convex hull convex_hull j to the
Figure PCTCN2022133598-appb-000013
The determined boundary is corroded, and the final obtained
Figure PCTCN2022133598-appb-000014
is the 2D ROI on this view.
步骤S4050,根据2D视图区域获取第三3D点云。Step S4050, acquiring a third 3D point cloud according to the 2D view area.
一种可能的实施方式中,根据2D视图区域进一步进行稠密重建,获得3D点云,该稠密重建可以采用任意三维重建中稠密重建的方式,本申请实施例对此不作限定。In a possible implementation manner, dense reconstruction is further performed according to the 2D view area to obtain a 3D point cloud. The dense reconstruction may be any dense reconstruction method in 3D reconstruction, which is not limited in this embodiment of the present application.
步骤S4060,通过视图融合完成目标物体的三维重建。Step S4060, complete the 3D reconstruction of the target object through view fusion.
一种可能的实施方式中,将第三三维点云再进行视图融合得到最终重建图像,用于展示目标物体的三维图像。In a possible implementation manner, view fusion is performed on the third three-dimensional point cloud to obtain a final reconstructed image, which is used to display the three-dimensional image of the target object.
根据本申请实施例提供的技术方案,多视图三维重建场景中,基于图像采集设备的视轴先验信息,通过获取目标物体的3D ROI实现2D ROI的自动化提取,进一步实现目标物体的三维重建。从而能够避免ROI错检,漏检,不完整的问题,高效的实现完整的目标物体的多视图三维重建。According to the technical solution provided by the embodiment of the present application, in the multi-view 3D reconstruction scene, based on the visual axis prior information of the image acquisition device, the 3D ROI of the target object is obtained to realize the automatic extraction of 2D ROI, and the 3D reconstruction of the target object is further realized. In this way, it is possible to avoid ROI misdetection, missed detection, and incomplete problems, and efficiently realize the complete multi-view 3D reconstruction of the target object.
图5示出了本申请实施例提供的一种多视图三维重建的装置的结构框图。该多视图三维重建的装置500包括:第一确定模块510,第二确定模块520,构建模块530。Fig. 5 shows a structural block diagram of an apparatus for multi-view 3D reconstruction provided by an embodiment of the present application. The device 500 for multi-view three-dimensional reconstruction includes: a first determination module 510 , a second determination module 520 , and a construction module 530 .
其中,第一确定模块510用于根据目标物体的第一图像序列、第一实例掩码、第一三维点云和图像采集设备的位姿信息确定第二三维点云。Wherein, the first determining module 510 is configured to determine the second 3D point cloud according to the first image sequence of the target object, the first instance mask, the first 3D point cloud, and the pose information of the image acquisition device.
一种可能的实施方式中,第一确定模块510根据所述位姿信息确定所述目标物体的3D球形模型;根据所述3D球形模型获取所述目标物体的2D圆形图像;根据所述2D圆形图像移除所述背景物体的分割掩码,确定所述目标物体的分割掩码;根据所述第一三维点云和所述目标物体的分割掩码确定所述第二三维点云。In a possible implementation manner, the first determining module 510 determines the 3D spherical model of the target object according to the pose information; acquires a 2D circular image of the target object according to the 3D spherical model; The segmentation mask of the background object is removed from the circular image to determine the segmentation mask of the target object; and the second 3D point cloud is determined according to the first 3D point cloud and the segmentation mask of the target object.
一种可能的实施方式中,第一确定模块510使用最小二乘法根据所述相机位姿信息拟合出所述3D球形模型的球心和半径,从而确定3D球形模型。In a possible implementation manner, the first determining module 510 uses a least square method to fit the center and radius of the 3D spherical model according to the camera pose information, thereby determining the 3D spherical model.
一种可能的实施方式中,第一确定模块510根据所述2D圆形图像和所述第一实例掩码的重叠情况来确定所述目标物体的分割掩码。In a possible implementation manner, the first determination module 510 determines the segmentation mask of the target object according to the overlap between the 2D circular image and the first instance mask.
作为一种可选的示例,当所述2D圆形图像和所述第一实例掩码中包括的部分掩码重叠,则确定所述重叠的部分掩码为所属目标物体的分割掩码,移除所述不重叠的掩码;或者当所述2D圆形图像和所述第一实例掩码中包括的部分掩码不重叠,则移除所述不重叠的部分掩码,确定剩余的分割掩码为所属目标物体的分割掩码,所述不重叠的部分掩码为所述背景物体的分割掩码。As an optional example, when the 2D circular image overlaps with a partial mask included in the first instance mask, determine that the overlapped partial mask is the segmentation mask of the target object, and move Remove the non-overlapping masks; or when the partial masks included in the 2D circular image and the first instance mask do not overlap, remove the non-overlapping partial masks, and determine the remaining segmentation The mask is the segmentation mask of the target object to which it belongs, and the non-overlapping partial mask is the segmentation mask of the background object.
一种可能的实施方式中,第一确定模块510用于将所述第一三维点云投影至2D视图图像,根据所述2D视图图像与所述目标物体的分割掩码重叠的情况来确定所述第二三维点云。In a possible implementation manner, the first determination module 510 is configured to project the first 3D point cloud to a 2D view image, and determine the 2D view image according to the overlap between the 2D view image and the segmentation mask of the target object. Describe the second 3D point cloud.
作为一种可选的示例,所述第一三维点云的部分点云的2D视图图像与所述目标物体的分割掩码重叠,则确定所述重叠的所述第一三维点云的部分点云为所述第二三维点云,移除所述第一三维点云中剩余的未与所述目标物体的分割掩码重叠的点云。As an optional example, if the 2D view image of a part of the first 3D point cloud overlaps with the segmentation mask of the target object, then determine the partial points of the overlapped first 3D point cloud The cloud is the second 3D point cloud, and the remaining point clouds in the first 3D point cloud that do not overlap with the segmentation mask of the target object are removed.
其中,第二确定模块520用于根据所述第二三维点云获取2D视图区域。Wherein, the second determining module 520 is configured to acquire a 2D view area according to the second 3D point cloud.
一种可能的实施方式中,第二确定模块520根据所述第二三维点云和所述位姿信息获取2D凸包区域,所述2D凸包区域包括所述目标物体的外部轮廓内的2D点集,所述2D点集包括所述第二三维点云的2D投影点;对所述2D凸包区域进行边缘检测,获取2D扩展点集,所述边缘检测用于根据边缘点集移除所述2D凸包区域中包括的所述背景物体的稀疏点云,所述2D扩展点集包括所述2D点集和所述边缘点集;根据所述2D扩展点集获取所述目标物体的2D视图区域。In a possible implementation manner, the second determining module 520 obtains a 2D convex hull area according to the second 3D point cloud and the pose information, and the 2D convex hull area includes a 2D convex hull area within the outer contour of the target object. Point set, the 2D point set includes the 2D projection point of the second three-dimensional point cloud; Edge detection is performed on the 2D convex hull area to obtain a 2D extended point set, and the edge detection is used to remove according to the edge point set The sparse point cloud of the background object included in the 2D convex hull area, the 2D extended point set includes the 2D point set and the edge point set; obtain the target object according to the 2D extended point set 2D view area.
其中,构建模块530用于根据所述2D视图区域生成第三三维点云。Wherein, the construction module 530 is used for generating a third 3D point cloud according to the 2D view area.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those skilled in the art can appreciate that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the above-described system, device and unit can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed systems, devices and methods may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above is only a specific implementation of the application, but the scope of protection of the application is not limited thereto. Anyone familiar with the technical field can easily think of changes or substitutions within the technical scope disclosed in the application. Should be covered within the protection scope of this application. Therefore, the protection scope of the present application should be determined by the protection scope of the claims.

Claims (19)

  1. 一种多视图三维重建的方法,其特征在于,包括:A method for multi-view 3D reconstruction, comprising:
    根据目标物体的第一图像序列、第一实例掩码、第一三维点云和图像采集设备的位姿信息确定第二三维点云,其中,所述第一图像序列包括使用对所述目标物体进行环绕拍摄后进行去畸变的多幅图像,所述第一实例掩码包括所述第一图像序列中的目标物体的分割掩码和背景物体的分割掩码,所述第一三维点云包括所述第一图像序列中的目标物体的稀疏点云和背景物体的稀疏点云,所述位姿信息包括所述图像采集设备环绕所述目标物体拍摄时的参数信息,所述第二三维点云包括所述目标物体的稀疏点云;Determine the second 3D point cloud according to the first image sequence of the target object, the first instance mask, the first 3D point cloud, and the pose information of the image acquisition device, wherein the first image sequence includes using the target object A plurality of images subjected to de-distortion after surround shooting, the first instance mask includes a segmentation mask of the target object and a segmentation mask of the background object in the first image sequence, and the first 3D point cloud includes The sparse point cloud of the target object and the sparse point cloud of the background object in the first image sequence, the pose information includes parameter information when the image acquisition device shoots around the target object, and the second three-dimensional point a cloud comprising a sparse point cloud of said target object;
    根据所述第二三维点云获取2D视图区域,所述2D视图区域包括所述目标物体的兴趣区域;acquiring a 2D view area according to the second 3D point cloud, where the 2D view area includes an area of interest of the target object;
    根据所述2D视图区域生成第三三维点云,所述第三三维点云包括所述目标物体的稠密三维点云,所述稠密三维点云用于展示所述目标物体。A third three-dimensional point cloud is generated according to the 2D view area, the third three-dimensional point cloud includes a dense three-dimensional point cloud of the target object, and the dense three-dimensional point cloud is used to display the target object.
  2. 根据权利要求1所述的方法,其特征在于,所述根据目标物体的第一图像序列、第一实例掩码、第一三维点云和图像采集设备的位姿信息确定第二三维点云,包括:The method according to claim 1, wherein the second three-dimensional point cloud is determined according to the first image sequence of the target object, the first instance mask, the first three-dimensional point cloud, and the pose information of the image acquisition device, include:
    根据所述位姿信息确定所述目标物体的3D球形模型;determining a 3D spherical model of the target object according to the pose information;
    根据所述3D球形模型获取所述目标物体的2D圆形图像;acquiring a 2D circular image of the target object according to the 3D spherical model;
    根据所述2D圆形图像移除所述背景物体的分割掩码,确定所述目标物体的分割掩码;removing the segmentation mask of the background object according to the 2D circular image, and determining the segmentation mask of the target object;
    根据所述第一三维点云和所述目标物体的分割掩码确定所述第二三维点云。The second 3D point cloud is determined according to the first 3D point cloud and the segmentation mask of the target object.
  3. 根据权利要求2所述的方法,其特征在于,根据所述位姿信息确定所述目标物体的3D球形模型,包括:The method according to claim 2, wherein determining the 3D spherical model of the target object according to the pose information comprises:
    使用最小二乘法根据所述相机位姿信息拟合出所述3D球形模型的球心和半径。Fitting the center and radius of the 3D spherical model according to the camera pose information by using the least squares method.
  4. 根据权利要求2所述的方法,其特征在于,根据所述2D圆形图像移除所述背景物体的分割掩码,确定所述目标物体的分割掩码,包括:The method according to claim 2, wherein removing the segmentation mask of the background object according to the 2D circular image to determine the segmentation mask of the target object comprises:
    根据所述2D圆形图像和所述第一实例掩码的重叠情况来确定所述目标物体的分割掩码,包括:Determining the segmentation mask of the target object according to the overlap between the 2D circular image and the first instance mask, including:
    当所述2D圆形图像和所述第一实例掩码中包括的部分掩码重叠,则确定所述重叠的部分掩码为所属目标物体的分割掩码,移除所述不重叠的掩码;或者When the partial mask included in the 2D circular image and the first instance mask overlaps, then determine that the overlapping partial mask is the segmentation mask of the target object, and remove the non-overlapping mask ;or
    当所述2D圆形图像和所述第一实例掩码中包括的部分掩码不重叠,则移除所述不重叠的部分掩码,确定剩余的分割掩码为所属目标物体的分割掩码,所述不重叠的部分掩码为所述背景物体的分割掩码。When the partial masks included in the 2D circular image and the first instance mask do not overlap, the non-overlapping partial masks are removed, and the remaining segmentation masks are determined to be the segmentation masks of the corresponding target object , the non-overlapping partial mask is the segmentation mask of the background object.
  5. 根据权利要求2所述的方法,其特征在于,根据所述第一三维点云和所述目标物体的分割掩码确定第二三维点云,包括:The method according to claim 2, wherein determining a second 3D point cloud according to the segmentation mask of the first 3D point cloud and the target object comprises:
    将所述第一三维点云投影至2D视图图像,根据所述2D视图图像与所述目标物体的分割掩码重叠的情况来确定所述第二三维点云,包括:Projecting the first 3D point cloud to a 2D view image, and determining the second 3D point cloud according to the overlap between the 2D view image and the segmentation mask of the target object, including:
    所述第一三维点云的部分点云的2D视图图像与所述目标物体的分割掩码重叠,则确定所述重叠的所述第一三维点云的部分点云为所述第二三维点云,移除所述第一三维点云中剩余的未与所述目标物体的分割掩码重叠的点云。The 2D view image of the partial point cloud of the first three-dimensional point cloud overlaps with the segmentation mask of the target object, then it is determined that the overlapping partial point cloud of the first three-dimensional point cloud is the second three-dimensional point cloud, removing remaining point clouds in the first 3D point cloud that do not overlap with the segmentation mask of the target object.
  6. 根据权利要求1所述的方法,其特征在于,所述根据所述第二三维点云获取所述 2D视图区域,包括:The method according to claim 1, wherein said acquiring said 2D view area according to said second 3D point cloud comprises:
    根据所述第二三维点云和所述位姿信息获取2D凸包区域,所述2D凸包区域包括所述目标物体的外部轮廓内的2D点集,所述2D点集包括所述第二三维点云的2D投影点;Obtain a 2D convex hull area according to the second 3D point cloud and the pose information, the 2D convex hull area includes a 2D point set within the outer contour of the target object, and the 2D point set includes the second 2D projected points of the 3D point cloud;
    对所述2D凸包区域进行边缘检测,获取2D扩展点集,所述边缘检测用于根据边缘点集移除所述2D凸包区域中包括的所述背景物体的稀疏点云,所述2D扩展点集包括所述2D点集和所述边缘点集;Performing edge detection on the 2D convex hull area to obtain a 2D extended point set, the edge detection is used to remove the sparse point cloud of the background object included in the 2D convex hull area according to the edge point set, the 2D The extended point set includes the 2D point set and the edge point set;
    根据所述2D扩展点集获取所述目标物体的2D视图区域。Acquire the 2D view area of the target object according to the 2D extension point set.
  7. 根据权利要求1-6项中任一项所述的方法,其特征在于,所述图像采集设备环绕所述目标物体拍摄时的参数信息包括所述图像采集设备相对于所述目标物体运动时的自由度参数。The method according to any one of claims 1-6, wherein the parameter information when the image acquisition device shoots around the target object includes the parameter information when the image acquisition device moves relative to the target object degrees of freedom parameter.
  8. 根据权利要求1-7项中任一项所述的方法,其特征在于:所述图像采集设备包括多个。The method according to any one of claims 1-7, characterized in that: the image acquisition device comprises a plurality.
  9. 一种多视图三维重建的装置,其特征在于,包括:A device for multi-view three-dimensional reconstruction, characterized in that it comprises:
    第一确定模块,用于根据目标物体的第一图像序列、第一实例掩码、第一三维点云和图像采集设备的位姿信息确定第二三维点云,其中,所述第一图像序列包括使用对所述目标物体进行环绕拍摄后进行去畸变的多幅图像,所述第一实例掩码包括所述第一图像序列中的目标物体的分割掩码和背景物体的分割掩码,所述第一三维点云包括所述第一图像序列中的目标物体的稀疏点云和背景物体的稀疏点云,所述位姿信息包括所述图像采集设备环绕所述目标物体拍摄时的参数信息,所述第二三维点云包括所述目标物体的稀疏点云;A first determining module, configured to determine a second 3D point cloud according to the first image sequence of the target object, the first instance mask, the first 3D point cloud, and the pose information of the image acquisition device, wherein the first image sequence Including using multiple images de-distorted after surrounding shooting of the target object, the first instance mask includes a segmentation mask of the target object and a segmentation mask of the background object in the first image sequence, so The first three-dimensional point cloud includes a sparse point cloud of a target object and a sparse point cloud of a background object in the first image sequence, and the pose information includes parameter information when the image acquisition device shoots around the target object , the second three-dimensional point cloud includes a sparse point cloud of the target object;
    第二确定模块,用于根据所述第二三维点云获取2D视图区域,所述2D视图区域包括所述目标物体的兴趣区域;A second determining module, configured to acquire a 2D view area according to the second 3D point cloud, where the 2D view area includes an area of interest of the target object;
    构建模块,用于根据所述2D视图区域生成第三三维点云,所述第三三维点云包括所述目标物体的稠密三维点云,所述稠密三维点云用于展示所述目标物体。A building module, configured to generate a third 3D point cloud according to the 2D view area, the third 3D point cloud includes a dense 3D point cloud of the target object, and the dense 3D point cloud is used to display the target object.
  10. 根据权利要求9所述的多视图三维重建的装置,其特征在于,所述第一确定模块具体用于根据所述位姿信息确定所述目标物体的3D球形模型;根据所述3D球形模型获取所述目标物体的2D圆形图像;根据所述2D圆形图像移除所述背景物体的分割掩码,确定所述目标物体的分割掩码;根据所述第一三维点云和所述目标物体的分割掩码确定所述第二三维点云。The device for multi-view three-dimensional reconstruction according to claim 9, wherein the first determination module is specifically configured to determine the 3D spherical model of the target object according to the pose information; obtain the 3D spherical model according to the 3D spherical model The 2D circular image of the target object; remove the segmentation mask of the background object according to the 2D circular image, and determine the segmentation mask of the target object; according to the first three-dimensional point cloud and the target A segmentation mask of the object determines the second 3D point cloud.
  11. 根据权利要求10所述的多视图三维重建的装置,其特征在于,所述第一确定模块具体用于使用最小二乘法根据所述相机位姿信息拟合出所述3D球形模型的球心和半径。The device for multi-view three-dimensional reconstruction according to claim 10, wherein the first determination module is specifically configured to use the least squares method to fit the center and the center of the 3D spherical model according to the camera pose information. radius.
  12. 根据权利要求10所述的多视图三维重建的装置,其特征在于,所述第一确定模块具体用于根据所述2D圆形图像和所述第一实例掩码的重叠情况来确定所述目标物体的分割掩码,包括:当所述2D圆形图像和所述第一实例掩码中包括的部分掩码重叠,则确定所述重叠的部分掩码为所属目标物体的分割掩码,移除所述不重叠的掩码;或者当所述2D圆形图像和所述第一实例掩码中包括的部分掩码不重叠,则移除所述不重叠的部分掩码,确定剩余的分割掩码为所属目标物体的分割掩码,所述不重叠的部分掩码为所述背景物体的分割掩码。The device for multi-view 3D reconstruction according to claim 10, wherein the first determining module is specifically configured to determine the target according to the overlap between the 2D circular image and the first instance mask The segmentation mask of the object includes: when the partial mask included in the 2D circular image and the mask of the first instance overlaps, determining that the overlapping partial mask is the segmentation mask of the corresponding target object, and moving Remove the non-overlapping masks; or when the partial masks included in the 2D circular image and the first instance mask do not overlap, remove the non-overlapping partial masks, and determine the remaining segmentation The mask is the segmentation mask of the target object to which it belongs, and the non-overlapping partial mask is the segmentation mask of the background object.
  13. 根据权利要求10所述的多视图三维重建的装置,其特征在于,所述第一确定模 块具体用于将所述第一三维点云投影至2D视图图像,根据所述2D视图图像与所述目标物体的分割掩码重叠的情况来确定所述第二三维点云,包括:所述第一三维点云的部分点云的2D视图图像与所述目标物体的分割掩码重叠,则确定所述重叠的所述第一三维点云的部分点云为所述第二三维点云,移除所述第一三维点云中剩余的未与所述目标物体的分割掩码重叠的点云。The device for multi-view 3D reconstruction according to claim 10, wherein the first determination module is specifically configured to project the first 3D point cloud to a 2D view image, and according to the 2D view image and the The segmentation mask of the target object overlaps to determine the second 3D point cloud, including: the 2D view image of the partial point cloud of the first 3D point cloud overlaps with the segmentation mask of the target object, then determining the The overlapping part of the first 3D point cloud is the second 3D point cloud, and the remaining point clouds in the first 3D point cloud that do not overlap with the segmentation mask of the target object are removed.
  14. 根据权利要求9所述的多视图三维重建的装置,其特征在于,所述第二确定模块具体用于根据所述第二三维点云和所述位姿信息获取2D凸包区域,所述2D凸包区域包括所述目标物体的外部轮廓内的2D点集,所述2D点集包括所述第二三维点云的2D投影点;对所述2D凸包区域进行边缘检测,获取2D扩展点集,所述边缘检测用于根据边缘点集移除所述2D凸包区域中包括的所述背景物体的稀疏点云,所述2D扩展点集包括所述2D点集和所述边缘点集;根据所述2D扩展点集获取所述目标物体的2D视图区域。The device for multi-view 3D reconstruction according to claim 9, wherein the second determining module is specifically configured to obtain a 2D convex hull area according to the second 3D point cloud and the pose information, and the 2D The convex hull area includes a 2D point set within the outer contour of the target object, and the 2D point set includes 2D projection points of the second three-dimensional point cloud; performing edge detection on the 2D convex hull area to obtain 2D extension points set, the edge detection is used to remove the sparse point cloud of the background object included in the 2D convex hull area according to the edge point set, the 2D extended point set includes the 2D point set and the edge point set ; Obtain the 2D view area of the target object according to the 2D extension point set.
  15. 根据权利要求9-14项中任一项所述的多视图三维重建的装置,其特征在于,所述图像采集设备环绕所述目标物体拍摄时的参数信息包括所述图像采集设备相对于所述目标物体运动时的自由度参数。The device for multi-view three-dimensional reconstruction according to any one of claims 9-14, characterized in that, the parameter information when the image acquisition device shoots around the target object includes the image acquisition device relative to the The degree of freedom parameter when the target object is moving.
  16. 根据权利要求9-15项中任一项所述的多视图三维重建的装置,其特征在于:所述图像采集设备包括多个。The device for multi-view three-dimensional reconstruction according to any one of claims 9-15, characterized in that: the image acquisition device includes multiple.
  17. 一种电子设备,其特征在于,包括:An electronic device, characterized in that it comprises:
    处理器和存储器,其中,所述存储器用于存储程序指令,所述处理器用于调用所述程序指令来执行权利要求1至8项中任一项所述的方法。A processor and a memory, wherein the memory is used to store program instructions, and the processor is used to call the program instructions to execute the method according to any one of claims 1 to 8.
  18. 一种计算机可读存储介质,其特征在于,所述计算机可读介质存储用于设备执行的程序代码,该程序代码包括用于执行如权利要求1至8项中任一项所述的方法。A computer-readable storage medium, characterized in that the computer-readable medium stores program code for execution by a device, and the program code is included for executing the method according to any one of claims 1 to 8.
  19. 一种芯片,其特征在于,所述芯片包括处理器与数据接口,所述处理器通过所述数据接口读取存储器上存储的指令,以执行如权利要求1至8项中任一项所述的方法。A chip, characterized in that the chip includes a processor and a data interface, and the processor reads the instructions stored on the memory through the data interface to execute any one of claims 1 to 8. Methods.
PCT/CN2022/133598 2021-11-25 2022-11-23 Multi-view three-dimensional reconstruction method WO2023093739A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111414740.6A CN116168143A (en) 2021-11-25 2021-11-25 Multi-view three-dimensional reconstruction method
CN202111414740.6 2021-11-25

Publications (1)

Publication Number Publication Date
WO2023093739A1 true WO2023093739A1 (en) 2023-06-01

Family

ID=86420653

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/133598 WO2023093739A1 (en) 2021-11-25 2022-11-23 Multi-view three-dimensional reconstruction method

Country Status (2)

Country Link
CN (1) CN116168143A (en)
WO (1) WO2023093739A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116993923A (en) * 2023-09-22 2023-11-03 长沙能川信息科技有限公司 Three-dimensional model making method, system, computer equipment and storage medium for converter station
CN117173463A (en) * 2023-08-30 2023-12-05 北京长木谷医疗科技股份有限公司 Bone joint model reconstruction method and device based on multi-classification sparse point cloud
CN117274512A (en) * 2023-11-23 2023-12-22 岭南现代农业科学与技术广东省实验室河源分中心 Plant multi-view image processing method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111145238A (en) * 2019-12-12 2020-05-12 中国科学院深圳先进技术研究院 Three-dimensional reconstruction method and device of monocular endoscope image and terminal equipment
CN113129329A (en) * 2019-12-31 2021-07-16 中移智行网络科技有限公司 Method and device for constructing dense point cloud based on base station target segmentation
CN113192206A (en) * 2021-04-28 2021-07-30 华南理工大学 Three-dimensional model real-time reconstruction method and device based on target detection and background removal
US20210350616A1 (en) * 2020-05-07 2021-11-11 Toyota Research Institute, Inc. System and method for estimating depth uncertainty for self-supervised 3d reconstruction

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111145238A (en) * 2019-12-12 2020-05-12 中国科学院深圳先进技术研究院 Three-dimensional reconstruction method and device of monocular endoscope image and terminal equipment
CN113129329A (en) * 2019-12-31 2021-07-16 中移智行网络科技有限公司 Method and device for constructing dense point cloud based on base station target segmentation
US20210350616A1 (en) * 2020-05-07 2021-11-11 Toyota Research Institute, Inc. System and method for estimating depth uncertainty for self-supervised 3d reconstruction
CN113192206A (en) * 2021-04-28 2021-07-30 华南理工大学 Three-dimensional model real-time reconstruction method and device based on target detection and background removal

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117173463A (en) * 2023-08-30 2023-12-05 北京长木谷医疗科技股份有限公司 Bone joint model reconstruction method and device based on multi-classification sparse point cloud
CN116993923A (en) * 2023-09-22 2023-11-03 长沙能川信息科技有限公司 Three-dimensional model making method, system, computer equipment and storage medium for converter station
CN116993923B (en) * 2023-09-22 2023-12-26 长沙能川信息科技有限公司 Three-dimensional model making method, system, computer equipment and storage medium for converter station
CN117274512A (en) * 2023-11-23 2023-12-22 岭南现代农业科学与技术广东省实验室河源分中心 Plant multi-view image processing method and system
CN117274512B (en) * 2023-11-23 2024-04-26 岭南现代农业科学与技术广东省实验室河源分中心 Plant multi-view image processing method and system

Also Published As

Publication number Publication date
CN116168143A (en) 2023-05-26

Similar Documents

Publication Publication Date Title
CN111243093B (en) Three-dimensional face grid generation method, device, equipment and storage medium
CN108509848B (en) The real-time detection method and system of three-dimension object
WO2023093739A1 (en) Multi-view three-dimensional reconstruction method
CN107223269B (en) Three-dimensional scene positioning method and device
CN109003325B (en) Three-dimensional reconstruction method, medium, device and computing equipment
JP7453470B2 (en) 3D reconstruction and related interactions, measurement methods and related devices and equipment
WO2020001168A1 (en) Three-dimensional reconstruction method, apparatus, and device, and storage medium
US20200058153A1 (en) Methods and Devices for Acquiring 3D Face, and Computer Readable Storage Media
CN111710036B (en) Method, device, equipment and storage medium for constructing three-dimensional face model
WO2020034785A1 (en) Method and device for processing three-dimensional model
WO2019196745A1 (en) Face modelling method and related product
CN113012293A (en) Stone carving model construction method, device, equipment and storage medium
CN110276774B (en) Object drawing method, device, terminal and computer-readable storage medium
CN112651881B (en) Image synthesizing method, apparatus, device, storage medium, and program product
CN115439607A (en) Three-dimensional reconstruction method and device, electronic equipment and storage medium
CN113361365B (en) Positioning method, positioning device, positioning equipment and storage medium
CN111161398A (en) Image generation method, device, equipment and storage medium
WO2023116430A1 (en) Video and city information model three-dimensional scene fusion method and system, and storage medium
CN113643414A (en) Three-dimensional image generation method and device, electronic equipment and storage medium
JP6347610B2 (en) Image processing apparatus and three-dimensional spatial information acquisition method
WO2019042028A1 (en) All-around spherical light field rendering method
JP6086491B2 (en) Image processing apparatus and database construction apparatus thereof
CN112562067A (en) Method for generating large-batch point cloud data sets
CN111652807B (en) Eye adjusting and live broadcasting method and device, electronic equipment and storage medium
KR20220026423A (en) Method and apparatus for three dimesiontal reconstruction of planes perpendicular to ground

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22897822

Country of ref document: EP

Kind code of ref document: A1