WO2022147976A1 - 三维重建及相关交互、测量方法和相关装置、设备 - Google Patents

三维重建及相关交互、测量方法和相关装置、设备 Download PDF

Info

Publication number
WO2022147976A1
WO2022147976A1 PCT/CN2021/102882 CN2021102882W WO2022147976A1 WO 2022147976 A1 WO2022147976 A1 WO 2022147976A1 CN 2021102882 W CN2021102882 W CN 2021102882W WO 2022147976 A1 WO2022147976 A1 WO 2022147976A1
Authority
WO
WIPO (PCT)
Prior art keywords
data set
image
dimensional
preset
data
Prior art date
Application number
PCT/CN2021/102882
Other languages
English (en)
French (fr)
Inventor
项骁骏
齐勇
章国锋
鲍虎军
余亦豪
姜翰青
Original Assignee
浙江商汤科技开发有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浙江商汤科技开发有限公司 filed Critical 浙江商汤科技开发有限公司
Priority to KR1020237025998A priority Critical patent/KR20230127313A/ko
Priority to JP2023513719A priority patent/JP7453470B2/ja
Publication of WO2022147976A1 publication Critical patent/WO2022147976A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/08Indexing scheme for image data processing or generation, in general involving all processing steps from image acquisition to 3D model generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Definitions

  • the present disclosure relates to the technical field of computer vision, and in particular, to a three-dimensional reconstruction and related interaction and measurement method, and related devices and equipment.
  • mobile terminals such as mobile phones, tablet computers and other integrated camera devices to perform 3D reconstruction of objects in real scenes, so as to use the 3D reconstruction obtained by 3D reconstruction.
  • the model implements applications such as Augmented Reality (AR) and games on mobile terminals.
  • AR Augmented Reality
  • the present disclosure provides a three-dimensional reconstruction method and related devices and equipment.
  • a first aspect of the present disclosure provides a three-dimensional reconstruction method, including: acquiring multiple frames of images to be processed obtained by scanning a target to be reconstructed by a camera device; The target pixel points of the target to be reconstructed and their camera pose parameters; according to the preset division strategy, the image data of each frame of the to-be-processed image is sequentially divided into corresponding data sets, wherein the image data at least includes the target pixel points; The image data of the data set, and the image data and pose optimization parameters of the data set whose time sequence is located before it, determine the pose optimization parameters of each data set; using the pose optimization parameters of each data set, Adjust the camera pose parameters of the to-be-processed image to which the image data in the image data belongs; use the preset three-dimensional reconstruction method and the adjusted camera pose parameters of the to-be-processed image to reconstruct the image data of the to-be-processed image to obtain the to-be-reconstructed target. 3D model.
  • the image to be processed obtained by scanning the target to be reconstructed by the camera device and the calibration parameters of the camera device are used to determine the target pixels and camera pose parameters of each frame of the image to be processed belonging to the target to be reconstructed, and according to the preset division strategy, Divide the image data of each frame of the image to be processed into the corresponding data sets in turn, so as to sequentially use the image data of each data set, and the image data and pose optimization parameters of the data set before it in time sequence to determine the value of each data set.
  • the pose optimization parameters, and then the pose optimization parameters of each data set can be determined based on the pose optimization parameters of the previous data set, so the pose optimization parameters of each data set are used to be included in the data set.
  • the preset three-dimensional reconstruction method and the adjusted camera pose of the to-be-processed image are used. parameters, the image data of the image to be processed is reconstructed, and the effect of the obtained 3D model of the target to be reconstructed can be effectively improved, and the error elimination of the camera pose parameters in the unit of the data set can reduce the amount of calculation, which is conducive to reducing Calculate load.
  • a second aspect of the present disclosure provides an interaction method based on three-dimensional reconstruction, including: acquiring a three-dimensional model of a target to be reconstructed, wherein the three-dimensional model is obtained by using the three-dimensional reconstruction method in the first aspect; using a preset visual inertial navigation method , build a three-dimensional map of the scene where the camera device is located, and obtain the current pose information of the camera device in the three-dimensional map; based on the pose information, display the three-dimensional model in the scene image currently captured by the camera device.
  • the 3D model of the target to be reconstructed is displayed in the currently captured scene image, which can realize the geometric consistency fusion of the virtual object and the real scene, and because of the 3D model
  • the model is obtained by the 3D reconstruction method in the above first aspect, so the effect of 3D reconstruction can be improved, thereby improving the effect of geometrically consistent fusion of virtual and reality, which is beneficial to improve user experience.
  • a third aspect of the present disclosure provides a measurement method based on three-dimensional reconstruction, including: acquiring a three-dimensional model of a target to be reconstructed, wherein the three-dimensional model is obtained by using the three-dimensional reconstruction method in the first aspect; Multiple set measurement points; obtain distances between multiple measurement points, and obtain distances between positions corresponding to multiple measurement points on the target to be reconstructed.
  • the distance between the multiple measurement points is obtained, and the distance between the positions corresponding to the multiple measurement points on the target to be reconstructed is obtained, so as to satisfy the requirements for Measurement requirements of objects in real scenes, and since the 3D model is obtained by using the 3D reconstruction method in the first aspect, the effect of 3D reconstruction can be improved, thereby improving the measurement accuracy.
  • a fourth aspect of the present disclosure provides a three-dimensional reconstruction device, including an image acquisition module, a first determination module, a data division module, a second determination module, a parameter adjustment module, and a model reconstruction module, and an image acquisition module for acquiring a scan of a camera device
  • the multi-frame to-be-processed images obtained by the target to be reconstructed;
  • the first determination module is used to use each frame of the to-be-processed image and the calibration parameters of the imaging device to determine the target pixels of each frame of the to-be-processed image belonging to the target to be reconstructed and its camera pose parameters
  • the data division module is used to divide the image data of each frame of images to be processed into corresponding data sets in turn according to the preset division strategy, wherein the image data at least includes target pixels;
  • the second determination module sequentially utilizes the images of each data set data, and the image data and pose optimization parameters of the data set before it in time sequence, determine the pose optimization parameters of each data set;
  • the parameter adjustment module
  • a fifth aspect of the present disclosure provides an interaction device based on three-dimensional reconstruction, including a model acquisition module, a mapping positioning module, and a display interaction module.
  • the model acquisition module is used to acquire a three-dimensional model of a target to be reconstructed, wherein the three-dimensional model is obtained by using the above obtained by the three-dimensional reconstruction device in the fourth aspect;
  • the mapping and positioning module is used to construct a three-dimensional map of the scene where the camera device is located by using a preset visual inertial navigation method, and obtain the current pose information of the camera device in the three-dimensional map; display interaction
  • the module is used to display the 3D model in the scene image currently captured by the camera device based on the pose information.
  • a sixth aspect of the present disclosure provides a measurement device based on three-dimensional reconstruction, including a model acquisition module, a display interaction module, and a distance acquisition module, where the model acquisition module is used to acquire a three-dimensional model of a target to be reconstructed, wherein the three-dimensional model is obtained by using the above-mentioned No. obtained by the three-dimensional reconstruction device in the four aspects; the display interaction module is used to receive multiple measurement points set by the user on the three-dimensional model; the distance acquisition module is used to acquire the distances between the multiple measurement points, and obtain the corresponding values on the target to be reconstructed. The distance between the positions of multiple measurement points.
  • a seventh aspect of the present disclosure provides an electronic device, including a memory and a processor coupled to each other, where the processor is configured to execute program instructions stored in the memory, so as to implement the three-dimensional reconstruction method in the first aspect, or implement the second The three-dimensional reconstruction-based interaction method in the aspect, or the three-dimensional reconstruction-based measurement method in the above-mentioned third aspect.
  • An eighth aspect of the present disclosure provides a computer-readable storage medium on which program instructions are stored, and when the program instructions are executed by a processor, implement the three-dimensional reconstruction method in the first aspect above, or implement the three-dimensional reconstruction method in the second aspect above. Reconstruction interactive method, or implement the three-dimensional reconstruction-based measurement method in the third aspect.
  • a ninth aspect of the present disclosure provides a computer program, including computer-readable codes, which, when the computer-readable codes are executed in an electronic device and executed by a processor in the electronic device, implement the above-mentioned first aspect
  • a tenth aspect of the present disclosure provides a computer program product that, when run on a computer, causes the computer to execute the three-dimensional reconstruction method in the first aspect above, or the interactive method based on three-dimensional reconstruction in the second aspect above, or execute the The three-dimensional reconstruction-based measurement method in the third aspect.
  • the pose optimization parameters of each data set can be determined based on the pose optimization parameters of the previous data set, so the pose optimization parameters of each data set are used for the image data contained in the data set.
  • the preset three-dimensional reconstruction method and the adjusted camera pose parameters of the image to be processed are used to The image data of the processed image is reconstructed, and the effect of the obtained 3D model of the target to be reconstructed can be effectively improved, and the error elimination of the camera pose parameters in the unit of the data set can reduce the amount of calculation, thereby helping to reduce the calculation load.
  • FIG. 1 is a schematic flowchart of an embodiment of a three-dimensional reconstruction method of the present disclosure
  • FIG. 2 is a schematic state diagram of an embodiment of the three-dimensional reconstruction method of the present disclosure
  • step S12 in FIG. 1 is a schematic flowchart of an embodiment of step S12 in FIG. 1;
  • FIG. 4 is a schematic flowchart of an embodiment of step S13 in FIG. 1;
  • FIG. 5 is a schematic flowchart of an embodiment of step S14 in FIG. 1;
  • FIG. 6 is a schematic flowchart of an embodiment of step S141 in FIG. 5;
  • FIG. 7 is a schematic flowchart of an embodiment of step S142 in FIG. 5;
  • FIG. 8 is a schematic flowchart of an embodiment of step S143 in FIG. 5;
  • FIG. 9 is a schematic flowchart of an embodiment of the three-dimensional reconstruction-based interaction method of the present disclosure.
  • FIG. 10 is a schematic flowchart of an embodiment of a three-dimensional reconstruction-based measurement method of the present disclosure
  • FIG. 11 is a schematic diagram of a framework of an embodiment of a three-dimensional reconstruction apparatus of the present disclosure.
  • FIG. 12 is a schematic diagram of the framework of an embodiment of the three-dimensional reconstruction-based interaction device of the present disclosure.
  • FIG. 13 is a schematic diagram of a framework of an embodiment of a three-dimensional reconstruction-based measurement device of the present disclosure
  • FIG. 14 is a schematic diagram of a framework of an embodiment of an electronic device of the present disclosure.
  • FIG. 15 is a schematic diagram of a framework of an embodiment of a computer-readable storage medium of the present disclosure.
  • system and "network” are often used interchangeably herein.
  • the term “at least one of” is only an association relationship to describe related objects, which means that there can be three kinds of relationships, for example, at least one of A and B can mean that A exists alone, and A and B exist at the same time. B, there are three cases of B alone.
  • the character “/” in this document generally indicates that the related objects are an “or” relationship.
  • “multiple” herein means two or more than two
  • 3D reconstruction is an important problem in the field of computer vision and augmented reality, and it plays an important role in applications such as augmented reality on mobile platforms, games, and 3D printing.
  • AR effects of real objects are to be realized on mobile platforms, such as skeleton drive, users are usually required to quickly reconstruct real objects in 3D. Therefore, 3D object scanning and reconstruction technology has a wide range of needs in the field of augmented reality on mobile platforms.
  • the present disclosure proposes a three-dimensional reconstruction and related interaction, measurement method, and related devices and equipment, by acquiring multiple frames of images to be processed obtained by scanning a target to be reconstructed by a camera device; Each frame of the to-be-processed image belongs to the target pixel point of the target to be reconstructed and its camera pose parameters; the image data of each frame of the to-be-processed image is divided into corresponding data sets in turn; the image data of the data set and the data whose time sequence is located before it are used The image data and pose optimization parameters of the set are used to determine the pose optimization parameters of the data set; the pose optimization parameters of the data set are used to adjust the camera pose parameters of the to-be-processed images to which the image data included in the data set belongs; The image data of the image to be processed is reconstructed to obtain a three-dimensional model of the target to be reconstructed.
  • the pose optimization parameters of each data set can be determined based on the pose optimization parameters of the previous data set, so the pose optimization parameters of each data set are used for the images contained in the data set.
  • the preset three-dimensional reconstruction method and the adjusted camera pose parameters of the to-be-processed image are used to The image data of the image to be processed is reconstructed, and the effect of the obtained 3D model of the target to be reconstructed can be effectively improved, and the error elimination of the camera pose parameters in the unit of the data set can reduce the amount of calculation, which is beneficial to reduce the calculation load. .
  • the execution subject of the three-dimensional reconstruction method, the interactive method of three-dimensional reconstruction, and the measurement method of three-dimensional reconstruction may be an electronic device, wherein the electronic device may be a smart phone, a desktop computer, a tablet computer, a notebook computer, a smart speaker, a digital assistant, an augmented reality (augmented reality, AR)/virtual reality (VR) devices, smart wearable devices and other types of physical devices. It can also be software running on physical devices, such as applications, browsers, etc.
  • the operating system running on the physical device may include, but is not limited to, an Android system, an Apple system (IOS Input Output System, IOS), linux, windows, and the like.
  • FIG. 1 is a schematic flowchart of an embodiment of a three-dimensional reconstruction method of the present disclosure. , which can include the following steps:
  • Step S11 acquiring multiple frames of images to be processed obtained by scanning the target to be reconstructed by the imaging device.
  • the camera device may include, but is not limited to, mobile terminals such as mobile phones and tablet computers.
  • the steps in the method embodiments of the present disclosure may be performed by a mobile terminal, or may be performed by a processing device such as a microcomputer connected to a camera device with a scanning and shooting function.
  • the imaging device may include a color camera capable of sensing visible light and a depth camera capable of sensing the depth of the object to be reconstructed, such as a structured light depth camera.
  • a structured light depth camera such as a structured light depth camera.
  • Objects to be reconstructed may include, but are not limited to: people, animals, objects (such as statues, furniture, etc.).
  • the 3D model of the statue can be finally obtained by scanning the statue.
  • the 3D model of the statue can be further rendered and skeleton bound.
  • the target to be reconstructed may be determined according to actual application requirements, and is not limited here.
  • Step S12 Using each frame of the to-be-processed image and the calibration parameters of the imaging device, determine the target pixels and camera pose parameters of each frame of the to-be-processed image belonging to the target to be reconstructed.
  • the calibration parameters may include internal parameters of the imaging device.
  • the calibration parameters may include the internal parameters of the color camera; when the imaging device includes a depth camera, or includes a color camera and a depth camera, it can be deduced by analogy. , and no more examples will be given here.
  • the internal parameters may include but are not limited to: camera focal length, camera principal point coordinates.
  • the internal parameters may be represented in the form of a matrix.
  • the internal parameter K of the color camera may be represented as:
  • f x , f y represent the focal length of the color camera
  • c x , cy represent the principal point coordinates of the color camera.
  • the internal parameters of the depth camera It can be deduced in the same way, and no examples are given here.
  • the calibration parameters may also include external parameters between the depth camera and the color camera of the imaging device, which are used to represent the transformation from the world coordinate system to the camera coordinate system.
  • the external parameters may include a 3*3 rotation matrix R and a 3*1 translation matrix T. Using the rotation matrix R to multiply the coordinate point P world in the world coordinate system to the left, and summing it with the translation matrix T, the corresponding coordinate point P camera in the camera coordinate system of the coordinate point P world in the world coordinate system can be obtained.
  • a pre-trained image segmentation model (for example, Unet model) can be used to perform image segmentation on the image to be processed, so as to obtain target pixels belonging to the target to be reconstructed in the to-be-processed image; in another implementation scenario, further The target to be reconstructed can be placed in an environment with a large color difference with it.
  • the target to be reconstructed when the target to be reconstructed is a milky white gypsum statue, the target to be reconstructed can be placed in a black environment for scanning, so that the image to be processed belongs to the environment color.
  • the pixels of the target color to be reconstructed are marked as invalid, and the pixels belonging to the target color to be reconstructed are marked as valid, and the size of the connected domain formed by the pixels marked as valid is compared, and the largest connected domain is determined.
  • the pixels of the reconstructed target when the target to be reconstructed is a milky white gypsum statue, the target to be reconstructed can be placed in a black environment for scanning, so that the image to be processed belongs to the environment color.
  • the pixels of the target color to be reconstructed are marked as invalid, and the pixels belonging to the target color to be reconstructed are marked as valid, and the size of the connected domain formed by the pixels marked as valid is compared, and the largest connected domain is determined.
  • the camera device In order to obtain a complete 3D model of the target to be reconstructed, the camera device needs to scan the target to be reconstructed in different poses, so the camera pose parameters used for shooting different images to be processed may be different, so in order to eliminate the camera pose parameter error , so as to improve the effect of subsequent 3D reconstruction, it is necessary to first determine the camera pose parameters of each frame of the image to be processed.
  • the target pixels of each frame of the image to be processed belonging to the target to be reconstructed can be used and its previous frame to be processed image belongs to the target pixel of the target to be reconstructed
  • the internal parameter K of the camera device constructs the objective function of the relative pose parameter ⁇ T, and uses the ICP (Iterative Closest Point, iterative closest point) algorithm to minimize the objective function, so as to obtain the relative pose parameter ⁇ T, where the relative position
  • the pose parameter ⁇ T is the relative parameter of the camera pose parameter T t of each frame of the image to be processed relative to the camera pose parameter T t-1 of the preceding frame of the image to be processed.
  • the objective function of the relative pose parameter ⁇ T can be referred to the following formula:
  • is the weight
  • d is the depth data Project to color data The depth value of the rear pixel pi . Therefore, in the above formula, w( ⁇ , p i ) can represent the theoretical corresponding pixel point in the three-dimensional space after the pixel point p i of the current frame is transformed to its previous frame by using the relative pose parameter ⁇ T and the internal parameter K.
  • the square sum error E geo between the z coordinate value w( ⁇ , p i ) z of the corresponding pixel in the three-dimensional space is also smaller, so minimizing the above objective function E icp can accurately obtain the relative pose parameter ⁇ T,
  • the accuracy of the camera pose parameters can be improved.
  • the relative pose parameter ⁇ T After obtaining the relative pose parameter ⁇ T between the camera pose parameter T t of each frame of the image to be processed relative to the camera pose parameter T t -1 of the previous frame of the image to be processed, the relative pose parameter ⁇ The inverse of T (ie ⁇ T -1 ) is left-multiplied by the camera pose parameter T t-1 of the image to be processed in the previous frame to obtain the camera pose parameter T t of the image to be processed in the current frame.
  • its camera pose parameters can be initialized as a unit matrix.
  • the unit matrix is the main pair of A square matrix in which all elements on the corner are 1 and all other elements are 0.
  • the scanning of the to-be-processed image and the determination of the target pixel point and the camera pose parameters can also be performed at the same time, that is, after a frame of the to-be-processed image is scanned and obtained, the image to be processed that has just been scanned is scanned. The image is used to determine the target pixel points and the camera pose parameters. At the same time, the next frame of the image to be processed is obtained by scanning, so that the 3D reconstruction of the target to be reconstructed can be performed in real time and online.
  • Step S13 According to a preset division strategy, sequentially divide the image data of each frame of the image to be processed into a corresponding data set, wherein the image data at least includes target pixels.
  • the maximum number of frames (for example, 8 frames, 9 frames, 10 frames, etc.) of the to-be-processed image to which the image data that each data set can accommodate may be set, so that in the current
  • the number of frames of the to-be-processed image to which the image data included in the data set belongs reaches the maximum number of frames, a new data set is created, and the undivided image data of the to-be-processed image continues to be divided into the newly created data set, This cycle continues until the scan is complete.
  • the image data of the to-be-processed images that have similar poses can also be divided into the same data set, which is not detailed here. limited.
  • it is also possible to determine the pose difference between the to-be-processed image to which the image data belongs and the to-be-processed image of the previous frame for example, the camera orientation angle difference, Whether the camera position distance
  • the to-be-processed image to be divided can also be ignored, and the division operation of the image data of the next frame of the to-be-processed image is processed.
  • there may be image data belonging to the same image to be processed between adjacent data sets for example, there may be image data belonging to two frames of the same image to be processed between adjacent data sets, or, adjacent data There may also be image data belonging to three identical frames of images to be processed between the sets, which is not limited here.
  • the image data of each frame of the image to be processed may only include target pixels belonging to the target to be reconstructed (eg, target pixels in depth data, target pixels in color data); in another implementation scenario , the image data of each frame of the image to be processed may also include pixels that do not belong to the target to be reconstructed.
  • the image data divided into the data set may also be the image data of the entire image to be processed. In this case, the image data also The position coordinates of the target pixel can be included, so that the target pixel can be found later.
  • FIG. 2 is a schematic state diagram of an embodiment of the three-dimensional reconstruction method of the present disclosure.
  • the target to be reconstructed is a portrait plaster sculpture
  • each frame of the to-be-processed image 21 may include color data 22 and depth data 23, and the target pixels belonging to the target to be reconstructed are obtained.
  • the image data 24 are sequentially divided into corresponding data sets 25 .
  • Step S14 Determine the pose optimization parameters of each data set by sequentially using the image data of each data set, and the image data and pose optimization parameters of the data set whose time sequence is located before it.
  • the image data of each data set and the image data of the data set before it in time sequence can be used to determine the spatial transformation parameter T icp between the two, so that the spatial transformation parameter between the two can be used.
  • T icp , and their respective pose optimization parameters T frag to construct an objective function about the pose optimization parameter T frag , and then solve the objective function to obtain its pose optimization parameter T frag and the pose of the data set whose time sequence is located before it
  • the parameters are optimized, so that the pose optimization parameter T frag of the previous data set can be updated.
  • the pose optimization parameter T frag of the data set before it in time sequence is considered, that is , between the data set and the pose optimization parameters of the data set before it. They are related to each other, and with the continuous generation of new data sets, the pose optimization parameters of the previous data sets can also be continuously updated, and thus loop to the last data set, so that the final pose optimization parameters of each data set can be obtained, Therefore, the accumulated error can be effectively eliminated.
  • the pose optimization parameters of the first data set may be initialized as an identity matrix.
  • the pose optimization parameters of the previous data set can be calculated, and the pose optimization parameters of the related data set can be updated, and so on until the end of the scan , to obtain the final pose optimization parameters of each data set, which can help to balance the amount of calculation, and thus help to reduce the calculation load.
  • the camera device is a mobile terminal such as a mobile phone or a tablet computer
  • the timing sequence may represent the overall shooting sequence of the images to be processed in the data set.
  • Other situations can be deduced by analogy, and no examples are given here.
  • the image data in the image data set in the data set 25 can also be sequentially Map to 3D space to get 3D point cloud corresponding to each data set.
  • the camera pose parameter T t of the image to be processed to which the image data belongs and the internal parameter K of the imaging device can be used to map the image data to a three-dimensional space to obtain a three-dimensional point cloud.
  • Three-dimensional homogeneous get the pixel coordinates, and then use the inverse of the camera pose parameter T t Multiply the pixel coordinates after homogeneous with the inverse K -1 of the internal parameter K to obtain a three-dimensional point cloud in three-dimensional space.
  • the pose of the dataset can be used to optimize the inverse of the parameter T frag Left-multiply the 3D point cloud for dynamic adjustment.
  • the camera pose parameters of the data set can also be used to adjust the corresponding 3D point cloud.
  • the three-dimensional point cloud may be marked with a preset color (eg, green), which is not limited herein.
  • Step S15 Using the pose optimization parameters of each data set, adjust the camera pose parameters of the to-be-processed image to which the image data included in the data set belongs.
  • the inverse of the pose optimization parameter T frag of each data set can be used
  • the camera pose parameter T t of the to-be-processed image to which the image data contained therein belongs is left-multiplied, so as to realize the adjustment of the camera pose parameter.
  • the sequence of data set A that has been divided into data set A includes image data 01 (belonging to the image to be processed 01 ), image data 02 (belonging to the image to be processed 02 ), and image data 03 (belonging to the image to be processed 03 ), so it is possible to Use the inverse of the pose optimization parameter T frag of dataset A Left-multiply the camera pose parameter T t of the image to be processed 01 , the camera pose parameter T t of the image to be processed 02 , and the camera pose parameter T t of the image to be processed 03 , thereby realizing the image contained in the data set A. Adjustment of the camera pose parameters of the to-be-processed image to which the data belongs.
  • the adjacent data set B includes image data 03 (belonging to the to-be-processed image 03) and image data 04 (belonging to the to-be-processed image 04), so when the pose of the data set A is used Optimizing the inverse of the parameter T frag When left-multiplying the camera pose parameter T t of the image to be processed 01, the camera pose parameter T t of the image to be processed 02, and the camera pose parameter T t of the image to be processed 03, then when the image contained in the data set B is When adjusting the camera pose parameters of the to-be-processed image to which the data belongs, the inverse of the pose optimization parameter T frag of the data set B can be used.
  • the pose optimization parameters of each data set use the pose optimization parameters of each data set to analyze the images to be processed to which the image data included in the data set belongs.
  • the camera pose parameters 26 are adjusted to obtain the adjusted camera pose parameters 27 .
  • the adjusted camera pose parameters 27 of the data set can also be used to adjust the corresponding three-dimensional point cloud 28, so that the user can feel the three-dimensional point cloud dynamic adjustment.
  • Step S16 Using the preset three-dimensional reconstruction method and the adjusted camera pose parameters of the to-be-processed image, perform reconstruction processing on the image data of the to-be-processed image to obtain a three-dimensional model of the to-be-reconstructed target.
  • the preset three-dimensional reconstruction method may include, but is not limited to: a TSDF (Truncated Signed Distance Function, based on a truncated signed distance function) reconstruction method and a Poisson reconstruction method.
  • the TSDF reconstruction method is a method for calculating the latent potential surface in the 3D reconstruction, and details are not repeated here.
  • the core idea of Poisson reconstruction is that the three-dimensional point cloud represents the surface position of the object to be reconstructed, and its normal vector represents the direction of inside and outside. By implicitly fitting an indicator function derived from an object, a smooth object surface estimation can be obtained. , and details are not repeated here.
  • the above steps can be used to reconstruct the 3D model of the target to be reconstructed in real time, and superimposed and rendered at the same position and angle as the currently captured image frame, so that the to-be-reconstructed model can be displayed to the user.
  • the main 3D model of the target may also be printed by a three-dimensional printer, so as to obtain a physical model corresponding to the target to be reconstructed.
  • the pose optimization parameters of each data set can be determined based on the pose optimization parameters of the previous data set, so the pose optimization parameters of each data set are used for the image data contained in the data set.
  • the preset three-dimensional reconstruction method and the adjusted camera pose parameters of the image to be processed are used to The image data of the processed image is reconstructed, and the effect of the obtained 3D model of the target to be reconstructed can be effectively improved, and the error elimination of the camera pose parameters in the unit of the data set can reduce the amount of calculation, thereby helping to reduce the calculation load.
  • FIG. 3 is a schematic flowchart of an embodiment of step S12 in FIG. 1 .
  • FIG. 3 is a schematic flowchart of the determination process of the target pixel point in FIG. 1, which may include the following steps:
  • Step S121 Obtain the angle between the normal vector of each pixel included in the depth data after alignment with the color data and the gravitational direction of the image to be processed.
  • each frame of the image to be processed includes color data It and depth data depth data Projection to color data It yields depth data Dt after alignment .
  • the depth data can be converted by formula (6) 2D image coordinates of pixels in Use its depth value d t to convert to a three-dimensional homogeneous coordinate P:
  • the internal parameters of the depth camera in the imaging device are used After back-projecting the three-dimensional homogeneous coordinate P to the three-dimensional space, use the rotation matrix R and translation matrix t of the depth camera and the color camera to perform rigid transformation, and then use the internal parameter K of the color camera to project to the two-dimensional plane, and obtain the same color data.
  • the pixel coordinate P' of the object in the color data is a three-dimensional coordinate.
  • its depth value based on formula (8) that is, its third value P'[2 ] is divided by its first value and second value, respectively, to obtain the two-dimensional coordinate x t of the pixel point coordinate P' of the object in the color data:
  • a preset floating point number (for example, 0.5) can also be added to the result of the above division, which will not be repeated here.
  • a plane In three-dimensional space, a plane can be determined by any three points that are not on the same line, so that a vector perpendicular to the plane can be obtained, so the normal vector of each pixel can be determined by two adjacent pixels. plane, and then solve for the plane perpendicular to the plane.
  • a plurality of adjacent pixels for example, eight adjacent pixels
  • Each pixel determines a plane in the three-dimensional space, and solves a vector perpendicular to the plane, and finally calculates the average of multiple vectors as the normal vector of each pixel.
  • the pixel point x t as an example, according to its depth value d_t, its three-dimensional homogeneous coordinates can be obtained, and then the inverse K -1 of the internal parameter K is multiplied by the three-dimensional homogeneous coordinates, and the pixel point x t can be back projected into the three-dimensional space.
  • the three-dimensional point P x of the pixel point x t is arranged in a counterclockwise order in the 8 neighborhood pixels of the pixel point x t in the 3*3 window, and back-projected to the three-dimensional space respectively to obtain the corresponding three-dimensional point, denoted as ⁇ P 0 , P 1 , P 2 , P 3 , 3, P 7 ⁇ , then the three-dimensional normal vector N x of the pixel point x t can be expressed as
  • represents the cross product
  • % represents the remainder.
  • 1% 8 represents the remainder of 1 divided by 8, which is 1, and other situations can be deduced by analogy, and no examples will be given here.
  • the angle between the normal vector and the direction of gravity can be calculated by using the cosine formula, which is not repeated here.
  • Step S122 Projecting each pixel in the three-dimensional space to the direction of gravity to obtain the height value of each pixel in the three-dimensional space.
  • the step of obtaining the angle between the normal vector of each pixel point and the gravitational direction of the image to be processed in the above-mentioned step S121, and the step of obtaining the height value of each pixel point in the three-dimensional space in the step S122, can be according to the order. Sequential execution may also be performed simultaneously, which is not limited here.
  • Step S123 Analyze the height values of the pixel points whose included angles satisfy the preset angle condition to obtain the plane height of the object to be reconstructed.
  • the preset angle condition may include that the angle between the normal vector of the pixel point and the gravity direction of the image to be processed is less than or equal to a preset angle threshold (for example, 15 degrees, 10 degrees, etc.)
  • a preset angle threshold for example, 15 degrees, 10 degrees, etc.
  • screening is performed according to the preset angle condition to obtain the pixel point that meets the condition, and then the height value of each pixel point in the three-dimensional space obtained from the aforementioned step S122 , query the height values of the pixel points that satisfy the above-mentioned preset angle conditions, wherein the height values of the pixel points that satisfy the above-mentioned preset angle conditions can be regarded as a height set, and then perform cluster analysis on the height values in the height set , to obtain the plane height of the object to be reconstructed, so that only the height value can be used to obtain the plane height of the object to be reconstructed, which can reduce the calculation load.
  • a preset angle threshold for example, 15 degrees, 10 degrees
  • a random sampling consensus algorithm (Random Sample Consensus, RANSAC) can be used to cluster the height set, and each time a height value, the current plane height, can be randomly selected, and statistics related to the plane
  • RANSAC Random Sample Consensus
  • a preset drop range for example, 2 cm
  • the minimum value is selected and the corresponding number of inliers is greater than a predetermined threshold. Set the threshold candidate height as the final plane height.
  • Step S124 Use the plane height to screen out the target pixels belonging to the object to be reconstructed in the color data.
  • the pixels whose height value is greater than the plane height can be screened, and then the pixels corresponding to the screened pixels can be queried in the color data as candidate pixels, and the maximum connected domain formed by the candidate pixels in the color data can be determined.
  • the candidate pixels in the maximum connected domain are regarded as the target pixels belonging to the target to be reconstructed.
  • the target pixels belonging to the target to be reconstructed in each frame of the to-be-processed image can be automatically identified in combination with the direction of gravity, reducing the computational load of 3D reconstruction and avoiding user intervention, thus improving user experience.
  • FIG. 4 is a schematic flowchart of an embodiment of step S13 in FIG. 1 .
  • 4 is a schematic flowchart of an embodiment of dividing image data of each frame of images to be processed into corresponding data sets. Can include the following steps:
  • Step S131 successively take each frame of the image to be processed as the current image to be processed.
  • the image data of a certain frame of the image to be processed when divided, it can be used as the current image to be processed.
  • Step S132 When dividing the image data of the current image to be processed, it is judged that the last data set in the existing data set meets the preset overflow condition, if yes, go to step S133, otherwise go to step S134.
  • the existing data sets include: data set A, data set B, and data set C. Among them, data set C is created the latest, and data set C can be used as the last data set.
  • the preset overflow condition may include any of the following:
  • the number of frames of the image to be processed corresponding to the image data contained in the data set is greater than or equal to the preset frame number threshold (for example, 8 frames, 9 frames, 10 frames, etc.); any image data in the end data set belongs to the to-be-processed
  • the distance between the camera position of the image and the camera position of the current image to be processed is greater than a preset distance threshold (for example, 20 cm, 25 cm, 30 cm, etc.); the camera of the to-be-processed image to which any image data in the end data set belongs
  • the difference between the facing angle and the camera facing angle of the current image to be processed is greater than a preset angle threshold (eg, 25 degrees, 30 degrees, 35 degrees, etc.).
  • the camera orientation angle and camera position can be calculated according to the camera pose parameters of the image to be processed.
  • the camera pose parameter T t can be determined by the matrix Representation, that is, the camera pose parameters include a rotation matrix R and a translation matrix t, and the camera position can be expressed as:
  • T represents the transpose of the matrix.
  • the third row vector of R can be represented as the camera facing angle direction.
  • Step S133 Obtain the image data of the latest multi-frame images to be processed in the final data set, and store it in a newly created data set as a new final data set, and divide the image data of the current to-be-processed image into the new final data gather.
  • the end data set C includes image data 05 (belonging to the image to be processed 05), image data 06 (belonging to the image to be processed 06), image data 07 (belonging to the image to be processed 07), image data 08 (belonging to the image to be processed 08), and image data 09 (belonging to the image to be processed 09), the image data of the image to be processed 07 to the image to be processed 09 can be obtained, or Obtain the image data of the to-be-processed image 08 to the to-be-processed image 09, which is not limited here, and store the acquired image data in a newly created data set.
  • the image data is stored in the data set D.
  • the data set D includes: image data 07 (belonging to the to-be-processed image 07 ), image data 08 (belonging to the to-be-processed image 08 ), and image data 09 (belonging to the to-be-processed image 08 ) image 09), and take the data set D as a new final data set, and divide the image data 10 (belonging to the image 10 to be processed) into the data set D.
  • the end data set may also not meet the preset overflow condition, and the following step S134 may be performed in this case.
  • Step S134 Divide the image data of the current image to be processed into an end data set.
  • the image data of the current image to be processed is divided, if the last data set in the existing data set satisfies the preset overflow condition, the latest multi-frame to-be-processed image in the end data set is obtained.
  • the image data is stored in a newly created data set as a new end data set, so there are multiple frames of the same image data of the image to be processed between adjacent data sets, which is conducive to improving the alignment between adjacent data sets effect, which is beneficial to improve the effect of 3D reconstruction.
  • FIG. 5 is a schematic flowchart of an embodiment of step S14 in FIG. 1 .
  • 5 is a schematic flowchart of an embodiment of determining the pose optimization parameters of the data set, which may include the following steps:
  • Step S141 Take each data set as the current data set in turn, and select at least one data set whose time sequence is located before the current data set as a candidate data set.
  • data set B when determining the pose optimization parameters of data set B, data set B can be used as the current data set, and when determining the pose of data set C
  • the data set C when optimizing parameters, can be used as the current data set.
  • a new data set that is, the pose optimization parameters of the previous data set of the newly created data set can be determined.
  • a new data set D is newly created. At this time, the data set C can be used as the current data set, and its pose optimization parameters are determined.
  • FIG. 6 is a schematic flowchart of an embodiment of step S141 in FIG. 5 , which may include the following steps:
  • Step S61 constructing a bag-of-words model by using the preset image features of the image data in the current data set and the data set whose time sequence is located before it.
  • the preset image features can include ORB (Oriented FAST and Rotated Brief) image features, which can quickly create feature vectors for key points in the image data, and the feature vectors can be used to identify the target to be reconstructed in the image data.
  • ORB Oriented FAST and Rotated Brief
  • Fast and Brief They are the feature detection algorithm and the vector creation algorithm respectively, and details are not repeated here.
  • the bag of words model is a simplified expression model under natural language processing and information retrieval. Each preset image feature in the bag of words model is independent, and details are not repeated here.
  • the previous data set can be used as the current data set, and the preset image features of the image data in the current data set can be extracted and added to the bag-of-words model. In this way, the bag-of-words model can be incrementally expanded.
  • there is duplicate image data between the current data set and its previous data set so when extracting the preset image features of the image data in the current data set, the duplicated image data with the previous data set is not identical. Then perform feature extraction.
  • Step S62 Select the image data of the to-be-processed image at a preset time sequence in the current data set as the image data to be matched.
  • the preset time sequence may include the first position, the middle position, and the last position.
  • the data set C includes image data 05 (belonging to the image to be processed 05 ), image data 06 (belonging to image 06 to be processed), image data 07 (belonging to image 07 to be processed), image data 08 (belonging to image 08 to be processed), and image data 09 (belonging to image 09 to be processed), you can select the first image to be processed 05
  • the image data 05, the image data 07 of the middle image 07 to be processed, and the image data 09 of the last image 09 to be processed are used as the image data to be matched, and other implementation scenarios can be deduced by analogy, and will not be exemplified here.
  • the preset timing can also be set as the first position, the 1/4 timing position, the 1/2 timing position, the 3/4 timing position, and the last position, which is not limited here.
  • Step S63 From the preset range of the bag-of-words model, query the preset image features whose similarity score between the preset image features of the image data to be matched is greater than a preset similarity threshold.
  • the preset range may include preset image features of the image data whose data set is not adjacent to the current data set and is not included in the current data set. Still taking the data set A, data set B and data set C in the foregoing embodiment as an example, when the current data set is the data set C, the preset range may be the preset image features belonging to the data set A and the data set B. .
  • the preset similarity threshold may be a preset score value, for example, 0.018, 0.019, 0.020, etc., which is not limited herein.
  • the maximum score value score adj in the similarity scores between each image data in the data set adjacent to the current data set and the image data to be matched may also be obtained, and the pre-calculation of the maximum score value score adj A multiple (eg, 1.5 times, 2 times, 2.5 times) is set as the preset similarity threshold.
  • the preset multiple of the maximum score value score adj and any of the above preset score values can be used as the preset similarity threshold, that is, the query can be made from the preset range of the bag-of-words model.
  • the similarity score score loop between the preset image features of the image data to be matched is greater than the preset multiple of the maximum score value score adj , and the preset image feature of any one of the above preset score values, which is not limited here. .
  • Step S64 The data set where the image data to which the queried preset image feature belongs, and the data set adjacent to the current data set are used as the candidate data set.
  • the data set C and the data set D are queried by using the image data to be matched in the first position, and the data set D and the data set E are queried by using the image data to be matched in the middle position, Using the to-be-matched image data at the last position, the data set E and the data set F are queried, and the data sets C to F and the data set G can be used as the candidate data sets of the current data set H.
  • a preset number for example, 2, 3, etc.
  • the data sets adjacent to the current data set are used as candidate data sets.
  • the three data sets with the largest similarity score score loops and the data set G adjacent to the current data set can be selected from the data sets C to F as the candidate data sets.
  • Step S142 Using the image data of the current data set and the image data of the candidate data set, determine the spatial transformation parameters between the current data set and the candidate data set.
  • FIG. 7 is a schematic flowchart of an embodiment of step S142 in FIG. 5, which may include follow the steps below:
  • Step S71 Search for a set of image data to be matched that satisfies a preset matching condition in the candidate data set and the current data set.
  • the preset matching condition may include that the difference between the camera orientation angles of the to-be-processed images to which the image data to be matched belongs is the smallest, wherein, for each candidate data set, a group that satisfies the preset matching can be searched from the current data set and the candidate data set.
  • the image data to be matched belonging to the current data set may be denoted as I cur
  • the image data to be matched belonging to the candidate data set may be denoted as I similar .
  • Step S72 Based on the preset image features extracted from each set of image data to be matched, obtain matching pixel pairs between each set of image data to be matched.
  • the preset image features for example, ORB image features
  • the preset image features can be matched and screened to obtain the matching pixels between I cur and I similar , for ease of description, can be respectively recorded as p cur and p similar .
  • RANSAC algorithm reference may be made to the relevant steps in the foregoing embodiments, which will not be repeated here.
  • Step S73 Map the pixels belonging to the current data set in the matching pixel pair to the three-dimensional space to obtain the first three-dimensional matching point, and map the pixels belonging to the candidate data set in the matching pixel pair to the three-dimensional space to obtain the second three-dimensional matching point. 3D matching points.
  • p cur and p similar can be converted into three-dimensional homogeneous coordinates respectively, and then the three-dimensional homogeneous coordinates of p cur and p similar can be left-multiplied by the inverse K -1 of the internal parameter K to obtain the first three-dimensional matching points P cur and The second three-dimensional matching point P similar .
  • Step S74 Align the first three-dimensional matching point and the second three-dimensional matching point to obtain a spatial transformation parameter.
  • first three-dimensional matching point and the second three-dimensional matching point may be aligned in three-dimensional space, so that the degree of coincidence between the two is as large as possible, so as to obtain the space transformation parameter between the two.
  • a first pose transformation parameter between the first three-dimensional matching point and the second three-dimensional matching point may be obtained, wherein the first pose may be constructed by using the first three-dimensional matching point and the second three-dimensional matching point. Transform the objective function of the parameters, and then use SVD (Singular Value Decomposition, singular value decomposition) or non-offline optimization to solve the objective function, and obtain the first pose transformation parameter T pcd :
  • the first three-dimensional matching point can also be positioned by using the first pose transformation parameter T pcd and a preset pose transformation parameter (eg, identity matrix). pose optimization to obtain the first optimized matching point and the second optimized matching point respectively, wherein the first pose transformation parameter T pcd and the preset pose transformation parameter can be used to multiply the first three-dimensional matching point P cur to the left respectively, so as to obtain respectively
  • the first optimal matching point and the second optimal matching point can be respectively recorded as and Then calculate the second three-dimensional matching point P similar and the first optimized matching point respectively Second optimal matching point and select the pose transformation parameter adopted by the optimized matching point with a higher coincidence degree as the second pose transformation parameter, which can be denoted as T select for convenience of description.
  • the first optimal matching point when calculating the second three-dimensional matching point P similar and the first optimal matching point, the first optimal matching point can be searched within a preset range (for example, a range of 5 cm) of each second three-dimensional matching point P similar If it can be found, the second three-dimensional matching point P similar is marked as valid, otherwise, it can be marked as invalid. After all the second three-dimensional matching points P similar are searched, the second three-dimensional matching point P similar marked as valid is calculated.
  • a preset range for example, a range of 5 cm
  • the ratio of the number to the total number of the second three-dimensional matching points P similar that is, the second three-dimensional matching point P similar and the first optimized matching point
  • the degree of coincidence between the second three-dimensional matching point P similar and the second optimal matching point The degree of overlap between them can be deduced by analogy, which is not repeated here.
  • the second pose transformation parameter T select may be used as an initial value, and a preset alignment method (for example, the point-to-normal ICP method) can be used to match the first three-dimensional matching point P cur and the second three-dimensional matching point P similar are aligned to obtain a spatial transformation parameter between the current data set and the candidate data set, which is denoted as T icp for convenience of description.
  • a preset alignment method for example, the point-to-normal ICP method
  • T icp the spatial transformation parameter between the current data set and each candidate data set can be obtained.
  • Step S143 At least use the pose optimization parameters of the candidate data set and the spatial transformation parameters between the current data set and the candidate data set to obtain the pose optimization parameters of the current data set, and at least update the pose optimization parameters of the candidate data set .
  • the above-mentioned spatial transformation parameters T icp may be screened, wherein the current data set and each candidate can be obtained from the Among the spatial transformation parameters T icp between the data sets, the spatial transformation parameters that meet the preset screening conditions are selected for use in solving the pose optimization parameters of the current data set.
  • the preset screening conditions may include: a candidate data set related to the spatial transformation parameter T icp is adjacent to the current data set, or an optimized matching point obtained by performing pose optimization on the first three-dimensional matching point P cur using the spatial transformation parameter T icp ,
  • the degree of coincidence with the second three-dimensional matching point P similar is greater than a predetermined threshold of coincidence degree (eg, 60%, 65%, 70%, etc.).
  • a predetermined threshold of coincidence degree eg, 60%, 65%, 70%, etc.
  • the pose optimization parameters of the candidate data set and the spatial transformation between the current data set and the candidate data set can be used to construct the objective function of the pose optimization parameters of the current data set, and the objective function of the current data set can be obtained by solving the objective function.
  • the pose optimization parameters are updated, and at least the pose optimization parameters of the candidate data set are updated.
  • the previous data set of the newly created data set is respectively used as the current data set, and the pose optimization parameters can be obtained while scanning the target to be reconstructed and creating the data set, which can help to balance the calculation amount and reduce the Calculate the load, and realize 3D reconstruction of the target to be reconstructed in real time and online.
  • FIG. 8 is a schematic flowchart of an embodiment of step S143 in FIG. 5 . Can include the following steps:
  • Step S81 Take the two data sets corresponding to the respective spatial transformation parameters related to the current data set and the candidate data set whose time sequence is located before it as a data set pair.
  • the data sets C to F and the data set G are the candidate data sets of the current data set H
  • the spatial transformation parameters are The corresponding candidate data set C and the current data set H are regarded as a pair of data set pairs
  • the spatial transformation parameters are
  • the corresponding candidate data set D and the current data set H are regarded as a pair of data set pairs
  • the spatial transformation parameters are
  • the corresponding candidate data set E and the current data set H are regarded as a pair of data set pairs
  • the spatial transformation parameters are
  • the corresponding candidate data set F and the current data set H are regarded as a pair of data set pairs
  • the spatial transformation parameters are The corresponding candidate data set G and the current data set H are regarded as a pair of data sets.
  • each data set before the current data set H also has corresponding spatial transformation parameters.
  • data set B there may be spatial transformation parameters between data set A and data set A
  • the data set B and the data set A can be regarded as a data set pair.
  • data set C there can be spatial transformation parameters between the data set A and the data set B respectively, so the data set C and the data set can be respectively used.
  • A is regarded as a data set pair
  • data set C and data set B are regarded as a data set pair, and so on, and will not be exemplified here.
  • Step S82 Using the spatial transformation parameters of each data set pair and the respective pose optimization parameters to construct an objective function related to the pose optimization parameters.
  • the objective function can be expressed as:
  • i and j respectively represent the numbers of the data sets included in each data set pair (for example, letters such as C, D, E, etc., or can also be represented by Arabic numerals such as 1, 2, and 3) , represents the spatial transformation parameters between each data set pair, respectively represent the pose optimization parameters of each data set for the data set contained in it, and f( ) represents the optimization formula, which can be expressed as:
  • Step S83 Solve the objective function by using a preset solving method, and obtain the pose optimization parameters of the data set included in the data set corresponding to the current data set and the candidate data set whose time sequence is located before it.
  • the pose optimization parameters of the data sets included in each data set pair can be obtained.
  • the pose optimization parameters of the current data set H can be obtained, and the further optimized pose optimization parameters of the data sets C to G can be obtained, as well as the current data set.
  • Pose optimization parameters after further optimization of the data set before H When a new data set I is introduced and the spatial transformation parameters related to it are obtained, by constructing the objective function, the pose optimization parameters of the data set I can be obtained, and the pose optimization after further optimization of the previous data set can be obtained. Parameters, such a cycle, can further help eliminate the cumulative error of the pose.
  • the image data of the current data set and the image data of the candidate book set are utilized. , determine the spatial transformation parameters between the current data set and the candidate data set, and then use at least the pose optimization parameters of the candidate data set and the spatial transformation parameters between the current data set and the candidate data set to obtain the pose of the current data set.
  • Optimizing the parameters, and at least updating the pose optimization parameters of the candidate data set can help eliminate the error of the camera pose parameters accumulated during the scanning process, and reduce the amount of data used to calculate the pose optimization parameters, thereby reducing the computational load. .
  • FIG. 9 is a schematic flowchart of an embodiment of an interaction method based on 3D reconstruction of the present disclosure. , which can include the following steps:
  • Step S91 Obtain a three-dimensional model of the target to be reconstructed.
  • the three-dimensional model may be obtained through the steps in any of the foregoing three-dimensional reconstruction method embodiments, and reference may be made to the aforementioned three-dimensional reconstruction method embodiments, which will not be repeated here.
  • Step S92 constructing a three-dimensional map of the scene where the camera device is located by using a preset visual inertial navigation method, and acquiring current pose information of the camera device in the three-dimensional map.
  • the preset visual inertial navigation method can include SLAM (Simultaneous Localization and Mapping, real-time positioning and map construction).
  • SLAM Simultaneous Localization and Mapping, real-time positioning and map construction.
  • SLAM Simultaneous Localization and Mapping, real-time positioning and map construction.
  • the 3D model in order to realize the dynamic interaction with the 3D model, can also be bound with bones.
  • Bone binding refers to setting up a skeleton system for the 3D model, so that it can move according to the established rules at the skeleton joints, such as , the three-dimensional model is a four-legged animal such as a cow, a sheep, etc., after the three-dimensional model is bound with bones, its bone joints can move according to the established rules of the four-legged animal.
  • Step S93 Based on the pose information, display the three-dimensional model in the scene image currently captured by the imaging device.
  • the pose information may include the position and orientation of the camera device. For example, when the pose information of the camera device indicates that it is facing the ground, the scene image currently captured by the camera device can display the top of the 3D model; or, when the pose information of the camera device indicates that the camera device is facing an acute angle with the ground , the side of the 3D model can be displayed in the scene image currently captured by the camera device.
  • the skeleton after the skeleton is bound to the 3D model, it can also accept the driving instructions input by the user, so that the 3D model can move according to the driving instructions input by the user. For example, if the 3D model is a sheep, the user can drive it to lower its head and walk. Wait, there is no limitation here.
  • the three-dimensional model is a person or other objects, it can be deduced in the same way, and will not be exemplified one by one here.
  • the above solution based on the pose information of the camera device in the three-dimensional map of the scene, displays the three-dimensional model of the target to be reconstructed in the currently captured scene image, which can realize the geometric consistency fusion of the virtual object and the real scene, and because The three-dimensional model is obtained by the three-dimensional reconstruction method in the first aspect, so the effect of three-dimensional reconstruction can be improved, and the effect of geometrically consistent fusion of virtual and reality can be improved, which is beneficial to improve user experience.
  • FIG. 10 is a schematic flowchart of an embodiment of a three-dimensional reconstruction-based measurement method of the present disclosure. , which can include the following steps:
  • Step S1010 Obtain a three-dimensional model of the target to be reconstructed.
  • the three-dimensional model may be obtained through the steps in any of the foregoing three-dimensional reconstruction method embodiments, and reference may be made to the aforementioned three-dimensional reconstruction method embodiments, which will not be repeated here.
  • Step S1020 Receive a plurality of measurement points set by the user on the three-dimensional model.
  • the number of measurement points can be two, three, four, etc., which is not limited here.
  • the user taking the object to be reconstructed as an example of a plaster portrait, the user can set measurement points respectively in the centers of the two eyes of the three-dimensional model 29 , or can also set measurement points in the root and the person of the three-dimensional model 29 respectively, or , and the measurement points can also be set in the center of the two eyes of the three-dimensional model 29 and in the person, which will not be listed one by one here.
  • Step S1030 Acquire the distances between the multiple measurement points, and obtain the distances between the positions on the target to be reconstructed corresponding to the multiple measurement points.
  • the distance between the centers of the two eyes of the three-dimensional model 29 can be obtained, or, by obtaining the three-dimensional model 29
  • the distance between the mountain root and the human middle can be obtained by obtaining the plaster portrait corresponding to the distance between the mountain root and the human middle.
  • the above solution obtains the distance between the multiple measurement points by receiving the multiple measurement points set by the user on the 3D model, and then obtains the distance between the positions corresponding to the multiple measurement points on the target to be reconstructed, so as to satisfy the The measurement requirements for objects in the real scene, and because the 3D model is obtained by using the 3D reconstruction method in the first aspect, the effect of the 3D reconstruction can be improved, thereby improving the measurement accuracy.
  • FIG. 11 is a schematic frame diagram of an embodiment of a three-dimensional reconstruction apparatus 1100 of the present disclosure.
  • the three-dimensional reconstruction device 1100 includes an image acquisition part 1110, a first determination part 1120, a data division part 1130, a second determination part 1140, a parameter adjustment part 1150, and a model reconstruction part 1160, and the image acquisition part 1110 is configured to acquire the object to be reconstructed by scanning the imaging device The obtained multi-frame images to be processed;
  • the first determination part 1120 is configured to use each frame of the to-be-processed image and the calibration parameters of the imaging device to determine the target pixels of each frame of the to-be-processed image belonging to the target to be reconstructed and its camera pose parameters;
  • data The dividing part 1130 is configured to divide the image data of each frame of images to be processed into corresponding data sets in turn according to a preset dividing strategy, wherein the image data at least includes target pixels; the second determining part 1140 sequentially utilizes the images of each data set data
  • the second determining part 1140 includes a data set selection sub-part, configured to sequentially regard each data set as the current data set, and select at least one data set located before the current data set as a candidate data set
  • the first The second determination part 1140 further includes a spatial transformation parameter sub-part, configured to use the image data of the current data set and the image data of the candidate data set to determine the spatial transformation parameters between the current data set and the candidate data set
  • the second determination part 1140 also It includes a pose optimization parameter subsection, configured to use at least the pose optimization parameters of the candidate data set and the spatial transformation parameters between the current data set and the candidate data set to obtain the pose optimization parameters of the current data set, and at least update all the The pose optimization parameters of the candidate dataset are described.
  • the pose optimization parameter subsection includes a data set pair section, configured to treat two data sets corresponding to respective spatial transformation parameters related to the current data set and the data set temporally before it, as a data set
  • the pose optimization parameter subsection also includes an objective function construction part, which is configured to use the spatial transformation parameters of each data set pair and the respective pose optimization parameters to construct an objective function about the pose optimization parameters
  • the pose optimization The parameter sub-section also includes an objective function solving part, which is configured to solve the objective function by using a preset solving method, and obtain the current data set and the data set corresponding to the data set whose time sequence is located before it. The pose optimization of the included data set parameter.
  • the spatial transformation parameter subsection includes an image data search section configured to search for a set of image data to be matched that satisfies a preset matching condition in the candidate data set and the current data set, and the spatial transformation parameter subsection further includes a matching
  • the pixel point selection part is configured to obtain matching pixel point pairs between each group of image data to be matched based on preset image features extracted from each group of image data to be matched
  • the spatial transformation parameter sub-part further includes a three-dimensional space mapping part is configured to map the pixels belonging to the current data set in the matching pixel pair to the three-dimensional space to obtain the first three-dimensional matching point, and map the pixels belonging to the candidate data set in the matching pixel pair to the three-dimensional space to obtain the second three-dimensional matching point.
  • the three-dimensional matching point and the spatial transformation parameter subsection further includes a three-dimensional matching point alignment part, which is configured to perform alignment processing on the first three-dimensional matching point and the second three-dimensional matching point to obtain the spatial transformation parameters.
  • the 3D matching point alignment section includes a first pose transformation parameter subsection configured to obtain a first pose transformation parameter between the first 3D matching point and the second 3D matching point, and the 3D matching point aligning section It also includes a three-dimensional matching point optimization sub-section, configured to use the first pose transformation parameter and the preset pose transformation parameter to perform pose optimization on the first three-dimensional matching point, and obtain the first optimized matching point and the second optimized matching point respectively.
  • the three-dimensional matching point alignment part also includes a second pose transformation parameter sub-section, configured to calculate the degree of coincidence between the second three-dimensional matching point and the first optimal matching point and the second optimal matching point, and select a higher degree of coincidence
  • the pose transformation parameters adopted by the optimized matching points of the The first three-dimensional matching point and the second three-dimensional matching point are aligned to obtain the spatial transformation parameters between the current data set and the candidate data set.
  • the spatial transformation parameter subsection further includes a transformation parameter screening section, configured to select spatial transformation parameters that meet preset parameter screening conditions from the spatial transformation parameters between the current data set and each candidate data set; wherein , the preset parameter screening conditions include any one of the following: the candidate data set related to the spatial transformation parameter is adjacent to the current data set; the optimized matching point obtained by performing pose optimization on the first three-dimensional matching point by using the spatial transformation parameter The coincidence degree between the three-dimensional matching points is greater than a predetermined coincidence degree threshold.
  • the data set selection subsection includes a bag-of-words model construction section configured to construct a bag-of-words model using preset image features of the image data in the current data set and the data set temporally located before it, and the data set selection The subsection also includes an image data part to be matched, and is configured to select image data whose image to be processed belongs to a preset time sequence in the current data set, as the image data to be matched, and the data set selection subsection also includes an image feature query part, It is configured to query the preset image features whose similarity score between the preset image features of the image data to be matched is greater than a preset similarity threshold from the preset range of the bag-of-words model, and the data set selection subsection also includes candidate images.
  • the data set part is configured to use the data set where the image data to which the queried preset image feature belongs and the data set adjacent to the current data set are located as candidate data sets, wherein the preset range includes the data set and the data set to which they belong.
  • the current data set is not adjacent and is not included in the preset image features of the image data in the current data set.
  • the data set selection subsection further includes a maximum similarity score value acquisition section, configured to acquire the similarity score between each image data in the data set adjacent to the current data set and the image data to be matched.
  • the maximum score value, the data set selection subsection also includes a preset similarity threshold value determination part, configured to use either a preset multiple of the maximum score value or a preset score value as the preset similarity threshold value.
  • the data dividing part 1130 includes a current image to be processed sub-part configured to sequentially regard each frame of the image to be processed as the current image to be processed, and the data dividing part 1130 further includes a data processing sub-part configured to When the image data of the image to be processed is divided, if the last data set in the existing data set satisfies the preset overflow condition, the image data of the latest multi-frame to-be-processed images in the last data set is obtained, and stored in a newly created As a new end data set, the image data of the current image to be processed is divided into a new end data set.
  • the preset overflow condition includes any one of the following: the frame number of the image to be processed corresponding to the image data included in the end data set is greater than or equal to a preset frame number threshold; any image data in the end data set The distance between the camera position of the to-be-processed image to which it belongs and the camera position of the current to-be-processed image is greater than the preset distance threshold; the camera orientation angle of the to-be-processed image to which any image data in the end data set belongs and the camera of the current to-be-processed image The difference between the orientation angles is greater than a preset angle threshold; wherein, the camera position and the camera orientation angle are calculated by using the camera pose parameters of the image to be processed.
  • each frame of the image to be processed includes color data and depth data
  • the first determining part 1120 includes an included angle obtaining sub-part, configured to obtain the depth data after alignment with the color data for each pixel included in the pixel.
  • the angle between the normal vector and the gravitational direction of the image to be processed, the first determining part 1120 also includes a height acquisition sub-part, configured to project each pixel in the three-dimensional space to the gravitational direction, and obtain each pixel in the three-dimensional space.
  • the first determination part 1120 also includes a height analysis sub-part, configured to analyze the height value of the pixel points whose included angle satisfies the preset angle condition, to obtain the plane height of the object to be reconstructed, and the first determination part 1120 also includes The pixel screening subsection is configured to use the plane height to screen the target pixel points belonging to the object to be reconstructed in the color data.
  • the height analysis subsection includes a height set acquisition section, configured to use the height values of pixels whose included angles satisfy a preset angle condition as a height set, and the height analysis subsection includes a height cluster analysis section, configured to In order to perform cluster analysis on the height values in the height set, the plane height of the object to be reconstructed is obtained.
  • the three-dimensional reconstruction apparatus 1100 further includes a three-dimensional mapping part, configured to sequentially map the image data in each data set to a three-dimensional space to obtain a three-dimensional point cloud corresponding to each data set, and the three-dimensional reconstruction apparatus 1100 further It includes a point cloud adjustment part, which is configured to use the pose optimization parameters of each data set to adjust the corresponding 3D point cloud.
  • FIG. 12 is a schematic diagram of a framework of an embodiment of a three-dimensional reconstruction-based interaction apparatus 1200 of the present disclosure.
  • the interactive device 1200 based on three-dimensional reconstruction includes a model acquisition part 1210, a mapping positioning part 1220 and a display interactive part 1230.
  • the model acquisition part 1210 is configured to acquire a three-dimensional model of the object to be reconstructed, wherein the three-dimensional model is obtained by using any of the above three-dimensional reconstruction devices Obtained by the three-dimensional reconstruction device in the embodiment;
  • the mapping and positioning part 1220 is configured to use a preset visual inertial navigation method to construct a three-dimensional map of the scene where the camera device is located, and obtain the current pose information of the camera device in the three-dimensional map;
  • display interactive Section 1230 is configured to display the three-dimensional model in the scene image currently captured by the camera device based on the pose information.
  • FIG. 13 is a schematic frame diagram of an embodiment of a three-dimensional reconstruction-based measurement device 1300 of the present disclosure.
  • the measurement device 1300 based on 3D reconstruction includes a model acquisition part 1310, a display interaction part 1320 and a distance acquisition part 1330.
  • the model acquisition part 1310 is configured to acquire a 3D model of the object to be reconstructed, wherein the 3D model is implemented by using any of the above 3D reconstruction devices
  • the display interaction part 1320 is configured to receive a plurality of measurement points set by the user on the three-dimensional model; the distance acquisition part 1330 is configured to obtain the distance between the plurality of measurement points, and obtain the corresponding objects on the object to be reconstructed. The distance between the positions of multiple measurement points.
  • FIG. 14 is a schematic diagram of a framework of an embodiment of an electronic device 1400 of the present disclosure.
  • the electronic device 1400 includes a memory 1410 and a processor 1420 that are coupled to each other, and the processor 1420 is configured to execute program instructions stored in the memory 1410 to implement the steps in any of the foregoing three-dimensional reconstruction method embodiments, or to implement any of the foregoing three-dimensional reconstruction method embodiments.
  • the electronic device may include a mobile terminal such as a mobile phone and a tablet computer, or the electronic device may also be a data processing device (such as a microcomputer) connected with a camera device, which is not limited herein.
  • the processor 1420 may also be referred to as a CPU (Central Processing Unit, central processing unit).
  • the processor 1420 may be an integrated circuit chip with signal processing capability.
  • the processor 1420 may also be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), a field-programmable gate array (Field-Programmable Gate Array, FPGA) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the processor 1420 may be jointly implemented by an integrated circuit chip.
  • the above solution can improve the effect of 3D reconstruction and reduce the computational load of 3D reconstruction.
  • FIG. 15 is a schematic diagram of a framework of an embodiment of the disclosed computer-readable storage medium 1500 .
  • the computer-readable storage medium 1500 stores program instructions 1501 that can be executed by the processor, and the program instructions 1501 are used to implement the steps in any of the foregoing three-dimensional reconstruction method embodiments, or to implement any of the foregoing three-dimensional reconstruction-based interactive method embodiments. steps, or implement the steps in any of the foregoing three-dimensional reconstruction-based measurement method embodiments.
  • the above solution can improve the effect of 3D reconstruction and reduce the computational load of 3D reconstruction.
  • the disclosed method and apparatus may be implemented in other manners.
  • the device implementations described above are only illustrative.
  • the division of modules or units is only a logical function division. In actual implementation, there may be other divisions.
  • units or components may be combined or integrated. to another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, which may be in electrical, mechanical or other forms.
  • Units described as separate components may or may not be physically separated, and components shown as units may or may not be physical units, that is, may be located in one place, or may be distributed over network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this implementation manner.
  • each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
  • the integrated unit if implemented as a software functional unit and sold or used as a stand-alone product, may be stored in a computer-readable storage medium.
  • the technical solutions of the present disclosure can be embodied in the form of software products in essence, or the part that contributes to the prior art, or all or part of the technical solutions, and the computer software product is stored in a storage medium , including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the methods of the various embodiments of the present disclosure.
  • the aforementioned storage medium includes: U disk, mobile hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes .
  • multiple frames of images to be processed obtained by scanning the target to be reconstructed by the camera device are obtained; each frame of the to-be-processed image and the calibration parameters of the camera device are used to determine the target pixels of each frame of the to-be-processed image belonging to the target to be reconstructed and Its camera pose parameters; sequentially divide the image data of each frame of the image to be processed into the corresponding data set; use the image data of the data set and the image data and pose optimization parameters of the data set before it in time sequence to determine the data set.
  • Pose optimization parameters use the pose optimization parameters of the data set to adjust the camera pose parameters of the to-be-processed image to which the image data contained in the data set belongs; reconstruct the image data of the to-be-processed image to obtain the object to be reconstructed 3D model.
  • the above solution can improve the effect of 3D reconstruction and reduce the computational load of 3D reconstruction.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Length Measuring Devices By Optical Means (AREA)
  • Image Processing (AREA)

Abstract

一种三维重建及相关交互、测量方法和相关装置、设备,其中,三维重建方法包括获取摄像器件扫描待重建目标得到的多帧待处理图像;利用每帧待处理图像和摄像器件的标定参数,确定每帧待处理图像属于待重建目标的目标像素点及其相机位姿参数;依次将各帧待处理图像的图像数据划分至对应的数据集合;利用数据集合的图像数据及时序位于其之前的数据集合的图像数据和位姿优化参数,确定数据集合的位姿优化参数;利用数据集合的位姿优化参数,对包含于数据集合内的图像数据所属的待处理图像的相机位姿参数进行调整;对待处理图像的图像数据进行重建处理,得到待重建目标的三维模型。上述方案,能够提高三维重建的效果,并降低三维重建的计算负荷。

Description

三维重建及相关交互、测量方法和相关装置、设备
相关申请的交叉引用
本公开基于申请号为202110031502.0、申请日为2021年01月11日、申请名称为“三维重建及相关交互、测量方法和相关装置、设备”的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本公开作为参考。
技术领域
本公开涉及计算机视觉技术领域,特别是涉及一种三维重建及相关交互、测量方法和相关装置、设备。
背景技术
随着信息技术的发展和电子技术水平的提高,人们越来越青睐于使用诸如手机、平板电脑等集成有摄像器件的移动终端对真实场景中的物体进行三维重建,从而利用三维重建得到的三维模型在移动终端实现增强现实(Augmented Reality,AR)、游戏等应用。
然而,由于摄像器件在对真实场景中的物体进行扫描拍摄时,由于从摄像器件获取到的图像带有不同程度的噪声,而目前主流的相机位姿参数求解方法不可避免地会引入一些误差,误差随着扫描拍摄的进行而不断累积,从而影响三维模型的效果,此外,由于整个三维重建过程中,随着扫描拍摄的视野范围的扩大,新拍摄的图像不断融入,计算负荷也逐渐增大。有鉴于此,如何提高三维重建的效果,并降低三维重建的计算负荷成为亟待解决的问题。
发明内容
本公开提供一种三维重建方法及相关装置、设备。
本公开第一方面提供了一种三维重建方法,包括:获取摄像器件扫描待重建目标得到的多帧待处理图像;利用每帧待处理图像和摄像器件的标定参数,确定每帧待处理图像属于待重建目标的目标像素点及其相机位姿参数;按照预设划分策略,依次将各帧待处理图像的图像数据划分至对应的数据集合,其中,图像数据至少包括目标像素点;依次利用各个数据集合的图像数据,及时序位于其之前的数据集合的图像数据和位姿优化参数,确定每一数据集合的位姿优化参数;利用每一数据集合的位姿优化参数,对包含于数据集合内的图像数据所属的待处理图像的相机位姿参数进行调整;利用预设三维重建方式和待处理图像的调整后的相机位姿参数,对待处理图像的图像数据进行重建处理,得到待重建目标的三维模型。
因此,通过摄像器件扫描待重建目标得到的待处理图像和摄像器件的标定参数,确定得到每帧待处理图像属于待重建目标的目标像素点及其相机位姿参数,并按照预设划分策略,依次将各帧待处理图像的图像数据划分至对应的数据集合,从而依次利用各个数据集合的图像数据,及时序位于其之前的数据集合的图像数据和位姿优化参数,确定每一数据集合的位姿优化参数,进而每一数据集合的位姿优化参数都能够基于其之前的数据集合的位姿优化参数来得到确定,故在利用每一数据集合的位姿优化参数对包含于数据集合内的图像数据所属的待处理图像的相机位姿参数进行调整时,有利于消除扫描过程中所累积相机位姿参数的误差,故利用预设三维重建方式和待处理图像的调整后的相机位姿参数,对待处理图像的图像数据进行重建处理,得到的待重建目标的三维模型的效果得以有效提升,且以数据集合为单位进行相机位姿参数的误差消除,能够降低计算量,从而有利于减轻计算负荷。
本公开第二方面提供了一种基于三维重建的交互方法,包括:获取待重建目标的三维模型,其中,三维模型是利用第一方面中的三维重建方法得到的;利用预设视觉惯导方式,构建摄像器件所在场景的三维地图,并获取摄像器件当前在三维地图中的位姿信息;基于位姿信息,在摄像器件当前拍摄到的场景图像中显示三维模型。
因此,基于摄像器件在所在场景的三维地图中的位姿信息,将待重建目标的三维模型显示在当前拍摄到的场景图像中,能够实现虚拟物体与真实场景的几何一致性融合,且由于三维模型是由上述第一方面中的三维重建方法得到的,故能够提升三维重建的效果,进而提升虚拟与现实几何一致性融合效果,有利于提升用户体验。
本公开第三方面提供了一种基于三维重建的测量方法,包括:获取待重建目标的三维模型,其中, 三维模型是利用上述第一方面中的三维重建方法得到的;接收用户在三维模型上设置的多个测量点;获取多个测量点之间的距离,得到待重建目标上对应于多个测量点的位置之间的距离。
因此,通过接收用户在三维模型上设置的多个测量点,从而获取多个测量点之间的距离,进而得到待重建目标上对应于多个测量点的位置之间的距离,从而能够满足对真实场景中物体的测量需求,且由于三维模型是利用上述第一方面中的三维重建方法得到的,能够提升三维重建的效果,进而提升测量准确性。
本公开第四方面提供了一种三维重建装置,包括图像获取模块、第一确定模块、数据划分模块、第二确定模块、参数调整模块和模型重建模块,图像获取模块,用于获取摄像器件扫描待重建目标得到的多帧待处理图像;第一确定模块用于利用每帧待处理图像和摄像器件的标定参数,确定每帧待处理图像属于待重建目标的目标像素点及其相机位姿参数;数据划分模块用于按照预设划分策略,依次将各帧待处理图像的图像数据划分至对应的数据集合,其中,图像数据至少包括目标像素点;第二确定模块依次利用各个数据集合的图像数据,及时序位于其之前的数据集合的图像数据和位姿优化参数,确定每一数据集合的位姿优化参数;参数调整模块用于利用每一数据集合的位姿优化参数,对包含于数据集合内的图像数据所属的待处理图像的相机位姿参数进行调整;模型重建模块用于利用预设三维重建方式和待处理图像的调整后的相机位姿参数,对待处理图像的图像数据进行重建处理,得到待重建目标的三维模型。
本公开第五方面提供了一种基于三维重建的交互装置,包括模型获取模块、建图定位模块和显示交互模块,模型获取模块用于获取待重建目标的三维模型,其中,三维模型是利用上述第四方面中的三维重建装置得到的;建图定位模块用于利用预设视觉惯导方式,构建摄像器件所在场景的三维地图,并获取摄像器件当前在三维地图中的位姿信息;显示交互模块用于基于位姿信息,在摄像器件当前拍摄到的场景图像中显示三维模型。
本公开第六方面提供了一种基于三维重建的测量装置,包括模型获取模块、显示交互模块和距离获取模块,模型获取模块用于获取待重建目标的三维模型,其中,三维模型是利用上述第四方面中的三维重建装置得到的;显示交互模块用于接收用户在三维模型上设置的多个测量点;距离获取模块用于获取多个测量点之间的距离,得到待重建目标上对应于多个测量点的位置之间的距离。
本公开第七方面提供了一种电子设备,包括相互耦接的存储器和处理器,处理器用于执行存储器中存储的程序指令,以实现上述第一方面中的三维重建方法,或实现上述第二方面中的基于三维重建的交互方法,或实现上述第三方面中的基于三维重建的测量方法。
本公开第八方面提供了一种计算机可读存储介质,其上存储有程序指令,程序指令被处理器执行时实现上述第一方面中的三维重建方法,或实现上述第二方面中的基于三维重建的交互方法,或实现上述第三方面中的基于三维重建的测量方法。
本公开第九方面提供一种计算机程序,包括计算机可读代码,在所述计算机可读代码在电子设备中运行,被所述电子设备中的处理器执行的情况下,实现上述第一方面中的三维重建方法,或实现上述第二方面中的基于三维重建的交互方法,或实现上述第三方面中的基于三维重建的测量方法。
本公开第十方面提供一种计算机程序产品,当其在计算机上运行时,使得计算机执行上述第一方面中的三维重建方法,或执行上述第二方面中的基于三维重建的交互方法,或执行上述第三方面中的基于三维重建的测量方法。
上述方案,每一数据集合的位姿优化参数都能够基于其之前的数据集合的位姿优化参数来得到确定,故在利用每一数据集合的位姿优化参数对包含于数据集合内的图像数据所属的待处理图像的相机位姿参数进行调整时,有利于消除扫描过程中所累积相机位姿参数的误差,故利用预设三维重建方式和待处理图像的调整后的相机位姿参数,对待处理图像的图像数据进行重建处理,得到的待重建目标的三维模型的效果得以有效提升,且以数据集合为单位进行相机位姿参数的误差消除,能够降低计算量,从而有利于减轻计算负荷。
附图说明
图1是本公开三维重建方法一实施例的流程示意图;
图2是本公开三维重建方法一实施例的状态示意图;
图3是图1中步骤S12一实施例的流程示意图;
图4是图1中步骤S13一实施例的流程示意图;
图5是图1中步骤S14一实施例的流程示意图;
图6是图5中步骤S141一实施例的流程示意图;
图7是图5中步骤S142一实施例的流程示意图;
图8是图5中步骤S143一实施例的流程示意图;
图9是本公开基于三维重建的交互方法一实施例的流程示意图;
图10是本公开基于三维重建的测量方法一实施例的流程示意图;
图11是本公开三维重建装置一实施例的框架示意图;
图12是本公开基于三维重建的交互装置一实施例的框架示意图;
图13是本公开基于三维重建的测量装置一实施例的框架示意图;
图14是本公开电子设备一实施例的框架示意图;
图15是本公开计算机可读存储介质一实施例的框架示意图。
具体实施方式
下面结合说明书附图,对本公开实施例的方案进行详细说明。
以下描述中,为了说明而不是为了限定,提出了诸如特定系统结构、接口、技术之类的具体细节,以便透彻理解本公开。
本文中术语“系统”和“网络”在本文中常被可互换使用。本文中术语“中的至少一项”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和B的至少一项,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。此外,本文中的“多”表示两个或者多于两个
当前,三维重建是计算机视觉及增强现实领域的重要问题,在移动平台的增强现实,游戏以及三维打印等应用中发挥着重要作用。当移动平台上要实现真实物体的AR效果,例如骨骼驱动,通常要求用户能快速的对真实物体进行三维重建,因此三维物体扫描重建技术在移动平台的增强现实领域有着广泛的需求。
然而,移动平台要实现实时的三维物体重建存在多方面问题。(1)摄像器件位姿的累计误差消除。其中,由于从摄像器件获取到的深度视频流与图像视频流带有不同程度的噪声,而目前主流的相机位姿求解方法不可避免的会引入一些误差,该误差将随着扫描的进行不断累计从而影响最终的模型效果;(2)由于待重建物体在颜色,大小以及形状等各方面的差异较大,从而对重建方法的鲁棒性以及适用性提出了很高的要求;此外,移动平台上的物体扫描重建要求重建方法具有渐增式的重建模式,随着扫描视野范围的扩大,新拍摄的图像不断融入,进入视野的模型新区域将不断与已有的模型相融合,整个重建过程的计算负荷,而移动平台的计算资源有限。
鉴于此,如何提高三维重建的效果,并降低三维重建的计算负荷成为亟待解决的问题。本公开提出了一种三维重建及相关交互、测量方法和相关装置、设备,通过获取摄像器件扫描待重建目标得到的多帧待处理图像;利用每帧待处理图像和摄像器件的标定参数,确定每帧待处理图像属于待重建目标的目标像素点及其相机位姿参数;依次将各帧待处理图像的图像数据划分至对应的数据集合;利用数据集合的图像数据及时序位于其之前的数据集合的图像数据和位姿优化参数,确定数据集合的位姿优化参数;利用数据集合的位姿优化参数,对包含于数据集合内的图像数据所属的待处理图像的相机位姿参数进行调整;对待处理图像的图像数据进行重建处理,得到待重建目标的三维模型。
上述方案中,每一数据集合的位姿优化参数都能够基于其之前的数据集合的位姿优化参数来得到确定,故在利用每一数据集合的位姿优化参数对包含于数据集合内的图像数据所属的待处理图像的相机位姿参数进行调整时,有利于消除扫描过程中所累积相机位姿参数的误差,故利用预设三维重建方式和待处理图像的调整后的相机位姿参数,对待处理图像的图像数据进行重建处理,得到的待重建目标的三维模型的效果得以有效提升,且以数据集合为单位进行相机位姿参数的误差消除,能够降低计算量,从而有利于减轻计算负荷。
三维重建方法、三维重建的交互方法以及三维重建的测量方法的执行主体可以是电子设备,其中,电子设备可以是智能手机、台式计算机、平板电脑、笔记本电脑、智能音箱、数字助理、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备、智能可穿戴设备等类型的实体设备。也可以为运行于实体设备的软体,例如应用程序,浏览器等。其中,实体设备上运行的操作系统可以包括但不限于安卓系统、苹果系统(IOS Input Output System,IOS)、linux、windows等。
请参阅图1,图1是本公开三维重建方法一实施例的流程示意图。,可以包括如下步骤:
步骤S11:获取摄像器件扫描待重建目标得到的多帧待处理图像。
摄像器件可以包括但不限于:手机、平板电脑等移动终端,本公开方法实施例中的步骤可以由移动终端执行,也可以由与具备扫描拍摄功能的摄像器件连接的微型计算机等处理设备执行,在此不做限定。在一个实施场景中,摄像器件可以包括能够感测可见光的色彩相机和能够感测待重建目标深度的深度相机,如结构光深度相机等,当摄像器件包括色彩相机和深度相机时,每帧待处理图像包括色彩数据和深度数据。
待重建目标可以包括但不限于:人、动物、物体(如雕像、家具等)。例如,将雕像作为待重建目标时,可以通过对雕像进行扫描,最终得到雕像的三维模型,此外,在此基础上,还可以进一步对雕像的三维模型进行渲染、骨骼绑定等操作,在此不做限定,待重建目标可以根据实际应用需要而定,在此不做限制。
步骤S12:利用每帧待处理图像和摄像器件的标定参数,确定每帧待处理图像属于待重建目标的目标像素点及其相机位姿参数。
标定参数可以包括摄像器件的内部参数,例如,当摄像器件包括色彩相机时,标定参数可以包括色彩相机的内部参数;当摄像器件包括深度相机,或者包括色彩相机和深度相机时,可以以此类推,在此不再一一举例。在一个实施场景中,内部参数可以包括但不限于:相机焦距、相机主点坐标,在一实施场景中,内部参数可以以矩阵形式表示,例如,色彩相机的内部参数K可以表示为:
Figure PCTCN2021102882-appb-000001
其中,公式(1)中,f x,f y表示色彩相机的焦距,c x,c y表示色彩相机的主点坐标。此外,深度相机的内部参数
Figure PCTCN2021102882-appb-000002
可以以此类推,在此不再一一举例。
标定参数还可以包括摄像器件的深度相机与色彩相机之间的外部参数,用于表示由世界坐标系到相机坐标系的变换。本公开实施例中,外部参数可以包括一3*3的旋转矩阵R和一3*1的平移矩阵T。利用旋转矩阵R左乘世界坐标系下的坐标点P world,并与平移矩阵T进行求和,即可得到世界坐标系下的坐标点P world在相机坐标系下的对应坐标点P camera
实际扫描过程中,不可避免地扫描到不属于待重建目标的物体(例如,地面、墙面等),故为了提升后续三维重建的效果,需确定每帧待处理图像属于待重建目标的目标像素点。在一个实施场景中,可以利用预先训练的图像分割模型(例如,Unet模型)对待处理图像进行图像分割,从而得到待处理图像中属于待重建目标的目标像素点;在另一个实施场景中,还可以将待重建目标放置在与其具有较大色差的环境中,例如,待重建目标为乳白色的石膏雕像时,可以将待重建目标放置在黑色环境中进行扫描,从而将待处理图像中属于环境颜色的像素点标记为无效,将属于待重建目标颜色的像素点标记为有效,并比较由标记为有效的像素点所构成的连通域的大小,将最大的连通域中的像素点确定为属于待重建目标的像素点。
为了得到待重建目标完整的三维模型,摄像器件需以不同的位姿对待重建目标进行扫描待,故拍摄不同待处理图像所采用的相机位姿参数可能不同,故为了消除相机位姿参数的误差,从而提升后续三维重建的效果,需先确定每帧待处理图像的相机位姿参数。在一个实施场景中,可以利用每帧待处理图像属于待重建目标的目标像素点
Figure PCTCN2021102882-appb-000003
及其前一帧待处理图像属于待重建目标的目标像素点
Figure PCTCN2021102882-appb-000004
摄像器件的内部参数K构建相对位姿参数△T的目标函数,并利用ICP(Iterative Closest Point,迭代最近点)算法最小化该目标函数,从而求得相对位姿参数△T,其中,相对位姿参数△T是每帧待处理图像的相机位姿参数T t相对于其前一帧待处理图像的相机位姿参数T t-1的相对参数。其中,关于相对位姿参数△T的目标函数可以参阅下式:
E icp=θE photo+(1-θ)E geo        (2)
Figure PCTCN2021102882-appb-000005
Figure PCTCN2021102882-appb-000006
w(ε,p i)=△T□K -1□(x*d,y*d,d)     (5)
上式中,公式(2)至公式(5)中,θ为权重,x,y为像素点p i(像素点p i为目标像素点)在色彩数据
Figure PCTCN2021102882-appb-000007
中的二维图像坐标,d为将深度数据
Figure PCTCN2021102882-appb-000008
投影至色彩数据
Figure PCTCN2021102882-appb-000009
后像素点p i的深度值。故上式中,w(ε,p i)可以表示当前帧的像素点p i在利用相对位姿参数△T、内部参数K变换到其前一帧后理论上的对应像素点在三维空间中的位置坐标,该相对位姿参数△T越精确,该对应像素点在前一帧色彩数据
Figure PCTCN2021102882-appb-000010
的像素值
Figure PCTCN2021102882-appb-000011
与像素点p i在当前帧的色彩数据
Figure PCTCN2021102882-appb-000012
的像素值
Figure PCTCN2021102882-appb-000013
之间平方和误差 E photo越小,且该对应像素点在前一帧深度数据
Figure PCTCN2021102882-appb-000014
的深度值
Figure PCTCN2021102882-appb-000015
与对应像素点在三维空间中的z坐标值w(ε,p i) z之间平方和误差E geo也越小,故最小化上述目标函数E icp能够准确地获得相对位姿参数△T,从而能够提高相机位姿参数的准确性。
在得到每帧待处理图像的相机位姿参数T t相对于其前一帧待处理图像的相机位姿参数T t-1之间的相对位姿参数△T之后,可以将相对位姿参数△T的逆(即△T -1)左乘其前一帧待处理图像的相机位姿参数T t-1,得到当前帧的待处理图像的相机位姿参数T t。在一实施场景中,当待处理图像是摄像器件扫描得到的多帧待处理图像中的第一帧时,其相机位姿参数可以初始化为单位矩阵,本公开实施例中,单位矩阵为主对角线上的元素均为1,其他元素均为0的方阵。此外,在另一实施场景中,还可以同时进行待处理图像的扫描,以及目标像素点和相机位姿参数的确定,即在扫描得到一帧待处理图像之后,即对刚扫描得到的待处理图像进行目标像素点和相机位姿参数的确定,与此同时,扫描得到下一帧待处理图像,从而能够实时、在线地对待重建目标进行三维重建。
步骤S13:按照预设划分策略,依次将各帧待处理图像的图像数据划分至对应的数据集合,其中,图像数据至少包括目标像素点。
在一个实施场景中,在进行划分时,可以设置每个数据集合所能容纳的图像数据所属的待处理图像的最大帧数(例如,8帧、9帧、10帧等等),从而在当前数据集合所包含的图像数据所属的待处理图像的帧数到达最大帧数时,再创建一新的数据集合,并继续将未划分的待处理图像的图像数据划分至新创建的数据集合中,如此循环,直至扫描完毕。在另一个实施场景中,还可以将位姿相近(如,相机朝向角度相近、相机位置相近等)且在时序上连续的待处理图像的图像数据划分至同一数据集合中,在此不做具体限定。在又一个实施场景中,在对每帧待处理图像的图像数据进行划分时,还可以判断图像数据所属的待处理图像与其前一帧待处理图像的位姿差异(例如,相机朝向角度差异、相机位置距离)是否小于一预设下限值,若是,则还可以忽略待划分的待处理图像,并处理下一帧待处理图像的图像数据的划分操作。在又一个实施场景中,相邻数据集合之间可以有属于相同待处理图像的图像数据,例如,相邻数据集合之间可以存在属于两帧相同待处理图像的图像数据,或者,相邻数据集合之间还可以存在属于三帧相同待处理图像的图像数据,在此不做限定。
在一个实施场景中,各帧待处理图像的图像数据可以仅包括属于待重建目标的目标像素点(如深度数据中的目标像素点、色彩数据中的目标像素点);在另一个实施场景中,各帧待处理图像的图像数据还可以包括不属于待重建目标的像素点,例如,划分至数据集合中的图像数据还可以为整个待处理图像的图像数据,在此情形下,图像数据还可以包括目标像素点的位置坐标,以便后续查找到目标像素点。
请结合参阅图2,图2是本公开三维重建方法一实施例的状态示意图。如图2所示,待重建目标为一人像石膏雕塑,每帧待处理图像21可以包括色彩数据22和深度数据23,并得到属于待重建目标的目标像素点,从而将至少包括目标像素点的图像数据24依次划分至对应的数据集合25。
步骤S14:依次利用各个数据集合的图像数据,及时序位于其之前的数据集合的图像数据和位姿优化参数,确定每一数据集合的位姿优化参数。
在一个实施场景中,可以依次利用各个数据集合的图像数据,以及时序位于其之前的数据集合的图像数据确定两者之间的空间变换参数T icp,从而可以利用两者之间的空间变换参数T icp,以及各自的位姿优化参数T frag构建关于位姿优化参数T frag的目标函数,进而对目标函数求解,可以得到其位姿优化参数T frag及时序位于其之前的数据集合的位姿优化参数,从而能够对其之前的数据集合的姿优化参数T frag进行更新。故在依次求解每个数据集合的位姿优化参数T frag时,均考虑到了时序位于其之前的数据集合的位姿优化参数T frag,即数据集合与其之前的数据集合的位姿优化参数之间彼此关联,且随着新的数据集合的不断产生,之前的数据集合的位姿优化参数也得以不断更新,如此循环至最后一个数据集合,从而能够得到每个数据集合最终的位姿优化参数,故能够有效消除累积误差。在一个实施场景中,若数据集合是第一个数据集合,则可以将第一个数据集合的位姿优化参数初始化为单位矩阵。在一个实施场景中,可以每在创建一新的数据集合时,即可计算其前一个数据集合的位姿优化参数,并实现相关数据集合的位姿优化参数的更新,如此循环,直至扫描结束,得到每个数据集合最终的位姿优化参数,从而能够有利于均衡计算量,进而有利于减轻计算负荷。当摄像器件为手机、平板电脑等移动终端时,还可以在将拍摄得到的待处理图像划分至对应的数据集合的同时,对数据集合的位姿优化参数进行求解、更新,从而能够实时、在线地进行待重建目标的三维重建。需要说明的是,本公开实施例以及下述其他公开实施例中,如无特别说明,时序可以表示数据集合内待处理图像的整体拍摄时序。例如,数据 集合1包含:拍摄时序t=1的待处理图像、拍摄时序t=2的待处理图像,以及拍摄时序t=3的待处理图像;而数据集合2包含:拍摄时序t=4的待处理图像、拍摄时序t=5的待处理图像,以及拍摄时序t=6的待处理图像,可以数据集合1内待处理图像的整体拍摄时序位于数据集合2内待处理图像的整体拍摄时序,则可以认为数据集合1时序位于数据集合2之前。其他情况可以以此类推,在此不再一一举例。
在一个实施场景中,请结合参阅图2,为了在扫描过程中实现动态调整,以提高用户体验,并降低三维重建的计算负荷,还可以依次将数据集合25中的图像数据集合中的图像数据映射至三维空间,得到与每一数据集合对应的三维点云。
在一个实施场景中,可以利用图像数据所属的待处理图像的相机位姿参数T t和摄像器件的内部参数K将图像数据映射至三维空间,得到三维点云,其中,可以先将图像数据进行三维齐次,得到像素坐标,再采用相机位姿参数T t的逆
Figure PCTCN2021102882-appb-000016
和内部参数K的逆K -1左乘齐次之后的像素坐标,得到三维空间中的三维点云。在另一个实施场景中,可以采用数据集合的位姿优化参数T frag的逆
Figure PCTCN2021102882-appb-000017
左乘三维点云,以实现动态调整。即在得到每一数据集合的位姿优化参数之后,还可以采用数据集合的相机位姿参数对与其对应的三维点云进行调整。在又一个实施场景中,可以以预设颜色(例如,绿色)标示三维点云,在此不做限定。
步骤S15:利用每一数据集合的位姿优化参数,对包含于数据集合内的图像数据所属的待处理图像的相机位姿参数进行调整。
其中,可以采用每一数据集合的位姿优化参数T frag的逆
Figure PCTCN2021102882-appb-000018
左乘包含于其内的图像数据所属的待处理图像的相机位姿参数T t,从而实现对相机位姿参数的调整。例如,数据集合A中已划分至数据集合A的时序包括图像数据01(属于待处理图像01)、图像数据02(属于待处理图像02)、图像数据03(属于待处理图像03),故可以采用数据集合A的位姿优化参数T frag的逆
Figure PCTCN2021102882-appb-000019
分别左乘待处理图像01的相机位姿参数T t、待处理图像02的相机位姿参数T t、待处理图像03的相机位姿参数T t,从而实现对包含于数据集合A中的图像数据所属的待处理图像的相机位姿参数的调整。在一个实施场景中,当相邻数据集合之间存在属于相同待处理图像的图像数据时,可以仅采用两个数据集合中一者的位姿优化参数对相同待处理图像的相机位姿参数进行调整。例如,仍以上述数据集合A为例,其相邻的数据集合B包含图像数据03(属于待处理图像03)、图像数据04(属于待处理图像04),故当采用数据集合A的位姿优化参数T frag的逆
Figure PCTCN2021102882-appb-000020
分别左乘待处理图像01的相机位姿参数T t、待处理图像02的相机位姿参数T t、待处理图像03的相机位姿参数T t时,则当对包含于数据集合B的图像数据所属的待处理图像的相机位姿参数进行调整时,可以采用数据集合B的位姿优化参数T frag的逆
Figure PCTCN2021102882-appb-000021
左乘待处理图像04的相机位姿参数T t,而不再左乘待处理图像03的相机位姿参数T t
在一个实施场景中,请结合参阅图2,在得到每一数据集合的位姿优化参数之后,利用每一数据集合的位姿优化参数,对包含于数据集合内的图像数据所属的待处理图像的相机位姿参数26进行调整,得到调整后的相机位姿参数27。在得到每一数据集合的调整后位姿优化参数27之后,还可以采用数据集合的调整后的相机位姿参数27对与其对应的三维点云28进行调整,从而使得用户能够感受到三维点云的动态调整。
步骤S16:利用预设三维重建方式和待处理图像的调整后的相机位姿参数,对待处理图像的图像数据进行重建处理,得到待重建目标的三维模型。
预设三维重建方式可以包括但不限于:TSDF(Truncated Signed Distance Function,基于截断的带符号距离函数)重建方式、泊松重建方式。TSDF重建方式是一种在三维重建中计算隐势面的方式,具体在此不再赘述。泊松重建的核心思想是三维点云代表了待重建目标的表面位置,其法向量代表了内外的方向,通过隐势地拟合一个物体派生的指示函数,可以得出一个平滑的物体表面估计,具体在此不再赘述。在一个实施场景中,还可以在摄像器件拍摄待重建目标时,利用上述步骤实时重建待重建目标的三维模型,并和当前拍摄到的图像帧同位置角度叠加渲染,从而可以面向用户展现待重建目标主要的三维模型。在另一个实施场景中,还可以利用三维立体打印机将利用上述步骤重建得到的三维模型进行打印,从而得到与待重建目标对应的实物模型。
上述方案,每一数据集合的位姿优化参数都能够基于其之前的数据集合的位姿优化参数来得到确定,故在利用每一数据集合的位姿优化参数对包含于数据集合内的图像数据所属的待处理图像的相机位姿参数进行调整时,有利于消除扫描过程中所累积相机位姿参数的误差,故利用预设三维重建方式和待处理图像的调整后的相机位姿参数,对待处理图像的图像数据进行重建处理,得到的待重建目标的三维 模型的效果得以有效提升,且以数据集合为单位进行相机位姿参数的误差消除,能够降低计算量,从而有利于减轻计算负荷。
请参阅图3,图3是图1中步骤S12一实施例的流程示意图。其中,图3是图1中目标像素点的确定过程的流程示意图,可以包括如下步骤:
步骤S121:获取与色彩数据对齐之后的深度数据中所包含的每一像素点的法向量与待处理图像的重力方向之间的夹角。
其中,每帧待处理图像包括色彩数据I t和深度数据
Figure PCTCN2021102882-appb-000022
可以将深度数据
Figure PCTCN2021102882-appb-000023
投影至色彩数据I t,从而得到对齐后之后的深度数据D t。在一个实施场景中,可以利用公式(6)将深度数据
Figure PCTCN2021102882-appb-000024
中像素点的二维图像坐标
Figure PCTCN2021102882-appb-000025
利用其深度值d t转换为三维齐次坐标P:
Figure PCTCN2021102882-appb-000026
再基于公式(7),采用摄像器件中深度相机的内部参数
Figure PCTCN2021102882-appb-000027
将三维齐次坐标P反投影至三维空间后,利用深度相机与色彩相机的旋转矩阵R和平移矩阵t进行刚性变换,再利用色彩相机的内部参数K投影至二维平面,得到与色彩数据中对象的像素点坐标P′:
Figure PCTCN2021102882-appb-000028
上式中,与色彩数据中对象的像素点坐标P′为三维坐标,为了将其转换为二维坐标,需基于公式(8),利用其深度值,即其第三个数值P′[2]分别与其第一个数值和第二数值相除,得到与色彩数据中对象的像素点坐标P′的二维坐标x t
Figure PCTCN2021102882-appb-000029
此外,还可以分别将上述相除的结果加上一预设浮点数(例如0.5),在此不再赘述。
在三维空间中,可以通过不在同一直线上的任意三个点确定一个平面,从而可以得到与该平面垂直的向量,故每一像素点的法向量可以通过与其相邻的两个像素点确定一平面,再求解垂直于平面的平面即可得到。为了提高每一像素点的法向量的精度,可以获取每一像素点的多个邻域像素点(例如,八邻域像素点),再分别利用多个邻域像素点中的任意两个与每一像素点在三维空间中确定一平面,并求解垂直于平面的向量,最后求取多个向量的平均值,作为每一像素点的法向量。以像素点x t为例,根据其深度值d_t可以得到其三维齐次坐标,再将内部参数K的逆K -1左乘三维齐次坐标,可以得到像素点x t反投影至三维空间中的三维点P x,将像素点x t在3*3窗口中的8个邻域像素点按照逆时针顺序排列,并分别反投影至三维空间,得到对应的三维点,记为{P 0,P 1,P 2,P 3,3,P 7},则像素点x t的三维法向量N x可以表示为
Figure PCTCN2021102882-appb-000030
上式中,公式(9)中,×表示叉积,%表示取余,例如,1%8表示1除以8的余数,即为1,其他情况可以类推,在此不再一一举例。
在一个实施场景中,法向量与重力方向的夹角可以采用余弦公式计算得到,在此不再赘述。
步骤S122:并将每一像素点在三维空间投影至重力方向,得到每一像素点在三维空间的高度值。
仍以像素点x t为例,可以参考前述步骤得到其在三维空间中的三维点P x,并获取三维点P x在重力方向上的投影三维点,从而将投影三维点的第三个数值(z坐标),作为像素点x t在三维空间中的高度值H x
上述步骤S121中求取每一像素点的法向量与待处理图像的重力方向之间的夹角的步骤,以及步骤S122中求取每一像素点在三维空间的高度值的步骤,可以按照先后顺序执行,也可以同时执行,在此不做限定。
步骤S123:对夹角满足预设角度条件的像素点的高度值进行分析,得到待重建目标的平面高度。
在一个实施场景中,预设角度条件可以包括像素点的法向量与待处理图像的重力方向之间的夹角小于或等于预设角度阈值(例如,15度、10度等),故可以在前述步骤S121所得到的每一像素点对应的夹角的基础上,按照预设角度条件进行筛选,得到满足条件的像素点,然后从前述步骤S122所得到每一像素点在三维空间的高度值中,查询满足上述预设角度条件的像素点的高度值,其中,可以将满足上 述预设角度条件的像素点的高度值作为一高度集合,然后再对高度集合中的高度值进行聚类分析,得到待重建目标的平面高度,从而能够只利用高度值,即可得到待重建目标的平面高度,能够减轻计算计算负荷。在一个实施场景中,在进行聚类分析时,可以利用随机抽样一致性算法(Random Sample Consensus,RANSAC)对高度集合进行聚类,每次随机选取一个高度值,当前平面高度,并统计与平面高度之间的高度差在预设落差范围(例如,2厘米)之内的内点数量,当内点数量或迭代次数满足预设聚类条件时,将所有内点的高度值求平均作为一个候选高度,高度集合中剩下的高度值再进行下一次聚类,直至高度集合中的数量小于一预设阈值,在存在多个候选高度时,选取数值最小且对应的内点数量大于一预设阈值的候选高度作为最终的平面高度。
步骤S124:利用平面高度,筛选色彩数据中属于待重建物体的目标像素点。
其中,可以筛选高度值大于平面高度的像素点,再在色彩数据中查询与筛选到的像素点对应的像素点,作为候选像素点,确定候选像素点在色彩数据中所组成的最大连通域,并将最大连通域中的候选像素点作为属于待重建目标的目标像素点。
区别于前述实施例,能够结合重力方向即可自动识别出每帧待处理图像中属于待重建目标的目标像素点,降低三维重建的计算负荷,且能够免于用户介入,故能够提升用户体验。
请参阅图4,图4是图1中步骤S13一实施例的流程示意图。其中,图4是将各帧待处理图像的图像数据划分至对应数据集合一实施例的流程示意图。可以包括如下步骤:
步骤S131:依次将各帧待处理图像作为当前待处理图像。
其中,在对某一帧待处理图像的图像数据进行划分时,即可将其作为当前待处理图像。
步骤S132:在对当前待处理图像的图像数据进行划分时,判断已有的数据集合中的末尾数据集合满足预设溢出条件,若是,则执行步骤S133,否则执行步骤S134。
已有的数据集合可能只有一个,则这个数据集合就是末尾数据集合;或者,已有的数据集合可能有多个,则多个数据集合中创建最晚的为末尾数据集合。例如,已有的数据集合有:数据集合A、数据集合B、数据集合C,其中,数据集合C创建最晚,则可以将数据集合C作为末尾数据集合。
在一个实施场景中,为了使得数据集合可以根据帧数、摄像器件的角度以及位置变化,实现自适应构建,从而使得数据集合的构建更加鲁棒,预设溢出条件可以包括以下任一者:末尾数据集合中包含的图像数据所对应的待处理图像的帧数大于或等于预设帧数阈值(例如,8帧、9帧、10帧等);末尾数据集合中任一图像数据所属的待处理图像的相机位置与当前待处理图像的相机位置之间的距离大于预设距离阈值(例如,20厘米、25厘米、30厘米等);末尾数据集合中任一图像数据所属的待处理图像的相机朝向角度与当前待处理图像的相机朝向角度之间的差异大于预设角度阈值(例如,25度、30度、35度等)。其中,相机朝向角度和相机位置可以根据待处理图像的相机位姿参数计算得到。其中,相机位姿参数T t可以由矩阵
Figure PCTCN2021102882-appb-000031
表示,即相机位姿参数包括一旋转矩阵R和一平移矩阵t,相机位置position可以表示为:
position=-R T*t         (10)
上式中,公式(10)中,T表示矩阵的转置。此外,可以将R的第三行向量表示为相机朝向角度direction。
步骤S133:获取末尾数据集合中最新的多帧待处理图像的图像数据,并存入一新创建的数据集合,作为新的末尾数据集合,将当前待处理图像的图像数据划分至新的末尾数据集合。
仍以上述已有的数据集合A、数据集合B、数据集合C为例,在对图像数据10(属于待处理图像10)进行划分时,若当前的末尾数据集合C满足预设溢出条件,则获取末尾数据集合C中最新的多帧待处理图像的图像数据,例如,末尾数据集合C中包含图像数据05(属于待处理图像05)、图像数据06(属于待处理图像06)、图像数据07(属于待处理图像07)、图像数据08(属于待处理图像08)、图像数据09(属于待处理图像09),则可以获取其中待处理图像07~待处理图像09的图像数据,或者也可以获取其中待处理图像08~待处理图像09的图像数据,在此不做限定,并将获取到的图像数据存入一新创建的数据集合,例如,将待处理图像07~待处理图像09的图像数据存入到数据集合D中,此时,数据集合D中依时序包含:图像数据07(属于待处理图像07)、图像数据08(属于待处理图像08)、图像数据09(属于待处理图像09),并将数据集合D作为新的末尾数据集合,将图像数据10(属于待处理图像10)划分至数据集合D中。
在一个实施场景中,在对当前待处理图像的图像数据进行划分时,末尾数据集合还可能存在不满足预设溢出条件的情况,此时可以执行下述步骤S134。
步骤S134:将当前待处理图像的图像数据划分至末尾数据集合。
仍以上述已有的数据集合A、数据集合B、数据集合C为例,在对图像数据10(属于待处理图像10)进行划分时,若当前的末尾数据集合C不满足预设溢出条件,则将图像数据10(属于待处理图像10)划分至末尾数据集合C中。
区别于前述实施例,通过在对当前待处理图像的图像数据进行划分时,若已有的数据集合中的末尾数据集合满足预设溢出条件,则获取末尾数据集合中最新的多帧待处理图像的图像数据,并存入一新创建的数据集合,作为新的末尾数据集合,故相邻数据集合之间存在多帧相同的待处理图像的图像数据,有利于提升相邻数据集合之间对齐效果,进而有利于提升三维重建的效果。
请参阅图5,图5是图1中步骤S14一实施例的流程示意图。其中,图5是确定数据集合的位姿优化参数一实施例的流程示意图,可以包括如下步骤:
步骤S141:依次将每一数据集合作为当前数据集合,并选取至少一个时序位于当前数据集合之前的数据集合,作为候选数据集合。
仍以上述已有的数据集合A、数据集合B、数据集合C为例,在确定数据集合B的位姿优化参数时,可以将数据集合B作为当前数据集合,在确定数据集合C的位姿优化参数时,可以将数据集合C作为当前数据集合。此外,还可以在创建一新的数据集合时,即确定新创建的数据集合的前一个数据集合的位姿优化参数,如前述实施例中,在对图像数据10(属于待处理图像10)进行划分时,当前的末尾数据集合C满足预设溢出条件,则新创建一新的数据集合D,此时即可将数据集合C作为当前数据集合,并确定其位姿优化参数。
在一个实施场景中,为了提高位姿优化参数的准确性,从而提高三维重建效果,可以从位于当前数据集合之前的数据集合中,选择图像数据较为相似的作为候选数据集合,其中,请结合参阅图6,图6是图5中步骤S141一实施例的流程示意图,可以包括如下步骤:
步骤S61:利用当前数据集合及时序位于其之前的数据集合中的图像数据的预设图像特征,构建词袋模型。
预设图像特征可以包括ORB(Oriented FAST and Rotated BRIEF)图像特征,可以对图像数据中的关键点快速创建特征向量,且特征向量可以用于识别图像数据中的待重建目标,其中,Fast和Brief分别是特征检测算法和向量创建算法,具体在此不再赘述。
词袋模型(Bag of Words)是在自然语言处理和信息检索下被简化的表达模型,词袋模型中各个预设图像特征都是独立的,具体在此不再赘述。在一个实施场景中,在开始创建一个新的数据集合时,即可将其前一个数据集合作为当前数据集合,并提取当前数据集合中图像数据的预设图像特征,加入到词袋模型中,如此循环,可以使得词袋模型实现递增式扩充。在一个实施场景中,当前数据集合与其前一数据集合之间存在重复的图像数据,故在对当前数据集合中图像数据的预设图像特征进行提取时,与前一数据集合重复的图像数据不再进行特征提取。
步骤S62:选取所属的待处理图像位于当前数据集合中的预设时序处的图像数据,作为待匹配图像数据。
在一个实施场景中,预设时序可以包括首位、中位和末位,仍以前述实施例中的数据集合C为例,数据集合C包含图像数据05(属于待处理图像05)、图像数据06(属于待处理图像06)、图像数据07(属于待处理图像07)、图像数据08(属于待处理图像08)、图像数据09(属于待处理图像09),则可以选取首位待处理图像05的图像数据05、中位待处理图像07的图像数据07和末位待处理图像09的图像数据09,作为待匹配图像数据,其他实施场景可以以此类推,在此不再一一举例。此外,预设时序也可以实际情况,设置为首位、1/4时序位置、1/2时序位置、3/4时序位置、末位,在此不做限定。
步骤S63:从词袋模型的预设范围中,查询与待匹配图像数据的预设图像特征之间的相似度评分大于一预设相似度阈值的预设图像特征。
预设范围可以包括所属的数据集合与当前数据集合不相邻,且不包含于当前数据集合中的图像数据的预设图像特征。仍以前述实施例中的数集合A、数据集合B和数据集合C为例,在当前数据集合为数据集合C时,预设范围可以是属于数据集合A和属于数据集合B的预设图像特征。在一个实施场景中,预设相似度阈值可以是一个预设评分值,例如,0.018、0.019、0.020等等,在此不做限定。在另一个实施场景中,还可以获取与当前数据集合相邻的数据集合中各个图像数据与待匹配图像数据之间的相似度评分中最大评分值score adj,并将最大评分值score adj的预设倍数(例如,1.5倍、2倍、2.5倍)作为预设相似度阈值。在又一个实施场景中,还可以将最大评分值score adj的预设倍数、上述预设评分值中的任一者作为预设相似度阈值,即可以从词袋模型的预设范围中,查询与待匹配图像数据的预设图像特征之间的相似度评分score loop大于最大评分值score adj的预设倍数、上述预设评分值中的任一者的预设图像特征,在此不做限定。
步骤S64:将查询到的预设图像特征所属的图像数据所在的数据集合,以及与当前数据集合相邻的数据集合,作为候选数据集合。
以当前数据集合为数据集合H为例,利用位于首位的待匹配图像数据,查询到数据集合C、数据集合D,利用位于中位的待匹配图像数据,查询到数据集合D、数据集合E,利用位于末位的待匹配图像数据,查询到数据集合E、数据集合F,则可以将数据集合C~F和数据集合G作为当前数据集合H的候选数据集合。在一个实施场景中,还可以从查询到的预设图像特征所属的图像数据所在的数据集合中选取相似度评分最大的预设数量(例如,2个、3个等)个数据集合,以及与当前数据集合相邻的数据集合,作为候选数据集合。仍以当前数据集合为数据集合H为例,可以从数据集合C~F中选取相似度评分score loop最大的3个,以及与当前数据集合相邻的数据集合G,作为候选数据集合。
步骤S142:利用当前数据集合的图像数据和候选数据集合的图像数据,确定当前数据集合和候选数据集合之间的空间变换参数。
在一个实施场景中,为了确保当前数据集合和候选数据集合之间的空间变换参数的准确性,以提高位姿优化参数的准确性,从而提高三维重建的效果,可以结合当前数据集合和候选数据集合的图像数据的图像特征以及在三维空间中的位置,确定两者之间的空间变换参数,其中,请结合参阅图7,图7是图5中步骤S142一实施例的流程示意图,可以包括如下步骤:
步骤S71:在候选数据集合和当前数据集合中搜索一组满足预设匹配条件的待匹配图像数据。
预设匹配条件可以包括待匹配图像数据所属的待处理图像的相机朝向角度之间的差异最小,其中,对于每一个候选数据集合,都可以从其和当前数据集合中搜素一组满足预设匹配条件的待匹配图像数据,为了便于描述,可以将属于当前数据集合的待匹配图像数据记为I cur,将属于候选数据集合的待匹配图像数据记为I similar
步骤S72:基于从每组待匹配图像数据中提取得到的预设图像特征,得到每组待匹配图像数据之间的匹配像素点对。
可以结合RANSAC算法,对I cur和I similar的预设图像特征(例如,ORB图像特征)进行匹配对筛选,得到I cur和I similar之间的匹配像素点,为了便于描述,可以分别记为p cur和p similar。关于RANSAC算法可以参阅前述实施例中的相关步骤,在此不再赘述。
步骤S73:将匹配像素点对中属于当前数据集合的像素点映射至三维空间,得到第一三维匹配点,并将匹配像素点对中属于候选数据集合的像素点映射至三维空间,得到第二三维匹配点。
将p cur映射至三维空间,得到第一三维匹配点,为了便于描述,记为P cur,将p similar映射至三维空间,得到第二三维匹配点,为了便于描述,记为P similar。其中,可以分别将p cur和p similar转换为三维齐次坐标,再利用内部参数K的逆K -1分别左乘p cur和p similar的三维齐次坐标,得到第一三维匹配点P cur和第二三维匹配点P similar
步骤S74:将第一三维匹配点和第二三维匹配点进行对齐处理,得到空间变换参数。
其中,可以将第一三维匹配点和第二三维匹配点在三维空间对齐,以使两者之间的重合度尽可能地大,从而得到两者之间的空间变换参数。在一个实施场景中,可以获取第一三维匹配点和第二三维匹配点之间的第一位姿变换参数,其中,可以利用第一三维匹配点和第二三维匹配点构建关于第一位姿变换参数的目标函数,再利用SVD(Singular Value Decomposition,奇异值分解)或非线下优化等方式求解目标函数,得到第一位姿变换参数T pcd
Figure PCTCN2021102882-appb-000032
公式(11)中,
Figure PCTCN2021102882-appb-000033
Figure PCTCN2021102882-appb-000034
分别表示三维空间中第i对匹配三维点。
通过求解上述目标函数,得到第一位姿变换参数T pcd后,还可以利用第一位姿变换参数T pcd和预设位姿变换参数(例如,单位矩阵),对第一三维匹配点进行位姿优化,分别得到第一优化匹配点和第二优化匹配点,其中,可以利用第一位姿变换参数T pcd和预设位姿变换参数分别左乘第一三维匹配点P cur,从而分别得到第一优化匹配点和第二优化匹配点,为了便于描述,可以分别记为
Figure PCTCN2021102882-appb-000035
Figure PCTCN2021102882-appb-000036
再计算第二三维匹配点P similar分别与第一优化匹配点
Figure PCTCN2021102882-appb-000037
第二优化匹配点
Figure PCTCN2021102882-appb-000038
之间的重合度,并选取重合度较 高的优化匹配点所采用的位姿变换参数,作为第二位姿变换参数,为了便于描述,可以记为T select。其中,在计算第二三维匹配点P similar和第一优化匹配点
Figure PCTCN2021102882-appb-000039
之间的重合度时,可以在每一第二三维匹配点P similar预设范围内(例如,5厘米范围)查找第一优化匹配点
Figure PCTCN2021102882-appb-000040
若能够查找到,则将第二三维匹配点P similar标记为有效,否则,可以标记为无效,对所有第二三维匹配点P similar查找完毕后,计算标记为有效的第二三维匹配点P similar的数量占第二三维匹配点P similar总数的比例,即为第二三维匹配点P similar和第一优化匹配点
Figure PCTCN2021102882-appb-000041
之间的重合度,第二三维匹配点P similar和第二优化匹配点
Figure PCTCN2021102882-appb-000042
之间的重合度可以以此类推,在此不再赘述。
在求得第二位姿变换参数T select之后,可以以第二位姿变换参数T select作为初始值,利用预设对齐方式(例如,point-to-normal的ICP方式)将第一三维匹配点P cur和第二三维匹配点P similar进行对齐处理,得到当前数据集合与候选数据集合之间的空间变换参数,为了便于描述,记为T icp。重复上述步骤,即可得到当前数据集合与每一候选数据集合之间的空间变换参数T icp
步骤S143:至少利用候选数据集合的位姿优化参数,以及当前数据集合与候选数据集合之间的空间变换参数,获得当前数据集合的位姿优化参数,并至少更新候选数据集合的位姿优化参数。
在一个实施场景中,为了提高位姿优化参数的准确性,还可以在求解当前数据集合的位姿优化参数之前,对上述空间变换参数T icp进行筛选,其中,可以从当前数据集合与各个候选数据集合之间的空间变换参数T icp中,选取符合预设筛选条件的空间变换参数,以供求解当前数据集合的位姿优化参数所使用。预设筛选条件可以包括:空间变换参数T icp相关的候选数据集合与当前数据集合相邻,或者,利用空间变换参数T icp对第一三维匹配点P cur进行位姿优化得到的优化匹配点,与第二三维匹配点P similar之间的重合度大于一预设重合度阈值(例如,60%、65%、70%等)。其中,可以采用空间变换参数T icp左乘第一三维匹配点P cur,从而实现对其位姿优化。
其中,可以利用候选数据集合的位姿优化参数,以及当前数据集合与候选数据集合之间的空间变换构建关于当前数据集合的位姿优化参数的目标函数,通过求解目标函数,得到当前数据集合的位姿优化参数,并至少对候选数据集合的位姿优化参数进行更新。此外,如此循环,分别将新创建的数据集合前一个数据集合作为当前数据集合,可以在扫描待重建目标并创建数据集合的同时,求得位姿优化参数,进而能够有利于均衡计算量,减轻计算负荷,并实现实时、在线地对待重建目标进行三维重建。在一个实施场景中,请结合参阅图8,图8是图5中步骤S143一实施例的流程示意图。可以包括如下步骤:
步骤S81:将分别与当前数据集合以及时序位于其之前的候选数据集合相关的各个空间变换参数所对应的两个数据集合,作为一数据集合对。
仍以当前数据集合为数据集合H为例,数据集合C~F和数据集合G作为当前数据集合H的候选数据集合,将空间变换参数
Figure PCTCN2021102882-appb-000043
对应的候选数据集合C和当前数据集合H作为一对数据集合对,将空间变换参数
Figure PCTCN2021102882-appb-000044
对应的候选数据集合D和当前数据集合H作为一对数据集合对,将空间变换参数
Figure PCTCN2021102882-appb-000045
对应的候选数据集合E和当前数据集合H作为一对数据集合对,将空间变换参数
Figure PCTCN2021102882-appb-000046
对应的候选数据集合F和当前数据集合H作为一对数据集合对,将空间变换参数
Figure PCTCN2021102882-appb-000047
对应的候选数据集合G和当前数据集合H作为一对数据集合对。此外,当前数据集合H之前的各个数据集合(即数据集合A~G)也分别对应存在空间变换参数,例如,对于数据集合B而言,可以存在与数据集合A之间的空间变换参数,则可以将数据集合B与数据集合A作为一数据集合对,对于数据集合C而言,可以存在分别与数据集合A与数据集合B之间的空间变换参数,故可以分别将数据集合C与数据集合A作为一数据集合对,将数据集合C与数据集合B作为一数据集合对,以此类推,在此不再一一举例,空间变换参数的求解方式具体可以参考前述实施例中的相关步骤。
步骤S82:利用各个数据集合对的空间变换参数,以及各自的位姿优化参数,构建一关于位姿优化参数的目标函数。
其中,目标函数可以表示为:
Figure PCTCN2021102882-appb-000048
其中,公式(12)中,i,j分别表示各个数据集合对所包含的数据集合的编号(如,C、D、E等字母,或者,还可以以1、2、3等阿拉伯数字表示),
Figure PCTCN2021102882-appb-000049
表示各个数据集合对之间的空间变换参数,
Figure PCTCN2021102882-appb-000050
分别表示各个数据集合对所包含的数据集合各自的位姿优化参数,f(·)表示优化式子,可以表示为:
Figure PCTCN2021102882-appb-000051
公式(13)中,
Figure PCTCN2021102882-appb-000052
分别表示
Figure PCTCN2021102882-appb-000053
的逆,以及
Figure PCTCN2021102882-appb-000054
的逆。故每确定一数据集合的空间变换参数之后,均可为目标函数带来新的优化关系,从而对其之前的数据集合的位姿优化参数进行再次优化,直至全部数据集合的位姿优化参数确定完毕,故能够有利于消除扫描过程中的累积位姿误差,提高位姿优化参数的准确性,提升三维重建的效果。在一个实施场景中,当前数据集合为第一数据集合时,其位姿优化参数可以初始化为单位矩阵,可以参阅前述实施例中的相关步骤,在此不再赘述。
步骤S83:利用预设求解方式对目标函数进行求解,得到当前数据集合及时序位于其之前的候选数据集合各自对应的数据集合对所包含的数据集合的位姿优化参数。
如上式,通过最小化上述目标函数,可以得到各个数据集合对所包含的数据集合的位姿优化参数。仍以当前数据集合为数据集合H为例,通过求解上述目标函数,能够得到当前数据集合H的位姿优化参数,并得到数据集合C~G进一步优化后的位姿优化参数,以及当前数据集合H之前的数据集合进一步优化后的位姿优化参数。当引入新的数据集合I时,并求得与其相关的空间变换参数之后,通过构建目标函数,能够得到数据集合I的位姿优化参数,并得到其之前的数据集合进一步优化后的位姿优化参数,如此循环,能够进一步有利于消除位姿累积误差。
区别于前述实施例,通过依次将每一数据集合作为当前数据集合,并选取至少一个位于当前数据集合之前的数据集合,作为候选集合,从而利用当前数据集合的图像数据和候选书集合的图像数据,确定当前数据集合和候选数据集合之间的空间变换参数,进而至少利用候选数据集合的位姿优化参数,以及当前数据集合与候选数据集合之间的空间变换参数,获得当前数据集合的位姿优化参数,并至少更新候选数据集合的位姿优化参数,能够有利于消除扫描过程中所累积相机位姿参数的误差,并减少用于计算位姿优化参数的数据量,从而有利于降低计算负荷。
请参阅图9,图9是本公开基于三维重建的交互方法一实施例的流程示意图。,可以包括如下步骤:
步骤S91:获取待重建目标的三维模型。
三维模型可以是通过上述任一三维重建方法实施例中的步骤得到的,可以参考前述三维重建方法实施例,在此不再赘述。
步骤S92:利用预设视觉惯导方式,构建摄像器件所在场景的三维地图,并获取摄像器件当前在三维地图中的位姿信息。
预设视觉惯导方式可以包括SLAM(Simultaneous Localization and Mapping,即时定位与地图构建),通过SLAM,可以构建摄像器件(例如,手机、平板电脑等)所在场景的三维地图,并获取摄像器件当前在三维地图中的位姿信息。
在一个实施场景中,为了实现与三维模型的动态交互,还可以对三维模型进行骨骼绑定,骨骼绑定是指为三维模型架设骨骼系统,使之能够在骨骼关节处按照既定规则活动,例如,三维模型是牛、羊等四足动物,则三维模型在进行骨骼绑定之后,其骨骼关节可以按照四足动物既定规则活动。
步骤S93:基于位姿信息,在摄像器件当前拍摄到的场景图像中显示三维模型。
其中,位姿信息可以包括摄像器件的位置和朝向。例如,当摄像器件的位姿信息表示其朝向地面时,摄像器件当前拍摄到的场景图像中可以显示三维模型的顶部;或者,当摄像器件的位姿信息表示其朝向与地面呈一锐角夹角时,摄像器件当前拍摄到的场景图像中可以显示三维模型的侧面。在一个实施场景中,当对三维模型进行骨骼绑定后,还可以接受用户输入的驱动指令,使得三维模型能够按照用户输入的驱动指令活动,如三维模型为羊,用户可以驱动其低头、行走等等,在此不做限定。当三维模型为人或其他对象时,可以以此类推,在此不再一一举例。
上述方案,基于摄像器件在所在场景的三维地图中的位姿信息,将待重建目标的三维模型显示在当前拍摄到的场景图像中,能够实现虚拟物体与真实场景的几何一致性融合,且由于三维模型是由上述第一方面中的三维重建方法得到的,故能够提升三维重建的效果,进而提升虚拟与现实几何一致性融合效果,有利于提升用户体验。
请参阅图10,图10是本公开基于三维重建的测量方法一实施例的流程示意图。,可以包括如下步骤:
步骤S1010:获取待重建目标的三维模型。
三维模型可以是通过上述任一三维重建方法实施例中的步骤得到的,可以参考前述三维重建方法实施例,在此不再赘述。
步骤S1020:接收用户在三维模型上设置的多个测量点。
用户可以通过鼠标点击、键盘输入、显示触控的方式在三维模型上设置多个测量点。测量点的数量可以为两个、三个、四个等等,在此不不做限定。请结合参阅图2,以待重建目标为石膏人像为例,用户可以在三维模型29的两眼中心分别设置测量点,或者,还可以在三维模型29的山根和人中分别设置测量点,或者,还可以在三维模型29的两眼中心、人中分别设置测量点,在此不再一一举例。
步骤S1030:获取多个测量点之间的距离,得到待重建目标上对应于多个测量点的位置之间的距离。
请结合参阅图2,仍以待重建目标为石膏人像为例,通过获取三维模型29的两眼中心的距离,可以得到石膏人像对应于两眼中心之间的距离,或者,通过获取三维模型29的山根和人中之间的距离,可以得到石膏人像对应于山根和人中之间的距离,或者,通过获取三维模型29的两眼中心、人中两辆之间的距离,可以得到石膏人像对应于两眼、人中之间的距离,从而有利于提高了真实场景中物体测量的便捷性。
上述方案,通过接收用户在三维模型上设置的多个测量点,从而获取多个测量点之间的距离,进而得到待重建目标上对应于多个测量点的位置之间的距离,从而能够满足对真实场景中物体的测量需求,且由于三维模型是利用上述第一方面中的三维重建方法得到的,能够提升三维重建的效果,进而提升测量准确性。
请参阅图11,图11是本公开三维重建装置1100一实施例的框架示意图。三维重建装置1100包括图像获取部分1110、第一确定部分1120、数据划分部分1130、第二确定部分1140、参数调整部分1150和模型重建部分1160,图像获取部分1110配置为获取摄像器件扫描待重建目标得到的多帧待处理图像;第一确定部分1120配置为利用每帧待处理图像和摄像器件的标定参数,确定每帧待处理图像属于待重建目标的目标像素点及其相机位姿参数;数据划分部分1130配置为按照预设划分策略,依次将各帧待处理图像的图像数据划分至对应的数据集合,其中,图像数据至少包括目标像素点;第二确定部分1140依次利用各个数据集合的图像数据,及时序位于其之前的数据集合的图像数据和位姿优化参数,确定每一数据集合的位姿优化参数;参数调整部分1150配置为利用每一数据集合的位姿优化参数,对包含于数据集合内的图像数据所属的待处理图像的相机位姿参数进行调整;模型重建部分1160配置为利用预设三维重建方式和待处理图像的调整后的相机位姿参数,对待处理图像的图像数据进行重建处理,得到待重建目标的三维模型。
在一些实施例中,第二确定部分1140包括数据集合选取子部分,配置为依次将每一数据集合作为当前数据集合,并选取至少一个位于当前数据集合之前的数据集合,作为候选数据集合,第二确定部分1140还包括空间变换参数子部分,配置为利用当前数据集合的图像数据和候选数据集合的图像数据,确定当前数据集合和候选数据集合之间的空间变换参数,第二确定部分1140还包括位姿优化参数子部分,配置为至少利用候选数据集合的位姿优化参数,以及当前数据集合与候选数据集合之间的空间变换参数,获得当前数据集合的位姿优化参数,并至少更新所述候选数据集合的位姿优化参数。
在一些实施例中,位姿优化参数子部分包括数据集合对部分,配置为将分别与当前数据集合以及时序位于其之前的数据集合相关的各个空间变换参数对应的两个数据集合,作为一数据集合对,位姿优化参数子部分还包括目标函数构建部分,配置为利用各个数据集合对的空间变换参数,以及各自的位姿优化参数,构建一关于位姿优化参数的目标函数,位姿优化参数子部分还包括目标函数求解部分,配置为利用预设求解方式对目标函数进行求解,得到当前数据集合及时序位于其之前的数据集合各自对应的数据集合对所包含的数据集合的位姿优化参数。
在一些实施例中,空间变换参数子部分包括图像数据搜索部分,配置为在候选数据集合和当前数据集合中搜索一组满足预设匹配条件的待匹配图像数据,空间变换参数子部分还包括匹配像素点选取部分,配置为基于从每组待匹配图像数据中提取得到的预设图像特征,得到每组待匹配图像数据之间的匹配像素点对,空间变换参数子部分还包括三维空间映射部分,配置为将匹配像素点对中属于当前数据集合的像素点映射至三维空间,得到第一三维匹配点,并将匹配像素点对中属于候选数据集合的像素点映射至三维空间,得到第二三维匹配点,空间变换参数子部分还包括三维匹配点对齐部分,配置为将第一三维匹配点和第二三维匹配点进行对齐处理,得到空间变换参数。
在一些实施例中,三维匹配点对齐部分包括第一位姿变换参数子部分,配置为获取第一三维匹配点和第二三维匹配点之间的第一位姿变换参数,三维匹配点对齐部分还包括三维匹配点优化子部分,配置为利用第一位姿变换参数和预设位姿变换参数,对第一三维匹配点进行位姿优化,分别得到第一优化匹配点和第二优化匹配点,三维匹配点对齐部分还包括第二位姿变换参数子部分,配置为计算第二三维匹 配点分别与第一优化匹配点、第二优化匹配点之间的重合度,并选取重合度较高的优化匹配点所采用的位姿变换参数,作为第二位姿变换参数,三维匹配点对齐部分还包括空间变换参数子部分,配置为以第二位姿变换参数作为初始值,利用预设对齐方式将第一三维匹配点和第二三维匹配点进行对齐处理,得到当前数据集合与候选数据集合之间的空间变换参数。
在一些实施例中,空间变换参数子部分还包括变换参数筛选部分,配置为从当前数据集合与各个候选数据集合之间的空间变换参数中,选取符合预设参数筛选条件的空间变换参数;其中,预设参数筛选条件包括以下任一者:空间变换参数相关的候选数据集合与当前数据集合相邻;利用空间变换参数对第一三维匹配点进行位姿优化得到的优化匹配点,与第二三维匹配点之间的重合度大于一预设重合度阈值。
在一些实施例中,数据集合选取子部分包括词袋模型构建部分,配置为利用当前数据集合及时序位于其之前的数据集合中的图像数据的预设图像特征,构建词袋模型,数据集合选取子部分还包括待匹配图像数据部分,配置为选取所属的待处理图像位于当前数据集合中的预设时序处的图像数据,作为待匹配图像数据,数据集合选取子部分还包括图像特征查询部分,配置为从词袋模型的预设范围中,查询与待匹配图像数据的预设图像特征之间的相似度评分大于一预设相似度阈值的预设图像特征,数据集合选取子部分还包括候选数据集合部分,配置为将查询到的预设图像特征所属的图像数据所在的数据集合,以及与当前数据集合相邻的数据集合,作为候选数据集合,其中,预设范围包括所属的数据集合与当前数据集合不相邻,且不包含于当前数据集合中的图像数据的预设图像特征。
在一些实施例中,数据集合选取子部分还包括最大相似度评分值获取部分,配置为获取与当前数据集合相邻的数据集合中各个图像数据与待匹配图像数据之间的相似度评分中的最大评分值,数据集合选取子部分还包括预设相似度阈值确定部分,配置为将最大评分值的预设倍数和一预设评分值中的任一者作为预设相似度阈值。
在一些实施例中,数据划分部分1130包括当前待处理图像确定子部分,配置为依次将各帧待处理图像作为当前待处理图像,数据划分部分1130还包括数据处理子部分,配置为在对当前待处理图像的图像数据进行划分时,若已有的数据集合中的末尾数据集合满足预设溢出条件,则获取末尾数据集合中最新的多帧待处理图像的图像数据,并存入一新创建的数据集合,作为新的末尾数据集合,将当前待处理图像的图像数据划分至新的末尾数据集合。
在一些实施例中,预设溢出条件包括以下任一者:末尾数据集合中包含的图像数据所对应的待处理图像的帧数大于或等于预设帧数阈值;末尾数据集合中任一图像数据所属的待处理图像的相机位置与当前待处理图像的相机位置之间的距离大于预设距离阈值;末尾数据集合中任一图像数据所属的待处理图像的相机朝向角度与当前待处理图像的相机朝向角度之间的差异大于预设角度阈值;其中,相机位置和相机朝向角度是利用待处理图像的相机位姿参数计算得到的。
在一些实施例中,每帧待处理图像包括色彩数据和深度数据,第一确定部分1120包括夹角获取子部分,配置为获取与色彩数据对齐之后的深度数据中所包含的每一像素点的法向量与待处理图像的重力方向之间的夹角,第一确定部分1120还包括高度获取子部分,配置为将每一像素点在三维空间投影至重力方向,得到每一像素点在三维空间的高度值,第一确定部分1120还包括高度分析子部分,配置为对夹角满足预设角度条件的像素点的高度值进行分析,得到待重建目标的平面高度,第一确定部分1120还包括像素筛选子部分,配置为利用平面高度,筛选色彩数据中属于待重建物体的目标像素点。
在一些实施例中,高度分析子部分包括高度集合获取部分,配置为将夹角满足预设角度条件的像素点的高度值,作为一高度集合,高度分析子部分包括高度聚类分析部分,配置为对高度集合中的高度值进行聚类分析,得到待重建目标的平面高度。
在一些实施例中,三维重建装置1100还包括三维映射部分,配置为依次将每一数据集合中的图像数据映射至三维空间,得到与每一数据集合对应的三维点云,三维重建装置1100还包括点云调整部分,配置为采用每一数据集合的位姿优化参数将与其对应的三维点云进行调整。
请参阅图12,图12是本公开基于三维重建的交互装置1200一实施例的框架示意图。基于三维重建的交互装置1200包括模型获取部分1210、建图定位部分1220和显示交互部分1230,模型获取部分1210配置为获取待重建目标的三维模型,其中,三维模型是利用上述任一三维重建装置实施例中的三维重建装置得到的;建图定位部分1220配置为利用预设视觉惯导方式,构建摄像器件所在场景的三维地图,并获取摄像器件当前在三维地图中的位姿信息;显示交互部分1230配置为基于位姿信息,在摄像器件当前拍摄到的场景图像中显示三维模型。
请参阅图13,图13是本公开基于三维重建的测量装置1300一实施例的框架示意图。基于三维重建的测量装置1300包括模型获取部分1310、显示交互部分1320和距离获取部分1330,模型获取部分1310配置为获取待重建目标的三维模型,其中,三维模型是利用上述任一三维重建装置实施例中的三 维重建装置得到的;显示交互部分1320配置为接收用户在三维模型上设置的多个测量点;距离获取部分1330配置为获取多个测量点之间的距离,得到待重建目标上对应于多个测量点的位置之间的距离。
请参阅图14,图14是本公开电子设备1400一实施例的框架示意图。电子设备1400包括相互耦接的存储器1410和处理器1420,处理器1420用于执行存储器1410中存储的程序指令,以实现上述任一三维重建方法实施例中的步骤,或实现上述任一基于三维重建的交互方法实施例中的步骤,或实现上述任一基于三维重建的测量方法实施例中的步骤。在一个实施场景中,电子设备可以包括:手机、平板电脑等移动终端,或者电子设备还可以为连接有摄像器件的数据处理设备(如微型计算机等),在此不做限定。
处理器1420还可以称为CPU(Central Processing Unit,中央处理单元)。处理器1420可能是一种集成电路芯片,具有信号的处理能力。处理器1420还可以是通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。另外,处理器1420可以由集成电路芯片共同实现。
上述方案,能够提升三维重建的效果,并减轻三维重建的计算负荷。
请参阅图15,图15为本公开计算机可读存储介质1500一实施例的框架示意图。计算机可读存储介质1500存储有能够被处理器运行的程序指令1501,程序指令1501用于实现上述任一三维重建方法实施例中的步骤,或实现上述任一基于三维重建的交互方法实施例中的步骤,或实现上述任一基于三维重建的测量方法实施例中的步骤。
上述方案,能够提升三维重建的效果,并减轻三维重建的计算负荷。
在本公开所提供的几个实施例中,应该理解到,所揭露的方法和装置,可以通过其它的方式实现。例如,以上所描述的装置实施方式仅仅是示意性的,例如,模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性、机械或其它的形式。
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施方式方案的目的。
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本公开各个实施方式方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
工业实用性
本公开实施例中,通过获取摄像器件扫描待重建目标得到的多帧待处理图像;利用每帧待处理图像和摄像器件的标定参数,确定每帧待处理图像属于待重建目标的目标像素点及其相机位姿参数;依次将各帧待处理图像的图像数据划分至对应的数据集合;利用数据集合的图像数据及时序位于其之前的数据集合的图像数据和位姿优化参数,确定数据集合的位姿优化参数;利用数据集合的位姿优化参数,对包含于数据集合内的图像数据所属的待处理图像的相机位姿参数进行调整;对待处理图像的图像数据进行重建处理,得到待重建目标的三维模型。上述方案,能够提高三维重建的效果,并降低三维重建的计算负荷。

Claims (34)

  1. 一种三维重建方法,包括:
    获取摄像器件扫描待重建目标得到的多帧待处理图像;
    利用每帧所述待处理图像和所述摄像器件的标定参数,确定每帧所述待处理图像属于所述待重建目标的目标像素点及其相机位姿参数;
    按照预设划分策略,依次将各帧所述待处理图像的图像数据划分至对应的数据集合,其中,所述图像数据至少包括所述目标像素点;
    依次利用各个所述数据集合的图像数据,及时序位于其之前的数据集合的图像数据和位姿优化参数,确定每一所述数据集合的位姿优化参数;
    利用每一所述数据集合的位姿优化参数,对包含于所述数据集合内的图像数据所属的所述待处理图像的相机位姿参数进行调整;
    利用预设三维重建方式和所述待处理图像的调整后的相机位姿参数,对所述待处理图像的图像数据进行重建处理,得到所述待重建目标的三维模型。
  2. 根据权利要求1所述的三维重建方法,其中,所述依次利用各个所述数据集合的图像数据,及时序位于其之前的数据集合的图像数据和位姿优化参数,确定每一所述数据集合的位姿优化参数包括:
    依次将每一所述数据集合作为当前数据集合,并选取至少一个时序位于所述当前数据集合之前的数据集合,作为候选数据集合;
    利用所述当前数据集合的图像数据和所述候选数据集合的图像数据,确定所述当前数据集合和所述候选数据集合之间的空间变换参数;
    至少利用所述候选数据集合的位姿优化参数,以及所述当前数据集合与所述候选数据集合之间的空间变换参数,获得所述当前数据集合的位姿优化参数,并至少更新所述候选数据集合的位姿优化参数。
  3. 根据权利要求2所述的三维重建方法,其中,所述至少利用所述候选数据集合的位姿优化参数,以及所述当前数据集合与所述候选数据集合之间的空间变换参数,获得所述当前数据集合的位姿优化参数,并至少更新所述候选数据集合的位姿优化参数包括:
    将分别与所述当前数据集合以及时序位于其之前的所述候选数据集合相关的各个空间变换参数所对应的两个数据集合,作为一数据集合对;
    利用各个所述数据集合对的空间变换参数,以及各自的位姿优化参数,构建一关于所述位姿优化参数的目标函数;
    利用预设求解方式对所述目标函数进行求解,得到所述当前数据集合及时序位于其之前的所述候选数据集合各自对应的数据集合对所包含的数据集合的位姿优化参数。
  4. 根据权利要求2所述的三维重建方法,其中,所述利用所述当前数据集合的图像数据和所述候选数据集合的图像数据,确定所述当前数据集合和所述候选数据集合之间的空间变换参数包括:
    在所述候选数据集合和所述当前数据集合中搜索一组满足预设匹配条件的待匹配图像数据;
    基于从每组所述待匹配图像数据中提取得到的预设图像特征,得到每组所述待匹配图像数据之间的匹配像素点对;
    将所述匹配像素点对中属于所述当前数据集合的像素点映射至三维空间,得到第一三维匹配点,并将所述匹配像素点对中属于所述候选数据集合的像素点映射至所述三维空间,得到第二三维匹配点;
    将所述第一三维匹配点和所述第二三维匹配点进行对齐处理,得到所述空间变换参数。
  5. 根据权利要求4所述的三维重建方法,其中,所述将所述第一三维匹配点和所述第二三维匹配点进行对齐处理,得到所述空间变换参数包括:
    获取所述第一三维匹配点和所述第二三维匹配点之间的第一位姿变换参数;
    利用所述第一位姿变换参数和预设位姿变换参数,对所述第一三维匹配点进行位姿优化,分别得到第一优化匹配点和第二优化匹配点;
    计算所述第二三维匹配点分别与所述第一优化匹配点、所述第二优化匹配点之间的重合度,并选取所述重合度较高的优化匹配点所采用的位姿变换参数,作为第二位姿变换参数;
    以所述第二位姿变换参数作为初始值,利用预设对齐方式将所述第一三维匹配点和所述第二三维匹配点进行对齐处理,得到所述当前数据集合与所述候选数据集合之间的空间变换参数。
  6. 根据权利要求4所述的三维重建方法,其中,所述利用所述当前数据集合的图像数据和所述候选数据集合的图像数据,确定所述当前数据集合和所述候选数据集合之间的空间变换参数之后,以及所述至少利用所述候选数据集合的位姿优化参数,以及所述当前数据集合与所述候选数据集合之间的空间 变换参数,获得所述当前数据集合的位姿优化参数之前,所述方法还包括:
    从所述当前数据集合与各个所述候选数据集合之间的空间变换参数中,选取符合预设参数筛选条件的空间变换参数;
    其中,所述预设参数筛选条件包括以下任一者:所述空间变换参数相关的所述候选数据集合与所述当前数据集合相邻;利用所述空间变换参数对所述第一三维匹配点进行位姿优化得到的优化匹配点,与所述第二三维匹配点之间的重合度大于一预设重合度阈值。
  7. 根据权利要求2所述的三维重建方法,其中,所述选取至少一个时序位于所述当前数据集合之前的数据集合,作为候选数据集合包括:
    利用所述当前数据集合及时序位于其之前的数据集合中的图像数据的预设图像特征,构建词袋模型;
    选取所属的待处理图像位于所述当前数据集合中的预设时序处的图像数据,作为待匹配图像数据;
    从所述词袋模型的预设范围中,查询与所述待匹配图像数据的预设图像特征之间的相似度评分大于一预设相似度阈值的预设图像特征;
    将查询到的预设图像特征所属的图像数据所在的数据集合,以及与所述当前数据集合相邻的数据集合,作为所述候选数据集合;
    其中,所述预设范围包括所属的数据集合与所述当前数据集合不相邻,且不包含于所述当前数据集合中的图像数据的预设图像特征。
  8. 根据权利要求7所述的三维重建方法,其中,所述从所述词袋模型的预设范围中,查询与所述待匹配图像数据的预设图像特征之间的相似度评分大于一预设相似度阈值的预设图像特征之前,所述方法还包括:
    获取与所述当前数据集合相邻的数据集合中各个所述图像数据与所述待匹配图像数据之间的相似度评分中的最大评分值;
    将所述最大评分值的预设倍数和一预设评分值中的任一者作为所述预设相似度阈值。
  9. 根据权利要求1所述的三维重建方法,其中,所述按照预设划分策略,依次将各帧所述待处理图像的图像数据划分至对应的数据集合包括:
    依次将各帧所述待处理图像作为当前待处理图像;
    在对当前待处理图像的图像数据进行划分时,若已有的所述数据集合中的末尾数据集合满足预设溢出条件,则获取所述末尾数据集合中最新的多帧所述待处理图像的图像数据,并存入一新创建的所述数据集合,作为新的所述末尾数据集合,将所述当前待处理图像的图像数据划分至新的所述末尾数据集合。
  10. 根据权利要求9所述的三维重建方法,其中,所述预设溢出条件包括以下任一者:
    所述末尾数据集合中包含的所述图像数据所对应的所述待处理图像的帧数大于或等于预设帧数阈值;所述末尾数据集合中任一所述图像数据所属的待处理图像的相机位置与所述当前待处理图像的相机位置之间的距离大于预设距离阈值;所述末尾数据集合中任一所述图像数据所属的待处理图像的相机朝向角度与所述当前待处理图像的相机朝向角度之间的差异大于预设角度阈值;
    其中,所述相机位置和所述相机朝向角度是利用所述待处理图像的相机位姿参数计算得到的。
  11. 根据权利要求1至10任一项所述的三维重建方法,其中,每帧所述待处理图像包括色彩数据和深度数据,所述利用每帧所述待处理图像和所述摄像器件的标定参数,确定每帧所述待处理图像属于所述待重建目标的目标像素点包括:
    获取与所述色彩数据对齐之后的深度数据中所包含的每一像素点的法向量与所述待处理图像的重力方向之间的夹角;
    将所述每一像素点在三维空间投影至所述重力方向,得到所述每一像素点在所述三维空间的高度值;
    对所述夹角满足预设角度条件的像素点的高度值进行分析,得到所述待重建目标的平面高度;
    利用所述平面高度,筛选所述色彩数据中属于所述待重建物体的目标像素点。
  12. 根据权利要求11所述的三维重建方法,其中,所述对所述夹角满足预设角度条件的像素点的高度值进行分析,得到所述待重建目标的平面高度包括:
    将所述夹角满足预设角度条件的所述像素点的高度值,作为一高度集合;
    对所述高度集合中的高度值进行聚类分析,得到所述待重建目标的平面高度。
  13. 根据权利要求1至12任一项所述的三维重建方法,其中,所述依次利用各个所述数据集合的图像数据,及时序位于其之前的数据集合的图像数据和位姿优化参数,确定每一所述数据集合的位姿优化参数之后,所述方法还包括:
    依次将每一所述数据集合中的图像数据映射至三维空间,得到与每一所述数据集合对应的三维点 云;
    采用每一所述数据集合的所述位姿优化参数将与其对应的所述三维点云进行调整。
  14. 一种基于三维重建的交互方法,包括:
    获取待重建目标的三维模型,其中,所述三维模型是利用权利要求1至13任一项所述的三维重建方法得到的;
    利用预设视觉惯导方式,构建摄像器件所在场景的三维地图,并获取所述摄像器件当前在所述三维地图中的位姿信息;
    基于所述位姿信息,在所述摄像器件当前拍摄到的场景图像中显示所述三维模型。
  15. 一种基于三维重建的测量方法,包括:
    获取待重建目标的三维模型,其中,所述三维模型是利用权利要求1至13任一项所述的三维重建方法得到的;
    接收用户在所述三维模型上设置的多个测量点;
    获取所述多个测量点之间的距离,得到所述待重建目标上对应于所述多个测量点的位置之间的距离。
  16. 一种三维重建装置,包括:
    图像获取部分,配置为获取摄像器件扫描待重建目标得到的多帧待处理图像;
    第一确定部分,配置为利用每帧所述待处理图像和所述摄像器件的标定参数,确定每帧所述待处理图像属于所述待重建目标的目标像素点及其相机位姿参数;
    数据划分部分,配置为按照预设划分策略,依次将各帧所述待处理图像的图像数据划分至对应的数据集合,其中,所述图像数据至少包括所述目标像素点;
    第二确定部分,依次利用各个所述数据集合的图像数据,及时序位于其之前的数据集合的图像数据和位姿优化参数,确定每一所述数据集合的位姿优化参数;
    参数调整部分,配置为利用每一所述数据集合的位姿优化参数,对包含于所述数据集合内的图像数据所属的待处理图像的相机位姿参数进行调整;
    模型重建部分,配置为利用预设三维重建方式和所述待处理图像的调整后的相机位姿参数,对所述待处理图像的图像数据进行重建处理,得到所述待重建目标的三维模型。
  17. 根据权利要求16所述的三维重建装置,其中,所述第二确定部分包括:
    数据集合选取子部分,配置为依次将每一所述数据集合作为当前数据集合,并选取至少一个时序位于所述当前数据集合之前的数据集合,作为候选数据集合;
    空间变换参数子部分,配置为利用所述当前数据集合的图像数据和所述候选数据集合的图像数据,确定所述当前数据集合和所述候选数据集合之间的空间变换参数;
    位姿优化参数子部分,配置为至少利用所述候选数据集合的位姿优化参数,以及所述当前数据集合与所述候选数据集合之间的空间变换参数,获得所述当前数据集合的位姿优化参数,并至少更新所述候选数据集合的位姿优化参数。
  18. 根据权利要求17所述的三维重建装置,其中,所述位姿优化参数子部分包括:
    数据集合对部分,配置为将分别与所述当前数据集合以及时序位于其之前的所述候选数据集合相关的各个空间变换参数所对应的两个数据集合,作为一数据集合对;
    目标函数构建部分,配置为利用各个所述数据集合对的空间变换参数,以及各自的位姿优化参数,构建一关于所述位姿优化参数的目标函数;
    目标函数求解,配置为利用预设求解方式对所述目标函数进行求解,得到所述当前数据集合及时序位于其之前的所述候选数据集合各自对应的数据集合对所包含的数据集合的位姿优化参数。
  19. 根据权利要求17所述的三维重建装置,其中,所述空间变换参数子部分包括:
    图像数据搜索部分,配置为在所述候选数据集合和所述当前数据集合中搜索一组满足预设匹配条件的待匹配图像数据;
    匹配像素点选取部分,配置为基于从每组所述待匹配图像数据中提取得到的预设图像特征,得到每组所述待匹配图像数据之间的匹配像素点对;
    三维空间映射部分,配置为将所述匹配像素点对中属于所述当前数据集合的像素点映射至三维空间,得到第一三维匹配点,并将所述匹配像素点对中属于所述候选数据集合的像素点映射至所述三维空间,得到第二三维匹配点;
    三维匹配点对齐部分,配置为将所述第一三维匹配点和所述第二三维匹配点进行对齐处理,得到所述空间变换参数。
  20. 根据权利要求19所述的三维重建装置,其中,三维匹配点对齐部分包括:
    第一位姿变换参数子部分,配置为获取所述第一三维匹配点和所述第二三维匹配点之间的第一位姿变换参数;
    三维匹配点优化子部分,配置为利用所述第一位姿变换参数和预设位姿变换参数,对所述第一三维匹配点进行位姿优化,分别得到第一优化匹配点和第二优化匹配点;
    第二位姿变换参数子部分,配置为计算所述第二三维匹配点分别与所述第一优化匹配点、所述第二优化匹配点之间的重合度,并选取所述重合度较高的优化匹配点所采用的位姿变换参数,作为第二位姿变换参数;
    空间变换参数子部分,配置为以所述第二位姿变换参数作为初始值,利用预设对齐方式将所述第一三维匹配点和所述第二三维匹配点进行对齐处理,得到所述当前数据集合与所述候选数据集合之间的空间变换参数。
  21. 根据权利要求19所述的三维重建装置,其中,所述空间变换参数子部分还包括:
    变换参数筛选单元,配置为在利用所述当前数据集合的图像数据和所述候选数据集合的图像数据,确定所述当前数据集合和所述候选数据集合之间的空间变换参数之后,以及所述至少利用所述候选数据集合的位姿优化参数,以及所述当前数据集合与所述候选数据集合之间的空间变换参数,获得所述当前数据集合的位姿优化参数之前,从所述当前数据集合与各个所述候选数据集合之间的空间变换参数中,选取符合预设参数筛选条件的空间变换参数;
    其中,所述预设参数筛选条件包括以下任一者:所述空间变换参数相关的所述候选数据集合与所述当前数据集合相邻;利用所述空间变换参数对所述第一三维匹配点进行位姿优化得到的优化匹配点,与所述第二三维匹配点之间的重合度大于一预设重合度阈值。
  22. 根据权利要求16所述的三维重建装置,其中,所述数据集合选取子部分包括:
    词袋模型构建单元,配置为利用所述当前数据集合及时序位于其之前的数据集合中的图像数据的预设图像特征,构建词袋模型;
    待匹配图像数据单元,配置为选取所属的待处理图像位于所述当前数据集合中的预设时序处的图像数据,作为待匹配图像数据;
    图像特征查询单元,配置为从所述词袋模型的预设范围中,查询与所述待匹配图像数据的预设图像特征之间的相似度评分大于一预设相似度阈值的预设图像特征;
    候选数据集合单元,配置为将查询到的预设图像特征所属的图像数据所在的数据集合,以及与所述当前数据集合相邻的数据集合,作为所述候选数据集合;
    其中,所述预设范围包括所属的数据集合与所述当前数据集合不相邻,且不包含于所述当前数据集合中的图像数据的预设图像特征。
  23. 根据权利要求22所述的三维重建装置,其中,所述数据集合选取子部分还包括:
    最大相似度评分值获取单元,配置为在从所述词袋模型的预设范围中,查询与所述待匹配图像数据的预设图像特征之间的相似度评分大于一预设相似度阈值的预设图像特征之前,获取与所述当前数据集合相邻的数据集合中各个所述图像数据与所述待匹配图像数据之间的相似度评分中的最大评分值;
    预设相似度阈值确定单元,配置为将所述最大评分值的预设倍数和一预设评分值中的任一者作为所述预设相似度阈值。
  24. 根据权利要求16所述的三维重建装置,其中,所述数据划分部分包括:
    当前待处理图像确定子部分,配置为依次将各帧所述待处理图像作为当前待处理图像;
    数据处理子部分,配置为在对当前待处理图像的图像数据进行划分时,若已有的所述数据集合中的末尾数据集合满足预设溢出条件,则获取所述末尾数据集合中最新的多帧所述待处理图像的图像数据,并存入一新创建的所述数据集合,作为新的所述末尾数据集合,将所述当前待处理图像的图像数据划分至新的所述末尾数据集合。
  25. 根据权利要求24所述的三维重建装置,其中,所述预设溢出条件包括以下任一者:
    所述末尾数据集合中包含的所述图像数据所对应的所述待处理图像的帧数大于或等于预设帧数阈值;所述末尾数据集合中任一所述图像数据所属的待处理图像的相机位置与所述当前待处理图像的相机位置之间的距离大于预设距离阈值;所述末尾数据集合中任一所述图像数据所属的待处理图像的相机朝向角度与所述当前待处理图像的相机朝向角度之间的差异大于预设角度阈值;
    其中,所述相机位置和所述相机朝向角度是利用所述待处理图像的相机位姿参数计算得到的。
  26. 根据权利要求16至26任一项所述的三维重建装置,其中,每帧所述待处理图像包括色彩数据和深度数据;第一确定部分包括:
    夹角获取子部分,配置为获取与所述色彩数据对齐之后的深度数据中所包含的每一像素点的法向量与所述待处理图像的重力方向之间的夹角;
    高度获取子部分,配置为将所述每一像素点在三维空间投影至所述重力方向,得到所述每一像素点在所述三维空间的高度值;
    高度分析子部分,配置为对所述夹角满足预设角度条件的像素点的高度值进行分析,得到所述待重建目标的平面高度;
    像素筛选子部分,配置为利用所述平面高度,筛选所述色彩数据中属于所述待重建物体的目标像素点。
  27. 根据权利要求26所述的三维重建装置,其中,所述高度分析子部分包括:
    高度集合获取单元,配置为将所述夹角满足预设角度条件的所述像素点的高度值,作为一高度集合;
    高度聚类分析单元,配置为对所述高度集合中的高度值进行聚类分析,得到所述待重建目标的平面高度。
  28. 根据权利要求16至27任一项所述的三维重建装置,其中,所述三维重建装置1100还包括:
    三维映射部分,配置为依次将每一所述数据集合中的图像数据映射至三维空间,得到与每一所述数据集合对应的三维点云;
    点云调整部分,配置为采用每一所述数据集合的所述位姿优化参数将与其对应的所述三维点云进行调整。
  29. 一种基于三维重建的交互装置,其特征在于,包括:
    模型获取部分,配置为获取待重建目标的三维模型,其中,所述三维模型是利用权利要求16所述的三维重建装置得到的;
    建图定位部分,配置为利用预设视觉惯导方式,构建摄像器件所在场景的三维地图,并获取所述摄像器件当前在所述三维地图中的位姿信息;
    显示交互部分,配置为基于所述位姿信息,在所述摄像器件当前拍摄到的场景图像中显示所述三维模型。
  30. 一种基于三维重建的测量装置,其特征在于,包括:
    模型获取部分,配置为获取待重建目标的三维模型,其中,所述三维模型是利用权利要求16所述的三维重建装置得到的;
    显示交互部分,配置为接收用户在所述三维模型上设置的多个测量点;
    距离获取部分,配置为获取所述多个测量点之间的距离,得到所述待重建目标上对应于所述多个测量点的位置之间的距离。
  31. 一种电子设备,其特征在于,包括相互耦接的存储器和处理器,所述处理器用于执行所述存储器中存储的程序指令,以实现权利要求1至13任一项所述的三维重建方法,或实现权利要求14所述的基于三维重建的交互方法,或实现权利要求15所述的基于三维重建的测量方法。
  32. 一种计算机可读存储介质,其上存储有程序指令,其特征在于,所述程序指令被处理器执行时实现权利要求1至13任一项所述的三维重建方法,或实现权利要求14所述的基于三维重建的交互方法,或实现权利要求15所述的基于三维重建的测量方法。
  33. 一种计算机程序,包括计算机可读代码,在所述计算机可读代码在电子设备中运行,被所述电子设备中的处理器执行的情况下,实现权利要求1至13任一项所述的三维重建方法,或实现权利要求14所述的基于三维重建的交互方法,或实现权利要求15所述的基于三维重建的测量方法。
  34. 一种计算机程序产品,当其在计算机上运行时,使得计算机执行如权利要求1至13任一项所述的三维重建方法,或执行权利要求14所述的基于三维重建的交互方法,或执行权利要求15所述的基于三维重建的测量方法。
PCT/CN2021/102882 2021-01-11 2021-06-28 三维重建及相关交互、测量方法和相关装置、设备 WO2022147976A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
KR1020237025998A KR20230127313A (ko) 2021-01-11 2021-06-28 3차원 재구성 및 관련 인터랙션, 측정 방법 및 관련장치, 기기
JP2023513719A JP7453470B2 (ja) 2021-01-11 2021-06-28 3次元再構成及び関連インタラクション、測定方法及び関連装置、機器

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110031502.0A CN112767538B (zh) 2021-01-11 2021-01-11 三维重建及相关交互、测量方法和相关装置、设备
CN202110031502.0 2021-01-11

Publications (1)

Publication Number Publication Date
WO2022147976A1 true WO2022147976A1 (zh) 2022-07-14

Family

ID=75701311

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/102882 WO2022147976A1 (zh) 2021-01-11 2021-06-28 三维重建及相关交互、测量方法和相关装置、设备

Country Status (4)

Country Link
JP (1) JP7453470B2 (zh)
KR (1) KR20230127313A (zh)
CN (1) CN112767538B (zh)
WO (1) WO2022147976A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115661371A (zh) * 2022-12-14 2023-01-31 深圳思谋信息科技有限公司 三维对象建模方法、装置、计算机设备及存储介质
CN115690693A (zh) * 2022-12-13 2023-02-03 山东鲁旺机械设备有限公司 一种建筑吊篮的智能监控系统及监控方法
CN116758157A (zh) * 2023-06-14 2023-09-15 深圳市华赛睿飞智能科技有限公司 一种无人机室内三维空间测绘方法、系统及存储介质
CN116863087A (zh) * 2023-06-01 2023-10-10 中国航空油料集团有限公司 基于数字孪生的航油信息显示方法、装置和可读存储介质
CN117168313A (zh) * 2023-11-03 2023-12-05 武汉工程大学 基于光栅投影三维重建的相位误差模型校正方法及系统
CN117476509A (zh) * 2023-12-27 2024-01-30 联合富士半导体有限公司 一种用于半导体芯片产品的激光雕刻装置及控制方法

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112767538B (zh) * 2021-01-11 2024-06-07 浙江商汤科技开发有限公司 三维重建及相关交互、测量方法和相关装置、设备
CN113450417A (zh) * 2021-05-12 2021-09-28 深圳市格灵精睿视觉有限公司 标定参数优化方法、装置、设备及存储介质
CN113240656B (zh) * 2021-05-24 2023-04-07 浙江商汤科技开发有限公司 视觉定位方法及相关装置、设备
CN115222799B (zh) * 2021-08-12 2023-04-11 达闼机器人股份有限公司 图像重力方向的获取方法、装置、电子设备及存储介质
CN116051723B (zh) * 2022-08-03 2023-10-20 荣耀终端有限公司 集束调整方法及电子设备
CN116704152B (zh) * 2022-12-09 2024-04-19 荣耀终端有限公司 图像处理方法和电子设备
CN116486008B (zh) * 2023-04-12 2023-12-12 荣耀终端有限公司 一种三维重建方法、显示方法及电子设备
CN117152399A (zh) * 2023-10-30 2023-12-01 长沙能川信息科技有限公司 基于变电站的模型制作方法、装置、设备和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108986037A (zh) * 2018-05-25 2018-12-11 重庆大学 基于半直接法的单目视觉里程计定位方法及定位系统
CN110910493A (zh) * 2019-11-29 2020-03-24 广州极飞科技有限公司 三维重建方法、装置及电子设备
US10733718B1 (en) * 2018-03-27 2020-08-04 Regents Of The University Of Minnesota Corruption detection for digital three-dimensional environment reconstruction
CN112767538A (zh) * 2021-01-11 2021-05-07 浙江商汤科技开发有限公司 三维重建及相关交互、测量方法和相关装置、设备

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9892552B2 (en) * 2015-12-15 2018-02-13 Samsung Electronics Co., Ltd. Method and apparatus for creating 3-dimensional model using volumetric closest point approach
CN108537876B (zh) * 2018-03-05 2020-10-16 清华-伯克利深圳学院筹备办公室 三维重建方法、装置、设备及存储介质
CN109166149B (zh) * 2018-08-13 2021-04-02 武汉大学 一种融合双目相机与imu的定位与三维线框结构重建方法与系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10733718B1 (en) * 2018-03-27 2020-08-04 Regents Of The University Of Minnesota Corruption detection for digital three-dimensional environment reconstruction
CN108986037A (zh) * 2018-05-25 2018-12-11 重庆大学 基于半直接法的单目视觉里程计定位方法及定位系统
CN110910493A (zh) * 2019-11-29 2020-03-24 广州极飞科技有限公司 三维重建方法、装置及电子设备
CN112767538A (zh) * 2021-01-11 2021-05-07 浙江商汤科技开发有限公司 三维重建及相关交互、测量方法和相关装置、设备

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115690693A (zh) * 2022-12-13 2023-02-03 山东鲁旺机械设备有限公司 一种建筑吊篮的智能监控系统及监控方法
CN115661371A (zh) * 2022-12-14 2023-01-31 深圳思谋信息科技有限公司 三维对象建模方法、装置、计算机设备及存储介质
CN116863087A (zh) * 2023-06-01 2023-10-10 中国航空油料集团有限公司 基于数字孪生的航油信息显示方法、装置和可读存储介质
CN116863087B (zh) * 2023-06-01 2024-02-02 中国航空油料集团有限公司 基于数字孪生的航油信息显示方法、装置和可读存储介质
CN116758157A (zh) * 2023-06-14 2023-09-15 深圳市华赛睿飞智能科技有限公司 一种无人机室内三维空间测绘方法、系统及存储介质
CN116758157B (zh) * 2023-06-14 2024-01-30 深圳市华赛睿飞智能科技有限公司 一种无人机室内三维空间测绘方法、系统及存储介质
CN117168313A (zh) * 2023-11-03 2023-12-05 武汉工程大学 基于光栅投影三维重建的相位误差模型校正方法及系统
CN117168313B (zh) * 2023-11-03 2024-01-23 武汉工程大学 基于光栅投影三维重建的相位误差模型校正方法及系统
CN117476509A (zh) * 2023-12-27 2024-01-30 联合富士半导体有限公司 一种用于半导体芯片产品的激光雕刻装置及控制方法
CN117476509B (zh) * 2023-12-27 2024-03-19 联合富士半导体有限公司 一种用于半导体芯片产品的激光雕刻装置及控制方法

Also Published As

Publication number Publication date
CN112767538B (zh) 2024-06-07
JP2023540917A (ja) 2023-09-27
CN112767538A (zh) 2021-05-07
JP7453470B2 (ja) 2024-03-19
KR20230127313A (ko) 2023-08-31

Similar Documents

Publication Publication Date Title
WO2022147976A1 (zh) 三维重建及相关交互、测量方法和相关装置、设备
US20210232924A1 (en) Method for training smpl parameter prediction model, computer device, and storage medium
CN111598998B (zh) 三维虚拟模型重建方法、装置、计算机设备和存储介质
WO2021175050A1 (zh) 三维重建方法和三维重建装置
CN108509848B (zh) 三维物体的实时检测方法及系统
US11928800B2 (en) Image coordinate system transformation method and apparatus, device, and storage medium
WO2020001168A1 (zh) 三维重建方法、装置、设备和存储介质
Stoll et al. Fast articulated motion tracking using a sums of gaussians body model
WO2019205852A1 (zh) 确定图像捕捉设备的位姿的方法、装置及其存储介质
WO2021057743A1 (zh) 地图融合方法及装置、设备、存储介质
WO2015139574A1 (zh) 一种静态物体重建方法和系统
WO2015135323A1 (zh) 一种摄像机跟踪方法及装置
US20170330375A1 (en) Data Processing Method and Apparatus
CN111625667A (zh) 一种基于复杂背景图像的三维模型跨域检索方法及系统
JP2019096113A (ja) キーポイントデータに関する加工装置、方法及びプログラム
TWI785588B (zh) 圖像配準方法及其相關的模型訓練方法、設備和電腦可讀儲存媒體
US20240046557A1 (en) Method, device, and non-transitory computer-readable storage medium for reconstructing a three-dimensional model
US7200269B2 (en) Non-rigid image registration using distance functions
US20200057778A1 (en) Depth image pose search with a bootstrapped-created database
WO2022247126A1 (zh) 视觉定位方法、装置、设备、介质及程序
CN114627244A (zh) 三维重建方法及装置、电子设备、计算机可读介质
WO2022142049A1 (zh) 地图构建方法及装置、设备、存储介质、计算机程序产品
WO2023078135A1 (zh) 三维建模方法和装置、计算机可读存储介质及计算机设备
JP2002520969A (ja) 動き画像からの自動化された3次元シーン走査方法
CN114638921A (zh) 动作捕捉方法、终端设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21917022

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2023513719

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 20237025998

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21917022

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 21917022

Country of ref document: EP

Kind code of ref document: A1