WO2021097843A1 - 三维重建方法、装置、系统和存储介质 - Google Patents
三维重建方法、装置、系统和存储介质 Download PDFInfo
- Publication number
- WO2021097843A1 WO2021097843A1 PCT/CN2019/120394 CN2019120394W WO2021097843A1 WO 2021097843 A1 WO2021097843 A1 WO 2021097843A1 CN 2019120394 W CN2019120394 W CN 2019120394W WO 2021097843 A1 WO2021097843 A1 WO 2021097843A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- dimensional
- original
- supplementary
- image
- view
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 54
- 239000013598 vector Substances 0.000 claims description 59
- 238000004590 computer program Methods 0.000 claims description 13
- 230000004927 fusion Effects 0.000 claims description 13
- 238000013528 artificial neural network Methods 0.000 claims description 12
- 238000000605 extraction Methods 0.000 claims description 6
- 238000010586 diagram Methods 0.000 description 12
- 230000036544 posture Effects 0.000 description 10
- 238000013527 convolutional neural network Methods 0.000 description 9
- 230000006870 function Effects 0.000 description 8
- 230000008569 process Effects 0.000 description 7
- 238000004422 calculation algorithm Methods 0.000 description 5
- 230000000295 complement effect Effects 0.000 description 5
- 238000006243 chemical reaction Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000001502 supplementing effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
Definitions
- the present invention relates to the field of computer vision technology, and more specifically to a three-dimensional reconstruction method, device, system and storage medium.
- Three-dimensional reconstruction is a process of restoring corresponding three-dimensional objects based on known two-dimensional images. Since the two-dimensional image only includes the information of the target object collected under a specific camera angle of view, it can only reflect the visible part of the target object under the specific camera angle of view. The more two-dimensional images based on different camera perspectives, the higher the degree of restoration of the reconstructed three-dimensional object relative to the target object, and the better the reconstruction quality.
- 3D reconstruction based on 2D images with limited viewing angles will cause the reconstruction to have multiple different characteristics due to inevitable occlusion and other problems. It is expected to use more viewing angles of two-dimensional images to obtain better reconstruction effects. However, due to the geographic location of the target object, the surrounding environment occlusion, etc., it may not be possible to obtain a two-dimensional image under the desired viewing angle. Therefore, it is difficult to obtain satisfactory three-dimensional reconstruction results.
- the present invention was made in consideration of the above-mentioned problems.
- a three-dimensional reconstruction method includes:
- the original three-dimensional object and the supplementary three-dimensional object are fused to obtain a three-dimensional reconstruction result of the target object.
- the determining the original three-dimensional object based on the original image feature includes:
- the original three-dimensional object is determined based on the depth map and the voxel cube.
- the determining the original three-dimensional object based on the depth map and the voxel cube includes:
- the depth map of the target object includes the depth map of the main view and the depth map of the back view of the target object.
- the original two-dimensional image includes multiple images with different viewing angles
- the determining the original three-dimensional object based on the original image features includes:
- the fusion of all three-dimensional objects from different perspectives to obtain the original three-dimensional object includes:
- the determining the voxel of the original three-dimensional object according to the voxels of all standard-view three-dimensional objects includes:
- the determining the camera pose of the supplementary angle of view of the target object includes:
- the camera pose of the candidate view angle is the camera pose of the supplementary view angle.
- the determining the original visible ratio of the visible voxels of the candidate perspective three-dimensional object includes:
- the original visible ratio is determined according to the counted number of pixels and the total number of pixels of the candidate view three-dimensional object in the projection image.
- the generating a supplementary two-dimensional image of the target object under the supplementary viewpoint based on the camera pose of the supplementary viewpoint includes:
- the supplementary two-dimensional image is generated based on the supplementary image feature.
- the generating a supplementary two-dimensional image of the target object under the supplementary viewpoint based on the camera pose of the supplementary viewpoint includes:
- the supplementary two-dimensional image is generated according to the target feature.
- the extraction of the target feature based on the projection image of the original three-dimensional object in the supplementary view angle and the feature of the original image includes:
- the corresponding feature vector in the target feature is determined based on random noise.
- the original two-dimensional image includes a plurality of images with different perspectives
- the original image feature includes a plurality of features corresponding to each image with a different perspective
- the target is determined according to the original image feature
- Corresponding feature vectors in features include:
- the corresponding feature vectors in the multiple original image features are averaged, and the average value is used as the target feature The corresponding feature vector in.
- the extracting the target feature according to the projection image of the original three-dimensional object under the supplementary angle of view and the original image feature further includes:
- the projection image and the determined feature vector are spliced together to generate the target feature.
- the method further includes:
- the supplementary two-dimensional image is used as the original two-dimensional image, and the three-dimensional reconstruction is performed again based on the camera pose of the new supplementary angle of view, until the proportion of visible voxels in the three-dimensional reconstruction result Greater than the second ratio.
- a three-dimensional reconstruction device including:
- the feature extraction module is used to extract the original image features from the original two-dimensional image of the target object
- the first reconstruction module is configured to determine the original three-dimensional object based on the original image feature
- a supplementary perspective module configured to determine a camera pose of a supplementary perspective of the target object, wherein the supplementary perspective is different from the first perspective used to generate the original two-dimensional image
- a supplementary image module configured to generate a supplementary two-dimensional image of the target object in the supplementary perspective based on the camera pose of the supplementary perspective;
- the second reconstruction module is configured to perform three-dimensional reconstruction on the supplementary two-dimensional image to generate a supplementary three-dimensional object corresponding to the supplementary two-dimensional image;
- the fusion module is used for fusing the original three-dimensional object and the supplementary three-dimensional object to obtain a three-dimensional reconstruction result of the target object.
- a three-dimensional reconstruction system including: a processor and a memory, wherein computer program instructions are stored in the memory, and the computer program instructions are used for execution when the processor is running.
- a storage medium on which program instructions are stored, and the program instructions are used to execute the above-mentioned three-dimensional reconstruction method during operation.
- the target object by adding a two-dimensional image of the target object under a supplementary perspective based on the original two-dimensional image, and performing three-dimensional reconstruction based on the two-dimensional image under the supplementary perspective and the original two-dimensional image, the target object can be obtained More credible information from the, and improve the quality of the reconstruction of three-dimensional objects.
- Fig. 1 shows a schematic flowchart of a three-dimensional reconstruction method according to an embodiment of the present invention
- Figure 2 shows a conversion relationship between a world coordinate system and a spherical coordinate system according to an embodiment of the present invention
- Fig. 3 shows a schematic flowchart of determining an original three-dimensional object according to an embodiment of the present invention
- 4A shows a schematic flowchart of determining an original three-dimensional object through multiple original two-dimensional images according to an embodiment of the present invention
- FIG. 4B shows a schematic diagram of different original two-dimensional images captured by cameras under different viewing angles
- FIG. 5A shows a schematic flow chart of fusing multiple perspective three-dimensional objects according to an embodiment of the present invention
- FIG. 5B shows a schematic block diagram of obtaining an original three-dimensional object through multiple original two-dimensional images according to an embodiment of the present invention
- FIG. 6 shows a schematic flowchart of determining a camera pose of a supplementary angle of view according to an embodiment of the present invention
- FIG. 7 shows a schematic flowchart of determining the original visible ratio according to an embodiment of the present invention.
- FIG. 8 shows a schematic diagram of determining the original visible scale according to an embodiment of the present invention.
- FIG. 9 shows a schematic flowchart of generating a supplementary two-dimensional image according to an embodiment of the present invention.
- FIG. 10 shows a schematic flowchart of generating a supplementary two-dimensional image according to another embodiment of the present invention.
- FIG. 11 shows a schematic block diagram of generating a supplementary two-dimensional image according to an embodiment of the present invention
- FIG. 12 shows a schematic flowchart of iterative reconstruction according to an embodiment of the present invention
- Fig. 13 shows a schematic block diagram of a three-dimensional reconstruction device according to an embodiment of the present invention
- Fig. 14 shows a schematic block diagram of a three-dimensional reconstruction system according to an embodiment of the present invention.
- Fig. 1 shows a schematic flowchart of a three-dimensional reconstruction method 100 according to an embodiment of the present invention. As shown in FIG. 1, the method 100 includes the following steps.
- the original two-dimensional image may be an image of the target object directly collected by imaging equipment such as a camera or a video camera.
- the original two-dimensional image can also be an image subjected to preprocessing operations.
- preprocessing operations such as filtering may be performed on the collected image to obtain an original two-dimensional image with better quality.
- the original two-dimensional image can be a single image obtained under a single viewing angle, or multiple images obtained under multiple different viewing angles.
- an encoder composed of a convolutional neural network is used to extract original image features from the original two-dimensional image of the target object.
- CNN convolutional neural network
- the original image feature may include multiple feature vectors. Each feature vector corresponds to the corresponding pixel in the original two-dimensional image. Taking a single original two-dimensional image as an example, H ⁇ W feature vectors can be extracted from the original two-dimensional image (H represents the height of the original two-dimensional image, and W represents the width of the original two-dimensional image). The dimension of each feature vector is C.
- S120 Determine an original three-dimensional object based on the original image feature.
- a decoder composed of a convolutional neural network is used to generate original three-dimensional objects based on original image features.
- the original three-dimensional object has a corresponding relationship with the original two-dimensional image.
- the original three-dimensional object can be represented in the following ways: point cloud (Point Cloud), mesh (Mesh), voxel (Voxel), or depth map (Depth map), etc.
- the original three-dimensional object is represented by voxels.
- the voxel representation method is to regard the space where the target object is located as a voxel cube composed of multiple three-dimensional squares, and the value of each three-dimensional square indicates whether the object has a voxel in the spatial position of the square. For example, a value of 0 means that the object does not have a voxel in the spatial position of the corresponding square, and a value of 1 means that there is a voxel.
- the three-dimensional reconstruction based on the original two-dimensional image of the target object is realized.
- the encoder and decoder described in the above step S110 and step S120 are only for example, and do not constitute a limitation to the present invention.
- a person of ordinary skill in the art can use any existing or future-developed algorithm for three-dimensional reconstruction based on known two-dimensional images to implement the above two steps.
- S130 Determine the camera pose of the supplementary angle of view of the target object, wherein the supplementary angle of view is different from the first angle of view for generating the original two-dimensional image.
- each two-dimensional image has a corresponding camera angle of view
- the camera angle of view is the angle of view when the camera collects the two-dimensional image.
- the camera's angle of view is determined by the camera's pose, which can be used to characterize the camera's angle of view.
- the camera pose is the position and posture of the camera when it collects a two-dimensional image.
- the camera pose can be expressed based on various coordinate systems. The following uses a spherical coordinate system as an example to illustrate the camera pose. Exemplarily, the location of the object can be taken as the origin of the spherical coordinate system, and the camera pose can be represented by vectors R and T.
- R [ ⁇ , ⁇ ], where ⁇ represents the azimuth angle of the camera, ⁇ represents the elevation angle of the camera; T represents the distance ⁇ between the camera and the object.
- the coordinates (x, y, z) of a camera in the world coordinate system are known, where x represents the coordinates of the camera on the X axis, y represents the coordinates of the camera on the Y axis, and z represents the coordinates of the camera on the Z axis, which can correspond Groundly determine the azimuth angle ⁇ , elevation angle ⁇ and distance ⁇ of the camera in the spherical coordinate system.
- Figure 2 shows the conversion relationship between the world coordinate system and the spherical coordinate system.
- the angle of view of the camera in this standard pose can be referred to as the standard angle of view.
- the posture of the three-dimensional object corresponding to the standard pose of the camera can be called its standard posture.
- the original three-dimensional object can be transformed to the standard posture. Therefore, different camera poses can be expressed as different azimuth and elevation angles, that is, different vectors [ ⁇ , ⁇ ].
- the camera pose when the image is generated can be determined according to the camera parameters corresponding to the original two-dimensional image.
- the angle of view corresponding to the camera pose of the original two-dimensional image is called the first angle of view.
- this step is used to determine a new supplementary viewing angle.
- the supplementary perspective is different from the first perspective.
- the camera pose of the supplementary angle of view is different from the camera pose of the first angle of view.
- the camera pose of the supplementary angle of view may be determined according to the first angle of view based on a preset rule. For example, based on the camera pose of the first view angle, the azimuth angle and/or the elevation angle are changed according to a preset rule. Specifically, the azimuth angle of the first viewing angle is added to a preset degree to obtain a supplementary viewing angle.
- S140 Based on the camera pose of the supplementary angle of view, generate a supplementary two-dimensional image of the target object in the supplementary angle of view.
- a supplementary two-dimensional image of the target object under the supplementary viewing angle can be generated according to the original image information from the original two-dimensional image.
- the original image information comes from, for example, original image features or original three-dimensional objects, or even from the original two-dimensional image itself.
- the supplementary viewing angle for generating the supplementary two-dimensional image is different from the first viewing angle for generating the original two-dimensional image, so that there is a difference between the supplementary two-dimensional image and the original two-dimensional image. Because the surface of the target object generally changes continuously, it is reliable to predict the invisible part of the target object in the first view based on the original image information.
- the supplementary two-dimensional image contains information that does not exist in the original two-dimensional image, and the information is reliable to a certain extent. Supplementing the two-dimensional image can play a supplementary and rich role in the original image information.
- S150 Perform three-dimensional reconstruction on the supplementary two-dimensional image to generate a supplementary three-dimensional object corresponding to the supplementary two-dimensional image.
- step S150 may include: firstly, an encoder composed of a convolutional neural network is used to extract supplementary image features from a supplementary two-dimensional image; then, a decoder composed of a convolutional neural network is used to determine corresponding image features based on the supplementary image features. Complement three-dimensional objects.
- the supplementary three-dimensional object is represented in the form of voxels. It can be understood that since the supplementary 2D image contains information that does not exist in the original image information, the voxels that are visible in the supplementary perspective in the generated supplementary 3D object must be different from the voxels that are visible in the first perspective in the original 3D object.
- the final three-dimensional reconstruction result of the target object may be determined by taking a union of the voxels of the original three-dimensional object and the supplementary three-dimensional object. For any position in the space, as long as any one of the original three-dimensional object or the supplementary three-dimensional object has a voxel at that position, it is determined that the three-dimensional reconstruction result has a voxel at that position.
- the final three-dimensional reconstruction result of the target object can also be determined by taking the intersection of the voxels of the original three-dimensional object and the supplementary three-dimensional object. For any position in the space, only if both the original three-dimensional object and the supplementary three-dimensional object have voxels at that location, then it is determined that the three-dimensional reconstruction result has a voxel at that location.
- Fig. 3 shows a schematic flowchart of determining an original three-dimensional object in step S120 according to an embodiment of the present invention.
- a decoder composed of a neural network can be used to generate original three-dimensional objects based on original image features.
- the decoder composed of a convolutional neural network can be implemented using a deep neural network and a voxel neural network.
- step S120 includes the following steps.
- S121 Decoding the original image features through the deep neural network to obtain a depth map of the target object.
- the deep neural network may include multiple 2-dimensional (2D) convolutional layers.
- Each pixel in the depth map represents the depth of the corresponding position of the target object.
- the depth may be the distance between the corresponding position of the target object and the camera.
- the depth d of the pixel corresponding to the feature vector in the original two-dimensional image can be calculated by the following formula:
- C represents the maximum depth.
- the angle of view when the camera collects the original two-dimensional image is the main angle of view, that is, the aforementioned first angle of view.
- the depth map of the main perspective can be generated based on the original image features.
- the depth map generated based on the original image features may also include the depth map of the back view.
- the back viewing angle is a viewing angle that is 180 degrees from the main viewing angle.
- the target object is symmetrical about a plane perpendicular to the main viewing angle direction. According to this, although the part of the target object that is visible from the rear view angle is actually invisible in the main view angle, the depth map of the rear view angle can be obtained according to the original image characteristics.
- S122 Decode the original image features through a voxel neural network to obtain a voxel cube of the target object.
- the voxel neural network may also include multiple 2D convolutional layers, which are used to output a voxel cube composed of multiple three-dimensional squares according to the original image features.
- a voxel cube if the value of a three-dimensional grid is 1, the target object has a voxel at the spatial position of the grid. If the value of the three-dimensional grid is 0, there is no voxel for the target object in the spatial position of the grid.
- the depth map may include the depth map of the main view and the depth map of the back view.
- the depth map of the main view includes the three-dimensional information of the front surface of the target object
- the depth map of the back view includes the three-dimensional information of the back surface of the target object.
- the three-dimensional information of the target object can be determined based on the three-dimensional information of the front surface and the three-dimensional information of the back surface. Exemplarily, it can be considered that the part between the front surface and the back surface is the target object reconstructed from the depth map.
- Each point of the front surface obtained from the depth map of the main view can be connected to the corresponding point of the rear surface obtained from the depth map of the rear view.
- the space enclosed by the front surface, the back surface and all the connecting lines is based on the depth The space occupied by the reconstructed target object.
- the target object reconstructed from the depth map and the voxel cube obtained from the original image feature can be fused to determine the original three-dimensional object.
- the target object in the case where both of the above-mentioned situations consider a specific location to be a target object, it will be determined that the target object exists at that location.
- Determining the original three-dimensional object through the depth map and the voxel cube can effectively use the information in the original two-dimensional image, making the generated original three-dimensional object closer to the target object.
- the above step S123 may include: firstly, determining the voxels visible in the original three-dimensional object according to the depth map; then, determining other voxels in the original three-dimensional object according to the voxel cube.
- the depth map may include a depth map of the main view. Since the depth map of the main view is obtained directly based on the original two-dimensional image, the voxels determined according to the depth map of the main view can be considered as visible voxels. These voxels are more reliable and can better reflect the actual shape of the target object.
- the depth map may also include a depth map of a back view angle. In view of the fact that most objects have a front-to-back symmetrical relationship, it can be considered that the voxels determined according to the depth map of the rear view angle are also visible.
- the voxels that are visible in the main view and the voxels that are visible in the back view of the original three-dimensional object can be determined according to the depth map of the main view and the depth map of the back view. It can be understood that although the voxel cube also contains voxels on the front surface and the back surface, determining these visible voxels based on the depth map is more accurate than determining these voxels based on the voxel cube.
- the depth map of the main view and the depth map of the back view cannot reflect other spatial characteristics of the target object.
- Other voxels in the original three-dimensional object are not visible in the original two-dimensional image. These voxels can be determined based on the voxel cubes generated by the voxel neural network.
- the voxel cube contains voxels on other surfaces except the front surface (visible in the main view) and the back surface (visible in the back view). These voxels can be used to determine the original three-dimensional object except for the front and back surfaces. Voxels for other surfaces.
- the original three-dimensional object with higher reliability and accuracy can be obtained.
- the original two-dimensional image may include multiple images obtained from multiple different viewing angles.
- FIG. 4A shows a schematic flowchart of determining an original three-dimensional object through multiple original two-dimensional images according to an embodiment of the present invention. As shown in FIG. 4A, when the original two-dimensional image contains multiple images with different viewing angles, step S120 determining the original three-dimensional object may include the following steps.
- the corresponding perspective three-dimensional objects are determined based on the corresponding original image features extracted from the original two-dimensional images of each perspective.
- FIG. 4B shows a schematic diagram of different original two-dimensional images captured by cameras at different viewing angles according to an embodiment of the present invention.
- C1, C2, and C3 represent cameras in different poses.
- the original two-dimensional images I1, I2, and I3 corresponding to the respective viewing angles can be obtained.
- a three-dimensional object corresponding to the angle of view of the original two-dimensional image can be obtained through three-dimensional reconstruction, which is referred to herein as a three-dimensional object with different views. It can be understood that the original two-dimensional image corresponding to each sub-view three-dimensional object is different, and therefore the voxels contained therein may also be different.
- the original three-dimensional object is determined according to the voxels contained in the multiple perspective three-dimensional objects. Any existing technology or algorithm developed in the future can be used to fuse various perspective three-dimensional objects, which is not limited in this application.
- the original three-dimensional object is determined based on multiple images with different viewing angles. These images contain more credible target object information. Therefore, the three-dimensional reconstruction result of the present application can be made more accurate.
- FIG. 5A shows a schematic flow chart of fusing all three-dimensional objects with different perspectives according to an embodiment of the present invention. As shown in FIG. 5A, fusing multiple three-dimensional objects with different perspectives includes the following steps.
- Each perspective three-dimensional object is generated based on its corresponding original two-dimensional image, which corresponds to its own perspective.
- each perspective three-dimensional object can be rotated to a unified standard posture.
- the spatial shape of each sub-view three-dimensional object under the same standard view angle can be obtained, that is, the standard view three-dimensional object.
- S520 Determine the voxel of the original three-dimensional object according to the voxels of all the three-dimensional objects in the standard viewing angle.
- the voxel of the original three-dimensional object can be determined based on the union or intersection of the voxels of all standard-view three-dimensional objects.
- FIG. 5B shows a schematic block diagram of determining an original three-dimensional object through multiple original two-dimensional images according to an embodiment of the present invention.
- each sub-view three-dimensional object is rotated to a standard posture, and then the rotated standard-view three-dimensional object is fused, which is not only easy to implement, but also ensures the accuracy of the result.
- the standard viewing angle three-dimensional object is represented in the form of voxels. According to the value of the three-dimensional grid is 1 or 0, it can be determined whether there is a voxel in the corresponding position of the three-dimensional grid. When a standard-view three-dimensional object with a voxel in a certain position among all standard-view three-dimensional objects exceeds the first ratio, it is determined that the original three-dimensional object has a voxel at that position.
- the first ratio is 0.5.
- (x, y, z) represents the coordinates of a certain position in space
- k represents the number of three-dimensional objects in the standard view
- Pi(x, y, z) represents the stereoscopic position of the i-th standard view three-dimensional object at that location
- the value of the square, O(x,y,z) represents the value of the three-dimensional square of the original three-dimensional object at that position.
- the original three-dimensional object is determined according to the number of voxels at a certain position in all standard-view three-dimensional objects.
- the original three-dimensional object is closer to the real target object. Therefore, the three-dimensional reconstruction result obtained by this technical solution is more ideal.
- step S130 is required to further determine the camera pose of the complementary perspective of the target object.
- Fig. 6 shows a schematic flowchart of determining a camera pose of a supplementary angle of view according to an embodiment of the present invention. As shown in FIG. 6, the step S130 to determine the camera pose of the supplementary angle of view includes the following steps.
- S131 Acquire a preset camera pose of at least one candidate angle of view.
- the camera pose of each candidate perspective can be expressed as the azimuth and elevation angles in the spherical coordinate system, which are represented by vectors ( ⁇ , ⁇ ).
- the azimuth angle ⁇ is selected as the element in the set [0,45,90,135,180,225,270,315]
- the elevation angle ⁇ is the set [-60,-30,0,30] ,60]
- the original three-dimensional object can be rotated from the current perspective to the candidate perspective.
- the current perspective of the original three-dimensional object can be the first perspective corresponding to the original two-dimensional image.
- the original three-dimensional object can be determined directly based on the first perspective, and the calculation is simpler.
- the current view angle of the original three-dimensional object may also be a standard view angle. According to the foregoing example, for the case where there are multiple images with different viewing angles in the original two-dimensional image, the obtained original three-dimensional object may be in a standard viewing angle.
- the original three-dimensional object can be rotated by the angle of ( ⁇ 2- ⁇ 1, ⁇ 2- ⁇ 1) to obtain Candidate perspective three-dimensional objects.
- S133 For the camera pose of each candidate perspective, determine the original visible ratio of the visible voxels of the three-dimensional object in the candidate perspective.
- the visible voxels of the three-dimensional object in the candidate perspective refer to the visible voxels of the three-dimensional object in the candidate perspective in the candidate perspective. Under different viewing angles, the visible voxels of three-dimensional objects are different. Taking a car as an example, assuming that the first perspective (0, 0) corresponding to the original two-dimensional image is the perspective facing the front of the car, then the voxels that make up the front of the car are visible voxels in the first perspective, such as the volume that constitutes the headlights. The voxels, the voxels that make up the wiper, the voxels that make up the hood, etc. are visible voxels. When the car is rotated to a candidate viewing angle, such as the left viewing angle (90, 0), then the voxels constituting the left door are visible voxels, but the voxels constituting the wiper are not visible voxels.
- a candidate viewing angle such as the left viewing angle (90, 0)
- the original visible ratio is the proportion of the number of voxels that are visible in the first view in the visible voxels of the candidate view three-dimensional object. It can be understood that if the original two-dimensional image includes multiple images with different viewing angles, the first viewing angle includes multiple viewing angles. It can be understood that the voxels of the three-dimensional object that are visible in the candidate perspective may be visible or invisible in the first perspective. In the example of the aforementioned car, among the visible voxels of the car in the left view, the voxels of the part close to the front of the car are visible from the front view, while the voxels of the part close to the rear of the car are not visible from the front view of the car. . Thus, in this example, the original visible ratio of the visible voxels of the car under the left angle of view is the ratio of pixels visible under the first angle of view in the visible voxels under the left angle of view.
- the original visible ratio can reflect the credibility of the candidate perspective three-dimensional object.
- the original three-dimensional object is generated based on the original two-dimensional image.
- the visible pixels in the original two-dimensional image can truly reflect the shape of the target object, so they are credible pixels.
- the voxels that are visible in the first viewing angle in the original three-dimensional object determined based on the pixels in the original two-dimensional image are also credible.
- the credibility of the voxels other than the voxels that are visible in the first view of the original three-dimensional object is lower than the credibility of the voxels that are visible in the first view.
- the purpose of this step is to select a candidate view angle whose original visible ratio is within a suitable range as a supplementary view angle for 3D reconstruction.
- the credibility of the three-dimensional object in the supplementary perspective should not be too low, otherwise the three-dimensional reconstruction in this perspective is meaningless; at the same time, the credibility of the three-dimensional object in the supplementary perspective should not be too high, otherwise it will be too close to the first perspective. Can not play the role of supplementary information.
- the first range is 50%-85%
- the candidate view angles whose original visible ratio is within this range are used as the supplementary view angles for 3D reconstruction
- the camera pose under the candidate view angle is the camera pose of the supplementary view angle. This range ensures that the credibility of the three-dimensional object under the supplementary viewing angle is sufficiently high, and it also guarantees the effective amount of supplementary information.
- the camera pose of the supplementary perspective is determined according to the original visible proportion of the visible voxels of the three-dimensional object in the candidate perspective, and the three-dimensional reconstruction result obtained based on the camera pose of the supplementary perspective is more accurate.
- Fig. 7 shows a schematic flowchart of determining the original visible ratio according to a specific embodiment of the present invention. As shown in FIG. 7, determining the original visible ratio of the visible voxels of the candidate perspective three-dimensional object includes the following steps.
- the candidate perspective three-dimensional object Since the candidate perspective three-dimensional object has been rotated to a position facing the candidate perspective, the candidate perspective three-dimensional object is projected in the candidate perspective direction to obtain the voxels of the candidate perspective three-dimensional object in the candidate perspective.
- the pixels of the candidate perspective three-dimensional object in the projection image correspond to the voxels that are visible in the candidate perspective.
- the projection image may be determined based on the voxel of the candidate view three-dimensional object that is closest to the projection plane in the candidate view.
- the projection plane can be a plane perpendicular to the candidate viewing angle where the camera is located. Assuming that the candidate perspective is the direction of the X axis, the voxel of the candidate perspective three-dimensional object that is closest to the projection plane in the candidate perspective can be determined by the following formula:
- P(:, y, z) represents all voxels on a straight line parallel to the X axis with the Y-axis coordinate of the candidate view three-dimensional object being y and the Z-axis coordinate being z.
- argmin(P(:,y,z)) represents the minimum distance between the voxel on the aforementioned straight line and the projection plane of the candidate perspective three-dimensional object.
- S720 Count the number of pixels visible in the first view of the candidate view three-dimensional object in the projection image.
- the pixels in the projection image correspond to voxels that are visible in the candidate perspective of the three-dimensional object in the candidate perspective.
- the voxels that are visible in the candidate perspective of the three-dimensional object in the candidate perspective may be visible or invisible in the first perspective of the original two-dimensional image.
- This step S720 is used to determine the number of pixels in the projection image corresponding to voxels that are visible in the first viewing angle and also visible in the candidate viewing angle.
- the voxels visible in the first viewing angle can be marked.
- the voxels visible in the first perspective may be voxels determined by the main perspective depth map in the original three-dimensional object. On the basis of marking the voxels in the original three-dimensional object, these marks are still retained in the candidate perspective three-dimensional object obtained after the rotation. However, voxels marked as visible in the first view may not be visible in the candidate view.
- the statistics to be counted in this step S720 are the marked voxels that are still visible in the candidate view angle.
- voxels that are not visible in the first viewing angle can also be marked.
- the voxels determined by the depth map and the voxel cube in the original three-dimensional object are marked as invisible voxels in the first view.
- the number of pixels in the projection image corresponding to the marked voxels the number of pixels visible in the first viewing angle of the candidate perspective three-dimensional object in the projection image can be obtained.
- S730 Determine the original visible ratio according to the counted number of pixels and the total number of pixels of the candidate view three-dimensional object in the projection image. By calculating the ratio of the number of pixels counted in step S720 to the total number of pixels of the candidate view three-dimensional object in the projection image, the original visible ratio can be determined.
- Fig. 8 shows the above-mentioned schematic diagram of determining the original visible scale.
- V0 is the original three-dimensional object generated based on the three-dimensional reconstruction of step S110 and step S120.
- the original three-dimensional object mainly includes three parts: voxels determined according to the depth map of the front view, voxels determined according to the depth map of the back view, and voxels determined according to the voxel cube. Among them, the voxels determined according to the depth map of the main view are considered to be visible in the first viewing angle, and the remaining voxels are considered to be invisible in the first viewing angle.
- V0' is a candidate perspective three-dimensional object obtained after the original three-dimensional object is rotated based on the candidate perspective.
- P0 is the projection image of the candidate perspective of the three-dimensional object under the candidate perspective.
- P0 includes pixels corresponding to voxels that are visible in the first view in the candidate view three-dimensional object and pixels corresponding to voxels that are not visible in the first view. The two are marked with squares of different gray levels.
- the original visible ratio can be determined based on the ratio between the former and the former plus the sum of the latter.
- the projection map is used to determine the original visible scale, which is easy to implement, and the final three-dimensional reconstruction result is more accurate.
- FIG. 9 shows a schematic flowchart of generating a supplementary two-dimensional image in step S140 according to a specific embodiment of the present invention, and this step S140 includes the following steps:
- S141 Calculate the horizontal rotation angle and the vertical rotation angle between the camera pose of the first angle of view and the camera pose of the supplementary angle of view.
- the camera poses of different viewing angles can be equivalent to the lateral rotation angle in the spherical coordinate system (the rotation angle relative to the X axis on the XOY plane). ) And the longitudinal rotation angle (the rotation angle relative to the Z axis on a plane perpendicular to XOY), represented by ( ⁇ , ⁇ ).
- the horizontal and vertical rotation angles between the camera pose of the first angle of view and the camera pose of the supplementary angle of view The rotation angle can be expressed as ( ⁇ 2- ⁇ 1, ⁇ 2- ⁇ 1).
- S142 Combine a vector composed of the horizontal corner and the vertical corner with each vector in the original image feature, and use all the combined vectors as supplementary image features.
- H ⁇ W feature vectors can be extracted from each original two-dimensional image, and these H ⁇ W feature vectors constitute the original image features.
- the horizontal rotation angle and the vertical rotation angle ( ⁇ 2- ⁇ 1, ⁇ 2- ⁇ 1) may be calculated in step S610, which are spliced to each feature vector, so that each spliced feature vector contains n+2 vectors.
- the spliced feature vector is represented as (P1, P2,...Pn, ⁇ 2- ⁇ 1, ⁇ 2- ⁇ 1).
- Each feature vector in the original image features is stitched, and all the feature vectors obtained after stitching are used as supplementary image features.
- S143 Generate the supplementary two-dimensional image based on the supplementary image feature.
- a decoder composed of a convolutional neural network can be used to generate a supplementary two-dimensional image corresponding to the supplementary image feature. It can be understood that the decoder can be obtained by training using sample features and corresponding sample images.
- the complementary image feature is obtained by stitching the corner between the feature vector in the original image feature and the camera pose, and the complementary two-dimensional image is generated based on the complementary image feature, which is simple to operate and easy to implement.
- Fig. 10 shows a schematic flow chart of generating a supplementary two-dimensional image according to another specific embodiment of the present invention. Specific steps are as follows:
- the projection image of the original three-dimensional object under the supplementary perspective may be obtained similarly to the acquisition of the projection image of the candidate perspective three-dimensional object under the candidate perspective in the foregoing step S710.
- the projection image of the original three-dimensional object under the supplementary perspective can be obtained directly based on the result of step S710.
- the projection image of the original three-dimensional object under the supplementary viewing angle contains pixels corresponding to the voxels that are visible in the first viewing angle of the original three-dimensional object and pixels corresponding to the voxels that are not visible in the first viewing angle.
- this step S141' may include the following steps: a) For pixels in the projection image corresponding to the voxels of the original three-dimensional object visible in the first viewing angle, the corresponding feature vector in the target feature can be determined according to the original image feature. Specifically, the corresponding feature vector in the original image feature can be used as the feature vector in the former target feature.
- the corresponding feature vector in the target feature can be determined based on random noise. For example, use random noise as the corresponding feature vector in the target feature.
- the random noise can take any value in the range [0,1].
- the original image features correspondingly contain multiple features corresponding to each image with different viewing angles.
- the corresponding feature vectors in all the original image features can be summed and then averaged, and the obtained average value is used as the pixel Target characteristics.
- S142' Generate the supplementary two-dimensional image according to the target feature.
- a decoder composed of a convolutional neural network can be used to generate a supplementary two-dimensional image corresponding to the target feature based on the target feature extracted in step S141'.
- a person of ordinary skill in the art can understand the specific operation, and for the sake of brevity, it will not be repeated here.
- Fig. 11 shows a schematic block diagram of generating a supplementary two-dimensional image according to a specific embodiment of the present invention.
- V0 is the original three-dimensional object generated by the three-dimensional reconstruction
- V0" is the supplementary perspective three-dimensional object obtained after the original three-dimensional object is rotated based on the supplementary perspective
- P0' is the projection image of the supplementary perspective three-dimensional object under the supplementary perspective P0' may include pixels corresponding to voxels that are visible in the first viewing angle of the original three-dimensional object and pixels corresponding to voxels that are not visible in the first viewing angle.
- the feature vectors of the pixels corresponding to the voxels that are visible in the first view of the original three-dimensional object and the voxels corresponding to the voxels that are not visible in the first view of the original three-dimensional object are extracted respectively.
- the feature vector of the pixel to generate the target feature For the former, the corresponding feature vector can be derived from the original image features extracted from the original two-dimensional image; for the latter, the corresponding feature vector can be determined based on random noise.
- step S141' further includes: concatenating P0' with the feature vector determined in step a) and step b) to generate the target feature.
- P0' is a matrix of 1 ⁇ H ⁇ W (H represents the height of the original two-dimensional image, and W represents the width of the original two-dimensional image).
- the original image feature is a C ⁇ H ⁇ W tensor as described above, and the feature vector determined in step a) and step b) also constitutes a C ⁇ H ⁇ W feature tensor.
- the (C+1) ⁇ H ⁇ W tensor is the generated target feature.
- P0' is used as the mask in the target feature, which will further improve the accuracy of the 3D reconstruction result.
- the target feature can be decoded by a decoder composed of, for example, a convolutional neural network, so as to obtain a corresponding supplementary two-dimensional image.
- a decoder composed of, for example, a convolutional neural network
- the supplementary two-dimensional image generated in the above technical solution contains more information in the original two-dimensional image, and also contains sufficient supplementary information, so that the three-dimensional reconstruction result obtained based on it has a high degree of credibility.
- step S130 to step S160 can be iterated multiple times, and the final 3D reconstruction result can be determined according to whether the iteration termination condition is satisfied.
- Fig. 12 shows a schematic flowchart of a three-dimensional reconstruction method according to another embodiment of the present invention. As shown in FIG. 12, the three-dimensional reconstruction method includes the following steps:
- S1210 Extract original image features from the original two-dimensional image of the target object.
- S1220 Determine an original three-dimensional object based on the original image feature.
- S1230 Determine the camera pose of the supplementary angle of view of the target object, wherein the supplementary angle of view is different from the first angle of view for generating the original two-dimensional image.
- S1240 Based on the camera pose of the supplementary angle of view, generate a supplementary two-dimensional image of the target object in the supplementary angle of view.
- S1250 Perform three-dimensional reconstruction on the supplementary two-dimensional image to generate a supplementary three-dimensional object corresponding to the supplementary two-dimensional image.
- S1270 Determine whether the proportion of visible voxels in the three-dimensional reconstruction result is greater than the second proportion.
- the percentage of visible voxels in the three-dimensional reconstruction result is the percentage of the number of voxels that are visible in the first perspective among the visible voxels of the three-dimensional reconstruction result in the supplementary perspective. For example, there are a total of m voxels that are visible in the supplementary viewing angle as a result of the three-dimensional reconstruction, and among these voxels that are simultaneously visible in the first viewing angle is M, the proportion of visible voxels is M/m. It can be understood that the proportion of visible voxels can reflect the credibility of the 3D reconstruction results.
- the second ratio can be any value between 70% and 90%. In an example, the above-mentioned second ratio is 85%. This value takes into account the consumption of computing resources and the accuracy of the calculation results.
- the ratio is not greater than the second ratio
- three-dimensional reconstruction is performed again based on the camera pose of the new supplementary angle of view. If the proportion of visible voxels is not greater than the second proportion, it indicates that there is still a certain gap between the current 3D reconstruction result and the real target object. Therefore, it is necessary to perform 3D reconstruction again based on the camera pose of the new supplementary perspective.
- step S1280 is executed.
- the proportion of visible voxels is greater than the second proportion, it indicates that the three-dimensional object generated under the current view angle is relatively close to the real three-dimensional object, so the three-dimensional reconstruction result can be used as the final result.
- a three-dimensional reconstruction device is also provided.
- Fig. 13 shows a schematic block diagram of a three-dimensional reconstruction device according to an embodiment of the present invention.
- the apparatus 1300 includes a feature extraction module 1310, a first reconstruction module 1320, a supplementary perspective module 1330, a supplementary image module 1340, a second reconstruction module 1350, and a fusion module 1360.
- the various modules can respectively execute the various steps/functions of the three-dimensional reconstruction method described above.
- the various modules can respectively execute the various steps/functions of the three-dimensional reconstruction method described above. In the following, only the main functions of the components of the device 1300 are described, and the details that have been described above are omitted.
- the feature extraction module 1310 is used to extract original image features from the original two-dimensional image of the target object
- the first reconstruction module 1320 is configured to determine the original three-dimensional object based on the original image feature
- a supplementary perspective module 1330 configured to determine a camera pose of a supplementary perspective of the target object, wherein the supplementary perspective is different from the first perspective for generating the original two-dimensional image;
- a supplementary image module 1340 configured to generate a supplementary two-dimensional image of the target object in the supplementary perspective based on the camera pose of the supplementary perspective;
- the second reconstruction module 1350 is configured to perform three-dimensional reconstruction on the supplementary two-dimensional image to generate a supplementary three-dimensional object corresponding to the supplementary two-dimensional image;
- the fusion module 1360 is used to fuse the original three-dimensional object and the supplementary three-dimensional object to obtain a three-dimensional reconstruction result of the target object.
- a three-dimensional reconstruction system including: a processor and a memory, wherein computer program instructions are stored in the memory, and the computer program instructions are used for execution when the processor is running.
- Fig. 14 shows a schematic block diagram of a three-dimensional reconstruction system according to an embodiment of the present invention.
- the system 1400 includes an input device 1410, a storage device 1420, a processor 1430, and an output device 1440.
- the input device 1410 is used for receiving operation instructions input by the user and collecting data.
- the input device 1410 may include one or more of a keyboard, a mouse, a microphone, a touch screen, an image capture device, and the like.
- the storage device 1420 stores computer program instructions for implementing the corresponding steps in the three-dimensional reconstruction method according to the embodiment of the present invention.
- the processor 1430 is used to run the computer program instructions stored in the storage device 1420 to execute the corresponding steps of the three-dimensional reconstruction method according to the embodiment of the present invention, and is used to implement the three-dimensional reconstruction apparatus according to the embodiment of the present invention
- the output device 1440 is used to output various information (such as images and/or sounds) to the outside (such as a user), and may include one or more of a display, a speaker, and the like.
- the system 1400 when the computer program instructions are executed by the processor 1430, the system 1400 is caused to perform the following steps:
- the original three-dimensional object and the supplementary three-dimensional object are fused to obtain a three-dimensional reconstruction result of the target object.
- a storage medium on which program instructions are stored, and when the program instructions are run by a computer or a processor, the computer or the processor is caused to execute the present invention.
- the corresponding steps of the above-mentioned three-dimensional reconstruction method of the embodiment are used to implement the corresponding module in the above-mentioned three-dimensional reconstruction device or the above-mentioned corresponding module used in the three-dimensional reconstruction system according to the embodiment of the present invention.
- the storage medium may include, for example, a memory card of a smart phone, a storage component of a tablet computer, a hard disk of a personal computer, a read-only memory (ROM), an erasable programmable read-only memory (EPROM), a portable compact disk read-only memory (CD-ROM), USB memory, or any combination of the above storage media.
- the computer-readable storage medium may be any combination of one or more computer-readable storage media.
- the computer or the processor executes the following steps:
- the original three-dimensional object and the supplementary three-dimensional object are fused to obtain a three-dimensional reconstruction result of the target object.
- the disclosed device and method may be implemented in other ways.
- the device embodiments described above are only illustrative.
- the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another device, or some features can be ignored or not implemented.
- the various component embodiments of the present invention may be implemented by hardware, or by software modules running on one or more processors, or by a combination of them.
- a microprocessor or a digital signal processor (DSP) may be used in practice to implement some or all of the functions of some modules in the three-dimensional reconstruction apparatus according to the embodiments of the present invention.
- DSP digital signal processor
- the present invention can also be implemented as a device program (for example, a computer program and a computer program product) for executing part or all of the methods described herein.
- Such a program for realizing the present invention may be stored on a computer-readable medium, or may have the form of one or more signals.
- Such a signal can be downloaded from an Internet website, or provided on a carrier signal, or provided in any other form.
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Computer Graphics (AREA)
- Geometry (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
Description
Claims (18)
- 一种三维重建方法,其特征在于,包括:从目标物体的原始二维图像中提取原始图像特征;基于所述原始图像特征确定原始三维物体;确定所述目标物体的补充视角的相机位姿,其中所述补充视角与生成所述原始二维图像的第一视角不同;基于所述补充视角的相机位姿,生成所述目标物体在所述补充视角下的补充二维图像;对所述补充二维图像进行三维重建,以生成与所述补充二维图像相对应的补充三维物体;以及对所述原始三维物体和所述补充三维物体进行融合,以获得所述目标物体的三维重建结果。
- 根据权利要求1所述的三维重建方法,其特征在于,所述基于所述原始图像特征确定原始三维物体包括:对所述原始图像特征通过深度神经网络进行解码,以获得所述目标物体的深度图;对所述原始图像特征通过体素神经网络进行解码,以获得所述目标物体的体素立方体;基于所述深度图和所述体素立方体确定所述原始三维物体。
- 根据权利要求2所述的三维重建方法,其特征在于,所述基于所述深度图和所述体素立方体确定所述原始三维物体包括:根据所述深度图确定所述原始三维物体中可见的体素;以及根据所述体素立方体确定所述原始三维物体中的其他体素。
- 根据权利要求3所述的三维重建方法,其特征在于,所述目标物体的深度图包括所述目标物体的主视角的深度图和后视角的深度图。
- 根据权利要求1所述的三维重建方法,其特征在于,所述原始二维图像包含多张不同视角的图像,所述基于所述原始图像特征确定原始三维物体包括:分别基于从每个视角的原始二维图像提取的对应的原始图像特征确定对应的分视角三维物体;以及对所有的分视角三维物体进行融合,以获得所述原始三维物体。
- 根据权利要求5所述的三维重建方法,其特征在于,所述对所有的分视角三维物体进行融合以获得所述原始三维物体包括:将每个分视角三维物体旋转到标准姿态,以获得对应的标准视角三维物体;以及根据所有标准视角三维物体的体素,确定所述原始三维物体的体素。
- 根据权利要求6所述的三维重建方法,其特征在于,所述根据所有标准视角三维物体的体素,确定所述原始三维物体的体素包括:对于所有标准视角三维物体所涉及的每个位置,当所有标准视角三维物体中在对应位置上存在体素的标准视角三维物体超过第一比例时,确定所述原始三维物体在该位置上存在体素。
- 根据权利要求1所述的三维重建方法,其特征在于,所述确定所述目标物体的补充视角的相机位姿包括:获取预设的至少一个候选视角的相机位姿;对于每个候选视角的相机位姿,将所述原始三维物体旋转到该候选视角下,以获得对应的候选视角三维物体;确定所述候选视角三维物体的可见体素的原始可见比例;当所述原始可见比例在第一范围内时,确定该候选视角的相机位姿为所述补充视角的相机位姿。
- 根据权利要求8所述的三维重建方法,其特征在于,所述确定所述候选视角三维物体的可见体素的原始可见比例包括:基于该候选视角,将所述候选视角三维物体进行投影,以获得投影图;统计所述投影图中的所述候选视角三维物体的、在所述第一视角下可见的像素数;以及根据所统计的像素数和所述投影图中的所述候选视角三维物体的总像素数,确定所述原始可见比例。
- 根据权利要求1所述的三维重建方法,其特征在于,所述基于所述补充视角的相机位姿,生成所述目标物体在所述补充视角下的补充二维图像包括:计算所述第一视角的相机位姿与所述补充视角的相机位姿之间的横向转角 和纵向转角;将所述横向转角和所述纵向转角组成的向量与所述原始图像特征中的每个向量拼接,以由拼接后的所有向量构成补充图像特征;基于所述补充图像特征生成所述补充二维图像。
- 根据权利要求1所述的三维重建方法,其特征在于,所述基于所述补充视角的相机位姿,生成所述目标物体在所述补充视角下的补充二维图像包括:根据所述原始三维物体在所述补充视角下的投影图以及所述原始图像特征,提取目标特征;以及根据所述目标特征生成所述补充二维图像。
- 根据权利要求11所述的三维重建方法,其特征在于,所述根据所述原始三维物体在所述补充视角下的投影图以及所述原始图像特征提取目标特征包括:对于所述投影图中的、与所述原始三维物体在所述第一视角下可见的体素对应的像素,根据所述原始图像特征确定所述目标特征中对应特征向量;对于所述投影图中其他像素,基于随机噪声确定所述目标特征中对应特征向量。
- 根据权利要求12所述的三维重建方法,其特征在于,所述原始二维图像包含多张不同视角的图像,所述原始图像特征包含与每张不同视角的图像相对应的多个特征,所述根据所述原始图像特征确定所述目标特征中对应特征向量包括:对于所述投影图中的、与所述原始三维物体在所述第一视角下可见的体素对应的像素,将多个原始图像特征中的对应特征向量进行平均,以将平均值作为目标特征中的对应特征向量。
- 根据权利要求12所述的三维重建方法,其特征在于,所述根据所述原始三维物体在所述补充视角下的投影图以及所述原始图像特征提取目标特征还包括:将所述投影图与所确定的特征向量进行拼接,以生成所述目标特征。
- 根据权利要求1所述的三维重建方法,其特征在于,在所述对所述原始三维物体和所述补充三维物体进行融合,以获得所述目标物体的三维重建结果后,还包括:判断所述三维重建结果中可见的体素占比是否大于第二比例;对于不大于第二比例的情况,将所述补充二维图像作为原始二维图像,并再次基于新的补充视角的相机位姿进行三维重建,直至三维重建结果中可见的体素占比大于第二比例。
- 一种三维重建装置,其特征在于,包括:特征提取模块,用于从目标物体的原始二维图像中提取原始图像特征;第一重建模块,用于基于所述原始图像特征确定原始三维物体;补充视角模块,用于确定所述目标物体的补充视角的相机位姿,其中所述补充视角与生成所述原始二维图像的第一视角不同;补充图像模块,用于基于所述补充视角的相机位姿,生成所述目标物体在所述补充视角下的补充二维图像;第二重建模块,用于对所述补充二维图像进行三维重建,以生成与所述补充二维图像相对应的补充三维物体;以及融合模块,用于对所述原始三维物体和所述补充三维物体进行融合,以获得所述目标物体的三维重建结果。
- 一种三维重建系统,包括:处理器和存储器,其中,所述存储器中存储有计算机程序指令,其特征在于,所述计算机程序指令被所述处理器运行时用于执行如权利要求1至15任一项所述的三维重建方法。
- 一种存储介质,在所述存储介质上存储了程序指令,其特征在于,所述程序指令在运行时用于执行如权利要求1至15任一项所述的三维重建方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201980002779.1A CN110998671B (zh) | 2019-11-22 | 2019-11-22 | 三维重建方法、装置、系统和存储介质 |
PCT/CN2019/120394 WO2021097843A1 (zh) | 2019-11-22 | 2019-11-22 | 三维重建方法、装置、系统和存储介质 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2019/120394 WO2021097843A1 (zh) | 2019-11-22 | 2019-11-22 | 三维重建方法、装置、系统和存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021097843A1 true WO2021097843A1 (zh) | 2021-05-27 |
Family
ID=70080495
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/120394 WO2021097843A1 (zh) | 2019-11-22 | 2019-11-22 | 三维重建方法、装置、系统和存储介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110998671B (zh) |
WO (1) | WO2021097843A1 (zh) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022077190A1 (zh) * | 2020-10-12 | 2022-04-21 | 深圳市大疆创新科技有限公司 | 数据处理方法、控制设备及存储介质 |
CN114697516B (zh) * | 2020-12-25 | 2023-11-10 | 花瓣云科技有限公司 | 三维模型重建方法、设备和存储介质 |
CN113628348B (zh) * | 2021-08-02 | 2024-03-15 | 聚好看科技股份有限公司 | 一种确定三维场景中视点路径的方法及设备 |
CN114119839B (zh) * | 2022-01-24 | 2022-07-01 | 阿里巴巴(中国)有限公司 | 三维模型重建与图像生成方法、设备以及存储介质 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104408762A (zh) * | 2014-10-30 | 2015-03-11 | 福州大学 | 利用单目和二维平台获取物体图像信息及三维模型的方法 |
CN106210700A (zh) * | 2016-07-14 | 2016-12-07 | 上海玮舟微电子科技有限公司 | 三维图像的获取系统、显示系统及所适用的智能终端 |
CN108269300A (zh) * | 2017-10-31 | 2018-07-10 | 杭州先临三维科技股份有限公司 | 牙齿三维数据重建方法、装置和系统 |
CN110148084A (zh) * | 2019-05-21 | 2019-08-20 | 智慧芽信息科技(苏州)有限公司 | 由2d图像重建3d模型的方法、装置、设备及存储介质 |
US20190333269A1 (en) * | 2017-01-19 | 2019-10-31 | Panasonic Intellectual Property Corporation Of America | Three-dimensional reconstruction method, three-dimensional reconstruction apparatus, and generation method for generating three-dimensional model |
WO2019211970A1 (ja) * | 2018-05-02 | 2019-11-07 | パナソニックIpマネジメント株式会社 | 三次元再構成方法及び三次元再構成装置 |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109003325B (zh) * | 2018-06-01 | 2023-08-04 | 杭州易现先进科技有限公司 | 一种三维重建的方法、介质、装置和计算设备 |
-
2019
- 2019-11-22 CN CN201980002779.1A patent/CN110998671B/zh active Active
- 2019-11-22 WO PCT/CN2019/120394 patent/WO2021097843A1/zh active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104408762A (zh) * | 2014-10-30 | 2015-03-11 | 福州大学 | 利用单目和二维平台获取物体图像信息及三维模型的方法 |
CN106210700A (zh) * | 2016-07-14 | 2016-12-07 | 上海玮舟微电子科技有限公司 | 三维图像的获取系统、显示系统及所适用的智能终端 |
US20190333269A1 (en) * | 2017-01-19 | 2019-10-31 | Panasonic Intellectual Property Corporation Of America | Three-dimensional reconstruction method, three-dimensional reconstruction apparatus, and generation method for generating three-dimensional model |
CN108269300A (zh) * | 2017-10-31 | 2018-07-10 | 杭州先临三维科技股份有限公司 | 牙齿三维数据重建方法、装置和系统 |
WO2019211970A1 (ja) * | 2018-05-02 | 2019-11-07 | パナソニックIpマネジメント株式会社 | 三次元再構成方法及び三次元再構成装置 |
CN110148084A (zh) * | 2019-05-21 | 2019-08-20 | 智慧芽信息科技(苏州)有限公司 | 由2d图像重建3d模型的方法、装置、设备及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN110998671A (zh) | 2020-04-10 |
CN110998671B (zh) | 2024-04-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021097843A1 (zh) | 三维重建方法、装置、系统和存储介质 | |
US10360718B2 (en) | Method and apparatus for constructing three dimensional model of object | |
Yang et al. | Mobile3DRecon: Real-time monocular 3D reconstruction on a mobile phone | |
CN106600686B (zh) | 一种基于多幅未标定图像的三维点云重建方法 | |
CN102804231B (zh) | 三维场景的分段平面重建 | |
Waechter et al. | Virtual rephotography: Novel view prediction error for 3D reconstruction | |
JP7448566B2 (ja) | クロスリアリティシステムにおけるスケーラブル3次元オブジェクト認識 | |
CN111133477B (zh) | 三维重建方法、装置、系统和存储介质 | |
Wei | Converting 2d to 3d: A survey | |
CN104350525A (zh) | 组合用于三维建模的窄基线和宽基线立体 | |
Yin et al. | Towards accurate reconstruction of 3d scene shape from a single monocular image | |
WO2023024441A1 (zh) | 模型重建方法及相关装置、电子设备和存储介质 | |
WO2018133119A1 (zh) | 基于深度相机进行室内完整场景三维重建的方法及系统 | |
CN111382618B (zh) | 一种人脸图像的光照检测方法、装置、设备和存储介质 | |
EP3309750B1 (en) | Image processing apparatus and image processing method | |
CN114202632A (zh) | 网格线性结构恢复方法、装置、电子设备及存储介质 | |
Xu et al. | Hybrid mesh-neural representation for 3d transparent object reconstruction | |
WO2020151078A1 (zh) | 一种三维重建的方法和装置 | |
US20080111814A1 (en) | Geometric tagging | |
Lin et al. | Multiview textured mesh recovery by differentiable rendering | |
Lhuillier | Toward flexible 3d modeling using a catadioptric camera | |
JP2021026759A (ja) | オブジェクトの3dイメージングを実施するためのシステムおよび方法 | |
Han et al. | Ro-map: Real-time multi-object mapping with neural radiance fields | |
Price et al. | Augmenting crowd-sourced 3d reconstructions using semantic detections | |
Park et al. | A tensor voting approach for multi-view 3D scene flow estimation and refinement |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19953425 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 17915487 Country of ref document: US |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19953425 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19953425 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 23.01.2023) |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 23.01.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19953425 Country of ref document: EP Kind code of ref document: A1 |