CN110490916B

CN110490916B - Three-dimensional object modeling method and apparatus, image processing device, and medium

Info

Publication number: CN110490916B
Application number: CN201910295558.XA
Authority: CN
Inventors: 田虎; 周连江; 关海波; 段小军; 王鐘烽; 李海洋; 杨毅; 朱辰
Original assignee: Beijing Urban Network Neighbor Information Technology Co Ltd
Current assignee: Beijing Urban Network Neighbor Information Technology Co Ltd
Priority date: 2019-04-12
Filing date: 2019-04-12
Publication date: 2020-07-17
Anticipated expiration: 2039-04-12
Also published as: CN110490916A; CN111862179B; CN111862179A

Abstract

The invention discloses a three-dimensional object modeling method and device, an image processing device and a medium. The method comprises the following steps: extracting a plane contour of each panoramic image in a three-dimensional space; normalizing the scale of the position of the camera and the scale of the plane contour of each panoramic image in the three-dimensional space to obtain the normalized plane contour of each panoramic image in the three-dimensional space; and based on the position of the camera, rotating and translating the three-dimensional point coordinates of the plane contour of each panoramic image subjected to scale normalization in the three-dimensional space so as to unify the three-dimensional point coordinates into the same coordinate system, and splicing the object contours into a multi-object plane contour. By using the high-resolution panoramic image, the resolution of the model constructed for the three-dimensional object can be improved, and the accuracy of object modeling can be improved by performing multi-contour stitching and subsequent multi-contour optimization through coordinate transformation, so that the resolution and the accuracy of the generated model of the three-dimensional object can be effectively improved.

Description

Three-dimensional object modeling method and apparatus, image processing device, and medium

Technical Field

The present invention relates to the field of image processing and three-dimensional object modeling, and more particularly, to a three-dimensional object modeling method and apparatus, an image processing device, and a medium.

Background

In the field of three-dimensional object modeling, how to make the generated three-dimensional model have high resolution and high accuracy is a goal that is strongly pursued in the industry.

There are two main ways to model three-dimensional objects at present.

One is to take multiple images from different angles using common image acquisition equipment and then combine/stitch the multiple images taken from different angles together to construct a three-dimensional model of the three-dimensional object. However, this method requires complicated image stitching, and it is difficult to obtain a three-dimensional object with high accuracy.

The other mode is that three-dimensional point clouds of a three-dimensional object are obtained by directly using three-dimensional scanning equipment, and then the three-dimensional point clouds are spliced to generate a three-dimensional model. However, the image acquisition device of such a three-dimensional scanning device is not highly accurate, resulting in a captured image with a low resolution, resulting in a generated three-dimensional model with a low resolution.

Therefore, how to obtain a high-resolution acquired image and how to accurately splice the images to obtain a vivid three-dimensional model are the technical problems to be solved by the invention.

Disclosure of Invention

In order to solve one of the above problems, the present invention provides a three-dimensional object modeling method and apparatus, an image processing device, and a medium.

According to an embodiment of the present invention, there is provided a three-dimensional object modeling method including: a planar contour extraction step in which, for at least one panoramic image taken for at least one three-dimensional object to be modeled, a planar contour in three-dimensional space of each panoramic image is extracted, wherein each panoramic image is taken for one three-dimensional object, each three-dimensional object corresponding to one or more panoramic images; a scale normalization step, wherein the scale of the plane contour of each panoramic image in the three-dimensional space is normalized based on the camera position, and the normalized plane contour of each panoramic image in the three-dimensional space is obtained; and a multi-object splicing step, wherein, based on the position of the camera, the three-dimensional point coordinates of the plane contour in the three-dimensional space of each panoramic image after scale normalization are rotated and translated so as to unify the three-dimensional point coordinates into the same coordinate system, thereby splicing each object contour into a multi-object plane contour.

Optionally, the camera position is predetermined or estimated by: performing feature point matching between the panoramic images by using the geometric relationship of at least one panoramic image shot for at least one three-dimensional object to be processed, and recording mutually matched feature points in the panoramic images as matching feature points; and reducing the reprojection error of the matching feature points on the panoramic image for each panoramic image to obtain the camera position when each panoramic image is shot and the three-dimensional point coordinates of the matching feature points on each panoramic image.

Optionally, the scale normalization step includes: when the camera position is obtained through estimation, sorting the height values in all three-dimensional point coordinates on the at least one panoramic image obtained when the camera position is estimated from small to large, and taking the median or mean of the height values sorted at the top as a uniform specific category contour estimated height h_c’(ii) a And assuming height h with a class-specific profile_cEstimating height h with profile of specific category_c’Generating a normalized planar contour in three-dimensional space of each panoramic image from the planar contour in three-dimensional space of said each panoramic image, wherein the profile of a particular category assumes a height h_cIs an arbitrarily assumed height.

Optionally, the scale normalization step includes: camera determined height h predetermined at camera position_c’In the case of (2), the assumed height h of the camera assumed in the plane contour extraction step is used_cDetermining the height h with the camera_c’Generating a normalized planar contour in three-dimensional space of each panoramic image from the planar contour in three-dimensional space of said each panoramic image, wherein the profile of a particular category assumes a height h_cIs an arbitrarily assumed height.

Optionally, the single-object plane contour generating step includes: for the at least one panoramic image, determining whether a plurality of panoramic images belong to the same three-dimensional object one by the following method: if more than specific proportion of matching feature points exist between the two panoramic images, the two panoramic images are determined to belong to the same three-dimensional object; and if the plurality of panoramic images are determined to belong to the same three-dimensional object, taking a union set of all plane contours of the same three-dimensional object obtained from the plurality of panoramic images as the plane contour of the three-dimensional object in the three-dimensional space.

Optionally, in the multi-object stitching step, a multi-object plane contour in a three-dimensional space can be obtained by stitching based on the plane contour in the three-dimensional space of each single three-dimensional object.

Optionally, the multi-object stitching step includes: assuming that the planar profiles in the three-dimensional space of all the panoramic images are N in total, the pth three-dimensional point of the nth planar profile is represented as

The camera position when the panoramic image corresponding to the nth plane profile is photographed is expressed as { R_n，t_nIn which R is_nA rotation matrix for representing rotation parameters of the camera position, t_nSelecting a camera position when the panoramic image corresponding to the ith plane contour is shot as a reference coordinate system, and unifying three-dimensional points of other plane contours under the reference coordinate system by the following formula:

converting all three-dimensional points of the dimension-normalized plane profiles except the ith plane profile through the formula to unify the three-dimensional points of all the plane profiles to the same coordinate system, thereby splicing all the object profiles into a multi-object plane profile.

Optionally, the three-dimensional object modeling method further includes: and a multi-object contour optimization step, wherein the distance between the adjacent edges of two single-object contours in the multi-object contour is calculated, and if the distance is nonzero and is smaller than a specific threshold value, the two single-object contours are shifted so that the distance between the adjacent edges becomes zero.

Optionally, the three-dimensional object modeling method further includes: and a 3D model generation step, wherein after the multi-object splicing step, the multi-object plane contour in the three-dimensional space obtained by splicing is converted into a multi-object 3D model.

According to an exemplary embodiment of the present invention, there is provided a three-dimensional object modeling apparatus including: a planar contour extraction means configured to extract a planar contour in three-dimensional space of each panoramic image for at least one panoramic image taken for at least one three-dimensional object to be modeled, wherein each panoramic image is taken for one three-dimensional object, and each three-dimensional object corresponds to one or more panoramic images; the scale normalization device is configured to normalize the scale of the plane contour of each panoramic image in the three-dimensional space based on the camera position to obtain the normalized plane contour of each panoramic image in the three-dimensional space; and the multi-object splicing device is used for performing rotation and translation operations on three-dimensional point coordinates of the plane contour of each panoramic image subjected to scale normalization in a three-dimensional space based on the position of the camera so as to unify the three-dimensional point coordinates into the same coordinate system, thereby splicing each object contour into a multi-object plane contour.

Optionally, the scale normalization means is further configured to: in the case where the camera position is estimated, it will be estimated when the camera position is estimatedSorting the height values in all three-dimensional point coordinates on the at least one panoramic image from small to large, and taking the median or mean of the height values sorted at the top as the uniform profile estimated height h of a specific category_c’(ii) a And assuming height h with a class-specific profile_cEstimating height h with profile of specific category_c’Generating a normalized planar contour in three-dimensional space of each panoramic image from the planar contour in three-dimensional space of said each panoramic image, wherein the profile of a particular category assumes a height h_cIs an arbitrarily assumed height.

Optionally, the scale normalization means is further configured to: camera determined height h predetermined at camera position_c’In the case of (2), the assumed height h of the camera assumed in the plane contour extraction step is used_cDetermining the height h with the camera_c’Generating a normalized planar contour in three-dimensional space of each panoramic image from the planar contour in three-dimensional space of said each panoramic image, wherein the profile of a particular category assumes a height h_cIs an arbitrarily assumed height.

Optionally, the single-object plane contour generating device is further configured to: for the at least one panoramic image, determining whether a plurality of panoramic images belong to the same three-dimensional object one by the following method: if more than specific proportion of matching feature points exist between the two panoramic images, the two panoramic images are determined to belong to the same three-dimensional object; and if the plurality of panoramic images are determined to belong to the same three-dimensional object, taking a union set of all plane contours of the same three-dimensional object as the plane contour of the three-dimensional object for each plane contour of the same three-dimensional object obtained by the plurality of panoramic images.

Optionally, the multi-object stitching device is further configured to be able to stitch the multi-object plane contour in the three-dimensional space based on the plane contour in the three-dimensional space of each single three-dimensional object.

Optionally, the multi-object stitching device is further configured to: assuming all panoramic imagesThe number of the plane profiles in the three-dimensional space is N, and the p-th three-dimensional point of the N-th plane profile is represented as

Optionally, the three-dimensional object modeling apparatus further includes: and the multi-object contour optimization device is configured to calculate the distance between the adjacent edges of two single-object contours in the multi-object contour, and if the distance is nonzero and smaller than a specific threshold value, the two single-object contours are shifted so that the distance between the adjacent edges becomes zero.

Optionally, the three-dimensional object modeling apparatus further includes: and the 3D model generation device is configured to convert the spliced multi-object plane contour in the three-dimensional space into a multi-object 3D model.

According to still another embodiment of the present invention, there is provided an image processing apparatus including: a processor; and a memory having executable code stored thereon, which when executed by the processor, causes the processor to perform one of the methods described above.

According to yet another embodiment of the invention, there is provided a non-transitory machine-readable storage medium having stored thereon executable code which, when executed by a processor, causes the processor to perform one of the methods described above.

The invention carries out 2D modeling and 3D modeling of the three-dimensional object based on a plurality of panoramic images of the three-dimensional object to be processed shot by using the panoramic camera, and overcomes the defect of low model resolution caused by using 3D scanning equipment to generate an object model in the prior art.

In the present invention, a high-resolution captured image is provided for multi-object modeling (e.g., modeling of a house or a vehicle, etc.) by taking a panoramic image of at least one object using a panoramic camera.

Further, in the invention, an efficient three-dimensional object modeling method is adopted, high-resolution data required by modeling is provided for multi-object modeling (such as modeling of houses or vehicles), and the provided data required by modeling can simplify the subsequent model generation process.

Still further, by the modeling method of the present invention, the resolution and accuracy of the generated model (e.g., a 2D and/or 3D model of a house or vehicle, etc.) can be effectively improved.

The invention is suitable for 2D modeling and 3D modeling of a single object and also suitable for 2D modeling and 3D modeling of multiple objects, can perform 2D modeling and 3D modeling based on panoramic images of each three-dimensional object, provides an innovative comprehensive image processing scheme, and can be applied to various VR (virtual reality) scenes for performing object modeling based on panoramic images, such as house modeling (VR seeing a house), vehicle modeling (VR seeing a car), shopping place modeling (VR shopping) and the like.

Drawings

The above and other objects, features and advantages of the present disclosure will become more apparent by describing in more detail exemplary embodiments thereof with reference to the attached drawings, in which like reference numerals generally represent like parts throughout.

Fig. 1 presents a schematic flow-chart of a method of modeling a three-dimensional object according to an exemplary embodiment of the present invention.

Fig. 2 presents a schematic flow-chart of an overall process of three-dimensional object modeling according to another exemplary embodiment of the present invention.

Fig. 3 presents a schematic block diagram of a three-dimensional object modeling apparatus in accordance with an exemplary embodiment of the present invention.

Fig. 4 gives a schematic block diagram of an image processing apparatus according to an exemplary embodiment of the present invention.

Detailed Description

Preferred embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While the preferred embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art. It should be noted that the numbers, serial numbers and reference numbers in the present application are only presented for convenience of description, and no limitation is made to the steps, the sequence and the like of the present invention unless the specific sequence of the steps is explicitly indicated in the specification.

The invention provides a three-dimensional object modeling method, a three-dimensional object modeling apparatus, an image processing device, and a computer medium.

Firstly, in the invention, a common panoramic camera is adopted to shoot each three-dimensional object to obtain a high-resolution panoramic image, thereby overcoming the defect of low resolution of the image captured by the 3D scanning camera described in the background technology.

Then, using the plurality of panoramic images photographed, a planar contour in a three-dimensional space of a single panoramic image (may be referred to as a "single-image planar contour") may be extracted.

Furthermore, through the scale normalization, the unification between the scale of the single image plane outline and the scale of the camera position can be realized, the normalized single image plane outlines are generated, high-resolution and sufficient data preparation is provided for the subsequent three-dimensional object modeling, and the difficulty of the subsequent processing work is reduced.

Still further, the accurate single-object plane contour can be obtained by fusing the single-image plane contours belonging to the same three-dimensional object.

Still further, the plane outlines of the single objects may be stitched in a three-dimensional space to obtain a multi-object model (in this case, a 2D model).

In addition, the multi-object model can be corrected to obtain a more accurate model, so that the model display effect is better.

Finally, a complete, high resolution and accurate 3D model is obtained by 3D model generation.

Hereinafter, for the convenience of understanding and description, the respective processes of the present invention will be described in detail by taking house modeling as an example.

Here, the panoramic camera is first briefly described. The panoramic camera is different from the general camera in that the general camera generally photographs with only one lens, and the panoramic camera photographs with two or more lenses, so that the panoramic camera can realize 360-degree photographing.

In a three-dimensional object modeling method according to an exemplary embodiment of the present invention, based on at least one panoramic image (one panoramic image corresponds to only one room (three-dimensional object) but a plurality of panoramic images may be taken in one room, that is, one room may correspond to a plurality of panoramic images) taken for one three-dimensional object (for example, in one room), a planar contour in a three-dimensional space of each panoramic image is extracted, then the extracted planar contour is normalized to obtain a planar contour required for modeling, then, stitching of the multi-planar contours is realized through coordinate transformation, thereby obtaining a multi-object planar contour (2D model), and then, a 3D model of the three-dimensional object may be generated from the multi-object planar contour.

As shown in fig. 1, in step S110, a planar contour in a three-dimensional space of each panoramic image is extracted for at least one panoramic image taken for at least one three-dimensional object to be modeled, wherein each panoramic image is taken for one three-dimensional object, and each three-dimensional object corresponds to one or more panoramic images.

In this step, the extraction of the planar contour may be achieved in various ways. An example will be briefly given below to explain the plane contour extraction method.

Taking at least one three-dimensional object as an indoor house as an example, each room may be regarded as one three-dimensional object, and at least one panoramic image is taken for each room, so one panoramic image corresponds to one room, but each room may correspond to a plurality of panoramic images.

In this case, since the ceiling is always above the camera, the uppermost pixel point in the panoramic image is always on the ceiling. And thirdly, most of the pixel points belonging to the ceiling have similar characteristics, so that all the pixel points belonging to the ceiling can be finally obtained according to the characteristic similarity of the pixel points.

For example, all the pixel points in the first row of the panoramic image are regarded as pixel points belonging to the ceiling; for each pixel in the second row, the feature similarity (the feature may adopt color, gray scale, etc., and the feature similarity of two pixels may be, for example, the absolute value of the difference between the features of two pixels (e.g., the difference between gray scales or the difference between colors, etc.)) with the pixel belonging to the same column in the first row is calculated. If the feature similarity is within a certain threshold (if a gray scale value of 0-255 is used, the threshold may be set to 10, for example), the pixel also belongs to the ceiling, and the similarity between the third row and the second row and the similarity between the fourth row and the third row on the column are continuously calculated … until the feature similarity exceeds the threshold, and the pixel position at this time is an edge pixel of the ceiling.

The edge pixels of the ceiling form the edge of the ceiling, and therefore, the plane outline of the ceiling can be formed by projecting the edge pixels to the three-dimensional space.

The projection of the pixel points into three-dimensional space will be described below.

Assume that the width of a panoramic image is W and the height is H, and assume that the obtained edge pixel point c of the ceiling has the coordinate (p) in the image coordinate system_c，q_c). Since the panoramic image is obtained by spherical projection, it is expressed as (θ) in a spherical coordinate system_c，φ_c) Wherein theta_c∈[-π，π]Is longitude, phi_c∈[-π/2，π/2]Is a dimension.

The relationship between the spherical coordinate system and the image coordinate system can be obtained by the following formula 1:

the following formula 2 is a coordinate (θ) of a pixel point c at the edge of the ceiling in a spherical coordinate system_c，φ_c) Three-dimensional point coordinates (x) projected onto a three-dimensional plane_c，y_c，z_c)：

In this document, the term "image coordinate system" refers to a coordinate system where image pixels are located, and is mainly used to describe the locations of the pixels in the image. Therefore, the panoramic image coordinate system refers to a coordinate system where the pixel points of the panoramic image are located, and is mainly used for describing the positions where the pixel points are located in the panoramic image.

Note that the above gives only one example of generating a plane contour in a three-dimensional space of a panoramic image based on the similarity of ceiling feature points on the panoramic image, and the method that can be used by the present invention is not limited to this example.

Since the ceiling can be regarded as a plane, it can be regarded that each pixel point at the edge of the ceiling has a uniform height from the camera, which can be referred to as "height of the camera from the ceiling".

Here, since the panoramic camera is generally supported by a tripod and has a fixed height, it can be considered that the height of the camera from the ceiling and the height of the camera from the floor are fixed.

For the planar contour in the three-dimensional space obtained in this step, a height value can be assumed for each three-dimensional point on the contour, for example, the height of the camera from the ceiling is assumed to be h_cAnd the assumed height may be an arbitrary value such as 100 (the height of the real camera from the ceiling may be found by subsequent processing). To avoid confusion, the height h of the camera from the ceiling will be assumed here below_cReferred to as "assumed height of camera from ceiling" h_c。

In the above embodiments, the planar profile of the image can be automatically obtained based on the panoramic image without human intervention for production and without using expensive 3D scanning equipment.

In step S120, the scale of the camera position and the scale of the three-dimensional spatial plane profile of the panoramic image obtained in step S110 are normalized (scale normalization).

Since the assumed height h of the camera from the ceiling is adopted when the three-dimensional spatial plane profile of the room is obtained in step S110_c(the height of the camera is not determined at that time), therefore, the scale of the position of the camera (the actual height of the camera) and the scale of the three-dimensional space plane profile of the three-dimensional object are not uniform, which causes certain difficulty in subsequent room profile splicing.

In this step, the scale of the camera position at the time of shooting each panoramic image and the scale of the plane profile of each panoramic image in the three-dimensional space are normalized to enable the subsequent multi-object stitching processing to be performed.

In this normalization step, two cases are possible: the first case is to have a certain camera position (true camera height) and the second case is to not yet determine the camera position.

Camera determined height h predetermined at camera position_c’In this case, the assumed height h of the camera from the ceiling assumed in the plane contour extraction step S110 may be used_cDetermining the height h with the camera_c’Generating a normalized planar contour in three-dimensional space of each panoramic image from the planar contour in three-dimensional space of each panoramic image. Here, the class-specific profile assumes a height h_cIt may be an arbitrary assumed height, for example a coordinate value of 100 or any other value.

For the case where the camera position is estimated, first, the camera position may be estimated by: performing feature point matching between the panoramic images by using the geometric relationship of at least one panoramic image shot for at least one three-dimensional object to be processed, and recording mutually matched feature points in the panoramic images as matching feature points; and reducing the re-projection error of the matching characteristic points on each panoramic image to obtain the position of the camera when each panoramic image is shot and the three-dimensional point coordinates of the matching characteristic points on the panoramic image.

Then, in case the camera position is estimated, the scale normalization can be achieved by: sorting the height values in all three-dimensional point coordinates on the at least one panoramic image obtained in the process of estimating the position of the camera from small to large, and taking the median or mean of the height values sorted at the top as a uniform profile estimation height h of a specific category_c’(ii) a And assuming height h with a class-specific profile_cEstimating height h with profile of specific category_c’Generating a normalized planar contour in three-dimensional space of each panoramic image from the planar contour in three-dimensional space of each panoramic image. Again, the class-specific profile here assumes a height h_cBut also an arbitrary assumed height, such as 100 or any other value.

In the present invention, a multi-view geometry based method may be employed to estimate the camera position, which is not described in detail herein, but is limited by space.

Alternatively, in step S125, a planar contour in three-dimensional space of each individual object may be obtained based on the normalized planar contour in three-dimensional space of each panoramic image.

In the present invention, a corresponding planar contour in three-dimensional space is obtained from a panoramic image, which may be referred to as a "single-object planar contour".

For example, taking each room as a three-dimensional object as an example, as described above, since a plurality of panoramic images of the same room may be included in a captured panoramic image, in this case, the same room will correspond to a plurality of plane contours in a three-dimensional space, and therefore, in a multi-room plane contour obtained by a subsequent multi-room stitching process, a phenomenon may occur in which plane contours obtained from different panoramic images of one or more rooms do not coincide, and thus stitched contours overlap or are confused. Therefore, it is considered to perform fusion of the same room contour (may be referred to as "single object fusion") first to avoid such a phenomenon. Moreover, the single-object fusion can also eliminate the incomplete phenomenon of the single-object outline.

For the above-mentioned situation that single object fusion is required, the following exemplary method will be given below by taking a room as a three-dimensional object as an example.

First, it is determined whether two panoramic images belong to the same room.

Here, a feature point matching-based approach may be adopted, and if there are more than a certain proportion (a certain proportion, for example, 50%, etc.) of matching feature points between two panoramic images, it may be determined that the two panoramic images belong to the same room.

Then, if a plurality of panoramic images belong to the same room, that is, for plane contours of the same room obtained from different panoramic images, a union of these plane contours is taken as a single room plane contour in a three-dimensional space (one room contour, avoiding the case of multiple single image contours of one room), thereby realizing fusion of the same room contour.

Wherein, the aboveThe proportion of matching feature points of (a) may be set in the following manner: suppose that image 1 has n₁A feature point, image 2 has n₂And n characteristic points are matched with the two images. The proportion of matching feature points may be n/min (n)₁,n₂)。

Alternatively, it may be set that if the ratio is larger than, for example, 50%, the two images are considered to be the same room.

Here, the setting of the proportion of the matching feature points and the actual size of the proportion may be tested or determined empirically according to actual circumstances, and the present invention is not limited thereto.

As described above, in the present invention, for at least one panoramic image described above, it can be determined whether a plurality of panoramic images belong to the same room by means of single-room fusion as follows: if there are more than a certain proportion of matching feature points between two panoramic images, it can be determined that the two panoramic images belong to the same room.

If it is determined that the plurality of panoramic images belong to the same room, for plane profiles of the same room obtained from the plurality of panoramic images, a union of the plane profiles is taken as a plane profile of the room.

In addition, after the contours of the same room are fused, noise may exist due to the obtained contour edges, and for example, the phenomena that the edge lines are not straight and the edge lines are not perpendicular to the edge lines may appear. Therefore, the invention can further carry out right-angle polygon fitting on the outline of each room to obtain a more reasonable room plane outline.

Through the optimization processing specially performed for the single object, such as single object fusion and/or right-angle polygon fitting, a more accurate single object plane contour can be obtained, the subsequent generation of 2D and 3D models is facilitated, and the resolution and the accuracy of the models are improved.

Note that this step is not a necessary step for two-dimensional or three-dimensional modeling of three-dimensional objects, but is a preferred way of processing that can improve the accuracy of the model.

In step S130, the three-dimensional point coordinates of the planar contour in the three-dimensional space of each panoramic image subjected to the scale normalization may be rotated and translated by using the camera position to unify the three-dimensional point coordinates into the same coordinate system, so as to splice the object contours into a multi-object planar contour.

In the step, the splicing of the plane outlines of all the objects subjected to scale normalization is realized to splice the plane outlines into the multi-object outline, and the multi-object splicing is realized in an automatic mode in the invention.

An automated multi-object stitching scheme proposed by the inventors of the present invention will be given below.

The following will describe in detail the specific operation of a room as a three-dimensional object.

Assuming the contour of N rooms, the p-th three-dimensional point of the nth room contour is represented as

The camera position of the room is denoted as R_n，t_nIn which R is_nAs a rotation matrix for rotation parameters representing the camera position, t_nIs a translation vector used to represent translation parameters for the camera position.

At this time, the camera position of the first room can be selected as the reference coordinate system, because the currently obtained room outlines are the outline positions in the respective coordinate systems, and need to be unified into one coordinate system, so that one reference coordinate system needs to be selected. Specifically, the coordinate system in which the camera position of the first room is located may be selected as the reference coordinate system. Then, the contour three-dimensional points of other rooms can be unified under the coordinate system by the following formula 3:

all dimension-normalized contour three-dimensional points (for example, three-dimensional points on a ceiling edge, a wall surface edge and a floor edge) except the first room are converted through a formula 3, so that the three-dimensional points of all rooms can be unified to the same coordinate system (namely, a reference coordinate system of the first room), and therefore splicing of the multi-room plane contour can be achieved.

Here, the coordinate system of any one room can be selected as the reference coordinate system (for example, the coordinate system of the ith room is selected as the reference coordinate system, in this case, R in formula 3₁Is changed into R_i，t₁Becomes t_i) The present invention is not limited in this regard, as relative positional relationships, not absolute positional relationships, are required in the present invention.

Here, the multi-object contour obtained after the multi-object stitching of this step may be output as a 2D model (e.g., a 2D floor plan) of the at least one (including a plurality of) three-dimensional objects.

According to another exemplary embodiment of the present invention, optionally, the three-dimensional object modeling method of the present invention may further modify the multi-object contour in step S135.

Note that this step is also not a necessary step for two-or three-dimensional modeling of three-dimensional objects, but a preferred way of processing that can improve the accuracy of the model.

In the invention, after the multi-object contour is spliced, the multi-object contour can be further corrected to obtain a more accurate multi-object contour.

Taking a room as an example of a three-dimensional object of the type of interest, due to the influence of the single-image plane contour extraction precision and the camera position estimation precision, the contours of adjacent multi-dimensional objects (such as a set of indoor houses) may have an overlapping region or a gap after splicing, and therefore, the contours can be further corrected for the two cases.

The correction method may be, for example, as follows. First, the distance between adjacent edges of two contours (which should theoretically be overlapped, that is, should theoretically be one overlapped edge of the multi-room contour) is calculated, if the distance is smaller than a certain threshold, it can be determined that the two edges are in an adjacent relationship, at this time, the contour can be shifted accordingly, so that the distance between the adjacent edges becomes 0 (becomes overlapped, becomes an overlapped edge), thereby correcting the overlap or gap between the adjacent edges.

For the threshold value, for example, the average length L of the adjacent edges that should be an overlapped edge may be calculated, and a certain proportion of the average length may be used as the threshold value, for example, 0.2 × L may be used as the distance threshold value.

Note that the above is merely an exemplary threshold value given for ease of understanding, and in fact, the present invention does not impose additional limitations on the threshold value, which can be determined experimentally and empirically.

Thus, the multi-room contour after the above single-room contour fusion and multi-room contour modification can be used as a complete and accurate 2D floor plan (2D model of the house) of the set of houses.

Further, in step S140, the generated multi-object planar contour is converted into a 3D model of at least one three-dimensional object.

For ease of understanding and description, the house modeling will be described as an example below.

For the multi-object plane contour (e.g., multi-room plane contour) obtained in the previous step, three-dimensional point interpolation is performed internally, and then all three-dimensional point coordinates are projected into the corresponding panoramic image so as to acquire the ceiling texture (color value).

Here, a method of interpolating three-dimensional points will be exemplified. For example, assuming that the ceiling profile of the obtained multi-room plane profile is a rectangle, assuming that the length is H and the width is W, the length and the width can be divided into N intervals, respectively, so that a total of N × N interpolation points can be obtained. Then, a vertex of the rectangle may be selected (assuming that the three-dimensional point coordinates of the vertex are (x, y, z)) as an origin, and the N × N points may be sequentially represented by (x + H/N, y, z), (x +2 × H/N, y, z) … (x, y + W/N, z) (x, y +2 × W/N, z), … (x + H/N, y + W/N, z) …. Therefore, after the three-dimensional point interpolation, the dense three-dimensional point coordinates inside the contour are obtained.

It should be noted that a specific example of three-dimensional point interpolation is given above for the sake of understanding, and in fact, the three-dimensional point interpolation method applicable to the present invention may be many and is not limited to this example.

In addition, for example, a specific projection method may be as follows. The coordinate of the three-dimensional point after interpolation is assumed to be (x)_i，y_i，z_i) The longitude and latitude projected on the panoramic image is (theta)_i，φ_i) Then the projection formula can be represented by the following formula 4:

after the latitude and longitude are obtained by the formula, the coordinate of the three-dimensional point on the panoramic image plane can be obtained according to the formula 1, and the color value of the point can be used as the texture of the three-dimensional point.

For most scenes, the contour of the ceiling and the contour of the floor may be assumed to be parallel and the same. Thus, the corrected ceiling plane profile of each room obtained as described above is used, plus the estimated height h of the camera from the floor obtained above_f’And generating three-dimensional points of the multi-room floor plane outline by the formula 2.

Here, the shape of the plane contour of the floor is assumed to be the same as the ceiling, i.e., the three-dimensional coordinates x and z of the horizontal plane are the same, except for the height, i.e., the y value in the vertical direction (e.g., the plane contour of the ceiling is above the camera, and the floor is below the camera, so the heights are different). Therefore, it is only necessary to compare the y value (estimated height h of the camera from the ceiling) in the three-dimensional point coordinates of the ceiling profile obtained above_c’) Replacement by estimated height h of camera from floor_f’And (4) finishing.

Similarly to the three-dimensional point interpolation of the planar contour of the ceiling, for the planar contour of the floor, the three-dimensional point interpolation is internally performed and then projected into the corresponding panoramic image using equation 4 so as to obtain the texture of the floor.

Then, three-dimensional vertices at the same plane position between the ceiling profile and the floor profile are connected to form plane profiles of a plurality of wall surfaces, and similarly, three-dimensional point interpolation is performed on the interiors of the plane profiles, and then the three-dimensional point interpolation is projected into the corresponding panoramic image by using formula 4 so as to obtain the texture of the wall surface.

Thus, a 3D texture model of the complete house may be generated.

By the three-dimensional object modeling method, the resolution and the accuracy of the generated model can be effectively improved.

Moreover, it should be noted that, for the sake of understanding and description, the method for modeling based on images of the present invention is described by taking house modeling as an example, and actually, the present invention should not be limited to the application scenario of house modeling, but can be applied to various scenarios for modeling based on images.

As shown in fig. 3, the three-dimensional object modeling apparatus 100 according to an exemplary embodiment of the present invention may include a plane contour extraction device 110, a scale normalization device 120, and a multi-contour stitching device 130.

Wherein, the planar contour extraction means 110 may be configured to extract a planar contour in a three-dimensional space of each panoramic image for at least one panoramic image taken for at least one three-dimensional object to be modeled.

Here, as described above, each panoramic image is photographed for one three-dimensional object, each three-dimensional object corresponding to one or more panoramic images.

The scale normalization means 120 may be configured to normalize the scale of the camera position and the scale of the planar profile of each panoramic image in the three-dimensional space, resulting in a normalized planar profile of each panoramic image in the three-dimensional space.

The multi-object stitching device 130 may be configured to perform rotation and translation operations on three-dimensional point coordinates of the planar contours in the three-dimensional space of the respective panoramic images subjected to scale normalization based on the camera positions to unify the three-dimensional point coordinates into the same coordinate system, thereby stitching the planar contours in the three-dimensional space of the respective panoramic images into a multi-object planar contour.

Wherein, optionally, the camera position may be predetermined.

Alternatively, the camera position may be estimated by:

1) performing feature point matching between the panoramic images by using the geometric relationship of at least one panoramic image shot for at least one three-dimensional object to be processed, and recording mutually matched feature points in the panoramic images as matching feature points; and

2) the camera position when each panoramic image is shot and the three-dimensional point coordinates of the matching feature points on the panoramic image are obtained by reducing the reprojection error of the matching feature points on each panoramic image.

Here, in the case where the camera position is estimated, the height values in all three-dimensional point coordinates on the at least one panoramic image obtained when the camera position is estimated may be sorted from small to large, and a median or a mean of the top-ranked height values may be taken as a uniform specific class profile estimated height h_c’(ii) a And assuming height h with a class-specific profile_cEstimating height h with profile of specific category_c’Generating a normalized planar contour in three-dimensional space of each panoramic image from the planar contour in three-dimensional space of each panoramic image.

Wherein the camera here assumes a height h_c(i.e., the assumed height of the camera from the ceiling) is an arbitrarily assumed one.

In addition, the camera determination height h is predetermined at the camera position_c’In this case, the assumed height h of the camera assumed in the plane contour extraction step S110 may be utilized_cDetermining the height h with the camera_c’From said each panoramic image in three-dimensional spaceThe planar contours generate normalized planar contours of the panoramic images in three-dimensional space. Also, the camera here assumes a height h_c(i.e., the assumed height of the camera from the ceiling) is also an arbitrarily assumed height.

Optionally, the three-dimensional object modeling apparatus 100 further comprises a single object plane contour generating device 125 configured to:

for the at least one panoramic image, determining whether a plurality of panoramic images belong to the same three-dimensional object one by the following method: if more than specific proportion of matching feature points exist between the two panoramic images, the two panoramic images are determined to belong to the same three-dimensional object; and

and if the plurality of panoramic images are determined to belong to the same three-dimensional object, taking the union of the plane outlines of the plurality of panoramic images in the three-dimensional space as the plane outline of the three-dimensional object in the three-dimensional space.

In addition, optionally, the multi-object stitching device 130 is further configured to be able to stitch the multi-object plane contour in the three-dimensional space based on the plane contour in the three-dimensional space of each single three-dimensional object.

In addition, optionally, the multi-object stitching device 130 may implement stitching of the multi-plane contour through the following coordinate transformation:

assuming that the planar profiles in the three-dimensional space of all the panoramic images are N in total, the pth three-dimensional point of the nth planar profile is represented as

The camera position when the panoramic image corresponding to the nth plane profile is captured is expressed as { R }_n，t_nIn which R is_nA rotation matrix for representing the rotation parameters of the camera position, t_nA translation vector for a translation parameter representing the camera position, where N is an integer greater than 1 and N is an integer greater than or equal to 1. In this case, the camera position at the time when the panoramic image corresponding to the ith plane contour is captured may be selected as the reference coordinate system, and the following may be usedThe formula unifies the three-dimensional points of the other plane profiles into the coordinate system:

converting all three-dimensional points of the plane contour subjected to scale normalization except the ith plane contour through the formula to unify the three-dimensional points of all the plane contours to the same coordinate system, and splicing the plane contours of all the panoramic images in the three-dimensional space into a multi-object plane contour.

In addition, optionally, the three-dimensional object modeling apparatus 100 may further include a multi-object contour optimization device 135 configured to calculate a distance between adjacent edges of two single-object plane contours in the multi-object plane contour, and if the distance is non-zero and less than a certain threshold, shift the two single-object plane contours so that the distance between the adjacent edges becomes zero.

In addition, optionally, the three-dimensional object modeling apparatus 100 may further include a 3D model generation device 140 configured to convert the multi-object plane contour in the three-dimensional space obtained by stitching into a multi-object 3D model.

Here, the respective devices 110, 120, 130, 135, 140, 145, 150, etc. of the three-dimensional object modeling apparatus 100 described above correspond to the steps S110, 120, 130, 135, 140, 145, 150, etc. described in detail above, respectively, and are not described again here.

Therefore, the three-dimensional object modeling device can effectively improve the resolution and the accuracy of the generated model.

Moreover, it should be noted that, for the sake of understanding and description, the technical solution of the present invention for modeling based on images is described by taking house modeling as an example, and actually, the present invention should not be limited to the application scenario of house modeling, but can be applied to various scenarios for modeling three-dimensional objects based on images.

Referring to fig. 4, the image processing apparatus 1 includes a memory 10 and a processor 20.

The processor 20 may be a multi-core processor or may include a plurality of processors. In some embodiments, processor 20 may comprise a general-purpose host processor and one or more special purpose coprocessors such as a Graphics Processor (GPU), Digital Signal Processor (DSP), or the like. In some embodiments, processor 20 may be implemented using custom circuits, such as an Application Specific Integrated Circuit (ASIC) or a Field Programmable Gate Array (FPGA).

The memory 10 has stored thereon executable code which, when executed by the processor 20, causes the processor 20 to perform one of the methods described above. The memory 10 may include various types of storage units, such as a system memory, a Read Only Memory (ROM), and a permanent storage device, among others. Wherein the ROM may store static data or instructions that are required by the processor 20 or other modules of the computer. The persistent storage device may be a read-write storage device. The persistent storage may be a non-volatile storage device that does not lose stored instructions and data even after the computer is powered off. In some embodiments, the persistent storage device employs a mass storage device (e.g., magnetic or optical disk, flash memory) as the persistent storage device. In other embodiments, the permanent storage may be a removable storage device (e.g., floppy disk, optical drive). The system memory may be a read-write memory device or a volatile read-write memory device, such as a dynamic random access memory. The system memory may store instructions and data that some or all of the processors require at runtime. Further, the memory 10 may comprise any combination of computer-readable storage media, including various types of semiconductor memory chips (DRAM, SRAM, SDRAM, flash memory, programmable read-only memory), magnetic and/or optical disks, may also be employed. In some embodiments, memory 10 may include a removable storage device that is readable and/or writable, such as a Compact Disc (CD), a read-only digital versatile disc (e.g., DVD-ROM, dual layer DVD-ROM), a read-only Blu-ray disc, an ultra-density optical disc, a flash memory card (e.g., SD card, min SD card, Micro-SD card, etc.), a magnetic floppy disk, or the like. Computer-readable storage media do not contain carrier waves or transitory electronic signals transmitted by wireless or wired means.

Furthermore, the method according to the invention may also be implemented as a computer program or computer program product comprising computer program code instructions for carrying out the above-mentioned steps defined in the above-mentioned method of the invention.

Alternatively, the invention may also be embodied as a non-transitory machine-readable storage medium (or computer-readable storage medium, or machine-readable storage medium) having stored thereon executable code (or a computer program, or computer instruction code) which, when executed by a processor of an electronic device (or computing device, server, etc.), causes the processor to perform the steps of the above-described method according to the invention.

Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both.

The flowcharts, block diagrams, etc. in the figures illustrate the architecture, functionality, and operation of possible implementations of systems and methods according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. A three-dimensional object modeling method, characterized in that the three-dimensional object modeling method comprises:

a planar contour extraction step in which, for at least one panoramic image taken for at least one three-dimensional object to be modeled, a planar contour in three-dimensional space of each panoramic image is extracted, wherein each panoramic image is taken for one three-dimensional object, each three-dimensional object corresponding to one or more panoramic images;

a scale normalization step, wherein the scale of the plane contour of each panoramic image in the three-dimensional space is normalized based on the camera position, and the normalized plane contour of each panoramic image in the three-dimensional space is obtained; and

and a multi-object splicing step, wherein three-dimensional point coordinates of the plane contour of each panoramic image subjected to scale normalization in a three-dimensional space are rotated and translated based on the position of the camera, so that the three-dimensional point coordinates are unified into the same coordinate system, and the plane contour of each object is spliced into a multi-object plane contour.

2. A method of modelling a three-dimensional object according to claim 1, wherein the camera position is predetermined or estimated by:

performing feature point matching between the panoramic images by using the geometric relationship of at least two panoramic images shot for at least one three-dimensional object to be processed, and recording mutually matched feature points in the panoramic images as matching feature points; and

and reducing the reprojection error of the matching characteristic points on the panoramic image for each panoramic image to obtain the position of a camera when each panoramic image is shot and the three-dimensional point coordinates of the matching characteristic points on each panoramic image.

3. The method of modeling a three-dimensional object as recited in claim 2, wherein the scale normalization step comprises:

in the case where the camera position is estimated,

sorting the height values in all three-dimensional point coordinates on the at least one panoramic image obtained in the process of estimating the position of the camera from small to large, and taking the median or mean of the height values sorted at the top as a uniform profile estimation height h of a specific category_c'; and

presume height h with class-specific profile_cEstimating height h with profile of specific category_c' generating a normalized planar contour in three-dimensional space of each panoramic image from the planar contour in three-dimensional space of each panoramic image,

wherein the profile of a particular class assumes a height h_cIs an arbitrarily assumed height;

wherein the category-specific contour is a planar contour of a top of an object in the panoramic image.

4. The method of modeling a three-dimensional object as recited in claim 2, wherein the scale normalization step comprises:

camera determined height h predetermined at camera position_cIn the case of the' case (a),

by using in the plane profile extraction stepAssumed camera assumed height h_cDetermining the height h with the camera_c' generating a normalized planar contour in three-dimensional space of each panoramic image from the planar contour in three-dimensional space of each panoramic image,

5. The method of modeling a three-dimensional object as recited in claim 1, further comprising, after the scale normalizing step, a single object plane contour generating step comprising:

for the at least two panoramic images, determining whether a plurality of panoramic images belong to the same three-dimensional object one by the following method: if more than specific proportion of matching feature points exist between the two panoramic images, the two panoramic images are determined to belong to the same three-dimensional object; and

and if the plurality of panoramic images belong to the same three-dimensional object, taking a union set of all plane contours of the same three-dimensional object obtained from the plurality of panoramic images as the plane contour of the three-dimensional object in the three-dimensional space.

6. The three-dimensional object modeling method according to claim 5, characterized in that in the multi-object stitching step, a multi-object plane contour in a three-dimensional space can be further stitched based on a plane contour in a three-dimensional space of each individual three-dimensional object.

7. The three-dimensional object modeling method of claim 1, wherein the multi-object stitching step comprises:

The camera position when the panoramic image corresponding to the nth plane profile is photographed is expressed as { R_n，t_nIn which R is_nA rotation matrix for representing rotation parameters of the camera position, t_nA translation vector for a translation parameter representing the camera position, where N is an integer greater than 1, N is an integer greater than or equal to 1,

the camera position when the panoramic image corresponding to the ith plane contour is shot is selected as a reference coordinate system, and three-dimensional points of other plane contours can be unified under the reference coordinate system through the following formula:

converting all three-dimensional points of the dimension-normalized plane profiles except the ith plane profile through the formula to unify the three-dimensional points of all the plane profiles to the same coordinate system, thereby splicing the plane profiles of all the objects into a multi-object plane profile.

8. The three-dimensional object modeling method of claim 7, further comprising:

and a multi-object contour optimization step, wherein the distance between the adjacent edges of two single-object contours in the multi-object contour is calculated, and if the distance is nonzero and is smaller than a specific threshold value, the two single-object contours are shifted so that the distance between the adjacent edges becomes zero.

9. A three-dimensional object modeling method as defined in any of claims 1-8, further comprising:

and a 3D model generation step, wherein after the multi-object splicing step, the multi-object plane contour in the three-dimensional space obtained by splicing is converted into a multi-object 3D model.

10. A three-dimensional object modeling apparatus, characterized in that the three-dimensional object modeling apparatus comprises:

a planar contour extraction means configured to extract a planar contour in three-dimensional space of each panoramic image for at least one panoramic image taken for at least one three-dimensional object to be modeled, wherein each panoramic image is taken for one three-dimensional object, and each three-dimensional object corresponds to one or more panoramic images;

the scale normalization device is configured to normalize the scale of the plane contour of each panoramic image in the three-dimensional space based on the camera position to obtain the normalized plane contour of each panoramic image in the three-dimensional space; and

and the multi-object splicing device is configured to rotate and translate three-dimensional point coordinates of the plane contour of each panoramic image subjected to scale normalization in a three-dimensional space based on the position of the camera so as to unify the three-dimensional point coordinates into the same coordinate system, so that the plane contour of each object is spliced into the multi-object plane contour.

11. The three-dimensional object modeling apparatus of claim 10, wherein the camera position is predetermined or estimated by:

12. The three-dimensional object modeling apparatus of claim 11, wherein the scale normalization means is further configured for:

in the case where the camera position is estimated,

13. The three-dimensional object modeling apparatus of claim 12, wherein the scale normalization means is further configured for:

using the assumed height h of the camera assumed in the plane contour extraction step_cDetermining the height h with the camera_c' generating a normalized planar contour in three-dimensional space of each panoramic image from the planar contour in three-dimensional space of each panoramic image,

14. The three-dimensional object modeling apparatus of claim 10, further comprising: a single object plane contour generation apparatus configured to:

and if the plurality of panoramic images belong to the same three-dimensional object, taking a union set of all plane outlines of the same three-dimensional object as the plane outline of the three-dimensional object for each plane outline of the same three-dimensional object obtained by the plurality of panoramic images.

15. The three-dimensional object modeling apparatus of claim 14, wherein the multi-object stitching means is further configured to enable stitching to obtain a multi-object planar contour in three-dimensional space based on planar contours in three-dimensional space of individual three-dimensional objects.

16. The three-dimensional object modeling apparatus of claim 10, wherein the multi-object stitching device is further configured to:

17. The three-dimensional object modeling apparatus of claim 16, further comprising:

and the multi-object contour optimization device is configured to calculate the distance between the adjacent edges of two single-object contours in the multi-object contour, and if the distance is nonzero and smaller than a specific threshold value, the two single-object contours are shifted so that the distance between the adjacent edges becomes zero.

18. The three-dimensional object modeling apparatus of any of claims 10-17, further comprising:

and the 3D model generation device is configured to convert the spliced multi-object plane contour in the three-dimensional space into a multi-object 3D model.

19. An image processing apparatus comprising:

a processor; and

a memory having executable code stored thereon, which when executed by the processor, causes the processor to perform the method of any of claims 1-9.

20. A non-transitory machine-readable storage medium having stored thereon executable code, which when executed by a processor, causes the processor to perform the method of any of claims 1-9.