WO2017079918A1 - 室内场景扫描重建的方法及装置 - Google Patents

室内场景扫描重建的方法及装置 Download PDF

Info

Publication number
WO2017079918A1
WO2017079918A1 PCT/CN2015/094295 CN2015094295W WO2017079918A1 WO 2017079918 A1 WO2017079918 A1 WO 2017079918A1 CN 2015094295 W CN2015094295 W CN 2015094295W WO 2017079918 A1 WO2017079918 A1 WO 2017079918A1
Authority
WO
WIPO (PCT)
Prior art keywords
thrust
objects
robot
region
segmentation
Prior art date
Application number
PCT/CN2015/094295
Other languages
English (en)
French (fr)
Inventor
黄惠
徐凯
龙品辛
李�昊
陈宝权
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Priority to PCT/CN2015/094295 priority Critical patent/WO2017079918A1/zh
Publication of WO2017079918A1 publication Critical patent/WO2017079918A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis

Definitions

  • the present invention relates to the field of three-dimensional scanning reconstruction technology, and in particular, to a method and apparatus for indoor scene scanning reconstruction.
  • 3D reconstruction technology has always been the focus of research in the field of computer vision and computer graphics. From the initial three-dimensional reconstruction of a point in three-dimensional space to the three-dimensional reconstruction of a specific object, to the three-dimensional reconstruction of the entire indoor scene and the entire city, the research on three-dimensional reconstruction has made great progress and progress, and has been applied.
  • human daily life such as: 3D printing, digital museums, visual tracking, terrain navigation and so on.
  • digital reproduction technology can easily process and analyze the surrounding environment information, so that the real scene can see the details at every angle, and finally make it possible for humans and robots to interpret the surrounding environment.
  • 3D measuring devices have emerged, and there are more ways and ways to reconstruct 3D scenes. Scenes from simple small objects at the beginning to complex large indoor and outdoor scenes make people's lives more enjoyable with 3D reconstruction.
  • the traditional method mainly uses 3D model data to assist in object extraction and recognition.
  • Some work uses scene recurring in the indoor scene as a clue for scene understanding, and some work uses human activity data in indoor scenes to help with scene semantic analysis.
  • these tasks use scanned scene data as input.
  • This offline analysis method lacks first-hand information about the structure of the scene and must rely on prior knowledge (provided by human or another database) or on-time recording. Additional information is completed for analysis.
  • Scanning reconstruction of a single object using a robot has been a lot of work before, but from the perspective of global reconstruction, especially object-level scene reconstruction, there is only a small amount of work to study how to perform automatic scene scan reconstruction.
  • the closest and latest solution is a method of object segmentation based on robot-driven scenes.
  • the core idea of this method is to use RGB images and 3D point clouds.
  • the push point and the push direction are calculated as inputs, and the Shi-Tomasi feature is extracted during the push process.
  • the object is tracked by the optical flow tracking method, and finally the motion track of the feature is clustered to divide the object.
  • the processing scene is limited: the prior art can only analyze several objects on the desktop, and cannot process the entire indoor scene in a large and disorderly manner.
  • the segmentation accuracy is not high: in the case of the same number of interactions (about 10 times) and the same scene (daily items on the desktop), the segmentation accuracy of the existing method is only 70% to 80%.
  • An embodiment of the present invention provides a method for scanning and reconstructing an indoor scene, which is used to process a large and disorderly indoor scene, effectively reconstruct an object in the scene, and improve efficiency and accuracy of indoor scene reconstruction.
  • the method includes:
  • the indoor scene is reconstructed based on the complete three-dimensional data of the segmented objects and objects.
  • the present invention also provides an apparatus for scanning and reconstructing an indoor scene, which is used for processing a large and disorderly whole indoor scene, effectively reconstructing objects in the scene, and improving efficiency and precision of indoor scene reconstruction.
  • the device includes:
  • a preliminary scan reconstruction module configured to obtain scanned image information of the indoor space captured by a robot located in an indoor space, and reconstruct a three-dimensional scene model map of the indoor space according to the scanned image information
  • a segmentation scan processing module configured to divide the three-dimensional scene model map into a plurality of regions of interest; for each region of interest, performing the following operations: dividing the region of interest into a plurality of small regions; controlling the robot to each small Applying a thrust to the region corresponding object, obtaining an image of the plurality of small regions after the thrust is applied; comparing the small region image after the thrust is applied with the small region image before applying the thrust, and dividing the object in the image according to the comparison result, and The control robot scans the uncompleted part of the object separated after applying the thrust to obtain complete three-dimensional data of the object;
  • the reconstruction module reconstructs the indoor scene based on the complete three-dimensional data of the segmented objects and objects.
  • the indoor scene scanning reconstruction scheme provided by the embodiment of the present invention firstly performs a rough global scan reconstruction on the entire indoor scene by using a robot, and obtains a scanned image of the indoor space captured by the robot located in the indoor space.
  • the prior art can only analyze a plurality of objects on the desktop, and the present invention can process a large and disorderly entire indoor scene through the above technical solution.
  • the segmentation accuracy of the existing method is only 70% to 80%, and the present invention can achieve 90 by the above technical solution. Around %, the accuracy of segmentation is improved.
  • the prior art requires 10-12 times for a small scene with only 5-6 objects to obtain a satisfactory segmentation result.
  • the present invention can obtain about 90% of the segmentation accuracy for a large scene with 20-30 objects, so that the object can be analyzed and physics can be performed when the robot scans the scene.
  • the nudge interaction on the side verifies the segmentation accuracy, which greatly improves the ability of traditional 3D reconstruction for object analysis and improves the efficiency of interaction.
  • the actual work of the prior art does not consider how to reconstruct the segmented object, and cannot effectively reconstruct the object in the scene.
  • the present invention proposes an object-level scene reconstruction and an object-level scene analysis framework, so that reconstruction and analysis are directed to objects in the scene, so that the framework proposed by the present invention can better identify and semantically understand the next step. Wait for work. At the same time, since the push can often make the scan more fully, the confidence of object reconstruction is improved.
  • the indoor scene scanning reconstruction method can process a large and disorderly indoor indoor scene, effectively reconstruct objects in the scene, improve segmentation efficiency and interaction efficiency, and improve indoor scene reconstruction.
  • the efficiency and precision of this method can be used to obtain a reconstructed complete 3D indoor scene model with segmentation information that can interact with the user.
  • FIG. 1 is a schematic flow chart of a method for scanning and reconstructing an indoor scene in an embodiment of the present invention
  • FIG. 2 is a schematic diagram of distinguishing some error divisions in the case of over-segmentation and under-segmentation in the embodiment of the present invention
  • FIG. 3 is a schematic diagram of three types of split verification in the embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of an apparatus for scanning and reconstructing an indoor scene in an embodiment of the present invention.
  • the main problem to be solved by the present invention is how to make the robot automatically scan and reconstruct the indoor scene points and realize the object level segmentation through active interactive verification.
  • the sub-problems to be solved are real-time 3D reconstruction of indoor scenes, calculation of scene pre-segmentation, calculation of objects that require active interaction, and verification of object segmentation by robot interaction, and fine scanning of the segmented objects.
  • the design and construction of the entire scan-segment-interaction-scanning system is also a problem solved by the present invention.
  • the existing automatic scene 3D reconstruction technology can not effectively extract and reconstruct objects in the scene. It is necessary to propose an object-level scene reconstruction and analysis framework, and the geometric information of the object obtained in real time as a feedback guide scanning and reconstruction process.
  • the present invention innovatively proposes a scene reconstruction framework combining object level reconstruction and object level, and combines automatic scan reconstruction and interactive segmentation in a set of algorithm flow, and finally obtains a segmentation information. Reconstruct a relatively complete 3D indoor scene.
  • the method is robust and effective and can be applied to a wide range of complex indoor scenes.
  • the present invention adopts the following scheme:
  • the robot is used to perform rough automatic scan reconstruction on the indoor scene, and then the pre-processed and over-segmented reconstruction results are generated, and then the hypothesis of some objects is generated and represented by the area map. Finally, the entropy-based robot nudge interaction method is used to verify. The accuracy of the hypothesis, as well as fine-grained scanning of objects to improve data integrity.
  • a method for automatically performing scan reconstruction and analysis on an object in a scene includes:
  • a method of acquiring and reconstructing a three-dimensional scene in real time In a given indoor scene, the robot first automatically navigates and scans and reconstructs the entire room. In order to achieve effective processing, we divide the reconstructed scene into a number of Regions of Interest (ROIs), and then process them one by one. For every one In the interesting area, the system scans and interacts with each other's occluded objects and uses the scanned object geometry information as feedback to guide the subsequent verification process.
  • ROIs Regions of Interest
  • the fine scan is mainly for the objects that are pushed during the interaction process, because the originally occluded parts of the objects are revealed after the push, and scanning can obtain more data of the object, which improves the integrity of the reconstruction model of these objects. Finally, the verification results are merged into the segmentation and reconstruction, thus reducing the corresponding uncertainty of the object. This iterative process is repeated until the uncertainty stops decreasing.
  • FIG. 1 is a schematic flowchart of a method for scanning and reconstructing an indoor scene in an embodiment of the present invention; as shown in FIG. 1, the method may include the following steps:
  • Step 101 Obtain scan image information of the indoor space captured by a robot located in an indoor space, and reconstruct a three-dimensional scene model map of the indoor space according to the scan image information;
  • Step 102 Segmenting the three-dimensional scene model map into multiple regions of interest; for each region of interest, performing the following operations: dividing the region of interest into a plurality of small regions; controlling the robot to apply to each small region corresponding object Thrust, obtaining an image of the plurality of small regions after the thrust is applied; comparing the small region image after the thrust is applied with the small region image before applying the thrust, dividing the object in the image according to the comparison result, and controlling the robot pair application The scanned part is not scanned and the entire part is scanned to obtain complete 3D data of the object;
  • Step 103 Reconstruct the indoor scene according to the complete three-dimensional data of the segmented object and the object.
  • the indoor scene scanning reconstruction scheme provided by the embodiment of the present invention firstly performs a rough global scan reconstruction on the entire indoor scene by using a robot, and obtains a scanned image of the indoor space captured by the robot located in the indoor space.
  • the embodiment of the invention has at least the following beneficial technical effects:
  • the prior art can only analyze a plurality of objects on the desktop, and the present invention can process a large and disorderly entire indoor scene through the above technical solution.
  • the segmentation accuracy of the existing method is only 70% to 80%, and the present invention can achieve 90 by the above technical solution. Around %, the accuracy of segmentation is improved.
  • the prior art requires 10-12 times for a small scene with only 5-6 objects to obtain a satisfactory segmentation result.
  • the present invention can obtain about 90% of the segmentation accuracy for a large scene with 20-30 objects, so that the object can be analyzed and physics can be performed when the robot scans the scene.
  • the nudge interaction on the side verifies the segmentation accuracy, which greatly improves the ability of traditional 3D reconstruction for object analysis and improves the efficiency of interaction.
  • the actual work of the prior art does not consider how to reconstruct the segmented object, and cannot effectively reconstruct the object in the scene.
  • the present invention proposes an object-level scene reconstruction and an object-level scene analysis framework, so that reconstruction and analysis are directed to objects in the scene, so that the framework proposed by the present invention can better identify and semantically understand the next step. Wait for work. At the same time, since the push can often make the scan more fully, the confidence of object reconstruction is improved.
  • the indoor scene scanning reconstruction method can process a large and disorderly indoor indoor scene, effectively reconstruct objects in the scene, improve segmentation efficiency and interaction efficiency, and improve indoor scene reconstruction.
  • the efficiency and precision of this method can be used to obtain a reconstructed complete 3D indoor scene model with segmentation information that can interact with the user.
  • the method proposed by the embodiment of the present invention is an online active analysis method.
  • scene analysis and scanning and reconstruction are closely combined in a framework to form a feedback system that is autonomous (that is, does not require a model database or any manual input).
  • the biggest difference between the method proposed by the embodiment of the present invention and the prior art is that the robot interaction in our scheme is driven by considering both the segmentation confidence and the reconstruction quality, that is, we consider both the scene segmentation and the object reconstruction. Certainty to guide robot interaction.
  • the method proposed in this patent first introduces robot interaction into the complete reconstruction of the indoor scene and finely reconstructs the object while performing object segmentation extraction.
  • a global space contains the entire scene, and the latter three are only for the current region of interest.
  • the geometric information and label information of the region of interest are integrated into the global space.
  • the object analysis in this method is processed on a 3D point cloud. Unlike the method based on 2D RGB image, the depth information provided by the 3D point cloud is very useful for robot interaction.
  • the process of analysis and verification is centered on the area map and the object map.
  • the object map is generated by segmentation and reconstruction of the area map, and then the area map and the object map are used to estimate the optimal interaction.
  • our method uses the two graphs above to improve the reconstruction of the object level.
  • step 101 we use the robot automatic navigation function to perform rough scanning and reconstruction of the entire indoor scene.
  • On the path of the robot we sample to obtain some sampling points, and the robot stops at each sampling point. Do a scan, and finally splicing the data from each sample scan to get the complete data of the indoor scene, but this reconstruction model loses a lot of details. Therefore, we divide the reconstructed model into several regions of interest, and then use the robot to scan them one by one. Each region of interest is a relatively independent sub-scenario with a large number of objects that need to be scanned.
  • the output of the initialization phase is a series of regions of interest connected by the robot scan path.
  • Embodiments of the present invention liberate humans from boring indoor scene scanning work by using robot autonomous scanning reconstruction and pushing interaction to perform object level segmentation and object level reconstruction.
  • step 102 the analysis of the object and the active interaction verification based on entropy are included, which is the core innovation of the solution.
  • step 102 the analysis of the object and the active interaction verification based on entropy are included, which is the core innovation of the solution.
  • step 102 the analysis of the object and the active interaction verification based on entropy are included, which is the core innovation of the solution.
  • step 102 the analysis of the object and the active interaction verification based on entropy are included, which is the core innovation of the solution.
  • the specific implementation details are described step by step below.
  • step 102 a specific implementation of the object in the segmentation image is described as follows:
  • the object in the image is segmented by using the following formula:
  • x u represents the u-th small region
  • x v is the v-th small region
  • V p and ⁇ p are the nodes and regions between the regions generated by the over-segmentation Edges of adjacency relationships
  • a penalty is added to the area to indicate that the foreground is not needed.
  • the parameter ⁇ controls the range, and the value is not fixed, but the diagonal length l d from 0 to the entire boundary square.
  • split cost is defined as the likelihood that two adjacent regions belong to the same object and will be learned based on the results of the robot's active verification. The learning of split costs will be introduced in subsequent chapters. The remaining question is how to choose the initial area of the foreground. We use multiple map partitions using each region as the initial region of the foreground, resulting in many foreground options with redundancy. Next, we cluster the foreground using the mean shift algorithm to get the most probable foreground object.
  • the object in the image can be segmented by using the following formula:
  • the candidate set of the guess when the region map is divided, the candidate set of the guess is not necessary, and some candidates overlap each other, which makes it easy to cause ambiguity when setting the label.
  • the existing method either filters out some of the candidate sets based on some rules, or sorts the models learned from the training data, and selects the most likely ones.
  • the node and the edge of the adjacency between the regions; the data item E d (lu; Pu) is defined as the similarity of the region P u belonging to a particular object, for the region Pu and the i-th object H i , the data item
  • the ratio defined as the region P u covered by all regions belonging to the object H i is denoted as C i .
  • t(P u , C i ) represents the number of occurrences of the region P u in the classification C i
  • the smoothing term is consistent with that previously used in the binary map segmentation.
  • the data item selects the label of each region based on the consistent voting of all foreground clusters.
  • the voting mechanism the larger the value of a foreground cluster, the probability that the corresponding object hypothesis represents an independent object. The larger, because the object has appeared multiple times in many regional segmentation processes, the possibility is increased.
  • Figure 2 illustrates how our method can distinguish between some erroneous partitions in the case of over-segmentation and under-segmentation.
  • 3D scene segmentation is affected by many factors, such as geometry, color, texture, and even some high-level structural information. It is difficult to integrate all the above factors into a calculation method of segmentation cost, so we want to take the initiative from the robot. Learn the cost of segmentation by verifying the likelihood that two adjacent regions belong to different objects:
  • x( ⁇ ) ⁇ R n is the feature vector extracted from a pair of region blocks.
  • SVM support vector machine
  • the f prediction function returns a positive value when the two area labels are different, otherwise it returns a negative value.
  • p c e uv
  • a major problem with this learning method is that the features extracted from different factors are composed of different components, so we need to design multiple kernel functions in the support vector machine, so we use Multiple Kernel Learning.
  • the way to learn for each feature is that in the process of robot proactive analysis and verification, the training samples are sequential, and online learning is more efficient, and the accuracy of the prediction function can be increased incrementally with a large number of samples.
  • ⁇ uv represents the angle between the average normal vectors of the regions P u and P v
  • takes 0.01 when the angle of the two adjacent regions is a convex dihedral angle, otherwise takes 1.
  • the robot will perform a nudge in the current region of interest for verification and fine scanning. After the robot physically interacts with the scene, the corresponding object moves, and the geometric information of the object is increased. The added information can be used to verify or correct the segmentation result.
  • the active nudge method proposed by the present invention is particularly suitable for objects that are close to each other in space, and solves the problem of mutual occlusion between objects that are close to each other, and is advantageous for fine scanning of objects.
  • Robot interaction can lead to three possibilities:
  • the patch graph is constructed as an object graph, and the next best nudge selection is based on the two graphs.
  • the next best nudge has a positive effect on both.
  • nudges should minimize the uncertainty of scene segmentation; on the other hand, in order to increase the completeness of the data, the robot should try to explore areas that are not observed. The uncertainty of the reconstruction results is measured by the Poisson-based reconstruction quality.
  • the indoor scene is reconstructed by using the following formula:
  • S and R are respectively used to describe the random variables of the segmentation and reconstruction results of the current region of interest, H(S) stands for segmentation entropy, H(S
  • p c (e) indicates the probability that the edge is clipped;
  • indicates that the non-planar iso surface (iso-surface, which is obtained by calculating the Poisson field from the reconstructed result) is uniformly sampled by the zero-crossing point.
  • a set of iso points; a ratio of the area P u covered by all areas belonging to the object H i , denoted as C i ; ⁇ 0 (S) is a set of edges with respect to S in the figure.
  • H H(S,R)
  • S and R are respectively used to describe the random variables of the segmentation and reconstruction results of the current region of interest, where the joint entropy is used. Measures inaccuracies or the amount of information represented by random variables.
  • H(S, R) H(S)+H(S
  • R) the above formula can be divided into the following two steps: H(S) stands for segmentation entropy, and H(S
  • S is discretized into a series of possible segments, denoted by S(G p ). Therefore, the cutting of all sides can be represented by S i segmentation. Assuming that the edge cutting is independent, we can use the following formula to estimate the probability of dividing the region map:
  • g is the Logistic function, consistent with the above mentioned.
  • Self-occlusion and mutual occlusion are the two most important factors affecting scanning. Self-occlusion of the main and self geometry, this problem can be avoided by scanning the scanner for multiple viewing angles. On the other hand, mutual occlusion can be solved by separating the two parts. Therefore, mutual occlusion is directly related to the uncertainty of reconstruction, and the uncertainty can be reduced by calculating the next best perspective multiple times.
  • the reconstruction entropy under the condition of segmentation is represented by the following formula:
  • Conditional entropy measures the uncertainty of reconstruction based on mutual occlusion of adjacent objects after segmentation. It should be pointed out that we did not consider the occlusion problem between the object and the support plane, because our nudges are horizontal.
  • a nudge should maximize the joint uncertainty of segmentation and reconstruction, so both sides increase the amount of information.
  • ⁇ p u , d u >) is a posteriori entropy, and then a nudge candidate u that maximizes I(S, R
  • ⁇ p u ,d u >) H s ( ⁇ p (O u ))+H R
  • S ( ⁇ 0 (O u , d u )) have been mentioned before, and will not be described here.
  • step 102 the operation of applying the thrust does not include:
  • An operation of applying an angle between the thrust direction and a vertical line to which the center of the thrust object is applied is greater than a preset angle threshold.
  • the following five rules filter out the destructive (rules 1 to 5) or can not be executed by the robot (rule 4), and we prefer the translational motion to nudge the object and reduce the motion.
  • the difficulty of detection. Can include the following 5 basic rules:
  • Rule 1 We filter out those operations where the nudge point is higher than the 2/3 height of the object to avoid dropping.
  • Rule 3 We filter out those operations where the nudge direction is not perpendicular to the support plane, ensuring that the nudge direction is horizontal.
  • Rule 4 We filter out the operations that the robot's arm can't reach. We set a cube area around each nudge point. If other objects enter this area, it means that this operation is not feasible.
  • nudge point is detected around the cube area. If the above five rules are met, we perform a nudge operation, generally 5 cm, which is enough for motion detection.
  • nudge In the process of nudge, we use a combination of texture and non-texture algorithms to detect the motion of the object. For texture tracking, extract the texture features from the RGB data acquired by Kinect. For non-texture cases, use the depth. The data is used to extract geometric features. Finally, we cluster the trajectories of the feature points and then return those features to the object.
  • step 102 the small area image after the thrust is applied is compared with the small area image before the thrust is applied, and according to the comparison result, the objects in the image are divided, including:
  • control robot scans a portion of the object that is separated after applying the thrust without scanning the entire portion to obtain complete three-dimensional data of the object, including:
  • control robot scans a portion of the object that is separated after the application of the thrust without scanning.
  • next best view (NBV) based on Kinect sensor point cloud data.
  • NBV next best view
  • the result of the combination verification through verification, we will combine the information gains obtained from the segmentation and reconstruction, and the new data is merged into the global body pixel space of KinectFusion.
  • the objects in the image are segmented according to the comparison result, including:
  • the objects in the image are segmented according to the accuracy of the object segmentation.
  • the above accuracy can be understood as: the uncertainty of segmentation and reconstruction.
  • the information gain mentioned in the implementation of the present invention refers to the amount of uncertainty of information (eg, segmentation information, etc.), and the entropy value refers to the magnitude of uncertainty of information (eg, segmentation and reconstruction information, etc.). Both the information gain and the entropy value have corresponding meanings to the above mentioned accuracy.
  • the termination condition is introduced: when the information gain of the next best nudge is less than a preset threshold, the analysis-verification iterative process is terminated.
  • our method does not guarantee convergence for a limited number of times because the verification method based on robot interaction cannot guarantee the simultaneous reduction of the uncertainty of segmentation and reconstruction. For example, in some complicated situations, a nudge may sometimes bring new objects to each other; on the other hand, fine scanning always reduces the uncertainty of reconstruction because of the improved data integrity. Therefore, our termination conditions can include the following three:
  • an apparatus for scanning and reconstructing an indoor scene is further provided in the embodiment of the present invention, as described in the following embodiments.
  • the method for solving the problem in the indoor scene scan reconstruction is similar to the method for the indoor scene scan reconstruction. Therefore, the implementation of the indoor scene scan reconstruction apparatus can be implemented by referring to the method of the indoor scene scan reconstruction, and the repeated description is not repeated.
  • the term "unit” or "module” may implement a combination of software and/or hardware of a predetermined function.
  • the apparatus described in the following embodiments is preferably implemented in software, hardware, or a combination of software and hardware, is also possible and contemplated.
  • FIG. 4 is a schematic structural diagram of an apparatus for scanning and reconstructing an indoor scene in an embodiment of the present invention. As shown in FIG. 4, the apparatus includes:
  • the preliminary scan reconstruction module 02 is configured to obtain scan image information of the indoor space captured by a robot located in the indoor space, and reconstruct a three-dimensional scene model map of the indoor space according to the scan image information;
  • a segmentation scan processing module 04 configured to divide the three-dimensional scene model map into a plurality of regions of interest; for each region of interest, performing the following operations: dividing the region of interest into a plurality of small regions; controlling the robot to each The small area applies a thrust corresponding to the object, obtains an image of the plurality of small areas after the thrust is applied; compares the small area image after the thrust is applied with the small area image before the thrust is applied, and divides the object in the image according to the comparison result, And controlling the robot to scan the uncompleted part of the object separated after applying the thrust to obtain complete three-dimensional data of the object;
  • the reconstruction module 06 reconstructs the indoor scene according to the complete three-dimensional data of the segmented object and the object.
  • the segmentation scan processing module is specifically configured to segment objects in the image according to the following conditions:
  • the method of the present invention is capable of obtaining more complete data of indoor scenes and is capable of handling more complex and scaled scenes.
  • the interaction efficiency is high.
  • the method of the present invention can achieve a segmentation accuracy of about 90% by performing 12 or so interactions on a large scene containing 20-30 objects.
  • the method of the present invention not only performs object level segmentation on the scene, but also performs high quality reconstruction on the objects in the scene.
  • the invention proves its feasibility through a plurality of experiments, and can be widely applied to various scenes, in particular, can deal with large-scale and complex scenes, which cannot be realized by existing methods. It can be seen from a large number of experimental results that the method proposed by the present invention can perform object level reconstruction and segmentation of indoor scenes efficiently and robustly.
  • the framework proposed by the present invention can be widely applied to a variety of robot platforms without being affected by the robot hardware platform.
  • modules, devices or steps of the embodiments of the invention described above may be implemented by a general-purpose computing device, which may be centralized on a single computing device or distributed across multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device such that they may be stored in the storage device by the computing device and, in some cases, may be different.
  • the steps shown or described herein are performed sequentially, or they are separately fabricated into individual integrated circuit modules, or a plurality of modules or steps thereof are fabricated as a single integrated circuit module.
  • embodiments of the invention are not limited to any specific combination of hardware and software.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种室内场景扫描重建的方法及装置,该方法包括:获得机器人捕获的室内空间的扫描图像信息,根据扫描图像信息重建室内空间的三维场景模型图;将三维场景模型图分割为多个兴趣区域;对于每个兴趣区域,均执行以下操作:将兴趣区域分割为多个小区域;控制机器人对每个小区域对应物体施加推力,获得施加推力后的多个小区域的图像;将施加推力后的小区域图像与施加推力前的小区域图像进行比较,根据比较结果,分割图像中的物体,以及控制机器人对施加推力后分离的物体中没有扫描完全的部分进行扫描,获取物体完整的三维数据;根据分割出的物体和物体的完整三维数据,对室内场景进行重建。上述技术方案,提高了室内场景重建的效率和精度。

Description

室内场景扫描重建的方法及装置 技术领域
本发明涉及三维扫描重建技术领域,特别涉及一种室内场景扫描重建的方法及装置。
背景技术
目前,三维重建技术一直是计算机视觉和计算机图形学领域研究的重点。从最初的对三维空间中一个点的三维重建到对某个特定物体的三维重建,再到对整个室内场景乃至整个城市的三维重建,三维重建的研究取得了长足的发展和进步,且已经应用到了人类日常生活的方方面面,例如:3D打印、数字博物馆、视觉跟踪、地形导航等。对于三维场景建模来说,数字化再现技术可以方便的处理和分析周围的环境信息,使得现实场景能在每个角度看清细节,最终使得不论是人类还是机器人,都可以解读所在的周边环境。现如今,随着科技的迅速发展各种3D测量设备随之出现,3D各场景的重建有了更多方法和途径。场景从一开始的简单的小型物体到复杂的大型室内外场景,使得人们生活愈加享受到3D重建带来的优越。
近年来,对大范围室内场景进行数字化的研究受到了越来越多的关注。真实场景的数字化,可以让我们可以在不同视角下充分观察欣赏场景的各个部分。近年来,多种三维测量设备的迅速发展也为三维场景重构提供了更多的实现手段。但是由于室内场景中各种物体间的互相遮挡,或数据获取装置自身的物理限制等原因,利用传统的室内场景三维重建及分割方法很难得到一个完整、较高精度的环境模型,且重建得到三维模型往往功能意义不明确,用户也无法与之交互。要想得到有明确意义的室内场景模型,我们需要对扫描重建得到地结果进行分割,但由于室内场景情况复杂、遮挡严重,使得完全使用软件算法的分割方法极具挑战性。如果在后续工作中想将此重建模型用到虚拟漫游、室内设计等应用中,很多时候都需要人为进行一些分割、识别、添加语义、实现动画等一系列工作。
在现有典型的室内场景获取工作中,通常是一个操作员手持深度摄像机在室内场景中移动来扫描捕捉场景数据。但是对于人类来说,精细化的场景扫描是一个枯燥无聊的活,特别是对于大尺度、包含很多物体的室内场景。要解决这一问题,利用移动机器人对室内场景进行全自动扫描就成了一个十分吸引人的方案。
从单个物体的点云数据中进行表面重建的研究已经趋近成熟,现今关于三维扫描和重建的研发重点越来越多地转向了室内场景。特别是随着低成本深度传感器(如微软公司的Kinect深度传感器,华硕公司的Xtion Live深度传感器,Intel公司的Realsense传感器)的快速发展和SLAM相关技术的成熟,实时场景扫描和重建得到了学术界和产业界的一致重视。这些方法的共同之处是,最终都是用一个三维模型来表示整个重建的场景。然而,室内场景是由它其中的物体及物体之间的空间关系所表征的。如果不能有效地表示室内场景中有意义的、具体的物品,那么重建得到的场景三维模型的作用是有很大限制的。如之前所述的工作,因为最后得到的是单个三维模型,所以无法用于场景中的物体检索、编辑与合成。因此,更有意义、更具使用价值的室内场景重建应该要能够提取重建场景中的各种物品并由此能推断各物品之间的相关关系。
要想提取场景中的各个物品,就得对重建场景进行分割分析。传统的方法主要是利用3D模型数据要辅助进行物体提取和识别。有些工作利用室内场景中物体重复出现作为线索进行场景理解,还有一些工作利用人类在室内场景的活动数据来帮助进行场景语义分析。但是,这些工作都是使用扫描好的场景数据作为输入,这种线下的分析方法缺少关于场景结构的第一手信息,必须依靠先验知识(由人或另外的数据库提供)或者扫描时记录的额外信息才完成分析。
利用机器人对单个物体进行扫描重建在之前已经有不少工作,但是从全局重建,特别是物体级别的场景重建的角度来评价,现在只有很少的工作研究了如何进行全自动场景扫描重建。此外,目前也有一些通过机器人交互从场景中提取物体的技术方案,最接近并且最新的解决方案是一种基于机器人推动的场景中物体分割方法,该方法的核心思想是利用RGB图像和三维点云作为输入来计算推动点和推动方向,并在推动过程中提取Shi-Tomasi特征利用光流跟踪法对物体进行跟踪,最后对特征的运动轨迹进行聚类以分割物体。
综上所述,现有技术的缺点主要包括以下几点:
(1)处理场景有限:现有技术只能对桌面上的若干个物体进行分析,不能对大范围杂乱的整个室内场景进行处理。
(2)分割准确率不高:在相同交互次数(10次左右)、相同场景(桌面上的日常物品)的情况下,现有方法的分割准确率只有70%~80%。
(3)交互效率不高:现有技术对只有5-6个物体的小场景就需要推动10-12次才能得到令人满意的分割结果。
(4)不能有效地重建场景中的物体:现有工作都没有考虑如何重建分割出来的物体。
发明内容
本发明实施例提供了一种室内场景扫描重建的方法,用以对大范围杂乱的整个室内场景进行处理,有效地重建场景中的物体,提高室内场景重建的效率和精度,该方法包括:
获得位于室内空间的机器人捕获的所述室内空间的扫描图像信息,根据所述扫描图像信息重建所述室内空间的三维场景模型图;
将所述三维场景模型图分割为多个兴趣区域;对于每个兴趣区域,均执行以下操作:将兴趣区域分割为多个小区域;控制所述机器人对每个小区域对应物体施加推力,获得施加推力后的所述多个小区域的图像;将施加推力后的小区域图像与施加推力前的小区域图像进行比较,根据比较结果,分割图像中的物体,以及控制机器人对施加推力后分离的物体中没有扫描完全的部分进行扫描,获取物体完整的三维数据;
根据分割出的物体和物体的完整三维数据,对室内场景进行重建。
本发明还提供了一种室内场景扫描重建的装置,用以对大范围杂乱的整个室内场景进行处理,有效地重建场景中的物体,提高室内场景重建的效率和精度,该装置包括:
初步扫描重建模块,用于获得位于室内空间的机器人捕获的所述室内空间的扫描图像信息,根据所述扫描图像信息重建所述室内空间的三维场景模型图;
分割扫描处理模块,用于将所述三维场景模型图分割为多个兴趣区域;对于每个兴趣区域,均执行以下操作:将兴趣区域分割为多个小区域;控制所述机器人对每个小区域对应物体施加推力,获得施加推力后的所述多个小区域的图像;将施加推力后的小区域图像与施加推力前的小区域图像进行比较,根据比较结果,分割图像中的物体,以及控制机器人对施加推力后分离的物体中没有扫描完全的部分进行扫描,获取物体完整的三维数据;
重建模块,根据分割出的物体和物体的完整三维数据,对室内场景进行重建。
与现有技术相比较,本发明实施例提供的室内场景扫描重建的方案,首先,利用机器人对整个室内场景进行粗略地全局扫描重建,获得位于室内空间的机器人捕获的所述室内空间的扫描图像信息,根据所述扫描图像信息重建所述室内空间的三维场景模型图;然后,将所述三维场景模型图分割为多个兴趣区域;对于每个兴趣区域,均执行以 下操作:将兴趣区域分割为多个小区域;控制所述机器人对每个小区域对应物体施加推力,获得施加推力后的所述多个小区域的图像;将施加推力后的小区域图像与施加推力前的小区域图像进行比较,根据比较结果,分割图像中的物体,以及控制机器人对施加推力后分离的物体中没有扫描完全的部分进行扫描,获取物体完整的三维数据;最后,根据分割出的物体和物体的完整三维数据,对室内场景进行重建,通过该技术方案,本发明实施例至少具有以下有益技术效果:
(1)现有技术只能对桌面上的若干个物体进行分析,本发明通过上述技术方案,可以对大范围杂乱的整个室内场景进行处理。
(2)在相同交互次数(10次左右)、相同场景(桌面上的日常物品)的情况下,现有方法的分割准确率只有70%~80%,而本发明通过上述技术方案能达到90%左右,提高了分割的准确率。
(3)现有技术对只有5-6个物体的小场景就需要推动10-12次才能得到令人满意的分割结果。而本发明通过上述技术方案,对于含有20-30个物体的大场景进行12左右的交互就能得到90%左右的分割准确率,使得在机器人扫描场景的时候可以进行物体的分析,并通过物理上的轻推交互来验证分割准确性,从而大大提高了传统三维重建对于物体分析的能力,提高了交互的效率。
(4)现有技术的实际工作都没有考虑如何重建分割出来的物体,不能有效地重建场景中的物体。而本发明提出了一个物体层面的场景重建和物体级别的场景分析框架,使得重建和分析都是针对场景中的物体,进而使得本发明提出的框架能更好地为下一步的识别、语义理解等工作所用。同时,由于轻推之后往往可以使得扫描更加充分,提高了物体重建的置信度。
通过上述可知,本发明实施例提供的室内场景扫描重建的方法,可以对大范围杂乱的整个室内场景进行处理,有效地重建场景中的物体,提高了分割效率和交互效率,提高了室内场景重建的效率和精度,通过该方法可以得到一个带有分割信息的、可以与用户交互的、重建完整的三维室内场景模型。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对 于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。在附图中:
图1是本发明实施例中室内场景扫描重建的方法的流程示意图;
图2是本发明实施例中在过分割和欠分割的情况下区别一些错误分割的示意图;
图3是本发明实施例中三种分割验证情况的示意图;
图4是本发明实施例中室内场景扫描重建的装置的结构示意图。
具体实施方式
为使本发明实施例的目的、技术方案和优点更加清楚明白,下面结合附图对本发明实施例做进一步详细说明。在此,本发明的示意性实施例及其说明用于解释本发明,但并不作为对本发明的限定。
本发明要解决的主要问题是如何让机器人自动扫描重建室内场景点并通过主动交互式的验证方式实现物体层面的分割。要解决的子问题有,室内场景实时三维重建,计算场景预分割,计算需要主动交互的物体,利用机器人交互验证物体的分割,对分割得到的物体进行精细扫描。另外,整个扫描—分割—交互—扫描系统的设计和搭建也是本发明解决的难题。现有的自动场景三维重建技术不能有效地提取并重建场景中的物体,有必要提出一种物体级别的场景重建和分析框架,将在线实时得到的物体几何信息作为反馈指导扫描和重建的过程。
有鉴于此,本发明创新性地提出了一个结合了物体层面的场景重建和物体级别的场景分析框架,将自动扫描重建和交互式分割耦合在一套算法流程中,最后得到一个带有分割信息的、重建较为完整的三维室内场景。该方法鲁棒有效且能适用于大范围、较复杂的室内场景。
为实现上述目的,本发明采用如下方案:
先使用机器人对室内场景进行粗略地自动扫描重建,再对得到的重建结果进行预处理和过分割,继而生成一些物体的假设并用区域图来表示,最后通过基于熵的机器人轻推交互方式来验证假设的准确性,同时对物体进行精细扫描以提升数据完整性。
具体地,一种对场景中物体自动进行扫描重建和分析的方法,其中,该方法包括:
(1)实时获取和重建三维场景的扫描方法。在一个给定的室内场景中,机器人首先自动导航并对整个房间进行扫描和重建。为了实现有效的处理,我们把重建得到的场景分割为若干个兴趣区域(Regions of Interest,ROIs),接着逐个进行处理。对于每一个兴 趣区域,系统对于互相遮挡的物体进行扫描和交互并将扫描得到的物体几何信息作为反馈指导后续的验证过程。
(2)基于机器人交互的主动式验证方法。为了实现物体层面的分析,首先将当前的兴趣区域过分割为一些很小的区域(patches),然后用这些小区域构建一个区域图模型(patch graph),然后使用图分割(graph-cut)算法生成一些可能的物体假设,然后建立物体图(object graph)。通过生成的这两个图,我们估计重见得到物体的不确定性来指导主动式验证操作,包括两个部分:一个是利用机器人手臂进行轻推,另一个是对于没有扫描完全的物体进行精细扫描。我们利用机器人进行水平轻推来验证我们的局部分割是否正确,同时,通过对相距很近的物体进行推动分离来减少物体间相互遮挡的方法也可以帮助我们获取数据。精细扫描主要针对于交互过程中推动过的物体,因为推动过后这些物体原本被遮挡的部分显露了出来,此时进行扫描能获取该物体更多的数据,提高了这些物体重建模型的完整性。最后,验证结果会被合并到分割和重建中,因此减少了物体相应的不确定性。这样的迭代过程一直重复直到不确定性停止减少。
下面进行详细说明。
图1是本发明实施例中室内场景扫描重建的方法的流程示意图;如图1所示,该方法可以包括如下步骤:
步骤101:获得位于室内空间的机器人捕获的所述室内空间的扫描图像信息,根据所述扫描图像信息重建所述室内空间的三维场景模型图;
步骤102:将所述三维场景模型图分割为多个兴趣区域;对于每个兴趣区域,均执行以下操作:将兴趣区域分割为多个小区域;控制所述机器人对每个小区域对应物体施加推力,获得施加推力后的所述多个小区域的图像;将施加推力后的小区域图像与施加推力前的小区域图像进行比较,根据比较结果,分割图像中的物体,以及控制机器人对施加推力后分离的物体中没有扫描完全的部分进行扫描,获取物体完整的三维数据;
步骤103:根据分割出的物体和物体的完整三维数据,对室内场景进行重建。
与现有技术相比较,本发明实施例提供的室内场景扫描重建的方案,首先,利用机器人对整个室内场景进行粗略地全局扫描重建,获得位于室内空间的机器人捕获的所述室内空间的扫描图像信息,根据所述扫描图像信息重建所述室内空间的三维场景模型图;然后,将所述三维场景模型图分割为多个兴趣区域;对于每个兴趣区域,均执行以下操作:将兴趣区域分割为多个小区域;控制所述机器人对每个小区域对应物体施加推力,获得施加推力后的所述多个小区域的图像;将施加推力后的小区域图像与施加推力 前的小区域图像进行比较,根据比较结果,分割图像中的物体,以及控制机器人对施加推力后分离的物体中没有扫描完全的部分进行扫描,获取物体完整的三维数据;最后,根据分割出的物体和物体的完整三维数据,对室内场景进行重建,通过该技术方案,本发明实施例至少具有以下有益技术效果:
(1)现有技术只能对桌面上的若干个物体进行分析,本发明通过上述技术方案,可以对大范围杂乱的整个室内场景进行处理。
(2)在相同交互次数(10次左右)、相同场景(桌面上的日常物品)的情况下,现有方法的分割准确率只有70%~80%,而本发明通过上述技术方案能达到90%左右,提高了分割的准确率。
(3)现有技术对只有5-6个物体的小场景就需要推动10-12次才能得到令人满意的分割结果。而本发明通过上述技术方案,对于含有20-30个物体的大场景进行12左右的交互就能得到90%左右的分割准确率,使得在机器人扫描场景的时候可以进行物体的分析,并通过物理上的轻推交互来验证分割准确性,从而大大提高了传统三维重建对于物体分析的能力,提高了交互的效率。
(4)现有技术的实际工作都没有考虑如何重建分割出来的物体,不能有效地重建场景中的物体。而本发明提出了一个物体层面的场景重建和物体级别的场景分析框架,使得重建和分析都是针对场景中的物体,进而使得本发明提出的框架能更好地为下一步的识别、语义理解等工作所用。同时,由于轻推之后往往可以使得扫描更加充分,提高了物体重建的置信度。
通过上述可知,本发明实施例提供的室内场景扫描重建的方法,可以对大范围杂乱的整个室内场景进行处理,有效地重建场景中的物体,提高了分割效率和交互效率,提高了室内场景重建的效率和精度,通过该方法可以得到一个带有分割信息的、可以与用户交互的、重建完整的三维室内场景模型。
与之前的方法不同,本发明实施例提出的方法是在线主动式分析方法。其中,场景分析与扫描和重建紧密地结合在一个框架中,形成一个自主(也就是说,不需要模型数据库或者任何的人工输入)的反馈系统。
本发明实施例提出的方法与现有技术最大的不同在于,我们方案中的机器人交互是通过既考虑分割置信度又考虑重建质量来驱动的,也就是说我们同时考虑场景分割和物体重建的不确定性来指导进行机器人交互。本专利提出的方法第一个将机器人交互引入室内场景的完整重建且在进行物体分割提取的同时对物体进行精细重建。
在本发明实施例中,表示场景有四种:全局空间,3D点云,区域图,物体图。全局空间包含了整个场景,后三种则只针对当前的兴趣区域,兴趣区域的几何信息和标签信息会整合到全局空间中。本方法中的物体分析是在3D点云上进行处理的,与基于二维RGB图像的方法不同,3D点云能提供的深度信息对于机器人的交互非常有用。分析和验证的过程是以区域图和物体图为中心的,特别地,通过区域图的分割和重构生成了物体图,然后区域图和物体图都会被用于估计最佳交互动作。总之,我们的方法使用上述两个图来提高物体级别的重建。
具体实施时,在上述步骤101中,我们使用机器人自动导航功能来进行整个室内场景的粗略扫描与重建,在机器人行驶的路径上我们进行采样以得到一些采样点,机器人在每一个采样点停下来做一次扫描,最后将每次采样扫描得到的数据拼接起来得到室内场景的完整数据,但这个重建模型丢失了大量的细节信息。所以,我们把重建得到的模型划分成几个兴趣区域,然后用机器人对它们逐一进行更精细的扫描。每个兴趣区域是一个相对独立的子场景,其中拥有大量需要扫描的物体。初始化阶段的输出是一系列由机器人扫描路径连接的兴趣区域。
本发明实施例通过使用机器人自主扫描重建和推动交互,进行物体级别的分割和物体层面的重建,把人类从无聊的室内场景扫描工作中解放出来。
具体实施时,在上述步骤102中,包括了对物体的分析和基于熵的主动式交互验证,这是本方案的核心创新点。下面将分步介绍具体的实现细节。
在上述步骤102中,介绍分割图像中的物体的具体的实施例如下:
在进行物体分析时,我们首先对场景点云过分割,建立一个邻接图,记为Gp=(υpp)每个节点代表一个区域,图的邻接关系代表区域的邻接关系。基于区域图,我们使用二值图分割算法计算得到一系列物体假设的候选选项,然后我们使用多分类投票机制来选取最有可能的候选选项。本发明实施例中提到的物体假设的含义为:假设小区域属于某个特定物体。
下面介绍物体假设的生成:
具体实施时,利用如下公式,分割图像中的物体:
Figure PCTCN2015094295-appb-000001
具体实施时,当我们用图分割算法生成物体假设的时候,我们先选取一个区域作为前景,且暂时先不定义背景,我们通过一种背景加惩罚项的办法来去除非前景的部分。 特别地,我们选取一个区域,记作Ps,标记为前景Xs=1,对二值的区域标签X=[x1,...,xn]有如下的能量函数公式,并对其最小化:
Figure PCTCN2015094295-appb-000002
其中,Ps表示一块选定并标记为前景(xs=1)的小区域,X为小区域的标签,X=[x1,...,xn],xi∈{0,1},n为小区域的数量;xu表示第u个小区域,xv为第v个小区域;Vp和εp为过分割产生的表示各区域的节点(nodes)和各区域之间邻接关系的边(edges);第
Figure PCTCN2015094295-appb-000003
是数据项,当xu=0,且u=s时,
Figure PCTCN2015094295-appb-000004
无穷大;当xu=1,且u≠s时,
Figure PCTCN2015094295-appb-000005
等于fu;其余情况下,
Figure PCTCN2015094295-appb-000006
等于0;fu是一个背景的惩罚项,设定当一个区域和一个已标定为前景的区域的距离大于预设阈值的时候,就对该区域加上一个惩罚项,表明这个不是需要的前景;当Ps和Pu的距离大于阈值λ时,fu=k(d(Ps,Pu)-λ),否则fu=0;使用k=2.0作为步长惩罚距离大于预设阈值,但是却被标记为前景的区域。参数λ控制着范围,值并不是固定不变的,而是从0到整个边界方格的对角线长度ld
平滑项,或者称为分割成本,定义为两个邻接区域属于同一个物体的可能性,会基于机器人主动验证的结果来学习。分割成本的学习将在后续章节中进行介绍。剩下的问题就是如何选择前景的初始区域。我们采用的是使用给每个区域都分别作为前景的初始区域来进行多次图分割,产生许多带冗余的前景选择。接下来,我们将前景的分割使用均值漂移算法进行聚类,进而得到最有可能的前景物体。
下面介绍物体假设的选择:
具体实施时,可以利用如下公式,分割图像中的物体:
Figure PCTCN2015094295-appb-000007
具体实施时,在分割区域图的时候,猜想的候选集并不是必须的,一些候选会相互覆盖,使得在设置标签的时候容易引起歧义。现有的方法要么基于一些规则滤掉了部分候选集,要么将从训练数据学习到的模型进行排序,选取最有可能的。为了不要依赖监督式的方法来选取最优的候选,我们提出了一种让候选集之间相互竞争的方式来产生区域分割的方法。这个方法将有下述的马尔科夫随机场分割的能量函数公式表征:
Figure PCTCN2015094295-appb-000008
其中,L为所有小区域的标签,L=[l1,...,ln],lu∈{1,...,k};Vp和εp为过分割产生的表示各区域的节点和各区域之间邻接关系的边;数据项Ed(lu;Pu)被定义为区域Pu属于某个特定的物体的相似度,对于区域Pu和第i个物体Hi,数据项被定义为区域Pu被属于物体Hi的所有区域覆盖的比例,记作Ci
Ed(lu=i;Pu)=-ln(t(Pu,Ci)/∑jt(Pu,Cj));
其中,t(Pu,Ci)代表区域Pu在分类Ci中出现的次数,平滑项和之前在二值图分割中用到的一致。
本质上来说,数据项基于所有前景聚类的一致性投票来选择了每个区域的标签,对于投票机制,某个前景聚类的值越大,对应的物体假设代表一个独立物体的概率也就越大,因为物体已经在许多区域分割过程中多次出现而增加了其可能性。图2例证了我们的方法能够在过分割和欠分割的情况下很好地区别一些错误的分割。
下面再详细说明前面所提到的图分割成本的在线学习。
具体实施时,3D场景分割受到很多因素的影响,比如几何结构,颜色,纹理,甚至一些高层的结构信息,很难将上述所有因素融入到一个分割成本的计算方法中,因此我们想从机器人主动验证两个邻接区域是否属于不同的物体的可能性来学习分割成本:
Es(lu,lv)=1-p(lu≠lv|x(Pu,Pv));
其中,x(·)∈Rn是从一对区域块中提取出的特征向量,特别地,我们在这一特征向量上利用支持向量机(SVM)来训练预测函数:
Figure PCTCN2015094295-appb-000009
其中,f预测函数,当两个区域标签不同的时候返回正值,否则返回负值。g(t)=1/(1+e-t)是Logistic函数,用来将预测的数据转化为可能性,而对一条边euv的切割的可能性我们用pc(euv)来描述。为了训练支持向量机,我们从机器人验证中收集了一系列例子,每一对区域的边是否需要分割,是则设置标签为+1,否则为-1作为训练样本进行训练。分割成本学习的一个重要的好处就是学习到的结果能够被用于提高整个区域图的分割,而不仅仅是在一个局部的子图中。
这种学习方法的一个主要问题就是从不同因素中提取出的特征是由不同组分构成的,因此我们需要在支持向量机中设计多个核函数,因此我们使用了多核学习(Multiple Kernel Learning)的方法来对每一种特征进行学习。另一个需要注意的事实就是,在机器人主动分析和验证的过程中,训练样本是顺序的,更加高效的实现了在线学习,可以随着大量的样本而递增的提高预测函数的准确值。为了实现在线的多核学习,我们使用了passive-aggressive算法,使得多核学习算法有效的执行。
在初始化阶段,当还没有训练数据的时候,我们基于局部几何凹度定义分割可能性来进行分割:
pc(euv)=η(1-cosθuv);
其中θuv代表区域Pu和Pv的平均法向量的夹角,当两个邻接区域角度是一个凸的二面角时η取0.01,否则取1。
具体地,下面详细介绍基于熵的主动式交互验证:
为了提高物体层面的分割和重建,机器人会在当前的兴趣区域中进行轻推来验证和精细扫描。机器人与场景进行物理交互后,相应的物体发生移动,物体的几何信息由此增加,该增加的信息可以用来验证或更正分割结果。与现有的方法不同,本发明提出的主动轻推的方法特别适合于空间相距较近的物体,并解决了相距很近的物体之间的互相遮挡问题,有利于对物体进行精细扫描。
机器人交互(轻推)可能导致三种可能:
1)一个物体可被独立的推动,那预示着这是一个独立的物体;
2)一个物体可以分散成多个物体,表明我们的分割不够完全;
3)一个物体会在运动过程中和周围的物体一起运动,那预示着我们过分割了物体。后两者还需要进行更正,而前者则直接将其分割为一个物体。
使用信息增益最大来指导下一个最佳的轻推选择:
基于分割结果,区域图(patch graph)被构建成一个物体图(object graph),下一个最佳轻推的选择是基于这两个图。对于进行物体层面的分割和物体级别重建,下一个最好轻推对这两者都有积极作用。首先,轻推应该最大化减小场景分割的不确定性;另一方面,为了增加数据的完备性,机器人应该去尽量探索没有观察到的区域。重建结果的不确定性是由基于泊松的重建质量来度量的。
在上述步骤103中,根据分割出的物体和物体的完整三维数据,利用如下公式,对室内场景进行重建:
min H(S,R)=H(S)+H(S|R);
Figure PCTCN2015094295-appb-000010
Figure PCTCN2015094295-appb-000011
其中,S和R分别用来描述当前兴趣区域的分割和重建结果的随机变量,H(S)代表分割熵,H(S|R)代表基于分割的重建熵;我们将S离散化为区域图Gp的一系列可能分割,并用S(Gp)表示,所以一种分割Si∈S(Gp)能用来表示区域图Gp所有边的联合剪边概率;e为区域图Gp中一条边,pc(e)表示该边被剪断的概率;Ω表示用过零点的非平面iso表面(iso-surface,是由对重建得到的结果计算泊松场得到的)均匀采样得到的iso点的集合;区域Pu被属于物体Hi的所有区域覆盖的比例,记作Ci;ε0(S)为图中关于S的边的集合。
下面介绍分割和重建的联合熵以及上述公式的具体应用:
我们使用香农熵来联合测量分割和重建的不确定性,H=H(S,R),其中S和R分别用来描述当前兴趣区域的分割和重建结果的随机变量,这里的联合熵用于度量不准确性或者随机变量代表的信息量。根据后验概率公式:H(S,R)=H(S)+H(S|R),上述公式可以分为以下两步:H(S)代表分割熵,H(S|R)代表基于分割的重建熵。接下来我们从细节上进行详解。
对于分割,将S离散化为一系列可能的分割,用S(Gp)表示。因此所有边的切割就能用Si分割来表示,假设边的切割是独立的,我们可以用下列的公式来估计分割区域图的概率:
Figure PCTCN2015094295-appb-000012
为了估计不确定性,我们首先使用点云数据和提取非平面的过零点的方法,为每个感兴趣区域中的物体计算泊松场,这些过零点的非平面被采样到一系列的点,用Ω表示,而在一个分离点的数据保真度可以由下面的泊松场梯度公式表示:c(s)=Γ(s)·ns;其中,Γ(s)是泊松场的梯度,ns是其法向量。基于这个测量方法,我们可以估算物体重建的熵:
Figure PCTCN2015094295-appb-000013
其中g是Logistic函数,和上文提到的一致。
自我遮挡和相互遮挡是影响扫描的最主要的两个因素。自我遮挡主要和自我的几何形态,这个问题可以通过移动扫描仪进行多个视角的扫描来避免,另一方面,相互遮挡可以通过将两个部分分离来解决。因此,相互遮挡直接与重建的不确定性有关,而可以由多次计算下一次最好的视角来减少不确定性。当给定一个物体未知的场景,我们只有物体级的分割信息来寻找潜在的相互遮挡,因此在分割的条件下的重建熵由下面公式表示:
Figure PCTCN2015094295-appb-000014
条件熵测量了基于分割后相邻物体相互遮挡的重建的不确定性。需要指出的是,我们没有考虑物体与支撑平面之间的遮挡问题,因为我们的轻推都是水平方向。
下面介绍最大化信息增益:一次轻推应该最大化减少分割和重建的联合不确定性,因此增加两方面都要增加信息量。我们首先对感兴趣区域重建后的表面进行采样,得到了以下的轻推候选选项,而一次轻推导致的信息增加则用推动前后的熵的变化量来描述:
I(S,R|<pu,du>)=H(S,R)-H′(S,R|<pu,du>);
其中,H′(S,R|<pu,du>)是后验熵,然后选择使得I(S,R|<pu,du>)最大化的轻推候选u即可。
下面后验熵:为了估计一次轻推的后验熵,我们需要计算分割和重建的后验熵。这需要观察被推物体产生的变化,而这在推动之前是不得而知的。为了简化这个估计,基于当前物体的提取结果,我们使用了两个关键的假设。给定一个推动u,让我们定义Ou为含有推动点pu的物体,ε0(Ou)为物体图中关于Ou的边的集合。首先,我们假设ε0(Ou)中边的分割状态将由轻推u来决定,因此那些边的分割不确定性为0。其次,假设在Ou和它的邻接Oj之间的交界处会暴露出来,假设有足够多的扫描的前提下,那些冲突的区域的重建的不确定性会随着轻推而减少。
这些假设的原理就是当前物体的分析仅仅提供了一个位置场景的查询的线索,在假设物体的提取是可靠的前提下,我们将尽可能的减少熵值。基于这些假设,一个轻推动作后对应的边和的后验熵能被减小到零。因此,轻推后的信息增益有下列的公式表征:
I(S,R|<pu,du>)=Hsp(Ou))+HR|S0(Ou,du));
而Hsp(Ou))和HR|S0(Ou,du))已经在前文提过,这里不再赘述。
下面进行介绍物理上可行的下一个最好的轻推选择:
基于信息增益的方法由于现实物理世界中的种种限制导致某些情况下不可行,因此对于那些没有办法实现的轻推选择我们需要选择一个可行的替代方案。出于效能考虑,我们对点云进行了下采样,用一系列规则滤掉了那些不可行的选择,而这些规则是物体可感知的,然后从剩余的轻推选项中选择信息增益最大的。
在上述步骤102中,施加推力的操作不包括:
施加推力点位高于物体预设高度的操作;
施加推力方向使得两个物体挨得太近的操作;
施加推力方向与支持平面不垂直的操作;
机器人的手臂不能到达位置的操作;
施加推力方向和被施加推力物体中心的垂直线之间夹角角度大于预设角度阈值的操作。
具体地,即下列的5条规则中滤掉了会造成破坏性的(规则一到规则五)或者不能被机器人执行的(规则四),同时我们偏好于平移的运动来轻推物体而降低运动检测的难度。可以包括以下5条基本规则:
规则一:我们滤掉了那些轻推点位高于物体2/3高度的那些操作,以避免掉落。
规则二:我们滤掉了那些轻推方向会使得两个物体挨得太近(小于5厘米)的操作。
规则三:我们滤掉了轻推方向与支持平面不垂直的那些操作,保证轻推方向是水平的。
规则四:我们滤掉了那些机器人的手臂不能到达的操作。我们在每一个轻推点周围设置了一个立方体区域,如果如果其他物体进入到这个区域,那么表明这个操作是不可行的。
规则五:为了避免被推物体旋转,如果轻推的方向和该物体中心的垂直线的角度大于一个阈值,我们则滤掉这个操作。
一旦下一个最好轻推选择已经选择好了,我们就只需要让机器人对轻推点进行缓慢的操作,并沿着轻推方向慢慢发力。推的距离取决于两个因素,一个是如果检测到沿着轻推方向前方有物体会挡道那就避免推太远而碰到那些物体,另一个是如果旁边有物体,我们会倾向于保证把这两个物体尽量的分开。这些分离的操作都依赖于规则四所说 的轻推点周围的立方体区域检测。如果满足以上五条规则,我们就执行轻推操作,一般推5厘米,已经足够用于运动检测了。
在轻推的过程中,我们使用了一种结合纹理和非纹理算法的方法检测物体的运动,对于纹理追踪,通过Kinect采集到的RGB数据提取出纹理的特征,对于非纹理的情况,使用深度数据来提取几何特征,最终,我们将特征点的轨迹聚类,然后再将这些特征返回对应到物体上。
在上述步骤102中,将施加推力后的小区域图像与施加推力前的小区域图像进行比较,根据比较结果,分割图像中的物体,包括:
如果每个小区域对应物体沿着同样的轨迹运动,确定物体被正确地分割,这些小区域属于同一物体;
如果多个小区域对应物体被聚类成几个不同的分类,确定物体处于欠分割状态;
如果多个小区域对应物体是联动的,确定物体被过分割。
具体实施时,基于追踪的结果,我们可以轻易识别到三种分割验证的情况,如图3所示,
情况一:如果追踪到特征点随着同样的轨迹运动,表明被正确分割,如图3(a)所示。
情况二:如果一个猜测物体的特征点被聚类成几个不同的分类,那么表示这个假设物体处于欠分割状态,如图3(b)所示。
情况三:如果多个假想物体的特征点是联动的,那么预示着这多个假想物体被过分割,而应该合并为一个,如图3(c)所示。
下面介绍扫描的补全和验证的合并
首先,介绍物体级的扫描补全:
在一个实施例中,控制机器人对施加推力后分离的物体中没有扫描完全的部分进行扫描,获取物体完整的三维数据,包括:
计算机器人下一个最佳扫描视角;
根据所述最佳扫描视角,控制机器人对施加推力后分离的物体中没有扫描完全的部分进行扫描。
具体地,为了提高被推物体的数据完整性,我们计算了一系列基于Kinect传感器点云数据的下一个最佳视角(next best view,NBV)。在计算下一个最佳扫描视角时,我们会考虑到场景、传感器和机器人平台的限制,滤掉一个无法进行扫描或机器人无法达到 的位置。最终留下的扫描视角可以有效地消除KinectFusion中的累积漂移误差,因为我们由机器人可以得到各个扫描视角之间的位置转换关系。
其次,合并验证结果:通过验证,我们将从分割和重建得到的信息增益进行合并,新的数据被融合到KinectFusion的全局体像素空间中。基于对物体进行运动跟踪的结果,我们既局部更新区域图又全局更新图的分割成本。对于正确分割的物体,我们简单地将其区域合并为一整个大区域并更新区域图;对于错误分割的物体,因为精细扫描会更新表面重建的结果,我们选择在扫描完成后对移动的物体重新进行区域级的分割,然后把新的区域加入到区域图,代替那些原来的假设。
在一个实施例中,根据比较结果,分割图像中的物体,包括:
确定图像中物体分割的准确度;
根据物体分割的准确度,分割图像中的物体。
上述准确度可以理解为:分割和重建的不确定性。本发明实施中提到的信息增益指的是:信息(例如:分割信息等)不确定性的减少量,熵值指的是:信息(例如:分割和重建信息等)不确定性的大小。信息增益和熵值均与上述提到的准确度具有相应的意义。
具体实施时,介绍终止条件:当下一次最佳轻推的信息增益小于一个预先设定的阈值时,分析—验证的迭代过程终止。然而,我们的方法不能保证在有限次数内的收敛,这是因为基于机器人交互的验证方法不能保证同时减少分割和重建的不确定性。比如,对某些复杂的情况,一次轻推有时可能会带来新的物体间相互遮挡;另一方面,精细扫描总是能减少重建的不确定性,因为提高了数据的完整性。因此,我们的终止条件可以包括以下三种:
情况一:下一次轻推的最大信息增益小于当前的熵值。
情况二:已经没有可执行的轻推操作可以被执行。
情况三:每个感兴趣区域达到了30次轻推次数的上限。
基于同一发明构思,本发明实施例中还提供了一种室内场景扫描重建的装置,如下面的实施例所述。由于室内场景扫描重建的装置解决问题的原理与室内场景扫描重建的方法相似,因此室内场景扫描重建的装置的实施可以参见室内场景扫描重建的方法的实施,重复之处不再赘述。以下所使用的,术语“单元”或者“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。
图4是本发明实施例中室内场景扫描重建的装置的结构示意图,如图4所示,该装置,包括:
初步扫描重建模块02,用于获得位于室内空间的机器人捕获的所述室内空间的扫描图像信息,根据所述扫描图像信息重建所述室内空间的三维场景模型图;
分割扫描处理模块04,用于将所述三维场景模型图分割为多个兴趣区域;对于每个兴趣区域,均执行以下操作:将兴趣区域分割为多个小区域;控制所述机器人对每个小区域对应物体施加推力,获得施加推力后的所述多个小区域的图像;将施加推力后的小区域图像与施加推力前的小区域图像进行比较,根据比较结果,分割图像中的物体,以及控制机器人对施加推力后分离的物体中没有扫描完全的部分进行扫描,获取物体完整的三维数据;
重建模块06,根据分割出的物体和物体的完整三维数据,对室内场景进行重建。
在一个实施例中,分割扫描处理模块具体用于根据以下情况,分割图像中的物体:
如果每个小区域对应物体沿着同样的轨迹运动,确定物体被正确地分割,这些小区域属于同一物体;
如果多个小区域对应物体被聚类成几个不同的分类,确定物体处于欠分割状态;
如果多个小区域对应物体是联动的,确定物体被过分割。
本发明实施例提供技术方案的有益技术效果为:
(1)本发明的方法能够获得室内场景更加完整的数据,且能够处理更加复杂、尺度更大的场景。
(2)分割准确率高。分在相同交互次数(10次左右)、相同场景(桌面上的日常物品)的情况下,现有方法的分割准确率只有70%~80%,而本发明提出的方法能达到90%左右。
(3)交互效率高。本发明的方法对于含有20-30个物体的大场景进行12左右的交互就能得到90%左右的分割准确率。
(4)现有工作都没有考虑如何重建分割出来的物体。而本发明的方法既对场景进行了物体级别的分割,又对场景中的物体进行高质量的重建。
本发明经过多个实验证明其可行性,能广泛适用于各种场景,特别是能对大尺度、复杂场景进行处理,这点是现有方法所不能实现的。从大量的实验结果可以看出,本发明提出的方法能高效鲁棒地对室内场景进行物体级别的重建和分割。特别要说明的,本发明提出的这一框架,能广泛适用于多种机器人平台,不受机器人硬件平台的影响。
显然,本领域的技术人员应该明白,上述的本发明实施例的各模块、装置或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本发明实施例不限制于任何特定的硬件和软件结合。
以上仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明实施例可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (12)

  1. 一种室内场景扫描重建的方法,其特征在于,包括:
    获得位于室内空间的机器人捕获的所述室内空间的扫描图像信息,根据所述扫描图像信息重建所述室内空间的三维场景模型图;
    将所述三维场景模型图分割为多个兴趣区域;对于每个兴趣区域,均执行以下操作:将兴趣区域分割为多个小区域;控制所述机器人对每个小区域对应物体施加推力,获得施加推力后的所述多个小区域的图像;将施加推力后的小区域图像与施加推力前的小区域图像进行比较,根据比较结果,分割图像中的物体,以及控制机器人对施加推力后分离的物体中没有扫描完全的部分进行扫描,获取物体完整的三维数据;
    根据分割出的物体和物体的完整三维数据,对室内场景进行重建。
  2. 如权利要求1所述的室内场景扫描重建的方法,其特征在于,将施加推力后的小区域图像与施加推力前的小区域图像进行比较,根据比较结果,分割图像中的物体,包括:
    如果每个小区域对应物体沿着同样的轨迹运动,确定物体被正确地分割,这些小区域属于同一物体;
    如果多个小区域对应物体被聚类成几个不同的分类,确定物体处于欠分割状态;
    如果多个小区域对应物体是联动的,确定物体被过分割。
  3. 如权利要求1所述的室内场景扫描重建的方法,其特征在于,控制机器人对施加推力后分离的物体中没有扫描完全的部分进行扫描,获取物体完整的三维数据,包括:
    计算机器人下一个最佳扫描视角;
    根据所述最佳扫描视角,控制机器人对施加推力后分离的物体中没有扫描完全的部分进行扫描。
  4. 如权利要求1所述的室内场景扫描重建的方法,其特征在于,施加推力的操作不包括:
    施加推力点位高于物体预设高度的操作;
    施加推力方向使得两个物体挨得太近的操作;
    施加推力方向与支持平面不垂直的操作;
    机器人的手臂不能到达位置的操作;
    施加推力方向和被施加推力物体中心的垂直线之间夹角角度大于预设角度阈值的操作。
  5. 如权利要求1所述的室内场景扫描重建的方法,其特征在于,施加推力的方向为水平方向。
  6. 如权利要求1所述的室内场景扫描重建的方法,其特征在于,根据比较结果,分割图像中的物体,包括:
    确定图像中物体分割的准确度;
    根据物体分割的准确度,分割图像中的物体。
  7. 如权利要求6所述的室内场景扫描重建的方法,其特征在于,机器人施加推力的操作的终止条件包括以下条件之一:
    机器人下一次施加推力对应的物体分割准确度小于当前施加推力对应的物体分割的准确度;
    机器人已经没有可执行的施加推力操作;
    机器人对每个兴趣区域内对应物品的施加推力次数达到了预设次数。
  8. 如权利要求1所述的室内场景扫描重建的方法,其特征在于,利用如下公式,分割图像中的物体:
    Figure PCTCN2015094295-appb-100001
    其中,Ps表示一块选定并标记为前景的小区域,X为小区域的标签,X=[x1,...,xn],xi∈{0,1},n为小区域的数量;xu表示第u个小区域的标签,xv为第v个小区域的标签;Vp和εp为过分割产生的表示各区域的节点和各区域之间邻接关系的边;第
    Figure PCTCN2015094295-appb-100002
    是数据项,当xu=0,且u=s时,
    Figure PCTCN2015094295-appb-100003
    无穷大;当xu=1,且u≠s时,
    Figure PCTCN2015094295-appb-100004
    等于fu;其余情况下,
    Figure PCTCN2015094295-appb-100005
    等于0;fu是一个背景的惩罚项,设定当一个区域和一个已标定为前景的区域的距离大于预设阈值的时候,就对该区域加上一个惩罚项,表明这个不是需要的前景;当Ps和Pu的距离大于阈值λ时,fu=k(d(Ps,Pu)-λ),否则fu=0;使用k=2.0作为步长惩罚距离大于预设阈值,但是却被标记为前景的区域。
  9. 如权利要求6所述的室内场景扫描重建的方法,其特征在于,利用如下公式,分割图像中的物体:
    Figure PCTCN2015094295-appb-100006
    其中,L为所有小区域的标签,L=[l1,...,ln],lu∈{1,...,k};Vp和εp为过分割产生的表示各区域的节点和各区域之间邻接关系的边;数据项Ed(lu;Pu)被定义为区域Pu属于某个特定的物体的相似度,对于区域Pu和第i个物体Hi,数据项被定义为区域Pu被属于物体Hi的所有区域覆盖的比例,记作Ci
  10. 如权利要求1所述的室内场景扫描重建的方法,其特征在于,根据分割出的物体和物体的完整三维数据,利用如下公式,对室内场景进行重建:
    其中,S和R分别用来描述当前兴趣区域的分割和重建结果的随机变量,H(S)代表分割熵,H(S|R)代表基于分割的重建熵;我们将S离散化为区域图Gp的一系列可能分割,并用S(Gp)表示,所以一种分割Si∈S(Gp)能用来表示区域图Gp所有边的联合剪边概率;e为区域图Gp中一条边,pc(e)表示该边被剪断的概率;Ω表示用过零点的非平面iso表面均匀采样得到的iso点的集合;区域Pu被属于物体Hi的所有区域覆盖的比例,记作Ci;ε0(S)为图中关于S的边的集合。
  11. 一种室内场景扫描重建的装置,其特征在于,包括:
    初步扫描重建模块,用于获得位于室内空间的机器人捕获的所述室内空间的扫描图像信息,根据所述扫描图像信息重建所述室内空间的三维场景模型图;
    分割扫描处理模块,用于将所述三维场景模型图分割为多个兴趣区域;对于每个兴趣区域,均执行以下操作:将兴趣区域分割为多个小区域;控制所述机器人对每个小区域对应物体施加推力,获得施加推力后的所述多个小区域的图像;将施加推力后的小区域图像与施加推力前的小区域图像进行比较,根据比较结果,分割图像中的物体,以及控制机器人对施加推力后分离的物体中没有扫描完全的部分进行扫描,获取物体完整的三维数据;
    重建模块,根据分割出的物体和物体的完整三维数据,对室内场景进行重建。
  12. 如权利要求11所述的室内场景扫描重建的装置,其特征在于,所述分割扫描处理模块具体用于根据以下情况,分割图像中的物体:
    如果每个小区域对应物体沿着同样的轨迹运动,确定物体被正确地分割,这些小区域属于同一物体;
    如果多个小区域对应物体被聚类成几个不同的分类,确定物体处于欠分割状态;
    如果多个小区域对应物体是联动的,确定物体被过分割。
PCT/CN2015/094295 2015-11-11 2015-11-11 室内场景扫描重建的方法及装置 WO2017079918A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/094295 WO2017079918A1 (zh) 2015-11-11 2015-11-11 室内场景扫描重建的方法及装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/094295 WO2017079918A1 (zh) 2015-11-11 2015-11-11 室内场景扫描重建的方法及装置

Publications (1)

Publication Number Publication Date
WO2017079918A1 true WO2017079918A1 (zh) 2017-05-18

Family

ID=58694658

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/094295 WO2017079918A1 (zh) 2015-11-11 2015-11-11 室内场景扫描重建的方法及装置

Country Status (1)

Country Link
WO (1) WO2017079918A1 (zh)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109325998A (zh) * 2018-10-08 2019-02-12 香港理工大学 一种基于点云数据的室内3d建模方法、系统及相关装置
CN111113424A (zh) * 2019-12-31 2020-05-08 芜湖哈特机器人产业技术研究院有限公司 一种基于三维视觉的机器人离线编程系统
CN111340939A (zh) * 2020-02-21 2020-06-26 广东工业大学 一种室内三维语义地图构建方法
CN111738906A (zh) * 2020-05-28 2020-10-02 北京三快在线科技有限公司 室内路网生成方法、装置、存储介质及电子设备
CN111829579A (zh) * 2020-06-02 2020-10-27 深圳全景空间工业有限公司 一种室内空间重建的方法
CN112365580A (zh) * 2020-11-16 2021-02-12 同济大学 一种面向人机技能传授的虚拟作业演示系统
CN112633069A (zh) * 2020-11-26 2021-04-09 贝壳技术有限公司 物体检测方法及装置
CN114092791A (zh) * 2021-11-19 2022-02-25 济南大学 一种基于场景感知的人机协同方法、系统及机器人
CN116402984A (zh) * 2023-02-28 2023-07-07 北京优酷科技有限公司 三维模型处理方法、装置及电子设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103278170A (zh) * 2013-05-16 2013-09-04 东南大学 基于显著场景点检测的移动机器人级联地图创建方法
WO2014091307A2 (en) * 2012-12-14 2014-06-19 Faro Technologies, Inc. Method for optically scanning and measuring an environment
CN103914875A (zh) * 2014-04-17 2014-07-09 中国科学院深圳先进技术研究院 室内场景的功能性建模方法
CN104732587A (zh) * 2015-04-14 2015-06-24 中国科学技术大学 一种基于深度传感器的室内3d语义地图构建方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014091307A2 (en) * 2012-12-14 2014-06-19 Faro Technologies, Inc. Method for optically scanning and measuring an environment
CN103278170A (zh) * 2013-05-16 2013-09-04 东南大学 基于显著场景点检测的移动机器人级联地图创建方法
CN103914875A (zh) * 2014-04-17 2014-07-09 中国科学院深圳先进技术研究院 室内场景的功能性建模方法
CN104732587A (zh) * 2015-04-14 2015-06-24 中国科学技术大学 一种基于深度传感器的室内3d语义地图构建方法

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109325998A (zh) * 2018-10-08 2019-02-12 香港理工大学 一种基于点云数据的室内3d建模方法、系统及相关装置
CN109325998B (zh) * 2018-10-08 2023-06-30 香港理工大学 一种基于点云数据的室内3d建模方法、系统及相关装置
CN111113424A (zh) * 2019-12-31 2020-05-08 芜湖哈特机器人产业技术研究院有限公司 一种基于三维视觉的机器人离线编程系统
CN111340939A (zh) * 2020-02-21 2020-06-26 广东工业大学 一种室内三维语义地图构建方法
CN111340939B (zh) * 2020-02-21 2023-04-18 广东工业大学 一种室内三维语义地图构建方法
CN111738906A (zh) * 2020-05-28 2020-10-02 北京三快在线科技有限公司 室内路网生成方法、装置、存储介质及电子设备
CN111738906B (zh) * 2020-05-28 2024-04-09 北京三快在线科技有限公司 室内路网生成方法、装置、存储介质及电子设备
CN111829579A (zh) * 2020-06-02 2020-10-27 深圳全景空间工业有限公司 一种室内空间重建的方法
CN111829579B (zh) * 2020-06-02 2022-05-20 深圳全景空间工业有限公司 一种室内空间重建的方法
CN112365580B (zh) * 2020-11-16 2022-10-28 同济大学 一种面向人机技能传授的虚拟作业演示系统
CN112365580A (zh) * 2020-11-16 2021-02-12 同济大学 一种面向人机技能传授的虚拟作业演示系统
CN112633069A (zh) * 2020-11-26 2021-04-09 贝壳技术有限公司 物体检测方法及装置
CN114092791A (zh) * 2021-11-19 2022-02-25 济南大学 一种基于场景感知的人机协同方法、系统及机器人
CN116402984A (zh) * 2023-02-28 2023-07-07 北京优酷科技有限公司 三维模型处理方法、装置及电子设备
CN116402984B (zh) * 2023-02-28 2024-04-16 神力视界(深圳)文化科技有限公司 三维模型处理方法、装置及电子设备

Similar Documents

Publication Publication Date Title
WO2017079918A1 (zh) 室内场景扫描重建的方法及装置
Labbé et al. Cosypose: Consistent multi-view multi-object 6d pose estimation
Li et al. A unified framework for multi-view multi-class object pose estimation
Zeng et al. 3dmatch: Learning local geometric descriptors from rgb-d reconstructions
Camplani et al. Multiple human tracking in RGB‐depth data: a survey
Tan et al. Robust monocular SLAM in dynamic environments
Lim et al. Real-time image-based 6-dof localization in large-scale environments
Xu et al. Autoscanning for coupled scene reconstruction and proactive object analysis
CN112258618A (zh) 基于先验激光点云与深度图融合的语义建图与定位方法
CN110472585B (zh) 一种基于惯导姿态轨迹信息辅助的vi-slam闭环检测方法
Azad et al. 6-DoF model-based tracking of arbitrarily shaped 3D objects
EP2959431A1 (en) Method and device for calculating a camera or object pose
CN105427293A (zh) 室内场景扫描重建的方法及装置
Bergström et al. Generating object hypotheses in natural scenes through human-robot interaction
Li et al. Hierarchical semantic parsing for object pose estimation in densely cluttered scenes
Zhuang et al. Instance segmentation based 6D pose estimation of industrial objects using point clouds for robotic bin-picking
Pu et al. Visual SLAM integration with semantic segmentation and deep learning: A review
Zillich et al. Knowing your limits-self-evaluation and prediction in object recognition
Zhu et al. A review of 6d object pose estimation
Zhang et al. Robust head tracking based on multiple cues fusion in the kernel-bayesian framework
Liang et al. A semi-direct monocular visual SLAM algorithm in complex environments
Cheng et al. A grasp pose detection scheme with an end-to-end CNN regression approach
Kok et al. Obscured tree branches segmentation and 3D reconstruction using deep learning and geometrical constraints
Nagy et al. 3D CNN based phantom object removing from mobile laser scanning data
Turk et al. Computer vision for mobile augmented reality

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15908048

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15908048

Country of ref document: EP

Kind code of ref document: A1