WO2019127102A1 - Information processing method and apparatus, cloud processing device, and computer program product - Google Patents

Information processing method and apparatus, cloud processing device, and computer program product Download PDF

Info

Publication number
WO2019127102A1
WO2019127102A1 PCT/CN2017/119008 CN2017119008W WO2019127102A1 WO 2019127102 A1 WO2019127102 A1 WO 2019127102A1 CN 2017119008 W CN2017119008 W CN 2017119008W WO 2019127102 A1 WO2019127102 A1 WO 2019127102A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
semantic
dimensional
dimensional reconstruction
key frame
Prior art date
Application number
PCT/CN2017/119008
Other languages
French (fr)
Chinese (zh)
Inventor
王恺
廉士国
Original Assignee
深圳前海达闼云端智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳前海达闼云端智能科技有限公司 filed Critical 深圳前海达闼云端智能科技有限公司
Priority to CN201780002737.9A priority Critical patent/CN108124489B/en
Priority to PCT/CN2017/119008 priority patent/WO2019127102A1/en
Publication of WO2019127102A1 publication Critical patent/WO2019127102A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects

Definitions

  • the present application relates to the field of data processing technologies, and in particular, to an information processing method, apparatus, cloud processing device, and computer program product.
  • Semantic map construction refers to the high-level semantic information (such as the object name and location) that can be used by the computer and other devices based on the perceptual data, cognition and understanding of the environment, and comprehensive analysis of the data.
  • the acquisition of sensory data can be achieved through key technologies such as radio frequency identification technology, auditory technology, and visual technology. At present, most research focuses on visual technology.
  • the deep learning technology can be relied on, and the image perceived by the computer in real time may contain multiple objects, firstly segment the image, and then perform the object in the segmented image by means of machine learning or the like. Identification, this process involves a large number of image operations and takes a long time.
  • the processing method in the prior art is mainly for the processing of two-dimensional data.
  • the three-dimensional data is semantically segmented
  • geometrically continuous segmentation results cannot be obtained by using this method, and the number of samples is limited, and can be segmented.
  • the types of objects are limited and take a long time.
  • the embodiment of the present application provides an information processing method, device, cloud processing device, and computer program product, which can process three-dimensional data in real time and generate a three-dimensional semantic map, which not only improves the accuracy of scene segmentation but also shortens the processing time.
  • an embodiment of the present application provides an information processing method, including:
  • the semantic segmentation data is mapped to the three-dimensional reconstruction data to obtain a three-dimensional semantic map.
  • mapping the semantic segmentation data and the three-dimensional reconstruction data to obtain a three-dimensional semantic map including:
  • the semantic information of all points in the three-dimensional reconstruction data is integrated to obtain the three-dimensional semantic map.
  • mapping processing on the semantic segmentation data and the three-dimensional reconstruction data to obtain a three-dimensional semantic map including:
  • the semantic information of all the faces in the three-dimensional reconstruction data and the semantic information of all the connection points are integrated to obtain the three-dimensional semantic map.
  • Extracting and processing the key frame data in the RGBD data to obtain geometric reconstruction data including:
  • Reconstruction is performed according to the pose information and the D data in the key frame data to obtain geometric reconstruction data.
  • the embodiment of the present application further provides an information processing apparatus, including:
  • An acquiring unit configured to acquire RGBD data collected by the image capturing device
  • An extracting unit configured to extract key frame data in the RGBD data and perform processing to obtain geometric reconstruction data
  • a processing unit configured to perform mapping processing on the RGB data in the key frame data and the geometric reconstruction data to obtain three-dimensional reconstruction data; and perform semantic segmentation on the RGB data in the key frame data to obtain semantic segmentation data;
  • mapping unit configured to perform mapping processing on the semantic segmentation data and the three-dimensional reconstruction data to obtain a three-dimensional semantic map.
  • the mapping unit is specifically configured to:
  • the semantic information of all points in the three-dimensional reconstruction data is integrated to obtain the three-dimensional semantic map.
  • the mapping unit is specifically configured to:
  • the semantic information of all the faces in the three-dimensional reconstruction data and the semantic information of all the connection points are integrated to obtain the three-dimensional semantic map.
  • the extracting unit is specifically configured to:
  • Reconstruction is performed according to the pose information and the D data in the key frame data to obtain geometric reconstruction data.
  • the embodiment of the present application further provides a cloud processing device, where the device includes a processor and a memory; the memory is configured to store an instruction, when the instruction is executed by the processor, causing the device to perform, for example, The method of any of the first aspects.
  • the embodiment of the present application further provides a computer program product, which can be directly loaded into an internal memory of a computer and includes software code. After the computer program is loaded and executed by a computer, the first aspect can be implemented. One such method.
  • the information processing method, device, cloud processing device and computer program product provided by the embodiments of the present application process geometric key reconstruction data by extracting key frame data in RGBD data, and then perform three-dimensional reconstruction and semantic segmentation simultaneously.
  • the two processes respectively obtain the three-dimensional reconstruction data and the semantic segmentation data, and finally the semantic segmentation data and the three-dimensional reconstruction data are mapped to obtain a three-dimensional semantic map.
  • the three-dimensional reconstruction and the semantic segmentation can be performed.
  • FIG. 1 is a flowchart of an embodiment of an information processing method according to an embodiment of the present application
  • FIG. 2 is a schematic structural diagram of an embodiment of an information processing apparatus according to an embodiment of the present disclosure
  • FIG. 3 is a schematic structural diagram of an embodiment of a cloud processing device according to an embodiment of the present disclosure.
  • the word “if” as used herein may be interpreted as “when” or “when” or “in response to determining” or “in response to detecting.”
  • the phrase “if determined” or “if detected (conditions or events stated)” may be interpreted as “when determined” or “in response to determination” or “when detected (stated condition or event) “Time” or “in response to a test (condition or event stated)”.
  • the three-dimensional semantic map consists of two parts, one of which is a three-dimensional reconstruction model obtained by reconstructing one environment, and the other is scene recognition information obtained by precise semantic segmentation.
  • semantic segmentation is mostly based on two-dimensional data processing, and semantic segmentation of three-dimensional data in the same manner cannot obtain geometrically continuous segmentation results, and it takes a long time, and it is difficult to achieve real-time completion.
  • the embodiment of the present application provides an information processing method, which performs semantic transformation on the collected environmental information while realizing three-dimensional reconstruction of the collected environmental information, and real-time generates a three-dimensional semantic map.
  • FIG. 1 is the present application.
  • a flowchart of an embodiment of the information processing method provided by the embodiment, as shown in FIG. 1 the information processing method provided by the embodiment of the present application may specifically include the following steps:
  • the image acquisition device when it is required to perform three-dimensional reconstruction on a certain scene and obtain a three-dimensional semantic map, the image acquisition device is first used to collect images on the scene, and the image collection device needs to include an RGB camera and a depth (Depth) camera, and The RGBD data is obtained after the acquisition is completed.
  • the computer for generating a three-dimensional semantic map may include a real-time mapping positioning module, which is used to acquire RGBD data collected by the image acquisition device, and specifically, may be constructed by a real-time mapping module. Actively acquiring RGBD data, the image acquisition device can also actively send RGBD data to the real-time mapping positioning module.
  • the following steps may be used to obtain the geometric reconstruction data: first, the pose information of the image acquisition device is calculated according to the key frame data in the RGBD data, and specifically, the RGBD data corresponding to the key frame is extracted in all the RGBD data. Calculating the pose of the image acquisition device according to the RGBD data corresponding to the key frame; then reconstructing according to the pose information and the D data in the key frame data to obtain geometric reconstruction data.
  • the geometric reconstruction data can include two formats, one is a point cloud format, and the other is a grid format, and the two formats can be selected according to actual needs.
  • the pose information and the D data in the key frame data are processed by a fast fusion algorithm to reconstruct the data in the point cloud format.
  • the fast fusion algorithm is used to process the pose information and the D data in the key frame data to reconstruct the data in the grid format.
  • the two processes of obtaining the three-dimensional reconstruction data and obtaining the semantic segmentation data have large calculation amounts and occupy a large amount of computing resources, so the two are put into different threads, or Parallel computing is used.
  • the process of generating three-dimensional reconstruction data according to different formats will be different.
  • the geometric reconstruction data is in the point cloud format, first find the D data corresponding to each point in the point cloud, and then, according to the calibration result of the RGB camera and the depth camera, find the RGB data corresponding to each point, and finally each point. The value of the corresponding RGB data is assigned to the corresponding point.
  • the geometric reconstruction data is in a grid format, the RGB data corresponding to the key frame is mapped to the grid as a texture according to an algorithm.
  • the algorithm may include a nearest sampling point algorithm, a bilinear interpolation algorithm, and three Linear interpolation algorithm, etc.
  • the semantic segmentation data can be obtained by selecting different prior art methods according to different scenarios.
  • the geometric reconstruction data is in a point cloud format: first, determining RGB data corresponding to each point in the three-dimensional reconstruction data; and then determining, corresponding to each point in the three-dimensional reconstruction data, according to the first correspondence relationship between the RGB data and the semantic segmentation data Semantic information; finally, the semantic information of all points in the 3D reconstruction data is integrated to obtain a 3D semantic map.
  • each point of the 3D geometric reconstruction result is V P (P is the number of the point), and the RGB value V C corresponding to each point can be obtained by looking up the table ⁇ .
  • the table ⁇ is a table representing the correspondence between the sequence number P and the RGB values.
  • the semantic information of each point is determined by the first correspondence function, and the specific function is:
  • V S is semantic information
  • V P is a point
  • V C is an RGB value
  • the geometric reconstruction data is in a grid format: first, determining RGB data corresponding to each face in the three-dimensional reconstruction data; and then, according to the second correspondence relationship between the RGB data and the semantic segmentation data, determining each face in the three-dimensional reconstruction data Semantic information; then, determine the faces around each connection point in the three-dimensional data; determine the semantic information of each connection point according to the semantic information corresponding to each face; finally, integrate the semantic information of all faces in the three-dimensional reconstruction data A semantic map of all the connection points is obtained to obtain a three-dimensional semantic map.
  • a mesh consists of points and faces, and faces are connected by points.
  • each face F j correspond to a region F c of RGB data, and the corresponding RGB value F c of each region can be obtained by looking up the table ⁇ .
  • the table ⁇ is a table representing the correspondence between the sequence number j and the RGB values.
  • the semantic information of each face is determined by a second corresponding relationship function.
  • the specific function is:
  • F s is semantic information
  • F j is a face
  • F c is an RGB value
  • V i s is semantic information
  • F k s is semantic information of all faces around V i
  • p is the number of faces around V i .
  • the function Q(F k s ) can be expressed as follows:
  • F k s is the semantic information of all faces around V i
  • p is the number of faces around V i .
  • the function Q(F k s ) can take the following specific representation:
  • F k s is the semantic information of all faces around V i
  • p is the number of faces around V i
  • F k A is the area of F k .
  • the information processing method provided by the embodiment of the present application extracts key frame data in the RGBD data, processes the key frame data to obtain geometric reconstruction data, and then simultaneously performs two processes of three-dimensional reconstruction and semantic segmentation to obtain three-dimensional reconstruction data and Semanticly segmenting the data, and finally mapping the semantic segmentation data and the three-dimensional reconstruction data to obtain a three-dimensional semantic map.
  • the three-dimensional reconstruction and the semantic segmentation can be simultaneously performed, and the three-dimensional reconstruction can be performed according to the RGBD data.
  • Reconstruction can obtain semantic information at the same time, which can shorten the calculation time and improve the accuracy of scene segmentation, and achieve the effect of generating 3D maps in real time, which can solve the problem of segmentation of 3D data in the prior art.
  • the type of object is limited and takes a long time.
  • FIG. 2 is a schematic structural diagram of an embodiment of an information processing apparatus according to an embodiment of the present application. As shown in FIG. 2, the apparatus of this embodiment may be used.
  • the acquisition unit 11 includes an acquisition unit 11, an extraction unit 12, a processing unit 13, and a mapping unit 14.
  • the obtaining unit 11 is configured to acquire RGBD data collected by the image capturing device.
  • the extracting unit 12 is configured to extract key frame data in the RGBD data and perform processing to obtain geometric reconstruction data.
  • the processing unit 13 is configured to perform mapping processing on the RGB data and the geometric reconstruction data in the key frame data to obtain three-dimensional reconstruction data; and perform semantic segmentation processing on the RGB data in the key frame data to obtain semantic segmentation data.
  • the mapping unit 14 is configured to perform mapping processing on the semantic segmentation data and the three-dimensional reconstruction data to obtain a three-dimensional semantic map.
  • mapping unit 14 is specifically configured to:
  • mapping unit 14 is specifically configured to:
  • the extracting unit 12 is specifically configured to:
  • the reconstruction is performed based on the pose information and the D data in the key frame data to obtain geometric reconstruction data.
  • the information processing apparatus provided by the embodiment of the present application may be used to implement the technical solution of the method embodiment shown in FIG. 1 , and the implementation principle and technical effects thereof are similar, and details are not described herein again.
  • FIG. 3 is a schematic structural diagram of an embodiment of a cloud processing device according to an embodiment of the present disclosure.
  • the cloud processing device includes a processor 21 and a memory 22; the memory 22 is for storing instructions that, when executed by the processor 21, cause the device to perform any of the methods described above.
  • the cloud processing device provided by the embodiment of the present application may be used to implement the technical solution of the method embodiment shown in FIG. 1 , and the implementation principle and the technical effect are similar, and details are not described herein again.
  • the embodiment of the present application further provides a computer program product, which can be directly loaded into an internal memory of a computer and contains software code, and the computer program can be implemented by being loaded and executed by a computer. Any method.
  • the cloud processing device provided by the embodiment of the present application may be used to implement the technical solution of the method embodiment shown in FIG. 1 , and the implementation principle and the technical effect are similar, and details are not described herein again.
  • the aforementioned program can be stored in a computer readable storage medium.
  • the program when executed, performs the steps including the foregoing method embodiments; and the foregoing storage medium includes various media that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.
  • the device embodiments described above are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, ie may be located in one place. Or it can be distributed to at least two network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment. Those of ordinary skill in the art can understand and implement without deliberate labor.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Image Generation (AREA)
  • Processing Or Creating Images (AREA)

Abstract

An information processing method and apparatus, a cloud processing device, and a computer program product, applied to the technical field of data processing. The information processing method comprises: obtaining RGBD data collected by an image collection device (101); extracting and processing key frame data in the RGBD data to obtain geometric reconstruction data (102); mapping RGB data in the key frame data and the geometric reconstruction data to obtain three-dimensional reconstruction data; and performing semantic segmentation treatment on the RGB data in the key frame data to obtain semantic segmentation data (103); and mapping the semantic segmentation data and the three-dimensional reconstruction data to obtain a three-dimensional semantic map (104). According to the method, three-dimensional reconstruction and semantic segmentation can be performed at the same time; not only the three-dimensional reconstruction can be performed according to RGBD data, but also semantic information can be obtained, the calculation time is shortened, and precision of scene segmentation can also be improved.

Description

信息处理方法、装置、云处理设备以及计算机程序产品Information processing method, device, cloud processing device, and computer program product 技术领域Technical field
本申请涉及数据处理技术领域,尤其涉及一种信息处理方法、装置、云处理设备以及计算机程序产品。The present application relates to the field of data processing technologies, and in particular, to an information processing method, apparatus, cloud processing device, and computer program product.
背景技术Background technique
语义地图构建是指计算机等设备基于感知数据,认知和理解所处环境,对数据进行综合分析,提炼出可供设备自主决策所使用的高层次语义信息(如物体名称和所处位置)。其中,感知数据的获取可通过射频识别技术、听觉技术、视觉技术等关键技术来实现,目前,大多数研究集中在视觉技术。Semantic map construction refers to the high-level semantic information (such as the object name and location) that can be used by the computer and other devices based on the perceptual data, cognition and understanding of the environment, and comprehensive analysis of the data. Among them, the acquisition of sensory data can be achieved through key technologies such as radio frequency identification technology, auditory technology, and visual technology. At present, most research focuses on visual technology.
在生成语义地图的具体的操作过程中,可以依赖深度学习技术,计算机实时感知的图像中很可能包含多个物体,首先对图像进行分割,再通过机器学习等方法对分割后图像中的物体进行识别,该过程涉及到大量的图像运算,耗时较长。In the specific operation process of generating the semantic map, the deep learning technology can be relied on, and the image perceived by the computer in real time may contain multiple objects, firstly segment the image, and then perform the object in the segmented image by means of machine learning or the like. Identification, this process involves a large number of image operations and takes a long time.
而现有技术中的处理方式主要针对于二维数据的处理,当对三维数据进行语义分割时,采用该方式则无法获得几何上连续的分割结果,并且受限于样本的数量,可以分割出的物体种类有限,耗时较长。However, the processing method in the prior art is mainly for the processing of two-dimensional data. When the three-dimensional data is semantically segmented, geometrically continuous segmentation results cannot be obtained by using this method, and the number of samples is limited, and can be segmented. The types of objects are limited and take a long time.
发明内容Summary of the invention
本申请实施例提供一种信息处理方法、装置、云处理设备以及计算机程序产品,可以实时对三维数据进行处理,生成三维语义地图,不但提高了场景分割的精度还缩短了处理时间。The embodiment of the present application provides an information processing method, device, cloud processing device, and computer program product, which can process three-dimensional data in real time and generate a three-dimensional semantic map, which not only improves the accuracy of scene segmentation but also shortens the processing time.
第一方面,本申请实施例提供了一种信息处理方法,包括:In a first aspect, an embodiment of the present application provides an information processing method, including:
获取图像采集设备采集的RGBD数据;Obtaining RGBD data collected by the image acquisition device;
提取所述RGBD数据中的关键帧数据并进行处理,得到几何重建数据;Extracting key frame data in the RGBD data and processing to obtain geometric reconstruction data;
将所述关键帧数据中的RGB数据以及所述几何重建数据进行映射处理,得到三维重建数据;以及,对所述关键帧数据中的RGB数据进行语义分割处理,得到语义分割数据;Mapping the RGB data in the key frame data and the geometric reconstruction data to obtain three-dimensional reconstruction data; and performing semantic segmentation processing on the RGB data in the key frame data to obtain semantic segmentation data;
将所述语义分割数据与所述三维重建数据进行映射处理,得到三维语义地图。The semantic segmentation data is mapped to the three-dimensional reconstruction data to obtain a three-dimensional semantic map.
如上所述的方面和任一可能的实现方式,进一步提供一种实现方式,所述将所述语义分割数据与所述三维重建数据进行映射处理,得到三维语义地图,包括:The aspect as described above, and any possible implementation manner, further provide an implementation manner, the mapping the semantic segmentation data and the three-dimensional reconstruction data to obtain a three-dimensional semantic map, including:
确定所述三维重建数据中每个点对应的RGB数据;Determining RGB data corresponding to each point in the three-dimensional reconstruction data;
根据所述RGB数据与所述语义分割数据的第一对应关系,确定所述三维重建数据中每个点对应的语义信息;Determining, according to the first correspondence between the RGB data and the semantic segmentation data, semantic information corresponding to each point in the three-dimensional reconstruction data;
整合所述三维重建数据中的所有点的语义信息,得到所述三维语义地图。The semantic information of all points in the three-dimensional reconstruction data is integrated to obtain the three-dimensional semantic map.
如上所述的方面和任一可能的实现方式,进一步提供一种实现方式,An aspect of the above, and any possible implementation, further providing an implementation manner,
所述将所述语义分割数据与所述三维重建数据进行映射处理,得到三维语义地图,包括:And performing mapping processing on the semantic segmentation data and the three-dimensional reconstruction data to obtain a three-dimensional semantic map, including:
确定所述三维重建数据中每个面对应的RGB数据;Determining RGB data corresponding to each face in the three-dimensional reconstruction data;
根据所述RGB数据与所述语义分割数据的第二对应关系,确定所述三维重建数据中每个面对应的语义信息;Determining, according to the second correspondence between the RGB data and the semantic segmentation data, semantic information corresponding to each face in the three-dimensional reconstruction data;
确定所述三维数据中每个连接点周围的面;Determining a face around each connection point in the three-dimensional data;
根据每个面对应的语义信息确定每个连接点的语义信息;Determining semantic information of each connection point according to semantic information corresponding to each face;
整合所述三维重建数据中的所有面的语义信息与所有连接点的语义信 息,得到所述三维语义地图。The semantic information of all the faces in the three-dimensional reconstruction data and the semantic information of all the connection points are integrated to obtain the three-dimensional semantic map.
如上所述的方面和任一可能的实现方式,进一步提供一种实现方式,An aspect of the above, and any possible implementation, further providing an implementation manner,
所述提取所述RGBD数据中的关键帧数据并进行处理,得到几何重建数据,包括:Extracting and processing the key frame data in the RGBD data to obtain geometric reconstruction data, including:
根据所述RGBD数据中的关键帧数据计算所述图像采集设备的位姿信息;Calculating pose information of the image collection device according to key frame data in the RGBD data;
根据所述位姿信息以及所述关键帧数据中的D数据进行重建,得到几何重建数据。Reconstruction is performed according to the pose information and the D data in the key frame data to obtain geometric reconstruction data.
第二方面,本申请实施例还提供一种信息处理装置,包括:In a second aspect, the embodiment of the present application further provides an information processing apparatus, including:
获取单元,用于获取图像采集设备采集的RGBD数据;An acquiring unit, configured to acquire RGBD data collected by the image capturing device;
提取单元,用于提取所述RGBD数据中的关键帧数据并进行处理,得到几何重建数据;An extracting unit, configured to extract key frame data in the RGBD data and perform processing to obtain geometric reconstruction data;
处理单元,用于将所述关键帧数据中的RGB数据以及所述几何重建数据进行映射处理,得到三维重建数据;以及,对所述关键帧数据中的RGB数据进行语义分割处理,得到语义分割数据;a processing unit, configured to perform mapping processing on the RGB data in the key frame data and the geometric reconstruction data to obtain three-dimensional reconstruction data; and perform semantic segmentation on the RGB data in the key frame data to obtain semantic segmentation data;
映射单元,用于将所述语义分割数据与所述三维重建数据进行映射处理,得到三维语义地图。And a mapping unit, configured to perform mapping processing on the semantic segmentation data and the three-dimensional reconstruction data to obtain a three-dimensional semantic map.
如上所述的方面和任一可能的实现方式,进一步提供一种实现方式,An aspect of the above, and any possible implementation, further providing an implementation manner,
所述映射单元,具体用于:The mapping unit is specifically configured to:
确定所述三维重建数据中每个点对应的RGB数据;Determining RGB data corresponding to each point in the three-dimensional reconstruction data;
根据所述RGB数据与所述语义分割数据的第一对应关系,确定所述三维重建数据中每个点对应的语义信息;Determining, according to the first correspondence between the RGB data and the semantic segmentation data, semantic information corresponding to each point in the three-dimensional reconstruction data;
整合所述三维重建数据中的所有点的语义信息,得到所述三维语义地图。The semantic information of all points in the three-dimensional reconstruction data is integrated to obtain the three-dimensional semantic map.
如上所述的方面和任一可能的实现方式,进一步提供一种实现方式,An aspect of the above, and any possible implementation, further providing an implementation manner,
所述映射单元,具体用于:The mapping unit is specifically configured to:
确定所述三维重建数据中每个面对应的RGB数据;Determining RGB data corresponding to each face in the three-dimensional reconstruction data;
根据所述RGB数据与所述语义分割数据的第二对应关系,确定所述三维重建数据中每个面对应的语义信息;Determining, according to the second correspondence between the RGB data and the semantic segmentation data, semantic information corresponding to each face in the three-dimensional reconstruction data;
确定所述三维数据中每个连接点周围的面;Determining a face around each connection point in the three-dimensional data;
根据每个面对应的语义信息确定每个连接点的语义信息;Determining semantic information of each connection point according to semantic information corresponding to each face;
整合所述三维重建数据中的所有面的语义信息与所有连接点的语义信息,得到所述三维语义地图。The semantic information of all the faces in the three-dimensional reconstruction data and the semantic information of all the connection points are integrated to obtain the three-dimensional semantic map.
如上所述的方面和任一可能的实现方式,进一步提供一种实现方式,An aspect of the above, and any possible implementation, further providing an implementation manner,
所述提取单元,具体用于:The extracting unit is specifically configured to:
根据所述RGBD数据中的关键帧数据计算所述图像采集设备的位姿信息;Calculating pose information of the image collection device according to key frame data in the RGBD data;
根据所述位姿信息以及所述关键帧数据中的D数据进行重建,得到几何重建数据。Reconstruction is performed according to the pose information and the D data in the key frame data to obtain geometric reconstruction data.
第三方面,本申请实施例还提供一种云处理设备,所述设备包括处理器以及存储器;所述存储器用于存储指令,所述指令被所述处理器执行时,使得所述设备执行如第一方面中任一种所述的方法。In a third aspect, the embodiment of the present application further provides a cloud processing device, where the device includes a processor and a memory; the memory is configured to store an instruction, when the instruction is executed by the processor, causing the device to perform, for example, The method of any of the first aspects.
第四方面,本申请实施例还提供一种计算机程序产品,可直接加载到计算机的内部存储器中,并含有软件代码,所述计算机程序经由计算机载入并执行后能够实现如第一方面中任一种所述的方法。In a fourth aspect, the embodiment of the present application further provides a computer program product, which can be directly loaded into an internal memory of a computer and includes software code. After the computer program is loaded and executed by a computer, the first aspect can be implemented. One such method.
本申请实施例提供的信息处理方法、装置、云处理设备以及计算机程序产品,通过提取RGBD数据中的关键帧数据,对关键帧数据进行处理得到几何重建数据,然后,同时执行三维重建和语义分割两个进程,分别得到三维重建数据和语义分割数据,最后将语义分割数据与三维重建数据进行映射处理,得到三维语义地图,在本申请实施例提供的技术方案中,能够 将三维重建和语义分割同时进行,既能根据RGBD数据进行三维重建,又能同时获得语义信息,在缩短了计算时间的同时还能够提高场景分割的精度,达到实时生成三维地图的效果,解决了现有技术中对三维数据进行语义分割时,可以分割出的物体种类有限,耗时较长的问题。The information processing method, device, cloud processing device and computer program product provided by the embodiments of the present application process geometric key reconstruction data by extracting key frame data in RGBD data, and then perform three-dimensional reconstruction and semantic segmentation simultaneously. The two processes respectively obtain the three-dimensional reconstruction data and the semantic segmentation data, and finally the semantic segmentation data and the three-dimensional reconstruction data are mapped to obtain a three-dimensional semantic map. In the technical solution provided by the embodiment of the present application, the three-dimensional reconstruction and the semantic segmentation can be performed. Simultaneously, it can not only perform three-dimensional reconstruction based on RGBD data, but also obtain semantic information at the same time, which can shorten the calculation time and improve the accuracy of scene segmentation, and achieve the effect of generating three-dimensional map in real time, and solve the three-dimensional map in the prior art. When the data is semantically segmented, the types of objects that can be segmented are limited and take a long time.
附图说明DRAWINGS
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, a brief description of the drawings used in the embodiments or the prior art description will be briefly described below. Obviously, the drawings in the following description It is a certain embodiment of the present application, and other drawings can be obtained according to the drawings without any creative labor for those skilled in the art.
图1为本申请实施例提供的信息处理方法实施例的流程图;FIG. 1 is a flowchart of an embodiment of an information processing method according to an embodiment of the present application;
图2为本申请实施例提供的信息处理装置实施例的结构示意图;2 is a schematic structural diagram of an embodiment of an information processing apparatus according to an embodiment of the present disclosure;
图3为本申请实施例提供的云处理设备实施例的结构示意图。FIG. 3 is a schematic structural diagram of an embodiment of a cloud processing device according to an embodiment of the present disclosure.
具体实施方式Detailed ways
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present application. It is a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present application without departing from the inventive scope are the scope of the present application.
在本申请实施例中使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本申请。在本申请实施例和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。The terms used in the embodiments of the present application are for the purpose of describing particular embodiments only, and are not intended to limit the application. The singular forms "a", "the", and "the"
应当理解,本文中使用的术语“和/或”仅仅是一种描述关联对象的关 联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。It should be understood that the term "and/or" as used herein is merely an association describing the associated object, indicating that there may be three relationships, for example, A and/or B, which may indicate that A exists separately, while A and B, there are three cases of B alone. In addition, the character "/" in this article generally indicates that the contextual object is an "or" relationship.
取决于语境,如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”或“响应于检测”。类似地,取决于语境,短语“如果确定”或“如果检测(陈述的条件或事件)”可以被解释成为“当确定时”或“响应于确定”或“当检测(陈述的条件或事件)时”或“响应于检测(陈述的条件或事件)”。Depending on the context, the word "if" as used herein may be interpreted as "when" or "when" or "in response to determining" or "in response to detecting." Similarly, depending on the context, the phrase "if determined" or "if detected (conditions or events stated)" may be interpreted as "when determined" or "in response to determination" or "when detected (stated condition or event) "Time" or "in response to a test (condition or event stated)".
为了加强计算机等设备对周围环境里的感知和理解,我们需要给它提供高质量的三维语义地图。三维语义地图包括了两部分,其中一部分是对某一个环境进行重建得到的三维重建模型,另一部分是对其进行精确语义分割后得到的场景识别信息。现有技术中,语义分割多数是基于二维数据进行处理,而使用相同的方式对三维数据进行语义分割则无法获得几何上连续的分割结果,且耗时较长,很难达到实时完成,因此,本申请实施例提供一种信息处理方法,在对采集到的环境信息进行三维重建的同时,对采集到的环境信息进行语义分割,实现实时生成三维语义地图,具体的,图1为本申请实施例提供的信息处理方法实施例的流程图,如图1所示,本申请实施例提供的信息处理方法,具体可以包括如下步骤:In order to enhance the perception and understanding of computers and other devices in the surrounding environment, we need to provide high quality 3D semantic maps. The three-dimensional semantic map consists of two parts, one of which is a three-dimensional reconstruction model obtained by reconstructing one environment, and the other is scene recognition information obtained by precise semantic segmentation. In the prior art, semantic segmentation is mostly based on two-dimensional data processing, and semantic segmentation of three-dimensional data in the same manner cannot obtain geometrically continuous segmentation results, and it takes a long time, and it is difficult to achieve real-time completion. The embodiment of the present application provides an information processing method, which performs semantic transformation on the collected environmental information while realizing three-dimensional reconstruction of the collected environmental information, and real-time generates a three-dimensional semantic map. Specifically, FIG. 1 is the present application. A flowchart of an embodiment of the information processing method provided by the embodiment, as shown in FIG. 1 , the information processing method provided by the embodiment of the present application may specifically include the following steps:
101、获取图像采集设备采集的RGBD数据。101. Acquire RGBD data collected by the image collection device.
在本申请实施例中,当需要对某一个场景进行三维重建以及得到三维语义地图时,首先使用图像采集设备对场景进行图像采集,图像采集设备需要包含RGB摄像头以及深度(Depth)摄像头,并在采集完成后得到RGBD数据。在一个具体的实现过程中,用于生成三维语义地图的计算机中可以包含有实时建图定位模块,该模块用于获取图像采集设备采集的RGBD数据,具体的,既可以由实时建图定位模块主动获取RGBD数据,也可以由图像采 集设备主动发送RGBD数据至实时建图定位模块。In the embodiment of the present application, when it is required to perform three-dimensional reconstruction on a certain scene and obtain a three-dimensional semantic map, the image acquisition device is first used to collect images on the scene, and the image collection device needs to include an RGB camera and a depth (Depth) camera, and The RGBD data is obtained after the acquisition is completed. In a specific implementation process, the computer for generating a three-dimensional semantic map may include a real-time mapping positioning module, which is used to acquire RGBD data collected by the image acquisition device, and specifically, may be constructed by a real-time mapping module. Actively acquiring RGBD data, the image acquisition device can also actively send RGBD data to the real-time mapping positioning module.
102、提取RGBD数据中的关键帧数据并进行处理,得到几何重建数据。102. Extract key frame data in the RGBD data and process the data to obtain geometric reconstruction data.
在本申请实施例中,可以采用如下步骤得到几何重建数据:首先,根据RGBD数据中的关键帧数据计算图像采集设备的位姿信息,具体的,在所有RGBD数据中提取关键帧对应的RGBD数据,根据关键帧对应的RGBD数据计算图像采集设备的位姿;然后,根据位姿信息以及关键帧数据中的D数据进行重建,得到几何重建数据。In the embodiment of the present application, the following steps may be used to obtain the geometric reconstruction data: first, the pose information of the image acquisition device is calculated according to the key frame data in the RGBD data, and specifically, the RGBD data corresponding to the key frame is extracted in all the RGBD data. Calculating the pose of the image acquisition device according to the RGBD data corresponding to the key frame; then reconstructing according to the pose information and the D data in the key frame data to obtain geometric reconstruction data.
几何重建数据可以包括两种格式,有一种是点云格式,另一种是网格格式,两种格式可以根据实际需要择一使用。例如,在一个具体的实现过程中,利用快速融合(fastfusion)算法对位姿信息以及关键帧数据中的D数据进行处理,重建出点云格式的数据。又例如,在一个具体的实现过程中,利用fastfusion算法对位姿信息以及关键帧数据中的D数据进行处理,重建出网格格式的数据。The geometric reconstruction data can include two formats, one is a point cloud format, and the other is a grid format, and the two formats can be selected according to actual needs. For example, in a specific implementation process, the pose information and the D data in the key frame data are processed by a fast fusion algorithm to reconstruct the data in the point cloud format. For another example, in a specific implementation process, the fast fusion algorithm is used to process the pose information and the D data in the key frame data to reconstruct the data in the grid format.
在本申请实施例中,在重建过程中,关键帧至少有两个,因此,需要将所有的关键帧数据同时使用进行重建。In the embodiment of the present application, during the reconstruction process, there are at least two key frames, and therefore, all key frame data needs to be used for reconstruction at the same time.
103、将关键帧数据中的RGB数据以及几何重建数据进行映射处理,得到三维重建数据;以及,对关键帧数据中的RGB数据进行语义分割处理,得到语义分割数据。103. Perform mapping processing on the RGB data and the geometric reconstruction data in the key frame data to obtain three-dimensional reconstruction data; and perform semantic segmentation processing on the RGB data in the key frame data to obtain semantic segmentation data.
由于在本申请实施例中,得到三维重建数据和得到语义分割数据这两个过程的计算量均较大,且占用的计算资源较多,因此,将二者放入不同的线程中进行,或者采用并行计算的方式进行。In the embodiment of the present application, the two processes of obtaining the three-dimensional reconstruction data and obtaining the semantic segmentation data have large calculation amounts and occupy a large amount of computing resources, so the two are put into different threads, or Parallel computing is used.
其中,由于几何重建数据可以包括两种格式,根据不同的格式生成三维重建数据其过程也会不同。当几何重建数据为点云格式时,首先找到点云中每个点对应的D数据,然后,根据RGB摄像头与深度摄像头的标定结果,找到与每个点对应的RGB数据,最后将每个点对应的RGB数据的值赋 予对应的点。当几何重建数据为网格格式时,则根据算法将关键帧对应的RGB数据作为纹理映射至网格,在一个具体的实现过程中,算法可以包括最近采样点算法、双线性插值算法、三线性插值算法等。Among them, since the geometric reconstruction data can include two formats, the process of generating three-dimensional reconstruction data according to different formats will be different. When the geometric reconstruction data is in the point cloud format, first find the D data corresponding to each point in the point cloud, and then, according to the calibration result of the RGB camera and the depth camera, find the RGB data corresponding to each point, and finally each point. The value of the corresponding RGB data is assigned to the corresponding point. When the geometric reconstruction data is in a grid format, the RGB data corresponding to the key frame is mapped to the grid as a texture according to an algorithm. In a specific implementation process, the algorithm may include a nearest sampling point algorithm, a bilinear interpolation algorithm, and three Linear interpolation algorithm, etc.
语义分割数据可以根据不同的场景选择不同的现有技术中的方式来获得。The semantic segmentation data can be obtained by selecting different prior art methods according to different scenarios.
104、将语义分割数据与三维重建数据进行映射处理,得到三维语义地图。104. Perform mapping processing on the semantic segmentation data and the three-dimensional reconstruction data to obtain a three-dimensional semantic map.
在本申请实施例中,由于几何重建数据存在不同的格式,因此,在该步骤中对每一种格式分别进行说明。In the embodiment of the present application, since the geometric reconstruction data has different formats, each format is separately described in this step.
当几何重建数据为点云格式时:首先,确定三维重建数据中每个点对应的RGB数据;然后,根据RGB数据与语义分割数据的第一对应关系,确定三维重建数据中每个点对应的语义信息;最后,整合三维重建数据中的所有点的语义信息,得到三维语义地图。When the geometric reconstruction data is in a point cloud format: first, determining RGB data corresponding to each point in the three-dimensional reconstruction data; and then determining, corresponding to each point in the three-dimensional reconstruction data, according to the first correspondence relationship between the RGB data and the semantic segmentation data Semantic information; finally, the semantic information of all points in the 3D reconstruction data is integrated to obtain a 3D semantic map.
为了能够更详尽的说明该流程,在本申请实施例中使用计算公式来表示。假设三维几何重建结果的每一个点为V P(P为点的序号),每个点对应的RGB值V C可以通过查表Ω获取。其中,表Ω为代表序号P与RGB值的对应关系表。确定每个点的语义信息通过第一对应关系函数来表达,具体的函数为: In order to explain the flow in more detail, a calculation formula is used in the embodiment of the present application. Assume that each point of the 3D geometric reconstruction result is V P (P is the number of the point), and the RGB value V C corresponding to each point can be obtained by looking up the table Ω. The table Ω is a table representing the correspondence between the sequence number P and the RGB values. The semantic information of each point is determined by the first correspondence function, and the specific function is:
F(V P,V C)=V S F(V P , V C )=V S
其中,V S为语义信息,V P为点,V C为RGB值。 Where V S is semantic information, V P is a point, and V C is an RGB value.
当几何重建数据为网格格式时:首先,确定三维重建数据中每个面对应的RGB数据;然后,根据RGB数据与语义分割数据的第二对应关系,确定三维重建数据中每个面对应的语义信息;接着,确定三维数据中每个连接点周围的面;根据每个面对应的语义信息确定每个连接点的语义信息;最后,整合三维重建数据中的所有面的语义信息与所有连接点的语义信息, 得到三维语义地图。When the geometric reconstruction data is in a grid format: first, determining RGB data corresponding to each face in the three-dimensional reconstruction data; and then, according to the second correspondence relationship between the RGB data and the semantic segmentation data, determining each face in the three-dimensional reconstruction data Semantic information; then, determine the faces around each connection point in the three-dimensional data; determine the semantic information of each connection point according to the semantic information corresponding to each face; finally, integrate the semantic information of all faces in the three-dimensional reconstruction data A semantic map of all the connection points is obtained to obtain a three-dimensional semantic map.
为了能够更详尽的说明该流程,在本申请实施例中使用计算公式来表示。In order to explain the flow in more detail, a calculation formula is used in the embodiment of the present application.
网格由点和面组成,并且,面由点连接而成。假设三维几何重建结果包括n个点,每一个点设为V i(i=1 to n),m个面,每一个面设为F j(j=1 to m),其中,n、m、j均为正整数。设每一个面F j对应RGB数据的一块区域F c,每个区域对应的RGB值F c可以通过查表σ获取。其中,表Ω为代表序号j与RGB值的对应关系表。 A mesh consists of points and faces, and faces are connected by points. Assume that the 3D geometric reconstruction result includes n points, each point is set to V i (i=1 to n), m faces, each face is set to F j (j=1 to m), where n, m, j is a positive integer. Let each face F j correspond to a region F c of RGB data, and the corresponding RGB value F c of each region can be obtained by looking up the table σ. The table Ω is a table representing the correspondence between the sequence number j and the RGB values.
首先,确定每个面的语义信息通过第二对应关系函数来表达,具体的函数为:First, the semantic information of each face is determined by a second corresponding relationship function. The specific function is:
G(F j,F c)=F s G(F j , F c )=F s
其中,F s为语义信息,F j为面,F c为RGB值。 Where F s is semantic information, F j is a face, and F c is an RGB value.
然后,确定每个连接点V i的语义信息,设定每个连接点周围的面为F k(k=1 to p),F k对应的语义信息为F k s,语义信息可以通过函数来表达,具体的函数为: Then, it is determined semantic information for each connection point V i is set each face around the connection points is F k (k = 1 to p ), semantic information corresponding to F k F k s, semantic information can function Express, the specific function is:
V i s=Q(F k s)(k=1 to p) V i s =Q(F k s )(k=1 to p)
其中,V i s为语义信息,F k s为V i周围所有面的语义信息,p为V i周围的面的数量。 Where V i s is semantic information, F k s is semantic information of all faces around V i , and p is the number of faces around V i .
在一个具体的实现过程中,函数Q(F k s)可以采用如下具体表现方式: In a specific implementation process, the function Q(F k s ) can be expressed as follows:
Figure PCTCN2017119008-appb-000001
Figure PCTCN2017119008-appb-000001
其中,F k s为V i周围所有面的语义信息,p为V i周围的面的数量。 Where F k s is the semantic information of all faces around V i , and p is the number of faces around V i .
在另一个具体的实现过程中,函数Q(F k s)可以采用如下具体表现方式: In another specific implementation process, the function Q(F k s ) can take the following specific representation:
Figure PCTCN2017119008-appb-000002
Figure PCTCN2017119008-appb-000002
其中,F k s为V i周围所有面的语义信息,p为V i周围的面的数量,F k A为F k的面积。 Where F k s is the semantic information of all faces around V i , p is the number of faces around V i , and F k A is the area of F k .
本申请实施例提供的信息处理方法,通过提取RGBD数据中的关键帧数据,对关键帧数据进行处理得到几何重建数据,然后,同时执行三维重建和语义分割两个进程,分别得到三维重建数据和语义分割数据,最后将语义分割数据与三维重建数据进行映射处理,得到三维语义地图,在本申请实施例提供的技术方案中,能够将三维重建和语义分割同时进行,既能根据RGBD数据进行三维重建,又能同时获得语义信息,在缩短了计算时间的同时还能够提高场景分割的精度,达到实时生成三维地图的效果,解决了现有技术中对三维数据进行语义分割时,可以分割出的物体种类有限,耗时较长的问题。The information processing method provided by the embodiment of the present application extracts key frame data in the RGBD data, processes the key frame data to obtain geometric reconstruction data, and then simultaneously performs two processes of three-dimensional reconstruction and semantic segmentation to obtain three-dimensional reconstruction data and Semanticly segmenting the data, and finally mapping the semantic segmentation data and the three-dimensional reconstruction data to obtain a three-dimensional semantic map. In the technical solution provided by the embodiment of the present application, the three-dimensional reconstruction and the semantic segmentation can be simultaneously performed, and the three-dimensional reconstruction can be performed according to the RGBD data. Reconstruction can obtain semantic information at the same time, which can shorten the calculation time and improve the accuracy of scene segmentation, and achieve the effect of generating 3D maps in real time, which can solve the problem of segmentation of 3D data in the prior art. The type of object is limited and takes a long time.
为了实现前述内容的方法流程,本申请实施例还提供一种信息处理装置,图2为本申请实施例提供的信息处理装置实施例的结构示意图,如图2所示,本实施例的装置可以包括:获取单元11、提取单元12、处理单元13和映射单元14。In order to implement the method flow of the foregoing, the embodiment of the present application further provides an information processing apparatus. FIG. 2 is a schematic structural diagram of an embodiment of an information processing apparatus according to an embodiment of the present application. As shown in FIG. 2, the apparatus of this embodiment may be used. The acquisition unit 11 includes an acquisition unit 11, an extraction unit 12, a processing unit 13, and a mapping unit 14.
获取单元11,用于获取图像采集设备采集的RGBD数据。The obtaining unit 11 is configured to acquire RGBD data collected by the image capturing device.
提取单元12,用于提取RGBD数据中的关键帧数据并进行处理,得到几何重建数据。The extracting unit 12 is configured to extract key frame data in the RGBD data and perform processing to obtain geometric reconstruction data.
处理单元13,用于将关键帧数据中的RGB数据以及几何重建数据进行映射处理,得到三维重建数据;以及,对关键帧数据中的RGB数据进行语义分割处理,得到语义分割数据。The processing unit 13 is configured to perform mapping processing on the RGB data and the geometric reconstruction data in the key frame data to obtain three-dimensional reconstruction data; and perform semantic segmentation processing on the RGB data in the key frame data to obtain semantic segmentation data.
映射单元14,用于将语义分割数据与三维重建数据进行映射处理,得到三维语义地图。The mapping unit 14 is configured to perform mapping processing on the semantic segmentation data and the three-dimensional reconstruction data to obtain a three-dimensional semantic map.
在一个具体的实现过程中,映射单元14,具体用于:In a specific implementation process, the mapping unit 14 is specifically configured to:
确定三维重建数据中每个点对应的RGB数据;Determining RGB data corresponding to each point in the three-dimensional reconstruction data;
根据RGB数据与语义分割数据的第一对应关系,确定三维重建数据中每个点对应的语义信息;Determining semantic information corresponding to each point in the three-dimensional reconstruction data according to the first correspondence between the RGB data and the semantic segmentation data;
整合三维重建数据中的所有点的语义信息,得到三维语义地图。Integrate the semantic information of all points in the 3D reconstruction data to obtain a 3D semantic map.
在另一个具体的实现过程中,映射单元14,具体用于:In another specific implementation process, the mapping unit 14 is specifically configured to:
确定三维重建数据中每个面对应的RGB数据;Determining RGB data corresponding to each face in the three-dimensional reconstruction data;
根据RGB数据与语义分割数据的第二对应关系,确定三维重建数据中每个面对应的语义信息;Determining semantic information corresponding to each face in the three-dimensional reconstruction data according to the second correspondence between the RGB data and the semantic segmentation data;
确定三维数据中每个连接点周围的面;Determining the faces around each connection point in the three-dimensional data;
根据每个面对应的语义信息确定每个连接点的语义信息;Determining semantic information of each connection point according to semantic information corresponding to each face;
整合三维重建数据中的所有面的语义信息与所有连接点的语义信息,得到三维语义地图。Integrate the semantic information of all faces in the 3D reconstruction data with the semantic information of all connected points to obtain a 3D semantic map.
其中,提取单元12,具体用于:The extracting unit 12 is specifically configured to:
根据RGBD数据中的关键帧数据计算图像采集设备的位姿信息;Calculating pose information of the image acquisition device according to key frame data in the RGBD data;
根据位姿信息以及关键帧数据中的D数据进行重建,得到几何重建数据。The reconstruction is performed based on the pose information and the D data in the key frame data to obtain geometric reconstruction data.
本申请实施例提供的信息处理装置,可以用于执行图1示方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。The information processing apparatus provided by the embodiment of the present application may be used to implement the technical solution of the method embodiment shown in FIG. 1 , and the implementation principle and technical effects thereof are similar, and details are not described herein again.
为了实现前述内容的方法流程,本申请实施例还提供一种云处理设备,图3为本申请实施例提供的云处理设备实施例的结构示意图,如图3所示,本申请实施例提供的云处理设备包括处理器21以及存储器22;存储器22用于存储指令,指令被处理器21执行时,使得设备执行如前述内容中任一种方法。The embodiment of the present application provides a cloud processing device, and FIG. 3 is a schematic structural diagram of an embodiment of a cloud processing device according to an embodiment of the present disclosure. The cloud processing device includes a processor 21 and a memory 22; the memory 22 is for storing instructions that, when executed by the processor 21, cause the device to perform any of the methods described above.
本申请实施例提供的云处理设备,可以用于执行图1示方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。The cloud processing device provided by the embodiment of the present application may be used to implement the technical solution of the method embodiment shown in FIG. 1 , and the implementation principle and the technical effect are similar, and details are not described herein again.
为了实现前述内容的方法流程,本申请实施例还提供一种计算机程序 产品,可直接加载到计算机的内部存储器中,并含有软件代码,计算机程序经由计算机载入并执行后能够实现如前述内容中任一种方法。In order to implement the method flow of the foregoing, the embodiment of the present application further provides a computer program product, which can be directly loaded into an internal memory of a computer and contains software code, and the computer program can be implemented by being loaded and executed by a computer. Any method.
本申请实施例提供的云处理设备,可以用于执行图1示方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。The cloud processing device provided by the embodiment of the present application may be used to implement the technical solution of the method embodiment shown in FIG. 1 , and the implementation principle and the technical effect are similar, and details are not described herein again.
本领域普通技术人员可以理解:实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成。前述的程序可以存储于一计算机可读取存储介质中。该程序在执行时,执行包括上述各方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。One of ordinary skill in the art will appreciate that all or part of the steps to implement the various method embodiments described above may be accomplished by hardware associated with the program instructions. The aforementioned program can be stored in a computer readable storage medium. The program, when executed, performs the steps including the foregoing method embodiments; and the foregoing storage medium includes various media that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.
以上所描述的装置实施例仅仅是示意性的,其中作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到至少两个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下,即可以理解并实施。The device embodiments described above are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, ie may be located in one place. Or it can be distributed to at least two network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment. Those of ordinary skill in the art can understand and implement without deliberate labor.
最后应说明的是:以上各实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述各实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。Finally, it should be noted that the above embodiments are only for explaining the technical solutions of the present application, and are not limited thereto; although the present application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that The technical solutions described in the foregoing embodiments may be modified, or some or all of the technical features may be equivalently replaced; and the modifications or substitutions do not deviate from the technical solutions of the embodiments of the present application. range.

Claims (10)

  1. 一种信息处理方法,其特征在于,包括:An information processing method, comprising:
    获取图像采集设备采集的RGBD数据;Obtaining RGBD data collected by the image acquisition device;
    提取所述RGBD数据中的关键帧数据并进行处理,得到几何重建数据;Extracting key frame data in the RGBD data and processing to obtain geometric reconstruction data;
    将所述关键帧数据中的RGB数据以及所述几何重建数据进行映射处理,得到三维重建数据;以及,对所述关键帧数据中的RGB数据进行语义分割处理,得到语义分割数据;Mapping the RGB data in the key frame data and the geometric reconstruction data to obtain three-dimensional reconstruction data; and performing semantic segmentation processing on the RGB data in the key frame data to obtain semantic segmentation data;
    将所述语义分割数据与所述三维重建数据进行映射处理,得到三维语义地图。The semantic segmentation data is mapped to the three-dimensional reconstruction data to obtain a three-dimensional semantic map.
  2. 根据权利要求1所述的方法,其特征在于,所述将所述语义分割数据与所述三维重建数据进行映射处理,得到三维语义地图,包括:The method according to claim 1, wherein the mapping the semantic segmentation data and the three-dimensional reconstruction data to obtain a three-dimensional semantic map comprises:
    确定所述三维重建数据中每个点对应的RGB数据;Determining RGB data corresponding to each point in the three-dimensional reconstruction data;
    根据所述RGB数据与所述语义分割数据的第一对应关系,确定所述三维重建数据中每个点对应的语义信息;Determining, according to the first correspondence between the RGB data and the semantic segmentation data, semantic information corresponding to each point in the three-dimensional reconstruction data;
    整合所述三维重建数据中的所有点的语义信息,得到所述三维语义地图。The semantic information of all points in the three-dimensional reconstruction data is integrated to obtain the three-dimensional semantic map.
  3. 根据权利要求1所述的方法,其特征在于,所述将所述语义分割数据与所述三维重建数据进行映射处理,得到三维语义地图,包括:The method according to claim 1, wherein the mapping the semantic segmentation data and the three-dimensional reconstruction data to obtain a three-dimensional semantic map comprises:
    确定所述三维重建数据中每个面对应的RGB数据;Determining RGB data corresponding to each face in the three-dimensional reconstruction data;
    根据所述RGB数据与所述语义分割数据的第二对应关系,确定所述三维重建数据中每个面对应的语义信息;Determining, according to the second correspondence between the RGB data and the semantic segmentation data, semantic information corresponding to each face in the three-dimensional reconstruction data;
    确定所述三维数据中每个连接点周围的面;Determining a face around each connection point in the three-dimensional data;
    根据每个面对应的语义信息确定每个连接点的语义信息;Determining semantic information of each connection point according to semantic information corresponding to each face;
    整合所述三维重建数据中的所有面的语义信息与所有连接点的语义信息,得到所述三维语义地图。The semantic information of all the faces in the three-dimensional reconstruction data and the semantic information of all the connection points are integrated to obtain the three-dimensional semantic map.
  4. 根据权利要求1所述的方法,其特征在于,所述提取所述RGBD数据中的关键帧数据并进行处理,得到几何重建数据,包括:The method according to claim 1, wherein the extracting and processing the key frame data in the RGBD data to obtain geometric reconstruction data comprises:
    根据所述RGBD数据中的关键帧数据计算所述图像采集设备的位姿信息;Calculating pose information of the image collection device according to key frame data in the RGBD data;
    根据所述位姿信息以及所述关键帧数据中的D数据进行重建,得到几何重建数据。Reconstruction is performed according to the pose information and the D data in the key frame data to obtain geometric reconstruction data.
  5. 一种信息处理装置,其特征在于,包括:An information processing apparatus, comprising:
    获取单元,用于获取图像采集设备采集的RGBD数据;An acquiring unit, configured to acquire RGBD data collected by the image capturing device;
    提取单元,用于提取所述RGBD数据中的关键帧数据并进行处理,得到几何重建数据;An extracting unit, configured to extract key frame data in the RGBD data and perform processing to obtain geometric reconstruction data;
    处理单元,用于将所述关键帧数据中的RGB数据以及所述几何重建数据进行映射处理,得到三维重建数据;以及,对所述关键帧数据中的RGB数据进行语义分割处理,得到语义分割数据;a processing unit, configured to perform mapping processing on the RGB data in the key frame data and the geometric reconstruction data to obtain three-dimensional reconstruction data; and perform semantic segmentation on the RGB data in the key frame data to obtain semantic segmentation data;
    映射单元,用于将所述语义分割数据与所述三维重建数据进行映射处理,得到三维语义地图。And a mapping unit, configured to perform mapping processing on the semantic segmentation data and the three-dimensional reconstruction data to obtain a three-dimensional semantic map.
  6. 根据权利要求5所述的装置,其特征在于,所述映射单元,具体用于:The device according to claim 5, wherein the mapping unit is specifically configured to:
    确定所述三维重建数据中每个点对应的RGB数据;Determining RGB data corresponding to each point in the three-dimensional reconstruction data;
    根据所述RGB数据与所述语义分割数据的第一对应关系,确定所述三维重建数据中每个点对应的语义信息;Determining, according to the first correspondence between the RGB data and the semantic segmentation data, semantic information corresponding to each point in the three-dimensional reconstruction data;
    整合所述三维重建数据中的所有点的语义信息,得到所述三维语义地图。The semantic information of all points in the three-dimensional reconstruction data is integrated to obtain the three-dimensional semantic map.
  7. 根据权利要求5所述的装置,其特征在于,所述映射单元,具体用于:The device according to claim 5, wherein the mapping unit is specifically configured to:
    确定所述三维重建数据中每个面对应的RGB数据;Determining RGB data corresponding to each face in the three-dimensional reconstruction data;
    根据所述RGB数据与所述语义分割数据的第二对应关系,确定所述三维重建数据中每个面对应的语义信息;Determining, according to the second correspondence between the RGB data and the semantic segmentation data, semantic information corresponding to each face in the three-dimensional reconstruction data;
    确定所述三维数据中每个连接点周围的面;Determining a face around each connection point in the three-dimensional data;
    根据每个面对应的语义信息确定每个连接点的语义信息;Determining semantic information of each connection point according to semantic information corresponding to each face;
    整合所述三维重建数据中的所有面的语义信息与所有连接点的语义信息,得到所述三维语义地图。The semantic information of all the faces in the three-dimensional reconstruction data and the semantic information of all the connection points are integrated to obtain the three-dimensional semantic map.
  8. 根据权利要求5所述的装置,其特征在于,所述提取单元,具体用于:The device according to claim 5, wherein the extracting unit is specifically configured to:
    根据所述RGBD数据中的关键帧数据计算所述图像采集设备的位姿信息;Calculating pose information of the image collection device according to key frame data in the RGBD data;
    根据所述位姿信息以及所述关键帧数据中的D数据进行重建,得到几何重建数据。Reconstruction is performed according to the pose information and the D data in the key frame data to obtain geometric reconstruction data.
  9. 一种云处理设备,其特征在于,所述设备包括处理器以及存储器;所述存储器用于存储指令,所述指令被所述处理器执行时,使得所述设备执行如权利要求1~4中任一种所述的方法。A cloud processing device, characterized in that the device comprises a processor and a memory; the memory is for storing instructions, when the instructions are executed by the processor, causing the device to perform as claimed in claims 1 to 4 Any of the methods described.
  10. 一种计算机程序产品,其特征在于,可直接加载到计算机的内部存储器中,并含有软件代码,所述计算机程序经由计算机载入并执行后能够实现如权利要求1~4中任一种所述的方法。A computer program product, which can be directly loaded into an internal memory of a computer and containing software code, which can be implemented by any one of claims 1 to 4 after being loaded and executed by a computer Methods.
PCT/CN2017/119008 2017-12-27 2017-12-27 Information processing method and apparatus, cloud processing device, and computer program product WO2019127102A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201780002737.9A CN108124489B (en) 2017-12-27 2017-12-27 Information processing method, apparatus, cloud processing device and computer program product
PCT/CN2017/119008 WO2019127102A1 (en) 2017-12-27 2017-12-27 Information processing method and apparatus, cloud processing device, and computer program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/119008 WO2019127102A1 (en) 2017-12-27 2017-12-27 Information processing method and apparatus, cloud processing device, and computer program product

Publications (1)

Publication Number Publication Date
WO2019127102A1 true WO2019127102A1 (en) 2019-07-04

Family

ID=62234350

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/119008 WO2019127102A1 (en) 2017-12-27 2017-12-27 Information processing method and apparatus, cloud processing device, and computer program product

Country Status (2)

Country Link
CN (1) CN108124489B (en)
WO (1) WO2019127102A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111292340A (en) * 2020-01-23 2020-06-16 北京市商汤科技开发有限公司 Semantic segmentation method, device, equipment and computer readable storage medium

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109117718B (en) * 2018-07-02 2021-11-26 东南大学 Three-dimensional semantic map construction and storage method for road scene
CN109191526B (en) * 2018-09-10 2020-07-07 杭州艾米机器人有限公司 Three-dimensional environment reconstruction method and system based on RGBD camera and optical encoder
CN109461211B (en) * 2018-11-12 2021-01-26 南京人工智能高等研究院有限公司 Semantic vector map construction method and device based on visual point cloud and electronic equipment
CN110245567B (en) * 2019-05-16 2023-04-07 达闼机器人股份有限公司 Obstacle avoidance method and device, storage medium and electronic equipment
CN113313832B (en) * 2021-05-26 2023-07-04 Oppo广东移动通信有限公司 Semantic generation method and device of three-dimensional model, storage medium and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070273696A1 (en) * 2006-04-19 2007-11-29 Sarnoff Corporation Automated Video-To-Text System
CN104732587A (en) * 2015-04-14 2015-06-24 中国科学技术大学 Depth sensor-based method of establishing indoor 3D (three-dimensional) semantic map
CN105551084A (en) * 2016-01-28 2016-05-04 北京航空航天大学 Outdoor three-dimensional scene combined construction method based on image content parsing
CN106067191A (en) * 2016-05-25 2016-11-02 深圳市寒武纪智能科技有限公司 The method and system of semantic map set up by a kind of domestic robot
CN106384383A (en) * 2016-09-08 2017-02-08 哈尔滨工程大学 RGB-D and SLAM scene reconfiguration method based on FAST and FREAK feature matching algorithm

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104424630A (en) * 2013-08-20 2015-03-18 华为技术有限公司 Three-dimension reconstruction method and device, and mobile terminal
CN107292949B (en) * 2017-05-25 2020-06-16 深圳先进技术研究院 Three-dimensional reconstruction method and device of scene and terminal equipment
CN107358189B (en) * 2017-07-07 2020-12-04 北京大学深圳研究生院 Object detection method in indoor environment based on multi-view target extraction

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070273696A1 (en) * 2006-04-19 2007-11-29 Sarnoff Corporation Automated Video-To-Text System
CN104732587A (en) * 2015-04-14 2015-06-24 中国科学技术大学 Depth sensor-based method of establishing indoor 3D (three-dimensional) semantic map
CN105551084A (en) * 2016-01-28 2016-05-04 北京航空航天大学 Outdoor three-dimensional scene combined construction method based on image content parsing
CN106067191A (en) * 2016-05-25 2016-11-02 深圳市寒武纪智能科技有限公司 The method and system of semantic map set up by a kind of domestic robot
CN106384383A (en) * 2016-09-08 2017-02-08 哈尔滨工程大学 RGB-D and SLAM scene reconfiguration method based on FAST and FREAK feature matching algorithm

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111292340A (en) * 2020-01-23 2020-06-16 北京市商汤科技开发有限公司 Semantic segmentation method, device, equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN108124489A (en) 2018-06-05
CN108124489B (en) 2023-05-12

Similar Documents

Publication Publication Date Title
WO2019127102A1 (en) Information processing method and apparatus, cloud processing device, and computer program product
US11200424B2 (en) Space-time memory network for locating target object in video content
CN109508681B (en) Method and device for generating human body key point detection model
US11928800B2 (en) Image coordinate system transformation method and apparatus, device, and storage medium
EP3279803B1 (en) Picture display method and device
CN108895981B (en) Three-dimensional measurement method, device, server and storage medium
WO2016124103A1 (en) Picture detection method and device
CN110503076B (en) Video classification method, device, equipment and medium based on artificial intelligence
US9047706B1 (en) Aligning digital 3D models using synthetic images
US20210358170A1 (en) Determining camera parameters from a single digital image
US10902053B2 (en) Shape-based graphics search
WO2022227770A1 (en) Method for training target object detection model, target object detection method, and device
CN110765882B (en) Video tag determination method, device, server and storage medium
CN109272543B (en) Method and apparatus for generating a model
CN111275784A (en) Method and device for generating image
CN115797350B (en) Bridge disease detection method, device, computer equipment and storage medium
CN111080670A (en) Image extraction method, device, equipment and storage medium
CN116129129B (en) Character interaction detection model and detection method
Liu et al. A computationally efficient denoising and hole-filling method for depth image enhancement
CN115861462B (en) Training method and device for image generation model, electronic equipment and storage medium
US20160110909A1 (en) Method and apparatus for creating texture map and method of creating database
CN111292333B (en) Method and apparatus for segmenting an image
CN110738702A (en) three-dimensional ultrasonic image processing method, device, equipment and storage medium
WO2019148311A1 (en) Information processing method and system, cloud processing device and computer program product
CN113537187A (en) Text recognition method and device, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17936201

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 12.11.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 17936201

Country of ref document: EP

Kind code of ref document: A1