WO2019127102A1 - Procédé et appareil de traitement d'informations, dispositif de traitement en nuage et produit programme d'ordinateur - Google Patents

Procédé et appareil de traitement d'informations, dispositif de traitement en nuage et produit programme d'ordinateur Download PDF

Info

Publication number
WO2019127102A1
WO2019127102A1 PCT/CN2017/119008 CN2017119008W WO2019127102A1 WO 2019127102 A1 WO2019127102 A1 WO 2019127102A1 CN 2017119008 W CN2017119008 W CN 2017119008W WO 2019127102 A1 WO2019127102 A1 WO 2019127102A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
semantic
dimensional
dimensional reconstruction
key frame
Prior art date
Application number
PCT/CN2017/119008
Other languages
English (en)
Chinese (zh)
Inventor
王恺
廉士国
Original Assignee
深圳前海达闼云端智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳前海达闼云端智能科技有限公司 filed Critical 深圳前海达闼云端智能科技有限公司
Priority to PCT/CN2017/119008 priority Critical patent/WO2019127102A1/fr
Priority to CN201780002737.9A priority patent/CN108124489B/zh
Publication of WO2019127102A1 publication Critical patent/WO2019127102A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects

Definitions

  • the present application relates to the field of data processing technologies, and in particular, to an information processing method, apparatus, cloud processing device, and computer program product.
  • Semantic map construction refers to the high-level semantic information (such as the object name and location) that can be used by the computer and other devices based on the perceptual data, cognition and understanding of the environment, and comprehensive analysis of the data.
  • the acquisition of sensory data can be achieved through key technologies such as radio frequency identification technology, auditory technology, and visual technology. At present, most research focuses on visual technology.
  • the deep learning technology can be relied on, and the image perceived by the computer in real time may contain multiple objects, firstly segment the image, and then perform the object in the segmented image by means of machine learning or the like. Identification, this process involves a large number of image operations and takes a long time.
  • the processing method in the prior art is mainly for the processing of two-dimensional data.
  • the three-dimensional data is semantically segmented
  • geometrically continuous segmentation results cannot be obtained by using this method, and the number of samples is limited, and can be segmented.
  • the types of objects are limited and take a long time.
  • the embodiment of the present application provides an information processing method, device, cloud processing device, and computer program product, which can process three-dimensional data in real time and generate a three-dimensional semantic map, which not only improves the accuracy of scene segmentation but also shortens the processing time.
  • an embodiment of the present application provides an information processing method, including:
  • the semantic segmentation data is mapped to the three-dimensional reconstruction data to obtain a three-dimensional semantic map.
  • mapping the semantic segmentation data and the three-dimensional reconstruction data to obtain a three-dimensional semantic map including:
  • the semantic information of all points in the three-dimensional reconstruction data is integrated to obtain the three-dimensional semantic map.
  • mapping processing on the semantic segmentation data and the three-dimensional reconstruction data to obtain a three-dimensional semantic map including:
  • the semantic information of all the faces in the three-dimensional reconstruction data and the semantic information of all the connection points are integrated to obtain the three-dimensional semantic map.
  • Extracting and processing the key frame data in the RGBD data to obtain geometric reconstruction data including:
  • Reconstruction is performed according to the pose information and the D data in the key frame data to obtain geometric reconstruction data.
  • the embodiment of the present application further provides an information processing apparatus, including:
  • An acquiring unit configured to acquire RGBD data collected by the image capturing device
  • An extracting unit configured to extract key frame data in the RGBD data and perform processing to obtain geometric reconstruction data
  • a processing unit configured to perform mapping processing on the RGB data in the key frame data and the geometric reconstruction data to obtain three-dimensional reconstruction data; and perform semantic segmentation on the RGB data in the key frame data to obtain semantic segmentation data;
  • mapping unit configured to perform mapping processing on the semantic segmentation data and the three-dimensional reconstruction data to obtain a three-dimensional semantic map.
  • the mapping unit is specifically configured to:
  • the semantic information of all points in the three-dimensional reconstruction data is integrated to obtain the three-dimensional semantic map.
  • the mapping unit is specifically configured to:
  • the semantic information of all the faces in the three-dimensional reconstruction data and the semantic information of all the connection points are integrated to obtain the three-dimensional semantic map.
  • the extracting unit is specifically configured to:
  • Reconstruction is performed according to the pose information and the D data in the key frame data to obtain geometric reconstruction data.
  • the embodiment of the present application further provides a cloud processing device, where the device includes a processor and a memory; the memory is configured to store an instruction, when the instruction is executed by the processor, causing the device to perform, for example, The method of any of the first aspects.
  • the embodiment of the present application further provides a computer program product, which can be directly loaded into an internal memory of a computer and includes software code. After the computer program is loaded and executed by a computer, the first aspect can be implemented. One such method.
  • the information processing method, device, cloud processing device and computer program product provided by the embodiments of the present application process geometric key reconstruction data by extracting key frame data in RGBD data, and then perform three-dimensional reconstruction and semantic segmentation simultaneously.
  • the two processes respectively obtain the three-dimensional reconstruction data and the semantic segmentation data, and finally the semantic segmentation data and the three-dimensional reconstruction data are mapped to obtain a three-dimensional semantic map.
  • the three-dimensional reconstruction and the semantic segmentation can be performed.
  • FIG. 1 is a flowchart of an embodiment of an information processing method according to an embodiment of the present application
  • FIG. 2 is a schematic structural diagram of an embodiment of an information processing apparatus according to an embodiment of the present disclosure
  • FIG. 3 is a schematic structural diagram of an embodiment of a cloud processing device according to an embodiment of the present disclosure.
  • the word “if” as used herein may be interpreted as “when” or “when” or “in response to determining” or “in response to detecting.”
  • the phrase “if determined” or “if detected (conditions or events stated)” may be interpreted as “when determined” or “in response to determination” or “when detected (stated condition or event) “Time” or “in response to a test (condition or event stated)”.
  • the three-dimensional semantic map consists of two parts, one of which is a three-dimensional reconstruction model obtained by reconstructing one environment, and the other is scene recognition information obtained by precise semantic segmentation.
  • semantic segmentation is mostly based on two-dimensional data processing, and semantic segmentation of three-dimensional data in the same manner cannot obtain geometrically continuous segmentation results, and it takes a long time, and it is difficult to achieve real-time completion.
  • the embodiment of the present application provides an information processing method, which performs semantic transformation on the collected environmental information while realizing three-dimensional reconstruction of the collected environmental information, and real-time generates a three-dimensional semantic map.
  • FIG. 1 is the present application.
  • a flowchart of an embodiment of the information processing method provided by the embodiment, as shown in FIG. 1 the information processing method provided by the embodiment of the present application may specifically include the following steps:
  • the image acquisition device when it is required to perform three-dimensional reconstruction on a certain scene and obtain a three-dimensional semantic map, the image acquisition device is first used to collect images on the scene, and the image collection device needs to include an RGB camera and a depth (Depth) camera, and The RGBD data is obtained after the acquisition is completed.
  • the computer for generating a three-dimensional semantic map may include a real-time mapping positioning module, which is used to acquire RGBD data collected by the image acquisition device, and specifically, may be constructed by a real-time mapping module. Actively acquiring RGBD data, the image acquisition device can also actively send RGBD data to the real-time mapping positioning module.
  • the following steps may be used to obtain the geometric reconstruction data: first, the pose information of the image acquisition device is calculated according to the key frame data in the RGBD data, and specifically, the RGBD data corresponding to the key frame is extracted in all the RGBD data. Calculating the pose of the image acquisition device according to the RGBD data corresponding to the key frame; then reconstructing according to the pose information and the D data in the key frame data to obtain geometric reconstruction data.
  • the geometric reconstruction data can include two formats, one is a point cloud format, and the other is a grid format, and the two formats can be selected according to actual needs.
  • the pose information and the D data in the key frame data are processed by a fast fusion algorithm to reconstruct the data in the point cloud format.
  • the fast fusion algorithm is used to process the pose information and the D data in the key frame data to reconstruct the data in the grid format.
  • the two processes of obtaining the three-dimensional reconstruction data and obtaining the semantic segmentation data have large calculation amounts and occupy a large amount of computing resources, so the two are put into different threads, or Parallel computing is used.
  • the process of generating three-dimensional reconstruction data according to different formats will be different.
  • the geometric reconstruction data is in the point cloud format, first find the D data corresponding to each point in the point cloud, and then, according to the calibration result of the RGB camera and the depth camera, find the RGB data corresponding to each point, and finally each point. The value of the corresponding RGB data is assigned to the corresponding point.
  • the geometric reconstruction data is in a grid format, the RGB data corresponding to the key frame is mapped to the grid as a texture according to an algorithm.
  • the algorithm may include a nearest sampling point algorithm, a bilinear interpolation algorithm, and three Linear interpolation algorithm, etc.
  • the semantic segmentation data can be obtained by selecting different prior art methods according to different scenarios.
  • the geometric reconstruction data is in a point cloud format: first, determining RGB data corresponding to each point in the three-dimensional reconstruction data; and then determining, corresponding to each point in the three-dimensional reconstruction data, according to the first correspondence relationship between the RGB data and the semantic segmentation data Semantic information; finally, the semantic information of all points in the 3D reconstruction data is integrated to obtain a 3D semantic map.
  • each point of the 3D geometric reconstruction result is V P (P is the number of the point), and the RGB value V C corresponding to each point can be obtained by looking up the table ⁇ .
  • the table ⁇ is a table representing the correspondence between the sequence number P and the RGB values.
  • the semantic information of each point is determined by the first correspondence function, and the specific function is:
  • V S is semantic information
  • V P is a point
  • V C is an RGB value
  • the geometric reconstruction data is in a grid format: first, determining RGB data corresponding to each face in the three-dimensional reconstruction data; and then, according to the second correspondence relationship between the RGB data and the semantic segmentation data, determining each face in the three-dimensional reconstruction data Semantic information; then, determine the faces around each connection point in the three-dimensional data; determine the semantic information of each connection point according to the semantic information corresponding to each face; finally, integrate the semantic information of all faces in the three-dimensional reconstruction data A semantic map of all the connection points is obtained to obtain a three-dimensional semantic map.
  • a mesh consists of points and faces, and faces are connected by points.
  • each face F j correspond to a region F c of RGB data, and the corresponding RGB value F c of each region can be obtained by looking up the table ⁇ .
  • the table ⁇ is a table representing the correspondence between the sequence number j and the RGB values.
  • the semantic information of each face is determined by a second corresponding relationship function.
  • the specific function is:
  • F s is semantic information
  • F j is a face
  • F c is an RGB value
  • V i s is semantic information
  • F k s is semantic information of all faces around V i
  • p is the number of faces around V i .
  • the function Q(F k s ) can be expressed as follows:
  • F k s is the semantic information of all faces around V i
  • p is the number of faces around V i .
  • the function Q(F k s ) can take the following specific representation:
  • F k s is the semantic information of all faces around V i
  • p is the number of faces around V i
  • F k A is the area of F k .
  • the information processing method provided by the embodiment of the present application extracts key frame data in the RGBD data, processes the key frame data to obtain geometric reconstruction data, and then simultaneously performs two processes of three-dimensional reconstruction and semantic segmentation to obtain three-dimensional reconstruction data and Semanticly segmenting the data, and finally mapping the semantic segmentation data and the three-dimensional reconstruction data to obtain a three-dimensional semantic map.
  • the three-dimensional reconstruction and the semantic segmentation can be simultaneously performed, and the three-dimensional reconstruction can be performed according to the RGBD data.
  • Reconstruction can obtain semantic information at the same time, which can shorten the calculation time and improve the accuracy of scene segmentation, and achieve the effect of generating 3D maps in real time, which can solve the problem of segmentation of 3D data in the prior art.
  • the type of object is limited and takes a long time.
  • FIG. 2 is a schematic structural diagram of an embodiment of an information processing apparatus according to an embodiment of the present application. As shown in FIG. 2, the apparatus of this embodiment may be used.
  • the acquisition unit 11 includes an acquisition unit 11, an extraction unit 12, a processing unit 13, and a mapping unit 14.
  • the obtaining unit 11 is configured to acquire RGBD data collected by the image capturing device.
  • the extracting unit 12 is configured to extract key frame data in the RGBD data and perform processing to obtain geometric reconstruction data.
  • the processing unit 13 is configured to perform mapping processing on the RGB data and the geometric reconstruction data in the key frame data to obtain three-dimensional reconstruction data; and perform semantic segmentation processing on the RGB data in the key frame data to obtain semantic segmentation data.
  • the mapping unit 14 is configured to perform mapping processing on the semantic segmentation data and the three-dimensional reconstruction data to obtain a three-dimensional semantic map.
  • mapping unit 14 is specifically configured to:
  • mapping unit 14 is specifically configured to:
  • the extracting unit 12 is specifically configured to:
  • the reconstruction is performed based on the pose information and the D data in the key frame data to obtain geometric reconstruction data.
  • the information processing apparatus provided by the embodiment of the present application may be used to implement the technical solution of the method embodiment shown in FIG. 1 , and the implementation principle and technical effects thereof are similar, and details are not described herein again.
  • FIG. 3 is a schematic structural diagram of an embodiment of a cloud processing device according to an embodiment of the present disclosure.
  • the cloud processing device includes a processor 21 and a memory 22; the memory 22 is for storing instructions that, when executed by the processor 21, cause the device to perform any of the methods described above.
  • the cloud processing device provided by the embodiment of the present application may be used to implement the technical solution of the method embodiment shown in FIG. 1 , and the implementation principle and the technical effect are similar, and details are not described herein again.
  • the embodiment of the present application further provides a computer program product, which can be directly loaded into an internal memory of a computer and contains software code, and the computer program can be implemented by being loaded and executed by a computer. Any method.
  • the cloud processing device provided by the embodiment of the present application may be used to implement the technical solution of the method embodiment shown in FIG. 1 , and the implementation principle and the technical effect are similar, and details are not described herein again.
  • the aforementioned program can be stored in a computer readable storage medium.
  • the program when executed, performs the steps including the foregoing method embodiments; and the foregoing storage medium includes various media that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.
  • the device embodiments described above are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, ie may be located in one place. Or it can be distributed to at least two network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment. Those of ordinary skill in the art can understand and implement without deliberate labor.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Processing Or Creating Images (AREA)
  • Image Analysis (AREA)
  • Image Generation (AREA)

Abstract

L'invention concerne un procédé et un appareil de traitement d'informations, un dispositif de traitement en nuage et un produit programme d'ordinateur, appliqués au domaine technique du traitement de données. Le procédé de traitement d'informations comprend les étapes suivantes : obtention de données RGBD collectées par un dispositif de collecte d'image (101) ; extraction et traitement de données de trame clé dans les données RGBD pour obtenir des données de reconstruction géométrique (102) ; mappage de données RGB dans les données de trame clé et les données de reconstruction géométrique pour obtenir des données de reconstruction tridimensionnelle ; et réalisation d'un traitement de segmentation sémantique sur les données RGB dans les données de trame clé pour obtenir des données de segmentation sémantique (103) ; et mappage des données de segmentation sémantique et des données de reconstruction tridimensionnelle pour obtenir une carte sémantique tridimensionnelle (104). Selon le procédé, une reconstruction tridimensionnelle et une segmentation sémantique peuvent être réalisées en même temps ; la reconstruction tridimensionnelle peut non seulement être effectuée selon des données RGBD, mais des informations sémantiques peuvent aussi être obtenues, le temps de calcul est raccourci, et la précision de la segmentation de scène peut également être améliorée.
PCT/CN2017/119008 2017-12-27 2017-12-27 Procédé et appareil de traitement d'informations, dispositif de traitement en nuage et produit programme d'ordinateur WO2019127102A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2017/119008 WO2019127102A1 (fr) 2017-12-27 2017-12-27 Procédé et appareil de traitement d'informations, dispositif de traitement en nuage et produit programme d'ordinateur
CN201780002737.9A CN108124489B (zh) 2017-12-27 2017-12-27 信息处理方法、装置、云处理设备以及计算机程序产品

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/119008 WO2019127102A1 (fr) 2017-12-27 2017-12-27 Procédé et appareil de traitement d'informations, dispositif de traitement en nuage et produit programme d'ordinateur

Publications (1)

Publication Number Publication Date
WO2019127102A1 true WO2019127102A1 (fr) 2019-07-04

Family

ID=62234350

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/119008 WO2019127102A1 (fr) 2017-12-27 2017-12-27 Procédé et appareil de traitement d'informations, dispositif de traitement en nuage et produit programme d'ordinateur

Country Status (2)

Country Link
CN (1) CN108124489B (fr)
WO (1) WO2019127102A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111292340A (zh) * 2020-01-23 2020-06-16 北京市商汤科技开发有限公司 语义分割方法、装置、设备及计算机可读存储介质

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109117718B (zh) * 2018-07-02 2021-11-26 东南大学 一种面向道路场景的三维语义地图构建和存储方法
CN109191526B (zh) * 2018-09-10 2020-07-07 杭州艾米机器人有限公司 基于rgbd摄像头和光编码器的三维环境重建方法及系统
CN109461211B (zh) * 2018-11-12 2021-01-26 南京人工智能高等研究院有限公司 基于视觉点云的语义矢量地图构建方法、装置和电子设备
CN110245567B (zh) * 2019-05-16 2023-04-07 达闼机器人股份有限公司 避障方法、装置、存储介质及电子设备
CN113313832B (zh) * 2021-05-26 2023-07-04 Oppo广东移动通信有限公司 三维模型的语义生成方法、装置、存储介质与电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070273696A1 (en) * 2006-04-19 2007-11-29 Sarnoff Corporation Automated Video-To-Text System
CN104732587A (zh) * 2015-04-14 2015-06-24 中国科学技术大学 一种基于深度传感器的室内3d语义地图构建方法
CN105551084A (zh) * 2016-01-28 2016-05-04 北京航空航天大学 一种基于图像内容解析的室外三维场景组合构建方法
CN106067191A (zh) * 2016-05-25 2016-11-02 深圳市寒武纪智能科技有限公司 一种家用机器人建立语义地图的方法及系统
CN106384383A (zh) * 2016-09-08 2017-02-08 哈尔滨工程大学 一种基于fast和freak特征匹配算法的rgb‑d和slam场景重建方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104424630A (zh) * 2013-08-20 2015-03-18 华为技术有限公司 三维重建方法及装置、移动终端
CN107292949B (zh) * 2017-05-25 2020-06-16 深圳先进技术研究院 场景的三维重建方法、装置及终端设备
CN107358189B (zh) * 2017-07-07 2020-12-04 北京大学深圳研究生院 一种基于多视目标提取的室内环境下物体检测方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070273696A1 (en) * 2006-04-19 2007-11-29 Sarnoff Corporation Automated Video-To-Text System
CN104732587A (zh) * 2015-04-14 2015-06-24 中国科学技术大学 一种基于深度传感器的室内3d语义地图构建方法
CN105551084A (zh) * 2016-01-28 2016-05-04 北京航空航天大学 一种基于图像内容解析的室外三维场景组合构建方法
CN106067191A (zh) * 2016-05-25 2016-11-02 深圳市寒武纪智能科技有限公司 一种家用机器人建立语义地图的方法及系统
CN106384383A (zh) * 2016-09-08 2017-02-08 哈尔滨工程大学 一种基于fast和freak特征匹配算法的rgb‑d和slam场景重建方法

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111292340A (zh) * 2020-01-23 2020-06-16 北京市商汤科技开发有限公司 语义分割方法、装置、设备及计算机可读存储介质

Also Published As

Publication number Publication date
CN108124489A (zh) 2018-06-05
CN108124489B (zh) 2023-05-12

Similar Documents

Publication Publication Date Title
WO2019127102A1 (fr) Procédé et appareil de traitement d'informations, dispositif de traitement en nuage et produit programme d'ordinateur
US11200424B2 (en) Space-time memory network for locating target object in video content
KR102319177B1 (ko) 이미지 내의 객체 자세를 결정하는 방법 및 장치, 장비, 및 저장 매체
CN109508681B (zh) 生成人体关键点检测模型的方法和装置
US11928800B2 (en) Image coordinate system transformation method and apparatus, device, and storage medium
CN108895981B (zh) 一种三维测量方法、装置、服务器和存储介质
WO2016124103A1 (fr) Procédé et dispositif de détection d'image
CN110503076B (zh) 基于人工智能的视频分类方法、装置、设备和介质
US9047706B1 (en) Aligning digital 3D models using synthetic images
AU2018202767B2 (en) Data structure and algorithm for tag less search and svg retrieval
WO2022227770A1 (fr) Procédé d'apprentissage de modèle de détection d'objet cible, procédé de détection d'objet cible, et dispositif
CN109272543B (zh) 用于生成模型的方法和装置
CN111275784A (zh) 生成图像的方法和装置
CN115797350B (zh) 桥梁病害检测方法、装置、计算机设备和存储介质
CN111080670A (zh) 图像提取方法、装置、设备及存储介质
Liu et al. A computationally efficient denoising and hole-filling method for depth image enhancement
CN110765882A (zh) 一种视频标签确定方法、装置、服务器及存储介质
CN115861462B (zh) 图像生成模型的训练方法、装置、电子设备及存储介质
CN116129129B (zh) 一种人物交互检测模型及检测方法
US20160110909A1 (en) Method and apparatus for creating texture map and method of creating database
CN111292333B (zh) 用于分割图像的方法和装置
CN110738702A (zh) 一种三维超声图像的处理方法、装置、设备及存储介质
WO2019148311A1 (fr) Procédé et système de traitement d'informations, dispositif de traitement en nuage et produit de programme informatique
CN113537187A (zh) 文本识别方法、装置、电子设备及可读存储介质
CN112149528A (zh) 一种全景图目标检测方法、系统、介质及设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17936201

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 12.11.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 17936201

Country of ref document: EP

Kind code of ref document: A1