CN113379663B - Space positioning method and device - Google Patents
Space positioning method and device Download PDFInfo
- Publication number
- CN113379663B CN113379663B CN202110675145.1A CN202110675145A CN113379663B CN 113379663 B CN113379663 B CN 113379663B CN 202110675145 A CN202110675145 A CN 202110675145A CN 113379663 B CN113379663 B CN 113379663B
- Authority
- CN
- China
- Prior art keywords
- point
- cloud data
- target
- rgb image
- point cloud
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 37
- 238000013528 artificial neural network Methods 0.000 claims abstract description 43
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 28
- 238000012549 training Methods 0.000 claims abstract description 19
- 239000013598 vector Substances 0.000 claims description 81
- 238000000605 extraction Methods 0.000 claims description 22
- 238000006243 chemical reaction Methods 0.000 claims description 15
- 238000001914 filtration Methods 0.000 claims description 15
- 230000008569 process Effects 0.000 claims description 13
- 230000011218 segmentation Effects 0.000 claims description 13
- 239000011159 matrix material Substances 0.000 claims description 12
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000012545 processing Methods 0.000 claims description 9
- 238000013519 translation Methods 0.000 claims description 6
- 230000004807 localization Effects 0.000 claims description 3
- 230000001360 synchronised effect Effects 0.000 claims description 3
- 238000013461 design Methods 0.000 abstract description 5
- 230000006870 function Effects 0.000 description 7
- 238000004590 computer program Methods 0.000 description 4
- 230000004927 fusion Effects 0.000 description 3
- 230000006872 improvement Effects 0.000 description 3
- 238000001514 detection method Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003190 augmentative effect Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/30—Determination of transform parameters for the alignment of images, i.e. image registration
- G06T7/33—Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a space positioning method and a space positioning device, which relate to the technical field of three-dimensional computer vision, and the method comprises the steps of synchronously acquiring point cloud data output by a 3D TOF sensor and RGB images output by an RGB camera; coarse registration is carried out by adopting an SIFT algorithm to obtain a first matching point pair; performing fine registration on the first matching point pair according to a space constraint relation to obtain a second matching point pair; and training the RBF neural network according to the second matching point pair to obtain the corresponding relation between the point cloud data and the RGB image, and fusing the point cloud data of the target and the RGB image of the target based on the corresponding relation to obtain the target space positioning data. The method has the advantage of high acquisition precision of the matching point pairs, and greatly increases the operation space of the virtual background design layout by segmenting, extracting and positioning the target person and the target object.
Description
Technical Field
The invention relates to the technical field of three-dimensional computer vision, in particular to a space positioning method and a space positioning device.
Background
Three-dimensional object detection has made rapid progress due to advances in deep learning of point clouds. The point cloud is the raw output data of most three-dimensional information acquisition devices, and compared with a two-dimensional image, the point cloud data can provide more comprehensive description of a real world scene. In recent years, with the improvement of computer hardware performance and the popularization of point cloud data acquisition equipment, the application of a point cloud data utilization technology in the fields of intelligent robots, unmanned driving, advanced manufacturing, virtual reality, augmented reality and the like is increasing.
However, the point cloud data has inherent limitations, they are sparse, lack color information, are often affected by sensor noise, and are difficult to directly utilize.
The RGB image has advantages of high resolution and rich texture, but is limited in three-dimensional detection due to lack of target depth and scale, etc., which can be provided by a three-dimensional point cloud.
Therefore, for effectively utilizing point cloud data and RGB image data to carry out target space positioning, a great deal of work is carried out by scientific and technical personnel at home and abroad, but the defects are still more in complex environment, so that the existing target space positioning technology cannot meet the requirement of real-time and accurate environment sensing under complex conditions.
Disclosure of Invention
Therefore, in order to overcome the above-mentioned drawbacks, embodiments of the present invention provide a high-precision spatial positioning method and apparatus.
Therefore, the space positioning method of the embodiment of the invention comprises the following steps:
synchronously acquiring point cloud data output by a 3D TOF sensor and RGB images output by an RGB camera;
performing coarse registration by adopting an SIFT algorithm according to the point cloud data and the RGB image to obtain a first matching point pair;
performing fine registration on the first matching point pair according to a space constraint relation to obtain a second matching point pair;
training an RBF neural network according to the second matching point pairs to obtain the corresponding relation between the point cloud data and the RGB image;
filtering, segmenting and extracting according to the point cloud data to obtain point cloud data of a target;
denoising, segmenting and extracting according to the RGB image to obtain an RGB image of the target;
and fusing the point cloud data of the target and the RGB image of the target according to the corresponding relation between the point cloud data and the RGB image to obtain the target space positioning data.
Preferably, the step of performing coarse registration by using an SIFT algorithm according to the point cloud data and the RGB image to obtain a first matching point pair includes:
converting the point cloud data into a two-dimensional image;
extracting first key points of the two-dimensional image by adopting an SIFT algorithm to obtain feature vectors of the first key pointsi is 1,2, …, n, n is the total number of the first keypoints;
extracting second key points of the RGB image by adopting an SIFT algorithm to obtain a feature vector of the RGB imagej is 1,2, …, m, m is the total number of the second keypoints;
respectively calculating a neighbor index G between the first key point and the second key pointijThe calculation formula is as follows:
wherein, muiIs the first key point feature vectorMean value of (d) (. mu.)jAs the feature vector of the second key pointMean value of (a)iIs the first key point feature vectorVariance of (a)jAs the feature vector of the second key pointVariance of (a)ijIs the first key point feature vectorAnd a second keypoint feature vectorCovariance of (C)1、C2、C3Respectively are preset constants;
aiming at each first key point, selecting neighbor index C in all second key pointsijThe maximum k are used as coarse registration points of the first key points to obtain first matching point pairsi=1,2,…,n,j=1,2,…,k。
Preferably, the step of performing fine registration on the first matching point pair according to the spatial constraint relationship to obtain a second matching point pair includes:
selecting the first matching point pair which simultaneously meets the following three spatial constraint relations to obtain a second matching point pair:
the pixel translation distance between a first key point and a second key point in the first matching point pair is smaller than or equal to a first preset value;
wherein,for any two of the first pair of matching points,is the included angle between the connecting line of two first key points in the two-dimensional image and the horizontal direction,is the included angle between the connecting line of two second key points in the RGB image and the horizontal direction, epsilon1The second preset value is set;
wherein,is the distance between two first keypoints in the two-dimensional image,is the distance, ε, between two second keypoints in an RGB image2Is the third preset value.
Preferably, the step of training the RBF neural network according to the second matching point pairs to obtain a corresponding relationship between the point cloud data and the RGB image includes:
constructing a conversion matrix model between a first key point feature vector and a second key point feature vector in the second matching point pair, taking the first key point feature vector in the second matching point pair as the input of the RBF neural network, taking the second key point feature vector in the second matching point pair as the expected output of the RBF neural network, training the RBF neural network by utilizing the input and the expected output, and obtaining various parameters of the conversion matrix model;
inputting any pixel point in the RGB image into the trained RBF neural network to obtain a pixel point in the two-dimensional image corresponding to the any pixel point;
and according to the inverse process of the process of converting the point cloud data into the two-dimensional image, obtaining a spatial position point in the point cloud data corresponding to the pixel point in the two-dimensional image, and obtaining a corresponding relation between the point cloud data and the RGB image.
Preferably, the step of fusing the point cloud data of the target and the RGB image of the target according to the corresponding relationship between the point cloud data and the RGB image to obtain the target space positioning data includes:
and according to the corresponding relation between the point cloud data and the RGB image, assigning the parameter of each pixel point in the RGB image of the target to a spatial position point in the point cloud data of the target corresponding to the pixel point, and fusing the point cloud data of the target and the RGB image of the target to obtain target spatial positioning data.
The space positioning device of the embodiment of the invention comprises:
the synchronous acquisition module is used for synchronously acquiring point cloud data output by the 3D TOF sensor and RGB images output by the RGB camera;
the rough registration module is used for carrying out rough registration by adopting an SIFT algorithm according to the point cloud data and the RGB image to obtain a first matching point pair;
the fine registration module is used for performing fine registration on the first matching point pair according to the space constraint relation to obtain a second matching point pair;
the RBF neural network module is used for training an RBF neural network according to the second matching point pairs to obtain the corresponding relation between the point cloud data and the RGB image;
the point cloud target extraction module is used for carrying out filtering, segmentation and extraction processing according to the point cloud data to obtain point cloud data of a target;
the image target extraction module is used for carrying out denoising and segmentation extraction processing according to the RGB image to obtain an RGB image of a target;
and the target obtaining module is used for fusing the point cloud data of the target and the RGB image of the target according to the corresponding relation between the point cloud data and the RGB image to obtain target space positioning data.
Preferably, the coarse registration module comprises:
the two-dimensional image conversion module is used for converting the point cloud data into a two-dimensional image;
a first SIFT extraction module for extracting first key points of the two-dimensional image by using SIFT algorithm to obtain feature vectors thereofi is 1,2, …, n, n is the total number of the first keypoints;
a second SIFT extraction module for extracting second key points of the RGB image by using SIFT algorithm to obtain feature vectors thereofj is 1,2, …, m, m is the total number of the second keypoints;
a calculation module for calculating a neighbor index G between the first and second key points, respectivelyijThe calculation formula is as follows:
wherein, muiIs a first key point feature(Vector)Mean value of (d) (. mu.)jAs the feature vector of the second key pointMean value of (a)iIs the first key point feature vectorVariance of (a)jAs the feature vector of the second key pointVariance of (a)ijIs the first key point feature vectorAnd a second keypoint feature vectorCovariance of (C)1、C2、C3Respectively are preset constants;
a first matching point pair obtaining module, configured to select, for each first keypoint, a neighbor index G of all second keypointsijThe maximum k are used as coarse registration points of the first key points to obtain first matching point pairsi=1,2,…,n,j=1,2,…,k。
Preferably, the fine registration module includes:
a second matching point pair obtaining module, configured to select the first matching point pair that simultaneously satisfies the following three spatial constraint relationships, and obtain a second matching point pair:
the pixel translation distance between a first key point and a second key point in the first matching point pair is smaller than or equal to a first preset value;
wherein,for any two of the first pair of matching points,is the included angle between the connecting line of two first key points in the two-dimensional image and the horizontal direction,is the included angle between the connecting line of two second key points in the RGB image and the horizontal direction, epsilon1The second preset value is set;
wherein,is the distance between two first keypoints in the two-dimensional image,is the distance, ε, between two second keypoints in an RGB image2Is the third preset value.
Preferably, the RBF neural network module includes:
the training module is used for constructing a conversion matrix model between a first key point feature vector and a second key point feature vector in the second matching point pair, taking the first key point feature vector in the second matching point pair as the input of the RBF neural network, taking the second key point feature vector in the second matching point pair as the expected output of the RBF neural network, training the RBF neural network by utilizing the input and the expected output, and obtaining various parameters of the conversion matrix model;
the first corresponding relation obtaining module is used for inputting any pixel point in the RGB image into the trained RBF neural network to obtain a pixel point in the two-dimensional image corresponding to the any pixel point;
and the second corresponding relation obtaining module is used for obtaining a spatial position point in the point cloud data corresponding to a pixel point in the two-dimensional image according to the inverse process of the process of converting the point cloud data into the two-dimensional image, and obtaining the corresponding relation between the point cloud data and the RGB image.
Preferably, the object obtaining module includes:
and the pixel assignment module is used for assigning the parameter of each pixel point in the RGB image of the target to a spatial position point in the point cloud data of the target corresponding to the pixel point according to the corresponding relation between the point cloud data and the RGB image, fusing the point cloud data of the target and the RGB image of the target and obtaining target spatial positioning data.
The space positioning method and the space positioning device have the following advantages that:
1. by combining the SIFT algorithm and the space constraint relation, the acquisition precision of the matching point pairs is improved.
2. The RBF neural network is adopted to obtain the corresponding relation between the point cloud data output by the 3D TOF sensor and the RGB image output by the RGB camera, so that the point cloud data and the RGB image are fused, the spatial target is reconstructed based on the three-dimensional information and the image information, the positioning precision is further improved, and the requirement of real-time and accurate environment sensing under complex conditions can be met.
3. Target space positioning is carried out through a 3D point cloud and image fusion algorithm, technical complementation is formed, and positioning accuracy is improved.
4. By segmenting, extracting and positioning the target person and the target object, the operation space of the virtual background design layout is greatly increased.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of a specific example of a spatial localization method in embodiment 1 of the present invention;
fig. 2 is a flowchart of another specific example of the spatial localization method in embodiment 1 of the present invention;
fig. 3 is a schematic block diagram of a specific example of a spatial positioning apparatus in embodiment 2 of the present invention;
fig. 4 is a circuit diagram of a specific example of the spatial positioning device in embodiment 2 of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In describing the present invention, it is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. The terms "comprises" and/or "comprising," when used in this specification, are intended to specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The term "and/or" includes any and all combinations of one or more of the associated listed items. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
Furthermore, certain drawings in this specification are flow charts illustrating methods. It will be understood that each block of the flowchart illustrations, and combinations of blocks in the flowchart illustrations, can be implemented by computer program instructions. These computer program instructions may be loaded onto a computer or other programmable apparatus to produce a machine, such that the instructions which execute on the computer or other programmable apparatus create means for implementing the functions specified in the flowchart block or blocks. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart block or blocks. The computer program instructions may also be loaded onto a computer or other programmable apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart block or blocks.
Accordingly, blocks of the flowchart illustrations support combinations of means for performing the specified functions and combinations of steps for performing the specified functions. It will also be understood that each block of the flowchart illustrations, and combinations of blocks in the flowchart illustrations, can be implemented by special purpose hardware-based computer systems which perform the specified functions or steps, or combinations of special purpose hardware and computer instructions.
In addition, the technical features involved in the different embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Example 1
The present embodiment provides a spatial positioning method, as shown in fig. 1, including the following steps:
s1, synchronously acquiring point cloud data output by the 3D TOF sensor and RGB images output by the RGB camera;
s2, performing coarse registration by adopting an SIFT algorithm according to the point cloud data and the RGB image to obtain a first matching point pair;
s3, performing fine registration on the first matching point pair according to the space constraint relation to obtain a second matching point pair;
s4, training an RBF neural network according to the second matching point pairs to obtain the corresponding relation between the point cloud data and the RGB image;
s5, filtering, segmenting and extracting according to the point cloud data to obtain target point cloud data; the target comprises a human body and an object, and the object comprises an office table, a chair and the like;
s6, carrying out denoising, segmentation and extraction processing according to the RGB image to obtain an RGB image of the target;
s7, fusing the point cloud data of the target and the RGB image of the target according to the corresponding relation between the point cloud data and the RGB image to obtain target space positioning data, wherein the space positioning data comprises space positions, shapes, sizes and the like.
According to the space positioning method, the SIFT algorithm and the space constraint relation are combined, so that the acquisition precision of the matching point pairs is improved. The RBF neural network is adopted to obtain the corresponding relation between the point cloud data output by the 3D TOF sensor and the RGB image output by the RGB camera, so that the point cloud data and the RGB image are fused, the spatial target is reconstructed based on the three-dimensional information and the image information, the positioning precision is further improved, and the requirement of real-time and accurate environment sensing under complex conditions can be met. Target space positioning is carried out through a 3D point cloud and image fusion algorithm, technical complementation is formed, and positioning accuracy is improved. By segmenting, extracting and positioning the target person and the target object, the operation space of the virtual background design layout is greatly increased.
Preferably, as shown in fig. 2, the step of performing coarse registration by using a SIFT algorithm according to the point cloud data and the RGB image in S2 to obtain a first matching point pair includes:
s21, converting the point cloud data into a two-dimensional image; preferably, pixel values in the two-dimensional image are represented by height values in the point cloud data.
S22, extracting the first key point of the two-dimensional image by adopting SIFT algorithm to obtain the feature vector of the first key pointi is 1,2, …, n, n is the total number of the first key points,is a 128-dimensional feature vector;
s23, extracting second key points of the RGB image by adopting SIFT algorithm to obtain feature vectors of the RGB imagej is 1,2, …, m is the total number of the second key points,is a 128-dimensional feature vector;
s24, calculating a neighbor index G between the first key point and the second key point respectivelyijThe calculation formula is as follows:
wherein, muiIs the first key point feature vectorMean value of (d) (. mu.)jAs the feature vector of the second key pointMean value of (a)iIs the first key point feature vectorVariance of (a)jAs the feature vector of the second key pointVariance of (a)ijIs the first key point feature vectorAnd a second keypoint feature vectorCovariance of (C)1、C2、C3Respectively are preset constants;
s25, aiming at each first key point, selecting a neighbor index G in all second key pointsijThe maximum k are used as coarse registration points of the first key points to obtain first matching point pairs1,2, …, n, j 1,2, …, k; because the resolution of the RGB image is fixed and unchanged, and the resolution of the two-dimensional image converted by the point cloud data can be improved or reduced according to actual requirements, the coarse registration precision obtained by searching the second key point extracted by the RGB image based on the first key point extracted by the two-dimensional image can be improved along with the improvement of the resolution of the two-dimensional image, the controllable adjustment of the coarse registration precision as required is realized, and the application range is expanded.
Preferably, the step of S3 of precisely registering the first matching point pair according to the spatial constraint relationship, and obtaining a second matching point pair includes:
s31, selecting the first matching point pair which simultaneously satisfies the following three spatial constraint relations, and obtaining a second matching point pair:
the pixel translation distance between a first key point and a second key point in the first matching point pair is smaller than or equal to a first preset value; the first preset value can be set according to actual requirements;
wherein,for any two of the first pair of matching points,is the included angle between the connecting line of two first key points in the two-dimensional image and the horizontal direction,is the included angle between the connecting line of two second key points in the RGB image and the horizontal direction, epsilon1The second preset value can be set according to actual requirements;
wherein,is the distance between two first keypoints in the two-dimensional image,is the distance, ε, between two second keypoints in an RGB image2And the third preset value can be set according to actual requirements. Through the three spatial constraint relations, the first matching point pairs which do not meet the three spatial constraint relations are removed, the first matching point pairs are accurately registered, and the registration accuracy is improved.
Preferably, the step of training the RBF neural network according to the second matching point pairs of S4 to obtain a correspondence between the point cloud data and the RGB image includes:
s41, constructing a conversion matrix model between the first key point feature vector and the second key point feature vector in the second matching point pair, taking the first key point feature vector in the second matching point pair as the input of the RBF neural network, taking the second key point feature vector in the second matching point pair as the expected output of the RBF neural network, training the RBF neural network by utilizing the input and the expected output, and obtaining various parameters of the conversion matrix model;
s42, inputting any pixel point in the RGB image into the trained RBF neural network to obtain a pixel point in the two-dimensional image corresponding to the any pixel point;
s43, according to the inverse process of the process of converting the point cloud data into the two-dimensional image, obtaining the spatial position point in the point cloud data corresponding to the pixel point in the two-dimensional image, and obtaining the corresponding relation between the point cloud data and the RGB image. And finally, obtaining the corresponding relation between the RGB image and the point cloud data according to the corresponding relation between the two-dimensional image and the point cloud data, and establishing a basis for the subsequent reconstruction of the space target.
Preferably, the step of S5, performing filtering, segmentation and extraction processing according to the point cloud data to obtain the point cloud data of the target, includes:
s51, removing outliers by adopting a radius filtering method based on the point cloud data to obtain filtered point cloud data;
s52, establishing a three-dimensional kd-tree space index for the filtering point cloud data; aiming at each spatial position point in the filtering point cloud data, solving by using the kd-tree spatial index to obtain k nearest points; for each space position point and k nearest points thereof, solving an equation of a fitting plane of the space position point by adopting a characteristic value method, and determining a normal vector of each space position point;
and S53, carrying out segmentation and region growth according to the normal vector to obtain point cloud data of the target, namely obtaining three-dimensional point cloud data of the target, and capturing the target from the background.
Preferably, the step of S6, performing denoising and segmentation extraction processing according to the RGB image, to obtain an RGB image of the target, includes:
s61, obtaining a model of a common target on the COCO data set;
s62, denoising the RGB image to obtain a denoised RGB image;
s63, segmenting the denoising RGB image by using the common target model to obtain an RGB image of the target, and capturing the target from the background to obtain the RGB image of the target.
Preferably, the step of S7, according to the corresponding relationship between the point cloud data and the RGB image, fusing the point cloud data of the target and the RGB image of the target to obtain the target spatial location data includes:
s71, according to the corresponding relation between the point cloud data and the RGB image, assigning the parameter of each pixel point in the RGB image of the target to a spatial position point in the point cloud data of the target corresponding to the pixel point, fusing the point cloud data of the target and the RGB image of the target to obtain target spatial positioning data, and realizing reconstruction of the target.
Preferably, the step of assigning the parameter of each pixel point in the RGB image of the target to a spatial position point in the point cloud data of the target corresponding to the pixel point includes:
adding pixel parameter items to all spatial position points in the point cloud data of the target for reflecting the colors of the spatial position points;
and for each spatial position point, assigning the pixel value of one pixel point or the weighted average value of the pixel values of more than two pixel points in the RGB image of the target corresponding to the spatial position point to the value of the pixel parameter item of the spatial position point.
Example 2
The present embodiment provides a spatial positioning apparatus, as shown in fig. 3, including:
the synchronous acquisition module is used for synchronously acquiring point cloud data output by the 3D TOF sensor and RGB images output by the RGB camera;
the rough registration module is used for carrying out rough registration by adopting an SIFT algorithm according to the point cloud data and the RGB image to obtain a first matching point pair;
the fine registration module is used for performing fine registration on the first matching point pair according to the space constraint relation to obtain a second matching point pair;
the RBF neural network module is used for training an RBF neural network according to the second matching point pairs to obtain the corresponding relation between the point cloud data and the RGB image;
the point cloud target extraction module is used for carrying out filtering, segmentation and extraction processing according to the point cloud data to obtain point cloud data of a target; the target comprises a human body and an object, and the object comprises an office table, a chair and the like;
the image target extraction module is used for carrying out denoising and segmentation extraction processing according to the RGB image to obtain an RGB image of a target;
and the target obtaining module is used for fusing the point cloud data of the target and the RGB image of the target according to the corresponding relation between the point cloud data and the RGB image to obtain target space positioning data, and the space positioning data comprises space positions, shapes, sizes and the like.
According to the space positioning device, the SIFT algorithm and the space constraint relation are combined, so that the acquisition precision of the matching point pairs is improved. The RBF neural network is adopted to obtain the corresponding relation between the point cloud data output by the 3D TOF sensor and the RGB image output by the RGB camera, so that the point cloud data and the RGB image are fused, the spatial target is reconstructed based on the three-dimensional information and the image information, the positioning precision is further improved, and the requirement of real-time and accurate environment sensing under complex conditions can be met. Target space positioning is carried out through a 3D point cloud and image fusion algorithm, technical complementation is formed, and positioning accuracy is improved. By segmenting, extracting and positioning the target person and the target object, the operation space of the virtual background design layout is greatly increased.
Preferably, the coarse registration module comprises:
the two-dimensional image conversion module is used for converting the point cloud data into a two-dimensional image; preferably, pixel values in the two-dimensional image are represented by height values in the point cloud data.
A first SIFT extraction module for extracting first key points of the two-dimensional image by using SIFT algorithm to obtain feature vectors thereofi is 1,2, …, n, n is the total number of the first key points,is a 128-dimensional feature vector;
a second SIFT extraction module for extracting second key points of the RGB image by using SIFT algorithm to obtain feature vectors thereofj=1,2,…,m,mIs the total number of the second keypoints,is a 128-dimensional feature vector;
a calculation module for calculating a neighbor index G between the first and second key points, respectivelyijThe calculation formula is as follows:
wherein, muiIs the first key point feature vectorMean value of (d) (. mu.)jAs the feature vector of the second key pointMean value of (a)iIs the first key point feature vectorVariance of (a)jAs the feature vector of the second key pointVariance of (a)ijIs the first key point feature vectorAnd a second keypoint feature vectorCovariance of (C)1、C2、C3Respectively are preset constants;
a first matching point pair obtaining module, configured to select, for each first keypoint, a neighbor index G of all second keypointsijThe maximum k are used as coarse registration points of the first key points to obtain first matching point pairs1,2, …, n, j 1,2, …, k; because the resolution of the RGB image is fixed and unchanged, and the resolution of the two-dimensional image converted by the point cloud data can be improved or reduced according to actual requirements, the coarse registration precision obtained by searching the second key point extracted by the RGB image based on the first key point extracted by the two-dimensional image can be improved along with the improvement of the resolution of the two-dimensional image, the controllable adjustment of the coarse registration precision as required is realized, and the application range is expanded.
Preferably, the fine registration module includes:
a second matching point pair obtaining module, configured to select the first matching point pair that simultaneously satisfies the following three spatial constraint relationships, and obtain a second matching point pair:
the pixel translation distance between a first key point and a second key point in the first matching point pair is smaller than or equal to a first preset value; the first preset value can be set according to actual requirements;
wherein,for any two of the first pair of matching points,is the included angle between the connecting line of two first key points in the two-dimensional image and the horizontal direction,is the included angle between the connecting line of two second key points in the RGB image and the horizontal direction, epsilon1The second preset value can be set according to actual requirements;
wherein,is the distance between two first keypoints in the two-dimensional image,is the distance, ε, between two second keypoints in an RGB image2And the third preset value can be set according to actual requirements. Through the three spatial constraint relations, the first matching point pairs which do not meet the three spatial constraint relations are removed, the first matching point pairs are accurately registered, and the registration accuracy is improved.
Preferably, the RBF neural network module includes:
the training module is used for constructing a conversion matrix model between a first key point feature vector and a second key point feature vector in the second matching point pair, taking the first key point feature vector in the second matching point pair as the input of the RBF neural network, taking the second key point feature vector in the second matching point pair as the expected output of the RBF neural network, training the RBF neural network by utilizing the input and the expected output, and obtaining various parameters of the conversion matrix model;
the first corresponding relation obtaining module is used for inputting any pixel point in the RGB image into the trained RBF neural network to obtain a pixel point in the two-dimensional image corresponding to the any pixel point;
and the second corresponding relation obtaining module is used for obtaining a spatial position point in the point cloud data corresponding to a pixel point in the two-dimensional image according to the inverse process of the process of converting the point cloud data into the two-dimensional image, and obtaining the corresponding relation between the point cloud data and the RGB image. And finally, obtaining the corresponding relation between the RGB image and the point cloud data according to the corresponding relation between the two-dimensional image and the point cloud data, and establishing a basis for the subsequent reconstruction of the space target.
Preferably, the point cloud target extraction module comprises:
the filtering module is used for removing outliers by adopting a radius filtering method based on the point cloud data to obtain filtering point cloud data;
the normal vector estimation module is used for establishing a three-dimensional kd-tree space index for the filtering point cloud data; aiming at each spatial position point in the filtering point cloud data, solving by using the kd-tree spatial index to obtain k nearest points; for each space position point and k nearest points thereof, solving an equation of a fitting plane of the space position point by adopting a characteristic value method, and determining a normal vector of each space position point;
and the point cloud target segmentation module is used for carrying out segmentation and area growth according to the normal vector to obtain point cloud data of the target, namely three-dimensional point cloud data of the target, and capturing the target from the background.
Preferably, the image object extracting module includes:
the target model acquisition module is used for acquiring a model of a common target on the COCO data set;
the image denoising module is used for denoising the RGB image to obtain a denoised RGB image;
and the image target segmentation module is used for segmenting the de-noised RGB image by using the model of the common target to obtain an RGB image of the target, capturing the target from the background to obtain the RGB image of the target.
Preferably, the object obtaining module includes:
and the pixel assignment module is used for assigning the parameter of each pixel point in the RGB image of the target to a spatial position point in the point cloud data of the target corresponding to the pixel point according to the corresponding relation between the point cloud data and the RGB image, fusing the point cloud data of the target and the RGB image of the target to obtain target spatial positioning data and realize the reconstruction of the target.
Preferably, in terms of hardware, the 3D TOF sensor is a Sony IMX556 industrial-grade VGA pixel-level TOF sensor module, the RGB camera is a Sony IMX274 high-definition CMOS sensor module, and a VCSEL surface light source driving module, a rayleigh micro-self ISP and 2TOPS NPU computing RV1126SOC chip module, and the like are adopted, and the connection relationship of each part is shown in fig. 4. The method is realized by adopting the RV1126 single-chip SOC scheme, so that the complexity of system circuit design is simplified, the system space is saved, and the system cost is reduced.
It should be understood that the above examples are only for clarity of illustration and are not intended to limit the embodiments. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. And obvious variations or modifications therefrom are within the scope of the invention.
Claims (10)
1. A spatial localization method, comprising the steps of:
synchronously acquiring point cloud data output by a 3D TOF sensor and RGB images output by an RGB camera;
performing coarse registration by adopting an SIFT algorithm according to the point cloud data and the RGB image to obtain a first matching point pair;
performing fine registration on the first matching point pair according to a space constraint relation to obtain a second matching point pair;
training an RBF neural network according to the second matching point pairs to obtain the corresponding relation between the point cloud data and the RGB image;
filtering, segmenting and extracting according to the point cloud data to obtain point cloud data of a target;
denoising, segmenting and extracting according to the RGB image to obtain an RGB image of the target;
and fusing the point cloud data of the target and the RGB image of the target according to the corresponding relation between the point cloud data and the RGB image to obtain the target space positioning data.
2. The method of claim 1, wherein the step of performing coarse registration using a SIFT algorithm based on the point cloud data and the RGB image to obtain a first matching point pair comprises:
converting the point cloud data into a two-dimensional image;
extracting a first key point of the two-dimensional image by adopting an SIFT algorithm to obtain a feature vector F of the first key pointi 1I is 1,2, …, n, n is the total number of the first keypoints;
extracting second key points of the RGB image by adopting an SIFT algorithm to obtain a feature vector F of the RGB imagej 2J is 1,2, …, m, m is the total number of second keypoints;
respectively calculating a neighbor index G between the first key point and the second key pointijThe calculation formula is as follows:
wherein, muiIs the first key point feature vector Fi 1Mean value of (d) (. mu.)jAs the second keypoint feature vector Fj 2Mean value of (a)iIs the first key point feature vector Fi 1Variance of (a)jAs the second keypoint feature vector Fj 2Variance of (a)ijIs the first key point feature vector Fi 1And a second keypoint feature vector Fj 2Covariance of (C)1、C2、C3Respectively are preset constants;
aiming at each first key point, selecting neighbor index G in all second key pointsijA maximum of k, as coarse registration points for the first keypoint, obtaining a first pair of matching points (F)i 1,Fj 2),i=1,2,…,n,j=1,2,…,k。
3. The method according to claim 1, wherein the step of fine-registering the first matching point pair according to the spatial constraint relationship to obtain a second matching point pair comprises:
selecting the first matching point pair which simultaneously meets the following three spatial constraint relations to obtain a second matching point pair:
the pixel translation distance between a first key point and a second key point in the first matching point pair is smaller than or equal to a first preset value;
wherein,for any two of the first pair of matching points,is the included angle between the connecting line of two first key points in the two-dimensional image and the horizontal direction,is the included angle between the connecting line of two second key points in the RGB image and the horizontal direction, epsilon1The second preset value is set;
4. The method of claim 1, wherein the step of training the RBF neural network according to the second matching point pair to obtain the corresponding relationship between the point cloud data and the RGB image comprises:
constructing a conversion matrix model between a first key point feature vector and a second key point feature vector in the second matching point pair, taking the first key point feature vector in the second matching point pair as the input of the RBF neural network, taking the second key point feature vector in the second matching point pair as the expected output of the RBF neural network, training the RBF neural network by utilizing the input and the expected output, and obtaining various parameters of the conversion matrix model;
inputting any pixel point in the RGB image into the trained RBF neural network to obtain a pixel point in the two-dimensional image corresponding to the any pixel point;
and according to the inverse process of the process of converting the point cloud data into the two-dimensional image, obtaining a spatial position point in the point cloud data corresponding to the pixel point in the two-dimensional image, and obtaining a corresponding relation between the point cloud data and the RGB image.
5. The method according to any one of claims 1 to 4, wherein the step of fusing the point cloud data of the target and the RGB image of the target according to the corresponding relationship between the point cloud data and the RGB image to obtain the object space positioning data comprises:
and according to the corresponding relation between the point cloud data and the RGB image, assigning the parameter of each pixel point in the RGB image of the target to a spatial position point in the point cloud data of the target corresponding to the pixel point, and fusing the point cloud data of the target and the RGB image of the target to obtain target spatial positioning data.
6. A spatial locator device, comprising:
the synchronous acquisition module is used for synchronously acquiring point cloud data output by the 3D TOF sensor and RGB images output by the RGB camera;
the rough registration module is used for carrying out rough registration by adopting an SIFT algorithm according to the point cloud data and the RGB image to obtain a first matching point pair;
the fine registration module is used for performing fine registration on the first matching point pair according to the space constraint relation to obtain a second matching point pair;
the RBF neural network module is used for training an RBF neural network according to the second matching point pairs to obtain the corresponding relation between the point cloud data and the RGB image;
the point cloud target extraction module is used for carrying out filtering, segmentation and extraction processing according to the point cloud data to obtain point cloud data of a target;
the image target extraction module is used for carrying out denoising and segmentation extraction processing according to the RGB image to obtain an RGB image of a target;
and the target obtaining module is used for fusing the point cloud data of the target and the RGB image of the target according to the corresponding relation between the point cloud data and the RGB image to obtain target space positioning data.
7. The apparatus of claim 6, wherein the coarse registration module comprises:
the two-dimensional image conversion module is used for converting the point cloud data into a two-dimensional image;
a first SIFT extraction module for extracting the first key point of the two-dimensional image by using SIFT algorithm to obtain the feature vector F thereofi 1I is 1,2, …, n, n is the total number of the first keypoints;
a second SIFT extraction module for extracting second key points of the RGB image by using SIFT algorithm to obtain a feature vector F thereofj 2J is 1,2, …, m, m is the total number of second keypoints;
a calculation module for calculating a neighbor index G between the first and second key points, respectivelyijThe calculation formula is as follows:
wherein, muiIs the first key point feature vector Fi 1Mean value of (d) (. mu.)jAs the second keypoint feature vector Fj 2Mean value of (a)iIs the first key point feature vector Fi 1Variance of (a)jAs the second keypoint feature vector Fj 2Variance of (a)ijIs the first key point feature vector Fi 1And a second keypoint feature vector Fj 2Covariance of (C)1、C2、C3Respectively are preset constants;
a first matching point pair obtaining module, configured to select, for each first keypoint, a neighbor index G of all second keypointsijA maximum of k, as coarse registration points for the first keypoint, obtaining a first pair of matching points (F)i 1,Fj 2),i=1,2,…,n,j=1,2,…,k。
8. The apparatus of claim 6, wherein the fine registration module comprises:
a second matching point pair obtaining module, configured to select the first matching point pair that simultaneously satisfies the following three spatial constraint relationships, and obtain a second matching point pair:
the pixel translation distance between a first key point and a second key point in the first matching point pair is smaller than or equal to a first preset value;
wherein,for any two of the first pair of matching points,is the included angle between the connecting line of two first key points in the two-dimensional image and the horizontal direction,is the included angle between the connecting line of two second key points in the RGB image and the horizontal direction, epsilon1The second preset value is set;
9. The apparatus of claim 6, wherein the RBF neural network module comprises:
the training module is used for constructing a conversion matrix model between a first key point feature vector and a second key point feature vector in the second matching point pair, taking the first key point feature vector in the second matching point pair as the input of the RBF neural network, taking the second key point feature vector in the second matching point pair as the expected output of the RBF neural network, training the RBF neural network by utilizing the input and the expected output, and obtaining various parameters of the conversion matrix model;
the first corresponding relation obtaining module is used for inputting any pixel point in the RGB image into the trained RBF neural network to obtain a pixel point in the two-dimensional image corresponding to the any pixel point;
and the second corresponding relation obtaining module is used for obtaining a spatial position point in the point cloud data corresponding to a pixel point in the two-dimensional image according to the inverse process of the process of converting the point cloud data into the two-dimensional image, and obtaining the corresponding relation between the point cloud data and the RGB image.
10. The apparatus of any of claims 6-9, wherein the target obtaining module comprises:
and the pixel assignment module is used for assigning the parameter of each pixel point in the RGB image of the target to a spatial position point in the point cloud data of the target corresponding to the pixel point according to the corresponding relation between the point cloud data and the RGB image, fusing the point cloud data of the target and the RGB image of the target and obtaining target spatial positioning data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110675145.1A CN113379663B (en) | 2021-06-18 | 2021-06-18 | Space positioning method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110675145.1A CN113379663B (en) | 2021-06-18 | 2021-06-18 | Space positioning method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113379663A CN113379663A (en) | 2021-09-10 |
CN113379663B true CN113379663B (en) | 2022-04-12 |
Family
ID=77577637
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110675145.1A Active CN113379663B (en) | 2021-06-18 | 2021-06-18 | Space positioning method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113379663B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110415342A (en) * | 2019-08-02 | 2019-11-05 | 深圳市唯特视科技有限公司 | A kind of three-dimensional point cloud reconstructing device and method based on more merge sensors |
CN112802073A (en) * | 2021-04-08 | 2021-05-14 | 之江实验室 | Fusion registration method based on image data and point cloud data |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11379688B2 (en) * | 2017-03-16 | 2022-07-05 | Packsize Llc | Systems and methods for keypoint detection with convolutional neural networks |
CN110705574B (en) * | 2019-09-27 | 2023-06-02 | Oppo广东移动通信有限公司 | Positioning method and device, equipment and storage medium |
CN112836734A (en) * | 2021-01-27 | 2021-05-25 | 深圳市华汉伟业科技有限公司 | Heterogeneous data fusion method and device and storage medium |
-
2021
- 2021-06-18 CN CN202110675145.1A patent/CN113379663B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110415342A (en) * | 2019-08-02 | 2019-11-05 | 深圳市唯特视科技有限公司 | A kind of three-dimensional point cloud reconstructing device and method based on more merge sensors |
CN112802073A (en) * | 2021-04-08 | 2021-05-14 | 之江实验室 | Fusion registration method based on image data and point cloud data |
Also Published As
Publication number | Publication date |
---|---|
CN113379663A (en) | 2021-09-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110853075B (en) | Visual tracking positioning method based on dense point cloud and synthetic view | |
CN109903331B (en) | Convolutional neural network target detection method based on RGB-D camera | |
CN110009674B (en) | Monocular image depth of field real-time calculation method based on unsupervised depth learning | |
CN109993793B (en) | Visual positioning method and device | |
CN113065546B (en) | Target pose estimation method and system based on attention mechanism and Hough voting | |
CN109579825B (en) | Robot positioning system and method based on binocular vision and convolutional neural network | |
CN111476841B (en) | Point cloud and image-based identification and positioning method and system | |
CN107103277B (en) | Gait recognition method based on depth camera and 3D convolutional neural network | |
CN107063228A (en) | Targeted attitude calculation method based on binocular vision | |
CN106155299B (en) | A kind of pair of smart machine carries out the method and device of gesture control | |
CN109598754A (en) | A kind of binocular depth estimation method based on depth convolutional network | |
CN112750198B (en) | Dense correspondence prediction method based on non-rigid point cloud | |
CN111998862B (en) | BNN-based dense binocular SLAM method | |
CN110751097B (en) | Semi-supervised three-dimensional point cloud gesture key point detection method | |
CN110135277B (en) | Human behavior recognition method based on convolutional neural network | |
CN114219855A (en) | Point cloud normal vector estimation method and device, computer equipment and storage medium | |
CN111583342B (en) | Target rapid positioning method and device based on binocular vision | |
CN108171753A (en) | Stereoscopic vision localization method based on centroid feature point Yu neighborhood gray scale cross correlation | |
CN115601430A (en) | Texture-free high-reflection object pose estimation method and system based on key point mapping | |
CN113284249B (en) | Multi-view three-dimensional human body reconstruction method and system based on graph neural network | |
CN110197104B (en) | Distance measurement method and device based on vehicle | |
CN108399630B (en) | Method for quickly measuring distance of target in region of interest in complex scene | |
CN106595602B (en) | Relative orientation method based on homonymous line feature | |
CN112329723A (en) | Binocular camera-based multi-person human body 3D skeleton key point positioning method | |
CN117315138A (en) | Three-dimensional reconstruction method and system based on multi-eye vision |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |