CN114140539A - Method and device for acquiring position of indoor object - Google Patents

Method and device for acquiring position of indoor object Download PDF

Info

Publication number
CN114140539A
CN114140539A CN202111472804.8A CN202111472804A CN114140539A CN 114140539 A CN114140539 A CN 114140539A CN 202111472804 A CN202111472804 A CN 202111472804A CN 114140539 A CN114140539 A CN 114140539A
Authority
CN
China
Prior art keywords
camera
coordinate system
image
laser
recognized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111472804.8A
Other languages
Chinese (zh)
Inventor
王旭
何为
白世杰
陈永丰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jianke Public Facilities Operation Management Co ltd
Original Assignee
Jianke Public Facilities Operation Management Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jianke Public Facilities Operation Management Co ltd filed Critical Jianke Public Facilities Operation Management Co ltd
Priority to CN202111472804.8A priority Critical patent/CN114140539A/en
Publication of CN114140539A publication Critical patent/CN114140539A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • G06T7/85Stereo camera calibration
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/88Lidar systems specially adapted for specific applications
    • G01S17/89Lidar systems specially adapted for specific applications for mapping or imaging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • G06T2207/10012Stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Electromagnetism (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Length Measuring Devices By Optical Means (AREA)
  • Optical Radar Systems And Details Thereof (AREA)

Abstract

The invention relates to a position acquisition method and device for an indoor object, belongs to the technical field of indoor space positioning, and solves the problem that the existing indoor object has large positioning position error and cannot effectively reflect the recording and changing of building information. Carrying out high-precision space calibration on the multi-sensor to obtain external parameters from a local coordinate system of the camera to a basic coordinate center; acquiring a laser point set and an image point set of an indoor space through a plurality of sensors to position the multi-line laser radar; the method comprises the steps of obtaining a laser point set and an image point set of an object to be recognized in an indoor space through a multi-sensor, detecting the position of the object to be recognized in a camera coordinate system by using a target recognition model, and obtaining an image feature extractor for extracting visual semantic features and a fusion module for effectively fusing laser radar features and the visual semantic features; and acquiring the absolute position of the object to be recognized in the world coordinate system through coordinate transformation. The positioning position error of the indoor object can be reduced.

Description

Method and device for acquiring position of indoor object
Technical Field
The invention relates to the technical field of indoor space positioning, in particular to a method and a device for acquiring the position of an indoor object.
Background
Lidar is a novel measuring instrument that utilizes the principle of laser ranging to determine the space to be scanned. The linear distance from the laser to the target is obtained by measuring the phase difference (time difference) between the laser emission signal and the target reflection signal point by point. Meanwhile, the spatial position of the target point is obtained from the direction in which the laser signal is emitted and the spatial position of the laser. By means of the dense scanning of the laser to the surface of the object to be scanned, a three-dimensional surface model of the object can be obtained.
The BIM technology is also one of the most popular science and technology topics, and it can add three-dimensional images of information to make the whole life cycle of a building visible as reality. From this, it is not possible to derive an idea whether it is possible to develop a system that has both the accuracy and scalability of the laser point cloud and the informativeness and flexibility of the BIM model? In the indoor space positioning technology in the current market, no matter whether GPS or Bluetooth is utilized, the error exceeds an ideal range, and further the recording and the changing of the building information cannot be effectively reflected.
Disclosure of Invention
In view of the foregoing analysis, embodiments of the present invention provide a method and an apparatus for acquiring a position of an indoor object, so as to solve the problem that the recording and changing of building information cannot be reflected effectively due to a large error of a positioning position of an existing indoor object.
In one aspect, an embodiment of the present invention provides a method for acquiring a position of an indoor object, including: carrying out high-precision spatial calibration on a multi-sensor to obtain an external parameter from a local coordinate system of a camera to a basic coordinate center, wherein the multi-sensor comprises a multi-line laser radar, a fixed laser radar and a camera; acquiring a laser point set and an image point set of an indoor space through the multi-sensor to position the multi-line laser radar; acquiring a laser point set and an image point set of an object to be recognized in the indoor space through the multi-sensor, and detecting the position of the object to be recognized in a camera coordinate system by using a target recognition model, wherein the target recognition model comprises a point cloud feature extractor for extracting laser radar features, an image feature extractor for extracting visual semantic features and a fusion module for effectively fusing the laser radar features and the visual semantic features; and acquiring the absolute position of the object to be recognized in the world coordinate system through coordinate transformation.
The beneficial effects of the above technical scheme are as follows: this application is through carrying out high accuracy space calibration to multisensor, right multiline lidar advances line location and utilizes the target identification model to detect treat discernment object position under the camera coordinate system, can reduce the error of indoor object locating position by a wide margin. In addition, the absolute position of the object to be recognized in the world coordinate system is acquired through coordinate transformation, so that the target to be recognized is added to the building information model BIM, and the recording and the change of the building information can be effectively reflected.
In a further improvement of the above method, the lidar characteristic comprises geometry and depth information of a point cloud; the visual semantic features comprise color and texture information of the image; and the fusion module is used for fusing the geometrical structure and the depth information of the point cloud with the color and texture information of the image to generate a 3D model of the object to be identified.
Based on further improvement of the method, the point cloud feature extractor comprises a multi-scale point cloud feature extractor for extracting the geometric structure and the depth information of the multi-scale point cloud; the image feature extractor comprises a multi-scale image feature extractor which is used for extracting color and texture information of the multi-scale image; and the fusion module comprises a multi-scale fusion module and is used for fusing the geometrical structures and the depth information of the point clouds with different scales and the color and the texture information of the image with the corresponding scale layer by layer.
Based on a further improvement of the above method, acquiring, by the multi-sensor, a laser point set and an image point set of an object to be recognized located in the indoor space, and detecting a position of the object to be recognized in a camera coordinate system by using a target recognition model further includes: establishing a deep learning neural network, and training the deep learning neural network by using the marked visible light image and the marked laser point cloud image to obtain the target recognition model; and shooting a visible light image of the target to be recognized by using the camera, scanning a laser point cloud picture of the target to be recognized by using the multi-line laser radar and the fixed laser radar, and inputting the visible light image of the target to be recognized and the laser point cloud picture of the target to be recognized into the target recognition model to obtain the type and the position of the target to be recognized in the camera coordinate system.
Based on further improvement of the method, the following loss functions are constructed, and the deep learning neural network is trained by using the marked visible light image and the marked laser point cloud image based on the loss functions:
Figure BDA0003383193120000031
wherein D and G represent the predicted bounding box and the true bounding box, respectively, and c represents the confidence of the classification of D; total residual function LtotalThe definition is as follows:
Ltotal=Lrpn+Lrcnn
wherein L isrpnAnd LrcnnAnd residual error functions respectively representing two sub-networks of RPN and RCNN, wherein the RPN network is used for generating candidate frames, and the RCNN optimization network is used for optimizing target detection frames.
Based on a further improvement of the above method, acquiring a set of laser points and a set of image points of an indoor space by the multi-sensor to locate the multiline lidar further comprises: performing high-precision positioning based on feature matching, wherein performing high-precision positioning based on feature matching further comprises: calculating the inclination angle omega of the laser beam of the laser point compared with the horizontal plane of the laser radar according to the coordinates (x, y, z) of the laser point:
Figure BDA0003383193120000032
the relative pose of the (k + 1) th frame and the (k) th frame is as follows:
Figure BDA0003383193120000033
turning to the k +1 frame coordinate system:
Figure BDA0003383193120000041
wherein p isiIn order to be a line feature,
Figure BDA0003383193120000042
to predict line features;
constructing a residual function of the line feature and the surface feature to solve a pose vector Twl under a world coordinate system:
Figure BDA0003383193120000043
Figure BDA0003383193120000044
wherein, | pa-pbI is the length of the line feature, when piWhen the line feature is obtained, the nearest line feature point p is searched in the previous frameaAnd find a line characteristic point p on the adjacent linebTo form a straight line; when it is a face feature, the nearest face feature point p is searched in the previous framemAnd finding two face feature points p on the adjacent linejAnd plAnd forming a plane.
The beneficial effects of the above technical scheme are as follows: the multi-line laser radar is positioned by utilizing the laser point set and the image point set acquired by the portable laser scanner so as to acquire the pose of the multi-line laser radar under the world coordinate system, and the situation that objects in the wall are positioned and added into a room next to the outside of the wall due to insufficient system precision can be avoided.
Based on the further improvement of the method, the high-precision spatial calibration of the multi-sensor to obtain the external parameter from the local coordinate system of the camera to the base coordinate center further comprises: jointly calibrating the multiline lidar and the camera to obtain rotation and translation of the multiline lidar relative to the camera; and jointly calibrating the multiline lidar and the solid state lidar to calculate extrinsic parameters between the multiline lidar and the stationary lidar, wherein jointly calibrating the multiline lidar and the camera further comprises:
Figure BDA0003383193120000045
wherein Zc is a scale parameter; (X)w,Yw,Zw) Is the world coordinate system; (u, v) are pixel coordinates; the camera coordinate system takes the optical axis of the camera as the z-axis, the central position of the light ray in the optical system of the camera is the origin Oc, the camera coordinate systems Xc and Yc are parallel to the image coordinate systems X and Y, respectively, the distance f between the camera coordinate origin and the origin of the image coordinate system, namely the focal length,
Figure BDA0003383193120000051
solving the camera intrinsic parameter matrix by using a Zhangyingyou scaling method for the camera intrinsic parameter matrix;
Figure BDA0003383193120000052
is a camera extrinsic parameter matrix; and performing external parameter joint rough calibration on the multi-line laser radar and the camera; and carrying out external parameter combined fine calibration on the multi-line laser radar and the camera.
Based on the further improvement of the above method, jointly calibrating the multiline lidar and the solid state lidar to calculate an extrinsic parameter between the multiline lidar and the stationary lidar further comprises: collecting point cloud data of two laser radars in a standard indoor space; providing planar features from the point cloud data; matching the planar features; after the plane feature matching is completed, solving initial values of R and t by using singular value decomposition; and establishing an optimization function according to the square of the distance from the point to the plane as an objective function.
Based on a further improvement of the above method, obtaining the position Pw of the target to be recognized in the world coordinate system by the following formula further includes:
Pw=Twl*Tlc*Pc
the position of the target to be recognized in the camera coordinate system is Pc; the pose of the multi-line laser radar under the world coordinate system is Twl; and the extrinsic parameter from the camera local coordinate system to the base coordinate center is Tlc.
In another aspect, an embodiment of the present invention provides an apparatus for acquiring a position of an indoor object, including: the calibration module is used for carrying out high-precision spatial calibration on the multi-sensor to obtain external parameters from a local coordinate system of the camera to a basic coordinate center, wherein the multi-sensor comprises a multi-line laser radar, a fixed laser radar and a camera; the positioning module is used for acquiring a laser point set and an image point set of an indoor space through the multi-sensor so as to position the multi-line laser radar; the target recognition model is used for acquiring a laser point set and an image point set of an object to be recognized in the indoor space through the multi-sensor and detecting the position of the object to be recognized in a camera coordinate system by using the target recognition model, wherein the target recognition model comprises a point cloud feature extractor for extracting laser radar features, an image feature extractor for extracting visual semantic features and a fusion module for effectively fusing the laser radar features and the visual semantic features; and the coordinate transformation module is used for acquiring the absolute position of the object to be identified in the world coordinate system through coordinate transformation.
Compared with the prior art, the invention can realize at least one of the following beneficial effects:
1. this application is through carrying out high accuracy space calibration to multisensor, right multiline lidar advances line location and utilizes the target identification model to detect treat discernment object position under the camera coordinate system, can reduce the error of indoor object locating position by a wide margin. In addition, the absolute position of the object to be recognized in the world coordinate system is acquired through coordinate transformation, so that the target to be recognized is added to the building information model BIM, and the recording and the change of the building information can be effectively reflected.
2. The multi-line laser radar is positioned by utilizing the laser point set and the image point set acquired by the portable laser scanner so as to acquire the pose of the multi-line laser radar under the world coordinate system, and the situation that objects in the wall are positioned and added into a room next to the outside of the wall due to insufficient system precision can be avoided.
3. And fusing laser radar features (geometric) and visual semantic features (textures) at multiple levels through a plurality of fusion module networks connected between the point cloud branch network and the image branch network.
4. The consistency of the object classification and positioning confidence level can be further improved by designing a new Loss function.
In the invention, the technical schemes can be combined with each other to realize more preferable combination schemes. Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and drawings.
Drawings
The drawings are only for purposes of illustrating particular embodiments and are not to be construed as limiting the invention, wherein like reference numerals are used to designate like parts throughout.
Fig. 1 is a flowchart of a position acquisition method of an indoor object according to an embodiment of the present invention;
FIG. 2 is a schematic illustration of the rotation and translation of the lidar relative to the camera;
fig. 3 is a diagram of a target image acquired by a camera and a laser spot cloud acquired by a lidar.
Fig. 4 is a diagram of extracting the center of a circle for an object given in advance.
FIG. 5 is a graph of local coordinate systems established for multiline lidar and solid state lidar.
Fig. 6 is a flow chart of a high-precision positioning method of the multiline laser radar based on direct matching.
Fig. 7 is a coordinate system of a multiline lidar.
Fig. 8 is a laser spot cloud with laser spot to radar distance and curvature.
FIG. 9 is a diagram of a fusion of a point cloud branch and an image branch according to an embodiment of the invention.
Fig. 10 is a diagram of a deep network architecture according to an embodiment of the present invention.
Fig. 11 is a structural diagram of a converged module network architecture according to an embodiment of the present invention.
FIG. 12 is a schematic diagram of a spatial transformation between a lidar and a camera according to an embodiment of the invention.
Fig. 13 is a block diagram of an automatic identification and positioning device for indoor objects according to an embodiment of the invention.
Detailed Description
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate preferred embodiments of the invention and together with the description, serve to explain the principles of the invention and not to limit the scope of the invention.
The advantages of both laser point clouds and BIM models require that extremely high indoor space localization techniques be a place to be practiced.
The invention discloses a method for acquiring the position of an indoor object. Referring to fig. 1, a position acquisition method of an indoor object includes: in step S102, carrying out high-precision space calibration on a multi-sensor to obtain an external parameter from a local coordinate system of a camera to a basic coordinate center, wherein the multi-sensor comprises a multi-line laser radar, a fixed laser radar and a camera; in step S104, a laser point set and an image point set of an indoor space are obtained through a multi-sensor so as to position the multi-line laser radar; in step S106, a laser point set and an image point set of an object to be recognized in an indoor space are obtained through a multi-sensor, and the position of the object to be recognized in a camera coordinate system is detected by using a target recognition model, wherein the target recognition model comprises a point cloud feature extractor for extracting laser radar features, an image feature extractor for extracting visual semantic features and a fusion module for effectively fusing the laser radar features and the visual semantic features; and in step S108, acquiring the absolute position of the object to be recognized in the world coordinate system through coordinate transformation.
Compared with the prior art, in the position obtaining method of the indoor object provided by the embodiment, the multi-line laser radar is positioned by performing high-precision spatial calibration on the multi-sensor, and the position of the object to be recognized in the camera coordinate system is detected by using the target recognition model, so that the error of the positioning position of the indoor object can be greatly reduced. In addition, the absolute position of the object to be recognized in the world coordinate system is acquired through coordinate transformation, so that the target to be recognized is added to the building information model BIM, and the recording and the change of the building information can be effectively reflected.
Hereinafter, the respective steps of the position acquiring method of the indoor object according to the embodiment of the present invention will be described in detail with reference to fig. 1.
In step S102, a multi-sensor is calibrated in high precision space to obtain an external parameter from a local coordinate system of the camera to a base coordinate center, wherein the multi-sensor includes a multi-line lidar, a fixed lidar and a camera. Performing high-precision spatial calibration on the multi-sensor to obtain the external parameters of the local coordinate system of the camera to the base coordinate center further comprises: jointly calibrating the multi-line laser radar and the camera to obtain the rotation and translation of the multi-line laser radar relative to the camera; and jointly calibrating the multi-line laser radar and the solid laser radar to calculate external parameters between the multi-line laser radar and the fixed laser radar.
Specifically, the jointly calibrating the multiline lidar and the camera further comprises:
Figure BDA0003383193120000081
wherein Zc is a scale parameter; (X)w,Yw,Zw) Is a world coordinate system; (u, v) are pixel coordinates; the camera coordinate system takes the optical axis of the camera as the z-axis, the central position of the light ray in the optical system of the camera is the origin Oc, the camera coordinate systems Xc and Yc are parallel to the image coordinate systems X and Y, respectively, the distance f between the camera coordinate origin and the origin of the image coordinate system, namely the focal length,
Figure BDA0003383193120000091
solving the camera intrinsic parameter matrix by using a Zhangyingyou scaling method for the camera intrinsic parameter matrix;
Figure BDA0003383193120000092
is a camera extrinsic parameter matrix; performing external parameter combined rough calibration on the multi-line laser radar and the camera; and carrying out external parameter combined fine calibration on the multi-line laser radar and the camera.
Specifically, the jointly calibrating the multiline lidar and the solid lidar to calculate the external parameter between the multiline lidar and the fixed lidar further comprises: collecting point cloud data of two laser radars in a standard indoor space; providing planar features from the point cloud data; matching planar features; after the plane feature matching is completed, solving initial values of R and t by using singular value decomposition; and establishing an optimization function according to the square of the distance from the point to the plane as an objective function.
In step S104, a set of laser points and a set of image points of the indoor space are acquired by the multi-sensor to locate the multiline lidar. Acquiring a set of laser points and a set of image points of an indoor space with a multi-sensor to locate a multiline lidar further comprises: and carrying out high-precision positioning based on feature matching. Performing high-precision positioning based on feature matching further comprises: calculating the inclination angle omega of the laser beam of the laser point compared with the horizontal plane of the laser radar according to the coordinates (x, y, z) of the laser point:
Figure BDA0003383193120000093
the relative pose of the (k + 1) th frame and the (k) th frame is as follows:
Figure BDA0003383193120000094
turning to the kth frame coordinate system, points in the (k + 1) th frame:
Figure BDA0003383193120000095
wherein p isiIn order to be a line feature,
Figure BDA0003383193120000096
to predict line features;
constructing a residual function of the line feature and the surface feature to solve a pose vector Twl under a world coordinate system:
Figure BDA0003383193120000097
Figure BDA0003383193120000101
wherein, | pa-pbI is the length of the line feature, when piWhen the line feature is obtained, the nearest line feature point p is searched in the previous frameaAnd find a line characteristic point p on the adjacent linebTo form a straight line; when it is a face feature, the nearest face feature point p is searched in the previous framemAnd finding two face feature points p on the adjacent linejAnd plAnd forming a plane.
In step S106, a laser point set and an image point set of an object to be recognized located in an indoor space are obtained through a multi-sensor, and a target recognition model is used to detect a position of the object to be recognized in a camera coordinate system, wherein the target recognition model includes a point cloud feature extractor for extracting laser radar features, an image feature extractor for extracting visual semantic features, and a fusion module for effectively fusing the laser radar features and the visual semantic features.
The lidar features include geometry and depth information of the point cloud. The visual semantic features include color and texture information of the image. The fusion module is used for fusing the geometrical structure and the depth information of the point cloud with the color and texture information of the image to generate a 3D model of the object to be identified. Specifically, the point cloud feature extractor includes a multi-scale point cloud feature extractor for extracting geometric structure and depth information of the multi-scale point cloud. The image feature extractor includes a multi-scale image feature extractor for extracting color and texture information of the multi-scale image. The fusion module comprises a multi-scale fusion module and is used for fusing the geometrical structure and the depth information of the point clouds with different scales and the color and the texture information of the image with the corresponding scales layer by layer.
Specifically, acquiring a laser point set and an image point set of an object to be recognized in an indoor space through a multi-sensor, and detecting a position of the object to be recognized under a camera coordinate system by using a target recognition model further comprises: establishing a deep learning neural network, and training the deep learning neural network by using the marked visible light image and the marked laser point cloud image to obtain a target recognition model; and shooting a visible light image of the target to be recognized by using a camera, scanning a laser point cloud picture of the target to be recognized by using the multi-line laser radar and the fixed laser radar, and inputting the visible light image of the target to be recognized and the laser point cloud picture of the target to be recognized into a target recognition model to obtain the type and the position of the target to be recognized in a camera coordinate system. Constructing the following loss functions, and training the deep learning neural network by using the marked visible light image and the marked laser point cloud image based on the loss functions:
Figure BDA0003383193120000111
wherein D and G represent the predicted bounding box and the true bounding box, respectively, and c represents the confidence of the classification of D;
total residual function LtotalThe definition is as follows:
Ltotal=Lrpn+Lrcnn
wherein L isrpnAnd LrcnnAnd residual functions respectively representing two sub-networks of RPN and RCNN, wherein the RPN network is used for generating the candidate box, and the RCNN optimization network is used for optimizing the target detection box.
In step S108, the absolute position of the object to be recognized in the world coordinate system is acquired through coordinate transformation. The step of acquiring the position Pw of the target to be recognized in the world coordinate system by the following formula further comprises:
Pw=Twl*Tlc*Pc
the position of a target to be recognized in a camera coordinate system is Pc; the pose of the multiline laser radar under the world coordinate system is Twl; and the extrinsic parameter from the camera local coordinate system to the base coordinate center is Tlc.
In step S110, the target to be recognized is added to the building Information model bim (building Information modeling) according to the position of the target to be recognized in the world coordinate system. In particular, BIM creates, through digital means, a virtual building in a computer that will provide a single, complete building information base containing logical relationships. The meaning of "information" is not only visual information describing geometric shapes, but also contains a great deal of non-geometric information, such as the fire-resistant grade and heat transfer coefficient of materials, the construction cost and purchasing information of components, and the like.
The invention discloses a position acquisition device for an indoor object. Referring to fig. 13, the position acquiring apparatus of the indoor object according to the embodiment of the present invention includes: a calibration module 1302, configured to perform high-precision spatial calibration on a multi-sensor to obtain an external parameter from a local coordinate system of a camera to a base coordinate center, where the multi-sensor includes a multi-line lidar, a fixed lidar and a camera; a positioning module 1304, configured to acquire a laser point set and an image point set of an indoor space through multiple sensors to position a multi-line lidar; the target recognition model 1306 is used for acquiring a laser point set and an image point set of an object to be recognized in an indoor space through a multi-sensor, and detecting the position of the object to be recognized in a camera coordinate system by using the target recognition model, wherein the target recognition model comprises a point cloud feature extractor for extracting laser radar features, an image feature extractor for extracting visual semantic features and a fusion module for effectively fusing the laser radar features and the visual semantic features; and a coordinate transformation module 1308, configured to obtain an absolute position of the object to be recognized in the world coordinate system through coordinate transformation.
Hereinafter, a position acquisition method of an indoor object according to an embodiment of the present invention will be described in detail by way of specific examples with reference to fig. 2 to 12.
The technology of the handheld intelligent monitoring equipment relates to appearance and structure design, hardware system design, software system design (embedded software, acquisition software and intelligent processing software), core algorithm design and the like. The core algorithm is the key of the application, the rationality of the design of the core algorithm directly influences the overall performance of the equipment, and the principle detailed description of the core algorithm is mainly carried out in the part.
The core algorithm of the handheld intelligent monitoring equipment mainly comprises the following four parts: the system comprises a multi-sensor high-precision space calibration algorithm, a high-precision positioning algorithm based on a laser radar, a multi-modal target recognition algorithm based on deep learning and a target position acquisition algorithm based on multi-sensor fusion. The four sections will be described in detail one by one hereinafter.
1. Multi-sensor high-precision space calibration algorithm
The core sensors in the system are 1 multi-line laser radar, 1 solid-state laser radar and 1 visible light camera, so that the sensor calibration mainly refers to that a set of strict algorithm and operation flow are researched through precise design, and high-precision external parameters between every two of the three are finally obtained, wherein the external parameters are represented by a4 x 4 space transformation matrix or a combination of a position vector and an Euler angle vector (quaternion).
Euler angle and rotation matrix transformation relation:
assuming that the rotation is performed in the order of Z-Y-X (2,1,0) and the rotation angles are phi, theta, psi, respectively, the rotation matrix is expressed as follows:
Figure BDA0003383193120000131
in the embodiment, the transformation relation from the visible light camera and the solid-state laser to the multi-line laser is respectively calibrated by taking the multi-line laser as a basic coordinate reference.
1.1 Multi-line laser and Camera Joint calibration
As shown in fig. 2, the result of the joint calibration of the lidar and the camera is a rotation and translation of the lidar relative to the camera.
The joint calibration is carried out in two steps: firstly calibrating the internal parameters of the camera and then jointly calibrating the external parameters.
1.1.1 Camera intrinsic parameter calibration
Internal parameter calibration concept: in the image measuring process and the calculator vision, in order to determine the mutual relation between the three-dimensional geometric relation position of a certain point of a space object and the corresponding point of the certain point in an image, a geometric model of camera imaging must be established, the parameters of the model are the parameters of the camera, and the process of solving the parameters is called camera calibration.
The experimental process of internal parameter calibration is as follows:
(1) a checkerboard a4 sheet (with a known black and white spacing) was printed and attached to a flat plate.
(2) Several pictures (typically 10-20) were taken for the checkerboard.
(3) Feature points (Harris features) are detected in the picture.
(4) 5 internal parameters, and 6 external parameters were calculated using an analytical solution estimation method.
(5) And designing an optimization target and realizing the redefinition of the parameters according to a maximum likelihood estimation strategy.
The imaging process of the camera can be expressed as:
Figure BDA0003383193120000132
the system comprises a world coordinate system (Xw, Yw, Zw), camera coordinates (Xc, Yc, Zc), and pixel coordinates (u, v), wherein the camera coordinate system takes an optical axis of the camera as a z-axis, a central position of a light ray in an optical system of the camera is an origin Oc, and the camera coordinate systems Xc and Yc are respectively parallel to an axis of an image coordinate system X, Y. The distance f between the origin of the camera coordinates and the origin of the image coordinate system, i.e. the focal length
Figure BDA0003383193120000141
Solving the camera intrinsic parameter matrix by using a Zhangyingyou scaling method for the camera intrinsic parameter matrix;
Figure BDA0003383193120000142
is the camera extrinsic parameter matrix.
1.1.2 external parameter Joint calibration
Rough calibration:
(1) description of the points
Mark points (marker) are used for calibration during calibration, since the edges of the mark points are more easily detected in both sensor data. In the calibration process, a laser and a camera are used for respectively collecting a frame of data as basic data.
(2) Detection under point clouds
As shown in fig. 3, since the depth information is discontinuous, edge detection can be achieved by detecting a depth difference between adjacent points under the same line. In fig. 3, (a) and (c) are detection originals, and (b) and (d) are laser spot cloud images corresponding to (a) and (c), respectively.
The size, number, and location of the targets on the actual calibration plate are known, and detection (detection), verification (verification), and point cloud pruning (point cloud pruning) are then required.
a. The inner points fitted to the plane are retained, the outer points are removed, and the inner points are processed as shown in fig. 4 (a).
b. The center of the circle is extracted for a target (four circles, forming a square, a radius, a point distance) given in advance by using a Random sample consensus (RANSAC) method, as shown in fig. 4 (b).
c. And verifying the detection result, and ending if the detection result is passed, and carrying out the next step if the detection result is not passed.
d. If the point cloud cannot be extracted, the point cloud needs to be processed and trimmed. The set rule extracts the points to be extracted.
(3) Detection in a camera
And extracting the edge of the image by using a Sobel operator, and extracting the circular boundary by using Hough transform (Hough transform).
(4) Calculating an initial translation vector (t)x ty tz)
Calculating t byz,r3DRepresenting the detected radius of the point cloud, r2DRepresents the radius detected by the image, Z represents the depth coordinate of the center of the circle detected by the image, f represents the focal length:
Figure BDA0003383193120000151
x, Y, Z points coordinates under lidar, x, y coordinates under the image coordinate system, ox、oyIndicating the principal point deviation.
Figure BDA0003383193120000152
Figure BDA0003383193120000153
Fine calibration:
the fine calibration process is the optimal solution of searching parameters in a small parameter search space. Based on the premise that the edge detected by the camera and the edge detected by the laser radar are matched with each other, the loss function is designed as follows:
a: projecting a target point under a laser radar coordinate system onto a camera plane to construct a two-dimensional image plane;
b: after Sobel operator processing, generating an image containing edge information;
c: the Inverse Distance Transform (IDT) method was used to perform L1 regularization on the edge information of the image.
d: construction of an error function SEThe external parameters are optimized using a non-linear optimization method, thereby minimizing the value of the loss function.
Figure BDA0003383193120000161
The above formula represents the setting of the Loss function, ICRepresenting data in the coordinate system of the processed image, IVRepresenting the data in the radar coordinate system after processing projection. Therefore, optimization is performed on the basis of the rough calibration to improve the calibration accuracy.
1.2 calibration of multi-line laser and solid-state laser
The project adopts a calibration method for calculating the external parameters between the laser radars by surface feature matching, and respectively collects point cloud data of two lasers in a standard calibration room as basic data.
(1) Face feature extraction
a. Point cloud pretreatment: because there are many points which are not on the plane due to the influence of noise, point cloud preprocessing is firstly carried out, and the points which are not on the plane are processed.
b. Using RANSAC to fit a plane in two point clouds, it is possible to fit many planes.
c. Because the plane normal vector is equivalent to one coordinate axis, the rotation of two coordinate systems can be solved only by matching three coordinate axes. The number of planes respectively fitted in the two point clouds by using RANSAC may be more than 3, and three maximum planes are reserved according to the number of plane points.
d. Solving the parametric coefficients of the plane, the plane equation is usually expressed as:
β(i,0)xn(i,1)yn(i,2)zn(i,3)=0
wherein, beta(i,0)、β(i,1)、β(i,2)And beta(i,3)Are parameters respectively. The point-to-plane distance can thus be defined as:
fi(Pn)=|β(i,0)xn(i,1)yn(i,2)zn(i,3)|
e. constructing a least square problem, and selecting N points on a plane, wherein the plane coefficient should satisfy the following least square problem:
Figure BDA0003383193120000171
f. after the plane coefficients are solved, the three planes are considered as three planes of XYZ coordinate axes, and the origin of coordinates of the local coordinate system formed by the three planes is the intersection of the three planes:
Figure BDA0003383193120000172
(2) face feature matching
As shown in fig. 5, the ground is usually selected as a feature plane during calibration, and the lidar is generally installed approximately parallel to the ground, so that after setting the normal vector direction, the following steps are used:
nground surface=max([0,0,1]ni)
Determining which plane in the point cloud is the ground, and setting a local right-hand coordinate system to obtain three normal vectors n1、n2And n3The following relationship is satisfied:
(n2×n1)n3>0
through traversing combination, each plane can be coded with a corresponding number, and matching of plane features is completed.
(3) Closed solving of external parameter initial value
After the matching of the plane features is completed, the three normal vectors are taken as three points, and the initial value of R can be solved by utilizing SVD (singular value decomposition), wherein the three normal vectors have the following corresponding relation:
Rn1=n′1,Rn2=n′2,Rn3=n′3
R[n1,n2,n3]=[n′1,n′2,n′3]
P=[n1,n2,n3],Q=[n′1,n′2,n′3]
H=PQT
and performing SVD on the H matrix to obtain intermediate variables V and U, and solving initial values of R and t by using the following formulas, wherein O and O' are respectively the origin coordinates of the local coordinate systems of the two laser radars.
R=VUT,t=O′-RO
(4) Non-linear optimization
And finally, establishing an optimization function according to the square of the distance from the point to the plane as an objective function, and performing iterative optimization. The optimization function is to sum the error values of 3 matching planes separately, where the error of each matching plane consists of two parts: first part fi 2(Rp '+ t) is the distance from a point on the solid-state laser plane to the plane corresponding to the multi-line laser, and p' is the coordinate of the point on the solid-state laser plane; second part fi 2(R' p + t) is the distance from a point on the multiline laser plane to the corresponding plane of the solid-state laser, and p is the coordinate of the point on the multiline laser plane.
Figure BDA0003383193120000183
2. High-precision positioning algorithm based on laser radar
The module mainly adopts a high-precision positioning algorithm based on direct matching and feature fusion.
2.1 high-precision positioning algorithm based on direct matching
For two sets of points:
X={x1,x2,…,xNx}
Y={y1,y2,…,yNy}
referring to fig. 6, first, point cloud data is preprocessed; dividing the space into grids, and counting points falling in each grid; calculating the mean value and covariance of each grid and constructing Gaussian distribution according to points in each grid; calculating joint probability according to the predicted attitude; and solving R and t, judging whether the solving is finished or not, and returning to the calculation of the joint probability according to the predicted attitude if the solving is not finished. If finished, then output R, t.
Finding an objective function such that:
Figure BDA0003383193120000191
wherein:
Figure BDA0003383193120000192
the objective function is defined as:
Figure BDA0003383193120000193
y′i=T(p,yi)=Ryi+t
wherein, an objective function is defined as follows, mu is the centroid of the point set X, y'iTo make use of predicted position to point yiAnd (4) transforming to obtain point coordinates.
According to the process of the Gauss-Newton method, iterative optimization can be performed only by calculating the Jacobian of the residual function relative to the parameter to be solved.
2.2 high-precision positioning algorithm based on feature matching
(1) Line and surface feature extraction
Referring to fig. 7, from the laser point coordinates (x, y, z), the tilt angle ω of the beam of laser light compared to the radar level can be calculated:
Figure BDA0003383193120000194
referring to fig. 8, from the tilt angle and the radar internal reference (the design tilt angle of each scan line), it can be known to which laser beam the radar belongs.
And calculating the curvature according to the length X (the length refers to the distance from the laser point to the radar) of the front and back adjacent points and the current point. And judging the characteristics according to the curvature.
Figure BDA0003383193120000195
(2) Line-surface feature correlation
The relative pose of the (k + 1) th frame and the (k) th frame is as follows:
Figure BDA0003383193120000201
turning to the kth frame coordinate system, points in the (k + 1) th frame:
Figure BDA0003383193120000202
piin order to be a line feature,
Figure BDA0003383193120000203
to predict line features.
(3) Pose optimization
Respectively constructing residual functions of line features and surface features:
Figure BDA0003383193120000204
Figure BDA0003383193120000205
wherein the content of the first and second substances,
Figure BDA0003383193120000206
to predict line features; | pa-pbI is the length of the line feature, when piWhen the line feature is obtained, the nearest line feature point p is searched in the previous frameaAnd find a line characteristic point p on the adjacent linebAnd form a straight line. When it is a face feature, the nearest face feature point p is searched in the previous framemAnd finding two face feature points p on the adjacent linejAnd plAnd forming a plane.
According to the convex optimization basis, as long as the Jacobian of the residual error relative to the variable to be solved is obtained, Gauss Newton and the like can be adopted for optimization, and therefore the pose vector is solved.
3. Multimodal target recognition based on deep learning
The traditional image target identification technology based on deep learning tends to be stable and mature at present, but the problems of false detection and missing detection easily occur in small targets and complex scenes, so that the technology can only be applied to target identification scenes with simpler environment at present. Deep learning is essentially a data-driven pattern expression learning method, and poor performance under complex conditions is mainly limited by the limitation problem of single sensor data. The multimodality data can realize effective expression of the environment on different characteristic dimensions, and the great advantages of the multimodality data gradually arouse the high attention of researchers at present. The subject is to adopt a novel deep neural network structure, the network input end is multi-mode sensor data and comprises visible light images and laser point clouds, the feature expression learning of different input data is realized through network parameters, and the output end is the type and the position of a target. And a certain amount of labeled data is utilized for training and learning, so that automatic optimization of parameters is realized, and the accuracy and the environmental adaptability of target identification are greatly improved.
Theoretically, the image information is dense and regular, and contains abundant color information and texture information, but has the disadvantage of being two-dimensional information. There is a scale problem due to the proximity. The expression of the point cloud is sparse and irregular relative to the image, which also makes direct processing on the point cloud infeasible with traditional CNN perception. However, the point cloud contains three-dimensional geometric structure and depth information, which is more beneficial for 3D object detection, so that the two information are theoretically complementary. In addition, in the current two-dimensional image detection, deep learning methods are all designed based on CNN, but in point cloud target detection, there are networks designed by multiple basic structures such as MLP, CNN, GCN, etc., and it is relatively necessary to study which network is fused in the fusion process.
The project mainly adopts a method based on feature fusion, as shown in fig. 9. This fusion requires some interaction in the feature layer. The main fusion mode is that a feature extractor is respectively adopted for the point cloud and the image branch, and the network of the image branch and the point cloud branch is fused in a feed-forward hierarchy by semantic levels to perform semantic fusion of multi-scale information.
The network structure adopted by the project is shown in fig. 10, wherein a point cloud branch is a point encoder-decoder (point encoder-decoder) structure, and an image branch is a stepwise-encoding network and performs feature fusion layer by layer.
The network consists of a two-stream RPN network and an optimized network, wherein the RPN network is mainly used for generating a candidate frame, the optimized network is mainly used for optimizing a target detection frame, and the network can be directly trained end to end. Through an L1-Fusion module, the network can effectively fuse laser radar features and visual semantic features.
L1-Fusion module network architecture, as shown in fig. 11:
in addition, the network further improves the consistency of the object classification and positioning confidence level by designing a new loss function (CE loss).
CE loss is defined as follows:
Figure BDA0003383193120000221
where D and G represent the predicted and true bounding boxes (bounding boxes), respectively, and c represents the confidence of the classification of D.
Total residual function LtotalThe definition is as follows:
Ltotal=Lrpn+Lrcnn
wherein L isrpnAnd LrcnnRepresenting the residual functions of rpn and rcnn two subnetworks, respectively.
4. Target location acquisition based on multi-sensor fusion
Referring to fig. 12, after the three steps of sensor calibration, high-precision positioning, and target detection of multi-modal data, the results of the three steps may be used to perform fusion calculation, so as to obtain the absolute position of the target of interest in the world coordinate system, and the overall calculation process is summarized as follows:
(1) detecting an interested target at the time t according to the multi-mode fusion result, wherein the position of the interested target in a camera coordinate system is Pc;
(2) the pose of the basic coordinate center (multi-line laser) at the moment under the world coordinate system is solved as Twl;
(3) carrying out high-precision calibration on external parameters of each sensor, wherein the external parameter from a local coordinate system of a camera to a basic coordinate center is Tlc;
(4) the position Pw of the target of interest in the world coordinate system can be solved through spatial transformation, and the calculation formula is as follows:
Pw=Twl*Tlc*Pc
those skilled in the art will appreciate that all or part of the flow of the method implementing the above embodiments may be implemented by a computer program, which is stored in a computer readable storage medium, to instruct related hardware. The computer readable storage medium is a magnetic disk, an optical disk, a read-only memory or a random access memory.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention.

Claims (10)

1. A method for acquiring a position of an indoor object, comprising:
carrying out high-precision spatial calibration on a multi-sensor to obtain an external parameter from a local coordinate system of a camera to a basic coordinate center, wherein the multi-sensor comprises a multi-line laser radar, a fixed laser radar and a camera;
acquiring a laser point set and an image point set of an indoor space through the multi-sensor to position the multi-line laser radar;
acquiring a laser point set and an image point set of an object to be recognized in the indoor space through the multi-sensor, and detecting the position of the object to be recognized in a camera coordinate system by using a target recognition model, wherein the target recognition model comprises a point cloud feature extractor for extracting laser radar features, an image feature extractor for extracting visual semantic features and a fusion module for effectively fusing the laser radar features and the visual semantic features; and
and acquiring the absolute position of the object to be recognized in the world coordinate system through coordinate transformation.
2. A position acquisition method of an indoor object according to claim 1,
the laser radar features comprise geometrical structure and depth information of a point cloud;
the visual semantic features comprise color and texture information of the image; and
the fusion module is used for fusing the geometrical structure and the depth information of the point cloud with the color and texture information of the image to generate a 3D model of the object to be identified.
3. A position acquisition method of an indoor object according to claim 2,
the point cloud feature extractor comprises a multi-scale point cloud feature extractor for extracting the geometric structure and depth information of the multi-scale point cloud;
the image feature extractor comprises a multi-scale image feature extractor which is used for extracting color and texture information of the multi-scale image; and
the fusion module comprises a multi-scale fusion module and is used for fusing the geometrical structures and the depth information of the point clouds with different scales and the color and the texture information of the image with the corresponding scales layer by layer.
4. The method according to claim 2, wherein acquiring the set of laser points and the set of image points of the object to be recognized located in the indoor space by the multi-sensor, and detecting the position of the object to be recognized in the camera coordinate system by using the target recognition model further comprises:
establishing a deep learning neural network, and training the deep learning neural network by using the marked visible light image and the marked laser point cloud image to obtain the target recognition model;
and shooting a visible light image of the target to be recognized by using the camera, scanning a laser point cloud picture of the target to be recognized by using the multi-line laser radar and the fixed laser radar, and inputting the visible light image of the target to be recognized and the laser point cloud picture of the target to be recognized into the target recognition model to obtain the type and the position of the target to be recognized in the camera coordinate system.
5. The method according to claim 4, wherein the following loss functions are constructed and the deep learning neural network is trained using the marked visible light image and the marked laser point cloud map based on the loss functions:
Figure FDA0003383193110000021
wherein D and G represent the predicted bounding box and the true bounding box, respectively, and c represents the confidence of the classification of D;
total residual function LtotalThe definition is as follows:
Ltotal=Lrpn+Lrcnn
wherein L isrpnAnd LrcnnAnd residual error functions respectively representing two sub-networks of RPN and RCNN, wherein the RPN network is used for generating candidate frames, and the RCNN optimization network is used for optimizing target detection frames.
6. The method of claim 1, wherein the acquiring a set of laser points and a set of image points of an indoor space by the multi-sensor to locate the multiline lidar further comprises: performing high-precision positioning based on feature matching, wherein performing high-precision positioning based on feature matching further comprises:
calculating the inclination angle omega of the laser beam of the laser point compared with the horizontal plane of the laser radar according to the coordinates (x, y, z) of the laser point:
Figure FDA0003383193110000022
the relative pose of the (k + 1) th frame and the (k) th frame is as follows:
Figure FDA0003383193110000023
turning to the k +1 frame coordinate system:
Figure FDA0003383193110000031
wherein p isiIn order to be a line feature,
Figure FDA0003383193110000032
to predict line features;
constructing a residual function of the line feature and the surface feature to solve a pose vector Twl under a world coordinate system:
Figure FDA0003383193110000033
Figure FDA0003383193110000034
wherein, | pa-pbI is the length of the line feature, when piWhen the line feature is obtained, the nearest line feature point p is searched in the previous frameaAnd find a line characteristic point p on the adjacent linebTo form a straight line; when it is a face feature, the nearest face feature point p is searched in the previous framemAnd finding two face feature points p on the adjacent linejAndland forming a plane.
7. The method of claim 1, wherein the performing high-precision spatial calibration on the multi-sensor to obtain the extrinsic parameters from the local coordinate system of the camera to the base coordinate center further comprises:
jointly calibrating the multiline lidar and the camera to obtain rotation and translation of the multiline lidar relative to the camera; and
jointly calibrating the multiline lidar and the solid state lidar to calculate an extrinsic parameter between the multiline lidar and the stationary lidar,
wherein jointly calibrating the multiline lidar and the camera further comprises:
Figure FDA0003383193110000035
wherein Zc is a scale parameter; (X)w,Yw,Zw) Is the world coordinate system; (u, v) are pixel coordinates; the camera coordinate system takes the optical axis of the camera as the z-axis, and the light ray is positioned at the center of the optical system of the cameraIs the origin Oc, the camera coordinate system Xc, Yc are parallel to the image coordinate system X, Y-axis, respectively, the distance f between the camera coordinate origin and the origin of the image coordinate system, i.e. the focal length,
Figure FDA0003383193110000041
solving the camera intrinsic parameter matrix by using a Zhangyingyou scaling method for the camera intrinsic parameter matrix;
Figure FDA0003383193110000042
is a camera extrinsic parameter matrix; and
carrying out external parameter combined rough calibration on the multi-line laser radar and the camera; and
and carrying out external parameter combined fine calibration on the multi-line laser radar and the camera.
8. The method of claim 7, wherein jointly calibrating the multiline lidar and the solid state lidar to calculate the extrinsic parameters between the multiline lidar and the stationary lidar further comprises:
collecting point cloud data of two laser radars in a standard indoor space;
providing planar features from the point cloud data;
matching the planar features;
after the plane feature matching is completed, solving initial values of R and t by using singular value decomposition; and
and establishing an optimization function according to the square of the distance from the point to the plane as an objective function.
9. The method according to claim 8, wherein obtaining the position Pw of the target to be recognized in the world coordinate system by the following formula further comprises:
Pw=Twl*Tlc*Pc
the position of the target to be recognized in the camera coordinate system is Pc; the pose of the multi-line laser radar under the world coordinate system is Twl; and the extrinsic parameter from the camera local coordinate system to the base coordinate center is Tlc.
10. A position acquisition apparatus for an indoor object, comprising:
the calibration module is used for carrying out high-precision spatial calibration on the multi-sensor to obtain external parameters from a local coordinate system of the camera to a basic coordinate center, wherein the multi-sensor comprises a multi-line laser radar, a fixed laser radar and a camera;
the positioning module is used for acquiring a laser point set and an image point set of an indoor space through the multi-sensor so as to position the multi-line laser radar;
the target recognition model is used for acquiring a laser point set and an image point set of an object to be recognized in the indoor space through the multi-sensor and detecting the position of the object to be recognized in a camera coordinate system by using the target recognition model, wherein the target recognition model comprises a point cloud feature extractor for extracting laser radar features, an image feature extractor for extracting visual semantic features and a fusion module for effectively fusing the laser radar features and the visual semantic features; and
and the coordinate transformation module is used for acquiring the absolute position of the object to be identified in the world coordinate system through coordinate transformation.
CN202111472804.8A 2021-11-30 2021-11-30 Method and device for acquiring position of indoor object Pending CN114140539A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111472804.8A CN114140539A (en) 2021-11-30 2021-11-30 Method and device for acquiring position of indoor object

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111472804.8A CN114140539A (en) 2021-11-30 2021-11-30 Method and device for acquiring position of indoor object

Publications (1)

Publication Number Publication Date
CN114140539A true CN114140539A (en) 2022-03-04

Family

ID=80388015

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111472804.8A Pending CN114140539A (en) 2021-11-30 2021-11-30 Method and device for acquiring position of indoor object

Country Status (1)

Country Link
CN (1) CN114140539A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114241269A (en) * 2022-02-18 2022-03-25 聚时科技(江苏)有限公司 A collection card vision fuses positioning system for bank bridge automatic control
CN115294204A (en) * 2022-10-10 2022-11-04 浙江光珀智能科技有限公司 Outdoor target positioning method and system
CN117456146A (en) * 2023-12-21 2024-01-26 绘见科技(深圳)有限公司 Laser point cloud splicing method, device, medium and equipment

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114241269A (en) * 2022-02-18 2022-03-25 聚时科技(江苏)有限公司 A collection card vision fuses positioning system for bank bridge automatic control
CN115294204A (en) * 2022-10-10 2022-11-04 浙江光珀智能科技有限公司 Outdoor target positioning method and system
CN117456146A (en) * 2023-12-21 2024-01-26 绘见科技(深圳)有限公司 Laser point cloud splicing method, device, medium and equipment
CN117456146B (en) * 2023-12-21 2024-04-12 绘见科技(深圳)有限公司 Laser point cloud splicing method, device, medium and equipment

Similar Documents

Publication Publication Date Title
CN111563442B (en) Slam method and system for fusing point cloud and camera image data based on laser radar
WO2021233029A1 (en) Simultaneous localization and mapping method, device, system and storage medium
CN109598765B (en) Monocular camera and millimeter wave radar external parameter combined calibration method based on spherical calibration object
Daftry et al. Building with drones: Accurate 3D facade reconstruction using MAVs
Kang et al. Automatic targetless camera–lidar calibration by aligning edge with gaussian mixture model
CN110415342A (en) A kind of three-dimensional point cloud reconstructing device and method based on more merge sensors
CN114140539A (en) Method and device for acquiring position of indoor object
CN104574406B (en) A kind of combined calibrating method between 360 degree of panorama laser and multiple vision systems
CN109961468A (en) Volume measuring method, device and storage medium based on binocular vision
CN113096183B (en) Barrier detection and measurement method based on laser radar and monocular camera
Moussa Integration of digital photogrammetry and terrestrial laser scanning for cultural heritage data recording
Hu et al. An automatic 3D registration method for rock mass point clouds based on plane detection and polygon matching
Liang et al. Automatic registration of terrestrial laser scanning data using precisely located artificial planar targets
CN114137564A (en) Automatic indoor object identification and positioning method and device
CN113916130B (en) Building position measuring method based on least square method
CN109341668A (en) Polyphaser measurement method based on refraction projection model and beam ray tracing method
US10432915B2 (en) Systems, methods, and devices for generating three-dimensional models
Özdemir et al. A multi-purpose benchmark for photogrammetric urban 3D reconstruction in a controlled environment
CN115359130A (en) Radar and camera combined calibration method and device, electronic equipment and storage medium
Wang et al. A survey of extrinsic calibration of lidar and camera
Coorg Pose imagery and automated three-dimensional modeling of urban environments
Partovi et al. Automatic integration of laser scanning and photogrammetric point clouds: From acquisition to co-registration
CN115656991A (en) Vehicle external parameter calibration method, device, equipment and storage medium
CN115457130A (en) Electric vehicle charging port detection and positioning method based on depth key point regression
Becker et al. Lidar inpainting from a single image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination