CN110930519B - Semantic ORB-SLAM sensing method and device based on environment understanding - Google Patents

Semantic ORB-SLAM sensing method and device based on environment understanding Download PDF

Info

Publication number
CN110930519B
CN110930519B CN201911113708.7A CN201911113708A CN110930519B CN 110930519 B CN110930519 B CN 110930519B CN 201911113708 A CN201911113708 A CN 201911113708A CN 110930519 B CN110930519 B CN 110930519B
Authority
CN
China
Prior art keywords
sequence
frame
frames
orb
key frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911113708.7A
Other languages
Chinese (zh)
Other versions
CN110930519A (en
Inventor
柯晶晶
周广兵
蒙仕格
郑辉
林飞堞
陈惠纲
王珏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China Robotics Innovation Research Institute
Original Assignee
South China Robotics Innovation Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China Robotics Innovation Research Institute filed Critical South China Robotics Innovation Research Institute
Priority to CN201911113708.7A priority Critical patent/CN110930519B/en
Publication of CN110930519A publication Critical patent/CN110930519A/en
Application granted granted Critical
Publication of CN110930519B publication Critical patent/CN110930519B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/003Navigation within 3D models or images
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Computer Graphics (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a semantic ORB-SLAM sensing method and device based on environment understanding, wherein the method comprises the following steps: inputting the sequence frame into ORB-SLAM front end Tracking thread to extract key frame and obtain key frame data; inputting the key frame data into an adjacent key frame image optimizing thread to perform key frame data optimizing processing, and obtaining the key frame data after image optimizing; calculating error values among the key frame data after the graph optimization, and generating a candidate set based on the error values; and carrying out closed-loop correction processing on the candidate set based on global map optimization and loop fusion, and carrying out synchronous positioning and map construction based on correction results. In the embodiment of the invention, the robot is improved to have a remarkable effect on environmental perception, and can obtain higher-layer cognitive information of a scene, so that a more natural application mode is provided for application domains including robot navigation, augmented reality and automatic driving.

Description

Semantic ORB-SLAM sensing method and device based on environment understanding
Technical Field
The invention relates to the technical field of intelligent robot perception, in particular to a semantic ORB-SLAM perception method and device based on environment understanding.
Background
Synchronous positioning and map construction (Simultaneous Localization and Mapping, SLAM) are the basis for realizing autonomous navigation in an unknown environment by a mobile robot, and are one of preconditions for realizing autonomy and intellectualization; currently, visual SLAM can achieve real-time localization and three-dimensional map construction under static environment within a certain range, however, the map generated by conventional visual SLAM only contains simple geometric information (points, lines, etc.) or low-level pixel-level information (colors, brightness, etc.), and does not contain semantic information. While these simple geometric and pixel level information may be sufficient for autonomous navigation of the robot in a single environment, it may not be sufficient for the mobile robot to accomplish higher level tasks.
Patent CN201811514700 discloses a visual SLAM method based on ORB features, which only adopts ORB features to replace traditional SIFT feature extraction in the front end link, and uses hamming distance to perform feature matching judgment, so that the calculated amount can be reduced to a certain extent, and the real-time performance of the visual SLAM is improved; and in the back-end module, the graph optimization idea is adopted, and the accuracy of loop detection can be improved well based on the point cloud fusion optimization idea of combining the local loop and the global loop.
However, although ORB features are used for replacing traditional SIFT feature extraction, the visual SLAM method based on ORB features can effectively improve the calculation speed, but can only work in a static state or in a scene with a small number of dynamic objects, if a large number of feature points fall on the dynamic objects, the SLAM tracking and positioning result can deviate along with the movement of the dynamic objects, the robot map building and positioning accuracy is greatly influenced, and even the calculation pose failure can occur; the generation process of the feature points of the visual SLAM method based on ORB features can discard most pixel information in the original picture, and lacks effective semantic information, so that further understanding of the robot on environment perception is seriously affected.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides a semantic ORB-SLAM sensing method and device based on environment understanding, improves the remarkable effect of a robot on environment sensing, can obtain higher-layer cognitive information of a scene, and provides a more natural application mode for application fields including robot navigation, augmented reality and automatic driving.
In order to solve the technical problems, an embodiment of the present invention provides a semantic ORB-SLAM sensing method based on environmental understanding, the method comprising:
Inputting the sequence frame into ORB-SLAM front end Tracking thread to extract key frame and obtain key frame data;
inputting the key frame data into an adjacent key frame image optimizing thread to perform key frame data optimizing processing, and obtaining the key frame data after image optimizing;
calculating error values among the key frame data after the graph optimization, and generating a candidate set based on the error values;
and carrying out closed-loop correction processing on the candidate set based on global map optimization and loop fusion, and carrying out synchronous positioning and map construction based on correction results.
Optionally, the inputting the sequence frame into the ORB-SLAM front end Tracking thread to perform key frame extraction processing, to obtain key frame data includes:
the ORB-SLAM front end Tracking thread adopts an inter-frame difference method to carry out dynamic background removal processing on the input sequence frames, and obtains sequence frames with the dynamic background removed;
establishing a mapping relation between the sequence frames with the dynamic background removed and object feature points, and obtaining a sequence frame with the mapping relation with the object feature points;
performing ORB feature extraction processing on the sequence frames with the object feature point mapping relation to obtain sequence frame ORB features;
matching the ORB characteristics of the sequence frame of the current frame with the ORB characteristics of the sequence frame of the previous frame to obtain matching characteristic point pairs;
Performing pose estimation and repositioning processing based on the matched feature point pairs to obtain pose estimation and repositioning results;
and carrying out pose estimation and repositioning results according to the matched adjacent sequence frames to carry out optimization processing, obtaining pose optimization of the adjacent frames, and obtaining a key frame sequence based on the pose optimization of the adjacent frames.
Optionally, the ORB-SLAM front end Tracking thread uses an inter-frame difference method to perform dynamic background removal processing on an input sequence frame, and obtains a sequence frame from which a dynamic background is removed, including:
performing differential operation on adjacent frames in continuous time intervals in the sequence frames, and performing change detection by using strong correlation of the adjacent frames in the sequence frames to obtain a moving target;
and removing the dynamic background of the moving object in the sequence frames based on the selected threshold value, and acquiring the sequence frames with the dynamic background removed.
Optionally, the step of establishing a mapping relationship between the sequence frame from which the dynamic background is removed and the object feature point to obtain a sequence frame with a mapping relationship with the object feature point includes:
according to the image points observed by the sequence frames of the current frame, from which the dynamic background is removed, and based on the image points, the sequence frames of the next frame, from which the dynamic background is removed, are observed, and are used as adjacent sequence frames of the current frame, from which the dynamic background is removed;
Generating a node tree by taking a sequence frame of the current frame with the dynamic background removed as a root node and taking an adjacent sequence frame as a child node;
and constructing a mapping relation between the sequence frames with the dynamic background removed and the object feature points based on the node tree, and obtaining the sequence frames with the mapping relation with the object feature points.
Optionally, the pose estimation and repositioning processing based on the matching feature point pairs includes:
and calculating the relative displacement of the sequence frame of the current frame and the sequence frame of the last frame by using the minimized reprojection error according to the matched characteristic point pairs.
Optionally, the method further comprises:
after pose estimation and repositioning processing failure is carried out based on the matched characteristic point pairs, the closest sequence frame between the sequence frames of the current frame is obtained based on the mapping relation with the object characteristic points;
obtaining the most similar sequence frame ORB characteristics, and matching the sequence frame ORB characteristics of the current frame with the most similar sequence frame ORB characteristics to obtain a first matching characteristic point pair;
and carrying out pose estimation and repositioning calculation on the repositioning by using the first matching feature points to obtain pose estimation and repositioning results.
Optionally, the optimizing obtaining a key frame sequence based on the pose of the adjacent frame includes:
Calculating the minimum re-projection error between the adjacent frames, and establishing a common view based on the minimum re-projection error;
and extracting the sequence frames in the common view as key sequence frames.
Optionally, the inputting the key frame data into an adjacent key frame image optimizing thread performs key frame data optimizing processing, and obtaining the key frame data after image optimization includes:
and inputting the key frame data into an adjacent key frame image optimization thread, and then sequentially carrying out redundant point elimination processing, semantic extraction processing, new image point creation processing and adjacent frame optimization processing on the key frame data to obtain the key frame data after image optimization.
Optionally, the semantic extraction processing is performed on the key frame data after the redundant point rejection processing, including:
performing object detection on the key frame data subjected to redundant point elimination processing based on a YOLO-v3 algorithm to obtain an object detection result;
carrying out semantic association processing on the object detection result by using a conditional random field to obtain combined object category probability and scene context information;
correcting and optimizing the combined object type probability and scene context information to generate a temporary object information candidate set;
judging whether the temporary object information in the temporary object information candidate set is a new object or an existing object, and searching each point information of each temporary object information in the temporary object information candidate set in a corresponding neighborhood to acquire a nearest three-dimensional point of the point;
And calculating the Euler distance between the point and the three-dimensional point, and if the Euler distance is smaller than a preset threshold value, considering the point and the three-dimensional point as the same point.
In addition, the embodiment of the invention also provides a semantic ORB-SLAM sensing device based on environment understanding, which comprises the following components:
a key frame extraction module: the method comprises the steps of inputting a sequence frame into an ORB-SLAM front end Tracking thread to perform key frame extraction processing, and obtaining key frame data;
key frame optimization module: the key frame data is input into an adjacent key frame image optimizing thread to perform key frame data optimizing processing, and the key frame data after image optimization is obtained;
and an error calculation module: the method comprises the steps of calculating error values between key frame data after the optimization of the graph, and generating a candidate set based on the error values;
synchronous positioning and map construction module: and the method is used for carrying out closed-loop correction processing on the candidate set based on global map optimization and loop fusion, and carrying out synchronous positioning and map construction based on correction results.
In the embodiment of the invention, aiming at the defects that the traditional visual ORB-SLAM is easy to be interfered by a dynamic target in a characteristic extraction process, the extracted characteristic points only contain color brightness and geometric information and lack of object environment semantic information, in an ORB-SLAM front-end Tracking thread, adjacent frames in an inter-frame difference method sequence frame are firstly utilized to carry out differential operation, a threshold value is set, dynamic objects are removed, a mapping relation between the sequence frame and the object characteristic points is built again, ORB characteristic extraction is carried out, the object environment information extracted based on deep learning semantic is integrated into an ORB-SLAM system, and the semantic ORB-SLAM perception method for realizing 'understanding' of the environment has the advantages of stable performance, difficult environmental interference, accurate matching and deeper understanding of the environment; the robot has remarkable effect on environment perception, can obtain higher-layer cognitive information of a scene, and provides a more natural application mode for application fields including robot navigation, augmented reality and automatic driving.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings which are required in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow diagram of a semantic ORB-SLAM sensing method based on environmental understanding in an embodiment of the present invention;
FIG. 2 is a schematic diagram of the structural composition of a semantic ORB-SLAM sensing device based on environmental understanding in an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Examples
Referring to fig. 1, fig. 1 is a flow chart of a semantic ORB-SLAM perception method based on environmental understanding in an embodiment of the present invention.
As shown in fig. 1, a semantic ORB-SLAM perception method based on environmental understanding, the method comprising:
s11, inputting the sequence frame into an ORB-SLAM front end Tracking thread to perform key frame extraction processing, and acquiring key frame data;
in the implementation process of the invention, the step of inputting the sequence frame into the ORB-SLAM front end Tracking thread to perform key frame extraction processing to obtain key frame data comprises the following steps: the ORB-SLAM front end Tracking thread adopts an inter-frame difference method to carry out dynamic background removal processing on the input sequence frames, and obtains sequence frames with the dynamic background removed; establishing a mapping relation between the sequence frames with the dynamic background removed and object feature points, and obtaining a sequence frame with the mapping relation with the object feature points; performing ORB feature extraction processing on the sequence frames with the object feature point mapping relation to obtain sequence frame ORB features; matching the ORB characteristics of the sequence frame of the current frame with the ORB characteristics of the sequence frame of the previous frame to obtain matching characteristic point pairs; performing pose estimation and repositioning processing based on the matched feature point pairs to obtain pose estimation and repositioning results; and carrying out pose estimation and repositioning results according to the matched adjacent sequence frames to carry out optimization processing, obtaining pose optimization of the adjacent frames, and obtaining a key frame sequence based on the pose optimization of the adjacent frames.
Further, the ORB-SLAM front end Tracking thread performs dynamic background removal processing on an input sequence frame by adopting an inter-frame difference method to obtain a sequence frame with a dynamic background removed, and the method comprises the following steps: performing differential operation on adjacent frames in continuous time intervals in the sequence frames, and performing change detection by using strong correlation of the adjacent frames in the sequence frames to obtain a moving target; and removing the dynamic background of the moving object in the sequence frames based on the selected threshold value, and acquiring the sequence frames with the dynamic background removed.
Further, the step of establishing a mapping relationship between the sequence frame with the dynamic background removed and the object feature point to obtain a sequence frame with the mapping relationship with the object feature point includes: according to the image points observed by the sequence frames of the current frame, from which the dynamic background is removed, and based on the image points, the sequence frames of the next frame, from which the dynamic background is removed, are observed, and are used as adjacent sequence frames of the current frame, from which the dynamic background is removed; generating a node tree by taking a sequence frame of the current frame with the dynamic background removed as a root node and taking an adjacent sequence frame as a child node; and constructing a mapping relation between the sequence frames with the dynamic background removed and the object feature points based on the node tree, and obtaining the sequence frames with the mapping relation with the object feature points.
Further, the pose estimation and repositioning processing based on the matching feature point pairs includes: and calculating the relative displacement of the sequence frame of the current frame and the sequence frame of the last frame by using the minimized reprojection error according to the matched characteristic point pairs.
Further, the method further comprises: after pose estimation and repositioning processing failure is carried out based on the matched characteristic point pairs, the closest sequence frame between the sequence frames of the current frame is obtained based on the mapping relation with the object characteristic points; obtaining the most similar sequence frame ORB characteristics, and matching the sequence frame ORB characteristics of the current frame with the most similar sequence frame ORB characteristics to obtain a first matching characteristic point pair; and carrying out pose estimation and repositioning calculation on the repositioning by using the first matching feature points to obtain pose estimation and repositioning results.
Further, the optimizing obtaining a key frame sequence based on the pose of the adjacent frame includes: calculating the minimum re-projection error between the adjacent frames, and establishing a common view based on the minimum re-projection error; and extracting the sequence frames in the common view as key sequence frames.
Specifically, in the ORB-SLAM front end Tracking thread, a sequence frame is input into the ORB-SLAM front end Tracking thread, firstly, dynamic background removal is carried out, noise interference and influence of a dynamic object on a subsequent characteristic point extraction and matching process are eliminated, an inter-frame difference method is adopted, adjacent frames in continuous time intervals in the sequence frame are extracted for differential operation, strong correlation of the adjacent frames in the sequence frame is utilized for change detection, so that a moving target is detected, and then a moving region in the sequence frame is removed by selecting a threshold value; in the sequence frames, the kth frame f k (x, y) and k+1 frame f k+1 The change between (x, y) can be represented by a binarized differential value D (x, y) as follows:
Figure BDA0002273484180000071
wherein T is a set binary differential threshold; the part of '1' in the binary difference consists of a part of which the gray value of the corresponding pixel of the front frame and the rear frame is changed, and the part generally comprises a moving target and noise; the part of 0 is composed of the part of the corresponding pixel gray value of the two frames.
In the front-end Tracking thread, in order to integrate the extracted semantic information into the ORB-SLAM frame, a mapping relation between a sequence frame with a dynamic background removed and object feature points needs to be established; in ORB-SLAM, each sequence frame with the dynamic background removed stores the image point observed by the frame, and each image point also stores the sequence frame with the dynamic background removed and observed by the image point; according to the relation between the sequence frames and the image points of which the dynamic background is removed, a spanning tree of ORB-SLAM is established; in order to construct a spanning tree, firstly, according to the image points observed by the current sequence frames with the dynamic background removed, finding the sequence frames with the dynamic background removed for observing the image points, wherein the sequence frames with the dynamic background removed are adjacent sequence frames of the current sequence frames with the dynamic background removed, and have a large number of image points identical to those of the current sequence frames with the dynamic background removed; meanwhile, the picture points among the current sequence frames for removing the dynamic background are provided with associated sequence frames for removing the dynamic background; therefore, a spanning tree with the current sequence frame with the dynamic background removed as a root node and the adjacent sequence frame as a child node can be generated; in the spanning tree, the relationship between the child node and the father node is determined by the number of the common graph points; according to the spanning tree, the adjacent sequence frames of the current sequence frames with dynamic background removed can be conveniently found, so that more associated image points can be found; the mapping relation between the sequence frames and the objects with the dynamic background removed is established in the following way:
Each object O i Comprising:
point cloud data which are contained in an object under a world coordinate system and are obtained through projection calculation of a camera; the number of object categories and the probability of the corresponding object category, the probability being iteratively updated by an iterative bayesian process; observing a set of keyframes for the object; the object belongs to the category with the highest corresponding probability; the number of times the object is observed.
The color image corresponding to the sequence frame with the dynamic background removed is used for object detection; the depth image corresponding to the sequence frame with the dynamic background removed is used for generating object point cloud data; the sequence frame with the dynamic background removed is observed to be object information. After the relation construction operation between the object and the sequence frame with the dynamic background removed is completed based on the mapping of the image points and the sequence frame with the dynamic background removed, the sequence frame with the dynamic background removed can be used for finding the associated object according to the found object, and the object can also find the associated sequence frame with the dynamic background removed.
ORB feature extraction is carried out on the sequence frames with the object feature point mapping relation, ORB feature points are extracted to replace SIFT feature points, so that the operation amount can be effectively searched, and the operation efficiency is improved.
In the ORB-SLAM front end Tracking thread, pose estimation and repositioning are carried out according to the sequence frame with the dynamic background removed from the previous frame, namely, sequence frame ORB characteristics of the current frame and sequence frame ORB characteristics of the previous frame are matched to obtain a matched characteristic point pair, and then the current matched characteristic point pair is utilized to calculate the relative displacement of the current frame and the previous frame by utilizing the minimized reprojection error; if the tracking and positioning are failed, the scene failure mode is utilized to find the sequence frame which is closest to the current frame, the current frame is matched with the sequence frame to obtain a matched picture point, and the pose of the current frame is recalculated by utilizing the matched picture point.
Generally, two adjacent frames can observe a part of the same image points at the same time, by calculating the minimum re-projection error between the two adjacent frames, the smaller the re-projection error is, the larger the correlation between the two adjacent sequence frames is, so that a corresponding preset threshold value is set, the projection error is required to be smaller than or equal to the preset threshold value by comparing the projection error with the preset threshold value, and the corresponding adjacent sequence frames are not removed, so that a common view can be established on the premise that pose optimization between the adjacent frames is formed, and the sequence frames in the common view are obtained as key sequence frames.
S12: inputting the key frame data into an adjacent key frame image optimizing thread to perform key frame data optimizing processing, and obtaining the key frame data after image optimizing;
in the implementation process of the invention, the step of inputting the key frame data into an adjacent key frame image optimizing thread to perform key frame data optimizing processing, and obtaining the key frame data after image optimization comprises the following steps: and inputting the key frame data into an adjacent key frame image optimization thread, and then sequentially carrying out redundant point elimination processing, semantic extraction processing, new image point creation processing and adjacent frame optimization processing on the key frame data to obtain the key frame data after image optimization.
Further, the semantic extraction processing is performed on the key frame data after the redundant point elimination processing, including: performing object detection on the key frame data subjected to redundant point elimination processing based on a YOLO-v3 algorithm to obtain an object detection result; carrying out semantic association processing on the object detection result by using a conditional random field to obtain combined object category probability and scene context information; correcting and optimizing the combined object type probability and scene context information to generate a temporary object information candidate set; judging whether a new object exists in temporary object information in the temporary object information candidate set or not, searching each point information of each temporary object information in the temporary object information candidate set in a corresponding neighborhood of each point information, and acquiring a nearest three-dimensional point of the point; and calculating the Euler distance between the point and the three-dimensional point, and if the Euler distance is smaller than a preset threshold value, considering the point and the three-dimensional point as the same point.
Specifically, after obtaining a key frame, inputting the key frame into an adjacent key frame image optimization thread, removing redundant points, designing a semantic extraction algorithm to realize the image optimization process of adjacent frames among the key frames, and designing a semantic extraction algorithm which comprises the functions of object detection, object semantic association, temporary object generation, object association, object model updating and the like, wherein the object detection is responsible for extracting object information of a picture by using a deep learning network, carrying out semantic association on the extracted object information, correcting and optimizing the semantic association, so that the extracted detected object is more accurate and reliable, and storing the extracted detected object in a temporary object information set; the object association and updating is responsible for associating the temporary object information with the object information existing in the object database according to the mapping relation of the key frame, the object information and the map points, and integrating the temporary object information update into the corresponding object information.
Here, a YOLO-v3 based algorithm is used for object detection, which divides each picture into n×n squares, then performs an object detection operation only once for each square, and finally fuses the detection results together.
Semantic detection is carried out on the key frames by utilizing a YOLO-v3 algorithm, semantic association is further carried out on objects extracted through deep learning detection by utilizing a conditional random field, and detection classification accuracy is improved by combining object class probability and scene context information, wherein the energy equation corresponding to the designed conditional random field combining the object class probability and the context information is as follows:
Figure BDA0002273484180000102
E(x)=∑ i ψ μ (x i )+∑ i<j ψ P (x i ,y i );
Where x represents a random variable of the object class, i, j ranges from 1 to k, where k is the number of objects detected in the image, Z is a normalization factor, ensuring that the calculation result is probability, E (x) is the energy function of the conditional random field, a unitary potential function ψ u Probability of marking class of node of random field map, binary potential function psi P Is to characterize the correlation between random field pattern nodes.
Unitary potential function ψ u The following is shown:
ψ μ =-log p(x i );
binary potential function ψ P The following is shown:
Figure BDA0002273484180000101
wherein p (x) i ) Represents the probability distribution, omega, of the category to which the ith object belongs given by the YOLO-v3 model m Is a linear combining weight, μ is a tag compatibility function, representing the likelihood of simultaneous occurrence of different classes within the neighborhood.
The semantic association of the detected object is realized through a conditional random field, the detection result is corrected and optimized, a temporary object information candidate set is generated, the temporary object is judged, and whether the temporary object is a new object or an existing object in the candidate set is determined; for the data of each candidate object, searching each point information of the temporary object in the neighborhood of the data, finding out a three-dimensional point closest to the point from the point cloud data of the candidate object, calculating the Euler distance between the two points, and if the Euler distance between the two points is smaller than a set threshold value, considering the two points as the same point.
S13: calculating error values among the key frame data after the graph optimization, and generating a candidate set based on the error values;
in the implementation process of the invention, the error value between the key frame data after the optimization of the graph is calculated, and the candidate set can be generated according to the error value.
S14: and carrying out closed-loop correction processing on the candidate set based on global map optimization and loop fusion, and carrying out synchronous positioning and map construction based on correction results.
In the implementation process of the invention, the candidate set is subjected to closed-loop correction processing through global map optimization and loop fusion; closed loop detection is realized, positioning accuracy is improved, and errors are reduced; and synchronous positioning and map construction are carried out based on the correction result.
In the embodiment of the invention, aiming at the defects that the traditional visual ORB-SLAM is easy to be interfered by a dynamic target in a characteristic extraction process, the extracted characteristic points only contain color brightness and geometric information and lack of object environment semantic information, in an ORB-SLAM front-end Tracking thread, adjacent frames in an inter-frame difference method sequence frame are firstly utilized to carry out differential operation, a threshold value is set, dynamic objects are removed, a mapping relation between the sequence frame and the object characteristic points is built again, ORB characteristic extraction is carried out, the object environment information extracted based on deep learning semantic is integrated into an ORB-SLAM system, and the semantic ORB-SLAM perception method for realizing 'understanding' of the environment has the advantages of stable performance, difficult environmental interference, accurate matching and deeper understanding of the environment; the robot has remarkable effect on environment perception, can obtain higher-layer cognitive information of a scene, and provides a more natural application mode for application fields including robot navigation, augmented reality and automatic driving.
Examples
Referring to fig. 2, fig. 2 is a schematic structural diagram of a semantic ORB-SLAM sensing device based on environmental understanding according to an embodiment of the present invention.
As shown in fig. 2, a semantic ORB-SLAM perception device based on environmental understanding, the device comprising:
key frame extraction module 21: the method comprises the steps of inputting a sequence frame into an ORB-SLAM front end Tracking thread to perform key frame extraction processing, and obtaining key frame data;
in the implementation process of the invention, the step of inputting the sequence frame into the ORB-SLAM front end Tracking thread to perform key frame extraction processing to obtain key frame data comprises the following steps: the ORB-SLAM front end Tracking thread adopts an inter-frame difference method to carry out dynamic background removal processing on the input sequence frames, and obtains sequence frames with the dynamic background removed; establishing a mapping relation between the sequence frames with the dynamic background removed and object feature points, and obtaining a sequence frame with the mapping relation with the object feature points; performing ORB feature extraction processing on the sequence frames with the object feature point mapping relation to obtain sequence frame ORB features; matching the ORB characteristics of the sequence frame of the current frame with the ORB characteristics of the sequence frame of the previous frame to obtain matching characteristic point pairs; performing pose estimation and repositioning processing based on the matched feature point pairs to obtain pose estimation and repositioning results; and carrying out pose estimation and repositioning results according to the matched adjacent sequence frames to carry out optimization processing, obtaining pose optimization of the adjacent frames, and obtaining a key frame sequence based on the pose optimization of the adjacent frames.
Further, the ORB-SLAM front end Tracking thread performs dynamic background removal processing on an input sequence frame by adopting an inter-frame difference method to obtain a sequence frame with a dynamic background removed, and the method comprises the following steps: performing differential operation on adjacent frames in continuous time intervals in the sequence frames, and performing change detection by using strong correlation of the adjacent frames in the sequence frames to obtain a moving target; and removing the dynamic background of the moving object in the sequence frames based on the selected threshold value, and acquiring the sequence frames with the dynamic background removed.
Further, the step of establishing a mapping relationship between the sequence frame with the dynamic background removed and the object feature point to obtain a sequence frame with the mapping relationship with the object feature point includes: according to the image points observed by the sequence frames of the current frame, from which the dynamic background is removed, and based on the image points, the sequence frames of the next frame, from which the dynamic background is removed, are observed, and are used as adjacent sequence frames of the current frame, from which the dynamic background is removed; generating a node tree by taking a sequence frame of the current frame with the dynamic background removed as a root node and taking an adjacent sequence frame as a child node; and constructing a mapping relation between the sequence frames with the dynamic background removed and the object feature points based on the node tree, and obtaining the sequence frames with the mapping relation with the object feature points.
Further, the pose estimation and repositioning processing based on the matching feature point pairs includes: and calculating the relative displacement of the sequence frame of the current frame and the sequence frame of the last frame by using the minimized reprojection error according to the matched characteristic point pairs.
Further, the method further comprises: after pose estimation and repositioning processing failure is carried out based on the matched characteristic point pairs, the closest sequence frame between the sequence frames of the previous frame is obtained based on the mapping relation with the object characteristic points; obtaining the most similar sequence frame ORB characteristics, and matching the sequence frame ORB characteristics of the current frame with the most similar sequence frame ORB characteristics to obtain a first matching characteristic point pair; and carrying out pose estimation and repositioning calculation on the repositioning by using the first matching feature points to obtain pose estimation and repositioning results.
Further, the optimizing obtaining a key frame sequence based on the pose of the adjacent frame includes: calculating the minimum re-projection error between the adjacent frames, and establishing a common view based on the minimum re-projection error; and extracting the sequence frames in the common view as key sequence frames.
Specifically, in the ORB-SLAM front end Tracking thread, a sequence frame is input into the ORB-SLAM front end Tracking thread, firstly, dynamic background removal is carried out, noise interference and influence of a dynamic object on a subsequent characteristic point extraction and matching process are eliminated, an inter-frame difference method is adopted, adjacent frames in continuous time intervals in the sequence frame are extracted for differential operation, strong correlation of the adjacent frames in the sequence frame is utilized for change detection, so that a moving target is detected, and then a moving region in the sequence frame is removed by selecting a threshold value; in the sequence frames, the kth frame f k (x, y) and k+1 frame f k+1 The change between (x, y) can be represented by a binarized differential value D (x, y) as follows:
Figure BDA0002273484180000131
wherein T is a set binary differential threshold; the part of '1' in the binary difference consists of a part of which the gray value of the corresponding pixel of the front frame and the rear frame is changed, and the part generally comprises a moving target and noise; the part of 0 is composed of the part of the corresponding pixel gray value of the two frames.
In the front-end Tracking thread, in order to integrate the extracted semantic information into the ORB-SLAM frame, a mapping relation between a sequence frame with a dynamic background removed and object feature points needs to be established; in ORB-SLAM, each sequence frame with the dynamic background removed stores the image point observed by the frame, and each image point also stores the sequence frame with the dynamic background removed and observed by the image point; according to the relation between the sequence frames and the image points of which the dynamic background is removed, a spanning tree of ORB-SLAM is established; in order to construct a spanning tree, firstly, according to the image points observed by the current sequence frames with the dynamic background removed, finding the sequence frames with the dynamic background removed for observing the image points, wherein the sequence frames with the dynamic background removed are adjacent sequence frames of the current sequence frames with the dynamic background removed, and have a large number of image points identical to those of the current sequence frames with the dynamic background removed; meanwhile, the current sequence frames for removing the dynamic background are provided with picture points, and each picture point is provided with an associated sequence frame for removing the dynamic background; therefore, a spanning tree with the current sequence frame with the dynamic background removed as a root node and the adjacent sequence frame as a child node can be generated; in the spanning tree, the relationship between the child node and the father node is determined by the number of the common graph points; according to the spanning tree, the adjacent sequence frames of the current sequence frames with dynamic background removed can be conveniently found, so that more associated image points can be found; the mapping relation between the sequence frames and the objects with the dynamic background removed is established in the following way:
Each object O i Comprising:
point cloud data which are contained in an object under a world coordinate system and are obtained through projection calculation of a camera; the number of object categories and the probability of the corresponding object category, the probability being iteratively updated by an iterative bayesian process; observing a set of keyframes for the object; the object belongs to the category with the highest corresponding probability; the number of times the object is observed.
The color image corresponding to the sequence frame with the dynamic background removed is used for object detection; the depth image corresponding to the sequence frame with the dynamic background removed is used for generating object point cloud data; the sequence frame with the dynamic background removed is observed to be object information. After the relation construction operation between the object and the sequence frame with the dynamic background removed is completed based on the mapping of the image points and the sequence frame with the dynamic background removed, the sequence frame with the dynamic background removed can be used for finding the associated object according to the found object, and the object can also find the associated sequence frame with the dynamic background removed.
ORB feature extraction is carried out on the sequence frames with the object feature point mapping relation, ORB feature points are extracted to replace SIFT feature points, so that the operation amount can be effectively searched, and the operation efficiency is improved.
In the ORB-SLAM front end Tracking thread, pose estimation and repositioning are carried out according to the sequence frame with the dynamic background removed from the previous frame, namely, sequence frame ORB characteristics of the current frame and sequence frame ORB characteristics of the previous frame are matched to obtain a matched characteristic point pair, then the current matched characteristic point pair is perceived, and the relative displacement between the current frame and the previous frame is calculated by utilizing the minimized reprojection error; if the tracking and positioning are failed, the scene failure mode is utilized to find the sequence frame which is closest to the current frame, the current frame is matched with the sequence frame to obtain a matched picture point, and the matched land is utilized to recalculate the pose of the current frame.
Generally, two adjacent frames can observe a part of the same image points at the same time, by calculating the minimum re-projection error between the two adjacent frames, the smaller the re-projection error is, the larger the correlation between the two adjacent sequence frames is, so that a corresponding preset threshold value is set, the projection error is required to be smaller than or equal to the preset threshold value by comparing the projection error with the preset threshold value, and the corresponding adjacent sequence frames are not removed, so that a common view can be established on the premise that pose optimization between the adjacent frames is formed, and the sequence frames in the common view are obtained as key sequence frames.
Key frame optimization module 22: the key frame data is input into an adjacent key frame image optimizing thread to perform key frame data optimizing processing, and the key frame data after image optimization is obtained;
in the implementation process of the invention, the step of inputting the key frame data into an adjacent key frame image optimizing thread to perform key frame data optimizing processing, and obtaining the key frame data after image optimization comprises the following steps: and inputting the key frame data into an adjacent key frame image optimizing thread, and then sequentially carrying out redundant point elimination processing, semantic extraction processing and new image point creation processing opinion adjacent frame optimizing processing on the key frame data to obtain the key frame data after image optimization.
Further, the semantic extraction processing is performed on the key frame data after the redundant point elimination processing, including: performing object detection on the key frame data subjected to redundant point elimination processing based on a YOLO-v3 algorithm to obtain an object detection result; carrying out semantic association processing on the object detection result by using a conditional random field to obtain combined object category probability and scene context information; correcting and optimizing the combined object type probability and scene context information to generate a temporary object information candidate set; judging whether the temporary object information in the temporary object information candidate set is a new object or an existing object, and searching each point information of each temporary object information in the temporary object information candidate set in a corresponding neighborhood to acquire a nearest three-dimensional point of the point; and calculating the Euler distance between the point and the three-dimensional point, and if the Euler distance is smaller than a preset threshold value, considering the point and the three-dimensional point as the same point.
Specifically, after obtaining a key frame, inputting the key frame into an adjacent key frame image optimization thread, removing redundant points, designing a semantic extraction algorithm, realizing the image optimization process of adjacent frames among the key frames, and designing a semantic extraction algorithm which comprises the functions of object detection, object semantic association, temporary object generation, object association, object model updating and the like, wherein the object detection is responsible for extracting object information of a picture by using a deep learning network, carrying out semantic tag on the extracted object information, carrying out object association with corresponding semantics, and then correcting and optimizing through semantic association, so that the extracted detected object is more accurate and reliable, and is stored in a temporary object information set; the object association and updating is responsible for associating the temporary object information with the object information existing in the object database according to the mapping relation of the key frame, the object information and the map points, and integrating the temporary object information update into the corresponding object information.
The method comprises the steps of dividing each picture into N square grids based on a YOLO algorithm for object detection, performing object detection operation on each square grid once, and finally fusing detection results; the design of YOLO solves the problem of duplicate detection.
Semantic detection is carried out on the key frames by utilizing a YOLO algorithm, semantic association is further carried out on objects extracted through deep learning detection by utilizing a conditional random field, and detection classification accuracy is improved by combining object class probability and scene context information, wherein the energy equation corresponding to the designed conditional random field combining the object class probability and the context information is as follows:
Figure BDA0002273484180000161
E(x)=∑ i ψ μ (x i )+∑ i<j ψ P (x i ,y i );
where x represents a random variable of the object class, i, j ranges from 1 to k, where k is the number of objects detected in the image, Z is a normalization factor, ensuring that the calculation result is probability, E (x) is the energy function of the conditional random field, a unitary potential function ψ u Probability of marking class of node of random field map, binary potential function psi P Is to characterize the correlation between random field pattern nodes.
Unitary potential function ψ u The following is shown:
ψ μ =-log p(x i );
binary potential function ψ P The following is shown:
Figure BDA0002273484180000162
wherein p (x) i ) Represents the probability distribution, omega, of the category to which the ith object belongs given by the YOLO model m Is a linear combining weight, μ is a tag compatibility function, representing the likelihood of simultaneous occurrence of different classes within the neighborhood.
The semantic association of the detected object is realized through a conditional random field, the detection result is corrected and optimized, a temporary object information candidate set is generated, the temporary object is judged, and whether the temporary object is a new object or an existing object in the candidate set is determined; for the data of each candidate object, searching each point information of the temporary object in the neighborhood of the data, finding out a three-dimensional point closest to the point from the point cloud data of the candidate object, calculating the Euler distance between the two points, and if the Euler distance between the two points is smaller than a set threshold value, considering the two points as the same point.
Error calculation module 23: the method comprises the steps of calculating error values between key frame data after the optimization of the graph, and generating a candidate set based on the error values;
in the implementation process of the invention, the error value between the key frame data after the optimization of the graph is calculated, and the candidate set can be generated according to the error value.
Synchronous positioning and map construction module 24: and the method is used for carrying out closed-loop correction processing on the candidate set based on global map optimization and loop fusion, and carrying out synchronous positioning and map construction based on correction results.
In the implementation process of the invention, the candidate set is subjected to closed-loop correction processing through global map optimization and loop fusion; closed loop detection is realized, positioning accuracy is improved, and errors are reduced; and synchronous positioning and map construction are carried out based on the correction result.
In the embodiment of the invention, aiming at the defects that the traditional visual ORB-SLAM is easy to be interfered by a dynamic target in a characteristic extraction process, the extracted characteristic points only contain color brightness and geometric information and lack of object environment semantic information, in an ORB-SLAM front-end Tracking thread, adjacent frames in an inter-frame difference method sequence frame are firstly utilized to carry out differential operation, a threshold value is set, dynamic objects are removed, a mapping relation between the sequence frame and the object characteristic points is built again, ORB characteristic extraction is carried out, the object environment information extracted based on deep learning semantic is integrated into an ORB-SLAM system, and the semantic ORB-SLAM perception method for realizing 'understanding' of the environment has the advantages of stable performance, difficult environmental interference, accurate matching and deeper understanding of the environment; the robot has remarkable effect on environment perception, can obtain higher-layer cognitive information of a scene, and provides a more natural application mode for application fields including robot navigation, augmented reality and automatic driving.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be implemented by a program to instruct related hardware, the program may be stored in a computer readable storage medium, and the storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.
In addition, the semantic ORB-SLAM sensing method and device based on environmental understanding provided by the embodiments of the present invention are described in detail, and specific examples should be adopted to illustrate the principles and embodiments of the present invention, and the description of the above embodiments is only used to help understand the method and core ideas of the present invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims (8)

1. A semantic ORB-SLAM perception method based on environmental understanding, the method comprising:
inputting the sequence frame into ORB-SLAM front end Tracking thread to extract key frame and obtain key frame data;
Inputting the key frame data into an adjacent key frame image optimizing thread to perform key frame data optimizing processing, and obtaining the key frame data after image optimizing;
calculating error values among the key frame data after the graph optimization, and generating a candidate set based on the error values;
performing closed-loop correction processing on the candidate set based on global map optimization and loop fusion, and performing synchronous positioning and map construction based on correction results;
the step of inputting the sequence frame into the ORB-SLAM front end Tracking thread for key frame extraction processing to obtain key frame data comprises the following steps:
the ORB-SLAM front end Tracking thread adopts an inter-frame difference method to carry out dynamic background removal processing on the input sequence frames, and obtains sequence frames with the dynamic background removed;
establishing a mapping relation between the sequence frames with the dynamic background removed and object feature points, and obtaining a sequence frame with the mapping relation with the object feature points;
performing ORB feature extraction processing on the sequence frames with the object feature point mapping relation to obtain sequence frame ORB features;
matching the ORB characteristics of the sequence frame of the current frame with the ORB characteristics of the sequence frame of the previous frame to obtain matching characteristic point pairs;
performing pose estimation and repositioning processing based on the matched feature point pairs to obtain pose estimation and repositioning results;
Performing pose estimation and repositioning results according to the matched adjacent sequence frames to perform optimization processing to obtain pose optimization of adjacent frames, and acquiring a key frame sequence based on the pose optimization of the adjacent frames;
the ORB-SLAM front end Tracking thread adopts an inter-frame difference method to carry out dynamic background removal processing on an input sequence frame to obtain a sequence frame with a dynamic background removed, and the method comprises the following steps:
performing differential operation on adjacent frames in continuous time intervals in the sequence frames, and performing change detection by using strong correlation of the adjacent frames in the sequence frames to obtain a moving target;
removing a dynamic background of a moving object in the sequence frames based on a selected threshold value, and acquiring sequence frames from which the dynamic background is removed;
of the sequence frames, the kth frame
Figure QLYQS_1
And k+1 frames->
Figure QLYQS_2
The change between is with a binarized differential value +.>
Figure QLYQS_3
The expression is as follows:
Figure QLYQS_4
wherein T is a set binary differential threshold; the part of '1' in the binary difference consists of a part of which the gray value of the corresponding pixel of the front frame and the rear frame is changed, and the part generally comprises a moving target and noise; the part of 0 is composed of the part of the corresponding pixel gray value of the two frames.
2. The semantic ORB-SLAM sensing method of claim 1, wherein the creating a mapping relationship between the sequence frame with the dynamic background removed and the object feature points to obtain a sequence frame with the object feature point mapping relationship comprises:
According to the image points observed by the sequence frames of the current frame, from which the dynamic background is removed, and based on the image points, the sequence frames of the next frame, from which the dynamic background is removed, are observed, and are used as adjacent sequence frames of the current frame, from which the dynamic background is removed;
generating a node tree by taking a sequence frame of the current frame with the dynamic background removed as a root node and taking an adjacent sequence frame as a child node;
and constructing a mapping relation between the sequence frames with the dynamic background removed and the object feature points based on the node tree, and acquiring the sequence frames with the mapping relation with the object feature points.
3. The semantic ORB-SLAM sensing method of claim 1, wherein the performing pose estimation and repositioning processing based on the matching feature point pairs comprises:
and calculating the relative displacement of the sequence frame of the current frame and the sequence frame of the last frame by using the minimized reprojection error according to the matched characteristic point pairs.
4. The semantic ORB-SLAM awareness method of claim 1, further comprising:
after pose estimation and repositioning processing failure is carried out based on the matched characteristic point pairs, the closest sequence frame between the sequence frames of the current frame is obtained based on the mapping relation with the object characteristic points;
Obtaining the most similar sequence frame ORB characteristics, and matching the sequence frame ORB characteristics of the current frame with the most similar sequence frame ORB characteristics to obtain a first matching characteristic point pair;
and carrying out pose estimation and repositioning calculation on the repositioning by using the first matching feature points to obtain pose estimation and repositioning results.
5. The semantic ORB-SLAM perceptual method of claim 1, wherein the optimizing the acquisition of a sequence of keyframes based on pose of the neighboring frames comprises:
calculating the minimum re-projection error between the adjacent frames, and establishing a common view based on the minimum re-projection error;
and extracting the sequence frames in the common view as key sequence frames.
6. The semantic ORB-SLAM sensing method of claim 1, wherein inputting the keyframe data into an adjacent keyframe map optimization thread performs keyframe data optimization processing to obtain map-optimized keyframe data, comprising:
and inputting the key frame data into an adjacent key frame image optimization thread, and then sequentially carrying out redundant point elimination processing, semantic extraction processing, new image point creation processing and adjacent frame optimization processing on the key frame data to obtain the key frame data after image optimization.
7. The semantic ORB-SLAM sensing method of claim 6, wherein the performing semantic extraction on the redundant point culling processed keyframe data comprises:
performing object detection on the key frame data subjected to redundant point elimination processing based on a YOLO-v3 algorithm to obtain an object detection result;
carrying out semantic association processing on the object detection result by using a conditional random field to obtain combined object category probability and scene context information;
correcting and optimizing the combined object type probability and scene context information to generate a temporary object information candidate set;
judging whether the temporary object information in the temporary object information candidate set is a new object or an existing object, and searching each point information of each temporary object information in the temporary object information candidate set in a corresponding neighborhood to acquire a nearest three-dimensional point of the point;
and calculating the Euler distance between the point and the three-dimensional point, and if the Euler distance is smaller than a preset threshold value, considering the point and the three-dimensional point as the same point.
8. A semantic ORB-SLAM awareness apparatus based on environmental understanding, the apparatus comprising:
A key frame extraction module: the method comprises the steps of inputting a sequence frame into an ORB-SLAM front end Tracking thread to perform key frame extraction processing, and obtaining key frame data;
key frame optimization module: the key frame data is input into an adjacent key frame image optimizing thread to perform key frame data optimizing processing, and the key frame data after image optimization is obtained;
and an error calculation module: the method comprises the steps of calculating error values between key frame data after the optimization of the graph, and generating a candidate set based on the error values;
synchronous positioning and map construction module: the method is used for carrying out closed-loop correction processing on the candidate set based on global map optimization and loop fusion, and carrying out synchronous positioning and map construction based on correction results;
the step of inputting the sequence frame into the ORB-SLAM front end Tracking thread for key frame extraction processing to obtain key frame data comprises the following steps:
the ORB-SLAM front end Tracking thread adopts an inter-frame difference method to carry out dynamic background removal processing on the input sequence frames, and obtains sequence frames with the dynamic background removed;
establishing a mapping relation between the sequence frames with the dynamic background removed and object feature points, and obtaining a sequence frame with the mapping relation with the object feature points;
performing ORB feature extraction processing on the sequence frames with the object feature point mapping relation to obtain sequence frame ORB features;
Matching the ORB characteristics of the sequence frame of the current frame with the ORB characteristics of the sequence frame of the previous frame to obtain matching characteristic point pairs;
performing pose estimation and repositioning processing based on the matched feature point pairs to obtain pose estimation and repositioning results;
performing pose estimation and repositioning results according to the matched adjacent sequence frames to perform optimization processing to obtain pose optimization of adjacent frames, and acquiring a key frame sequence based on the pose optimization of the adjacent frames;
the ORB-SLAM front end Tracking thread adopts an inter-frame difference method to carry out dynamic background removal processing on an input sequence frame to obtain a sequence frame with a dynamic background removed, and the method comprises the following steps:
performing differential operation on adjacent frames in continuous time intervals in the sequence frames, and performing change detection by using strong correlation of the adjacent frames in the sequence frames to obtain a moving target;
removing a dynamic background of a moving object in the sequence frames based on a selected threshold value, and acquiring sequence frames from which the dynamic background is removed;
of the sequence frames, the kth frame
Figure QLYQS_5
And k+1 frames->
Figure QLYQS_6
The change between is with a binarized differential value +.>
Figure QLYQS_7
The expression is as follows:
Figure QLYQS_8
wherein T is a set binary differential threshold; the part of '1' in the binary difference consists of a part of which the gray value of the corresponding pixel of the front frame and the rear frame is changed, and the part generally comprises a moving target and noise; the part of 0 is composed of the part of the corresponding pixel gray value of the two frames.
CN201911113708.7A 2019-11-14 2019-11-14 Semantic ORB-SLAM sensing method and device based on environment understanding Active CN110930519B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911113708.7A CN110930519B (en) 2019-11-14 2019-11-14 Semantic ORB-SLAM sensing method and device based on environment understanding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911113708.7A CN110930519B (en) 2019-11-14 2019-11-14 Semantic ORB-SLAM sensing method and device based on environment understanding

Publications (2)

Publication Number Publication Date
CN110930519A CN110930519A (en) 2020-03-27
CN110930519B true CN110930519B (en) 2023-06-20

Family

ID=69852948

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911113708.7A Active CN110930519B (en) 2019-11-14 2019-11-14 Semantic ORB-SLAM sensing method and device based on environment understanding

Country Status (1)

Country Link
CN (1) CN110930519B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115375869B (en) * 2022-10-25 2023-02-10 杭州华橙软件技术有限公司 Robot repositioning method, robot and computer-readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106373141A (en) * 2016-09-14 2017-02-01 上海航天控制技术研究所 Tracking system and tracking method of relative movement angle and angular velocity of slowly rotating space fragment
CN110125928A (en) * 2019-03-27 2019-08-16 浙江工业大学 A kind of binocular inertial navigation SLAM system carrying out characteristic matching based on before and after frames
CN110363816A (en) * 2019-06-25 2019-10-22 广东工业大学 A kind of mobile robot environment semanteme based on deep learning builds drawing method
CN110378345A (en) * 2019-06-04 2019-10-25 广东工业大学 Dynamic scene SLAM method based on YOLACT example parted pattern
CN110378997A (en) * 2019-06-04 2019-10-25 广东工业大学 A kind of dynamic scene based on ORB-SLAM2 builds figure and localization method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019057179A1 (en) * 2017-09-22 2019-03-28 华为技术有限公司 Visual slam method and apparatus based on point and line characteristic

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106373141A (en) * 2016-09-14 2017-02-01 上海航天控制技术研究所 Tracking system and tracking method of relative movement angle and angular velocity of slowly rotating space fragment
CN110125928A (en) * 2019-03-27 2019-08-16 浙江工业大学 A kind of binocular inertial navigation SLAM system carrying out characteristic matching based on before and after frames
CN110378345A (en) * 2019-06-04 2019-10-25 广东工业大学 Dynamic scene SLAM method based on YOLACT example parted pattern
CN110378997A (en) * 2019-06-04 2019-10-25 广东工业大学 A kind of dynamic scene based on ORB-SLAM2 builds figure and localization method
CN110363816A (en) * 2019-06-25 2019-10-22 广东工业大学 A kind of mobile robot environment semanteme based on deep learning builds drawing method

Also Published As

Publication number Publication date
CN110930519A (en) 2020-03-27

Similar Documents

Publication Publication Date Title
CN110335319B (en) Semantic-driven camera positioning and map reconstruction method and system
CN106909877B (en) Visual simultaneous mapping and positioning method based on dotted line comprehensive characteristics
CN111060115B (en) Visual SLAM method and system based on image edge features
CN111724439B (en) Visual positioning method and device under dynamic scene
CN103646391B (en) A kind of real-time video camera tracking method for dynamic scene change
CN111696118B (en) Visual loopback detection method based on semantic segmentation and image restoration in dynamic scene
CN109974743B (en) Visual odometer based on GMS feature matching and sliding window pose graph optimization
CN111462210B (en) Monocular line feature map construction method based on epipolar constraint
CN107369183A (en) Towards the MAR Tracing Registration method and system based on figure optimization SLAM
CN111899334A (en) Visual synchronous positioning and map building method and device based on point-line characteristics
Xue et al. Boundary-induced and scene-aggregated network for monocular depth prediction
CN111311708A (en) Visual SLAM method based on semantic optical flow and inverse depth filtering
CN112287906B (en) Template matching tracking method and system based on depth feature fusion
CN112733711B (en) Remote sensing image damaged building extraction method based on multi-scale scene change detection
CN111709317B (en) Pedestrian re-identification method based on multi-scale features under saliency model
JP2022082493A (en) Pedestrian re-identification method for random shielding recovery based on noise channel
CN110930519B (en) Semantic ORB-SLAM sensing method and device based on environment understanding
CN117541652A (en) Dynamic SLAM method based on depth LK optical flow method and D-PROSAC sampling strategy
CN113570713B (en) Semantic map construction method and device for dynamic environment
Guo et al. Image saliency detection based on geodesic‐like and boundary contrast maps
CN110059651B (en) Real-time tracking and registering method for camera
Sun et al. Kinect depth recovery via the cooperative profit random forest algorithm
CN112200850A (en) ORB extraction method based on mature characteristic points
CN113129332A (en) Method and apparatus for performing target object tracking
CN111353538A (en) Similar image matching method based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A semantic ORB-SLAM perception method and device based on environmental understanding

Effective date of registration: 20231130

Granted publication date: 20230620

Pledgee: Guangdong Shunde Rural Commercial Bank Co.,Ltd. science and technology innovation sub branch

Pledgor: SOUTH CHINA ROBOTICS INNOVATION Research Institute

Registration number: Y2023980068232