CN111797938B - Semantic information and VSLAM fusion method for sweeping robot - Google Patents

Semantic information and VSLAM fusion method for sweeping robot Download PDF

Info

Publication number
CN111797938B
CN111797938B CN202010681784.4A CN202010681784A CN111797938B CN 111797938 B CN111797938 B CN 111797938B CN 202010681784 A CN202010681784 A CN 202010681784A CN 111797938 B CN111797938 B CN 111797938B
Authority
CN
China
Prior art keywords
semantic information
dictionary
indoor
information
semantic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010681784.4A
Other languages
Chinese (zh)
Other versions
CN111797938A (en
Inventor
金梅
张少阔
张立国
张子豪
孙胜春
刘博�
张勇
郎梦园
王娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yanshan University
Original Assignee
Yanshan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yanshan University filed Critical Yanshan University
Priority to CN202010681784.4A priority Critical patent/CN111797938B/en
Publication of CN111797938A publication Critical patent/CN111797938A/en
Application granted granted Critical
Publication of CN111797938B publication Critical patent/CN111797938B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • AHUMAN NECESSITIES
    • A47FURNITURE; DOMESTIC ARTICLES OR APPLIANCES; COFFEE MILLS; SPICE MILLS; SUCTION CLEANERS IN GENERAL
    • A47LDOMESTIC WASHING OR CLEANING; SUCTION CLEANERS IN GENERAL
    • A47L11/00Machines for cleaning floors, carpets, furniture, walls, or wall coverings
    • A47L11/24Floor-sweeping machines, motor-driven
    • AHUMAN NECESSITIES
    • A47FURNITURE; DOMESTIC ARTICLES OR APPLIANCES; COFFEE MILLS; SPICE MILLS; SUCTION CLEANERS IN GENERAL
    • A47LDOMESTIC WASHING OR CLEANING; SUCTION CLEANERS IN GENERAL
    • A47L11/00Machines for cleaning floors, carpets, furniture, walls, or wall coverings
    • A47L11/40Parts or details of machines not provided for in groups A47L11/02 - A47L11/38, or not restricted to one of these groups, e.g. handles, arrangements of switches, skirts, buffers, levers
    • AHUMAN NECESSITIES
    • A47FURNITURE; DOMESTIC ARTICLES OR APPLIANCES; COFFEE MILLS; SPICE MILLS; SUCTION CLEANERS IN GENERAL
    • A47LDOMESTIC WASHING OR CLEANING; SUCTION CLEANERS IN GENERAL
    • A47L11/00Machines for cleaning floors, carpets, furniture, walls, or wall coverings
    • A47L11/40Parts or details of machines not provided for in groups A47L11/02 - A47L11/38, or not restricted to one of these groups, e.g. handles, arrangements of switches, skirts, buffers, levers
    • A47L11/4011Regulation of the cleaning machine by electric means; Control systems and remote control systems therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/757Matching configurations of points or features

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computational Mathematics (AREA)
  • Multimedia (AREA)
  • Pure & Applied Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Software Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Algebra (AREA)
  • Remote Sensing (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a semantic information and VSLAM fusion method for a sweeping robot, which is characterized in that a vector containing voice information in a semantic dictionary is added in front of a vector of a traditional dictionary to generate a fusion dictionary fusing the traditional information and the semantic information, so that the information source of a VSLAM system is enhanced, the defect that the traditional VSLAM cannot acquire environment prior information is overcome, and the precision of solving an essential matrix by the VSLAM system is improved by utilizing the semantic information; semantic information matching is firstly carried out in loop detection, if the semantic information can not be matched, the point is considered to be wrong matching, searching in a word bag is not needed, and the robustness of the system and the accuracy of building an indoor map are improved.

Description

Semantic information and VSLAM fusion method for sweeping robot
Technical Field
The invention belongs to the technical field of synchronous positioning and mapping (SLAM), and particularly relates to a semantic information and VSLAM fusion method for a sweeping robot.
Background
Sweeping robots are more and more common in people's lives. The core technology of the sweeping robot comprises the aspects of sweeping, mopping, obstacle avoidance, drawing construction, man-machine interaction and the like. In which, except for sweeping, the functions of wiping the floor and the like have different degrees of problems and are still in the exploration stage.
The position and the map are often closely related, that is, the positioning and the map are interdependent, and positioning cannot be referred to without the map, but a map suitable for the robot needs to be constructed by the indoor robot, and the position of the robot needs to be known. Because the indoor objects such as tables, chairs, boxes, cabinets and the like can be often placed or the engineering blueprints of indoor scenes do not completely conform to the reality, the robot cannot directly use a given artificial map, and needs to be constructed by sensing the environment by the robot. The positioning and mapping of the indoor robot are mainly realized through an SLAM technology, the mainstream scheme of the SLAM technology comprises a visual SLAM and a laser SLAM, the visual SLAM is the current research hotspot and has the advantages of low cost, rich information and the like, but the visual SLAM is poor in stability and precision and more complex than the laser SLAM; the laser SLAM has a better effect in the aspect of indoor robot positioning and map building, but laser data is single, closed-loop detection cannot be well realized, and in the low-cost laser radar, the laser point density is low, and meanwhile, a shielding phenomenon exists, so that a built map often has a phenomenon that the map cannot be closed; the semantic information of the picture can provide more information for the SLAM and more accurate semantic information in the subsequent process, so that the combination of the semantic information and the SLAM is a trend.
Disclosure of Invention
The invention aims to provide a method for combining Visual SLAM (VSLAM) and semantic information, and improve the accuracy of positioning and mapping of a sweeping robot.
In order to solve the technical problem, the invention provides a method for fusing semantic information and VSLAM of a sweeping robot, which comprises the following steps:
s1, fusing the semantic information into the VSLAM system, and establishing the indoor map fused with the semantic information, wherein the method comprises the following specific steps:
s11, extracting semantic information and identifying the semantic information;
s111, removing a last full connection layer by using ResNet-18 as a basic network according to the existing classification model, and extracting and identifying semantic information;
s112, establishing a public data set for the indoor object, dividing the indoor object into n regions, sequencing the n regions from left to right and from top to bottom, establishing an offline semantic information dictionary, and using n components c of the n-dimensional vector c1-cn in turn represents n regions, a component being 1 if the region represented by the component is present, otherwise 0;
s12, carrying out I-shaped cleaning indoors;
s121, determining a maximum region, identifying and segmenting an indoor object in the process of determining the maximum region, and returning to the initial position;
s122, starting I-shaped cleaning;
s13, generating a fusion dictionary fusing the traditional information and the semantic information;
s131, generating a semantic dictionary;
identifying the object and the position of the object in the cleaning process through the established off-line semantic dictionary to obtain a vector containing semantic information and generate a semantic information dictionary;
s132, generating a traditional dictionary;
obtaining the feature point of each frame of image through feature point extraction and matching, obtaining the pose of each feature point through motion estimation, and determining the attribute of each feature point; placing the feature points on the image into a dictionary to form a traditional dictionary;
s133, adding a vector containing the voice information in the semantic dictionary to the front of the vector of the traditional dictionary to generate a fusion dictionary fusing the traditional information and the semantic information;
s14, generating a map; constructing a point cloud map according to the obtained pose of each point in the space;
s15, detecting a loop;
firstly, matching semantic information, and if the semantic information cannot be matched, considering that the point is wrong matching and does not need to be searched in a word bag; otherwise, comparing the position relation of the objects in the traditional dictionary and the semantic information dictionary, when the matching degree of the semantic information exceeds a certain threshold value, carrying out loop detection, and optimizing the accumulated error to obtain an optimized map;
s2, establishing a self-learning model of the sweeping robot, self-learning the messiness degrees of different areas in the process of sweeping the indoor aiming at the established indoor map fused with semantic information, and dividing the indoor area to obtain a primary area, a secondary area and a tertiary area;
s3, establishing a multi-mode cleaning mechanism;
and aiming at the established self-learning model of the sweeping robot, a multi-mode sweeping mechanism is established.
Preferably, in step S132, a brute force matching method is used to extract and match feature points, and the method selects ORB features and uses a fast approximate nearest neighbor algorithm to perform feature point matching.
Preferably, in step S132, an EPnP algorithm is used to obtain the poses of the feature points.
Preferably, in the step S2, performing indoor area division by using an improved K-means clustering algorithm based on semantic information, including the following steps:
s21, selecting semantically recognized indoor objects according to the indoor map, and setting k clustering centroid points as mu12,…,μk
S22, using formula C(i)=argminj||x(i)j||2Clustering the indoor space sample data set for each data individual i according to the Euclidean distance nearest principle, and determining and classifying the samples, wherein C(i)Represents the class of sample i closest to the k classes, μjIs the centroid point, x, of each cluster(i)Is the coordinates of each point in the room;
s23, for each cluster mujClassifying the clustering area according to the number of dirt swept by the sweeping robot;
and S24, repeatedly executing the steps S22 and S23, and updating the regional grade change all the time.
Preferably, the multi-mode sweeping mechanism in step S3 includes: a power saving mode, an intelligent mode and a functional mode;
the power saving mode is as follows: according to the divided region grades, a primary region is mainly cleaned;
the intelligent mode comprises the following steps: according to the divided region grades, a first-class region is intensively cleaned, and other regions are slightly cleaned;
the functional mode is as follows: and according to the divided region grades, cleaning the indoor regions with no difference and emphasis.
Compared with the prior art, the invention has the following beneficial effects:
according to the method, the vector containing the semantic information is added into the vector of the traditional information to generate the dictionary fusing the semantic information, so that the information source of the VSLAM system is enhanced, the defect that the traditional VSLAM cannot obtain the prior information of the environment is overcome, the precision of solving the essential matrix of the VSLAM system is improved by utilizing the semantic information, and the robustness of the system and the precision of building an indoor map are improved.
Drawings
FIG. 1 is a schematic diagram of a map object of an embodiment of the present invention;
FIG. 2 is a flow chart of a mapping algorithm based on semantic information according to an embodiment of the present invention;
FIG. 3 is a schematic view of an I-cleaning mechanism according to an embodiment of the present invention;
FIG. 4 is a flow chart of the K-means algorithm for improving based on voice information according to the embodiment of the present invention; and
FIG. 5 is a schematic diagram of a multi-mode cleaning mechanism according to an embodiment of the present invention.
Detailed Description
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
FIG. 1 is a schematic diagram of a map object of an embodiment of the present invention;
the embodiment provides a semantic information and VSLAM fusion method for a sweeping robot, which includes the following steps:
s1, fusing the semantic information into a VSLAM system, establishing an indoor map fused with the semantic information, wherein a map construction algorithm based on the semantic information is shown in FIG. 2 and comprises the following specific steps:
s11, extracting semantic information and identifying the semantic information;
s111, because the existing classification model can better extract and identify the semantic information, the embodiment of the invention removes the last full connection layer by using ResNet-18 as a basic network according to the existing classification model, and realizes the extraction and identification of the semantic information;
s112, establishing a public data set for indoor common objects such as beds, sofas and the like, dividing the objects into 9 regions, establishing an offline semantic information dictionary, representing whether a certain region exists or not by using a vector c, sequentially representing by 1-9 from left to right and from top to bottom, and if the region exists, determining that the position is 1, otherwise, determining that the position is 0.
S12, carrying out I-shaped cleaning on an indoor environment, such as a bedroom;
s121, determining the maximum area of the room, wherein the maximum area is determined as shown in the implementation of figure 3, identifying the indoor object in the process of determining the maximum area, dividing the indoor object, and returning to the initial position;
s122, starting I-shaped cleaning, wherein the cleaning process is shown by a dotted line in figure 4;
s13, in the cleaning process, recognizing the passing object part while generating a dictionary by the traditional SLAM to obtain a semantic information dictionary, and combining the semantic information dictionary and the semantic information dictionary to obtain a dictionary fused with semantic information;
s131, generating a semantic dictionary, and identifying the object and the object position in the I-shaped cleaning process through the established off-line semantic dictionary to obtain a semantic information dictionary; for example, at the left bedside position, the semantic dictionary is [1,0,0,0,0,0,0,0,0], if other objects exist, vectors are continuously added, indoor objects in the semantic information dictionary are fixed and sequenced, the first 9 bits are description beds, the next 9 bits are description sofas, and the later fixed sequences exist;
s132, generating a traditional dictionary, obtaining the feature point of each frame of image through feature point extraction and matching, and obtaining the pose of each feature point through motion estimation, so that the attribute of each feature point is determined; placing the feature points on a certain frame of image into a dictionary to form a traditional dictionary; for example, a frame of image is input, feature detection and feature description are performed, and each feature point is processed through a dictionary (which is carried by a traditional SLAM) to obtain a vector v (the dimension is generally 100 ten thousand).
S1321, extracting and matching features of the image;
edges, corners, points, regions, colors, etc. can be used as features to represent elements in an image, and good image features need to have scale invariance, rotation invariance, repeatability and certain robustness under illumination. Because the embodiment of the invention is used for real-time positioning and mapping, ORB characteristics with low matching precision but minimum computing resources are selected;
the feature point matching is to find a feature matching relationship between images or between images and maps. The simplest scheme is violence matching, and the basic principle is that all feature points are matched
Figure BDA0002586109880000061
And
Figure BDA0002586109880000062
calculating the hamming distance of the BRIEF descriptor in the ORB features, and leaving the nearest neighbor as a matching point, which becomes cumbersome as the number of feature points increases, in this embodiment, FLANN (fast approximate nearest neighbor) is used for feature point matching;
s1322, performing pose estimation on the robot;
the embodiment of the invention selects an EPnP algorithm to estimate the pose; the EPnP algorithm has the core idea that the weighting is solved by adopting four virtual points which are not in the same plane, other points are represented by the four virtual points, and n three-dimensional points are represented as the weighted sum of four virtual control points, so that the problem to be solved is to solve the camera coordinates of the four points; the method comprises the following specific steps:
f for camera coordinate systemcRepresenting, world coordinate system by FwRepresenting, for each three-dimensional point, in a world coordinate system
Figure BDA0002586109880000063
Four control points alpha can be foundijJ ═ 1,2,3,4 such that:
Figure BDA0002586109880000064
wherein:
Figure BDA0002586109880000065
αijis a homogeneous coordinate, represents the weight of the control point,
Figure BDA0002586109880000066
the coordinates of four virtual points which are not on the same plane are in a world coordinate system;
also in the camera coordinate system:
Figure BDA0002586109880000067
wherein:
Figure BDA0002586109880000068
three-dimensional coordinates representing a reference point, i ═ 1 … n;
Figure BDA0002586109880000069
three-dimensional coordinates representing control points, j 1 … n; alpha is alphaijIs a homogeneous coordinate, represents the weight of the control point,
Figure BDA00025861098800000610
representing coordinates of four points which are not on the same plane in a camera coordinate system;
let K be the camera internal reference matrix, { ui}i=1,…,nIs the reference point { Pi}i=1,…,nHas the following 2D projection coordinates:
Figure BDA0002586109880000071
written in matrix form:
Figure BDA0002586109880000072
wherein: coordinates of 12 control points in camera coordinate system
Figure BDA0002586109880000073
And n projection parameters wi}i=1,…,nIs an unknown parameter of the linear system; the difference between the pixel coordinate system and the imaging plane is a zoom and a translation of the origin, the matrix K is an internal reference, indicating this relationship, where fxα f, where the pixel coordinates are scaled by α times on the u-axis, fyβ f, is the pixel coordinate scaled by β times on the v-axis, cx,cyIs a translation of the origin; unrolling (4) the last row to get:
Figure BDA0002586109880000074
bringing (5) into (4), and unfolding the first row and the second row to obtain:
Figure BDA0002586109880000075
Figure BDA0002586109880000076
in (6) and (7), only the control points are unknown quantities. Considering n reference points yields:
MX=0 (8)
the expansion is as follows:
Figure BDA0002586109880000077
wherein: the second matrix on the left side of the equation is the control point to be solved, and is written as:
Figure BDA0002586109880000081
x has 12 unknown variables and M is a matrix of 2n X12. X can be obtained as:
Figure BDA0002586109880000082
wherein: v. ofiIs the right singular vector of M; an integer N: 1-4 are matrices MTThe effective dimension of the M null space; { beta ]i}i=1,…,NIs calculating X time viLinear combination coefficients of (c).
If the complexity of direct solution is O (n)3) M may be usedTAnd solving the zero-space feature vector of M, and reducing the computational complexity to O (n). Since the four control points are in the world coordinate system and the camera coordinate system, the same distance between each two points constitutes a constraint:
Figure BDA0002586109880000083
wherein
Figure BDA0002586109880000084
Figure BDA0002586109880000085
Is the coordinates of a point in the camera coordinate system,
Figure BDA0002586109880000086
Figure BDA0002586109880000087
is the coordinates of a point in the world coordinate system.
An appropriate linear combination { beta } can be foundi}i=1,…,NFollowed by initial values in turn, by further optimization of { β by Gauss Newton's method, continuing by minimizing the distance between controllable pointsi}i=1,…,N
Knowing the coordinates of the control point in the camera coordinate system, the coordinates of the reference point in the camera coordinate system can be obtained by the formula (2), and for two groups of 3D coordinates
Figure BDA0002586109880000088
Solving the pose through SVD, comprising the following steps:
(1) calculating the centroid of a set of reference points
Figure BDA0002586109880000089
(2) Calculating centroid-removed coordinates of a set of reference points
Figure BDA00025861098800000810
(3) Computing
Figure BDA00025861098800000811
(4) Singular value decomposition of S-U-VT
(5) Solving the rotation matrix R ═ VUT
(6) Calculating a translation matrix t ═ mucw(ii) a And finishing the pose calculation.
S133, adding the vector containing the voice information in the semantic dictionary to the front of the vector in the traditional dictionary to generate a fusion dictionary fusing the traditional information and the semantic information
S14, generating a map; and constructing a point cloud map according to the obtained pose of each point in the space. The point cloud is a group of discrete points in a three-dimensional space, can also comprise color information of r, g and b besides basic three-dimensional coordinates x, y and z, and is a measurement map which can clearly express the relationship between objects in the environment;
s15, detecting a loop;
the robot runs for a long time, errors are inevitably generated in each time period, accumulated errors are generated after the robot runs for a long time, the current data can be associated with earlier data through loop detection, and because the same place is observed twice, constraints can be established between the two observations, so that the accumulated errors are eliminated, and a globally consistent map is obtained.
The loop detection in the embodiment firstly matches the semantic information, because the semantic information does not need to be searched, the semantic information can be directly matched, if the semantic information can not be matched, the point is considered to be wrong matching, and searching in a word bag is not needed; otherwise, comparing the position relation of the objects in the traditional dictionary and the semantic information dictionary, when the matching degree of the semantic information exceeds a certain threshold value, carrying out traditional loop detection, and optimizing the accumulated error to obtain an optimized map;
the current loop detection mainly relies on Bag of Words (BoW) algorithm, and is an efficient image retrieval matching algorithm. In order to realize quick retrieval, the BoW algorithm needs to construct a dictionary, which is usually constructed by using a K-ary tree. The traditional loop detection is only to simply compare the geometric characteristics of each node, and the invention introduces semantic information matching in the loop detection. Firstly, semantic information is matched, because the semantic information does not need to be searched, the semantic information can be directly matched, if the semantic information cannot be matched, the matching is considered to be wrong, and the semantic information does not need to be searched in the word bag. After loop back detection is completed, the map construction is complete.
S2, establishing a self-learning model of the sweeping robot, self-learning the messiness degree of different areas in the process of cleaning the indoor aiming at the established indoor map fused with semantic information, and dividing the different areas to obtain a primary area (dirtier), a secondary area and a tertiary area;
the traditional K-means algorithm is very sensitive to the initial point center, the clustering result is unstable, and aiming at the indoor environment, based on the established map fused with the semantic information, each dirty area is set as the initial center of the K-means algorithm according to the semantic information, and then the final clustering result is carried out according to the K-means algorithm, so that the classification of the dirty and dirty areas in the room is realized.
The K-means clustering algorithm is a typical unsupervised learning method and mainly automatically classifies similar samples. And dividing the samples into different categories according to the similarity among the samples in the K-means clustering algorithm. The invention divides the indoor areas into different grades according to different dirty degrees. The basic idea of the traditional K-means clustering algorithm is as follows: initializing a clustering center and presetting k, continuously iterating and recalculating the clustering center until the clustering center is not changed any more and the sum of squared distance errors and the local part is minimum, and taking the obtained compact and mutually independent classes as the final target of the algorithm. In order to obtain the optimal clustering effect, the threshold value of the iteration times can be adjusted by using a function extremum solving method.
The embodiment of the invention provides an improved K-means clustering algorithm based on semantic information as shown in FIG. 5, the algorithm has a dirty region grading algorithm with environment self-learning capability, and realizes the region grading of an indoor map, and the method comprises the following operation steps:
s21, according to the indoor map, selecting semantically identified indoor objects such as balconies, tea tables, toilets and the like, and setting k clustering center points to be mu12,…,μk
S22, using formula C(i)=argminj||x(i)j||2Clustering the indoor space sample data set for each data individual i according to the Euclidean distance nearest principle, and determining and classifying the samples, wherein C(i)Representing the class of the sample i with the closest distance to the k classes;
s23, for each cluster mujAnd grading the clustering area according to the number of dirt swept by the sweeping robot.
S24, repeatedly executing the steps S22 and S23 to update the regional grade change all the time;
s3, establishing a multi-mode cleaning mechanism;
aiming at the established self-learning model of the sweeping robot, a multi-mode sweeping mechanism is established, and only a first-level area is swept in a power-saving mode; the intelligent area emphatically cleans the first-level area, and other areas clean for one time; functional mode, no difference indoor cleaning;
the schematic diagram of the multifunctional cleaning mechanism of the embodiment of the invention is shown in fig. 5, and comprises the following modes:
s31, power saving mode: cleaning a primary area (for multiple times) in an important way according to the divided area grades;
s32, intelligent mode: according to the divided region grades, a first-class region is intensively cleaned, and other regions are slightly cleaned;
s33, function mode: and according to the divided region grades, cleaning the indoor regions with no difference and emphasis.
According to the embodiment of the invention, semantic information is integrated into the original VSLAM system, the information source of the VSLAM system is enhanced, the defect that the traditional VSLAM cannot obtain the prior information of the environment is overcome, the precision of solving an essential matrix by the VSLAM system is improved by utilizing the semantic information, and the robustness of the system and the precision of establishing an indoor map are improved; based on the established indoor map, a self-learning model is established, a dirty region grading algorithm with environment self-learning capability based on a vector Taylor series compensation algorithm is provided, and region grading of the indoor map is realized; the embodiment of the invention also establishes a multi-mode conversion mechanism, realizes the multi-mode conversion of the sweeping robot, and has the advantages of more electricity saving, higher efficiency, higher indoor cleanliness and the like.
The above-mentioned embodiments are merely illustrative of the preferred embodiments of the present invention, and do not limit the scope of the present invention, and various modifications and improvements made to the technical solution of the present invention by those skilled in the art without departing from the spirit of the present invention shall fall within the protection scope defined by the claims of the present invention.

Claims (5)

1. A semantic information and VSLAM fusion method for a sweeping robot is characterized by comprising the following steps:
s1, fusing the semantic information into the VSLAM system, and establishing the indoor map fused with the semantic information, wherein the method comprises the following specific steps:
s11, extracting semantic information and identifying the semantic information;
s111, removing a last full connection layer by using ResNet-18 as a basic network according to the existing classification model, and extracting and identifying semantic information;
s112, establishing a public data set for the indoor object, dividing the indoor object into n regions, sequencing the n regions from left to right and from top to bottom, establishing an offline semantic information dictionary, and using n components c of the n-dimensional vector c1-cnSequentially representing n regions, wherein if the region represented by the component exists, the component is 1, otherwise, the component is 0;
s12, carrying out I-shaped cleaning indoors;
s121, determining a maximum region, identifying and segmenting an indoor object in the process of determining the maximum region, and returning to the initial position;
s122, starting I-shaped cleaning;
s13, generating a fusion dictionary fusing the traditional information and the semantic information;
s131, generating a semantic dictionary;
identifying the object and the position of the object in the cleaning process through the established off-line semantic dictionary to obtain a vector containing semantic information and generate a semantic information dictionary;
s132, generating a traditional dictionary;
obtaining the feature point of each frame of image through feature point extraction and matching, obtaining the pose of each feature point through motion estimation, and determining the attribute of each feature point; placing the feature points on the image into a dictionary to form a traditional dictionary;
s133, adding a vector containing the voice information in the semantic dictionary to the front of the vector of the traditional dictionary to generate a fusion dictionary fusing the traditional information and the semantic information;
s14, generating a map; constructing a point cloud map according to the obtained pose of each point in the space;
s15, detecting a loop;
firstly, matching semantic information, and if the semantic information cannot be matched, considering that the point is wrong matching and does not need to be searched in a word bag; otherwise, comparing the position relation of the objects in the traditional dictionary and the semantic information dictionary, when the matching degree of the semantic information exceeds a certain threshold value, carrying out loop detection, and optimizing the accumulated error to obtain an optimized map;
s2, establishing a self-learning model of the sweeping robot, self-learning the messiness degrees of different areas in the process of sweeping the indoor aiming at the established indoor map fused with semantic information, and dividing the indoor area to obtain a primary area, a secondary area and a tertiary area;
s3, establishing a multi-mode cleaning mechanism;
and aiming at the established self-learning model of the sweeping robot, a multi-mode sweeping mechanism is established.
2. The method of claim 1, wherein a brute force matching method is used in step S132 for feature point extraction and matching, and the method selects ORB features and uses a fast nearest neighbor algorithm for feature point matching.
3. The method for fusing the semantic information and the VSLAM of the sweeping robot according to claim 1, wherein an EPnP algorithm is used to obtain the pose of the feature point in the step S132.
4. The method for fusing the semantic information and the VSLAM of the sweeping robot according to claim 1, wherein the indoor area division is performed by using the improved K-means clustering algorithm based on the semantic information in the step S2, which comprises the following steps:
s21, selecting semantically recognized indoor objects according to the indoor map, and setting k clustering centroid points as mu12,…,μk
S22, using formula C(i)=argminj||x(i)j||2Clustering the indoor space sample data set according to the Euclidean distance nearest principle for each data individual i, and determining and classifying the samplesWherein, C(i)Represents the class of sample i closest to the k classes, μjIs the centroid point, x, of each cluster(i)Is the coordinates of each point in the room;
s23, for each cluster mujClassifying the clustering area according to the number of dirt swept by the sweeping robot;
and S24, repeatedly executing the steps S22 and S23, and updating the regional grade change all the time.
5. The method for fusing the semantic information and the VSLAM of the sweeping robot according to claim 1, wherein the multi-mode sweeping mechanism in the step S3 comprises: a power saving mode, an intelligent mode and a functional mode;
the power saving mode is as follows: according to the divided region grades, a primary region is mainly cleaned;
the intelligent mode comprises the following steps: according to the divided region grades, a first-class region is intensively cleaned, and other regions are slightly cleaned;
the functional mode is as follows: and according to the divided region grades, cleaning the indoor regions with no difference and emphasis.
CN202010681784.4A 2020-07-15 2020-07-15 Semantic information and VSLAM fusion method for sweeping robot Active CN111797938B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010681784.4A CN111797938B (en) 2020-07-15 2020-07-15 Semantic information and VSLAM fusion method for sweeping robot

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010681784.4A CN111797938B (en) 2020-07-15 2020-07-15 Semantic information and VSLAM fusion method for sweeping robot

Publications (2)

Publication Number Publication Date
CN111797938A CN111797938A (en) 2020-10-20
CN111797938B true CN111797938B (en) 2022-03-15

Family

ID=72807203

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010681784.4A Active CN111797938B (en) 2020-07-15 2020-07-15 Semantic information and VSLAM fusion method for sweeping robot

Country Status (1)

Country Link
CN (1) CN111797938B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112435296B (en) * 2020-12-01 2024-04-19 南京工程学院 Image matching method for VSLAM indoor high-precision positioning
CN115191866A (en) * 2021-04-09 2022-10-18 美智纵横科技有限责任公司 Recharging method and device, cleaning robot and storage medium
CN113405547B (en) * 2021-05-21 2022-03-22 杭州电子科技大学 Unmanned aerial vehicle navigation method based on semantic VSLAM

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107063258A (en) * 2017-03-07 2017-08-18 重庆邮电大学 A kind of mobile robot indoor navigation method based on semantic information
CN108171796A (en) * 2017-12-25 2018-06-15 燕山大学 A kind of inspection machine human visual system and control method based on three-dimensional point cloud
CN108230337A (en) * 2017-12-31 2018-06-29 厦门大学 A kind of method that semantic SLAM systems based on mobile terminal are realized
CN108596974A (en) * 2018-04-04 2018-09-28 清华大学 Dynamic scene robot localization builds drawing system and method
CN109559320A (en) * 2018-09-18 2019-04-02 华东理工大学 Realize that vision SLAM semanteme builds the method and system of figure function based on empty convolution deep neural network
CN109816686A (en) * 2019-01-15 2019-05-28 山东大学 Robot semanteme SLAM method, processor and robot based on object example match
CN110738673A (en) * 2019-10-21 2020-01-31 哈尔滨理工大学 Visual SLAM method based on example segmentation
CN110956651A (en) * 2019-12-16 2020-04-03 哈尔滨工业大学 Terrain semantic perception method based on fusion of vision and vibrotactile sense
CN111179426A (en) * 2019-12-23 2020-05-19 南京理工大学 Deep learning-based robot indoor environment three-dimensional semantic map construction method
CN111260661A (en) * 2020-01-15 2020-06-09 江苏大学 Visual semantic SLAM system and method based on neural network technology

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107063258A (en) * 2017-03-07 2017-08-18 重庆邮电大学 A kind of mobile robot indoor navigation method based on semantic information
CN108171796A (en) * 2017-12-25 2018-06-15 燕山大学 A kind of inspection machine human visual system and control method based on three-dimensional point cloud
CN108230337A (en) * 2017-12-31 2018-06-29 厦门大学 A kind of method that semantic SLAM systems based on mobile terminal are realized
CN108596974A (en) * 2018-04-04 2018-09-28 清华大学 Dynamic scene robot localization builds drawing system and method
CN109559320A (en) * 2018-09-18 2019-04-02 华东理工大学 Realize that vision SLAM semanteme builds the method and system of figure function based on empty convolution deep neural network
CN109816686A (en) * 2019-01-15 2019-05-28 山东大学 Robot semanteme SLAM method, processor and robot based on object example match
CN110738673A (en) * 2019-10-21 2020-01-31 哈尔滨理工大学 Visual SLAM method based on example segmentation
CN110956651A (en) * 2019-12-16 2020-04-03 哈尔滨工业大学 Terrain semantic perception method based on fusion of vision and vibrotactile sense
CN111179426A (en) * 2019-12-23 2020-05-19 南京理工大学 Deep learning-based robot indoor environment three-dimensional semantic map construction method
CN111260661A (en) * 2020-01-15 2020-06-09 江苏大学 Visual semantic SLAM system and method based on neural network technology

Non-Patent Citations (12)

* Cited by examiner, † Cited by third party
Title
3D Semantic Mapping Based on Convolutional Neural Networks;Jing Li等;《Proceedings of the 37th Chinese Control Conference》;20180727;第9303-9308页 *
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution,and Fully Connected CRFs;Liang-Chieh Chen等;《arXiv》;20170512;第1-14页 *
ORB-SLAM: A Versatile and Accurate Monocular SLAM System;Ra´ul Mur-Artal等;《IEEE TRANSACTIONS ON ROBOTICS》;20151031;第31卷(第5期);第1147-1163页 *
VSLAM的研究与发展;吴家伟等;《单片机与嵌入式系统应用》;20191231(第9期);第4-7页 *
基于深度学习的视觉SLAM综述;刘瑞军等;《系统仿真学报》;20200731;第32卷(第7期);第1244-1256页 *
基于特征点法和直接法VSLAM的研究;邹雄等;《计算机应用研究》;20200531;第37卷(第5期);第1281-1291页 *
基于视觉和IMU融合的定位算法研究;施振宇;《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》;20200215(第2期);I138-1082 *
基于视觉的同时定位与地图构建的研究进展;陈常等;《计算机应用研究》;20180331;第35卷(第3期);第641-647页 *
基于语义信息与多视图几何的动态SLAM方法研究;仲星光;《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》;20200215(第2期);I138-1668 *
无人系统视觉SLAM技术发展现状简析;李云天等;《控制与决策》;20200114;第1-10页 *
融合语义激光与地标信息的SLAM技术研究;杨爽等;《计算机工程与应用》;20190911;第56卷(第18期);第262-271页 *
视觉同时定位与地图创建综述;周彦等;《智能系统学报》;20180228;第13卷(第1期);第97-106页 *

Also Published As

Publication number Publication date
CN111797938A (en) 2020-10-20

Similar Documents

Publication Publication Date Title
CN111797938B (en) Semantic information and VSLAM fusion method for sweeping robot
CN110427877B (en) Human body three-dimensional posture estimation method based on structural information
Pal et al. Learning hierarchical relationships for object-goal navigation
Eade et al. Monocular graph SLAM with complexity reduction
CN103413347B (en) Based on the extraction method of monocular image depth map that prospect background merges
CN109559320A (en) Realize that vision SLAM semanteme builds the method and system of figure function based on empty convolution deep neural network
CN110827398B (en) Automatic semantic segmentation method for indoor three-dimensional point cloud based on deep neural network
CN110781262B (en) Semantic map construction method based on visual SLAM
CN110110694B (en) Visual SLAM closed-loop detection method based on target detection
CN110555408B (en) Single-camera real-time three-dimensional human body posture detection method based on self-adaptive mapping relation
CN109101864A (en) The upper half of human body action identification method returned based on key frame and random forest
CN109766790B (en) Pedestrian detection method based on self-adaptive characteristic channel
Langer et al. Robust and efficient object change detection by combining global semantic information and local geometric verification
CN111709317B (en) Pedestrian re-identification method based on multi-scale features under saliency model
Hu et al. Loop closure detection for visual SLAM fusing semantic information
Zhang et al. Joint segmentation of images and scanned point cloud in large-scale street scenes with low-annotation cost
CN109740405B (en) Method for detecting front window difference information of non-aligned similar vehicles
CN116503654A (en) Multimode feature fusion method for carrying out character interaction detection based on bipartite graph structure
Rituerto et al. Label propagation in videos indoors with an incremental non-parametric model update
Zhang et al. Scale-aware insertion of virtual objects in monocular videos
Liu et al. Building semantic maps for blind people to navigate at home
Villaverde et al. Morphological neural networks and vision based simultaneous localization and mapping
CN114973305A (en) Accurate human body analysis method for crowded people
Qiao et al. Objects matter: learning object relation graph for robust camera relocalization
Liu et al. Detection based object labeling of 3D point cloud for indoor scenes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant