WO2022217794A1 - Procédé de positionnement de robot mobile dans un environnement dynamique - Google Patents

Procédé de positionnement de robot mobile dans un environnement dynamique Download PDF

Info

Publication number
WO2022217794A1
WO2022217794A1 PCT/CN2021/112575 CN2021112575W WO2022217794A1 WO 2022217794 A1 WO2022217794 A1 WO 2022217794A1 CN 2021112575 W CN2021112575 W CN 2021112575W WO 2022217794 A1 WO2022217794 A1 WO 2022217794A1
Authority
WO
WIPO (PCT)
Prior art keywords
image frame
target
target image
object region
area
Prior art date
Application number
PCT/CN2021/112575
Other languages
English (en)
Chinese (zh)
Inventor
彭业萍
张晓伟
曹广忠
吴超
Original Assignee
深圳大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳大学 filed Critical 深圳大学
Publication of WO2022217794A1 publication Critical patent/WO2022217794A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Definitions

  • the present application relates to the technical field of mobile robots, and in particular, to a positioning method of a mobile robot in a dynamic environment.
  • SLAM Simultaneous Localization And Mapping, simultaneous localization and map construction.
  • lidar, inertial sensor or camera are often used to collect data.
  • camera-based visual SLAM has become a hot spot in application research and development due to the low cost and abundant information that can be obtained.
  • Traditional visual SLAM is generally based on the assumption of a static environment to localize mobile robots.
  • ORB Oriented Fast and Rotated Brief
  • the extracted feature points are all static points, and they can be used to estimate their own positions to obtain valid points.
  • the feature points of the moving objects interfere with the estimation of its own position, thus affecting the accuracy of the positioning of the mobile robot.
  • the technical problem to be solved by the present application is to provide a positioning method for a mobile robot in a dynamic environment, aiming at the deficiencies of the prior art.
  • the first aspect of the embodiments of the present application provides a positioning method for a mobile robot in a dynamic environment, where the positioning method includes:
  • determining a moving object area in the target image frame based on the background area, the object area, an image frame preceding the target image frame, and the candidate camera pose;
  • a target camera pose corresponding to the target image frame is determined.
  • the positioning method of described dynamic environment mobile robot wherein, described acquisition target image frame, and determine the background area and object area of described target image frame specifically include:
  • the method for locating a mobile robot in a dynamic environment wherein the target image frame is determined based on the background area, the object area, the previous image frame of the target image frame, and the candidate camera pose.
  • the moving object area specifically includes:
  • the moving object region in the target image frame is determined.
  • the method for locating a mobile robot in a dynamic environment wherein the background error value of each target background feature point and their corresponding matching feature points, and the corresponding matching feature points of each target object feature point and each target object feature point are determined based on the candidate camera pose.
  • the object error value of the feature point specifically includes:
  • target feature point For each target feature point in the target feature point set formed by each target background feature point and each target object feature point, determine the target feature point and its corresponding matching feature point based on the transformation matrix, and determine the target feature point corresponding to the target feature point. target error value.
  • the positioning method of the dynamic environment mobile robot wherein the calculation formula of the target error value corresponding to the target feature point is:
  • d represents the target error value
  • F represents the transformation matrix
  • u 1 represents the target feature point
  • u 2 represents the matching feature point of the target feature point
  • (Fu 1 ) 1 represents the first vector element in the vector Fu 1
  • ( Fu 1 ) 2 represents the second vector element in the vector Fu 1 .
  • the method for locating a mobile robot in a dynamic environment wherein the determining of the motion feature points in the target image frame based on each background error value and each object error value specifically includes:
  • the target object feature points corresponding to the selected error values of each target object are used as the motion feature points in the target image frame.
  • the method for positioning a mobile robot in a dynamic environment, wherein the object area includes several object areas; and based on the determined motion feature points, determining the moving object area in the target image frame specifically includes:
  • For each object region select a target motion feature point located in the object region from the motion feature points, and determine the ratio of the number of selected target motion feature points to the number of feature points included in the object region;
  • the method for positioning a mobile robot in a dynamic environment wherein the target image frame includes several object areas; the target image frame based on the background area, the object area, the previous image frame of the target image frame, and the candidate camera After determining the moving object area in the target image frame, the method further includes:
  • the object region is used as the moving object region in the target image frame
  • the object region is used as the background region in the target image frame.
  • the method for positioning a mobile robot in a dynamic environment wherein the acquiring the reference motion state of the reference object region corresponding to the object region in each candidate image frame between the image frame corresponding to the candidate motion state and the target image frame before, the method further includes:
  • For each object region in several object regions determine the spatial position matching degree between the region position of the moving object region and the region position of each reference moving object region in the previous image frame, and the features in the moving object region The matching coefficient between the point and the feature point of each reference moving object area;
  • a reference moving object region corresponding to each moving object region is determined.
  • the positioning method of the dynamic environment mobile robot wherein the acquiring the candidate motion state corresponding to the object area specifically includes:
  • the frame number of the reference image frame is a multiple of a preset frame number threshold, and the reference image is a collection time located before the collection time of the target image frame and The image frame closest to the acquisition time of the target image frame;
  • the motion state of the candidate object region corresponding to the object region in the reference image frame is used as the candidate motion state corresponding to the object region.
  • a second aspect of the embodiments of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores one or more programs, and the one or more programs can be executed by one or more processors to The steps in the positioning method for a mobile robot in a dynamic environment as described above are implemented.
  • a third aspect of the embodiments of the present application provides a terminal device, which includes: a processor, a memory, and a communication bus; the memory stores a computer-readable program executable by the processor;
  • the communication bus implements connection communication between the processor and the memory
  • the present application provides a method for locating a mobile robot in a dynamic environment, the method includes acquiring a target image frame, and determining a background area and an object area of the target image frame; Determine the candidate camera pose corresponding to the target image frame in the background area; based on the target image frame and the moving object area, determine the target camera pose corresponding to the target image frame.
  • the target image frame is segmented to obtain the object area and the background area, and the moving object area in the target image frame is determined in combination with the previous image frame of the target image frame, so that the accuracy of the moving object area can be improved.
  • the accuracy of determining the pose of the target camera based on the image area in which the moving object area is removed in the target image frame is improved, thereby improving the accuracy of the positioning of the mobile robot in a dynamic environment.
  • FIG. 1 is a flowchart of a positioning method of a dynamic environment mobile robot provided by the present application.
  • FIG. 2 is an example flow chart of the positioning method for a mobile robot in a dynamic environment provided by the present application.
  • FIG. 3 is an example diagram of matching between a target image frame and a previous image frame in the method for locating a mobile robot in a dynamic environment provided by the present application.
  • FIG. 4 is an error change diagram in the positioning method of a mobile robot in a dynamic environment provided by the present application.
  • Figure 5 is a feature image without removing feature points of moving objects.
  • FIG. 6 is a feature image from which feature points of moving objects are removed.
  • FIG. 7 is a schematic structural diagram of a terminal device provided by the present application.
  • the present application provides a positioning method for a mobile robot in a dynamic environment.
  • the present application will be further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are only used to explain the present application, but not to limit the present application.
  • SLAM Simultaneous Localization And Mapping, simultaneous positioning and map construction.
  • lidar, inertial sensor or camera are often used to collect data.
  • camera-based visual SLAM has become a hot spot in application research and development due to the low cost and abundant information that can be obtained.
  • Traditional visual SLAM is generally based on the assumption of a static environment to localize mobile robots.
  • ORB Oriented Fast and Rotated Brief
  • the extracted feature points are all static points, and they can be used to estimate their own positions to obtain valid points.
  • the feature points of the moving objects interfere with the estimation of its own position, thus affecting the accuracy of the positioning of the mobile robot.
  • a target image frame is acquired, and a background area and an object area of the target image frame are determined; based on the background area, a candidate camera pose corresponding to the target image frame is determined; the background area, the object area, the previous image frame of the target image frame, and the candidate camera pose, determine the moving object area in the target image frame; based on the target image frame and the moving object area, and determine the target camera pose corresponding to the target image frame.
  • the target image frame is segmented to obtain the object area and the background area, and the moving object area in the target image frame is determined in combination with the previous image frame of the target image frame, so that the accuracy of the moving object area can be improved.
  • the accuracy of determining the pose of the target camera based on the image area in which the moving object area is removed in the target image frame is improved, thereby improving the accuracy of the positioning of the mobile robot in a dynamic environment.
  • This embodiment provides a method for locating a mobile robot in a dynamic environment, as shown in FIG. 1 and FIG. 2 , the method includes:
  • the target image frame may be a to-be-processed image acquired by an imaging module configured by the electronic device itself, or may be a target image acquired by an imaging module of other electronic devices acquired through a network, bluetooth, infrared, etc. frame.
  • the target image frame is captured by an imaging module configured by the mobile robot itself, wherein the imaging module may be a camera, a monocular camera, a binocular camera, or the like.
  • the mobile robot is configured with a depth camera, and the target image frame is captured by the depth camera, so that the target image frame carries depth information, which can solve the problem of scale factor, thereby making the positioning accuracy more accurate.
  • the target image frame carries the background in the shooting scene and the object located in the shooting scene, and the background area occupied by the background in the shooting scene in the target image frame, and the object occupied by the object of the shooting scene in the target image frame Regions can be obtained by segmenting the target image frame with a segmentation network model.
  • the acquiring the target image frame and determining the background area and the object area of the target image frame specifically include:
  • the segmentation network model is a trained deep learning module
  • the input item of the segmentation network model is the target image frame
  • the output item is the object area in the target image frame.
  • the segmentation network model outputs the annotated image carrying the annotation of the object area.
  • the object area in the target image frame can be determined based on the annotation image.
  • the image area from which the object area is removed in the target image frame is used as the background area of the target image frame, so as to obtain the background area and the object area of the target image frame.
  • the segmentation network model may use a yolact++ network model, and the target image frame is instance-segmented through the yolact++ network model to obtain object regions in the target image frame, such as people, cars, and animals and so on corresponding to the object area.
  • the target image frame may include several objects (for example, a person, a dog, a kitten, etc.), and accordingly, the object area may also include several object areas, and each of the several object areas corresponds to one of the shooting scenes objects, and the objects corresponding to each object area are different. For example, if the shooting scene includes human body A and human body B, the target image frame includes human body region a and human body region b. Human body A corresponds to human body region a, and human body B corresponds to human body region b.
  • feature point extraction may be performed on the background region and the object region to obtain background feature points corresponding to the background region and objects corresponding to the object region.
  • feature points and use the background feature points as the background area, and use the object feature points as the object area.
  • the candidate camera pose is determined based on a background area in the target image frame, wherein the background area is an image area occupied by the background in the shooting scene in the target image, and the background area in the background area
  • the background area is an image area occupied by the background in the shooting scene in the target image
  • the background area in the background area Each feature point can be considered as a static point, and then the candidate camera pose corresponding to the target image frame can be determined based on the background area.
  • EPnP Easy Perspective-n-Point
  • the PnP method is to determine the coordinates of each control point in the camera coordinate system according to the positional relationship between the four non-coplanar control points and the spatial point and the relationship between the spatial point and the target image frame, and then determine the camera pose, so that only Using four control points to represent three-dimensional points only needs to be optimized for four control points, which can improve the speed of camera attitude determination.
  • each coordinate point in the camera coordinate system can be expressed as follows:
  • the camera rotation matrix and displacement vector are solved for multiple coordinate points, and the candidate camera pose is obtained.
  • the candidate camera pose (R, t) can be obtained through the above EPnP method, but the determined candidate camera pose, not all coordinate points can be completely coincident after reprojection, so that The desired candidate camera pose needs to be optimized.
  • the optimization process can be:
  • s i is the scale factor
  • K is the camera internal parameter
  • T is the camera transformation matrix
  • P i represents the three-dimensional world coordinate point.
  • the optimal camera transformation matrix can be found by minimizing the cost function, and the expression is as follows:
  • Pi represents a three-dimensional world coordinate point
  • pi represents a two-dimensional coordinate point in the projected image
  • the cost function is optimized by the Levenberg-Marquardt method to obtain the optimized camera rotation matrix and displacement vector, and the optimized camera rotation matrix and displacement vector are used as the candidate camera pose corresponding to the target image.
  • the acquisition moment of the previous image frame is located before the acquisition moment of the target image frame, and the previous image frame is adjacent to the target image frame.
  • the moving object area is the object area corresponding to the moving object in the target image determined based on the background area, the object area, the previous image frame of the target image frame and the candidate camera pose, wherein the moving object area is included in the object area.
  • the moving object area may be a part of the image area in the object area, or the moving object area may be the entire image area in the object area.
  • the object region includes several object regions, which are respectively denoted as object region A and object region B, and the moving object region is object region A and so on.
  • the moving object in the target image frame is determined based on the background area, the object area, a previous image frame of the target image frame, and the candidate camera pose Areas specifically include:
  • the moving object region in the target image frame is determined.
  • the target background feature point is a feature point in the background area, and there is a matching feature point matching the feature point in the previous image frame, wherein the target background feature point corresponds to the shooting scene in the shooting scene.
  • the world point is the same as the world point in the shooting maternity leave corresponding to the matching feature point
  • the world point in the shooting scene corresponding to the target background feature point is the same as the world point in the shooting maternity leave corresponding to the matching feature point. in the coordinate system. For example, as shown in FIG.
  • the feature point P 1 in the image frame I 1 corresponding to the world point P on the world coordinate system, and the feature point P 2 in the image frame I 2 of the world point P on the world coordinate system , then the feature point P 1 in the image frame I 1 is taken as the target feature point, and the feature point P 2 in the image frame I 2 is the feature point P 1 feature point.
  • the mobile robot moves during the process of shooting the target image frame and the previous image frame, so that the shooting scene of the target image frame and the shooting scene of the previous image frame may be different.
  • some of the first feature points in the target image frame do not have matching feature points in the previous image frame, wherein the first feature points without matching feature points may be partially included in the background area and partially included in the object area; or , all contained in the background area; or, all contained in the object area.
  • the target background feature point in the background region and the target object feature point in the object region are respectively determined, wherein the target object feature point is a feature point in the object region and exists in the previous image frame. Matching feature points that match this feature point.
  • the principle of epipolar constraint in computer multi-view geometry is used, as shown in FIG. 3 , the target The plane formed between the feature point in the image frame and the feature point in the previous image frame and the world point intersects the imaging plane at the two matched feature points, and in the case of no error between the two matched feature points,
  • the relationship between the two matched feature points and the camera transformation can be expressed as:
  • u 2 is the matching feature point
  • u 1 is the target feature point
  • K is the internal parameter of the camera
  • R is the rotation matrix of the camera
  • t is the displacement vector of the camera.
  • Equation (1) is not necessarily true, so it is necessary to estimate the errors in order to determine the moving object region in the object region based on the error.
  • an error threshold may be determined based on the error between the target background feature point and its corresponding matching feature point, and based on the error threshold, the object error between the target object feature point and its corresponding matching feature point may be calculated. Measure, in order to determine whether the object corresponding to the feature point of the target object is a moving object.
  • the determination of the background error value of each target background feature point and its corresponding matching feature point based on the candidate camera pose, and the object error value of each target object feature point and its corresponding matching feature point specifically include:
  • target feature point For each target feature point in the target feature point set formed by each target background feature point and each target object feature point, determine the target feature point and its corresponding matching feature point based on the transformation matrix, and determine the target feature point corresponding to the target feature point. target error value.
  • the change matrix is determined based on the candidate camera pose and camera parameters, and is used to convert the target pixel into the world coordinate system, wherein the calculation formula of the change matrix can be:
  • K is the internal parameter of the camera
  • R is the rotation matrix of the camera
  • t is the displacement vector of the camera.
  • the target feature point is homogeneous, so that the target feature point Convert to three-dimensional feature points, and use the converted three-dimensional feature points as the target feature points; and calculate the target error value corresponding to the target feature points based on the change matrix and the converted three-dimensional feature points, where the corresponding target feature points are
  • the formula for calculating the target error value is:
  • d represents the target error value
  • F represents the transformation matrix
  • u 1 represents the target feature point
  • u 2 represents the matching feature point of the target feature point
  • (Fu 1 ) 1 represents the first vector element in the vector Fu 1
  • ( Fu 1 ) 2 represents the second vector element in the vector Fu 1 .
  • the target feature points and their corresponding matching feature points are formed into a four-dimensional vector for estimation, and the Simpson distance used for Simpson calculation and fitting of the quadratic surface is used as the target error value, so that the calculation speed of the target error value can be improved, and The accuracy of the target error value.
  • the determining of the motion feature points in the target image frame based on each background error value and each object error value specifically includes:
  • the target object feature points corresponding to the selected error values of each target object are used as the motion feature points in the target image frame.
  • a background error value can be randomly selected as the error threshold corresponding to the target image frame, and the error threshold and each The target object error value is used to determine the motion feature points in the object area. This is because when the moving object does not move, the pose transformation of the moving object is consistent with the camera transformation, so the object error value is basically the same as the background error value. On the contrary, if the moving object has motion, the difference between the object error value and the background error value will get bigger. Therefore, by judging the magnitude relationship between the object error value and the error threshold between the target object feature point and its corresponding matching feature point, it is determined whether the target object feature point is a motion feature point. If the object error value between the corresponding matching feature points is greater than the error threshold, it is determined that the target object feature point is a motion feature point, that is, if the expression is satisfied:
  • d i is the object error value corresponding to the i-th target object feature point
  • t is the error threshold
  • the target there may be an error between the estimated candidate camera pose and the real camera pose, and when the camera movement speed is greater than the preset speed threshold, the target There may be errors in the extraction of the background area and object area between the image frame and the previous image frame, as well as the matching of the target background feature points and the target object feature points.
  • the background error value is used as a preset threshold, or there will be errors in judging whether the feature point of the object is a motion pixel based on a fixed error threshold.
  • an adaptive threshold method can be used to calculate the mean value of the background error values of the target background feature points, and the calculated mean value is used as the error threshold.
  • the calculation formula of the error threshold may be:
  • d mean represents the error threshold
  • n represents the number of target background feature points
  • d i represents the background error value of the ith target background feature point.
  • the object area includes several object areas; based on the determined motion feature points, determining the moving object area in the target image frame specifically includes:
  • For each object region select a target motion feature point located in the object region from the motion feature points, and determine the ratio of the number of selected target motion feature points to the number of feature points included in the object region;
  • the preset ratio threshold is preset and is used to measure whether the object area is a moving object area.
  • the preset ratio threshold is greater than 50%, for example, 60% and the like.
  • the method further include:
  • the object region is used as the moving object region in the target image frame
  • the object region is used as the background region in the target image frame.
  • the motion state includes motion and stillness
  • the candidate motion state is motion
  • the reference motion state is motionless
  • the target motion state is motion Wait.
  • the target motion state is the motion state of the object area in the target image frame.
  • the target motion state of the object area is motion.
  • the target motion state is motion. If it is not a moving object area, the target motion state of the object area is static.
  • the motion state of the object area is motion
  • the motion state of the object area is static.
  • the candidate motion state is a motion state of a reference image frame, wherein the frame number of the reference image frame is a multiple of a preset frame number threshold, and the reference image is a collection time before the collection time of the target image frame And the image frame closest to the acquisition time of the target image frame.
  • the acquiring the candidate motion state corresponding to the object region specifically includes:
  • the motion state of the candidate object region corresponding to the object region in the reference image frame is used as the candidate motion state corresponding to the object region.
  • the preset frame number threshold is preset, and the candidate motion state of the object region is updated by the preset frame number threshold, wherein the candidate motion state can be stored as a configuration parameter, and the preset frame is set every interval
  • the image frame of the threshold value is updated once to the configuration parameter to update the candidate motion state, so that when the reference image frame corresponding to the target image frame is obtained, the configuration parameter for storing the candidate motion state can be directly obtained.
  • the candidate motion states corresponding to each object region are stored, so that the motion state of the object region can be quickly acquired.
  • the camera configured on the mobile robot collects image frames at a frame rate of 30 frames per second.
  • the preset frame number threshold is 10 frames, that is, 0.3 seconds as a stage, and the motion of the object area corresponding to the same object is continuously calculated within 10 frames. state, and the candidate motion state of the object is updated every 10 frames.
  • the probability is between 0 and 1
  • the probability of several image frames will increase or decrease continuously so that the probability exceeds the range
  • the addition and logarithm value of the probability are used to describe the motion state of the object, and the probability
  • the expression for the value can be:
  • y is the logarithmic value of probability
  • P is the probability of object motion
  • the probability logarithm value of the object is increased by one, otherwise, it is decreased by one.
  • the candidate motion state of the object is updated every image frame with a preset frame number threshold, and when the state of the object changes, the motion probability of the object will also change.
  • the candidate motion state is set to 1; otherwise, the candidate motion state is set to - 1.
  • the reference motion state is the motion state of the candidate object region corresponding to the object region in the candidate image frame, so when the reference motion state is acquired, the object needs to be tracked to facilitate the determination of the candidate object region.
  • the candidate object area in the image frame that is the same as the object corresponding to the object area.
  • the object area in the candidate image frame can be directly matched with the object area in the target image frame to determine the candidate object area corresponding to each object area; when there are multiple candidate image frames, the motion of each candidate image frame can be determined when the motion of each candidate image frame is determined.
  • the matching of the object in the corresponding previous image frame will be determined, so that when determining the matching between the candidate image frame and the object in the corresponding previous image frame, the corresponding relationship of each object area will be recorded. Yes, in this way, the corresponding relationship between each object of each candidate image frame before the previous image of the target image frame and each object in the previous image of the target image frame can be directly obtained, so that only the target image frame needs to be calculated.
  • the corresponding relationship with the corresponding previous image frame is sufficient.
  • the method may further include determining The process of the correspondence between the target image frame and each object region in the corresponding previous image frame, the process may specifically include:
  • For each object region in several object regions determine the spatial position matching degree between the region position of the moving object region and the region position of each reference moving object region in the previous image frame, and the features in the moving object region The matching coefficient between the point and the feature point of each reference moving object area;
  • a reference moving object region corresponding to each moving object region is determined.
  • the object needs to be tracked.
  • a method based on Kalman filtering and image feature fusion is used to track the moving object.
  • the Kalman filtering equation is as follows:
  • x k is the state at time k
  • a k is the state transition matrix
  • uk is the input at time k
  • w k is the process noise
  • z k is the measured value at time k
  • C k is the observation matrix
  • ⁇ k is the Observe noise.
  • the object state of the target image frame can be estimated through the state of the upper image frame and the input of the target image frame.
  • the following equation in the Kalman filter equation is the motion equation, and the object is calculated by the motion state. s position.
  • the estimated position of each tracked object in the target image frame can be obtained by calculation, and then according to the position of each object detected in the target image frame, the spatial position matching degree between the tracked object and the detected object can be calculated.
  • the expression of spatial position matching degree can be:
  • area i is the area of the i-th detection target rectangle
  • area j is the area of the j-th tracked target rectangle
  • area in is the area of the overlapping area of the two rectangles
  • iou ij is the i-th detection target and The overlap ratio between the jth tracked objects.
  • the spatial position similarity matrix is constructed by the overlap ratio relationship between each tracked object and each detected object:
  • the similarity of feature point matching is calculated for feature fusion.
  • ORB feature point extraction to obtain the feature points of the tracked object and the detected object, and then perform ORB feature point matching to obtain the tracked object and the detected object.
  • the number of matching points of the object, and the ratio of the matching points between the two frames to the total number of feature points is calculated as the matching coefficient.
  • the calculation formula of the matching coefficient can be as follows:
  • n i is the number of feature points of the i-th detection target
  • n j is the number of feature points representing the j-th tracked target
  • n in is the number of successful matching points of the two targets
  • rate ij is the i-th detection target and the j-th tracked target.
  • W is the fusion similarity matrix
  • a is the fusion coefficient
  • a coefficient of 0.5 is selected to construct a fusion similarity matrix, and the Hungarian algorithm is used to solve the problem, and there is a one-to-one correspondence between the active tracking object and the detected object, so as to realize the tracking of the moving object.
  • the determining the target camera pose corresponding to the target image frame based on the target image frame and the moving object area refers to determining, based on the target image frame and the moving object area, that in the target image frame
  • the stationary object region is then used as the background region to perform camera pose estimation to obtain the target camera pose.
  • the camera pose estimation process is the same as the above camera pose estimation process, and will not be repeated here. For example, as shown in Figure 5 without removing the feature points of moving objects, after removing the moving feature points, the feature image with the feature points of moving objects removed as shown in Figure 6 can be obtained.
  • the feature image of the target camera pose is determined.
  • this embodiment provides a method for locating a mobile robot in a dynamic environment.
  • the method includes acquiring a target image frame, and determining a background area and an object area of the target image frame; based on the background area, determining the corresponding target image frame based on the background area, the object area, the previous image frame of the target image frame and the candidate camera pose, determine the moving object area in the target image frame; based on the target image frame and the moving object area, determine the target image frame corresponding Target camera pose.
  • the present application divides the target image frame to obtain the object area and the background area, and then combines the previous image frame to determine the moving object area in the target image frame, which can improve the accuracy of the moving object area, thereby improving the accuracy of the moving object area based on the target image.
  • the image area in the frame with the moving object area removed determines the accuracy of the target camera pose, thereby improving the accuracy of the positioning of the mobile robot in a dynamic environment.
  • this embodiment adopts a yolact++ instance segmentation model, and adds Simpson distance as a motion judgment criterion based on the epipolar constraint in computer multi-view geometry, so as to improve the accuracy of motion object detection.
  • a feature fusion tracking algorithm based on Kalman filtering and feature descriptor matching is used in the tracking of moving objects, which is beneficial to the accuracy of moving object tracking.
  • this embodiment provides a computer-readable storage medium, where one or more programs are stored in the computer-readable storage medium, and the one or more programs can be stored by one or more Each processor executes to implement the steps in the method for positioning a mobile robot in a dynamic environment as described in the foregoing embodiments.
  • the present application also provides a terminal device, as shown in FIG. 7 , which includes at least one processor 20 ; a display screen 21 ; and a memory 22 , which may also Including communication interface (Communications Interface) 23 and bus 24.
  • the processor 20 , the display screen 21 , the memory 22 and the communication interface 23 can communicate with each other through the bus 24 .
  • the display screen 21 is set to display a user guide interface preset in the initial setting mode.
  • the communication interface 23 can transmit information.
  • the processor 20 may invoke logic instructions in the memory 22 to perform the methods in the above-described embodiments.
  • logic instructions in the memory 22 can be implemented in the form of software functional units and can be stored in a computer-readable storage medium when sold or used as an independent product.
  • the memory 22 may be configured to store software programs and computer-executable programs, such as program instructions or modules corresponding to the methods in the embodiments of the present disclosure.
  • the processor 20 executes functional applications and data processing by running the software programs, instructions or modules stored in the memory 22, that is, implements the methods in the above embodiments.
  • the memory 22 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal device, and the like. Additionally, memory 22 may include high-speed random access memory, and may also include non-volatile memory. For example, U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes, or temporary state storage medium.
  • U disk U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes, or temporary state storage medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

La présente demande divulgue un procédé de positionnement d'un robot mobile dans un environnement dynamique. Le procédé consiste à : obtenir une trame d'image cible et déterminer une région d'arrière-plan et une région d'objet de la trame d'image cible ; sur la base de la région d'arrière-plan, déterminer une pose de caméra candidate correspondant à la trame d'image cible ; déterminer une région d'objet mobile dans la trame d'image cible sur la base de la région d'arrière-plan, de la région d'objet, de la trame d'image précédente de la trame d'image cible et de la pose de caméra candidate ; et sur la base de la trame d'image cible et de la région d'objet mobile, déterminer une pose de caméra cible correspondant à la trame d'image cible. Selon la présente demande, la trame d'image cible est segmentée pour obtenir la région d'objet et la région d'arrière-plan, puis la région d'objet mobile dans la trame d'image cible est déterminée en combinaison avec la trame d'image précédente, de sorte que la précision de la région d'objet mobile puisse être améliorée, ce qui permet d'améliorer la précision de détermination de la pose de la caméra cible sur la base de la région d'image dans la trame d'image cible, en éliminant la région d'objet mobile, ce qui permet d'améliorer la précision de positionnement du robot mobile dans l'environnement dynamique.
PCT/CN2021/112575 2021-04-12 2021-08-13 Procédé de positionnement de robot mobile dans un environnement dynamique WO2022217794A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110388370.7A CN113052907B (zh) 2021-04-12 2021-04-12 一种动态环境移动机器人的定位方法
CN202110388370.7 2021-04-12

Publications (1)

Publication Number Publication Date
WO2022217794A1 true WO2022217794A1 (fr) 2022-10-20

Family

ID=76519234

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/112575 WO2022217794A1 (fr) 2021-04-12 2021-08-13 Procédé de positionnement de robot mobile dans un environnement dynamique

Country Status (2)

Country Link
CN (1) CN113052907B (fr)
WO (1) WO2022217794A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116051915A (zh) * 2023-02-22 2023-05-02 东南大学 基于聚类与几何残差的动态场景rgb-d slam方法
CN117934571A (zh) * 2024-03-21 2024-04-26 广州市艾索技术有限公司 一种4k高清的kvm坐席管理系统

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113052907B (zh) * 2021-04-12 2023-08-15 深圳大学 一种动态环境移动机器人的定位方法
CN113997295B (zh) * 2021-12-30 2022-04-12 湖南视比特机器人有限公司 机械臂的手眼标定方法、装置、电子设备及存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3506621A1 (fr) * 2017-12-28 2019-07-03 Canon Kabushiki Kaisha Appareil de traitement d'images numériques, procédé de traitement d'images numériques et programme informatique
CN110232379A (zh) * 2019-06-03 2019-09-13 上海眼控科技股份有限公司 一种车辆姿态检测方法及系统
CN110738667A (zh) * 2019-09-25 2020-01-31 北京影谱科技股份有限公司 一种基于动态场景的rgb-d slam方法和系统
CN110825123A (zh) * 2019-10-21 2020-02-21 哈尔滨理工大学 一种基于运动算法的自动跟随载物车的控制系统及方法
CN111724439A (zh) * 2019-11-29 2020-09-29 中国科学院上海微系统与信息技术研究所 一种动态场景下的视觉定位方法及装置
CN112132897A (zh) * 2020-09-17 2020-12-25 中国人民解放军陆军工程大学 一种基于深度学习之语义分割的视觉slam方法
CN113052907A (zh) * 2021-04-12 2021-06-29 深圳大学 一种动态环境移动机器人的定位方法

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108537829B (zh) * 2018-03-28 2021-04-13 哈尔滨工业大学 一种监控视频人员状态识别方法
CN110378345B (zh) * 2019-06-04 2022-10-04 广东工业大学 基于yolact实例分割模型的动态场景slam方法
CN112313536B (zh) * 2019-11-26 2024-04-05 深圳市大疆创新科技有限公司 物体状态获取方法、可移动平台及存储介质
CN111402336B (zh) * 2020-03-23 2024-03-12 中国科学院自动化研究所 基于语义slam的动态环境相机位姿估计及语义地图构建方法
CN112101160B (zh) * 2020-09-04 2024-01-05 浙江大学 一种面向自动驾驶场景的双目语义slam方法

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3506621A1 (fr) * 2017-12-28 2019-07-03 Canon Kabushiki Kaisha Appareil de traitement d'images numériques, procédé de traitement d'images numériques et programme informatique
CN110232379A (zh) * 2019-06-03 2019-09-13 上海眼控科技股份有限公司 一种车辆姿态检测方法及系统
CN110738667A (zh) * 2019-09-25 2020-01-31 北京影谱科技股份有限公司 一种基于动态场景的rgb-d slam方法和系统
CN110825123A (zh) * 2019-10-21 2020-02-21 哈尔滨理工大学 一种基于运动算法的自动跟随载物车的控制系统及方法
CN111724439A (zh) * 2019-11-29 2020-09-29 中国科学院上海微系统与信息技术研究所 一种动态场景下的视觉定位方法及装置
CN112132897A (zh) * 2020-09-17 2020-12-25 中国人民解放军陆军工程大学 一种基于深度学习之语义分割的视觉slam方法
CN113052907A (zh) * 2021-04-12 2021-06-29 深圳大学 一种动态环境移动机器人的定位方法

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116051915A (zh) * 2023-02-22 2023-05-02 东南大学 基于聚类与几何残差的动态场景rgb-d slam方法
CN117934571A (zh) * 2024-03-21 2024-04-26 广州市艾索技术有限公司 一种4k高清的kvm坐席管理系统
CN117934571B (zh) * 2024-03-21 2024-06-07 广州市艾索技术有限公司 一种4k高清的kvm坐席管理系统

Also Published As

Publication number Publication date
CN113052907B (zh) 2023-08-15
CN113052907A (zh) 2021-06-29

Similar Documents

Publication Publication Date Title
WO2022217794A1 (fr) Procédé de positionnement de robot mobile dans un environnement dynamique
CN110349250B (zh) 一种基于rgbd相机的室内动态场景的三维重建方法
US11830216B2 (en) Information processing apparatus, information processing method, and storage medium
WO2021196294A1 (fr) Procédé et système de suivi d'emplacement de personne à travers des vidéos, et dispositif
KR101725060B1 (ko) 그래디언트 기반 특징점을 이용한 이동 로봇의 위치를 인식하기 위한 장치 및 그 방법
CN111665842B (zh) 一种基于语义信息融合的室内slam建图方法及系统
US8903161B2 (en) Apparatus for estimating robot position and method thereof
CN110176032B (zh) 一种三维重建方法及装置
CN110363817B (zh) 目标位姿估计方法、电子设备和介质
CN109472820B (zh) 单目rgb-d相机实时人脸重建方法及装置
CN112784873B (zh) 一种语义地图的构建方法及设备
JP6860620B2 (ja) 情報処理装置、情報処理方法、及びプログラム
WO2022042304A1 (fr) Procédé et appareil pour identifier un contour de lieu, support lisible par ordinateur et dispositif électronique
JP2015522200A (ja) 人顔特徴点の位置決め方法、装置及び記憶媒体
WO2012023593A1 (fr) Appareil de mesure de position et d'orientation, procédé de mesure de position et d'orientation et support de stockage
TWI795885B (zh) 視覺定位方法、設備和電腦可讀儲存介質
CN112083403B (zh) 用于虚拟场景的定位追踪误差校正方法及系统
WO2023173950A1 (fr) Procédé de détection d'obstacle, robot mobile et support de stockage lisible par une machine
JP6817742B2 (ja) 情報処理装置およびその制御方法
CN111161334A (zh) 一种基于深度学习的语义地图构建方法
CN112465858A (zh) 基于概率网格滤波的语义视觉slam方法
JP6922348B2 (ja) 情報処理装置、方法、及びプログラム
JP5976089B2 (ja) 位置姿勢計測装置、位置姿勢計測方法、およびプログラム
CN116468786A (zh) 一种面向动态环境的基于点线联合的语义slam方法
CN117218210A (zh) 一种基于仿生眼的双目主动视觉半稠密深度估计方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21936666

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 15/02/2024)