CN112435278B - Visual SLAM method and device based on dynamic target detection - Google Patents

Visual SLAM method and device based on dynamic target detection Download PDF

Info

Publication number
CN112435278B
CN112435278B CN202110100524.8A CN202110100524A CN112435278B CN 112435278 B CN112435278 B CN 112435278B CN 202110100524 A CN202110100524 A CN 202110100524A CN 112435278 B CN112435278 B CN 112435278B
Authority
CN
China
Prior art keywords
feature point
dynamic
static
image
frame image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110100524.8A
Other languages
Chinese (zh)
Other versions
CN112435278A (en
Inventor
徐雪松
曾昱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
East China Jiaotong University
Original Assignee
East China Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by East China Jiaotong University filed Critical East China Jiaotong University
Priority to CN202110100524.8A priority Critical patent/CN112435278B/en
Publication of CN112435278A publication Critical patent/CN112435278A/en
Application granted granted Critical
Publication of CN112435278B publication Critical patent/CN112435278B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a visual SLAM method based on dynamic target detection, which comprises the steps of temporarily removing potential dynamic areas of images by using a target detection network Yolov3, solving a motion compensation frame through a reprojection error optimization homography matrix to obtain a four-frame difference image, then carrying out filtering, binaryzation and morphological processing on the four-frame difference image, simultaneously optimizing a dynamic target detection result by combining a Yolov3 network to obtain an improved dynamic target area, and finally carrying out tracking, image building and loop detection on the visual SLAM by using characteristic points of a static area. According to the method, a potential dynamic region in a scene is removed by adopting a deep learning target detection network, a homography matrix is roughly estimated, and whether characteristic points on the potential dynamic region can be used for calculating the homography matrix is judged based on a method of combining a reprojection error and an inter-class variance so as to optimize the homography matrix, so that the accuracy of the homography matrix is improved.

Description

Visual SLAM method and device based on dynamic target detection
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a visual SLAM method and device based on dynamic target detection.
Background
Meanwhile, a positioning and mapping (SLAM) technology is increasingly and widely applied to the fields of robot positioning, unmanned driving and the like, wherein a visual sensor has the characteristics of convenience in carrying and low cost, so that the visual sensor is widely applied to the SLAM technology, most of the traditional visual SLAM algorithms assume that a camera is in a static environment, such as Orbslam2, DSO, SVO and the like, and when a scene has a dynamic area, feature points extracted by the visual SLAM on a dynamic object influence the accuracy of the algorithm.
Aiming at the problem of accuracy reduction of a visual odometer in a dynamic scene, a commonly adopted method is as follows: the method comprises the steps of detecting an image advanced mobile dynamic object, reserving feature points of a static area for tracking and mapping the visual SLAM after removing feature points of the dynamic area, but in an image with a large dynamic area, the accuracy of tracking and mapping of the visual SLAM can be greatly influenced after removing the dynamic area.
The defects in the prior art are mainly caused by the following reasons: by using the deep learning target detection network alone, objects with mobility, such as people and automobiles, can be classified into potential dynamic targets in advance, but whether the potential dynamic targets are in a real motion state cannot be judged, and if the potential dynamic targets are in a static state, too many static feature points can be removed. An algorithm for dynamic detection by combining depth information is needed, and when the depth information of some areas of an image is uncertain or when the depth of a foreground and a background are close, the classification may be inaccurate.
Disclosure of Invention
The present invention provides a visual SLAM method and apparatus based on dynamic target detection, which is used to solve at least one of the above technical problems.
The invention provides a visual SLAM method based on dynamic target detection, which comprises the following steps: performing region segmentation on each image frame based on a deep learning target detection network in response to each acquired image frame, wherein each image frame comprises a potential dynamic region and/or a static region, the potential dynamic region comprises a motion feature point and/or a first static feature point, and the static region comprises a second static feature point; matching the second static characteristic point of the previous frame image with the second static characteristic point of the current frame image; responding to the acquired matching relation, and calculating to obtain a first homography matrix based on a RANSAC (random Sample consensus) algorithm; respectively extracting a first static feature point of the previous frame image and a first static feature point of the current frame image based on a motion feature point filtering method, wherein the motion feature point filtering method is a method formed by combining a reprojection error of feature points with a maximum inter-class variance method; optimizing the first homography matrix and obtaining a second homography matrix based on the matching relation between the first static characteristic point of the previous frame image and the first static characteristic point of the current frame image; and performing motion compensation on the previous frame image according to the second homography matrix so as to obtain a motion compensation frame image.
The invention provides a visual SLAM device based on dynamic target detection, which comprises: the segmentation module is configured to perform region segmentation on each image frame based on a deep learning target detection network in response to the acquired image frame, wherein each image frame comprises a potential dynamic region and/or a static region, the potential dynamic region comprises a motion feature point and/or a first static feature point, and the static region comprises a second static feature point; the matching module is configured to match the second static feature point of the previous frame image with the second static feature point of the current frame image; the calculation module is configured to respond to the acquired matching relationship and calculate to obtain a first homography matrix based on a RANSAC algorithm; the extraction module is configured to respectively extract the first static feature point of the previous frame image and the first static feature point of the current frame image based on a motion feature point filtering method, wherein the motion feature point filtering method is a method formed by combining a reprojection error of feature points with a maximum inter-class variance method; the optimization module is configured to optimize the first homography matrix and obtain a second homography matrix based on the matching relation between the first static feature point of the previous frame image and the first static feature point of the current frame image; and the compensation module is configured to perform motion compensation on the previous frame image according to the second homography matrix, so that a motion compensation frame image is obtained.
An electronic device is provided, which includes: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the visual SLAM method based on dynamic object detection of the present invention.
The present invention also provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the dynamic object detection based visual SLAM method of the present invention.
The method and the device adopt a deep learning target detection network to eliminate potential dynamic regions in a scene, roughly estimate a homography matrix, judge whether feature points on the potential dynamic regions can be used for calculation of the homography matrix or not based on a method of combining a reprojection error and an inter-class variance, and optimize the homography matrix, so that the precision of the homography matrix is effectively improved, the result of motion compensation is further optimized, and a dynamic target in an image can be accurately obtained through a frame difference method.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
Fig. 1 is a flowchart of a visual SLAM method based on dynamic target detection according to an embodiment of the present invention;
fig. 2 is a flowchart of a visual SLAM method based on dynamic target detection according to another embodiment of the present invention;
FIG. 3 is a flowchart of a visual SLAM method based on dynamic target detection according to another embodiment of the present invention;
FIG. 4 is a diagram illustrating an effect of detecting a motion region when an image is blurred according to an embodiment of the present invention;
fig. 5 is a block diagram of a visual SLAM device based on dynamic target detection according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a flowchart of an embodiment of a visual SLAM method based on dynamic object detection according to the present application is shown.
As shown in fig. 1, the visual SLAM method based on dynamic object detection includes the following steps:
in S101, responding to each acquired image frame, performing region segmentation on each image frame based on a deep learning target detection network, wherein each image frame comprises a potential dynamic region and/or a static region, the potential dynamic region comprises a motion feature point and/or a first static feature point, and the static region comprises a second static feature point;
in this embodiment, in response to each acquired image frame, each image frame is subjected to region segmentation based on a deep learning target detection network, the deep learning target detection network adopts a Darknet53 network and a multi-scale feature to perform target detection, and has good identification speed and accuracy, common objects with motility, such as pedestrians and vehicles, can be effectively identified, such objects with motility are classified as potential dynamic objects, the region where the potential dynamic objects are located is a potential dynamic region, the potential dynamic region contains a movement feature point and/or a first static feature point, the region where the static objects are located is a static region, and the static region contains a second static feature point.
According to the scheme of the implementation, a deep learning target detection network is adopted for target detection, dynamic target detection is carried out on each image frame, and potential dynamic areas and/or static areas in each image frame are screened and segmented, wherein each image frame possibly comprises a potential dynamic area, and the potential dynamic area possibly comprises a first static characteristic point, so that the potential dynamic area can be removed temporarily by a subsequent vision SLAM device conveniently for characteristic point matching, and a homography matrix is roughly calculated by adopting a RANSAC algorithm.
In S102, the second still feature point of the previous frame image and the second still feature point of the current frame image are matched.
In this embodiment, the second still feature point of the previous frame image and the second still feature point of the current frame image are matched, so as to obtain a matching relationship between the second still feature point of the previous frame image and the second still feature point of the current frame image.
In S103, in response to the obtained matching relationship, a first homography matrix is calculated based on the RANSAC algorithm.
In this embodiment, in response to the obtained matching relationship, the first homography matrix is calculated based on the RANSAC algorithm, and specifically, in a scene with a small dynamic area, the calculated first homography matrix can directly perform motion compensation on an image.
In S104, a first stationary feature point of the previous frame image and a first stationary feature point of the current frame image are respectively extracted based on a motion feature point filtering method, where the motion feature point filtering method is a method in which a reprojection error of feature points is combined with a maximum inter-class variance method.
In this embodiment, in order to determine whether the feature point of the potential dynamic region can be used to calculate the homography matrix H, a method of combining the feature point reprojection error and the maximum inter-class variance is adopted to determine whether the first stationary feature point of the potential dynamic region can be used to calculate the homography matrix H.
In the embodiment, a method based on a combination of a reprojection error and an inter-class variance is adopted to determine whether feature points on a potential dynamic region can be used for calculating a first homography matrix, so as to subsequently optimize the first homography matrix, wherein the method based on a combination of a reprojection error and an inter-class variance is used for determining whether feature points on a potential dynamic region can be used for calculating the first homography matrix in the following specific steps:
suppose that
Figure 455375DEST_PATH_IMAGE001
And
Figure 565283DEST_PATH_IMAGE002
is the characteristic point matched with the previous and next frames and the homography matrix
Figure 32298DEST_PATH_IMAGE003
Satisfies the formula (1). Assuming that the front and rear frames have N pairs of matched feature points, the front and rear frames have N re-projection errors, and the re-projection errors of one pair of matched feature points can be calculated
Figure 634181DEST_PATH_IMAGE004
The formula (2) is shown in the formula (2). Dividing the N reprojection errors into
Figure 48107DEST_PATH_IMAGE005
Stage (1) to
Figure 278100DEST_PATH_IMAGE006
The number of the level feature points is
Figure 384858DEST_PATH_IMAGE007
Wherein
Figure 333092DEST_PATH_IMAGE008
Therefore, there are
Figure 986927DEST_PATH_IMAGE009
Figure 713837DEST_PATH_IMAGE010
(1)
Figure 21190DEST_PATH_IMAGE011
(2)
Let the average of the N reprojection errors be
Figure 863244DEST_PATH_IMAGE012
. Set of first stationary feature points and second stationary feature points
Figure 415711DEST_PATH_IMAGE013
Record as
Figure 761241DEST_PATH_IMAGE014
Set of dynamic feature points
Figure 380442DEST_PATH_IMAGE015
Record as
Figure 804732DEST_PATH_IMAGE016
Is provided with
Figure 534791DEST_PATH_IMAGE017
In a ratio of
Figure 903804DEST_PATH_IMAGE018
Set of dynamic feature points
Figure 959484DEST_PATH_IMAGE019
In a ratio of
Figure 339912DEST_PATH_IMAGE020
Figure 998296DEST_PATH_IMAGE021
Figure 52839DEST_PATH_IMAGE022
Figure 639941DEST_PATH_IMAGE023
As shown in equation (3), the mean of the first stationary feature point set
Figure 412725DEST_PATH_IMAGE024
And mean of dynamic feature point set
Figure 484586DEST_PATH_IMAGE025
As shown in formula (4).
Figure 285314DEST_PATH_IMAGE026
(3)
Figure 151639DEST_PATH_IMAGE027
(4)
Thus, inter-class variance can be estimated
Figure 37817DEST_PATH_IMAGE028
As shown in formula (5):
Figure 647790DEST_PATH_IMAGE029
(5)
the formula (5) can be simplified to the formula (7) according to the formula (6).
Figure 67139DEST_PATH_IMAGE030
(6)
Figure 605830DEST_PATH_IMAGE031
(7)
Traversing between 0 and k, and enabling variance
Figure 212261DEST_PATH_IMAGE032
The maximum residual distance is recorded as
Figure 455286DEST_PATH_IMAGE033
If the reprojection error of a certain pair of matching points
Figure 463562DEST_PATH_IMAGE034
And then the feature points are the dynamic feature points,
Figure 937268DEST_PATH_IMAGE035
then it is the first stationary feature point or the secondTwo stationary feature points.
In S105, the first homography matrix is optimized and a second homography matrix is obtained based on the matching relationship between the first stationary feature point of the previous frame image and the first stationary feature point of the current frame image.
In this embodiment, the first homography matrix is optimized and the second homography matrix is obtained based on the matching relationship between the first static feature point of the previous frame image and the first static feature point of the current frame image.
In S106, motion compensation is performed on the previous frame image according to the optimized second homography matrix, so that a motion compensated frame image is obtained.
In the scheme of this embodiment, a matching relationship between the first static feature point of the previous frame image and the first static feature point of the current frame image is adopted to optimize the first homography matrix to obtain the second homography matrix, and motion compensation is performed on the image according to the second homography matrix, so that the precision of the motion compensation frame image can be effectively improved, specifically, according to the second homography matrix, an expression for performing motion compensation on the previous frame image is as follows:
Figure 532460DEST_PATH_IMAGE036
in the formula (I), the compound is shown in the specification,
Figure 218656DEST_PATH_IMAGE037
the pixel points of the previous frame are the pixels of the previous frame,
Figure 836764DEST_PATH_IMAGE038
is composed of
Figure 215793DEST_PATH_IMAGE037
The compensated pixel points are displayed on the display screen,
Figure 62395DEST_PATH_IMAGE039
a homography matrix of a previous frame and a current frame;
according to the method, the traditional visual SLAM is used in a static environment, when a dynamic object exists in a scene, the accuracy of the visual SLAM is reduced, the dynamic object in the image is mainly detected, and the accuracy of the SLAM is improved. When the camera moves, the current frame image can be subjected to motion compensation and then subjected to a frame difference method to obtain a dynamic region in the picture.
When the translation distance of the camera is small relative to the depth of the scene, the homography matrix H may be used as the motion compensation matrix. When the homography matrix H is calculated, images of previous and subsequent frames need to be matched, and if dynamic objects exist in a scene, the estimation of the homography matrix H is inaccurate. A potential dynamic object in a scene is eliminated by adopting a deep learning target detection network, and a homography matrix H is roughly estimated. The deep learning target detection network cannot judge whether the potential dynamic object is in a real motion state, if the potential dynamic object is in a static state, the feature points on the potential dynamic object can also participate in the calculation of the homography matrix H, so that the precision of the homography matrix H is improved, and whether the feature points on the potential dynamic object can be used for the calculation of the homography matrix H or not is judged by a method of combining a reprojection error and an inter-class variance, so that the precision of the homography matrix H is improved.
Referring to fig. 2, a flowchart illustrating steps further defined in the additional flow of fig. 1 is shown, according to yet another embodiment of the visual SLAM method based on dynamic object detection of the present application.
As shown in fig. 2, in S201, the motion-compensated frame image is differenced with the current frame image so as to obtain a frame difference map;
in S202, analyzing the frame difference map subjected to denoising and morphological processing based on a connected region algorithm, so as to determine a dynamic region, where the dynamic region only includes motion feature points;
in S203, the current frame image is subjected to dynamic region elimination, and tracking, image creation, and loop detection of the visual SLAM are performed based on the current frame image from which the dynamic region is eliminated.
In this embodiment, for S201, the motion compensation frame image and the current frame image are subjected to difference, so as to obtain a frame difference map, where the expression for performing difference on the motion compensation frame image and the current frame image is:
Figure 319326DEST_PATH_IMAGE040
in the formula (I), wherein,
Figure 443140DEST_PATH_IMAGE041
is as follows
Figure 117704DEST_PATH_IMAGE042
Is framed in
Figure 828433DEST_PATH_IMAGE043
The value of the pixel of (a) is,
Figure 574541DEST_PATH_IMAGE044
is as follows
Figure 54326DEST_PATH_IMAGE045
The compensation frame is at
Figure 634212DEST_PATH_IMAGE043
The value of the pixel of (a) is,
Figure 65194DEST_PATH_IMAGE046
is the t-th frame
Figure 522982DEST_PATH_IMAGE043
Then, for S202, based on the connected component algorithm, the frame difference map subjected to denoising and morphological processing is analyzed to determine a dynamic component, where the dynamic component only includes the motion feature points. Then, for S203, the current frame image is subjected to dynamic region elimination, and tracking, mapping, and loop detection of the visual SLAM are performed based on the current frame image from which the dynamic region is eliminated.
Please refer to fig. 3, which shows a flowchart of another embodiment of the visual SLAM method based on dynamic target detection according to the present application, and the flowchart mainly refers to a flowchart of the step of analyzing the frame difference map processed by denoising and morphology based on the connected component area algorithm to further define the condition of determining the dynamic component area at S202.
As shown in fig. 3, in S301, in response to the acquired frame difference map, the frame difference map is denoised based on filtering and binarization processing so as to be a binary map;
in S302, in response to the obtained binary image, setting each pixel value of the static region in the binary image to zero based on the deep learning target detection network;
in S303, the processed binary image is morphologically processed, and a dynamic region is obtained based on a connected region algorithm analysis.
In the present embodiment, for S301, in response to the acquired frame difference map, the frame difference map is denoised based on filtering and binarization processing so that a binary map is obtained. Thereafter, for S302, in response to the acquired binary image, the deep learning target detection network sets each pixel value of the static region in the binary image to zero. Then, in step S303, the processed binary image is morphologically processed, and a dynamic region is obtained based on a connected region algorithm analysis.
According to the method, when a strong parallax scene or an image is blurred, the result of dynamic target detection is optimized by combining a deep learning target detection network, so that the influence of blurring noise is reduced.
In a particular embodiment, the potential dynamic region is a region containing potential dynamic objects, wherein the potential dynamic objects are pedestrians or vehicles.
In some optional embodiments, the deep learning target detection network is a Yolov3 network. Therefore, the Darknet53 network and the multi-scale features are adopted for target detection, so that the method has better recognition speed and precision, and can effectively recognize common objects with motility, such as pedestrians, vehicles and the like.
It should be noted that the above method steps are not intended to limit the execution order of the steps, and in fact, some steps may be executed simultaneously or in the reverse order of the steps, which is not limited herein.
In some optional embodiments, the visual SLAM method based on dynamic target detection comprises the following steps:
(1) performing frame processing on the image to obtain each image frame;
(2) extracting feature points from a previous frame image and a current frame image;
(3) detecting a dynamic target through a Yolov3 network, and removing the dynamic target;
(4) matching a second static characteristic point in the previous frame image with a second static characteristic point of the current frame image, calculating by using a RANSAC algorithm based on a matching relation to obtain a first homography matrix, matching the first static characteristic point extracted from the dynamic target of the previous frame image with the first static characteristic point extracted from the dynamic target of the current frame image, and optimizing the first homography matrix based on the matching relation to obtain a second homography matrix;
(5) performing image compensation on the previous frame of image through a second homography matrix, and obtaining a four-frame difference image through a four-frame difference method (respectively performing difference on adjacent four frames of images), specifically, sequentially performing difference on the t frame, the t-1 th frame, the t-2 th frame and the t-3 th frame (the t frame-the t-1 th frame, the t-1 th frame-the t-2 th frame and the t-2 th frame-the t-3 rd frame), respectively obtaining two-frame difference images (respectively: (a first homography matrix is used for performing image compensation on the previous frame of image, and performing difference on the adjacent four frames of images) to obtain a four-frame difference image
Figure 247487DEST_PATH_IMAGE047
Figure 263853DEST_PATH_IMAGE048
Figure 277071DEST_PATH_IMAGE049
) Calculating to obtain a four-frame difference map
Figure 771506DEST_PATH_IMAGE050
Figure 193260DEST_PATH_IMAGE051
(6) And after obtaining the four-frame difference image, further denoising the image by using filtering and binarization, and judging the dynamic target by using a connected region algorithm after morphological processing.
(7) And removing the truly moving dynamic target, and performing tracking, drawing and loop detection of the visual SLAM by using the first static characteristic point of the dynamic area and the second static characteristic point of the static area.
As shown in fig. 4, image blurring may be caused by the motion of the camera, as shown in fig. 4 (a), the motion-compensated image is blurred, or if a strong parallax is generated during the motion of the camera, the motion compensation matrix is not calculated accurately, and the motion compensation effect is not ideal enough, in such cases, the binary image of the above method cannot process excessive background noise, so that a pixel point with a value of not 0 is also present in a static region, which is noise generated by image blurring, and many static regions are also determined as dynamic regions by mistake, as shown in fig. 4 (b). In order to eliminate background noise during image blurring, a binary image is optimized by combining with a Yolov3 network, and a pixel value of a non-potential dynamic region in the binary image is set to be 0, so that a final detection result becomes fig. 4 (c), and compared with fig. 4 (b) and fig. 4 (c), it can be seen that a dynamic target identified by a box in fig. 4 (c) is more accurate, and background false detection is also obviously reduced.
After the dynamic object of the image is solved according to the process, the image building and loop detection of the subsequent visual SLAM are carried out through the reserved characteristic points of the static area.
Testing was performed using the TUM (Technische Universal ä t M hunchen) dataset and quantitative assessments were obtained using Absolute Track Error (ATE). In the TUM data set, a prefix is walking and belongs to a high dynamic sequence, and a prefix is a low dynamic sequence; the suffix rpy represents that the camera rotates in three directions of r-p-y, xyz represents that the camera moves in the x-y-z direction, halfsphere means that the camera also increases arc motion on the basis of rpy and xyz, and static means that the camera remains almost still.
The comparison result of the algorithm of the application and other algorithms is shown in table 1, and Orbslam2 is an original algorithm without dynamic filtering; "DVO + MR (Dynamic visual interaction + motion removal)" judges a Dynamic object using a motion compensation algorithm; the map point weight sets weight for the feature point to judge whether the feature point is a dynamic feature point or not, and the accuracy of depth information is depended on; the DS-SLAM judges dynamic feature points by adopting a method combining deep learning and geometric constraint; the 'orbslam 2+ Yolov 3' is an algorithm that orbslam2 is directly combined with target detection Yolov3, and feature points of dynamic regions under semantics are filtered out without distinction.
Figure 616413DEST_PATH_IMAGE052
Compared with the Root Mean Square Error (RMSE) of the absolute track error in table 1, the orbslam2 algorithm has higher precision in a low dynamic data set and larger precision error in a high dynamic data set. Due to the fact that the walking _ rpy data set has a partially blurred and strong parallax image, after dynamic feature points are filtered, the number of remaining feature points available for tracking is reduced, and therefore algorithm tracking fails. The effect brought by a blurred image and strong parallax is reduced due to the combination of the Yolov3, and the robustness is improved to a certain extent. In the walking _ halfsphere data set, due to the fact that the camera moves in a strong parallax environment, certain influences are caused on calculation of a homography matrix and motion compensation, and the precision is reduced compared with that of a DS-SLAM.
Referring to fig. 5, a block diagram of a visual SLAM device based on dynamic object detection according to the present application is shown.
As shown in fig. 5, the visual SLAM device includes a segmentation module 410, a matching module 420, a calculation module 430, an extraction module 440, an optimization module 450, and a compensation module 460.
The segmentation module 410 is configured to perform region segmentation on each image frame based on a deep learning target detection network in response to each acquired image frame, where each image frame includes a potential dynamic region and/or a static region, the potential dynamic region includes a motion feature point and/or a first static feature point, and the static region includes a second static feature point; a matching module 420 configured to match the second still feature point of the previous frame image with the second still feature point of the current frame image; a calculating module 430, configured to obtain a first homography matrix by calculating based on a RANSAC algorithm in response to the obtained matching relationship; the extracting module 440 is configured to extract a first stationary feature point of the previous frame image and a first stationary feature point of the current frame image based on a motion feature point filtering method, wherein the motion feature point filtering method is a method formed by combining a reprojection error of feature points with a maximum inter-class variance method; the optimization module 450 is configured to optimize the first homography matrix and obtain a second homography matrix based on a matching relationship between the first static feature point of the previous frame image and the first static feature point of the current frame image; and a compensation module 460 configured to perform motion compensation on the previous frame image according to the second homography matrix, so as to obtain a motion compensated frame image.
It should be understood that the modules recited in fig. 5 correspond to various steps in the methods described with reference to fig. 1, 2, and 3. Thus, the operations and features described above for the method and the corresponding technical effects are also applicable to the modules in fig. 5, and are not described again here.
In other embodiments, an embodiment of the present invention further provides a non-volatile computer storage medium, where the computer storage medium stores computer-executable instructions, where the computer-executable instructions may perform the visual SLAM method based on dynamic target detection in any of the above method embodiments;
as one embodiment, a non-volatile computer storage medium of the present invention stores computer-executable instructions configured to:
responding to each acquired image frame, and performing region segmentation on each image frame based on a deep learning target detection network, wherein each image frame comprises a potential dynamic region and/or a static region, the potential dynamic region comprises a motion characteristic point and/or a first static characteristic point, and the static region comprises a second static characteristic point;
matching the second static characteristic point of the previous frame image with the second static characteristic point of the current frame image;
responding to the acquired matching relation, and calculating to obtain a first homography matrix based on a RANSAC algorithm;
respectively extracting a first static characteristic point of a previous frame image and a first static characteristic point of a current frame image based on a motion characteristic point filtering method, wherein the motion characteristic point filtering method is a method formed by combining a reprojection error of the characteristic points with a maximum inter-class variance method;
optimizing the first homography matrix and obtaining a second homography matrix based on the matching relation between the first static characteristic point of the previous frame image and the first static characteristic point of the current frame image;
and performing motion compensation on the previous frame image according to the second homography matrix so as to obtain a motion compensation frame image.
The non-volatile computer-readable storage medium may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created from use of a visual SLAM device based on dynamic object detection, and the like. Further, the non-volatile computer-readable storage medium may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the non-transitory computer readable storage medium optionally includes memory remotely located from the processor, and these remote memories may be connected over a network to a visual SLAM device based on dynamic object detection. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Embodiments of the present invention also provide a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions that, when executed by a computer, cause the computer to perform any of the above-mentioned visual SLAM methods based on dynamic object detection.
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 6, the electronic device includes: the processor 510 and the memory 520 are illustrated as one processor 510 in fig. 6. The apparatus of the visual SLAM method based on dynamic object detection may further include: an input device 530 and an output device 540. The processor 510, the memory 520, the input device 530, and the output device 540 may be connected by a bus or other means, such as by a bus connection in fig. 6. The memory 520 is a non-volatile computer-readable storage medium as described above. The processor 510 executes various functional applications of the server and data processing, i.e., implementing the visual SLAM method based on dynamic object detection of the above-described method embodiments, by running non-volatile software programs, instructions, and modules stored in the memory 520. The input device 530 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the visual SLAM device based on dynamic object detection. The output device 540 may include a display device such as a display screen.
The product can execute the method provided by the embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method. For technical details that are not described in detail in this embodiment, reference may be made to the method provided by the embodiment of the present invention.
As an embodiment, the electronic device is applied to a visual SLAM device based on dynamic object detection, and is used for a client, and includes: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to:
responding to each acquired image frame, and performing region segmentation on each image frame based on a deep learning target detection network, wherein each image frame comprises a potential dynamic region and/or a static region, the potential dynamic region comprises a motion characteristic point and/or a first static characteristic point, and the static region comprises a second static characteristic point;
matching the second static characteristic point of the previous frame image with the second static characteristic point of the current frame image;
responding to the acquired matching relation, and calculating to obtain a first homography matrix based on a RANSAC algorithm;
respectively extracting a first static characteristic point of a previous frame image and a first static characteristic point of a current frame image based on a motion characteristic point filtering method, wherein the motion characteristic point filtering method is a method formed by combining a reprojection error of the characteristic points with a maximum inter-class variance method;
optimizing the first homography matrix and obtaining a second homography matrix based on the matching relation between the first static characteristic point of the previous frame image and the first static characteristic point of the current frame image;
and performing motion compensation on the previous frame image according to the second homography matrix so as to obtain a motion compensation frame image.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods of the various embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (9)

1. A visual SLAM method based on dynamic object detection, comprising:
performing region segmentation on each image frame based on a deep learning target detection network in response to each acquired image frame, wherein each image frame comprises a potential dynamic region and a static region, the potential dynamic region comprises a motion feature point and a first static feature point, and the static region comprises a second static feature point;
matching the second static characteristic point of the previous frame image with the second static characteristic point of the current frame image;
responding to the acquired matching relation, and calculating to obtain a first homography matrix based on a RANSAC algorithm;
respectively extracting a first static feature point of the previous frame image and a first static feature point of the current frame image based on a motion feature point filtering method, wherein the motion feature point filtering method is a method formed by combining a reprojection error of feature points with a maximum inter-class variance method, and the specific steps of respectively extracting the first static feature point of the previous frame image and the first static feature point of the current frame image based on the motion feature point filtering method are as follows:
suppose that
Figure 618069DEST_PATH_IMAGE001
And
Figure 385430DEST_PATH_IMAGE002
is the characteristic point matched with the front and back frames, the characteristic point matched with the front and back frames and the homography matrix
Figure 463107DEST_PATH_IMAGE003
Satisfy the relation of
Figure 839862DEST_PATH_IMAGE004
Assuming that the front and rear frames have N pairs of matched feature points, the front and rear frames have N re-projection errors, and the re-projection errors of one pair of matched feature points can be calculated
Figure 319385DEST_PATH_IMAGE005
Is of the formula
Figure 756182DEST_PATH_IMAGE006
Dividing the N reprojection errors into
Figure 321156DEST_PATH_IMAGE007
Stage (1) to
Figure 501601DEST_PATH_IMAGE008
The number of the level feature points is
Figure 366789DEST_PATH_IMAGE009
Wherein
Figure 974488DEST_PATH_IMAGE010
Therefore, there are
Figure 525293DEST_PATH_IMAGE011
Let the average of the N reprojection errors be
Figure 509429DEST_PATH_IMAGE012
Figure 963545DEST_PATH_IMAGE013
Set of first stationary feature point and second stationary feature point
Figure 742145DEST_PATH_IMAGE014
Is composed of
Figure 16131DEST_PATH_IMAGE015
Set of dynamic feature points
Figure 803959DEST_PATH_IMAGE016
Is composed of
Figure 112580DEST_PATH_IMAGE017
Is provided with
Figure 62082DEST_PATH_IMAGE015
In a ratio of
Figure 354523DEST_PATH_IMAGE018
Figure 662487DEST_PATH_IMAGE019
Set of dynamic feature points
Figure 91194DEST_PATH_IMAGE017
In a ratio of
Figure 211597DEST_PATH_IMAGE020
Figure 132280DEST_PATH_IMAGE021
Mean of the point sets of the first stationary feature point and the second stationary feature point
Figure 996330DEST_PATH_IMAGE022
Figure 778079DEST_PATH_IMAGE023
Mean of a set of points of dynamic feature points
Figure 69384DEST_PATH_IMAGE024
Figure 70838DEST_PATH_IMAGE025
Thus, inter-class variance can be estimated
Figure 738579DEST_PATH_IMAGE026
The formula of (1) is:
Figure 876300DEST_PATH_IMAGE027
based on the formula
Figure 338505DEST_PATH_IMAGE028
To, for
Figure 827255DEST_PATH_IMAGE027
Simplification yields the formula:
Figure 298688DEST_PATH_IMAGE029
traversing between 0 and k, and enabling variance
Figure 290914DEST_PATH_IMAGE030
The maximum residual distance is recorded as
Figure 425486DEST_PATH_IMAGE031
If the reprojection error of a certain pair of matching points
Figure 135953DEST_PATH_IMAGE032
And then the feature points are the dynamic feature points,
Figure 411077DEST_PATH_IMAGE033
if the feature point is the first static feature point or the second static feature point;
optimizing the first homography matrix and obtaining a second homography matrix based on the matching relation between the first static characteristic point of the previous frame image and the first static characteristic point of the current frame image;
and performing motion compensation on the previous frame image according to the second homography matrix so as to obtain a motion compensation frame image.
2. A visual SLAM method based on dynamic object detection as claimed in claim 1, wherein after motion compensating said previous frame image according to said second homography matrix, such that it is a motion compensated frame image, said method further comprises:
performing difference on the motion compensation frame image and the current frame image to obtain a frame difference image;
analyzing the frame difference image subjected to denoising and morphological processing based on a connected region algorithm to determine a dynamic region, wherein the dynamic region only comprises motion characteristic points;
and removing the dynamic area from the current frame image, and performing tracking, drawing building and loop detection of the visual SLAM based on the current frame image from which the dynamic area is removed.
3. The visual SLAM method based on dynamic object detection as claimed in claim 2 wherein said connected component area algorithm analyzes said frame difference map that has been denoised and morphologically processed to determine dynamic areas comprising:
responding to the acquired frame difference image, denoising the frame difference image based on filtering and binarization processing to obtain a binary image;
setting each pixel value of a static area in the binary image to be zero based on the deep learning target detection network in response to the acquired binary image;
and carrying out morphological processing on the processed binary image, and analyzing to obtain a dynamic region based on the connected region algorithm.
4. A visual SLAM method based on dynamic object detection as claimed in claim 1, wherein said expression for motion compensation of said previous frame image according to said second homography matrix is:
Figure 257810DEST_PATH_IMAGE034
in the formula (I), the compound is shown in the specification,
Figure 327397DEST_PATH_IMAGE035
the pixel points of the previous frame are the pixels of the previous frame,
Figure 525160DEST_PATH_IMAGE036
is composed of
Figure 338395DEST_PATH_IMAGE035
The compensated pixel points are displayed on the display screen,
Figure 39635DEST_PATH_IMAGE037
is a homography matrix of the previous frame and the current frame.
5. The visual SLAM method based on dynamic object detection as claimed in claim 1 wherein the potential dynamic area is an area containing potential dynamic objects, wherein the potential dynamic objects are pedestrians or vehicles.
6. The visual SLAM method based on dynamic target detection of claim 1 wherein the deep learning target detection network is a Yolov3 network.
7. A visual SLAM apparatus based on dynamic object detection, comprising:
the segmentation module is configured to perform region segmentation on each image frame based on a deep learning target detection network in response to the acquired image frame, wherein each image frame comprises a potential dynamic region and a static region, the potential dynamic region comprises a motion feature point and a first static feature point, and the static region comprises a second static feature point;
the matching module is configured to match the second static feature point of the previous frame image with the second static feature point of the current frame image;
the calculation module is configured to respond to the acquired matching relationship and calculate to obtain a first homography matrix based on a RANSAC algorithm;
the extraction module is configured to extract the first stationary feature point of the previous frame image and the first stationary feature point of the current frame image respectively based on a motion feature point filtering method, wherein the motion feature point filtering method is a method formed by combining a reprojection error of feature points with a maximum inter-class variance method, and the specific steps of extracting the first stationary feature point of the previous frame image and the first stationary feature point of the current frame image respectively based on the motion feature point filtering method are as follows:
suppose that
Figure 280124DEST_PATH_IMAGE001
And
Figure 198139DEST_PATH_IMAGE002
is the characteristic point matched with the front and back frames, the characteristic point matched with the front and back frames and the homography matrix
Figure 815065DEST_PATH_IMAGE038
Satisfy the relation of
Figure 482063DEST_PATH_IMAGE004
Assuming that the front and rear frames have N pairs of matched feature points, the front and rear frames have N re-projection errors, and the re-projection errors of one pair of matched feature points can be calculated
Figure 253972DEST_PATH_IMAGE039
Is of the formula
Figure 629590DEST_PATH_IMAGE006
Dividing the N reprojection errors into
Figure 25106DEST_PATH_IMAGE040
Stage (1) to
Figure DEST_PATH_IMAGE041
The number of the level feature points is
Figure 74840DEST_PATH_IMAGE009
Wherein
Figure 657131DEST_PATH_IMAGE010
Therefore, there are
Figure 785624DEST_PATH_IMAGE011
Let the average of the N reprojection errors be
Figure 478773DEST_PATH_IMAGE012
Figure 244997DEST_PATH_IMAGE013
Set of first stationary feature point and second stationary feature point
Figure 935873DEST_PATH_IMAGE014
Is composed of
Figure 551662DEST_PATH_IMAGE015
Set of dynamic feature points
Figure 750300DEST_PATH_IMAGE016
Is composed of
Figure 400724DEST_PATH_IMAGE017
Is provided with
Figure 59238DEST_PATH_IMAGE015
In a ratio of
Figure 162324DEST_PATH_IMAGE018
Figure 931696DEST_PATH_IMAGE019
Set of dynamic feature points
Figure 436627DEST_PATH_IMAGE017
In a ratio of
Figure 767508DEST_PATH_IMAGE020
Figure 357889DEST_PATH_IMAGE021
Mean of the point sets of the first stationary feature point and the second stationary feature point
Figure 930953DEST_PATH_IMAGE022
Figure 24811DEST_PATH_IMAGE023
Mean of a set of points of dynamic feature points
Figure 556286DEST_PATH_IMAGE024
Figure 368384DEST_PATH_IMAGE025
Thus, inter-class variance can be estimated
Figure 978095DEST_PATH_IMAGE026
The formula of (1) is:
Figure 192039DEST_PATH_IMAGE027
based on the formula
Figure 832099DEST_PATH_IMAGE028
To, for
Figure 397072DEST_PATH_IMAGE027
Simplification yields the formula:
Figure 843097DEST_PATH_IMAGE029
traversing between 0 and k, and enabling variance
Figure 177126DEST_PATH_IMAGE030
The maximum residual distance is recorded as
Figure 784825DEST_PATH_IMAGE031
If the reprojection error of a certain pair of matching points
Figure 72980DEST_PATH_IMAGE032
And then the feature points are the dynamic feature points,
Figure 994800DEST_PATH_IMAGE033
if the feature point is the first static feature point or the second static feature point;
the optimization module is configured to optimize the first homography matrix and obtain a second homography matrix based on the matching relation between the first static feature point of the previous frame image and the first static feature point of the current frame image;
and the compensation module is configured to perform motion compensation on the previous frame image according to the second homography matrix, so that a motion compensation frame image is obtained.
8. An electronic device, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any of claims 1 to 6.
9. A storage medium on which a computer program is stored, which program, when being executed by a processor, carries out the method according to any one of claims 1 to 6.
CN202110100524.8A 2021-01-26 2021-01-26 Visual SLAM method and device based on dynamic target detection Active CN112435278B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110100524.8A CN112435278B (en) 2021-01-26 2021-01-26 Visual SLAM method and device based on dynamic target detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110100524.8A CN112435278B (en) 2021-01-26 2021-01-26 Visual SLAM method and device based on dynamic target detection

Publications (2)

Publication Number Publication Date
CN112435278A CN112435278A (en) 2021-03-02
CN112435278B true CN112435278B (en) 2021-05-04

Family

ID=74697251

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110100524.8A Active CN112435278B (en) 2021-01-26 2021-01-26 Visual SLAM method and device based on dynamic target detection

Country Status (1)

Country Link
CN (1) CN112435278B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113222891B (en) * 2021-04-01 2023-12-22 东方电气集团东方锅炉股份有限公司 Line laser-based binocular vision three-dimensional measurement method for rotating object
CN116452647B (en) * 2023-06-15 2023-12-08 广州安特激光技术有限公司 Dynamic image registration method, system and device based on matching pursuit

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107610177A (en) * 2017-09-29 2018-01-19 联想(北京)有限公司 A kind of method and apparatus that characteristic point is determined in synchronous superposition
CN110084850A (en) * 2019-04-04 2019-08-02 东南大学 A kind of dynamic scene vision positioning method based on image, semantic segmentation
CN110378345A (en) * 2019-06-04 2019-10-25 广东工业大学 Dynamic scene SLAM method based on YOLACT example parted pattern
CN111156984A (en) * 2019-12-18 2020-05-15 东南大学 Monocular vision inertia SLAM method oriented to dynamic scene
US10825424B2 (en) * 2018-06-05 2020-11-03 Magic Leap, Inc. Homography transformation matrices based temperature calibration of a viewing system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111145251B (en) * 2018-11-02 2024-01-02 深圳市优必选科技有限公司 Robot and synchronous positioning and mapping method thereof and computer storage device
CN110533716B (en) * 2019-08-20 2022-12-02 西安电子科技大学 Semantic SLAM system and method based on 3D constraint

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107610177A (en) * 2017-09-29 2018-01-19 联想(北京)有限公司 A kind of method and apparatus that characteristic point is determined in synchronous superposition
US10825424B2 (en) * 2018-06-05 2020-11-03 Magic Leap, Inc. Homography transformation matrices based temperature calibration of a viewing system
CN110084850A (en) * 2019-04-04 2019-08-02 东南大学 A kind of dynamic scene vision positioning method based on image, semantic segmentation
CN110378345A (en) * 2019-06-04 2019-10-25 广东工业大学 Dynamic scene SLAM method based on YOLACT example parted pattern
CN111156984A (en) * 2019-12-18 2020-05-15 东南大学 Monocular vision inertia SLAM method oriented to dynamic scene

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"交互多模型算法在目标跟踪领域的应用";许天野等;《四川兵工学报》;20131130;第34卷(第11期);第116-119页 *
"基于高斯金字塔的视觉里程计算法研究";刘瑞等;《华东交通大学学报》;20200831;第37卷(第4期);第48-53页 *

Also Published As

Publication number Publication date
CN112435278A (en) 2021-03-02

Similar Documents

Publication Publication Date Title
Zhang et al. Spatially variant defocus blur map estimation and deblurring from a single image
WO2020151172A1 (en) Moving object detection method and apparatus, computer device, and storage medium
JP6230751B1 (en) Object detection apparatus and object detection method
US9947077B2 (en) Video object tracking in traffic monitoring
CN109753971B (en) Correction method and device for distorted text lines, character recognition method and device
US10600158B2 (en) Method of video stabilization using background subtraction
JP2012038318A (en) Target detection method and device
CN111340749B (en) Image quality detection method, device, equipment and storage medium
CN112215773B (en) Local motion deblurring method and device based on visual saliency and storage medium
CN112435278B (en) Visual SLAM method and device based on dynamic target detection
CN113780110A (en) Method and device for detecting weak and small targets in image sequence in real time
Zhang et al. Depth enhancement with improved exemplar-based inpainting and joint trilateral guided filtering
US20160035107A1 (en) Moving object detection
Attard et al. Image mosaicing of tunnel wall images using high level features
Nguyen et al. UnfairGAN: An enhanced generative adversarial network for raindrop removal from a single image
CN112598743A (en) Pose estimation method of monocular visual image and related device
Xu et al. Features based spatial and temporal blotch detection for archive video restoration
CN116188535A (en) Video tracking method, device, equipment and storage medium based on optical flow estimation
Robinson et al. Foreground segmentation in atmospheric turbulence degraded video sequences to aid in background stabilization
Dederscheck et al. Illumination invariance for driving scene optical flow using comparagram preselection
Bar et al. Blind space-variant single-image restoration of defocus blur
US11620752B2 (en) Image-guided depth sampling and reconstruction
CN111104870A (en) Motion detection method, device and equipment based on satellite video and storage medium
CN111079624A (en) Method, device, electronic equipment and medium for collecting sample information
Gurrala et al. Enhancing Safety and Security: Face Tracking and Detection in Dehazed Video Frames Using KLT and Viola-Jones Algorithms.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant