CN116824638A

CN116824638A - Dynamic object feature point detection method and device, electronic equipment and storage medium

Info

Publication number: CN116824638A
Application number: CN202310868495.9A
Authority: CN
Inventors: 何常鑫; 徐慧明; 王妙妙; 蔡含宇; 杜振东; 林伟
Original assignee: Uisee Technologies Beijing Co Ltd
Current assignee: Uisee Technologies Beijing Co Ltd
Priority date: 2023-07-14
Filing date: 2023-07-14
Publication date: 2023-09-29

Abstract

The invention discloses a method and a device for detecting feature points of a dynamic object, electronic equipment and a storage medium; the method comprises the following steps: acquiring a current frame image and a previous frame image, and extracting feature points of the current frame image to obtain image feature points; detecting the image feature points, determining detection frames corresponding to the objects, and determining a first background feature point set and a first object feature point set corresponding to each detection frame according to each detection frame; if the total area of each detection frame meets the preset condition, determining a first displacement vector set corresponding to the background feature points according to the first background feature point set and the previous frame image; for each detection frame, determining a second displacement vector set corresponding to the object according to a first object feature point set corresponding to the detection frame and the previous frame image, and determining whether the object feature points in the detection frame are dynamic object feature points or not based on the first displacement vector set and the second displacement vector set, thereby solving the problem of inaccurate results when the monocular camera detects dynamic objects.

Description

Dynamic object feature point detection method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a method and apparatus for detecting feature points of a dynamic object, an electronic device, and a storage medium.

Background

The visual mapping technology is a technology for realizing the reconstruction, positioning and navigation of an environment three-dimensional model by using an algorithm based on data acquired by sensors such as cameras. The visual map-building positioning and map-building technology has wide application prospect in the fields of robots, automatic driving and the like.

Conventional visual localization and mapping is generally based on the assumption that all objects in a scene are stationary. In the situations of logistics park, open roads and the like, a certain number of moving vehicles usually appear during map building, and the correct interpretation and analysis of images can be interfered, so that the map building is interfered, and the map-based positioning effect of the automatic driving vehicles is further affected.

Dynamic object feature point removal in conventional visual mapping typically relies on binocular, RGBD, etc. camera devices that can capture depth information. By measuring whether the depth information of the camera feature points at different moments changes, the feature points from the dynamic object can be judged and removed. When a monocular camera is used for mapping, as depth information of pixel points cannot be acquired, the difficulty of removing dynamic objects is high, and all pixels from prior dynamic objects (vehicles and pedestrians) are removed by using a mask for object detection; or, identifying the detection frame where the object is located through an algorithm, and taking the points in the detection frame as dynamic points directly. However, depending on the accuracy of the object detection mask, false detection and missing detection of the dynamic object are liable to occur; the points in the detection frame are not uniform dynamic points, or may be static points, and the points in the detection frame are directly used as the dynamic points, so that the result is inaccurate. In the prior art, the accuracy of the method for detecting the dynamic object is low, and the quality of the drawing is further affected.

Disclosure of Invention

The invention provides a method, a device, electronic equipment and a storage medium for detecting feature points of a dynamic object, which are used for improving the detection accuracy when the dynamic object is detected by a monocular camera, and further improving the quality of visual image construction of the monocular camera.

According to an aspect of the present invention, there is provided a dynamic object feature point detection method, including:

acquiring a current frame image and a previous frame image, and extracting feature points of the current frame image to obtain image feature points, wherein the current frame image is acquired through a monocular camera;

detecting the image feature points, determining at least one detection frame corresponding to the object, and determining a first background feature point set and a first object feature point set corresponding to each detection frame according to each detection frame;

if the total area of each detection frame meets a preset condition, determining a first displacement vector set corresponding to the background feature point according to the first background feature point set and the previous frame image;

for each detection frame, determining a second displacement vector set corresponding to an object according to a first object feature point set corresponding to the detection frame and the previous frame image, and determining whether the object feature points in the detection frame are dynamic object feature points or not based on the first displacement vector set and the second displacement vector set.

According to another aspect of the present invention, there is provided a dynamic object feature point detection apparatus including:

the device comprises a characteristic point extraction module, a characteristic point extraction module and a camera module, wherein the characteristic point extraction module is used for acquiring a current frame image and a previous frame image, extracting characteristic points of the current frame image to obtain image characteristic points, and acquiring the current frame image through a monocular camera;

the object detection module is used for carrying out object detection on the image feature points, determining at least one detection frame corresponding to an object, and determining a first background feature point set and a first object feature point set corresponding to each detection frame according to each detection frame;

the first displacement set determining module is used for determining a first displacement vector set corresponding to the background feature points according to the first background feature point set and the previous frame image if the total area of each detection frame meets a preset condition;

the first dynamic point detection module is used for determining a second displacement vector set corresponding to an object according to a first object characteristic point set corresponding to each detection frame and the previous frame image, and determining whether the object characteristic points in the detection frames are dynamic object characteristic points or not based on the first displacement vector set and the second displacement vector set.

According to another aspect of the present invention, there is provided an electronic apparatus including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,

the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the dynamic object feature point detection method according to any one of the embodiments of the present invention.

According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to implement the dynamic object feature point detection method according to any one of the embodiments of the present invention when executed.

According to the technical scheme, the current frame image and the last frame image are obtained, feature points of the current frame image are extracted, and image feature points are obtained, wherein the current frame image is acquired through a monocular camera; detecting the image feature points, determining at least one detection frame corresponding to the object, and determining a first background feature point set and a first object feature point set corresponding to each detection frame according to each detection frame; if the total area of each detection frame meets a preset condition, determining a first displacement vector set corresponding to the background feature point according to the first background feature point set and the previous frame image; for each detection frame, determining a second displacement vector set corresponding to an object according to a first object feature point set corresponding to the detection frame and the previous frame image, determining whether object feature points in the detection frame are dynamic object feature points or not based on the first displacement vector set and the second displacement vector set, solving the problem that a detection result is inaccurate when a monocular camera performs dynamic object detection, obtaining the detection frame by performing object detection on image feature points in the current frame image, further determining a first background feature point set formed by background feature points according to the detection frame and the first object feature point set corresponding to each detection frame, performing dynamic object detection based on the first background feature point set when the total area of each detection frame meets a preset condition, performing dynamic and static distinction on the object feature points in the detection frame through the first displacement vector set and the second displacement vector set, improving the accuracy of the detection result, reducing the occurrence of false detection condition, and being lower in detection cost.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a method for detecting feature points of a dynamic object according to a first embodiment of the present invention;

fig. 2 is a flowchart of a method for detecting feature points of a dynamic object according to a second embodiment of the present invention;

fig. 3 is a schematic structural diagram of a dynamic object feature point detection device according to a third embodiment of the present invention;

fig. 4 is a schematic structural diagram of an electronic device implementing a method for detecting feature points of a dynamic object according to an embodiment of the present invention.

Detailed Description

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Example 1

Fig. 1 is a flowchart of a method for detecting feature points of a dynamic object according to an embodiment of the present invention, where the method may be performed by a device for detecting feature points of a dynamic object, where the device may be implemented in hardware and/or software, and the device may be configured in an electronic device. As shown in fig. 1, the method includes:

S101, acquiring a current frame image and a previous frame image, and extracting feature points of the current frame image to obtain image feature points, wherein the current frame image is acquired through a monocular camera.

In this embodiment, the current frame image may be specifically understood as an image currently subjected to detection of a feature point of a dynamic object, where the current frame image is collected by a monocular camera. The previous frame image can be understood as an image acquired before the current frame; the image feature points can be understood as feature points in the current frame image in particular.

Specifically, the monocular camera can be arranged on a vehicle, a robot and other equipment, the working frequency of the monocular camera is preset, and the monocular camera periodically acquires images according to the working frequency. In the embodiment of the application, when the feature points of the dynamic object are detected, the images can be processed according to the same frequency to realize the detection of the dynamic object, namely, each time a frame of image is acquired by the binocular camera, the images can be sent to the execution equipment for processing so as to realize the detection of the feature points of the dynamic object; in consideration of the performance and other problems of the execution device, the image may be processed according to different frequencies, for example, the acquisition frequency of the binocular camera is higher than the processing frequency of the execution device, and at this time, if the real-time performance is ensured, the execution device cannot process all the images acquired by the binocular camera, so that the execution device can acquire a frame of image with the nearest time as the current frame of image for processing when processing the image. The previous frame image may be an image acquired in a previous acquisition period of the current frame image; alternatively, an image detected at the time when the dynamic object feature point detection was last performed on the current frame image is taken as the previous frame image.

Extracting feature points of the current frame image in a conventional feature extraction mode such as a Harri s angle point detector, a SIFT feature point detector and a deep learning feature extraction mode such as SuperPoint; and obtaining image feature points in the current frame image through feature extraction.

S102, object detection is carried out on the image feature points, detection frames corresponding to at least one object are determined, and a first background feature point set and a first object feature point set corresponding to each detection frame are determined according to each detection frame.

In this embodiment, the detection frame may be rectangular, square, or the like; the first set of background feature points may be understood as a set of coordinates of background feature points in the current frame image, the background feature points being used to represent a background in the current frame image, the first set of background feature points comprising coordinates of at least one background feature point. The first object feature point set may be specifically understood as a set formed by coordinates of object feature points in one detection frame in the current frame image, where the object feature points are used to represent objects in the current frame image, and the first object feature point set includes coordinates of at least one object feature point, where each object corresponds to one detection frame, and each detection frame corresponds to one first object feature point set, i.e. the number of first object feature point sets is the same as the number of detection frames.

It should be noted that, in the embodiment of the present application, the feature points are generally represented by coordinate values, and each feature point set refers to a set formed by coordinates of the feature points.

Specifically, a deep learning algorithm, for example, a Yolo algorithm, an RCNN algorithm, etc., may be used for object detection on the image feature points. When a plurality of objects are included in the current frame image, the sizes of the objects in the image may be different due to factors such as the volumes and the distances of the objects, so that the sizes of the detection frames corresponding to each object may be the same or different. Distinguishing the background characteristic points and the object characteristic points in the current frame image according to the position, the size and other information of all the detection frames in the current frame image, forming a first background characteristic point set based on the background characteristic points representing the background, and forming a first object characteristic point set corresponding to each detection frame based on the object characteristic points representing the object.

And S103, if the total area of each detection frame meets the preset condition, determining a first displacement vector set corresponding to the background feature points according to the first background feature point set and the previous frame image.

In this embodiment, the preset conditions may be preset according to requirements of detection accuracy, processing speed, and the like; for example, the preset condition is that the total area of each detection frame is smaller than a certain threshold, or the preset condition is that the ratio of the total area of each detection frame to the total area of the current frame image is smaller than a certain threshold, or the like. The first set of displacement vectors may be understood as a set of displacements of each background feature point relative to the previous frame image, where the first set of displacement vectors includes at least one first displacement.

Specifically, the area of each detection frame is calculated according to the height and the width of each detection frame, the total area of all detection frames is further calculated, and the total area is compared with a preset condition. If the total area meets the preset condition, detecting the first background characteristic point set and the previous frame image, determining the corresponding background characteristic points in the two frames of images, and further calculating the displacement between the background characteristic points in one-to-one correspondence to form a first displacement vector set.

S104, determining a second displacement vector set corresponding to the object according to the first object feature point set corresponding to the detection frame and the previous frame image for each detection frame, and determining whether the object feature points in the detection frame are dynamic object feature points or not based on the first displacement vector set and the second displacement vector set.

In this embodiment, the second set of displacement vectors may be specifically understood as a set formed by displacements of each object feature point with respect to the previous frame image, where the second set of displacement vectors includes at least one second displacement.

For each detection frame, whether the object feature points in the detection frame are dynamic object feature points can be judged in the same way. For each detection frame, a first object feature point set corresponding to the detection frame is determined, the first object feature point set and the previous frame of image are detected, corresponding object feature points in the two frames of images are determined, and then displacement between the object feature points in one-to-one correspondence is calculated to form a second displacement vector set. Determining a judging condition for distinguishing the feature points of the dynamic and static object based on the first displacement vector set, for example, calculating the value of the mean value, the median value, the weighted value and the like of each first displacement in the first displacement vector set as the judging condition; and further judging the second displacement in the second displacement vector set according to the judging condition, for example, comparing the magnitude of the value in the second displacement and the judging condition, or calculating whether the difference value between the value in the second displacement and the judging condition is within a certain range, and the like, so as to further determine whether the object feature point is a dynamic object feature point.

In the embodiment of the application, the object characteristic points in the detection frame can be further judged to be dynamic points or static points, and compared with the prior art that all points in the detection frame are directly used as dynamic points, the dynamic point identification mode provided by the embodiment of the application has more accurate results.

The embodiment of the application provides a dynamic object feature point detection method, which solves the problem of inaccurate detection results when a monocular camera detects dynamic objects, obtains detection frames by detecting the image feature points in a current frame image, further determines a first background feature point set formed by background feature points according to the detection frames and a first object feature point set corresponding to each detection frame, and when the total area of each detection frame meets preset conditions, carries out dynamic object detection based on the first background feature point set, and carries out dynamic and static distinction on the object feature points in the detection frames through a first displacement vector set and a second displacement vector set, thereby improving the accuracy of the detection results and reducing the occurrence of false detection; the monocular camera is used for image acquisition, so that the detection cost is low.

Example two

Fig. 2 is a flowchart of a method for detecting feature points of a dynamic object according to a second embodiment of the present application, where the method is optimized based on the foregoing embodiment. As shown in fig. 2, the method includes:

S201, acquiring a current frame image and a previous frame image, and extracting feature points of the current frame image to obtain image feature points, wherein the current frame image is acquired through a monocular camera.

S202, object detection is carried out on the image feature points, detection frames corresponding to at least one object are determined, and a first background feature point set and a first object feature point set corresponding to each detection frame are determined according to each detection frame.

As an optional embodiment of the present embodiment, the present optional embodiment further optimizes determining, according to each detection frame, the first background feature point set and the first object feature point set corresponding to each detection frame, to:

a1, judging whether the image feature points are in the detection frames or not according to the coordinates of the image feature points and the size and position information of each detection frame aiming at each image feature point, and if so, executing A2; otherwise, A3 is performed.

In this embodiment, the size of the detection frame may be the height and width of the detection frame, and the position information of the detection frame may be coordinates of points such as a center point, an upper left vertex, and an upper right vertex of the detection frame. In the embodiment of the application, when the detection frame is determined, the size and the position information of the detection frame can be obtained at the same time, the coordinates of the image characteristic points are compared with the size and the position information of the detection frame, whether the image characteristic points are in the detection frame or not is judged, if so, the A2 is executed; otherwise, A3 is performed.

Exemplary, the embodiment of the application provides a method for judging whether an image feature point is in a detection frame, which comprises the following steps: taking the horizontal rightward direction as the x-axis and the vertical downward direction as the y-axis as an example, coordinates of the image feature points can be expressed as: [ x1, y1], wherein x1 is the pixel abscissa of the image feature point, and y1 is the pixel ordinate of the image feature point. The detection box may generally represent: [ x, y, w, h ], wherein [ x, y ] represents the upper left vertex of the detection frame, w is the width of the detection frame, and h is the height of the detection frame.

Whether the image feature point is located in the detection frame can be judged based on the following formula:

x<x1<x+w；

y<y1<y+h；

in the embodiment of the application, the image characteristic points are sequentially compared with all the detection frames, and whether the image characteristic points are in a certain detection frame or not is determined. And traversing all the image characteristic points, and judging whether each image characteristic point is a background characteristic point or an object characteristic point.

A2, adding the image characteristic points serving as object characteristic points into a first object characteristic point set corresponding to a detection frame where the image characteristic points are located.

If the image feature point is in the detection frame, determining the image feature point as an object feature point, determining the detection frame in which the image feature point is located, and adding the image feature point into a first object feature point set corresponding to the detection frame in which the image feature point is located.

A3, adding the image feature points serving as background feature points into the first background feature point set.

If the image feature point is not in any detection frame, determining the image feature point as a background feature point, and adding the image feature point into the first background feature point set.

In the embodiment of the application, when the object is detected, the object can be numbered, namely the detection frame is numbered. When the first object feature point set corresponding to the detection frame is determined, an object number can be added to the object feature points in the first object feature point set. After each image feature point is judged, the object feature point in each first object feature point set obtained at this time is added with a corresponding object number.

In the embodiment of the application, whether the image feature points are object feature points or background feature points is judged through the size and the position information of the detection frame and is added into the first object feature point set or the first background feature point set, so that the dynamic object feature points are detected in different modes respectively when the total area of the detection frame meets the preset condition and does not meet the preset condition.

S203, judging whether the total area of each detection frame meets a preset condition, if so, executing S204; otherwise, S211 is performed.

In the embodiment of the application, the preset condition is related to the area of the detection frames, and when the preset condition is set, if the total area of each detection frame meets the preset condition, the number of the background characteristic points is determined to be enough, so that the steps S204-S210 are adopted to distinguish the mobile and static objects. If the total area of each detection frame does not meet the preset condition, the number of the determined background feature points is small and the background feature points are not sufficiently representative, so that the steps S211-S214 are adopted to distinguish the mobile static objects. And different modes are selected according to the total area of each detection frame to distinguish dynamic and static characteristic points, so that the accuracy of a detection result is effectively improved.

S204, processing the first background characteristic point set and the previous frame image based on a sparse optical flow method to obtain a second background characteristic point set and a third background characteristic point set.

The background feature points in the second background feature point set are the background feature points in the current frame image, and the background feature points in the third background feature point set are the background feature points in the previous frame image.

In this embodiment, the second background feature point set and the third background feature point set are each a set composed of image feature points representing the background, and the number of the background feature points in the second background feature point set and the third background feature point set is the same and corresponds to each other one by one.

The optical flow method is a method for finding out the correspondence existing between the previous frame and the current frame by utilizing the change of pixels in an image sequence in a time domain and the correlation between adjacent frames, thereby calculating the motion information of an object between the adjacent frames. The sparse optical flow method only calculates the movement of a small number of characteristic points, so that the sparse optical flow method has higher calculation efficiency and less memory requirement. In the embodiment of the application, the number of the characteristic points can be reduced by extracting the characteristic points of the current frame image, and then the data in the two frames of images are processed by a sparse optical flow method to determine the motion change of the characteristic points.

And processing the first background characteristic point set and the previous frame image based on a sparse optical flow method, determining background characteristic points which are matched with the previous frame image in a one-to-one correspondence manner in the first background characteristic point set, and forming a second background characteristic point set and a third background characteristic point set. When the processing is performed based on the sparse optical flow method, due to the motion of equipment such as a vehicle and a robot, the background feature points in the first background feature point set do not find corresponding feature points in the previous frame image, so that the number of the background feature points in the second background feature point set in the embodiment of the application is smaller than or equal to that of the background feature points in the first background feature point set, namely, the second background feature point set is formed by filtering out the residual background feature points in the first background feature point set after the matched background feature points are not found in the previous frame image.

S205, subtracting the second background characteristic point set and the third background characteristic point set to obtain a first displacement vector set.

And subtracting the coordinates of the background characteristic points corresponding to the second background characteristic point set and the third background characteristic point set one by one to obtain the first displacement. The coordinate subtraction in the embodiment of the present application may be that the abscissa and the abscissa are different, and the ordinate are different, so that the obtained first displacement includes the displacement in the abscissa direction and the displacement in the ordinate direction. And forming a first displacement vector set based on the first displacement corresponding to each pair of matched background characteristic points.

S206, processing the first object feature point set and the previous frame image corresponding to the detection frames based on a sparse optical flow method to obtain a second object feature point set and a third object feature point set.

The object feature points in the second object feature point set are object feature points in the current frame image, and the object feature points in the third object feature point set are object feature points in the previous frame image.

In this embodiment, the second object feature point set and the third object feature point set are each a set made up of image feature points representing an object, and the number of object feature points in the second object feature point set and the third object feature point set are the same and correspond to each other one by one.

For each detection frame, the mode of S206-S210 can be adopted to determine whether the corresponding object feature point is a dynamic object feature point. And processing the first object feature point set and the previous frame image based on a sparse optical flow method, determining object feature points in the first object feature point set, which are matched with the previous frame image in a one-to-one correspondence manner, and forming a second object feature point set and a third object feature point set. Similarly, the number of object feature points in the second object feature point set in the embodiment of the present application is smaller than or equal to the number of object feature points in the first object feature point set, that is, the second object feature point set is formed by filtering out the remaining object feature points in the first object feature point set after the matching object feature points are not found in the previous frame image.

S207, subtracting the second object feature point set and the third object feature point set to obtain a second displacement vector set.

And subtracting the coordinates of the object feature points corresponding to the second object feature point set and the third object feature point set one by one to obtain second displacement. Similarly, the coordinate subtraction in the embodiment of the present application may be a difference between the abscissa and the abscissa, and the ordinate are different, so that the obtained second displacement includes a displacement in the abscissa direction and a displacement in the ordinate direction. A second set of displacement vectors is formed based on the corresponding second displacement of each pair of matched object feature points.

S208, determining a displacement average value and a displacement threshold value of the background area according to the first displacement vector set.

In this embodiment, the displacement average may be specifically understood as an average value of the moving distances of all the background feature points; the displacement threshold value can be understood as a threshold value for determining whether or not the distance traveled by the background feature point is excessively large.

The distance that all background feature points in the background area move is calculated based on the first displacement in the first displacement vector set, and in the embodiment of the present application, the distance that the background feature points move is calculated may be calculated according to the first displacement, and the L1 norm, the L2 norm, the L3 norm, and the like. And further calculating an average moving distance according to the moving distance to obtain a moving average value. And calculating a maximum difference value, a minimum difference value, a standard deviation and the like according to each first displacement to obtain a displacement threshold.

As an optional embodiment of the present embodiment, the present optional embodiment further optimizes a displacement average value and a displacement threshold value for determining the background area according to the first displacement vector set to:

b1, calculating L2 norms of the first displacement for each first displacement in the first displacement vector set.

For each first displacement in the first displacement vector set, calculating an L2 norm of the first displacement, namely, squaring the sum of squares of the abscissa and the ordinate of the first displacement, and obtaining the L2 norm of the first displacement.

B2, determining the average value of L2 norms of the first displacements as the displacement average value of the background area.

And calculating the average value of each L2 norm according to the L2 norms of each first displacement, and taking the calculated result as the displacement average value of the background area.

B3, determining the standard deviation of the L2 norm of each first displacement as a displacement threshold of the background area.

And calculating the standard deviation of each L2 norm according to the L2 norms of each first displacement, and taking the calculation result as a displacement threshold value of the background area.

S209, determining the displacement of each object characteristic point corresponding to the second displacement vector set.

The second displacement vector set includes a second displacement corresponding to at least one object feature point, and the distance moved by the object feature point is calculated based on the second displacement.

As an optional embodiment of the present embodiment, the present optional embodiment further optimizes the determining of the displacement of each object feature point included in the second set of displacement vectors to:

and C1, determining a second displacement corresponding to each object feature point in the second displacement vector set, and calculating an L2 norm of the second displacement.

And squaring the sum of squares of the abscissa and the ordinate of the second displacement to obtain an L2 norm of the second displacement, and calculating the L2 norm of the second displacement corresponding to each object feature point according to the method.

C2, determining the L2 norm of the second displacement as the displacement of the object characteristic point.

S210, calculating the difference value of the displacement and the displacement average value of each object feature point for each object feature point, and determining the object feature point as a dynamic object feature point if the absolute value of the difference value is not smaller than a displacement threshold value.

For each object feature point in each detection frame, the following manner can be adopted for static distinction: calculating the difference value between the displacement of the object characteristic points and the displacement average value, comparing the absolute value of the difference value with a displacement threshold value, and determining the object characteristic points as dynamic object characteristic points if the absolute value of the difference value is not smaller than the displacement threshold value; and if the object characteristic points are smaller than the displacement threshold value, determining the object characteristic points as static characteristic points.

In this step, when the dynamic and static determination of the object feature points is performed, if the number of object feature points in the second object feature point set is smaller than the number of object feature points in the first object feature point set, only the object feature points matched in the previous frame image may be subjected to dynamic and static detection, that is, only whether the object feature points in the second object feature point set are dynamic is determined. Alternatively, for the object feature point N1 that is not matched in the previous frame image, the object feature point N2 closest to the current frame image may be determined according to coordinates, and the dynamic state and the static state of the object feature point N1 may be determined according to the dynamic state and the static state of the object feature point N2, that is, when the object feature point N2 is a dynamic object feature point, the object feature point N1 is also a dynamic object feature point, and when the object feature point N2 is a static object feature point, the object feature point N1 is also a static object feature point.

S211, determining a basic matrix between the current frame image and the previous frame image.

And acquiring relative pose and relative rotation between the current frame image and the previous frame image, wherein the relative pose and the relative rotation can be acquired based on RTK equipment, wheel speed meters and other equipment or sensors. A basis matrix between two frames is determined based on the relative pose and relative rotation of the two frames of images.

Exemplary, the embodiment of the application provides a determination formula of a basic matrix:

F＝K ^-T ([t] _x R) ^T K ^-1 ；

wherein K represents camera internal parameters, t represents relative pose, [ t ]] _x An antisymmetric matrix representing the relative pose t between two frames, R representing the relative rotation between two frames.

S212, processing object feature points in the detection frames and the previous frame of image based on a sparse optical flow method aiming at the detection frames corresponding to each object to obtain a matching point set corresponding to the previous frame of image.

In this embodiment, the matching point set may be specifically understood as a set formed by object feature points that are matched with object feature points in the detection frame one by one in the previous frame image.

For each detection frame, the corresponding matching point set can be determined in the following manner: and processing the object characteristic points in the first object characteristic point set corresponding to the detection frame and the previous frame image based on a sparse optical flow method, determining object characteristic points in the detection frame and the object characteristic points in the previous frame image which are matched in a one-to-one correspondence manner, and forming a matching point set. The object feature points in the matching point set obtained in the step have object feature points corresponding to the object feature points in the detection frame one by one.

S213, determining target matching characteristic points corresponding to the object characteristic points in the matching point set aiming at each object characteristic point in the detection frame.

In this embodiment, the target matching feature point may be specifically understood as one object feature point in the matching point set, and may be determined according to the object feature point in the detection frame and the correspondence. After determining an object feature point, determining a target matching feature point corresponding to the object feature point in the matching point set according to the corresponding relation.

S214, determining the position relation of the object feature points in the two frames of images based on the coordinates of the object feature points and the target matching feature points in combination with the basic matrix, and determining the object feature points as dynamic object feature points if the position relation of the object feature points in the two frames of images does not meet the preset position condition.

In this embodiment, the preset position condition may be whether the two points coincide or not, or coincide within an error allowable range. And (3) taking the coordinates of the object feature points and the target matching feature points and the basic matrix into a calculation formula to calculate, so as to obtain the position relationship of the object feature points in the two frames of images. Judging whether the position relation of the object feature points in the two frames of images meets a preset position condition, and if so, determining that the object feature points are static feature points; and if the object feature points do not meet the dynamic object feature points, determining the object feature points as dynamic object feature points.

Exemplary, the embodiment of the present application provides a formula for determining whether an object feature point is a dynamic object feature point:

pFp*＜thre；

wherein, p is the coordinates of the object feature points, F is the basic matrix, p is the coordinates of the target matching feature points, and pFP represents the position relationship of the object feature points in the two frames of images; thre is a preset threshold value, and can be set according to the speed of the vehicle or the robot and the measurement accuracy of the RTK receiver. When pFp < thre, it is determined that the preset position condition is satisfied.

In the embodiment of the application, the corresponding two object characteristic points can be substituted into the formula through the formula to calculate, and whether the two object characteristic points are dynamic points or static points is judged according to the calculation result. For a point on the object, which is acquired at two moments, the coordinates of the two points can be acquired, if the point is dynamic, the actual position of the point has changed, so the coordinates have changed in the same coordinate system, the same point has not been calculated, and the calculated value is greater than the threshold value.

The embodiment of the application provides a method for detecting feature points of a dynamic object, which solves the problem of inaccurate detection results when a monocular camera detects the dynamic object, obtains detection frames by detecting the image feature points in a current frame image, and further determines a first background feature point set formed by background feature points and a first object feature point set corresponding to each detection frame according to the detection frames. Selecting different processing modes to perform dynamic and static distinction according to whether the total area of each detection frame meets preset conditions, and detecting a dynamic object based on a first background characteristic point set when the preset conditions are met; when the preset condition is not met, the number of the background characteristic points is small and is not representative, so that dynamic and static distinction is carried out only according to the object characteristic points. When the dynamic and static feature point detection is carried out, the detection result is more accurate, the feature points of the object in the detection frame can be judged again, and the occurrence of false detection is reduced; the monocular camera is used for image acquisition, so that the detection cost is low, the depth of the feature points is not dependent, and the robustness is higher.

Example III

Fig. 3 is a schematic structural diagram of a dynamic object feature point detection device according to a third embodiment of the present invention. As shown in fig. 3, the apparatus includes: the device comprises a feature point extraction module 31, an object detection module 32, a first displacement set determination module 33 and a first dynamic point detection module 34;

the feature point extraction module 31 is configured to obtain a current frame image and a previous frame image, and extract feature points of the current frame image to obtain image feature points, where the current frame image is collected by a monocular camera;

the object detection module 32 is configured to perform object detection on the image feature points, determine a detection frame corresponding to at least one object, and determine a first background feature point set and a first object feature point set corresponding to each detection frame according to each detection frame;

a first displacement set determining module 33, configured to determine a first displacement vector set corresponding to a background feature point according to the first background feature point set and the previous frame image if the total area of each detection frame meets a preset condition;

the first dynamic point detection module 34 is configured to determine, for each detection frame, a second set of displacement vectors corresponding to the object according to a first set of object feature points corresponding to the detection frame and the previous frame image, and determine, based on the first set of displacement vectors and the second set of displacement vectors, whether the object feature points in the detection frame are dynamic object feature points.

The embodiment of the application provides a dynamic object feature point detection device, which solves the problem of inaccurate detection results when a monocular camera detects dynamic objects, obtains detection frames by detecting the image feature points in a current frame image, further determines a first background feature point set formed by background feature points according to the detection frames and a first object feature point set corresponding to each detection frame, and performs dynamic object detection based on the first background feature point set when the total area of each detection frame meets preset conditions, and performs dynamic and static distinction on the object feature points in the detection frames through a first displacement vector set and a second displacement vector set, thereby improving the accuracy of the detection results and reducing the occurrence of false detection; the monocular camera is used for image acquisition, so that the detection cost is low.

Optionally, the object detection module 32 is specifically configured to: judging whether each image feature point is in a detection frame or not according to the coordinates of the image feature point and the size and position information of each detection frame aiming at each image feature point; if yes, the image feature points are used as object feature points and added into a first object feature point set corresponding to a detection frame where the image feature points are located; otherwise, the image feature points are used as background feature points to be added into a first background feature point set.

Optionally, the first displacement set determining module 33 includes:

the first background set determining unit is used for processing the first background characteristic point set and the previous frame image based on a sparse optical flow method to obtain a second background characteristic point set and a third background characteristic point set, wherein the background characteristic points in the second background characteristic point set are background characteristic points in the current frame image, and the background characteristic points in the third background characteristic point set are background characteristic points in the previous frame image;

and the first displacement set determining unit is used for subtracting the second background characteristic point set and the third background characteristic point set to obtain a first displacement vector set.

Optionally, the first dynamic point detection module 34 includes:

the first object set determining unit is used for processing the first object feature point set corresponding to the detection frame and the previous frame image based on a sparse optical flow method to obtain a second object feature point set and a third object feature point set, wherein object feature points in the second object feature point set are object feature points in the current frame image, and object feature points in the third object feature point set are object feature points in the previous frame image;

And the second displacement set determining unit is used for subtracting the second object characteristic point set and the third object characteristic point set to obtain a second displacement vector set.

Optionally, the first dynamic point detection module 34 includes:

the background displacement determining unit is used for determining a displacement average value and a displacement threshold value of a background area according to the first displacement vector set;

an object displacement determining unit, configured to determine a displacement of each object feature point corresponding to the second displacement vector set;

and the characteristic point judging unit is used for calculating the difference value between the displacement of each object characteristic point and the displacement average value for each object characteristic point, and determining the object characteristic point as a dynamic object characteristic point if the absolute value of the difference value is not smaller than the displacement threshold value.

Optionally, the background displacement determining unit is specifically configured to: calculating, for each first displacement in the first set of displacement vectors, an L2 norm of the first displacement; determining the average value of L2 norms of the first displacements as the displacement average value of the background area; and determining the standard deviation of the L2 norm of each first displacement as a displacement threshold of the background area.

Optionally, the object displacement determining unit is specifically configured to determine a second displacement corresponding to each object feature point in the second displacement vector set, and calculate an L2 norm of the second displacement; and determining the L2 norm of the second displacement as the displacement of the object characteristic point.

Optionally, the apparatus further comprises:

the base matrix determining module is used for determining a base matrix between the current frame image and the previous frame image if the total area of each detection frame does not meet a preset condition;

the matching point set determining module is used for processing object characteristic points in the detection frames and a previous frame image based on a sparse optical flow method aiming at the detection frames corresponding to each object to obtain a matching point set corresponding to the previous frame image;

the target point determining module is used for determining a target matching characteristic point corresponding to each object characteristic point in the matching point set;

and the second dynamic point detection module is used for determining the position relation of the object feature points in the two frames of images based on the coordinates of the object feature points and the target matching feature points and the basic matrix, and determining the object feature points as dynamic object feature points if the position relation of the object feature points in the two frames of images does not meet the preset position condition.

The dynamic object feature point detection device provided by the embodiment of the invention can execute the dynamic object feature point detection method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.

Example IV

Fig. 4 shows a schematic diagram of an electronic device 40 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.

As shown in fig. 4, the electronic device 40 includes at least one processor 41, and a memory communicatively connected to the at least one processor 41, such as a Read Only Memory (ROM) 42, a Random Access Memory (RAM) 43, etc., in which the memory stores a computer program executable by the at least one processor, and the processor 41 may perform various suitable actions and processes according to the computer program stored in the Read Only Memory (ROM) 42 or the computer program loaded from the storage unit 48 into the Random Access Memory (RAM) 43. In the RAM 43, various programs and data required for the operation of the electronic device 40 may also be stored. The processor 41, the ROM 42 and the RAM 43 are connected to each other via a bus 44. An input/output (I/O) interface 45 is also connected to bus 44.

Various components in electronic device 40 are connected to I/O interface 45, including: an input unit 46 such as a keyboard, a mouse, etc.; an output unit 47 such as various types of displays, speakers, and the like; a storage unit 48 such as a magnetic disk, an optical disk, or the like; and a communication unit 49 such as a network card, modem, wireless communication transceiver, etc. The communication unit 49 allows the electronic device 40 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.

The processor 41 may be various general and/or special purpose processing components with processing and computing capabilities. Some examples of processor 41 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 41 performs the respective methods and processes described above, such as a dynamic object feature point detection method.

In some embodiments, the dynamic object feature point detection method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as the storage unit 48. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 40 via the ROM 42 and/or the communication unit 49. When the computer program is loaded into the RAM 43 and executed by the processor 41, one or more steps of the dynamic object feature point detection method described above may be performed. Alternatively, in other embodiments, the processor 41 may be configured to perform the dynamic object feature point detection method in any other suitable way (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.

The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.

The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims

1. A method for detecting feature points of a dynamic object, comprising:

2. The method of claim 1, wherein determining a first set of displacement vectors corresponding to background feature points from the first set of background feature points and a previous frame image comprises:

based on a sparse optical flow method, processing the first background feature point set and the previous frame image to obtain a second background feature point set and a third background feature point set, wherein the background feature point in the second background feature point set is the background feature point in the current frame image, and the background feature point in the third background feature point set is the background feature point in the previous frame image;

and subtracting the second background characteristic point set from the third background characteristic point set to obtain a first displacement vector set.

3. The method of claim 1, wherein determining a second set of displacement vectors for the object from the first set of object feature points for the detection frame and the previous frame of image comprises:

processing a first object feature point set and a last frame image corresponding to the detection frame based on a sparse optical flow method to obtain a second object feature point set and a third object feature point set, wherein object feature points in the second object feature point set are object feature points in the current frame image, and object feature points in the third object feature point set are object feature points in the last frame image;

and subtracting the second object feature point set from the third object feature point set to obtain a second displacement vector set.

4. The method of claim 1, wherein the determining whether the object feature point in the detection frame is a dynamic object feature point based on the first set of displacement vectors and the second set of displacement vectors comprises:

determining a displacement average value and a displacement threshold value of the background area according to the first displacement vector set;

determining the displacement of each object characteristic point corresponding to the second displacement vector set;

And calculating the difference value of the displacement and the displacement average value of each object feature point for each object feature point, and determining the object feature point as a dynamic object feature point if the absolute value of the difference value is not smaller than the displacement threshold value.

5. The method of claim 4, wherein determining a displacement mean and a displacement threshold for a background region from the first set of displacement vectors comprises:

calculating, for each first displacement in the first set of displacement vectors, an L2 norm of the first displacement;

determining the average value of L2 norms of the first displacements as the displacement average value of the background area;

and determining the standard deviation of the L2 norm of each first displacement as a displacement threshold of the background area.

6. The method of claim 4, wherein the determining the displacement of each object feature point contained in the second set of displacement vectors comprises:

determining a second displacement corresponding to each object feature point in the second displacement vector set, and calculating an L2 norm of the second displacement;

and determining the L2 norm of the second displacement as the displacement of the object characteristic point.

7. The method according to any one of claims 1-6, further comprising:

If the total area of each detection frame does not meet the preset condition, determining a basic matrix between the current frame image and the previous frame image;

for a detection frame corresponding to each object, processing object feature points in the detection frame and a previous frame image based on a sparse optical flow method to obtain a matching point set corresponding to the previous frame image;

determining a target matching characteristic point corresponding to each object characteristic point in the matching point set by aiming at each object characteristic point in the detection frame;

and determining the position relation of the object feature points in the two frames of images based on the coordinates of the object feature points and the target matching feature points and the basic matrix, and determining the object feature points as dynamic object feature points if the position relation of the object feature points in the two frames of images does not meet the preset position condition.

8. A dynamic object feature point detection apparatus, comprising:

9. An electronic device, the electronic device comprising:

at least one processor; and

the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the dynamic object feature point detection method of any one of claims 1-7.

10. A computer readable storage medium storing computer instructions for causing a processor to perform the dynamic object feature point detection method of any one of claims 1-7.