CN115761558A - Method and device for determining key frame in visual positioning - Google Patents

Method and device for determining key frame in visual positioning Download PDF

Info

Publication number
CN115761558A
CN115761558A CN202111021732.5A CN202111021732A CN115761558A CN 115761558 A CN115761558 A CN 115761558A CN 202111021732 A CN202111021732 A CN 202111021732A CN 115761558 A CN115761558 A CN 115761558A
Authority
CN
China
Prior art keywords
target image
frame
image frame
key frame
key
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111021732.5A
Other languages
Chinese (zh)
Inventor
刘世蔷
谢铭诗
张义
张慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SAIC Motor Corp Ltd
Shanghai Automotive Industry Corp Group
Original Assignee
SAIC Motor Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SAIC Motor Corp Ltd filed Critical SAIC Motor Corp Ltd
Priority to CN202111021732.5A priority Critical patent/CN115761558A/en
Publication of CN115761558A publication Critical patent/CN115761558A/en
Pending legal-status Critical Current

Links

Images

Abstract

The embodiment of the application discloses a method and a device for determining key frames in visual positioning, wherein a video stream is provided with a plurality of image frames which are arranged according to a time sequence, a first key frame is arranged in the video stream, a first target image frame positioned behind the first key frame in the video stream can be obtained, the actual parallax between the first target image frame and the first key frame is calculated according to the distance between the first key frame and matched feature points in the first target image frame, if the actual parallax between the first target image frame and the first key frame is larger than or equal to the preset parallax, the number of the matched feature points is smaller than or equal to a first preset value or the difference between the number of the matched feature points is larger than or equal to a second preset value, the difference between the first target image frame and the first key frame is larger, and the redundant information is less, the first target image frame can be determined to be a second key frame, each determined key frame has lower redundant information while ensuring accurate pose, the amount of stored data is reduced, and computing resources are saved.

Description

Method and device for determining key frame in visual positioning
Technical Field
The invention relates to the field of computers, in particular to a method and a device for determining a key frame in visual positioning.
Background
The visual positioning technology is a technology for positioning an object by using a video stream acquired by a camera, and specifically, the video stream can be acquired by using the camera on a vehicle, and the moving track of the vehicle is determined by using the positions of key points in the video stream in a multi-frame image, wherein the key points are usually static points in a scene. However, in the video stream, if each frame is stored, the amount of global data is huge, and if the moving speed of the vehicle is slow or stops, a large amount of repeated information exists in the video stream, and how to reduce the amount of data and save the computing resources is an important problem for those skilled in the art.
Disclosure of Invention
In order to solve the above technical problems, embodiments of the present application provide a method and an apparatus for determining a key frame in visual positioning, so that redundant information is reduced, data amount is reduced, and computational resources are saved.
The embodiment of the application provides a method for determining key frames in visual positioning, a video stream is provided with a plurality of image frames which are arranged in a time sequence, a first key frame is arranged in the video stream, and the method comprises the following steps:
determining a first target image frame in the video stream that is located after the first keyframe;
determining the first target image frame as a second key frame if the first target image frame meets at least one of the following conditions:
the actual parallax of the first target image frame and the first key frame is greater than or equal to a first preset parallax, the number of matched feature points in the first key frame and the first target image frame is less than or equal to a first preset value, and the frame number difference between the first key frame and the first target image frame is greater than or equal to a second preset value;
and calculating the actual parallax of the first target image frame and the first key frame according to the distance between the first key frame and the matched feature points in the first target image frame.
Optionally, the method further includes:
acquiring a second target image frame positioned after the second key frame in the video stream;
if the second target image frame meets at least one of the following conditions, determining that the second target image frame is a third key frame:
the number of matched feature points in the second key frame and the second target image frame is less than or equal to a third preset value, the frame number difference between the second key frame and the second target image frame is greater than or equal to a fourth preset value, the actual parallax between the second target image frame and the second key frame is greater than or equal to a second preset parallax, and the actual parallax between the second target image frame and the first key frame is greater than or equal to a third preset parallax;
and calculating the actual parallax between the second target image frame and the second key frame according to the distance between the second key frame and the matched feature points in the second target image frame.
Optionally, the feature points in the first keyframe and the first target image frame are detected by an ORB algorithm.
Optionally, the video stream is acquired by a camera on a vehicle, the preset parallax corresponds to the first target image frame, and the preset parallax c' is represented by the following formula:
c′=w 1 |Δx|+w 2 |Δy|+w 3 |Δz|,
wherein Δ x is a translation distance of the vehicle in a first direction, Δ y is a translation distance of the vehicle in a second direction, Δ z is a rotation angle of the vehicle around a third direction, the first direction, the second direction and the third direction are three coordinate axis directions of a three-dimensional rectangular coordinate system, the third direction is a vertical direction, the Δ x, the Δ y and the Δ z are determined according to poses of the vehicle in the first keyframe and the first target image frame, and the poses of the vehicle in the first keyframe and the first target image frame are determined according to relative positions of the matched feature points; said w 1 W to 2 And said w 3 Are the weight of Δ x, Δ y and Δ z, respectively, andis a number from 0 to 1.
Optionally, the first key frame is an image frame of the first frame.
The embodiment of the application provides a device for determining key frames in visual positioning, a video stream is provided with a plurality of image frames which are arranged in a time sequence, a first key frame is arranged in the video stream, and the device comprises:
a disparity calculating unit, configured to determine that the first target image frame is a second key frame according to that the first key frame satisfies at least one of the following conditions:
the actual parallax of the first target image frame and the first key frame is greater than or equal to a first preset parallax, the number of matched feature points in the first key frame and the first target image frame is less than or equal to a first preset value, and the frame number difference between the first key frame and the first target image frame is greater than or equal to a second preset value;
and calculating the actual parallax of the first target image frame and the first key frame according to the distance between the first key frame and the matched feature points in the first target image frame.
Optionally, the frame determining unit is further configured to: acquiring a second target image frame positioned after the second key frame in the video stream;
the key frame determination unit is further configured to:
if the second target image frame meets at least one of the following conditions, determining that the second target image frame is a third key frame:
the number of matched feature points in the second key frame and the second target image frame is less than or equal to a third preset value, the frame number difference between the second key frame and the second target image frame is greater than or equal to a fourth preset value, the actual parallax between the second target image frame and the second key frame is greater than or equal to a second preset parallax, and the actual parallax between the second target image frame and the first key frame is greater than or equal to a third preset parallax;
and calculating the actual parallax between the second target image frame and the second key frame according to the distance between the second key frame and the matched feature points in the second target image frame.
Optionally, the feature points in the first keyframe and the first target image frame are detected by an ORB algorithm.
Optionally, the video stream is acquired by a camera on a vehicle, the preset parallax corresponds to the first target image frame, and the preset parallax c' is expressed by the following formula:
c′=w 1 |Δx|+w 2 |Δy|+w 3 |Δz|,
wherein Δ x is a translation distance of the vehicle in a first direction, Δ y is a translation distance of the vehicle in a second direction, Δ z is a rotation angle of the vehicle around a third direction, the first direction, the second direction and the third direction are three coordinate axis directions of a three-dimensional rectangular coordinate system, the third direction is a vertical direction, the Δ x, the Δ y and the Δ z are determined according to poses of the vehicle in the first keyframe and the first target image frame, and the poses of the vehicle in the first keyframe and the first target image frame are determined according to relative positions of the matched feature points; said w 1 W to 2 And said w 3 The weights of Δ x, Δ y, and Δ z are numbers of 0 to 1, respectively.
Optionally, the first key frame is an image frame of the first frame.
The embodiment of the application provides a method and a device for determining key frames in visual positioning, a video stream is provided with a plurality of image frames which are arranged in a time sequence, a first key frame is arranged in the video stream, a first target image frame positioned behind the first key frame in the video stream can be obtained, the actual parallax between the first target image frame and the first key frame is calculated according to the distance between the first key frame and matched feature points in the first target image frame, if the actual parallax between the first target image frame and the first key frame is larger than or equal to the preset parallax, the number of the matched feature points is smaller than or equal to a first preset value or the difference between the number of the matched feature points is larger than or equal to a second preset value, the difference between the first target image frame and the first key frame is larger, and the redundant information is less, the first target image frame can be determined to be a second key frame, each determined key frame has lower redundant information while ensuring accurate pose, the amount of stored data is reduced, and therefore computing resources are saved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art according to the drawings.
Fig. 1 is a flowchart of a method for determining a key frame in visual positioning according to an embodiment of the present disclosure;
fig. 2 is a block diagram of a device for determining a keyframe in visual positioning according to an embodiment of the present application.
Detailed Description
In order to make those skilled in the art better understand the technical solutions of the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
At present, an object may be located by using a video stream acquired by a camera, specifically, a camera on a vehicle may be used to acquire a video stream, and a moving track of the vehicle is determined by using positions of key points in a multi-frame image in the video stream, where the key points are usually static points in a scene. However, in the video stream, if the moving speed of the vehicle is slow, the position difference of the key points in the multi-frame images is small, and storing each frame results in huge global data volume, and meanwhile, the multi-frame images with small position difference of the key points have repeated information, so how to reduce the data volume and save the computing resources is an important problem for those skilled in the art.
Based on this, an embodiment of the present application provides a method and an apparatus for determining a keyframe in visual positioning, where a video stream has a plurality of image frames arranged in a time sequence, and the video stream has a first keyframe, a first target image frame located after the first keyframe in the video stream may be obtained, and an actual disparity between the target image frame and the first keyframe is calculated according to a distance between the first keyframe and a feature point matched with the first target image frame, and if the actual disparity between the first target image frame and the first keyframe is greater than or equal to a preset disparity, and the number of matched feature points is less than or equal to a first preset value or a frame number difference is greater than or equal to a second preset value, it is indicated that a difference between the first target image frame and the first keyframe is greater, and there is less redundant information, the first target image frame may be determined to be a second keyframe, and each determined keyframe has lower redundant information while ensuring accurate pose, and an amount of stored data is reduced, thereby saving computing resources.
The following describes in detail a specific implementation of a method and an apparatus for determining a keyframe in visual positioning according to an embodiment of the present application with reference to the accompanying drawings.
Referring to fig. 1, a flowchart of a method for determining a key frame in visual positioning according to an embodiment of the present application is provided, where the method may include the following steps.
S101, a first target image frame located after a first key frame in the video stream is determined.
In the embodiment of the application, the image frames in the video stream can be analyzed, so that the moving track of the vehicle in the image frames is obtained. The video stream comprises a plurality of image frames which are arranged according to a time sequence, the image frames are acquired by a camera, and the time of the image frames is the time of acquiring the image frames by the camera. Specifically, the video stream may be captured by a camera mounted on the vehicle, and each image frame represents a position of an object around the vehicle relative to the camera, so that a movement track of the vehicle may be obtained based on the position of the surrounding object relative to the camera by using the video stream. The camera that captures the video stream may be a binocular vision system.
Because the video stream is used for realizing visual positioning, only one or a few image frames with smaller parallax in the video stream can be reserved, and the image frames needing to be reserved are used as key frames, so that redundant information is reduced, and the amount of stored data is reduced.
In this embodiment, a first key frame may be determined in a video stream, where the first key frame may be an image frame of the first frame or an nth image frame, and then a second key frame, a third key frame, and the like may be determined according to the first key frame. That is, whether a subsequent image frame is a key frame may be determined from a previous key frame, and for convenience of description, the previous key frame is taken as a first key frame and a key frame determined from the first key frame is taken as a second key frame. The previous key frame may be a first key frame in the video stream, or may be a key frame determined according to other key frames in the video stream, and after determining a second key frame according to the first key frame, the second key frame may be used as a new first key frame for determining a new second key frame.
After determining that the video stream has the first key frame, some image frame after the first key frame may be analyzed as the first target image frame to determine whether the first target image frame is the second key frame. The first target image frame may be an image frame after the first key frame, that is, the acquisition time of the first target image frame is later than that of the first key frame, the first target image frame and the first key frame are adjacent image frames, or other image frames may be separated from the first key frame. Specifically, each image frame after the first key frame may be sequentially used as the first target image frame.
S102, calculating the actual parallax between the first target image frame and the first key frame according to the distance between the first key frame and the matched feature points in the first target image frame, and determining that the first target image frame is the second key frame if the actual parallax between the first key frame and the first target image frame is larger than or equal to a first preset parallax.
In the embodiment of the application, the first keyframe and the first target image frame may have matched feature points, the matched feature points are at least one pair of feature points in the first keyframe and the first target image frame, two feature points in the pair of feature points have substantially the same features, and represent points at the same position corresponding to the same object collected by the camera, and the feature points are stationary and easily recognized points in the field of view of the camera, and may be feature points on a stationary vehicle ahead, points on a lane line on a road surface, points of surrounding trees, and the like. The feature points are usually corner points where the change of the gray value of pixels in the image is obvious, and the feature points can be matched according to the description similarity of each feature point.
The feature points can be detected by ORB (original FAST and Rotated Brief) algorithm, the ORB algorithm is divided into two parts, which are original FAST feature point extraction and Brief feature point description, the feature point extraction is developed by FAST (featurefrom acquired Segment Test) algorithm, the original FAST is an improved FAST corner, the main direction of FAST feature points is calculated, the feature point description is improved according to Brief (Binary route independent element) feature description algorithm, brief is a descriptor, and the feature after rotation can be calculated by using the direction information of key points, so that the feature point has rotation invariance. The ORB feature is to combine the detection method of FAST feature points with BRIEF feature descriptors and make improvements and optimization on the original basis.
For example, the first keyframe has a first feature point that is the center point of the left wheel of the front stationary vehicle, and the first feature point has a first coordinate (x) in the first keyframe 1 ,y 1 ) The first target image frame has a second feature point, the second feature point is a center point of a left vehicle of the front stationary vehicle, and the second feature point has a second coordinate (x) in the first target image frame 2 ,y 2 ) If the first feature point and the second feature point have similar descriptors, the first feature point and the second feature point can be used as matched feature points to form a pair of feature points, the first feature point and the second feature pointThe first coordinate of the first characteristic point and the second coordinate of the second characteristic point represent the moving track of the center point of the left vehicle of the front vehicle relative to the camera, and the vehicle where the first characteristic point and the second characteristic point are located is a front static vehicle, so that the first coordinate of the first characteristic point and the second coordinate of the second characteristic point represent the moving track of the vehicle where the camera is located.
According to the distance between the first key frame and the matched feature points in the first target image frame, the actual parallax between the first target image frame and the first key frame can be calculated, and the actual parallax represents the position difference of the matched feature points of the first key frame and the first target image frame. The distance between the first key frame and the matched feature point in the first target image frame may be a euclidean distance, for example, the euclidean distance between the first feature point and the second feature point may be expressed as:
Figure BDA0003241699990000071
the distance between the matched feature points can be calculated, and the average value of the distances between the matched feature points can be used as the actual disparity c of the first target image frame and the first key frame.
In the embodiment of the application, the actual disparity between the first keyframe and the first target image frame represents the position difference between the matched feature points in the first keyframe and the target image frame, and if the actual disparity between the first keyframe and the first target image frame is larger, which indicates that the position difference between the matched feature points in the first keyframe and the first target image frame is larger, the first target image frame can be used as the second keyframe, and the first keyframe and the second keyframe store different positions of the matched feature points, so that the accuracy of the pose is ensured.
The first preset parallax may be a preset value for evaluating the size of the position difference between the first key frame and the matched feature point in the first target image frame, and the first preset parallax may be a uniform value.
When the video stream is captured by a camera on the vehicle, the moving direction of the vehicle is only the movement in the plane, and the reason for generating the parallax in the first key frame and the first target image frame may include: translation of the vehicle in the horizontal direction and rotation in the horizontal plane. For convenience of description, a three-dimensional rectangular coordinate system may be established for the vehicle, the first direction, the second direction, and the third direction are three coordinate axis directions of the three-dimensional rectangular coordinate system, and the third direction is a vertical direction, so that the movement of the vehicle may include three degrees of freedom: the translation distance Δ x of the vehicle in the first direction, the translation distance Δ y of the vehicle in the second direction, and the rotation angle Δ z of the vehicle around the third direction, where Δ x, Δ y, and Δ z may be determined according to the poses of the vehicle in the first keyframe and the first target image frame, and the poses of the vehicle in the first keyframe and the first target image frame are determined according to the relative positions of the matched feature points.
Therefore, in the embodiment of the application, a first preset parallax matched with the first target image frame can be determined according to the pose of the vehicle in the first key frame and the first target image frame, and when the actual parallax between the first key frame and the first target image frame is greater than or equal to the first preset parallax corresponding to the first target image frame, the first target image frame can be determined to be a second key frame. Specifically, the first preset parallax c' is represented by the following formula:
c′=w 1 |Δx|+w 2 |Δy|+w 3 |Δz|,
wherein, w 1 、w 2 And w 3 Are respectively the weight of Deltax, delay and Deltaz, and are a number from 0 to 1, w 1 、w 2 And w 3 The value of (c) can be set according to actual conditions. For example w 1 、w 2 And w 3 May be 0.5.Δ x, Δ y, and Δ z may be determined according to the distance between the matched feature points.
S103, if the number of the matched feature points in the first key frame and the first target image frame is smaller than or equal to a first preset value, determining that the first target image frame is a second key frame.
In the embodiment of the present application, a next key frame may be determined based on a previous key frame according to the number of matched feature points. Specifically, when the number of the matched feature points in the first keyframe and the first target image frame is less than or equal to a first preset value, it is indicated that the number of the newly extracted feature points is large, and the first target image frame needs to be retained, the first target image frame may be determined to be the second keyframe, and when the number of the matched feature points in the first keyframe and the first target image frame is greater than the first preset value, it is indicated that the feature points of the first keyframe and the first target image frame are close to each other, the first target image frame may not be retained, and is not taken as the keyframe. The first preset value can be determined according to actual conditions and can be represented by k.
And S104, if the frame number difference between the first key frame and the first target image frame is greater than or equal to a second preset value, determining that the first target image frame is a second key frame.
In this embodiment, the next key frame may be determined based on the previous key frame according to the frame number difference from the previous key frame. Specifically, when the frame number difference between the first key frame and the first target image frame is greater than or equal to a second preset value, the first target image frame is determined to be a second key frame, where the second preset value may be determined according to an actual situation and may be denoted by T.
In addition, a third key frame may be determined according to the first key frame, or a third key frame may be determined according to the second key frame, where the third key frame is located after the second key frame, and specifically, an actual disparity between the third key frame and the first key frame is greater than an actual disparity between the second key frame and the first key frame. In a specific implementation, a second target image frame located after the second key frame in the video stream may be obtained first.
In this embodiment of the application, when the actual disparity between the second target image frame and the second key frame is greater than or equal to the second preset disparity, it may be determined that the second target image frame is a third key frame, or when it is determined that the actual disparity between the second target image frame and the first key frame is greater than or equal to the third preset disparity, it is determined that the second target image frame is a third key frame, and the third preset disparity is greater than the first preset disparity. The actual parallax between the second target image frame and the second key frame can be calculated according to the distance between the second key frame and the matched feature points in the second target image frame. The second preset parallax may be determined in a manner that refers to the first preset parallax.
In the embodiment of the present application, a next key frame may be determined based on a previous key frame according to the number of matched feature points. Specifically, when the number of the feature points matched in the second keyframe and the second target image frame is less than or equal to a third preset value, it indicates that the number of the newly extracted feature points is large, and the second target image frame needs to be retained, the second target image frame may be determined to be the third keyframe, and when the number of the feature points matched in the second keyframe and the second target image frame is greater than the third preset value, it indicates that the feature points of the second keyframe and the second target image frame are close to each other, and the second target image frame may not be retained and is not taken as the keyframe. The third preset value may be determined according to actual conditions, and the third preset value may be the same as the first preset value and may be denoted by k.
In the embodiment of the present application, a next key frame may be determined based on a previous key frame according to a frame number difference from the previous key frame. Specifically, when the frame number difference between the second keyframe and the second target image frame is greater than or equal to a fourth preset value, the second target image frame is determined as a third keyframe, where the fourth preset value may be determined according to an actual situation, and the fourth preset value may be the same as the second preset value and may be represented by T.
That is, for the second target image frame, at least one of the following conditions is satisfied, and it may be regarded as the third key frame: 1) The actual parallax between the second target image frame and the second key frame is larger than or equal to a second preset parallax; 2) The actual parallax between the second target image frame and the first key frame is larger than or equal to a third preset parallax; 3) The number of the matched feature points in the second key frame and the second target image frame is less than or equal to a third preset value; 4) The difference in frame number between the second keyframe and the second target image frame is greater than or equal to a fourth preset value. The determination of whether the second target image frame satisfies the four conditions may be performed in any order, which is not limited herein.
In this embodiment of the present application, the third key frame determined above may be used as a new first key frame to determine a new second key frame, and a repeated description is not repeated here.
The embodiment of the application provides a method for determining a key frame in visual positioning, a video stream is provided with a plurality of image frames which are arranged in a time sequence, a first key frame is arranged in the video stream, a first target image frame positioned behind the first key frame in the video stream can be obtained, the actual parallax between the first target image frame and the first key frame is calculated according to the distance between the first key frame and matched feature points in the first target image frame, if the actual parallax between the first target image frame and the first key frame is larger than or equal to a preset parallax, the number of the matched feature points is smaller than or equal to a first preset value or the difference between the number of frame numbers is larger than or equal to a second preset value, the difference between the first target image frame and the first key frame is larger, and the redundant information is less, the first target image frame can be determined to be a second key frame, the determined key frames have lower redundant information while ensuring accurate pose, the amount of stored data is reduced, and therefore computing resources are saved.
Based on the foregoing method for determining a key frame in visual positioning, an embodiment of the present application further provides a device for determining a key frame in visual positioning, and referring to fig. 2, a block diagram of a device for determining a key frame in visual positioning provided in an embodiment of the present application is shown, where a video stream has a plurality of image frames arranged in a time sequence, the video stream has a first key frame, and the device includes:
a frame determination unit 110 for determining a first target image frame located after the first key frame in the video stream;
a key frame determining unit 120, configured to determine that the first target image frame is a second key frame if at least one of the following conditions is met:
the actual parallax of the first target image frame and the first key frame is greater than or equal to a first preset parallax, the number of matched feature points in the first key frame and the first target image frame is less than or equal to a first preset value, and the frame number difference between the first key frame and the first target image frame is greater than or equal to a second preset value;
and calculating the actual parallax of the first target image frame and the first key frame according to the distance between the first key frame and the matched feature points in the first target image frame.
Optionally, the frame determining unit is further configured to: acquiring a second target image frame positioned after the second key frame in the video stream;
the key frame determination unit is further configured to:
if the second target image frame meets at least one of the following conditions, determining that the second target image frame is a third key frame:
the number of matched feature points in the second key frame and the second target image frame is less than or equal to a third preset value, the frame number difference between the second key frame and the second target image frame is greater than or equal to a fourth preset value, the actual parallax between the second target image frame and the second key frame is greater than or equal to a second preset parallax, and the actual parallax between the second target image frame and the first key frame is greater than or equal to a third preset parallax;
and calculating the actual parallax between the second target image frame and the second key frame according to the distance between the second key frame and the matched feature points in the second target image frame.
Optionally, the feature points in the first keyframe and the first target image frame are detected by an ORB algorithm.
Optionally, the video stream is acquired by a camera on a vehicle, the preset parallax corresponds to the first target image frame, and the preset parallax c' is expressed by the following formula:
c′=w 1 |Δx|+w 2 |Δy|+w 3 |Δz|,
wherein Δ x is a translational distance of the vehicle along a first direction, Δ y is a translational distance of the vehicle along a second direction, Δ z is a rotation angle of the vehicle around a third direction, and the first direction and the second direction are differentThe direction and the third direction are three coordinate axis directions of a three-dimensional rectangular coordinate system, the third direction is a vertical direction, the delta x, the delta y and the delta z are determined according to the poses of the vehicle in the first key frame and the first target image frame, and the poses of the vehicle in the first key frame and the first target image frame are determined according to the relative positions of the matched feature points; said w 1 The above-mentioned w 2 And said w 3 The weights of Δ x, Δ y, and Δ z are numbers of 0 to 1, respectively.
Optionally, the first key frame is an image frame of the first frame.
The embodiment of the application provides a device for determining key frames in visual positioning, a video stream has a plurality of image frames arranged in a time sequence, a first key frame is arranged in the video stream, a first target image frame positioned behind the first key frame in the video stream can be obtained, the actual disparity between the first target image frame and the first key frame is calculated according to the distance between the first key frame and matched feature points in the first target image frame, if the actual disparity between the first target image frame and the first key frame is greater than or equal to a preset disparity, the number of the matched feature points is less than or equal to a first preset value or the difference between the number of frame numbers is greater than or equal to a second preset value, it is indicated that the difference between the first target image frame and the first key frame is greater, and the redundant information is less, the first target image frame can be determined to be a second key frame, and each determined key frame has lower redundant information while ensuring accurate pose, the amount of stored data is reduced, and thus computing resources are saved.
As can be seen from the above description of the embodiments, those skilled in the art can clearly understand that all or part of the steps in the above embodiment methods can be implemented by software plus a general hardware platform. Based on such understanding, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a storage medium, such as a read-only memory (ROM)/RAM, a magnetic disk, an optical disk, or the like, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network communication device such as a router) to execute the method according to the embodiments or some parts of the embodiments of the present application.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to some descriptions of the method embodiment for relevant points. The above-described embodiments of the apparatus and system are merely illustrative, wherein modules described as separate parts may or may not be physically separate, and parts shown as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The above description is only a preferred embodiment of the present application and is not intended to limit the scope of the present application. It should be noted that, for a person skilled in the art, several modifications and refinements can be made without departing from the application, and these modifications and refinements should also be regarded as the protection scope of the application.

Claims (10)

1. A method for determining keyframe in visual positioning, wherein a video stream having a plurality of image frames arranged in a time sequence, the video stream having a first keyframe, the method comprising:
determining a first target image frame in the video stream that is located after the first keyframe;
determining the first target image frame as a second key frame if the first target image frame meets at least one of the following conditions:
the actual parallax of the first target image frame and the first key frame is greater than or equal to a first preset parallax, the number of matched feature points in the first key frame and the first target image frame is less than or equal to a first preset value, and the frame number difference between the first key frame and the first target image frame is greater than or equal to a second preset value;
and calculating the actual parallax of the first target image frame and the first key frame according to the distance between the first key frame and the matched feature points in the first target image frame.
2. The method of claim 1, further comprising:
acquiring a second target image frame positioned after the second key frame in the video stream;
if the second target image frame meets at least one of the following conditions, determining that the second target image frame is a third key frame:
the number of the matched feature points in the second key frame and the second target image frame is less than or equal to a third preset value, the frame number difference between the second key frame and the second target image frame is greater than or equal to a fourth preset value, the actual parallax between the second target image frame and the second key frame is greater than or equal to a second preset parallax, and the actual parallax between the second target image frame and the first key frame is greater than or equal to a third preset parallax;
and calculating the actual parallax between the second target image frame and the second key frame according to the distance between the second key frame and the matched feature points in the second target image frame.
3. The method of claim 1, wherein the feature points in the first keyframe and the first target image frame are detected using an ORB algorithm.
4. The method according to any one of claims 1 to 3, wherein the video stream is captured by a camera on a vehicle, the preset disparity corresponds to the first target image frame, and the preset disparity c' is expressed by the following formula:
c′=w 1 |Δx|+w 2 |Δy|+w 3 |Δz|,
wherein Δ x is a translation distance of the vehicle along a first direction, Δ y is a translation distance of the vehicle along a second direction, Δ z is a rotation angle of the vehicle around a third direction, the first direction, the second direction and the third direction are three coordinate axis directions of a three-dimensional rectangular coordinate system, the third direction is a vertical direction, the Δ x, the Δ y and the Δ z are determined according to poses of the vehicle in the first keyframe and the first target image frame, and the poses of the vehicle in the first keyframe and the first target image frame are determined according to relative positions of the matched feature points; said w 1 The above-mentioned w 2 And said w 3 The weights of Δ x, Δ y, and Δ z are numbers of 0 to 1, respectively.
5. The method of any of claims 1-3, wherein the first keyframe is an image frame of a first frame.
6. An apparatus for determining keyframe in visual positioning, wherein a video stream having a plurality of temporally sequenced image frames with a first keyframe therein, the apparatus comprising:
a frame determination unit for determining a first target image frame located after the first key frame in the video stream;
a key frame determining unit, configured to determine that the first target image frame is a second key frame if the first target image frame satisfies at least one of the following conditions:
the actual parallax of the first target image frame and the first key frame is greater than or equal to a first preset parallax, the number of matched feature points in the first key frame and the first target image frame is less than or equal to a first preset value, and the frame number difference between the first key frame and the first target image frame is greater than or equal to a second preset value;
and calculating the actual parallax of the first target image frame and the first key frame according to the distance between the first key frame and the matched feature points in the first target image frame.
7. The apparatus of claim 6,
the frame determination unit is further configured to: acquiring a second target image frame positioned after the second key frame in the video stream;
the key frame determination unit is further configured to: if the second target image frame meets at least one of the following conditions, determining that the second target image frame is a third key frame:
the number of the matched feature points in the second key frame and the second target image frame is less than or equal to a third preset value, the frame number difference between the second key frame and the second target image frame is greater than or equal to a fourth preset value, the actual parallax between the second target image frame and the second key frame is greater than or equal to a second preset parallax, and the actual parallax between the second target image frame and the first key frame is greater than or equal to a third preset parallax;
and calculating the actual parallax between the second target image frame and the second key frame according to the distance between the second key frame and the matched feature points in the second target image frame.
8. The apparatus of claim 6, wherein the feature points in the first keyframe and the first target image frame are detected using an ORB algorithm.
9. The apparatus according to any one of claims 6-8, wherein the video stream is captured by a camera on a vehicle, the preset disparity corresponds to the first target image frame, and the preset disparity c' is expressed by the following formula:
c′=w 1 |Δx|+w 2 |Δy|+w 3 |Δz|,
wherein Δ x is a translation distance of the vehicle along a first direction, Δ y is a translation distance of the vehicle along a second direction, Δ z is a rotation angle of the vehicle around a third direction, the first direction, the second direction and the third direction are three coordinate axis directions of a three-dimensional rectangular coordinate system, the third direction is a vertical direction, the Δ x, the Δ y and the Δ z are determined according to poses of the vehicle in the first keyframe and the first target image frame, and the poses of the vehicle in the first keyframe and the first target image frame are determined according to relative positions of the matched feature points; said w 1 The above-mentioned w 2 And said w 3 The weights of Δ x, Δ y, and Δ z are numbers of 0 to 1, respectively.
10. The apparatus of any of claims 6-8, wherein the first key frame is an image frame of a first frame.
CN202111021732.5A 2021-09-01 2021-09-01 Method and device for determining key frame in visual positioning Pending CN115761558A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111021732.5A CN115761558A (en) 2021-09-01 2021-09-01 Method and device for determining key frame in visual positioning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111021732.5A CN115761558A (en) 2021-09-01 2021-09-01 Method and device for determining key frame in visual positioning

Publications (1)

Publication Number Publication Date
CN115761558A true CN115761558A (en) 2023-03-07

Family

ID=85332192

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111021732.5A Pending CN115761558A (en) 2021-09-01 2021-09-01 Method and device for determining key frame in visual positioning

Country Status (1)

Country Link
CN (1) CN115761558A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116704403A (en) * 2023-05-11 2023-09-05 杭州晶彩数字科技有限公司 Building image vision identification method and device, electronic equipment and medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116704403A (en) * 2023-05-11 2023-09-05 杭州晶彩数字科技有限公司 Building image vision identification method and device, electronic equipment and medium

Similar Documents

Publication Publication Date Title
CN112435325B (en) VI-SLAM and depth estimation network-based unmanned aerial vehicle scene density reconstruction method
CN110322500B (en) Optimization method and device for instant positioning and map construction, medium and electronic equipment
CN109166149B (en) Positioning and three-dimensional line frame structure reconstruction method and system integrating binocular camera and IMU
Lim et al. Real-time image-based 6-dof localization in large-scale environments
CN107329962B (en) Image retrieval database generation method, and method and device for enhancing reality
CN108519102B (en) Binocular vision mileage calculation method based on secondary projection
CN110853075A (en) Visual tracking positioning method based on dense point cloud and synthetic view
EP3293700B1 (en) 3d reconstruction for vehicle
CN111127524A (en) Method, system and device for tracking trajectory and reconstructing three-dimensional image
CN110349212B (en) Optimization method and device for instant positioning and map construction, medium and electronic equipment
WO2023016271A1 (en) Attitude determining method, electronic device, and readable storage medium
CN111340922A (en) Positioning and mapping method and electronic equipment
CN112115980A (en) Binocular vision odometer design method based on optical flow tracking and point line feature matching
WO2019057197A1 (en) Visual tracking method and apparatus for moving target, electronic device and storage medium
CN112785705B (en) Pose acquisition method and device and mobile equipment
CN110070578B (en) Loop detection method
WO2019175532A1 (en) Urban environment labelling
Jung et al. Object detection and tracking-based camera calibration for normalized human height estimation
CN112991441A (en) Camera positioning method and device, electronic equipment and storage medium
CN115761558A (en) Method and device for determining key frame in visual positioning
CN113592706A (en) Method and device for adjusting homography matrix parameters
CN112085842B (en) Depth value determining method and device, electronic equipment and storage medium
Long et al. Detail preserving residual feature pyramid modules for optical flow
Wong et al. Single camera vehicle localization using feature scale tracklets
JP2023065296A (en) Planar surface detection apparatus and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20230627

Address after: No. 203, Shanghai Songjiang Road, No. 201563, Pudong New Area

Applicant after: SAIC Motor Corp.,Ltd.

Applicant after: Shanghai automotive industry (Group) Co.,Ltd.

Address before: Room 509, building 1, 563 Songtao Road, China (Shanghai) pilot Free Trade Zone, Pudong New Area, Shanghai, 201203

Applicant before: SAIC Motor Corp.,Ltd.

TA01 Transfer of patent application right