CN115014324A - Positioning method, device, medium, equipment and vehicle - Google Patents

Positioning method, device, medium, equipment and vehicle Download PDF

Info

Publication number
CN115014324A
CN115014324A CN202210606914.7A CN202210606914A CN115014324A CN 115014324 A CN115014324 A CN 115014324A CN 202210606914 A CN202210606914 A CN 202210606914A CN 115014324 A CN115014324 A CN 115014324A
Authority
CN
China
Prior art keywords
real
positioning
pose
time
camera
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210606914.7A
Other languages
Chinese (zh)
Inventor
赵楠
王亚慧
张丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Uisee Technologies Beijing Co Ltd
Original Assignee
Uisee Technologies Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Uisee Technologies Beijing Co Ltd filed Critical Uisee Technologies Beijing Co Ltd
Priority to CN202210606914.7A priority Critical patent/CN115014324A/en
Publication of CN115014324A publication Critical patent/CN115014324A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S19/00Satellite radio beacon positioning systems; Determining position, velocity or attitude using signals transmitted by such systems
    • G01S19/38Determining a navigation solution using signals transmitted by a satellite radio beacon positioning system
    • G01S19/39Determining a navigation solution using signals transmitted by a satellite radio beacon positioning system the satellite radio beacon positioning system transmitting time-stamped messages, e.g. GPS [Global Positioning System], GLONASS [Global Orbiting Navigation Satellite System] or GALILEO
    • G01S19/42Determining position
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S19/00Satellite radio beacon positioning systems; Determining position, velocity or attitude using signals transmitted by such systems
    • G01S19/38Determining a navigation solution using signals transmitted by a satellite radio beacon positioning system
    • G01S19/39Determining a navigation solution using signals transmitted by a satellite radio beacon positioning system the satellite radio beacon positioning system transmitting time-stamped messages, e.g. GPS [Global Positioning System], GLONASS [Global Orbiting Navigation Satellite System] or GALILEO
    • G01S19/53Determining attitude

Abstract

The disclosure relates to a positioning method, a positioning device, a positioning medium, a positioning device and a vehicle, wherein the method comprises the following steps: acquiring a prior three-dimensional map of a current scene, acquiring a real-time reference pose of an object to be positioned, and acquiring real-time images under corresponding directions acquired by at least two paths of cameras carried by the object to be positioned; performing data fusion based on at least two real-time images to obtain camera fusion data; and determining the real-time pose of the target object to be positioned based on the prior three-dimensional map, the real-time reference pose and the camera fusion data. Therefore, the real-time images of the object to be positioned are obtained through the at least two cameras, data fusion is carried out on the real-time images to obtain camera fusion data, the real-time pose of the object to be positioned is determined based on the prior three-dimensional map, the real-time reference pose and the camera fusion data, the problem that positioning is lost due to the problems of view angle change, shielding or error matching of a single camera is solved, the fault tolerance rate is effectively improved, and the positioning accuracy and the robustness are improved.

Description

Positioning method, device, medium, equipment and vehicle
Technical Field
The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a positioning method, apparatus, medium, device, and vehicle.
Background
In recent years, the demand for automatic driving is increasing day by day, unmanned driving plays an important role in the fields of warehouse logistics, routing inspection monitoring and the like, and the field requires that unmanned driving can realize long-term stable operation in a relatively fixed environment and can realize accurate self-positioning. Accurate positioning is the most important prerequisite for unmanned driving, and meanwhile, sensing, prediction, planning and control are technically supported. Because the production cost of the vision sensor is low, the quantity of acquired information is large, and the related positioning method is widely researched and applied.
The traditional visual positioning method for camera basis matrix estimation based on feature point matching or the repositioning method based on the known map is easily affected by the problems of view angle change, dynamic shielding and the like, so that more positioning and losing events occur, and the problem of low positioning fault tolerance rate also exists.
Disclosure of Invention
To solve the technical problem or at least partially solve the technical problem, the present disclosure provides a positioning method, apparatus, medium, device, and vehicle.
In a first aspect, the present disclosure provides a positioning method, including:
acquiring a prior three-dimensional map of a current scene, acquiring a real-time reference pose of a target object to be positioned, and acquiring real-time images in corresponding directions acquired by at least two paths of cameras carried by the target object to be positioned;
performing data fusion based on the at least two real-time images to obtain camera fusion data;
and determining the real-time pose of the object to be positioned based on the prior three-dimensional map, the real-time reference pose and the camera fusion data.
Optionally, before acquiring the prior three-dimensional map of the current scene, the method further includes:
and constructing the prior three-dimensional map.
Optionally, the acquiring a real-time reference pose of the target object to be positioned includes:
acquiring at least one of a real-time reference pose of an object to be positioned under a global positioning system, a real-time reference pose under a Beidou satellite navigation system, a real-time reference pose under a Galileo satellite navigation system and a real-time reference pose under the global navigation satellite system;
the real-time reference pose is acquired by adopting a corresponding reference pose sensor, and the reference pose sensor and the at least two cameras are synchronously triggered.
Optionally, the obtaining the camera fusion data includes:
performing data preprocessing on the real-time images corresponding to the at least two paths of cameras to obtain corresponding single-camera feature point data;
constructing camera fusion data based on the single-camera feature point data corresponding to at least two cameras at the same time;
the real-time pose of the target object to be positioned is determined, and the method comprises the following steps:
triggering state bit judgment based on the camera fusion data, and carrying out initial positioning by combining a corresponding positioning mode associated with a judgment result;
and determining to obtain the real-time pose of the target object to be positioned when the preset positioning condition is met based on the positioning results in different positioning modes.
Optionally, the performing data preprocessing on the real-time images corresponding to the at least two cameras to obtain corresponding single-camera feature point data includes:
aiming at the corresponding real-time image of each camera:
constructing an image pyramid based on the real-time image;
constructing interest points of each layer based on the image pyramid;
extracting feature points and determining descriptors of the feature points based on the interest points of each layer to obtain single-camera feature point data;
the method for constructing camera fusion data based on the single-camera feature point data corresponding to the at least two cameras at the same time comprises the following steps:
and constructing camera fusion data based on the feature points corresponding to the at least two cameras at the same time and the descriptors corresponding to the feature points.
Optionally, the triggering, based on the camera fusion data, the status bit determination, and performing initial positioning in combination with a corresponding positioning mode associated with a determination result includes:
triggering a status bit judgment based on the camera fusion data;
when the state bit judgment result is passed, positioning by adopting a constant-speed tracking mode, and then positioning again by adopting a tracking local map mode;
and when the state bit judgment result is lost, positioning by adopting a repositioning mode, and then positioning again by adopting a tracking local map mode.
Optionally, the initial state position of the state bit is lost;
the method further comprises the following steps:
and confirming that the judgment result of the status bit is lost aiming at the first frame data.
Optionally, the prior three-dimensional map includes three-dimensional point features, a key frame sequence number, and position information of a key frame in a camera coordinate system;
positioning by adopting the relocation mode comprises the following steps:
respectively extracting key frames with the position similarity of the real-time images corresponding to the at least two cameras of the current frame being greater than a preset similarity threshold from the prior three-dimensional map based on the real-time reference poses corresponding to the real-time images of the at least two cameras of the current frame to obtain a candidate key frame sequence;
matching each key frame in the sequence with the corresponding real-time image based on the candidate key frame sequence, and determining the candidate key frame with the matching degree larger than a preset matching degree threshold value to obtain an expected candidate key frame;
establishing a multi-phase machine model based on the expected candidate key frames corresponding to the at least two cameras, introducing an OpenGv library, and calling a target function and a target algorithm to obtain the positioning pose of the target object to be positioned in the repositioning mode.
Optionally, after obtaining the position and posture of the object to be positioned in the repositioning mode, the method further includes:
taking the obtained pose of the target object to be positioned as an initial value, and establishing an error function for each path of camera by using the minimum reprojection error of the matching points;
establishing a total error function aiming at least two paths of cameras based on the error function of each path of camera;
and optimizing the total error function to obtain the optimized pose of the target object to be positioned in the repositioning mode.
Optionally, the positioning using the constant-speed tracking mode includes:
acquiring the pose of at least two cameras in a current frame, the real-time pose of a target object to be positioned in the previous frame and the constant speed of the target object to be positioned; wherein the constant speed is determined based on the real-time pose of the previous frame and the real-time pose of the previous frame;
determining an initial position and posture value of the target object to be positioned in the current frame based on the real-time position and posture of the previous frame and the constant speed;
projecting the feature points of the previous frame into the image corresponding to the pose initial value of the current frame, searching a target area in the image corresponding to the pose initial value of the current frame, and screening to obtain points to be matched;
and when the number of the points to be matched is larger than a preset number threshold, performing pose optimization to obtain the pose of the target object to be positioned in the constant-speed tracking mode.
Optionally, the performing the positioning again by using the tracking local map mode includes:
searching a key frame sequence which is viewed with the current frame in the prior three-dimensional map;
judging whether to update the existing local prior map or not based on the reference key frame with the most common viewpoints with the current frame in the key frame sequence and the number of the common viewpoints of the reference key frames;
when the local prior map is judged to be updated, determining the number of corresponding inner points obtained by statistics of all paths of cameras under the first positioning pose obtained by adopting a constant-speed tracking mode or a repositioning mode;
judging whether the number of the inner points is smaller than a preset inner point threshold value or not;
when the number of the interior points is judged to be smaller than a preset interior point threshold value, screening out a candidate key frame closest to the current frame by using reference position information, adding the candidate key frame to a local prior map, and adding a reference key frame with the most common view points with the current frame to the local prior map;
when the number of the interior points is judged to be equal to or larger than a preset interior point threshold value, adding the reference key frame with the most common viewpoints with the current frame to the local prior map;
updating a key frame corresponding to the current frame in the local prior map;
and updating the three-dimensional point characteristics in the local prior map.
Optionally, when it is determined not to update the local prior map, and after the updating of the three-dimensional point feature in the local prior map, the method further includes:
performing feature point matching on the basis of the current frame and a local prior map to obtain a matching point pair;
and constructing and optimizing a reprojection error based on the matching point pairs to obtain the pose of the target object to be positioned in the local map tracking mode.
Optionally, the obtaining a real-time pose of the target object to be positioned when determining that a preset positioning condition is met based on the positioning results in different positioning modes includes:
counting the number of the inner points based on the positioning results in different positioning modes;
when the number of the interior points is smaller than a preset number threshold, the state position is lost, and the real-time pose is obtained by re-positioning based on the prior three-dimensional map, the real-time reference pose and at least two real-time images;
and when the number of the inner points is not less than the preset number threshold, the state position is passed, and the initial positioning pose is determined as the real-time pose.
In a second aspect, the present disclosure also provides a positioning device, the device comprising:
the data acquisition module is used for acquiring a prior three-dimensional map of a current scene, acquiring a real-time reference pose of the target object to be positioned and acquiring real-time images under corresponding directions acquired by at least two paths of cameras carried by the target object to be positioned;
the data fusion module is used for carrying out data fusion on the basis of the at least two real-time images to obtain camera fusion data;
and the pose determination module is used for determining the real-time pose of the target object to be positioned based on the prior three-dimensional map, the real-time reference pose and the camera fusion data.
In a third aspect, the present disclosure also provides a computer-readable storage medium storing a computer program for performing the steps of any one of the above methods.
In a fourth aspect, the present disclosure also provides an on-vehicle positioning apparatus, including:
a processor;
a memory for storing the processor-executable instructions;
the processor is used for reading the executable instructions from the memory and executing the executable instructions to realize any one of the methods.
Optionally, the vehicle-mounted positioning device further comprises: at least two cameras disposed in different orientations;
the camera is configured to take real-time images in corresponding orientations.
In a fifth aspect, the present disclosure further provides a vehicle, including any one of the above vehicle-mounted positioning apparatuses.
Compared with the prior art, the technical scheme provided by the disclosure has the following advantages:
the disclosure provides a positioning method, a positioning device, a positioning medium, a positioning device and a vehicle, wherein the method comprises the following steps: acquiring a prior three-dimensional map of a current scene, acquiring a real-time reference pose of an object to be positioned, and acquiring real-time images under corresponding directions acquired by at least two paths of cameras carried by the object to be positioned; performing data fusion based on at least two real-time images to obtain camera fusion data; and determining the real-time pose of the target object to be positioned based on the prior three-dimensional map, the real-time reference pose and the camera fusion data. Therefore, the real-time image of the object to be positioned is obtained through at least two cameras, data fusion is carried out on the real-time image to obtain camera fusion data, the real-time pose of the object to be positioned is determined based on the prior three-dimensional map, the real-time reference pose and the camera fusion data, and the problem that positioning is lost due to the problems of visual angle change, shielding or mismatching and the like of a single camera is solved, so that the fault-tolerant rate is effectively improved, and the positioning accuracy and the robustness are improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present disclosure, the drawings used in the description of the embodiments or prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
Fig. 1 is a schematic flow chart of a positioning method according to an embodiment of the present disclosure;
fig. 2 is a schematic flow chart of another positioning method provided in the embodiment of the present disclosure;
fig. 3 is a schematic flowchart of another positioning method provided in the embodiment of the present disclosure;
fig. 4 is a detailed flowchart of S321 and S322 in the positioning method shown in fig. 3;
fig. 5 is a detailed flowchart of S331 in the positioning method shown in fig. 3;
fig. 6 is a detailed flowchart of "positioning using relocation as a mode" in the positioning method shown in fig. 5;
FIG. 7 is a detailed flow chart of "positioning using constant-speed tracking mode" in the positioning method shown in FIG. 5;
FIG. 8 is a detailed flowchart of "perform re-positioning using tracking local map mode" in the positioning method shown in FIG. 5;
fig. 9 is a detailed flowchart of S332 in the positioning method shown in fig. 3;
fig. 10 is a schematic structural diagram of a positioning device according to an embodiment of the disclosure;
fig. 11 is a schematic structural diagram of an on-vehicle positioning apparatus provided in the embodiment of the present disclosure.
Detailed Description
In order that the above objects, features and advantages of the present disclosure may be more clearly understood, aspects of the present disclosure will be further described below. It should be noted that the embodiments and features of the embodiments of the present disclosure may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure, but the present disclosure may be practiced in other ways than those described herein; it is to be understood that the embodiments disclosed in the specification are only a few embodiments of the present disclosure, and not all embodiments.
Fig. 1 is a schematic flow chart of a positioning method according to an embodiment of the present disclosure. Referring to fig. 1, the method includes:
s110, acquiring a prior three-dimensional map of a current scene, acquiring a real-time reference pose of the object to be positioned, and acquiring real-time images in a corresponding direction acquired by at least two cameras carried by the object to be positioned.
The prior three-dimensional map refers to a reference map established in advance, and may be constructed by using RGB cameras, such as monocular cameras and binocular cameras, or may be constructed by using other cameras in other forms, or may be constructed by using other cameras known to those skilled in the art, which is not limited herein.
The real-time reference pose of the target object to be positioned is acquired by a navigation positioning system and is acquired by a reference pose sensor of the navigation positioning system; the real-time reference pose at least comprises position data and attitude angle data of a target object to be positioned; the Navigation and Positioning System at least includes one of a Global Positioning System (GPS), a BeiDou Navigation Satellite System (BDS), a Galileo Navigation Satellite System (GSNS), a Global Navigation Satellite System (GNSS), and a GLONASS Satellite Positioning System (GLONASS), or other Navigation and Positioning systems known to those skilled in the art, and is not limited herein.
The system comprises a target object to be positioned, a plurality of monocular cameras, a plurality of cameras and a positioning module, wherein the target object to be positioned is loaded with at least two monocular cameras, and a common-view area does not exist among the cameras; for example, a front view angle and a rear view angle of two cameras are mounted on the object to be positioned and are respectively used for acquiring real-time images in front of and behind the object to be positioned. By the arrangement, the problem of limitation of the visual field of a single camera is solved by the at least two cameras, the observation visual field of the target object to be positioned is enlarged, and compared with the situation that the monocular camera is blocked or fails to be positioned due to the change of the visual angle, the accurate position of the vehicle can be output according to other cameras, and the positioning robustness is improved.
The reference pose sensor and the at least two cameras of the navigation positioning system are synchronously triggered, and synchronous data are acquired in a hardware pulse synchronous triggering mode.
And S120, performing data fusion based on the at least two real-time images to obtain camera fusion data.
Specifically, data preprocessing is respectively performed on real-time images acquired by at least two cameras, such as feature point extraction and feature point descriptor calculation; and then fusing the data of all the cameras at the same time to obtain camera fusion data.
And S130, determining the real-time pose of the object to be positioned based on the prior three-dimensional map, the real-time reference pose and the camera fusion data.
Specifically, the real-time reference poses corresponding to at least two real-time images at the current moment are respectively matched with a prior three-dimensional map, namely feature points extracted from the real-time images are matched with feature points of the prior three-dimensional map, and finally candidate key frames with the maximum matching degree with the at least two real-time images at the current moment are extracted from the prior three-dimensional map; establishing a multi-phase machine model, introducing an OpenGv library, unifying coordinate systems of at least two paths of cameras, calling a target function and a target algorithm in the OpenGv library, resolving an initial pose of the target object to be positioned, further optimizing the initial pose, and determining the real-time pose of the target object to be positioned.
The embodiment of the disclosure provides a positioning method, which comprises the following steps: acquiring a prior three-dimensional map of a current scene, acquiring a real-time reference pose of an object to be positioned, and acquiring real-time images under corresponding directions acquired by at least two paths of cameras carried by the object to be positioned; performing data fusion based on at least two real-time images to obtain camera fusion data; and determining the real-time pose of the target object to be positioned based on the prior three-dimensional map, the real-time reference pose and the camera fusion data. Therefore, the real-time images of the object to be positioned are obtained through the at least two cameras, data fusion is carried out on the real-time images to obtain camera fusion data, the real-time pose of the object to be positioned is determined based on the prior three-dimensional map, the real-time reference pose and the camera fusion data, the problem that positioning is lost due to the problems of view angle change, shielding or error matching of a single camera is solved, the fault tolerance rate is effectively improved, and the positioning accuracy and the robustness are improved.
In one embodiment, as shown in fig. 2, a schematic flow chart of another positioning method provided in the embodiments of the present disclosure is shown. Referring to fig. 2, before "acquiring a prior three-dimensional map of a current scene", the method further includes:
and S210, constructing a prior three-dimensional map.
The prior three-dimensional map refers to a reference map established in advance, and may be constructed by using, for example, an RGB binocular camera, other cameras, or other cameras known to those skilled in the art, which is not limited herein.
In one embodiment, acquiring a real-time reference pose of an object to be positioned includes: acquiring at least one of a real-time reference pose of an object to be positioned under a global positioning system, a real-time reference pose under a Beidou satellite navigation system, a real-time reference pose under a Galileo satellite navigation system and a real-time reference pose under the global navigation satellite system; the real-time reference pose is acquired by adopting a corresponding reference pose sensor, and the reference pose sensor is synchronously triggered by at least two paths of cameras.
Wherein the real-time reference pose is acquired by a reference pose sensor; and the reference pose sensor and the at least two cameras acquire synchronous data in a hardware pulse synchronous triggering mode.
It should be noted that, the embodiment of the present disclosure only exemplarily shows an acquisition path of a real-time reference pose of an object to be positioned, but does not constitute a limitation to the positioning method provided by the embodiment of the present disclosure. In other embodiments, the real-time reference pose of the target object to be positioned may also be obtained by other navigation and positioning systems known to those skilled in the art, which is not limited herein.
In an embodiment, as shown in fig. 3, a schematic flowchart of another positioning method provided in the embodiment of the present disclosure is shown. Referring to fig. 3, "obtaining camera fusion data" in S120 includes:
s321, preprocessing data of the real-time images corresponding to the at least two cameras to obtain corresponding single-camera feature point data.
The method comprises the following steps of respectively preprocessing real-time images acquired by at least two cameras and extracting feature point data of each camera, wherein the method comprises the following specific steps: firstly, constructing an image pyramid based on a real-time image, and calculating interest points of each layer of the image pyramid; and then, extracting feature points of each layer, and calculating descriptors of the feature points to obtain corresponding single-camera feature point data.
S322, camera fusion data are constructed based on single-camera feature point data corresponding to at least two cameras at the same time.
Specifically, single-camera feature point data corresponding to at least two cameras at the same time are fused to construct camera fusion data (i.e., MultiFrame types); the method is equivalent to a box at the same time, and the data of the corresponding time of at least two cameras are put into the box to obtain all the data, namely the camera fusion data, so that the data preparation is completed.
In S130, "determining a real-time pose of the target object to be positioned" includes:
and S331, triggering state bit judgment based on the camera fusion data, and carrying out initial positioning by combining with a corresponding positioning mode associated with a judgment result.
Specifically, the determination result of the status bit includes both pass (OK) and Loss (LOST), and the initial status position is set to LOST. When the judgment result of the state bit is passed, determining to adopt a constant-speed tracking mode for initial positioning; and when the state bit judgment result is lost, determining to adopt a relocation mode for initial positioning.
S332, determining that the real-time pose of the target object to be positioned is obtained when the preset positioning condition is met based on the positioning results in different positioning modes.
The preset positioning condition can be set as that whether the number of the inner points is not less than a preset number threshold value or not, and when the number of the inner points is not less than the preset number threshold value, the preset positioning condition is determined to be met; and when the number of the inner points is less than the preset number threshold, determining that the preset condition is not met.
Specifically, according to positioning results in different positioning modes, when the preset positioning conditions are met, the real-time pose of the target object to be positioned is obtained; and when the preset positioning condition is not met, re-positioning is carried out based on the prior three-dimensional map, the real-time reference pose and at least two real-time images to obtain the real-time pose.
In one embodiment, as shown in fig. 4, a detailed flowchart of S321 and S322 in the positioning method shown in fig. 3 is shown. Referring to fig. 4, S321 "pre-process data of real-time images corresponding to at least two cameras to obtain corresponding single-camera feature point data" includes:
aiming at the corresponding real-time image of each camera:
s411, constructing an image pyramid based on the real-time image.
And S412, constructing interest points of each layer based on the image pyramid.
And S413, extracting the feature points and determining descriptors of the feature points based on the interest points of each layer to obtain single-camera feature point data.
The obtained single-camera feature point data at least comprises feature points and descriptors corresponding to the feature points.
S322 "constructing camera fusion data based on single-camera feature point data corresponding to at least two cameras at the same time", includes:
and S420, constructing camera fusion data based on the feature points corresponding to the at least two cameras at the same time and the descriptors corresponding to the feature points.
In the step, all single-camera feature point data corresponding to at least two paths of cameras at the same moment are classified into one type, and camera fusion data are constructed; the single-camera feature point data at least comprises feature points and descriptors corresponding to the feature points.
In one embodiment, as shown in fig. 5, a detailed flowchart of S331 in the positioning method shown in fig. 3 is shown. Referring to fig. 5, S331 "triggering the status bit determination based on the camera fusion data and performing the initial positioning by combining the corresponding positioning mode associated with the determination result" includes:
and S510, triggering state bit judgment based on the camera fusion data.
The judgment result of the status bit includes two types, namely pass (OK) and Loss (LOST).
Specifically, determining which positioning mode is adopted when a new frame of real-time image is acquired according to the state bit; if the determination result of the status bit is pass, executing S520; if the determination result of the status bit is lost, S530 is executed.
And S520, when the state bit judgment result is that the current position passes, positioning by adopting a constant-speed tracking mode, and then positioning again by adopting a tracking local map mode.
When the state bit passes, a constant-speed tracking mode is adopted for positioning, and the specific steps are as follows:
(1) the constant speed can be estimated according to the pose of the target object to be positioned in the current frame and the real-time pose of the target object to be positioned in the previous frame, and the calculation formula is shown as follows:
Figure BDA0003670769710000101
wherein the content of the first and second substances,
Figure BDA0003670769710000102
the speed of the current moment relative to the last frame;
Figure BDA0003670769710000103
the pose of the target object to be positioned at the moment corresponding to the previous frame is obtained;
Figure BDA0003670769710000104
the pose of the target object to be positioned at the current moment under a world coordinate system is determined; converting the angle mark from bottom to top, wherein w represents a world coordinate system, v represents a coordinate system of the target object to be positioned, cur represents the current time, and last represents the corresponding time of the previous frame, and then
Figure BDA0003670769710000105
Representing the corresponding time of the previous frame, converting the world coordinate system to the coordinate system of the target object to be positioned,
Figure BDA0003670769710000106
and the current time is represented, and the coordinate system of the target object to be positioned is converted into a world coordinate system.
(2) Calculating to obtain the initial value of the pose of the target object to be positioned of the current frame according to the pose of the target object to be positioned at the corresponding moment of the previous frame and the estimated constant speed by the following formula:
Figure BDA0003670769710000107
(3) converting the initial value of the pose of the target object to be positioned of the current frame from a world coordinate system to a camera coordinate system by the following formula:
Figure BDA0003670769710000108
wherein the corner mark cam represents the camera coordinate system.
(4) Projecting the feature points of the previous frame image into an image corresponding to the pose initial value in the current frame, searching a target area in the image corresponding to the pose initial value in the current frame, and screening to obtain points to be matched; when the number of the points to be matched is larger than a preset number threshold, performing pose optimization to obtain the pose of the target object to be positioned in a constant-speed tracking mode; and meanwhile, counting the number of the inner points corresponding to each path of camera, and using the inner points for subsequently updating the local map.
S530, when the state bit judgment result is lost, positioning is carried out by adopting a relocation mode, and then positioning is carried out again by adopting a local map tracking mode.
When the state bit is lost, positioning by adopting a repositioning mode, and the method comprises the following specific steps:
(1) based on real-time reference poses corresponding to real-time images of at least two cameras of a current frame, extracting candidate corresponding candidate key frame sequences similar to the real-time image positions corresponding to the at least two cameras of the current frame from the first-check three-dimensional map respectively; the similarity of the real-time image positions corresponding to the candidate key frame sequence and the current frame at least two cameras is larger than or equal to a preset similarity threshold.
(2) Matching each key frame in the candidate key frame sequence with the corresponding feature point of the real-time image, and determining the candidate key frame with the matching degree larger than a preset matching degree threshold value to obtain an expected candidate key frame; the matching degree index is the number of matching points, and when the number of matching points is smaller than a preset matching degree threshold (for example, 15), the candidate key frame is discarded; and when the number of the matching points is greater than or equal to a preset pre-configuration threshold value, determining the key frame as a candidate key frame.
(3) Establishing a multi-phase machine model based on expected candidate key frames corresponding to the at least two cameras, introducing an OpenGv library, setting a coordinate system of a target object to be positioned as viewpoint, and representing a unified coordinate system of the at least two cameras; and calling a target function and a target algorithm in the OpenGv library to obtain the positioning pose of the target object to be positioned in the repositioning mode.
It should be noted that, after the constant-speed tracking mode or the repositioning mode, the local map tracking mode is adopted for repositioning; the purpose of the process is to add new matching points, so that the current frame can obtain more three-dimensional points, and more constraints can be added while loss is prevented; the pose of the current frame is more accurate.
In one embodiment, the initial state position of the state bit is lost; the method further comprises the following steps: and confirming that the judgment result of the status bit is lost aiming at the first frame data.
And confirming that the state bit judgment result is lost aiming at the camera fusion data corresponding to the first frame image, and entering a repositioning mode.
Specifically, when the first frame data is processed, there is no data available for tracking, and at this time, the relocation mode is entered, and the relocation mode is adopted for positioning.
In one embodiment, as shown in fig. 6, a detailed flowchart of "positioning using relocation as mode" in the positioning method shown in fig. 5 is shown. Referring to fig. 6, in the method, a priori three-dimensional map includes three-dimensional point features, a key frame sequence number, and position information of a key frame in a camera coordinate system; positioning in a relocation mode, comprising:
s610, respectively extracting key frames with the position similarity of the real-time images corresponding to the at least two cameras of the current frame being greater than a preset similarity threshold from the prior three-dimensional map based on the real-time reference poses corresponding to the real-time images of the at least two cameras of the current frame to obtain a candidate key frame sequence.
Specifically, according to real-time reference poses corresponding to real-time images of at least two cameras of a current frame, key frames with real-time image position similarity greater than preset similarity corresponding to at least two cameras of the current frame are respectively extracted from a global prior three-dimensional map and serve as candidate key frame sequences.
And S620, matching each key frame in the sequence with the corresponding real-time image based on the candidate key frame sequence, and determining the candidate key frame with the matching degree larger than a preset matching degree threshold value to obtain the expected candidate key frame.
Specifically, traversing each candidate key frame in the candidate key frame sequence to match with the feature points of the real-time image corresponding to the current frame, calculating the Euclidean distance of the description of the matching points, checking the consistency of the address selection through a histogram, eliminating inconsistent matching points, obtaining the number of the matching points of each candidate key frame of each camera and each candidate key frame and the current frame, and if the number of the matching points is less than a preset matching degree threshold (for example, 15), abandoning the key frame; if the number of the matching points is larger than or equal to a preset matching degree threshold value, establishing an EPnP algorithm through the current frame and the matched feature points, and solving the real-time pose of the target object to be positioned corresponding to the current frame; then, performing BA optimization on the real-time pose of the target object to be positioned corresponding to the current frame according to the obtained matching point to obtain the number (opti _ inliers) of optimized inliers, and if the number of the optimized inliers is smaller than a first preset inliers threshold, giving up the key frame; if the number of the optimized interior points is larger than a second preset interior point threshold value, determining the key frame as a candidate key frame; if the number of the optimized interior points is between a first preset interior point threshold and a second preset interior point threshold (the first preset interior point threshold is smaller than the second preset interior point threshold), projecting unmatched feature points in the candidate key frame to the current frame in a projection mode to increase matched points (nadional); when the sum of the number of the optimized interior points and the number of the added matching points is larger than a preset value, performing primary pose BA optimization to obtain the number (good) of secondary optimized interior points, and if the number of the secondary optimized interior points does not meet the preset condition, re-projecting until the preset condition is met; and if the number of the secondary optimization interior points meets the preset condition, determining the key frame as a candidate key frame.
Exemplarily, matching each key frame in the sequence with a corresponding real-time image, checking the consistency of address selection by calculating the Euclidean distance of the feature point descriptors and a histogram, eliminating inconsistent matching points, obtaining each candidate key frame of each camera and the number of the matching points of each candidate key frame and the current frame, and if the number of the matching points is less than 15, abandoning the key frame; if the number of the matching points is more than or equal to 15, establishing an EPnP algorithm through the current frame and the matched characteristic points, and solving the real-time pose of the target object to be positioned corresponding to the current frame; then, optimizing the real-time pose of the target object to be positioned corresponding to the current frame according to the obtained matching points to obtain the number of optimized interior points, and if the number of optimized interior points opti _ inliers is less than or equal to 10, giving up the key frame; if the number of the optimized interior points is more than or equal to 50, determining the key frame as a candidate key frame; if the Opti _ iners is more than 10 and less than 50, projecting unmatched feature points in the candidate key frame to the current frame in a projection mode to increase matched points, if the Opti _ iners + nadional is more than or equal to 50, performing pose optimization again to obtain the number of secondary optimized interior points, and if the good is more than 30 and less than 50, re-projecting again until success; and if not, returning the key frame as a candidate key frame which is matched with the positions of the real-time images corresponding to the at least two cameras of the current frame most.
S630, establishing a multi-phase machine model based on expected candidate key frames corresponding to at least two cameras, introducing an OpenGv library, and calling a target function and a target algorithm to obtain the positioning pose of the target object to be positioned in the repositioning mode.
Specifically, according to the step of determining the candidate keyframes, the candidate keyframes which are most matched with the positions of the real-time images corresponding to the at least two cameras of the current frame are extracted from the first-check three-dimensional map, and feature point matching is carried out on the current frame and the candidate keyframes to obtain matching points; establishing a multi-phase machine model, unifying the matching points and each camera to a viewport extrinsic parameter matrix, inputting the uniform coordinate system to an OpenGv library, and unifying at least two cameras to the viewport coordinate system according to the input value by the OpenGv library; calling an API function corresponding to the Absolute Poses SacProblem processed in the OpenGv library, selecting GP3P as an algorithm, resolving the position and posture of the target object to be positioned corresponding to the current frame, and taking the calculated position and posture of the target object to be positioned as an initial position and posture.
In one embodiment, as shown in fig. 6, after obtaining the pose where the object to be positioned is positioned in the repositioning mode, the method further includes:
and S640, establishing an error function for each path of camera by using the obtained pose of the target object to be positioned as an initial value and the minimum reprojection error of the matching point.
Specifically, in a repositioning mode, the pose of the target to be positioned obtained through an OpenGv library is used as an initial value, an external parameter matrix unified to the viewpoint by matching points and all paths of cameras is utilized, the minimum reprojection error is constructed, an error function is established, and pose optimization is carried out.
Wherein the content of the first and second substances,
Figure BDA0003670769710000131
to optimize the variables, define (u) ij ,v ij ) The coordinates of the jth two-dimensional characteristic point observed by the ith camera,
Figure BDA0003670769710000132
is and (u) ij ,v ij ) Three-dimensional points, k, in a matched a priori three-dimensional map i Is an internal reference of the ith camera,
Figure BDA0003670769710000133
calculating a reprojection error for the external parameter from the ith camera to the object to be determined, and constructing an error function of a feature point as follows:
Figure BDA0003670769710000134
s650, establishing a total error function aiming at least two paths of cameras based on the error function of each path of camera.
Figure BDA0003670769710000135
Wherein;
Figure BDA0003670769710000141
as a total error function, a total error function is established by carrying out reprojection error calculation on all the feature points observed by the cameras of each path; k is the number of cameras; n is a feature point observed by one camera and three prior pointsNumber of all matching point pairs of dimension map, Ω ij Is the information matrix of the error.
And S660, optimizing the total error function to obtain the optimized pose of the target object to be positioned in the repositioning mode.
Wherein, for the total error function
Figure BDA0003670769710000142
Optimizing, wherein the optimization variable is the real-time pose of the target object to be positioned, the optimization problem is a nonlinear least square problem, and the function convergence is carried out by utilizing a Levenberg algorithm to obtain the optimal solution of the error function
Figure BDA0003670769710000143
Namely the optimized pose of the target object to be positioned in the repositioning mode.
It should be noted that, in the embodiment of the present disclosure, the error functions of each camera are fused to establish a total error function, and then the total error function is optimized to obtain an optimized pose of the object to be positioned in the repositioning mode, and each camera adopts a uniform coordinate system in the optimization process; the method is different from the prior art that each camera adopts an independent coordinate system, each camera constructs an error function, and then the functions are respectively optimized to obtain the optimized pose of each camera.
In one embodiment, as shown in fig. 7, a detailed flow chart of "positioning using constant-speed tracking mode" in the positioning method shown in fig. 5 is shown. Referring to fig. 7, the constant-speed tracking mode is used for positioning, and includes:
s710, acquiring the poses of the at least two cameras in the current frame, the real-time poses of the target object to be positioned in the previous frame and the constant speed of the target object to be positioned.
Wherein the constant speed is determined based on the real-time pose of the previous frame and the real-time pose of the previous frame.
Specifically, the constant speed can be estimated according to the pose of the object to be positioned in the current frame and the real-time pose of the object to be positioned in the previous frame, and the calculation formula is shown as follows:
Figure BDA0003670769710000144
wherein the content of the first and second substances,
Figure BDA0003670769710000145
the speed of the current moment relative to the last frame;
Figure BDA0003670769710000146
the pose of the target object to be positioned at the moment corresponding to the previous frame is obtained;
Figure BDA0003670769710000147
the pose of the target object to be positioned at the current moment under a world coordinate system is determined; converting the angle mark from bottom to top, wherein w represents a world coordinate system, v represents a coordinate system of the target object to be positioned, cur represents the current time, and last represents the corresponding time of the previous frame, and then
Figure BDA0003670769710000151
Representing the corresponding time of the previous frame, converting the world coordinate system to the coordinate system of the target object to be positioned,
Figure BDA0003670769710000152
and the current time is represented, and the coordinate system of the target object to be positioned is converted into a world coordinate system.
S720, determining the initial value of the pose of the target object to be positioned in the current frame based on the real-time pose and the constant speed of the previous frame.
Specifically, according to the pose of the target object to be positioned at the moment corresponding to the previous frame and the estimated constant speed, calculating to obtain an initial value of the pose of the target object to be positioned of the current frame through the following formula:
Figure BDA0003670769710000153
converting the initial value of the pose of the target object to be positioned of the current frame from a world coordinate system to a camera coordinate system by the following formula:
Figure BDA0003670769710000154
wherein the corner mark cam represents the camera coordinate system.
And S730, projecting the feature points of the previous frame to the image corresponding to the pose initial value of the current frame, searching a target area in the image corresponding to the pose initial value of the current frame, and screening to obtain points to be matched.
Specifically, feature points of an image of a previous frame are projected into an image corresponding to a pose initial value of a current frame, a target area in the image corresponding to the pose initial value of the current frame is searched, Euclidean distances of descriptors between matching points are calculated, and the best matching point is obtained by histogram screening.
And S740, when the number of the points to be matched is larger than a preset number threshold, performing pose optimization to obtain the pose of the target object to be positioned in the constant-speed tracking mode.
Specifically, when the number of the points to be matched is larger than a preset number threshold, carrying out reprojection error function optimization on all the matching points of each camera to obtain the pose of the target object to be positioned in the constant-speed tracking mode, wherein the pose obtained in this way is accurate.
In one embodiment, as shown in fig. 8, a detailed flow diagram of "performing the positioning again by using the tracking local map mode" in the positioning method shown in fig. 5 is shown. Referring to fig. 8, the relocation is performed in a tracking local map mode, including:
and S810, searching a key frame sequence which is viewed together with the current frame in the prior three-dimensional map.
And S820, judging whether to update the existing local prior map or not based on the reference key frame with the most common view points with the current frame in the key frame sequence and the number of the common view points of the reference key frame.
Specifically, if the reference key frame (ref _ kf _ id) with the most common views in the current frame is the same frame as the reference key frame (last _ ref _ kf _ id) with the most common views in the previous frame, and the number of common views in the reference key frame is not less than the first preset common view number threshold (e.g., 30); or the reference key frame with the most common views in the previous frame is included in the key frame sequence with the common views in the current frame, and the number of the common views is greater than or equal to a second preset common view number threshold (for example, 0.9 times of the number of the common views of the key frame with the most common views in the current frame), and meanwhile, the number of the common views of the reference key frame with the most common views in the previous frame is not less than a third preset common view number threshold (for example, 30), so that the current local map is not updated; otherwise, adding and updating the point cloud observed by the co-view key frame into the local prior map.
It is understood that the embodiment of the present disclosure only exemplarily illustrates that the first preset co-viewpoint number threshold is 30, the second preset co-viewpoint number threshold is 0.9 times of the number of key frame co-viewpoints where the current frame co-viewpoint is the most, and the third preset co-viewpoint number threshold is equal to the first preset co-viewpoint number threshold, but does not constitute a limitation on the positioning method provided by the embodiment of the present disclosure. In other embodiments, values of the first preset common-viewpoint threshold, the second preset common-viewpoint number threshold, and the third preset common-viewpoint number threshold may be flexibly set according to requirements of the positioning method, and are not limited herein.
And S830, when the local prior map is judged to be updated, determining the number of interior points obtained by corresponding statistics of each path of camera under the first positioning pose obtained by adopting a constant-speed tracking mode or a repositioning mode.
And S840, judging whether the number of the inner points is smaller than a preset inner point threshold value.
Specifically, if the counted number of the internal points is smaller than the preset internal point threshold value, and the determination result is yes, executing S851 and S852; if the counted number of interior points is greater than or equal to the preset number of interior points threshold, and the determination result is no, then S852 is executed.
S851, screening out candidate key frames nearest to the current frame by using the reference position information, and adding the candidate key frames to a local prior map.
S852, adding the reference key frame with the most common view point with the current frame to the local prior map.
Specifically, when the number of the determined interior points is smaller than a preset interior point threshold value, screening out a candidate key frame closest to the current frame by using reference position information, and adding point cloud observed by the candidate key frame into a local prior map; and adding the reference key frame with the most common view point with the current frame to the local prior map.
Specifically, when the number of interior points is equal to or greater than a preset threshold of the number of interior points, only the point cloud observed by the key frame having a common-view relationship with the current frame is added to the local prior map.
Therefore, no matter whether the number of the interior points is smaller than the preset interior point threshold value or not, the reference key frame with the most common view points with the current frame is added to the local prior map; if the number of the interior points is less than the preset threshold of the number of the interior points, adding the candidate key frames into the local prior map, and adding the reference key frame with the most common viewpoints with the current frame into the local prior map; and if the number of the interior points is equal to or greater than a preset interior point threshold value, directly adding the reference key frame with the most common view points with the current frame into the local prior map.
And S860, updating the key frame corresponding to the current frame in the local prior map.
And S870, updating the three-dimensional point characteristics in the local prior map.
In one embodiment, as shown in fig. 8, when it is determined not to update the local prior map, and after the three-dimensional point feature in the local prior map is updated, the method further includes:
and S880, performing feature point matching based on the current frame and the local prior map to obtain a matching point pair.
Specifically, three-dimensional points in a local prior map are projected onto a current frame, and feature point matching is performed to obtain matching point pairs.
And S890, constructing and optimizing a reprojection error based on the matching point pairs to obtain the pose of the target object to be positioned in the local map tracking mode.
Specifically, based on the matching point pairs, carrying out reprojection error function optimization to obtain a relatively accurate pose of the target object to be positioned in a local map tracking mode; counting the number of the inner points, if the number of the inner points is smaller than a preset threshold value, failing to perform positioning again in a tracking local map mode, and repositioning based on a new frame of real-time image if the state position of the state bit is lost.
In one embodiment, as shown in fig. 9, a detailed flow diagram of S332 in the positioning method shown in fig. 3 is shown. Referring to fig. 9, S332 "obtaining a real-time pose of the target object to be positioned when determining that the preset positioning condition is satisfied based on the positioning results in different positioning modes" includes:
s910, counting the number of the inner points based on the positioning results in different positioning modes.
Wherein, a constant-speed tracking mode or a repositioning mode is adopted for initial positioning, and the number of the inner points is counted.
And S920, when the number of the inner points is smaller than a preset number threshold, the state position is lost, and the real-time pose is obtained by re-positioning based on the prior three-dimensional map, the real-time reference pose and at least two real-time images.
And S930, when the number of the inner points is not less than the preset number threshold, the state position is passed, and the initial positioning pose is determined to be a real-time pose.
Specifically, when the number of the inner points is smaller than a preset number threshold, determining that the state position of the state bit is lost, and performing relocation by adopting a relocation mode according to a prior three-dimensional map, a real-time reference pose and at least two real-time images; and when the number of the inner points is not less than the preset number threshold, determining the state position of the state bit as a passing state, and determining the initial positioning pose as a real-time pose.
Based on the same inventive concept, the embodiments of the present disclosure further provide a positioning apparatus, which can perform any of the steps of the positioning method provided in the embodiments of the present disclosure, and has functional modules and beneficial effects corresponding to the execution method, and therefore, in order to avoid repeated descriptions, the description is not repeated here.
Fig. 10 is a schematic structural diagram of a positioning device according to an embodiment of the disclosure. Referring to fig. 10, the apparatus 100 includes: the data acquisition module 101 is configured to acquire a priori three-dimensional map of a current scene, acquire a real-time reference pose of an object to be positioned, and acquire real-time images in a corresponding direction acquired by at least two cameras carried by the object to be positioned; the data fusion module 102 is configured to perform data fusion based on at least two real-time images to obtain camera fusion data; the pose determining module 103 is configured to determine a real-time pose of the object to be positioned based on the prior three-dimensional map, the real-time reference pose, and the camera fusion data.
In one embodiment, the apparatus further comprises: and the map building module is used for building the prior three-dimensional map before obtaining the prior three-dimensional map of the current scene.
In one embodiment, the data acquisition module is configured to acquire a real-time reference pose of the target object to be positioned, and includes: acquiring at least one of a real-time reference pose of an object to be positioned under a global positioning system, a real-time reference pose under a Beidou satellite navigation system, a real-time reference pose under a Galileo satellite navigation system and a real-time reference pose under the global navigation satellite system; the real-time reference pose is acquired by adopting a corresponding reference pose sensor, and the reference pose sensor is synchronously triggered by at least two paths of cameras.
In one embodiment, the data fusion module is configured to obtain camera fusion data, including: performing data preprocessing on real-time images corresponding to at least two cameras to obtain corresponding single-camera feature point data; and constructing camera fusion data based on single-camera feature point data corresponding to at least two cameras at the same time.
The data fusion module is used for preprocessing data of real-time images corresponding to at least two cameras to obtain corresponding single-camera feature point data, and comprises:
aiming at the corresponding real-time image of each camera: constructing an image pyramid based on the real-time image; constructing interest points of each layer based on the image pyramid; and extracting the feature points and determining descriptors of the feature points based on the interest points of each layer to obtain single-camera feature point data.
The data fusion module is used for constructing camera fusion data based on single-camera feature point data corresponding to at least two cameras at the same moment, and the data fusion module comprises: and constructing camera fusion data based on the feature points corresponding to the at least two cameras at the same time and the descriptors corresponding to the feature points.
The pose determination module is used for determining the real-time pose of the target object to be positioned, and comprises the following steps: triggering state bit judgment based on the camera fusion data, and carrying out initial positioning by combining a corresponding positioning mode associated with a judgment result; and determining to obtain the real-time pose of the target object to be positioned when the preset positioning condition is met based on the positioning results in different positioning modes.
Wherein, position appearance confirming module is used for based on camera fusion data, triggers state position and judges to combine and carry out the primary localization with the corresponding location mode that the judged result is correlated with, include: triggering state bit judgment based on the camera fusion data; when the state bit judgment result is passed, positioning by adopting a constant-speed tracking mode, and then positioning again by adopting a tracking local map mode; and when the state bit judgment result is lost, positioning by adopting a repositioning mode, and then positioning again by adopting a tracking local map mode.
In one embodiment, the initial state position of the state bit is lost; the method further comprises the following steps: and confirming that the judgment result of the status bit is lost aiming at the first frame data.
In one embodiment, the prior three-dimensional map comprises three-dimensional point features, a key frame serial number and position information of a key frame in a camera coordinate system; the pose determination module is used for positioning in a repositioning mode and comprises the following steps: respectively extracting key frames with the position similarity of the real-time images corresponding to the at least two cameras of the current frame being greater than a preset similarity threshold from the prior-check three-dimensional map based on the real-time reference poses corresponding to the real-time images of the at least two cameras of the current frame to obtain a candidate key frame sequence; matching each key frame in the sequence with a corresponding real-time image based on the candidate key frame sequence, and determining the candidate key frame with the matching degree larger than a preset matching degree threshold value to obtain an expected candidate key frame; establishing a multi-phase machine model based on expected candidate key frames corresponding to at least two cameras, introducing an OpenGv library, and calling a target function and a target algorithm to obtain the positioning pose of the target object to be positioned in a repositioning mode.
In one embodiment, the pose determining module is further configured to use the obtained pose of the target object to be positioned as an initial value, and establish an error function for each path of camera by using the minimum reprojection error of the matching point; establishing a total error function aiming at least two paths of cameras based on the error function of each path of camera; and optimizing the total error function to obtain the optimized pose of the target object to be positioned in the repositioning mode.
In one embodiment, the pose determination module is configured to perform positioning using a constant velocity tracking mode, and includes: acquiring the pose of at least two cameras in a current frame, the real-time pose of a target object to be positioned in the previous frame and the constant speed of the target object to be positioned; wherein the constant speed is determined based on the real-time pose of the previous frame and the real-time pose of the previous frame; determining the initial value of the pose of the target object to be positioned in the current frame based on the real-time pose and the constant speed of the previous frame; projecting the feature points of the previous frame into the image corresponding to the pose initial value of the current frame, searching a target area in the image corresponding to the pose initial value of the current frame, and screening to obtain points to be matched; and when the number of the points to be matched is larger than a preset number threshold, performing pose optimization to obtain the pose of the target object to be positioned in the constant-speed tracking mode.
In one embodiment, the pose determination module is configured to perform the repositioning using a tracking local map mode, and includes: searching a key frame sequence which is viewed together with the current frame in the prior three-dimensional map; judging whether to update the existing local prior map or not based on the reference key frame with the most common viewpoints with the current frame in the key frame sequence and the number of the common viewpoints of the reference key frame; when the local prior map is judged to be updated, determining the number of corresponding inner points obtained by statistics of all paths of cameras under the first positioning pose obtained by adopting a constant-speed tracking mode or a repositioning mode; judging whether the number of the internal points is smaller than a preset internal point threshold value or not; when the number of the interior points is judged to be smaller than a preset interior point threshold value, screening out a candidate key frame closest to the current frame by using the reference position information, adding the candidate key frame to the local prior map, and adding the reference key frame most sharing the common viewpoint with the current frame to the local prior map; when the number of the inner points is judged to be equal to or larger than a preset inner point threshold value, adding the reference key frame with the most common viewpoints with the current frame to the local prior map; updating a key frame corresponding to the current frame in the local prior map; and updating the three-dimensional point features in the local prior map.
In one embodiment, when the local prior map is judged not to be updated and after the three-dimensional point features in the local prior map are updated, the pose determination module is further configured to perform feature point matching based on the current frame and the local prior map to obtain matching point pairs; and constructing and optimizing a reprojection error based on the matching point pairs to obtain the pose of the target object to be positioned in the local map tracking mode.
In one embodiment, the pose determination module is configured to determine, based on positioning results in different positioning modes, a real-time pose of the target object to be positioned when a preset positioning condition is satisfied, and includes: counting the number of the inner points based on the positioning results in different positioning modes; when the number of the interior points is smaller than a preset number threshold, the state position is lost, and the real-time pose is obtained by re-positioning based on the prior three-dimensional map, the real-time reference pose and at least two real-time images; and when the number of the inner points is not less than the preset number threshold, the state position is passed, and the initial positioning pose is determined as the real-time pose.
On the basis of the above embodiment, the embodiment of the present disclosure further provides a vehicle-mounted positioning device. This on-vehicle positioning apparatus includes: a processor and a memory for storing processor-executable instructions; the processor is configured to read the executable instructions from the memory and execute the executable instructions to implement any of the methods described above.
FIG. 11 is a schematic diagram of a vehicle positioning apparatus suitable for use in implementing embodiments of the present disclosure. As shown in fig. 11, the in-vehicle positioning apparatus 200 includes a Central Processing Unit (CPU)201 that can execute various processes in the foregoing embodiments in accordance with a program stored in a Read Only Memory (ROM)202 or a program loaded from a storage section 208 into a Random Access Memory (RAM) 203. In the RAM203, various programs and data necessary for the operation of the in-vehicle positioning apparatus 200 are also stored. The CPU201, ROM202, and RAM203 are connected to each other via a bus 204. An input/output (I/O) interface 205 is also connected to bus 204.
The following components are connected to the I/O interface 205: an input device 206 including a keyboard, a mouse, and the like; output devices 207 including devices such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage device 208 including a hard disk and the like; and a communication device 209 including a network interface card such as a LAN card, modem, or the like. The communication device 209 performs communication processing via a network such as the internet. A drive 210 is also connected to the I/O interface 205 as needed. A removable medium 211, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like, is mounted on the drive 210 as necessary, so that a computer program read out therefrom is mounted into the storage device 208 as necessary.
In particular, the above described methods may be implemented as computer software programs, according to embodiments of the present disclosure. For example, embodiments of the present disclosure include a computer program product comprising a computer program tangibly embodied on a medium readable thereby, the computer program comprising program code for performing the aforementioned obstacle avoidance method. In such embodiments, the computer program may be downloaded and installed from a network via the communication device 209 and/or installed from the removable media 211.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowcharts or block diagrams may represent a module, a program segment, or a portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units or modules described in the embodiments of the present disclosure may be implemented by software or hardware. The units or modules described may also be provided in a processor, and the names of the units or modules do not in some cases constitute a limitation of the units or modules themselves.
In one embodiment, the in-vehicle positioning apparatus further comprises: at least two cameras disposed in different orientations; the camera is used to take real-time images in the corresponding orientation.
In addition, the embodiment of the present disclosure also provides a computer-readable storage medium, which may be the computer-readable storage medium included in the apparatus in the foregoing embodiment; or it may be a separate computer readable storage medium not incorporated into the device. The computer-readable storage medium stores computer-executable instructions that, when executed by a computing device, may be used to implement the positioning method described in any of the embodiments of the present disclosure.
On the basis of the foregoing embodiment, the embodiment of the present disclosure further provides a vehicle, including any one of the above vehicle-mounted positioning devices, which has corresponding beneficial effects, and is not described herein again in order to avoid repeated descriptions.
It is noted that, in this document, relational terms such as "first" and "second," and the like, are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The foregoing are merely exemplary embodiments of the present disclosure, which enable those skilled in the art to understand or practice the present disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method of positioning, comprising:
acquiring a prior three-dimensional map of a current scene, acquiring a real-time reference pose of an object to be positioned, and acquiring real-time images under corresponding directions acquired by at least two paths of cameras carried by the object to be positioned;
performing data fusion based on the at least two real-time images to obtain camera fusion data;
and determining the real-time pose of the object to be positioned based on the prior three-dimensional map, the real-time reference pose and the camera fusion data.
2. The method of claim 1, wherein prior to obtaining the a priori three-dimensional map of the current scene, further comprising:
and constructing the prior three-dimensional map.
3. The method of claim 1, wherein the obtaining of the real-time reference pose of the object to be positioned comprises:
acquiring at least one of a real-time reference pose of an object to be positioned under a global positioning system, a real-time reference pose under a Beidou satellite navigation system, a real-time reference pose under a Galileo satellite navigation system and a real-time reference pose under the global navigation satellite system;
the real-time reference pose is acquired by adopting a corresponding reference pose sensor, and the reference pose sensor and the at least two cameras are synchronously triggered.
4. The method of any one of claims 1-3, wherein the obtaining camera fusion data comprises:
performing data preprocessing on the real-time images corresponding to the at least two paths of cameras to obtain corresponding single-camera feature point data;
constructing camera fusion data based on the single-camera feature point data corresponding to at least two cameras at the same time;
the determining of the real-time pose of the target object to be positioned comprises:
triggering state bit judgment based on the camera fusion data, and carrying out initial positioning by combining a corresponding positioning mode associated with a judgment result;
and determining to obtain the real-time pose of the target object to be positioned when the preset positioning condition is met based on the positioning results in different positioning modes.
5. The method of claim 4,
the pre-processing the real-time images corresponding to the at least two cameras to obtain corresponding single-camera feature point data comprises:
aiming at the corresponding real-time image of each camera:
constructing an image pyramid based on the real-time image;
constructing interest points of each layer based on the image pyramid;
extracting feature points and determining descriptors of the feature points based on the interest points of each layer to obtain single-camera feature point data;
the method for constructing camera fusion data based on the single-camera feature point data corresponding to the at least two cameras at the same time comprises the following steps:
and constructing camera fusion data based on the feature points corresponding to the at least two cameras at the same time and the descriptors corresponding to the feature points.
6. The method of claim 4, wherein triggering a status bit determination based on the camera fusion data and performing an initial positioning in conjunction with a corresponding positioning mode associated with the determination result comprises:
triggering a status bit judgment based on the camera fusion data;
when the state bit judgment result is passed, positioning by adopting a constant-speed tracking mode, and then positioning again by adopting a tracking local map mode;
and when the state bit judgment result is lost, positioning by adopting a repositioning mode, and then positioning again by adopting a tracking local map mode.
7. A positioning device, comprising:
the data acquisition module is used for acquiring a prior three-dimensional map of a current scene, acquiring a real-time reference pose of the target object to be positioned and acquiring real-time images under corresponding directions acquired by at least two paths of cameras carried by the target object to be positioned;
the data fusion module is used for carrying out data fusion on the basis of the at least two real-time images to obtain camera fusion data;
and the pose determination module is used for determining the real-time pose of the target object to be positioned based on the prior three-dimensional map, the real-time reference pose and the camera fusion data.
8. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program for performing the steps of the method according to any one of claims 1-6.
9. An on-board positioning apparatus, comprising:
a processor;
a memory for storing the processor-executable instructions;
the processor is configured to read the executable instructions from the memory and execute the executable instructions to implement the method of any one of claims 1-6.
10. A vehicle comprising the on-board positioning apparatus of claim 9.
CN202210606914.7A 2022-05-31 2022-05-31 Positioning method, device, medium, equipment and vehicle Pending CN115014324A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210606914.7A CN115014324A (en) 2022-05-31 2022-05-31 Positioning method, device, medium, equipment and vehicle

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210606914.7A CN115014324A (en) 2022-05-31 2022-05-31 Positioning method, device, medium, equipment and vehicle

Publications (1)

Publication Number Publication Date
CN115014324A true CN115014324A (en) 2022-09-06

Family

ID=83071376

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210606914.7A Pending CN115014324A (en) 2022-05-31 2022-05-31 Positioning method, device, medium, equipment and vehicle

Country Status (1)

Country Link
CN (1) CN115014324A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116626036A (en) * 2023-05-24 2023-08-22 北京盛和信科技股份有限公司 Appearance quality inspection method and system based on machine vision recognition

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116626036A (en) * 2023-05-24 2023-08-22 北京盛和信科技股份有限公司 Appearance quality inspection method and system based on machine vision recognition
CN116626036B (en) * 2023-05-24 2024-04-02 北京盛和信科技股份有限公司 Appearance quality inspection method and system based on machine vision recognition

Similar Documents

Publication Publication Date Title
CN107990899B (en) Positioning method and system based on SLAM
KR102145109B1 (en) Methods and apparatuses for map generation and moving entity localization
Zhou et al. Ground-plane-based absolute scale estimation for monocular visual odometry
US8199977B2 (en) System and method for extraction of features from a 3-D point cloud
CN113406682B (en) Positioning method, positioning device, electronic equipment and storage medium
WO2022007776A1 (en) Vehicle positioning method and apparatus for target scene region, device and storage medium
CN108519102B (en) Binocular vision mileage calculation method based on secondary projection
US20070070069A1 (en) System and method for enhanced situation awareness and visualization of environments
CN107167826B (en) Vehicle longitudinal positioning system and method based on variable grid image feature detection in automatic driving
CN111986261B (en) Vehicle positioning method and device, electronic equipment and storage medium
CN111830953A (en) Vehicle self-positioning method, device and system
US11880931B2 (en) High-definition city mapping
CN111008660A (en) Semantic map generation method, device and system, storage medium and electronic equipment
CN114359476A (en) Dynamic 3D urban model construction method for urban canyon environment navigation
GB2578721A (en) Method and system for processing image data utilizing deep neural network
CN115014324A (en) Positioning method, device, medium, equipment and vehicle
CN113932796A (en) High-precision map lane line generation method and device and electronic equipment
CN112270748A (en) Three-dimensional reconstruction method and device based on image
CN115239899B (en) Pose map generation method, high-precision map generation method and device
CN110796706A (en) Visual positioning method and system
CN114111817B (en) Vehicle positioning method and system based on SLAM map and high-precision map matching
CN115719436A (en) Model training method, target detection method, device, equipment and storage medium
US11908198B2 (en) Contextualization and refinement of simultaneous localization and mapping
CN116245730A (en) Image stitching method, device, equipment and storage medium
CN113763468A (en) Positioning method, device, system and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination