CN109323709B - Visual odometry method, device and computer-readable storage medium - Google Patents

Visual odometry method, device and computer-readable storage medium Download PDF

Info

Publication number
CN109323709B
CN109323709B CN201710639962.5A CN201710639962A CN109323709B CN 109323709 B CN109323709 B CN 109323709B CN 201710639962 A CN201710639962 A CN 201710639962A CN 109323709 B CN109323709 B CN 109323709B
Authority
CN
China
Prior art keywords
frame
frames
target
attitude
posture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710639962.5A
Other languages
Chinese (zh)
Other versions
CN109323709A (en
Inventor
李昊鑫
李静雯
王刚
刘殿超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ricoh Co Ltd
Original Assignee
Ricoh Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ricoh Co Ltd filed Critical Ricoh Co Ltd
Priority to CN201710639962.5A priority Critical patent/CN109323709B/en
Publication of CN109323709A publication Critical patent/CN109323709A/en
Application granted granted Critical
Publication of CN109323709B publication Critical patent/CN109323709B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C22/00Measuring distance traversed on the ground by vehicles, persons, animals or other moving solid bodies, e.g. using odometers, using pedometers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/254Analysis of motion involving subtraction of images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods

Abstract

Visual odometry methods, apparatus, and computer-readable storage media for estimating a target state are provided. The method can comprise the following steps: obtaining a first number of current frames of target postures to be estimated, a second number of historical frames which are just before a first frame in the current frames and have been subjected to posture estimation, and a subsequent frame which is just after a last frame in the current frames; deducing the target attitude in the current frame according to the attitude estimation result of the historical frame; calculating the attitude change between the target attitude of one frame in the historical frames and the target attitude of the subsequent frame as a constraint condition; and optimizing the target posture in the current frame based on the constraint condition.

Description

Visual odometry method, device and computer-readable storage medium
Technical Field
The present disclosure relates to visual odometry, and more particularly, to visual odometry methods, apparatus, and computer-readable storage media for estimating a target pose.
Background
In the field of mobile robotics, simultaneous localization and mapping (SLAM) technology has been studied and developed for many years, and visual odometry is part of the SLAM problem, which incrementally estimates the position and pose of an object based on vision.
The most important problem with visual odometry is how to estimate the motion of the object from several neighboring images. Feature-based methods are the mainstream of current visual odometers, and have a long history of research. The feature method considers that, for two images, some representative points, called feature points, should be selected first. Thereafter, the motion of the object is estimated only for these feature points, while estimating the spatial positions of the feature points. The information of other non-feature points in the image is discarded. Thus, the feature point method converts a motion estimate for an image to a motion estimate between two sets of points.
At present, the visual odometry technology is gradually applied to the field of automatic driving, and compared with an indoor robot, the automatic driving has a huge application market, but at the same time, the moving speed of a vehicle in an automatic driving environment is often high, and factors such as illumination, weather and the like in an outdoor environment change infrequently, so that the quality of an obtained image changes, and sometimes, effective feature points are difficult to extract from some images and match the feature points, so that the posture of a corresponding frame is difficult to estimate, and certain challenges are brought to the visual odometry technology.
In the existing visual odometry methods, in order to estimate the pose of the vehicle more accurately, some methods improve the accuracy by introducing sensor data such as an IMU or a GPS, but the cost is increased by doing so.
The method using the SLAM framework is mainly located by a location recognition technology and a landmark recognition technology. In the field of automatic driving, the number of repeated roads passed by a vehicle is small, so that the situation that the adjacent posture cannot be recovered by repositioning after the posture estimation fails due to the low image quality of a part of image frames in a complex scene may exist. Moreover, the SLAM method performs pose constraint by continuously propagating landmark points forward, but when the number of features on an image is small, transmission of landmarks is difficult, and thus the problem that pose of such partial frames is difficult to estimate cannot be well solved.
Disclosure of Invention
In view of the foregoing, the present disclosure proposes a visual odometry method, apparatus, and computer-readable storage medium for estimating a target pose.
According to one aspect of the present disclosure, a visual odometry method for estimating a target pose is provided, which may include: obtaining a first number of current frames of target postures to be estimated, a second number of historical frames which are just before a first frame in the current frames and have been subjected to posture estimation, and a subsequent frame which is just after a last frame in the current frames; deducing the target attitude in the current frame according to the attitude estimation result of the historical frame; calculating the attitude change between the target attitude of one frame in the historical frames and the target attitude of the subsequent frame as a constraint condition; and optimizing the target posture in the current frame based on the constraint condition.
In an alternative embodiment, the step of inferring the target pose in the current frame from the pose estimation results of the historical frames may comprise: obtaining a local motion model according to the attitude estimation result of the historical frame; and calculating a target pose in the current frame based on the local motion model.
In an alternative embodiment, the step of obtaining a local motion model according to the pose estimation result in the historical frame may include: calculating a motion vector according to feature point matching between adjacent frames in the historical frames; obtaining a local motion direction category according to the motion vector by utilizing a pre-trained classifier; selecting a corresponding local motion model based on the local motion direction category; and solving the parameters of the local motion model by using the attitude estimation result of the historical frame.
In an alternative embodiment, the step of calculating a motion vector according to feature point matching between adjacent frames in the historical frames may comprise: obtaining mutually matched feature points between adjacent frames in the historical frames; transforming the matched feature points into a world coordinate system according to the camera parameters and the target postures of the frames where the matched feature points are located; and calculating motion vectors between the feature points matched with each other in the world coordinate system.
In an alternative embodiment, the step of calculating a pose change between the target pose of one frame of the historical frames and the target pose of the subsequent frame may comprise: performing feature point matching on one frame in the historical frames and the subsequent frames; and calculating the attitude change between the target attitude of one frame in the historical frames and the target attitude of the subsequent frame based on the feature point matching result.
In an alternative embodiment, the step of calculating a pose change between the target pose of one frame in the historical frames and the target pose of the subsequent frame based on the feature point matching result may include: transforming the matched characteristic points in one frame of the historical frame and the subsequent frame into a world coordinate system according to the camera parameters and the target postures of the frames where the matched characteristic points are located; and calculating the rotation and translation quantity between one frame in the historical frames and the subsequent frame according to the matched characteristic points in the two frames in the world coordinate system as the posture change.
In an optional embodiment, the method may further comprise: calculating a target pose in the subsequent frame using the local motion model. Wherein the step of optimizing the target pose in the current frame based on the constraint condition may include: establishing a posture graph by taking the calculated target posture in the current frame and the target posture in the subsequent frame as nodes and the constraint condition as edges; the energy function of the pose graph is minimized to obtain an optimized target pose.
In an optional embodiment, the method may further comprise: calculating the average value of the changes of the target postures of all the frames in the current frame; and smoothing the posture change between the adjacent frames in the current frame based on the average value to obtain a smoothed target posture.
According to another aspect of the present disclosure, there is provided a visual odometry apparatus for estimating a target pose, the apparatus may include: an obtaining means for obtaining a first number of current frames of a target attitude to be estimated, a second number of history frames that have been subjected to attitude estimation immediately before a first frame in the current frames, and a subsequent frame immediately after a last frame in the current frames; an inference component for inferring a target pose in the current frame from a pose estimation result of the historical frame; a calculation unit configured to calculate, as a constraint condition, a posture change between a target posture of one frame of the history frames and a target posture of the subsequent frame; and an optimizing component for optimizing the target posture in the current frame based on the constraint condition.
According to another aspect of the present disclosure, there is provided an apparatus for estimating a target pose, the apparatus may include: a memory storing computer program instructions; and a processor coupled to the memory, the processor configured to execute the computer program instructions to perform the following: obtaining a first number of current frames of target postures to be estimated, a second number of historical frames which are just before a first frame in the current frames and have been subjected to posture estimation, and a subsequent frame which is just after a last frame in the current frames; deducing the target attitude in the current frame according to the attitude estimation result of the historical frame; calculating the attitude change between the target attitude of one frame in the historical frames and the target attitude of the subsequent frame as a constraint condition; and optimizing the target posture in the current frame based on the constraint condition.
According to another aspect of the present disclosure, there is provided a computer readable storage medium storing computer program instructions which, when executed, may perform the following: obtaining a first number of current frames of target postures to be estimated, a second number of historical frames which are just before a first frame in the current frames and have been subjected to posture estimation, and a subsequent frame which is just after a last frame in the current frames; deducing the target attitude in the current frame according to the attitude estimation result of the historical frame; calculating the attitude change between the target attitude of one frame in the historical frames and the target attitude of the subsequent frame as a constraint condition; and optimizing the target posture in the current frame based on the constraint condition.
According to the visual odometry method, device and computer-readable storage medium for estimating a target pose of the embodiment of the present disclosure, the target pose in a current frame is inferred from the pose estimation result of a history frame on which the pose estimation has been performed before, and the pose change between the target pose of the history frame and the target pose of a subsequent frame after the current frame is calculated as a constraint condition by which the target pose in the current frame is optimized. Therefore, even for partial image frames which have low image quality and are difficult to perform feature point matching, target posture estimation can be performed, so that the visual odometer can normally operate in a complex scene, and the robustness and the accuracy of the visual odometer are improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and are intended to provide further explanation of the claimed technology.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent by describing in more detail embodiments of the present disclosure with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the principles of the disclosure and not to limit the disclosure. In the drawings, like reference numbers generally represent like parts or steps.
FIG. 1 is a flow chart illustrating the main steps of a visual odometry method for estimating a target pose according to an embodiment of the present disclosure;
FIG. 2 is a flow chart illustrating the main steps of a gesture inference method according to another embodiment of the present disclosure;
FIG. 3 is a flow chart illustrating the main steps of a local motion model calculation method according to another embodiment of the present disclosure;
FIG. 4 is a flow chart illustrating the main steps of a pose optimization method according to another embodiment of the present disclosure;
FIG. 5 is a schematic diagram illustrating an example pose graph according to another embodiment of the present disclosure;
FIG. 6 is a block diagram illustrating a primary configuration of a visual odometry device for estimating a target pose, according to an embodiment of the present disclosure; and
fig. 7 is a block diagram illustrating a main configuration of an apparatus for estimating a target posture according to another embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the present disclosure more apparent, example embodiments according to the present disclosure will be described in detail below with reference to the accompanying drawings. It is to be understood that the described embodiments are merely a subset of the embodiments of the present disclosure and not all embodiments of the present disclosure, with the understanding that the present disclosure is not limited to the example embodiments described herein. All other embodiments made by those skilled in the art without inventive efforts based on the embodiments of the present disclosure described in the present disclosure should fall within the scope of the present disclosure.
First, an application scenario of the present disclosure is briefly introduced. As described above, visual odometry provides a method of estimating the motion of an object from a continuous sequence of images obtained from a camera. The applicable fields include but are not limited to automatic driving, mobile robots or unmanned planes, etc. For example, when applied to an autonomous driving environment, a continuous sequence of images of a target scene may be captured by an onboard camera, and the pose of the target (i.e., the vehicle in this case) may be estimated from the continuous sequence of images using a visual odometry method.
Different camera types may be used in different applications, such as monocular cameras, stereo cameras, RGBD cameras, etc. Thus, the image captured by the camera may include a color image, a grayscale image, a depth image, an RGBD image, and so on. These image types are all applicable to the visual odometry method of the present disclosure for estimating the pose of a target. That is, the present disclosure does not limit the type of image captured by the camera.
As understood by those skilled in the art, the "pose" of an object refers to the position and orientation of the object, which may be represented by a six-dimensional vector (x, y, z, theta,
Figure BDA0001365689250000051
ψ) is shown. In general, the current pose of the target may be represented by the amount of rotation and translation relative to the initial pose of the target.
Next, a visual odometry method for estimating a target pose according to one embodiment of the present disclosure is described with reference to fig. 1.
As shown in fig. 1, the visual odometry method 100 according to this embodiment may include the following steps.
In step S110, a first number of current frames of the target pose to be estimated, a second number of history frames that have been subjected to pose estimation immediately before the first frame in the current frames, and a subsequent frame immediately after the last frame in the current frames are obtained.
As described above, the moving speed of the vehicle is often fast in the automatic driving environment, and at the same time, the quality of some image frames in the image sequence captured by the camera is not ideal enough due to the change of the illumination, weather and other factors in the outdoor environment, and it is difficult to extract effective feature points from the image frames and perform feature point matching. The continuous images with sparse feature points are the image frames of the estimated target pose aimed at by the visual odometry method of the embodiment.
Specifically, when the target pose is estimated by the conventional visual odometry method, feature points are extracted for each successive image frame, and the motion of the target is estimated by a feature point matching method to obtain the target pose of each frame. When extracting the characteristic points of each frame image and estimating the posture, if the image frame has less characteristic points and is difficult to estimate the posture through a matching method, adding the frame into the current image sequence. Then, feature points continue to be extracted for the next frame, and if the next frame still does not have enough feature points, the next frame is also added to the current image sequence. This operation continues until the next frame has enough feature points.
Thus, a current image sequence composed of image frames with fewer feature points before the frame with sufficient feature points is obtained as the current frame. It is clear to those skilled in the art that the current frame described herein may be one frame or may be a continuous plurality of frames.
Meanwhile, the above-described frame having sufficient feature points immediately after the last frame in the current frames is obtained as the subsequent frame. In addition, in order to infer the posture of the current frame from the historical posture of the target, a historical frame immediately before the first frame in the current frame is also obtained. As described above, the history frame is a frame that can estimate the target state according to the conventional visual odometry method and has obtained the posture estimation result. Those skilled in the art will appreciate that the historical frames are at least two frames for pose estimation, such as estimating the target pose by feature point matching. In one example, five historical frames prior to the current frame may be chosen. Of course, it will be apparent to those skilled in the art that the number of history frames is not limited to five frames and may vary depending on the particular application.
In the above feature extraction, the feature extraction method that can be adopted includes, but is not limited to, extracting corners, color blocks, etc. in the image. The feature point extraction algorithm developed in recent years can extract the same points even after the image is changed to a certain degree, and can judge the correlation between the same points. For example, commonly used features may include Harris corners, SIFT features, SURF features, ORB features, and the like.
For each feature point, in order to illustrate its distinction from other points, they may also be described using a "Descriptor" (Descriptor). A descriptor is usually a vector containing information of feature points and surrounding areas. Two feature points can be considered to be the same point if their descriptors are similar. According to the information of the feature points and the descriptors, the matching points in the two images can be calculated.
Of course, it is clear for those skilled in the art how to extract feature points is not the focus of attention herein, the above-listed methods are only examples, the applicable feature point extraction methods are not limited herein, any feature point extraction methods that are now known and developed in the future can be applied to the embodiments of the present disclosure, and those skilled in the art can select appropriate methods according to actual needs.
Next, in step S120 of the visual odometry method 100, a target pose in the current frame may be inferred from the pose estimation results of the historical frames. Since the shooting rate of the current camera is usually several tens of frames per second, such as 24 frames, 30 frames, 60 frames, or even more than 120 frames per second, the motion of the target between two adjacent frames usually does not have a drastic change, while the history frames selected in step S110 are a plurality of consecutive frames that have been subjected to the pose estimation, and therefore, the present disclosure contemplates that the pose of the target in the current frame can be roughly inferred from the pose estimation results of the history frames.
FIG. 2 illustrates one example of a gesture inference method that may be applied to embodiments of the present disclosure. As shown in FIG. 2, the pose inference method 200 can include the steps of: step S210, obtaining a local motion model according to the attitude estimation result of the historical frame; and step S220, calculating the target posture in the current frame based on the local motion model.
Regarding the method of obtaining the local motion model in step S210, fig. 3 shows one example of a local motion model calculation method applicable to an embodiment of the present disclosure. As shown in fig. 3, the local motion model calculation method 300 may include the following steps.
In step S310, a motion vector is calculated based on feature point matching between adjacent frames in the history frame.
As described above, the history frame is an image frame for which the pose estimation has been performed, and thus, the matched feature points in the adjacent frames extracted at the time of the pose estimation of the history frame and the poses in the respective frames can be obtained. For any image frame i in the history frame, the characteristic point extracted from the image frame i is marked as Pi jWhere j is a feature point index number in the image frame i, and the pose in the frame is denoted as Ti
Next, the matched feature points may be transformed into a world coordinate system according to the camera parameters and the target pose of the frame in which the matched feature points are located. For example, for all feature points P in image frame ii jIt can be transformed into points in the camera coordinate system in combination with known camera parameters Pa
Figure BDA0001365689250000073
Further, for points in the camera coordinate system
Figure BDA0001365689250000074
Can be based on the attitude T of the image frame iiConverting the point into a point in a world coordinate system
Figure BDA0001365689250000075
As shown in the following equation 1:
Figure BDA0001365689250000076
wherein the content of the first and second substances,
Figure BDA0001365689250000077
and
Figure BDA0001365689250000078
respectively representing the poses T of the image frames iiRotational and translational components relative to the target initial pose.
Thus, for each image frame i in the history frame, the feature point in the world coordinate system can be obtained
Figure BDA0001365689250000081
Thus, a motion vector can be calculated from the feature point matching relationship between adjacent frames in the known history frame
Figure BDA0001365689250000082
As shown in the following equation 2:
Figure BDA0001365689250000083
wherein the content of the first and second substances,
Figure BDA0001365689250000084
to represent
Figure BDA0001365689250000085
And the position of the matched characteristic point of the point in the (i + 1) th frame in a world coordinate system.
Thus, in step S310, motion vectors in all the history frames can be obtained
Figure BDA0001365689250000086
Is recorded as
Figure BDA0001365689250000087
Next, in step S320, a local motion direction category may be obtained from the motion vector obtained in step S310 using a pre-trained classifier.
To determine the motion direction class of the target in the history frame, the motion vector V obtained in step S310 may be input to a pre-trained classifier using the pre-trained classifier to obtain a corresponding motion direction class output. For example, the specific category of the movement direction may be classified into different movement categories such as acceleration movement, deceleration movement, straight movement, and turning movement. The motion of the object in the history frame is a local motion with respect to the motion of the object from the initial pose, and thus the motion direction class obtained here is also a class of local motion directions.
Regarding the adopted classifier, in the training stage of the classifier, a large number of local motion vectors V and corresponding motion direction labels y may be collected as training samples. When the classifier is trained, the trained classifier can be obtained by labeling the local motion vector V of the training sample with the local motion direction y. The present disclosure may employ various types of classifiers, such as BOW, K-means, and the like.
After the local motion direction category is obtained, in step S330, a corresponding local motion model may be selected based on the local motion direction category. The local motion model may be set based on the discrimination result of the local motion direction category y of the obtained history frame. For example, as described above, the local motion directions can be classified into the following four motion classes: acceleration, deceleration, straight-ahead movement and steering movement, when the vehicle is moving
Figure BDA0001365689250000088
For each local motion category y, a corresponding local motion model L may be sety
For example, for an acceleration motion, i.e. when y is 1, the corresponding local motion model may be set to L1=a*t2+ b, where a, b are parameters of the model. This accelerated motion model may be used to fit the motion pose T of the historical framepWhich includes both translational and rotational components. Since in practical situations the motion of the vehicle can often be approximated by a two-dimensional translational motion and a one-dimensional rotation, such local motions are easily fitted by a simple mathematical model.
Similarly, motion models in other motion directions may be set, which may be implemented by using any local trajectory fitting model, for example, a mathematical polynomial, a function, or a probability model, which is easy to implement by those skilled in the art and is not described herein again.
For the obtained local motion model LyIn step S340, the local motion model parameters may be solved using the pose estimation results of the historical frames.
As an example, the local motion model L may be calculated by the least squares method from the known poses of the historical framesyAs shown in the following equation 3:
Figure BDA0001365689250000091
wherein, wyRepresenting a local motion model LyIs determined by the parameters of (a) and (b),
Figure BDA0001365689250000092
representing the pose T of the ith frame in the historical framespRotational and translational components.
For example, for the above-described acceleration motion, y is 1, and the local motion model L may be calculated1=a*t2Parameters a, b in + b. Of course, in the same way, the parameters of the local motion model corresponding to other different motion types can be calculated.
Thus, with the local motion model calculation method 300, in step S210, a local motion model may be obtained from the pose estimation results of the historical frames.
Then, returning to the method 200, in step S220, a target pose in the current frame may be calculated based on the local motion model obtained in step S210.
For an object such as a vehicle, which tends to move linearly during a local movement, the local movement model L obtained above can be utilizedyLinearly inferring pose T in current framec. For example, the pose T in the current framecCan be calculated as shown in the following equation 4:
Figure BDA0001365689250000093
where n denotes the number of image frames included in the current frame image sequence, and t is the number of image frames.
Thus, with this pose estimation method 200, the target pose in the current frame can be estimated from the pose estimation results of the history frames in step S120.
Returning to the method 100, next, in step S130, a pose change between the target pose of one frame in the history frame and the target pose of the subsequent frame is calculated as a constraint.
Since the history frame and the subsequent frames are image frames with enough feature points, feature point matching can be performed on one frame in the history frame and the subsequent frames, and motion estimation can be performed on one frame in the history frame and the subsequent frames based on the feature point matching result, so that the target posture change from one frame in the history frame to the subsequent frames is calculated.
Any one of the historical frames may be selected to compute a pose change with a subsequent frame. In a preferred embodiment, the last frame in the historical frames may be selected.
When the attitude change is calculated by performing motion estimation through feature point matching, the adopted calculation mode can be the same as a common digital visual odometer method, and the feature point extraction mode can also be any known feature point extraction mode. In a preferred embodiment, for the feature points in the history frame, the same feature points as those already extracted in the above step S310 may be adopted to facilitate the improvement of efficiency. At this time, only the feature points in the subsequent frames need to be extracted, and feature point matching is performed.
In order to calculate the pose change, the feature points matched with each other in the last frame and the subsequent frame in the history frame are transformed into the world coordinate system according to the camera parameters and the target pose of the frame where the feature points matched with each other are located, and the rotation and translation amount between the last frame and the subsequent frame in the history frame is calculated as the pose change according to the feature points matched with each other in the last frame and the subsequent frame in the world coordinate system.
As described above, in the case of using the extracted feature points for the last frame of the historical frame, since the feature points of the historical frame have been transformed to the world coordinate system in step S310 above, only the matched feature points in the subsequent frames may be subjected to coordinate transformation, and the transformation method may be the same, and will not be described again here.
Thus, for a matching feature point pair P between one frame and a subsequent frame in the historical framei jCombining the camera parameters and the postures of the frames where the matched feature points are located, the projection of the camera parameters under a world coordinate system can be obtained
Figure BDA0001365689250000102
Thus, the relative motion between one frame in the history frame and the subsequent frame can be calculated under the world coordinate system, and the rotation and translation quantity of the relative motion of the target from one frame in the history frame to the subsequent frame is used for representing the attitude change EcAs shown in the following equation 5:
Figure BDA0001365689250000101
where R and T represent the amount of rotation and translation, respectively, in the change in target pose from one frame to the next in the above historical frames, and proj (x) represents the projection of the point-bound camera parameters in the world coordinate system into the image coordinate system. The above equation can be solved by sampling the known RANSIC or Gaussian Newton method to obtain the attitude change EcAs a constraint.
In step S140, the target pose in the current frame is optimized based on the constraint conditions obtained in step S130.
FIG. 4 illustrates one example of a pose optimization method that can be used with embodiments of the present disclosure. As shown in FIG. 4, the pose optimization method 400 can include the following steps.
In step S410, a pose graph is built with the target pose in the current frame and the target pose in the subsequent frame as nodes and the constraint condition as edges.
In the previously described step S120, the target pose in the current frame has been deduced from the pose estimation results of the historical frames. For example, as described above in step S220, a local motion model L may be utilizedyLinearly calculating the posture T in the current frame as shown in equation 4c. As for the subsequent frames, the local motion model L can be utilized as wellyLinearly calculating the target pose T in the subsequent framef. Constraint EcHas been calculated in step S130 above.
FIG. 5 illustrates one example of a pose graph according to an embodiment of the present disclosure. In this example, there are 4 current frames, each with a pose Tc1、Tc2、Tc3And Tc4With the pose of the subsequent frame being TfConstraint condition is Ec. In an attitude Tc1、Tc2、Tc3And Tc4And TfAs nodes, with constraint EcThe state diagram established as an edge is shown in fig. 5.
After the state diagram is built, the pose diagram is optimized by using a graph optimization algorithm in step S420, and an optimized target pose can be obtained.
In one specific example, the optimized current frame target pose T may be obtained using an energy function approach of minimizing a graph and using an algorithm such as Gaussian Newton or Levenberg algorithmc. Of course, this optimization method is merely an example, and any other suitable optimization method can be adopted by those skilled in the art as needed.
It should be noted that, although the last frame in the history frames is selected to calculate the pose change between the last frame and the subsequent frame in the above embodiment as the constraint condition, it should be understood by those skilled in the art that in other embodiments, other frames in the history frames may be selected to calculate the pose change between the last frame and the subsequent frame, and then the pose change between the last frame in the history frames and the subsequent frame may be obtained by referring to the pose change between the other frames and the last frame in the history frames as the constraint condition. Since all of the historical frames have been pose estimated, the pose change between the other frames and the last frame in the historical frames is readily available.
Thus, in step S140, the optimized current frame target pose is obtained, and the method 100 ends.
In an optional embodiment, after obtaining the optimized target pose in step S140, optionally, step S150 (not shown in the figure) may be further performed to smooth the optimized target pose.
In step S150, an average of changes in the target poses of all frames in the optimized current frame obtained in step S140 may be calculated, and pose changes between adjacent frames in the current frame are smoothed based on the average to obtain a smoothed target pose.
In one example, the pose of the current frame, the pose of the historical frame, and the state of the subsequent frame after optimization may be maintained in a world coordinate system, arranged in chronological order. Then, the postures of the two adjacent frames are smoothed in a linear smoothing mode.
Specifically, for two adjacent frames k and k-1, their target poses are T respectivelykAnd Tk-1Then the relative pose change between the two frames can be calculated
Figure BDA0001365689250000121
Thus, in this way, the attitude change T between adjacent frames can be calculated for all framesrAnd calculating the average value of the attitude change
Figure BDA0001365689250000122
As described above, the attitude change is typically represented in rotational and translational components, and the average of the attitude change
Figure BDA0001365689250000123
Or as the mean of the rotational and translational components.
If for an arbitrary frame k at this time,
Figure BDA0001365689250000124
then order
Figure BDA0001365689250000125
Recalculating the pose T of the k-th frame as shown in equation 6 belowk
Figure BDA0001365689250000126
Where θ is a threshold coefficient, which may be preset by the user.
By smoothing the attitude in the current frame of the target estimation such as a vehicle as described above, the local motion of the target can be made smoother and kept linear.
According to the visual odometer method 100 for estimating the target pose of the embodiment, in order to estimate the target pose in the current frame, the target pose in the current frame is estimated according to the pose estimation result of the historical frame, the pose constraint is established by combining the subsequent frames, and the estimated pose is optimized, so that the target pose estimation can be performed on the partial image frames which have low image quality and are difficult to perform feature point matching, the visual odometer can normally operate in a complex scene, and the robustness and the accuracy of the visual odometer are improved.
Furthermore, according to the visual odometry method of this embodiment, when the target pose in the current frame is estimated from the pose estimation result of the history frame, the pose and the motion vector of the history frame are used to discriminate the local motion direction, and the corresponding local motion model is selected to estimate the pose of the current image sequence, which is more accurate than the case of using a single model.
In addition, according to the visual odometry method of the embodiment, when the posture optimization is carried out, the local motion model is applied to the subsequent frames to establish the posture constraint, and the inferred rough posture is further optimized by using the graph optimization.
Next, a visual odometry apparatus for estimating a target pose according to an embodiment of the present disclosure will be described with reference to fig. 6.
Fig. 6 is a block diagram illustrating a main configuration of a visual odometry apparatus for estimating a target pose according to an embodiment of the present disclosure. As shown in fig. 6, the visual odometer apparatus 600 for estimating a target posture of this embodiment mainly includes: an obtaining unit 610 for obtaining a first number of current frames of the target pose to be estimated, a second number of history frames that have been subjected to pose estimation immediately before a first frame in the current frames, and a subsequent frame immediately after a last frame in the current frames; an inference component 620, configured to infer a target pose in the current frame according to a pose estimation result of the historical frame; a calculation unit 630 configured to calculate, as a constraint, a change in posture between a target posture of one frame in the history frames and a target posture of the subsequent frame; and an optimizing component 640 for optimizing the target pose in the current frame based on the constraint condition.
In one embodiment, the inference component 620 can include: a model obtaining unit 621 (not shown in the figure) configured to obtain a local motion model according to the pose estimation result of the historical frame; and a pose calculation unit 622 (not shown in the figure) for calculating a pose of the target in the current frame based on the local motion model.
In one embodiment, the model obtaining unit 621 may calculate a motion vector according to feature point matching between adjacent frames in the historical frames; obtaining a local motion direction category according to the motion vector by utilizing a pre-trained classifier; selecting a corresponding local motion model based on the local motion direction category; and solving parameters of the local motion model by using the attitude estimation result of the historical frame to obtain a local motion model.
In one embodiment, the model obtaining component 621 may calculate the motion vector as follows: and obtaining mutually matched feature points between adjacent frames in the historical frames, transforming the mutually matched feature points into a world coordinate system according to the camera parameters and the target postures of the frames in which the mutually matched feature points are located, and calculating motion vectors between the mutually matched feature points in the world coordinate system.
In one embodiment, the calculation component 630 may calculate the pose change as follows: performing feature point matching on one frame in the historical frames and the subsequent frames; and calculating the attitude change between the target attitude of one frame in the historical frames and the target attitude of the subsequent frame based on the feature point matching result.
In one embodiment, the calculation component 630 may calculate the pose change as follows: according to camera parameters and the target postures of the frames where the matched feature points are located, the matched feature points in one frame and the subsequent frame in the historical frame are transformed into a world coordinate system; and calculating the rotation and translation quantity between one frame in the historical frames and the subsequent frame according to the matched characteristic points in the two frames in the world coordinate system as the posture change.
In one embodiment, the pose computation component 622 can utilize the local motion model to compute a target pose in the subsequent frame. The optimization component 640 can optimize the target pose as follows: establishing a posture graph by taking the calculated target posture in the current frame and the target posture in the subsequent frame as nodes and the constraint condition as edges; and optimizing the attitude map by using a map optimization algorithm to obtain an optimized target attitude.
In an embodiment, the visual odometry apparatus 600 may further include a smoothing unit 650 (not shown in the figure) for calculating an average value of changes in the target poses of all frames in the current frame, and smoothing pose changes between adjacent frames in the current frame based on the average value to obtain a smoothed target pose.
In one embodiment, the first number of current frames may be one or more frames and the second number of historical frames may be at least two frames.
It is readily understood that the obtaining component 610, the inferring component 620, the calculating component 630, the optimizing component 640, and the optional smoothing component 650 in the visual odometry apparatus 600 of this embodiment may be configured by a Central Processing Unit (CPU) of the apparatus 600. Alternatively, the obtaining means 610, the inferring means 620, the calculating means 630, the optimizing means 640, and the optional smoothing means 650 may also be configured by a dedicated processing unit in the apparatus 600, such as an Application Specific Integrated Circuit (ASIC) or the like. That is, the obtaining component 610, the inferring component 620, the calculating component 630, the optimizing component 640, and the optional smoothing component 650 may be configured, for example, by hardware, software, firmware, and any feasible combination thereof.
Of course, for simplicity, only some of the components of the visual odometry apparatus 600 that are germane to the present disclosure are shown in fig. 6. Of course, the visual odometry device 600 may also include other modules, such as input-output components, display components, communication components, and the like. Of course, the visual odometer means 600 may comprise a storage device for storing, in a volatile or non-volatile manner, the images, data, results obtained, commands and intermediate data, etc. involved in the above-described processing. The storage device may include various volatile or non-volatile memory such as Random Access Memory (RAM), Read Only Memory (ROM), hard disk, or semiconductor memory, among others. In addition, components such as buses, input/output interfaces, and the like are also omitted from the drawings. In addition, the visual odometry device 600 may include any other suitable components, depending on the particular application.
Next, an apparatus for estimating a target posture of another embodiment of the present disclosure is described with reference to fig. 7.
Fig. 7 is a block diagram illustrating a main configuration of an apparatus for estimating a target posture according to another embodiment of the present disclosure.
As shown in fig. 7, the apparatus 700 for estimating a target pose of the present embodiment mainly includes a memory 710, a processor 720, an input/output device (e.g., keyboard, mouse, speaker, etc.) 730, a display device 740, and the like, which are interconnected by a bus system 750 and/or other form of connection mechanism (not shown). It should be noted that the components and configuration of the device 700 shown in fig. 7 are exemplary only, and not limiting, and that the device 700 may have other components and configurations as desired. For example, device 700 may also have an image acquisition component, such as a camera, for acquiring images of the target scene.
Memory 710 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), a hard disk, flash memory, EPROM memory, EEPROM memory, and the like. The computer-readable storage medium may also include registers, hard disk, floppy disk, solid state disk, removable disk, CD-ROM, DVD-ROM, Blu-ray disk, and the like. On which one or more computer program instructions may be stored and executed by processor 720 to implement the desired functionality.
Processor 720 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, including but not limited to, for example, one or more processors or microprocessors, etc., and may be coupled to memory 710 to execute computer program instructions stored in memory 710 to perform the following: obtaining a first number of current frames of target postures to be estimated, a second number of historical frames which are just before a first frame in the current frames and have been subjected to posture estimation, and a subsequent frame which is just after a last frame in the current frames; deducing the target attitude in the current frame according to the attitude estimation result of the historical frame; calculating the attitude change between the target attitude of one frame in the historical frames and the target attitude of the subsequent frame as a constraint condition; and optimizing the target posture in the current frame based on the constraint condition.
Furthermore, embodiments of the present invention also provide a computer-readable storage medium having stored thereon computer program instructions that, when executed by a computer, perform any of the embodiments of the visual odometry method for estimating a target pose described above with reference to fig. 1 to 5.
As described above, the computer-readable storage medium may include, for example, volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), a hard disk, flash memory, EPROM memory, EEPROM memory, and the like. The computer-readable storage medium may also include registers, hard disk, floppy disk, solid state disk, removable disk, CD-ROM, DVD-ROM, Blu-ray disk, and the like.
The visual odometry method, apparatus and computer-readable storage medium for estimating a target state according to embodiments of the present disclosure are described above with reference to fig. 1 to 7.
According to the present disclosure, a target posture in a current frame is inferred from a posture estimation result of a history frame on which a posture estimation has been previously performed, and a posture change between the target posture of the history frame and a target posture of a subsequent frame following the current frame is calculated as a constraint condition with which the target posture in the current frame is optimized. Therefore, even for partial image frames which have low image quality and are difficult to perform feature point matching, target posture estimation can be performed, so that the visual odometer can normally operate in a complex scene, and the robustness and the accuracy of the visual odometer are improved.
It should be noted that, in the present specification, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
Also, as used herein, including in the claims, "or" as used in a list of items beginning with "at least one" indicates a separate list, such that a list of "A, B or at least one of C" means a or B or C, or AB or AC or BC, or ABC (i.e., a and B and C). Furthermore, the word "exemplary" does not mean that the described example is preferred or better than other examples.
Further, it should be noted that each component or each step described in the present specification may be decomposed and/or recombined. These decompositions and/or recombinations are to be regarded as equivalents of the present invention. Also, the steps of executing the series of processes described above may naturally be executed chronologically in the order described, but need not necessarily be executed chronologically. Some steps may be performed in parallel or independently of each other.
Through the above description of the embodiments, those skilled in the art will clearly understand that the present invention may be implemented by software plus a necessary hardware platform, and may also be implemented by hardware entirely. With this understanding in mind, all or part of the technical solutions of the present invention that contribute to the background can be embodied in the form of a software product, which can be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes instructions for causing a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods according to the embodiments or some parts of the embodiments of the present invention.
In embodiments of the present invention, the units/modules may be implemented in software for execution by various types of processors. An identified module of executable code may, for instance, comprise one or more physical or logical blocks of computer instructions which may, for instance, be constructed as an object, procedure, or function. Nevertheless, the executables of an identified module need not be physically located together, but may comprise disparate instructions stored in different bits which, when joined logically together, comprise the unit/module and achieve the stated purpose for the unit/module.
When a unit/module can be implemented by software, considering the level of existing hardware technology, the unit/module can be implemented by software, and those skilled in the art can build corresponding hardware circuits to implement corresponding functions, without considering the cost, the hardware circuits include conventional Very Large Scale Integration (VLSI) circuits or gate arrays and existing semiconductors such as logic chips, transistors, or other discrete components. A module may also be implemented in programmable hardware devices such as field programmable gate arrays, programmable array logic, programmable logic devices or the like.
The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the invention. Thus, the present invention is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A visual odometry method for estimating a target pose, comprising:
obtaining a first number of current frames of target postures to be estimated, a second number of historical frames which are just before a first frame in the current frames and have been subjected to posture estimation, and a subsequent frame which is just after a last frame in the current frames;
obtaining a local motion model according to the attitude estimation result of the historical frame, and calculating the target attitude in the current frame based on the local motion model;
calculating the attitude change between the target attitude of one frame in the historical frames and the target attitude of the subsequent frame as a constraint condition;
calculating a target pose in a subsequent frame using the local motion model;
establishing a posture graph by taking the calculated target posture in the current frame and the target posture in the subsequent frame as nodes and the constraint condition as edges; and
and optimizing the attitude map by using a map optimization algorithm to obtain an optimized target attitude.
2. The method of claim 1, wherein obtaining a local motion model from pose estimates in the historical frames comprises:
calculating a motion vector according to feature point matching between adjacent frames in the historical frames;
obtaining a local motion direction category according to the motion vector by utilizing a pre-trained classifier;
selecting a corresponding local motion model based on the local motion direction category; and
and solving the parameters of the local motion model by using the attitude estimation result of the historical frame.
3. The method of claim 2, wherein computing motion vectors from feature point matches between adjacent frames in the historical frames comprises:
obtaining mutually matched feature points between adjacent frames in the historical frames;
transforming the matched feature points into a world coordinate system according to the camera parameters and the target postures of the frames where the matched feature points are located; and
and calculating motion vectors between the feature points which are matched with each other in the world coordinate system.
4. The method of any of claims 1-3, wherein calculating a change in pose between a target pose of one of the historical frames and a target pose of the subsequent frame comprises:
performing feature point matching on one frame in the historical frames and the subsequent frames;
and calculating the attitude change between the target attitude of one frame in the historical frames and the target attitude of the subsequent frame based on the feature point matching result.
5. The method of claim 4, wherein calculating a pose change between a target pose of one of the historical frames and a target pose of the subsequent frame based on feature point matching results comprises:
according to camera parameters and the target postures of the frames where the matched feature points are located, the matched feature points in one frame and the subsequent frame in the historical frame are transformed into a world coordinate system; and
and calculating the rotation and translation quantity between one frame and the subsequent frame in the world coordinate system according to the matched characteristic points in the two frames as the posture change.
6. The method of claim 1, further comprising:
calculating the average value of the changes of the target postures of all the frames in the current frame;
and smoothing the posture change between the adjacent frames in the current frame based on the average value to obtain a smoothed target posture.
7. The method of claim 1, wherein the first number of current frames is one or more frames and the second number of historical frames is at least two frames.
8. A visual odometry apparatus for estimating a target pose, comprising:
an obtaining means for obtaining a first number of current frames of a target attitude to be estimated, a second number of history frames that have been subjected to attitude estimation immediately before a first frame in the current frames, and a subsequent frame immediately after a last frame in the current frames;
the inference component is used for obtaining a local motion model according to the attitude estimation result of the historical frame and calculating the target attitude in the current frame based on the local motion model;
a calculation unit configured to calculate a posture change between a target posture of one frame in the history frames and a target posture of the subsequent frame as a constraint condition, and calculate a target posture in the subsequent frame using the local motion model; and
and the optimization component is used for establishing a posture graph by taking the calculated target posture in the current frame and the target posture in the subsequent frame as nodes and the constraint condition as an edge, and optimizing the posture graph by utilizing a graph optimization algorithm to obtain the optimized target posture.
9. An apparatus for estimating a pose of a target, comprising:
a memory storing computer program instructions; and
a processor coupled to the memory, the processor configured to execute the computer program instructions to perform the following:
obtaining a first number of current frames of target postures to be estimated, a second number of historical frames which are just before a first frame in the current frames and have been subjected to posture estimation, and a subsequent frame which is just after a last frame in the current frames;
obtaining a local motion model according to the attitude estimation result of the historical frame, and calculating the target attitude in the current frame based on the local motion model;
calculating the attitude change between the target attitude of one frame in the historical frames and the target attitude of the subsequent frame as a constraint condition; and
calculating a target pose in a subsequent frame using the local motion model;
establishing a posture graph by taking the calculated target posture in the current frame and the target posture in the subsequent frame as nodes and the constraint condition as edges; and
and optimizing the attitude map by using a map optimization algorithm to obtain an optimized target attitude.
10. A computer readable storage medium storing computer program instructions which, when executed, perform the process of:
obtaining a first number of current frames of target postures to be estimated, a second number of historical frames which are just before a first frame in the current frames and have been subjected to posture estimation, and a subsequent frame which is just after a last frame in the current frames;
obtaining a local motion model according to the attitude estimation result of the historical frame, and calculating the target attitude in the current frame based on the local motion model;
calculating the attitude change between the target attitude of one frame in the historical frames and the target attitude of the subsequent frame as a constraint condition; and
calculating a target pose in a subsequent frame using the local motion model;
establishing a posture graph by taking the calculated target posture in the current frame and the target posture in the subsequent frame as nodes and the constraint condition as edges; and
and optimizing the attitude map by using a map optimization algorithm to obtain an optimized target attitude.
CN201710639962.5A 2017-07-31 2017-07-31 Visual odometry method, device and computer-readable storage medium Active CN109323709B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710639962.5A CN109323709B (en) 2017-07-31 2017-07-31 Visual odometry method, device and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710639962.5A CN109323709B (en) 2017-07-31 2017-07-31 Visual odometry method, device and computer-readable storage medium

Publications (2)

Publication Number Publication Date
CN109323709A CN109323709A (en) 2019-02-12
CN109323709B true CN109323709B (en) 2022-04-08

Family

ID=65244917

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710639962.5A Active CN109323709B (en) 2017-07-31 2017-07-31 Visual odometry method, device and computer-readable storage medium

Country Status (1)

Country Link
CN (1) CN109323709B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110170167B (en) * 2019-05-28 2023-02-28 上海米哈游网络科技股份有限公司 Picture display method, device, equipment and medium
CN112766023B (en) * 2019-11-04 2024-01-19 北京地平线机器人技术研发有限公司 Method, device, medium and equipment for determining gesture of target object
CN111157757A (en) * 2019-12-27 2020-05-15 苏州博田自动化技术有限公司 Vision-based crawler speed detection device and method
CN112509047A (en) * 2020-12-10 2021-03-16 北京地平线信息技术有限公司 Image-based pose determination method and device, storage medium and electronic equipment
CN113689497A (en) * 2021-08-11 2021-11-23 影石创新科技股份有限公司 Pose optimization method, device, equipment and storage medium
CN116503958B (en) * 2023-06-27 2023-10-03 江西师范大学 Human body posture recognition method, system, storage medium and computer equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105938619A (en) * 2016-04-11 2016-09-14 中国矿业大学 Visual odometer realization method based on fusion of RGB and depth information
CN106556412A (en) * 2016-11-01 2017-04-05 哈尔滨工程大学 The RGB D visual odometry methods of surface constraints are considered under a kind of indoor environment
CN106780484A (en) * 2017-01-11 2017-05-31 山东大学 Robot interframe position and orientation estimation method based on convolutional neural networks Feature Descriptor

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111351495A (en) * 2015-02-10 2020-06-30 御眼视觉技术有限公司 Server system, method and machine-readable medium
EP3118814A1 (en) * 2015-07-15 2017-01-18 Thomson Licensing Method and apparatus for object tracking in image sequences

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105938619A (en) * 2016-04-11 2016-09-14 中国矿业大学 Visual odometer realization method based on fusion of RGB and depth information
CN106556412A (en) * 2016-11-01 2017-04-05 哈尔滨工程大学 The RGB D visual odometry methods of surface constraints are considered under a kind of indoor environment
CN106780484A (en) * 2017-01-11 2017-05-31 山东大学 Robot interframe position and orientation estimation method based on convolutional neural networks Feature Descriptor

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于序列图像块匹配的室内定位算法研究;蔡胜利等;《计算机测量与控制》;20100725;第18卷(第7期);第1641-1644页 *

Also Published As

Publication number Publication date
CN109323709A (en) 2019-02-12

Similar Documents

Publication Publication Date Title
CN109323709B (en) Visual odometry method, device and computer-readable storage medium
Zhan et al. Visual odometry revisited: What should be learnt?
Meng et al. Backtracking regression forests for accurate camera relocalization
Konda et al. Learning visual odometry with a convolutional network
EP2959431B1 (en) Method and device for calculating a camera or object pose
JP4644248B2 (en) Simultaneous positioning and mapping using multi-view feature descriptors
JP6261811B2 (en) Method for determining motion between a first coordinate system and a second coordinate system
CN108229416B (en) Robot SLAM method based on semantic segmentation technology
CN110874100A (en) System and method for autonomous navigation using visual sparse maps
CN107679489B (en) Automatic driving processing method and device based on scene segmentation and computing equipment
JP2017526082A (en) Non-transitory computer-readable medium encoded with computer program code for causing a motion estimation method, a moving body, and a processor to execute the motion estimation method
Cheng et al. Robust semantic mapping in challenging environments
JP7345664B2 (en) Image processing system and method for landmark position estimation with uncertainty
WO2014112346A1 (en) Device for detecting feature-point position, method for detecting feature-point position, and program for detecting feature-point position
Huang et al. Coarse-to-fine face alignment with multi-scale local patch regression
JP2022047508A (en) Three-dimensional detection of multiple transparent objects
CN115063768A (en) Three-dimensional target detection method, encoder and decoder
Dong et al. Efficient pose estimation from single RGB-D image via Hough forest with auto-context
JP4921847B2 (en) 3D position estimation device for an object
Zhu et al. Fusing panoptic segmentation and geometry information for robust visual slam in dynamic environments
CN110175523B (en) Self-moving robot animal identification and avoidance method and storage medium thereof
Petrović et al. Deep learning-based algorithm for mobile robot control in textureless environment
US11830218B2 (en) Visual-inertial localisation in an existing map
Zhang et al. Boosting the speed of real-time multi-object trackers
CN115170826A (en) Local search-based fast optical flow estimation method for small moving target and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant