CN115371699B

CN115371699B - Visual inertial odometer method and device and electronic equipment

Info

Publication number: CN115371699B
Application number: CN202111162776.XA
Authority: CN
Inventors: 郑泽玲; 高军强
Original assignee: Cloudminds Beijing Technologies Co Ltd
Current assignee: Cloudminds Beijing Technologies Co Ltd
Priority date: 2021-09-30
Filing date: 2021-09-30
Publication date: 2024-03-15
Anticipated expiration: 2041-09-30
Also published as: WO2023051019A1; CN115371699A

Abstract

The disclosure provides a method, a device and electronic equipment for a visual inertial odometer, wherein the method comprises the following steps: fusing pose information of at least one sensor to obtain a first initial pose; performing photometric optimization on the initial pose according to pose transformation between two frames of images to obtain a second initial pose; performing feature point optimization on the second initial pose by extracting and matching feature points in the image to generate map points; creating back-end tight coupling optimization according to the corresponding relation between the map points and the corner points; constructing a characteristic point constraint matrix and a luminosity constraint matrix through the back-end tight coupling optimization; and splicing the feature point matrix and the luminosity constraint matrix to obtain a final constraint matrix. According to the method, the front end is loosely coupled by adopting a characteristic point method and a photometry, the rear end is optimally coupled by adopting a VIO method of tightly coupling by adopting the characteristic point method and the photometry, so that the positioning and three-dimensional reconstruction under different scenes can be satisfied, and the robustness of an algorithm is improved.

Description

Visual inertial odometer method and device and electronic equipment

Technical Field

The present disclosure relates to the field of SLAM (simultaneous localization and mapping, chinese name: instant localization and mapping, or concurrent mapping and localization), and more particularly, to a method, apparatus, and electronic device for a Visual inertial odometer (VIO, visual-Inertial Odometry).

Background

With the development of positioning navigation, in particular to the wide application of SLAM technology, a robot starts to move from an unknown position in an unknown environment, self-positioning is carried out according to position estimation and a map in the moving process, and meanwhile, an incremental map is built on the basis of self-positioning, so that autonomous positioning and navigation of the robot are realized.

Different scenes have different parameter requirements for SLAM mapping, VO (Visual o object) positioning and three-dimensional reconstruction. Current visual SLAMs can be divided into two categories according to principles: feature point method SLAM and photometric SLAM. And establishing an objective function according to the photometric error of the point with larger pixel gradient change in the image by using the photometry, and optimizing and solving the pose of the camera and the position of the map point. The feature point method firstly needs to extract feature points and descriptors in an image, then determines a matching relation between the feature points according to the descriptors, and establishes an objective function according to projection errors between the matching points. Photometry is not high in texture requirements in a scene but is sensitive to illumination changes, while feature point method is not sensitive to illumination changes but is high in texture requirements. How to construct SLAM algorithms that meet different scenario requirements is a problem that needs to be addressed in the art.

Disclosure of Invention

In order to solve the above technical problems and satisfy positioning and three-dimensional reconstruction under various light scenes, an object of an embodiment of the present disclosure is to provide a visual inertial odometer method, a device, an electronic apparatus and a storage medium.

According to a first aspect of the present disclosure, embodiments of the present disclosure provide a method of a visual inertial odometer, comprising:

fusing pose information of at least one sensor to obtain a first initial pose;

performing photometric optimization on the initial pose according to pose transformation between two frames of images to obtain a second initial pose;

performing feature point optimization on the second initial pose by extracting and matching feature points in the image to generate map points;

creating back-end tight coupling optimization according to the corresponding relation between the map points and the corner points;

constructing a characteristic point constraint matrix and a luminosity constraint matrix through the back-end tight coupling optimization;

and splicing the feature point matrix and the luminosity constraint matrix to obtain a final constraint matrix.

Further, the fusing the pose information of the at least one sensor to obtain a first initial pose includes:

rotation information in the pose information is obtained by rotation of an inertial sensor (IMU, inertial Measurement Unit);

Obtaining displacement information in pose information by utilizing translation of a wheel type odometer;

and carrying out information fusion on the rotation information and the displacement information to obtain a first initial position.

Further, the performing photometric optimization on the initial pose according to pose transformation between two frames of images to obtain a second initial pose includes:

acquiring an image by an image sensor;

converting the color image into a gray scale image;

extracting pixel points with pixel gradient change larger than a certain threshold value from the gray image;

obtaining the position in the next frame of image through optical flow tracking and constructing a photometric error function;

constructing a nonlinear least square function according to the error function;

and optimizing the first initial pose by using the nonlinear least square function to obtain a second initial pose.

Further, the photometric error function is expressed as follows:

r＝ω _h (I ₂ [x ₂ ]-(a ₂₁ I ₁ [x ₁ ]+b ₂₁ ))

wherein r represents luminosity error, ω _h For Huber weight, I ₁ ，I ₂ Representing two adjacent gray scale images, x ₁ ，x ₂ The pixel coordinates of the spatial midpoint X in the image, X ₂ Is composed of x ₁ Projection is obtained, a, b are photometric affine transformation parameters.

x ₂ Is composed of x ₁ Projection acquisition requires a relative pose transformation ζ between two frames ₂₁ And x ₁ At I ₁ Is a reverse depth of (a) in the middle (a).

Further, the nonlinear least squares function is expressed as follows:

wherein J represents a jacobian matrix, xi represents a lie algebra, r represents a photometric error, i and N are natural numbers.

Further, the step of optimizing the feature points of the second initial pose by extracting and matching the feature points in the image to generate map points includes:

converting the color image into a gray scale image;

extracting pixel points with pixel gradient change larger than a certain threshold value from the gray image as characteristic points of the image;

screening out characteristic points with pixel gradient change larger than a second threshold value from the characteristic points of the image as corner points;

performing association matching with the data of the second initial pose through the description of the corner points;

constructing a reprojection error function according to the data after the association matching;

performing feature point optimization on the second initial pose according to a reprojection error function;

and generating map points according to the second initial pose after feature point optimization and the corner points.

Further, the reprojection error function is expressed as follows:

wherein, xi ^* Heavy projection error, u _i Is a known 2D homogeneous coordinate s _i Representing the scale, K is the internal parameters of the camera, ζ is the unknown camera pose, P _i I and N are natural numbers, which are known homogeneous coordinates of 3D points.

Further, the creating the back-end tight coupling optimization according to the correspondence between the map points and the corner points includes:

creating a back-end tight coupling optimization by whether the map points are generated by the corner points;

if the map points are generated by the corner points, the back-end tight coupling optimization comprises characteristic point constraint and luminosity constraint, and the characteristic point constraint and the luminosity constraint are tightly coupled;

and if the map points are not formed by the corner points, luminosity constraint is contained, and the updating quantity of the variable to be optimized is solved through nonlinear least square.

Further, the constructing the feature point constraint matrix and the luminosity constraint matrix through the back-end tight coupling optimization includes:

respectively obtaining derivatives between the feature point constraint and the variable to be changed in the luminosity constraint according to the luminosity error function and the reprojection error function;

and respectively constructing a characteristic point constraint matrix and a luminosity constraint matrix according to the derivative.

Further, the splicing of the constraint matrix for the feature point matrix and the luminosity constraint matrix to obtain a final constraint matrix includes:

Adding the same variable parts to be optimized in the characteristic point constraint matrix and the luminosity constraint matrix to obtain the update quantity of the variable to be optimized;

and performing elimination treatment on the update quantity of the variable to be optimized to obtain a final constraint matrix.

Further, the method further comprises:

updating variables to be optimized in the characteristic point constraint matrix and the luminosity constraint matrix; comprising the following steps:

solving an update amount by whether the map points are formed by corner points or not;

the back-end tight-coupling optimization contains feature point constraints and luminosity constraints if the map points are generated by the corner points, and luminosity constraints if the map points are not formed by the corner points;

and directly performing incremental addition on the variable to be optimized, and multiplying the rotation information in the second initial pose.

Further, the variables to be optimized comprise camera internal parameters, pose, feature point inverse depth and photometric affine transformation parameters.

In a second aspect, another embodiment of the present disclosure provides a visual inertial odometer device, comprising:

the fusion module is used for fusing the pose information of at least one sensor to obtain a first initial pose;

The luminosity optimization module is used for performing luminosity optimization on the initial pose according to pose transformation between two frames of images to obtain a second initial pose;

the feature point optimization module is used for performing feature point optimization on the second initial pose by extracting and matching feature points in the image to generate map points;

the close coupling optimization module is used for creating back-end close coupling optimization according to the corresponding relation between the map points and the corner points;

the matrix construction module is used for constructing a characteristic point constraint matrix and a luminosity constraint matrix through the back-end tight coupling optimization;

and the matrix splicing module is used for splicing the feature point matrix and the luminosity constraint matrix to obtain a final constraint matrix.

In a third aspect, another embodiment of the present disclosure provides an electronic device, including:

a memory for storing computer readable instructions; and

a processor configured to execute the computer readable instructions to cause the electronic device to implement the method of any one of the first aspect.

In a fourth aspect, another embodiment of the present disclosure provides a non-transitory computer-readable storage medium storing computer-readable instructions that, when executed by a computer, cause the computer to implement the method of any one of the first aspects.

The embodiment of the disclosure discloses a method, a device, electronic equipment and a computer readable storage medium of a visual inertial odometer, wherein the method comprises the following steps: fusing pose information of at least one sensor to obtain a first initial pose; performing photometric optimization on the initial pose according to pose transformation between two frames of images to obtain a second initial pose; performing feature point optimization on the second initial pose by extracting and matching feature points in the image to generate map points; creating back-end tight coupling optimization according to the corresponding relation between the map points and the corner points; constructing a characteristic point constraint matrix and a luminosity constraint matrix through the back-end tight coupling optimization; and splicing the feature point matrix and the luminosity constraint matrix to obtain a final constraint matrix. According to the method, the characteristic point method and the photometry in the special SLAM are effectively combined, the characteristic point method and the photometry are loosely coupled at the front end, the VIO method with the characteristic point method function and the photometry tightly coupled is adopted for rear end optimization, positioning and three-dimensional reconstruction under different scenes can be met, and the robustness of an algorithm is improved.

The method considers the complementary actions of the characteristic point method and the photometry on different environments (illumination change and weak textures) and the internal relation between the characteristic point method and the photometry, and solves the robustness of the algorithms which are similar to the existing methods for separately running and splicing the two methods together. The method has good stability in some scenes with intense illumination change but rich textures or little illumination change but weak textures, and can be applied to the fields of positioning and navigation.

The foregoing description is only an overview of the disclosed technology, and may be implemented in accordance with the disclosure of the present disclosure, so that the above-mentioned and other objects, features and advantages of the present disclosure can be more clearly understood, and the following detailed description of the preferred embodiments is given with reference to the accompanying drawings.

Drawings

FIG. 1 is a schematic flow diagram of a method for a visual inertial odometer according to an embodiment of the disclosure;

FIG. 2 is a schematic diagram of a reprojection error of a feature point method according to an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of a back-end optimization constraint factor provided by an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of a Hessian matrix in back-end optimization provided by an embodiment of the present disclosure;

FIG. 5 is a schematic diagram of a feature point method constraint matrix according to an embodiment of the present disclosure;

FIG. 6 is a diagram illustrating the elimination of a Hessian constraint matrix provided by an embodiment of the present disclosure;

FIG. 7 is a wire-frame flow chart of a visual inertial distance calculation method provided by an embodiment of the present disclosure;

FIG. 8 is a schematic view of a visual odometer assembly provided in accordance with another embodiment of the disclosure;

fig. 9 is a schematic structural diagram of an electronic device according to another embodiment of the present disclosure.

Detailed Description

In order to more clearly describe the technical contents of the present invention, a further description will be made below in connection with specific embodiments.

The implementations described in the following exemplary examples are not representative of all implementations consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present application as detailed in the accompanying claims.

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. Related definitions of other terms will be given in the description below.

It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units.

It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.

The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. The embodiments disclosed are described in detail below with reference to the accompanying drawings.

In the application of visual SLAM, the fields of automatic driving and intelligent robots are increasingly applied, and the accuracy and the robustness of positioning and three-dimensional reconstruction are increasingly high. Description of specific cases: the visual SLAM is mainly divided into a characteristic point method and a photometry (direct method), wherein the characteristic point method mainly depends on characteristic points in an image to realize data association, and has stronger dependence on the texture of a scene. Photometry operates directly on pixels, but is relatively sensitive to changes in illumination. Both cannot run robustly in an actual scene, and the actual scene usually encounters the conditions of severe illumination change and insufficient scene texture. However, the two methods are complementary, if the two methods can be well fused together, the robustness of the visual SLAM algorithm is improved to a certain extent.

In the SLAM algorithm, the SLAM algorithm combining the characteristic point method and the photometry is not more, the characteristic point method and the photometry can be respectively operated in two independent threads, the characteristic point method can be reinitialized by the photometry when the characteristic point method is lost in a weak texture scene, the characteristic point method is mainly used for eliminating accumulated errors, no help is provided for the condition of illumination change, the characteristic point method and the photometry are combined together in a loose coupling mode, and the complementarity of the characteristic point method and the photometry cannot be well utilized. Therefore, in order to achieve the complementary effect by combining the advantages of the two methods, the present disclosure proposes a VIO method combining a feature point method and a photometry method, so as to improve the robustness of the VIO algorithm.

Fig. 1 is a schematic flow chart of a visual odometer method according to an embodiment of the disclosure, where the visual odometer method may be implemented by a visual odometer device, which may be implemented as software, or as a combination of software and hardware, and the device may be integrally disposed inside an electronic device or a robot, and implemented by a processor of the electronic device or a robot control system. As shown in fig. 1, the method comprises the steps of:

step S101: and fusing pose information of at least one sensor to obtain a first initial pose.

In step S101, the sensors in the present disclosure are cameras, inertial sensors (IMU, inertial Measurement Unit), wheel odometers. The IMU sensor can predict more accurate pose (more accurate rotation information) through pre-integration in a shorter time, the wheel type odometer can also obtain pose (more accurate displacement information), and the pose information of the IMU sensor and the wheel type odometer are fused to obtain a better initial pose. In order to provide a better initial value for subsequent photometric optimization, the IMU pre-integral rotation and wheel-type odometer translation are combined to obtain a first initial pose.

Specifically, fusing pose information of at least one sensor to obtain a first initial pose, including: rotation information in the pose information is obtained by using rotation of the IMU sensor; obtaining displacement information in pose information by utilizing translation of a wheel type odometer; and carrying out information fusion on the rotation information and the displacement information to obtain a first initial position.

Step S102: and performing photometric optimization on the initial pose according to pose transformation between the two frames of images to obtain a second initial pose.

In step S102, an image is acquired by an image sensor (e.g., a camera), after image information is acquired, a color image is first converted into a gray image, a point with a large change in pixel gradient is extracted, a position in a next frame of image is obtained by optical flow tracking, a photometric error function is constructed, and an initial pose is optimized by nonlinear least square. Since the pose transformation between two frames has been previously predicted by the IMU and wheel odometer, the position of the previous point in the subsequent image can be estimated and then the photometric error function constructed.

Specifically, the performing photometric optimization on the initial pose according to pose transformation between two frames of images to obtain a second initial pose includes: acquiring an image by an image sensor; converting the color image into a gray scale image; extracting pixel points with pixel gradient change larger than a certain threshold value from the gray image; obtaining the position in the next frame of image through optical flow tracking and constructing a photometric error function; constructing a nonlinear least square function according to the error function; and optimizing the first initial pose by using the nonlinear least square function to obtain a second initial pose.

The photometric error function is expressed as follows:

r＝ω _h (I ₂ [x ₂ ]-(a ₂₁ I ₁ [x ₁ ]+b ₂₁ ) (1)

The nonlinear least squares function is expressed as follows:

Step S103: and carrying out feature point optimization on the second initial pose by extracting and matching the feature points in the image, and generating map points.

In step S103, the luminosity optimization in steps S101 and S102 results in a more accurate pose, i.e. the second initial position, but is optimized by a feature point method for tracking process more accurately. Feature extraction and matching are first required. And screening out points with larger pixel gradient changes from the extracted points which meet the threshold condition as corner points, wherein the points with larger pixel gradient changes can be characterized in that the pixel gradient changes are larger than a first threshold value, and further screening out characteristic points which are larger than a second threshold value condition as corner points, wherein the second threshold value is larger than the first threshold value. Furthermore, a data association is implemented in order to be able to match the description of the extracted corner points. And extracting characteristic points in the image and describing the characteristic points for matching, and finally, carrying out characteristic point re-projection optimization to obtain a more accurate pose, and generating map points by using the pose.

Specifically, the performing feature point optimization on the second initial pose by extracting and matching feature points in the image to generate map points includes: converting the color image into a gray scale image; extracting pixel points with pixel gradient change larger than a certain threshold value from the gray image as characteristic points of the image; screening out characteristic points with pixel gradient change larger than a second threshold value from the characteristic points of the image as corner points; performing association matching with the data of the second initial pose through the description of the corner points; constructing a reprojection error function according to the data after the association matching; performing feature point optimization on the second initial pose according to a reprojection error function; and generating map points according to the second initial pose after feature point optimization and the corner points.

Although the second initial pose after luminosity optimization is accurate, an error still exists, and a re-projection error function is constructed after feature matching is completed, as shown in fig. 2, fig. 2 shows a schematic diagram of a feature point method re-projection error provided by an embodiment of the disclosure.

According to the figure, a reprojection error function is constructed by constructing the same method as the photometric error function, and a specific data amount expression is as follows:

wherein, xi ^* Heavy projection error, u _i Is a known 2D homogeneous coordinate s _i Representing the scale, K is the internal parameters of the camera, ζ is the unknown camera pose, P _i I and N are natural numbers, which are known homogeneous coordinates of 3D points. After the re-luminosity error function is optimized, the front-end tracking is more accurate.

Step S104: and creating back-end tight coupling optimization according to the corresponding relation between the map points and the corner points.

In step S104, in the embodiment of the present disclosure, first, description will be given of the attribute of the map point, specifically referring to fig. 3, and fig. 3 shows a schematic diagram of the back-end optimization constraint factor provided in an embodiment of the present disclosure. As shown in FIG. 3, L in the figure ₁ ，L ₂ The map points are represented, the open circles represent feature point re-projection constraint factors, and the filled circles represent luminosity constraint factors. T (T) _h For the dominant frame (from which the inverse depth of the map point comes), T _t Is the target frame. From the figure, it can be seen thatGo out L ₁ Map points, which are generated by corner points, thus contain two constraints: a photometric constraint factor and a projective constraint factor. L (L) ₂ Is generated by common points to contain only one constraint factor: a photometric constraint factor. Therefore, two constraint factors are included when the map points are generated by the corner points in the back-end optimization, and the constraints of the feature point method and the photometry are tightly coupled together.

In this embodiment, the back-end optimization problem is constructed by whether the map points are generated by corner points, if the map points are generated by corner points, the back-end optimization includes both feature point constraints and luminosity constraints, and the corresponding parts in the Hessian matrix constructed by the two are added to obtain the final constraint matrix. And if the map points are not formed by the corner points, only luminosity constraint is contained, and the updating quantity of the variable to be optimized is solved through nonlinear least square.

Specifically, the creating the back-end tight coupling optimization according to the correspondence between the map points and the corner points includes: creating a back-end tight coupling optimization by whether the map points are generated by the corner points; if the map points are generated by the corner points, the back-end tight coupling optimization comprises characteristic point constraint and luminosity constraint, and the characteristic point constraint and the luminosity constraint are tightly coupled; and if the map points are not formed by the corner points, luminosity constraint is contained, and the updating quantity of the variable to be optimized is solved through nonlinear least square. The variables to be optimized comprise camera internal parameters, pose, feature point inverse depth and photometric affine transformation parameters.

Step S105: and constructing a characteristic point constraint matrix and a luminosity constraint matrix through the back-end tight coupling optimization.

In step S105, in the embodiment of the present disclosure, variables to be optimized of the feature point method are: camera internal parameters, pose, inverse depth of road mark point. The variables to be optimized of the photometry contain photometric affine transformation parameters a, b in addition to the characteristic point method.

Specifically, the construction of the feature point constraint matrix and the luminosity constraint matrix through the back-end tight coupling optimization includes: respectively obtaining derivatives between the feature point constraint and the variable to be changed in the luminosity constraint according to the luminosity error function and the reprojection error function; and respectively constructing a characteristic point constraint matrix and a luminosity constraint matrix according to the derivative.

The photometric error function and the mathematical expression of the characteristic point method projection error function have been described in the previous steps, so that the derivative of the error function with respect to the variables to be optimized can be deduced to construct a Hessian constraint matrix as shown in fig. 4. If the map points are generated by the corner points, the back-end optimization comprises feature point constraint and luminosity constraint, and corresponding parts in the Hessian matrix constructed by the feature point constraint and the luminosity constraint are added to obtain a final constraint matrix.

The content of the constraint matrix is shown in fig. 5, where p represents the inverse depth of the map point, C represents the camera internal parameter, and x represents the camera pose. Where the transpose of Hpx is equal to Hxp, the Hxx and bx parts of the figure are shown as x only _m For host frame, x _t Filling a target frame as an example, if host and target are other two frames, the corresponding numbered frames need to be accumulated in two parts of Hxx and bx, for example: host frame x ₁ Target frame is x _t The constraint for this part needs to be added to the parts corresponding to Hxx and bx.

In the same way, the photometric constraint can also obtain a matrix in a similar form, and the photometric constraint is more than the feature point method constraint by two photometric affine transformation parameters.

Step S106: and splicing the feature point matrix and the luminosity constraint matrix to obtain a final constraint matrix.

In step S106, when the variables to be optimized, which refer to the difference between the photometric constraint and the projective constraint, are photometric transformation parameters and the back-end optimization is tightly coupled, the back-end optimization problem is constructed by whether the map points are generated by corner points, if the map points are generated by corner points, the back-end optimization comprises both feature point constraint and photometric constraint, then a feature point constraint matrix and a photometric constraint matrix are constructed simultaneously, and the corresponding parts of the feature point constraint matrix and the photometric constraint matrix are added; and if the map points are not formed by the corner points, only luminosity constraint is contained, and the updating quantity of the variable to be optimized is solved through nonlinear least square. As shown in fig. 6, a Hessian constraint matrix provided for an embodiment of the present disclosure The elimination diagram is used for eliminating the updating quantity of the variable to be optimized, and H is obtained by constraint matrix elimination and H and b obtained by constraint matrix elimination, and H is obtained by luminosity constraint _guang And b _guang Feature point constraints will get H _point And b _point 。

H _guang x _guang ＝-b _guang (4)

H _point x _point ＝-b _point (5)

H _final x _final ＝-b _final ，H _final ＝H _guang +H _point ,b _final ＝b _guang +b _point (6)

Note that the above '+' indicates that the residuals are partially added with respect to the same variable derivative to be optimized. X is x _point ，x _guang Representing the amount of update of the variable to be optimized.

Specifically, the splicing the feature point matrix and the luminosity constraint matrix to obtain a final constraint matrix includes: adding the same variable parts to be optimized in the characteristic point constraint matrix and the luminosity constraint matrix to obtain the update quantity of the variable to be optimized; and performing elimination treatment on the update quantity of the variable to be optimized to obtain a final constraint matrix.

Further, in addition to the above steps S101 to S106, the visual inertial odometer method in the embodiment of the present disclosure further includes: updating variables to be optimized in the characteristic point constraint matrix and the luminosity constraint matrix; the updating process comprises the following specific steps: solving an update amount by whether the map points are formed by corner points or not; the back-end tight-coupling optimization contains feature point constraints and luminosity constraints if the map points are generated by the corner points, and luminosity constraints if the map points are not formed by the corner points; and directly performing incremental addition on the variable to be optimized, and multiplying the rotation information in the second initial pose.

Fig. 7 is a wireframe flow chart of a visual inertial mileage calculation method according to an embodiment of the present disclosure. As shown, in connection with the flow of fig. 1, first, the sensors in the electronic device are passed through, in this figure, cameras, inertial sensors (IMU, inertial Measurement Unit), wheel odometers. The IMU sensor can predict more accurate pose (more accurate rotation information) through pre-integration in a shorter time, the wheel type odometer can obtain pose (more accurate displacement information) as well, and the pose information of the IMU sensor and the wheel type odometer are fused to obtain a better initial pose. Therefore, in order to provide a better initial value for subsequent luminosity optimization, the IMU pre-integral rotation and wheel type odometer translation combination is utilized to obtain a first initial pose. The image sensor (for example, a camera) is used for acquiring an image, performing photometric optimization on the obtained first initial pose according to pose transformation between two frames in the acquired image, converting the acquired color image into a gray image, extracting points with larger pixel gradient change, obtaining the position in the next frame of image through optical flow tracking, constructing a photometric error function, optimizing the initial pose by utilizing nonlinear least square, obtaining the optimized second initial pose, estimating the position of the front point in the rear image according to pose transformation between the two frames, and then constructing the photometric error function. And then carrying out feature extraction and matching on the second initial pose, and screening out points with larger extracted pixel gradient changes as corner points meeting a threshold condition, wherein the points with larger pixel gradient changes can be characterized as the points with larger pixel gradient changes being larger than a first threshold value, and further screening out feature points with larger pixel gradient changes being larger than a second threshold value condition as corner points, wherein the second threshold value is larger than the first threshold value. And extracting characteristic points in the image and matching the description of the characteristic points, and carrying out characteristic point re-projection optimization to obtain more accurate pose, and generating map points by using the pose.

After generating map points, creating back-end tight coupling optimization according to the corresponding relation between the map points and the corner points, constructing a back-end optimization problem by whether the map points are generated by the corner points, if the map points are generated by the corner points, the back-end optimization comprises characteristic point constraint and luminosity constraint, constructing a characteristic point constraint matrix and a luminosity constraint matrix at the same time, and adding corresponding parts of the characteristic point constraint matrix and the luminosity constraint matrix; and if the map points are not formed by the corner points, only luminosity constraint is contained, and the updating quantity of the variable to be optimized is solved through nonlinear least square. And adding the error terms in the matrix with respect to the same variable to be optimized to obtain a final constraint matrix. And after the final constraint matrix is obtained, iterating the Europeanized solution to the increment of the variable to be optimized, thereby updating the variable to be optimized.

Fig. 8 shows a schematic view of a visual inertial odometer device provided in another embodiment of the disclosure. The device comprises: a fusion module 801, a luminosity optimization module 802, a feature point optimization module 803, a close coupling optimization module 804, a matrix construction module 805 and a matrix splicing module 806. Wherein:

the fusion module 801 is configured to fuse pose information of at least one sensor to obtain a first initial pose. The module is specifically used for: rotation information in the pose information is obtained by using rotation of the IMU sensor; obtaining displacement information in pose information by utilizing translation of a wheel type odometer; and carrying out information fusion on the rotation information and the displacement information to obtain a first initial position. In order to provide a better initial value for subsequent photometric optimization, the IMU pre-integral rotation and wheel-type odometer translation are combined to obtain a first initial pose.

The luminosity optimization module 802 is configured to perform luminosity optimization on the initial pose according to pose transformation between two frames of images, so as to obtain a second initial pose.

The module is specifically used for: acquiring an image by an image sensor; converting the color image into a gray scale image; extracting pixel points with pixel gradient change larger than a certain threshold value from the gray image; obtaining the position in the next frame of image through optical flow tracking and constructing a photometric error function; constructing a nonlinear least square function according to the error function; and optimizing the first initial pose by using the nonlinear least square function to obtain a second initial pose.

The photometric error function is expressed as follows:

r＝ω _h (I ₂ [x ₂ ]-(a ₂₁ I ₁ [x ₁ ]+b ₂₁ ) (1)

The nonlinear least squares function is expressed as follows:

The feature point optimization module 803 is configured to perform feature point optimization on the second initial pose by extracting and matching feature points in the image, so as to generate map points.

The second initial position is obtained after luminosity optimization, namely the second initial position is subjected to feature extraction and matching, the points with larger change of the extracted pixel gradient are screened out to serve as corner points meeting the threshold condition, wherein the points with larger change can be characterized in that the pixel gradient change is larger than a first threshold value, and the feature points with larger than a second threshold value condition are further screened out to serve as the corner points, and the second threshold value is larger than the first threshold value. Furthermore, a data association is implemented in order to be able to match the description of the extracted corner points. And extracting characteristic points in the image and describing the characteristic points for matching, and finally, carrying out characteristic point re-projection optimization to obtain a more accurate pose, and generating map points by using the pose.

The acquisition module is specifically configured to: converting the color image into a gray scale image; extracting pixel points with pixel gradient change larger than a certain threshold value from the gray image as characteristic points of the image; screening out characteristic points with pixel gradient change larger than a second threshold value from the characteristic points of the image as corner points; performing association matching with the data of the second initial pose through the description of the corner points; constructing a reprojection error function according to the data after the association matching; performing feature point optimization on the second initial pose according to a reprojection error function; and generating map points according to the second initial pose after feature point optimization and the corner points.

The close-coupling optimization module 804 is configured to create a back-end close-coupling optimization according to the correspondence between the map points and the corner points.

And creating a back-end tight coupling optimization module, constructing a back-end optimization problem by whether map points are generated by corner points, if the map points are generated by the corner points, performing back-end optimization to include feature point constraint and luminosity constraint, and adding corresponding parts in the Hessian matrix constructed by the two to obtain a final constraint matrix. And if the map points are not formed by the corner points, only luminosity constraint is contained, and the updating quantity of the variable to be optimized is solved through nonlinear least square.

The module is specifically used for: the creating back-end tight coupling optimization according to the correspondence between the map points and the corner points comprises the following steps: creating a back-end tight coupling optimization by whether the map points are generated by the corner points; if the map points are generated by the corner points, the back-end tight coupling optimization comprises characteristic point constraint and luminosity constraint, and the characteristic point constraint and the luminosity constraint are tightly coupled; and if the map points are not formed by the corner points, luminosity constraint is contained, and the updating quantity of the variable to be optimized is solved through nonlinear least square. The variables to be optimized comprise camera internal parameters, pose, feature point inverse depth and photometric affine transformation parameters.

The matrix construction module 805 is configured to construct a feature point constraint matrix and a luminosity constraint matrix through the back-end tight coupling optimization.

The module is specifically used for: respectively obtaining derivatives between the feature point constraint and the variable to be changed in the luminosity constraint according to the luminosity error function and the reprojection error function; and respectively constructing a characteristic point constraint matrix and a luminosity constraint matrix according to the derivative.

The photometric error function and the mathematical expression of the eigenvalue projected error function have been described above, so that the derivative of the error function with respect to the variables to be optimized can be deduced to construct a Hessian constraint matrix as shown in fig. 4. If the map points are generated by the corner points, the back-end optimization comprises feature point constraint and luminosity constraint, and corresponding parts in the Hessian matrix constructed by the feature point constraint and the luminosity constraint are added to obtain a final constraint matrix.

The matrix splicing module 806 is configured to splice the feature point matrix and the luminosity constraint matrix to obtain a final constraint matrix.

The module is specifically used for: constructing a back-end optimization problem by whether the map points are generated by corner points, if the map points are generated by the corner points, the back-end optimization comprises feature point constraint and luminosity constraint, and simultaneously constructing a feature point constraint matrix and a luminosity constraint matrix, and adding corresponding parts of the feature point constraint matrix and the luminosity constraint matrix; and if the map points are not formed by the corner points, only luminosity constraint is contained, and the updating quantity of the variable to be optimized is solved through nonlinear least square. Adding the same variable parts to be optimized in the characteristic point constraint matrix and the luminosity constraint matrix to obtain the update quantity of the variable to be optimized; and performing elimination treatment on the update quantity of the variable to be optimized to obtain a final constraint matrix.

In addition, the visual odometer device further comprises:

and the updating module is used for updating variables to be optimized in the characteristic point constraint matrix and the luminosity constraint matrix.

The updating module is specifically used for: solving an update amount by whether the map points are formed by corner points or not; the back-end tight-coupling optimization contains feature point constraints and luminosity constraints if the map points are generated by the corner points, and luminosity constraints if the map points are not formed by the corner points; and directly performing incremental addition on the variable to be optimized, and multiplying the rotation information in the second initial pose.

The apparatus shown in fig. 8 may perform the method of the embodiment shown in fig. 1, and reference is made to the relevant description of the embodiment shown in fig. 1 for parts of this embodiment not described in detail. The implementation process and the technical effect of this technical solution refer to the description in the embodiment shown in fig. 1, and are not repeated here.

Referring now to fig. 9, a schematic diagram of an electronic device 900 suitable for use in implementing another embodiment of the present disclosure is shown. The terminal devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 9 is merely an example, and should not impose any limitations on the functionality and scope of use of embodiments of the present disclosure.

As shown in fig. 9, the electronic device 900 may include a processing means (e.g., a central processor, a graphics processor, etc.) 901, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 902 or a program loaded from a storage means 908 into a Random Access Memory (RAM) 903. In the RAM 903, various programs and data necessary for the operation of the electronic device 900 are also stored. The processing apparatus 901, the ROM 902, and the RAM 903 are connected to each other via a communication line 904. An input/output (I/O) interface 905 is also connected to the communication line 904.

In general, the following devices may be connected to the I/O interface 905: input devices 906 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, and the like; an output device 907 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 908 including, for example, magnetic tape, hard disk, etc.; and a communication device 909. The communication means 909 may allow the electronic device 900 to communicate wirelessly or by wire with other devices to exchange data. While fig. 9 shows an electronic device 900 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a non-transitory computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication device 909, or installed from the storage device 908, or installed from the ROM 902. When executed by the processing device 901, performs the above-described functions defined in the methods of the embodiments of the present disclosure.

It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.

In some implementations, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.

The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: the interaction method in the above embodiment is performed.

Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including, but not limited to, an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units involved in the embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.

The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

According to one or more embodiments of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform any one of the methods of the first aspect.

According to one or more embodiments of the present disclosure, there is provided a non-transitory computer-readable storage medium, characterized in that the non-transitory computer-readable storage medium stores computer instructions for causing a computer to perform any of the methods of the foregoing first aspect.

The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in this disclosure is not limited to the specific combinations of features described above, but also covers other embodiments which may be formed by any combination of features described above or equivalents thereof without departing from the spirit of the disclosure. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).

Claims

1. A method of visual odometry, comprising:

fusing pose information of at least one sensor to obtain a first initial pose;

creating back-end tight coupling optimization according to the corresponding relation between the map points and the corner points, comprising: creating a back-end tight coupling optimization by whether the map points are generated by the corner points, if the map points are generated by the corner points, the back-end tight coupling optimization comprises characteristic point constraints and luminosity constraints, the characteristic point constraints and the luminosity constraints are tightly coupled, if the map points are not formed by the corner points, luminosity constraints are included, and solving the updating quantity of the variable to be optimized by nonlinear least square;

2. The method of claim 1, wherein fusing pose information of at least one sensor to obtain a first initial pose comprises:

obtaining rotation information in the pose information by utilizing rotation of the inertial sensor;

3. The method of claim 1, wherein the performing photometric optimization on the initial pose according to pose transformation between two frames of images to obtain a second initial pose comprises:

acquiring an image by an image sensor;

converting the color image into a gray scale image;

constructing a nonlinear least square function according to the error function;

4. A method according to claim 3, wherein the photometric error function is expressed as follows:

r＝ω _h (I ₂ [x ₂ ]-(a ₂₁ I ₁ [x ₁ ]+b ₂₁ ))

5. A method according to claim 3, wherein the nonlinear least squares function is expressed as follows:

6. The method of claim 1, wherein the feature point optimizing the second initial pose by extracting and matching feature points in the image, generating map points, comprises:

converting the color image into a gray scale image;

7. The method of claim 6, wherein the re-projection error function is expressed as follows:

8. The method of claim 1, wherein said constructing feature point constraint matrices and luminosity constraint matrices by said back-end tight coupling optimization comprises:

9. The method according to claim 1, wherein the performing the stitching of the feature point matrix and the photometric constraint matrix to obtain a final constraint matrix includes:

10. The method of claim 1, the method further comprising:

11. The method according to one of claims 8 to 10, characterized in that the variables to be optimized comprise camera internal parameters, pose, feature point inverse depth and photometric affine transformation parameters.

12. A visual odometer device, comprising:

the close coupling optimization module is used for creating back-end close coupling optimization according to the corresponding relation between the map points and the corner points, and comprises the following steps: creating a back-end tight coupling optimization by whether the map points are generated by the corner points, if the map points are generated by the corner points, the back-end tight coupling optimization comprises characteristic point constraints and luminosity constraints, the characteristic point constraints and the luminosity constraints are tightly coupled, if the map points are not formed by the corner points, luminosity constraints are included, and solving the updating quantity of the variable to be optimized by nonlinear least square;

13. An electronic device, comprising:

a memory for storing computer readable instructions; and

a processor for executing the computer readable instructions to cause the electronic device to implement the method according to any one of claims 1-11.