CN114119885A

CN114119885A - Image feature point matching method, device and system and map construction method and system

Info

Publication number: CN114119885A
Application number: CN202010801651.6A
Authority: CN
Inventors: 牛思杰; 庞涛; 潘碧莹; 沙通; 杨婷婷
Original assignee: China Telecom Corp Ltd
Current assignee: China Telecom Corp Ltd
Priority date: 2020-08-11
Filing date: 2020-08-11
Publication date: 2022-03-01

Abstract

The disclosure relates to an image feature point matching method, device and system. The image feature point matching method comprises the following steps: acquiring accurate pose information of the image sensor at a time t, wherein t is a positive integer; determining a plurality of first characteristic points of an ith image shot by an image sensor at the time t, a characteristic descriptor of each first characteristic point and pixel coordinates, wherein i is a positive integer; acquiring IMU data of the image sensor measured between t moment and t +1 moment by an inertial measurement unit IMU which is rigidly connected with the image sensor; determining estimated pose information of the image sensor at the t +1 moment according to the IMU data and the accurate pose information; determining the area of a second feature point matched with each first feature point in an i +1 th image shot by an image sensor at the t +1 moment according to the pixel coordinate, the accurate pose information and the estimated pose information of each first feature point; in the region, a second feature point is determined according to the feature descriptor of each first feature point.

Description

Image feature point matching method, device and system and map construction method and system

Technical Field

The present disclosure relates to the field of computer vision, and in particular, to a method, an apparatus, and a system for matching image feature points, a method and a system for constructing a map, and a computer-readable storage medium.

Background

The SLAM (Simultaneous localization and mapping) algorithm is an autonomous positioning navigation technique. At present, the SLAM algorithm is mainly applied to the fields of positioning and navigation of mobile robots or unmanned planes, VR (Virtual Reality) or AR (Augmented Reality) mixed Reality, and automatic driving. According to different sensor information, the SLAM algorithm is mainly divided into a laser SLAM algorithm based on a laser radar and a vision SLAM algorithm based on a vision sensor. At present, the visual SLAM algorithm is mainly divided into three parts, namely a front-end visual odometer, a rear-end nonlinear optimization and a global loop detection.

The principle of the visual odometer is to calculate the pose information of the visual sensor from a large number of identical (matching) feature points present in adjacent image frames. The main work of the vision odometer is to extract the characteristic points of the image returned by the vision sensor, match the characteristic points with the previous frame of image and calculate the three-dimensional space coordinates of the image, thereby calculating the pose information of the vision sensor. The local motion trajectory of the vision sensor is generated by continuously tracking feature points on the image frames.

The main work of the nonlinear optimization is to perform local Bundle Adjustment (Bundle Adjustment) optimization on the pose information corresponding to the keyframe of the visual odometer, correct accumulated errors and solve the more accurate position of the pose and the feature point of the visual sensor in the three-dimensional space.

The loop detection mainly works by performing loop detection on the pose track and the key frame and readjusting the track and the map so as to eliminate accumulated errors as much as possible.

In the related art, when a feature point in an image of a current frame is matched with a feature point, global matching is performed in the image corresponding to the current frame to determine the feature point matched with the feature point in the image corresponding to the current frame.

Disclosure of Invention

The inventor thinks that: in the related art, the calculation amount of feature point matching is large, the efficiency of feature point matching is low, and matching errors are easy to occur.

In view of the above technical problems, the present disclosure provides a solution, which can reduce the amount of computation and improve the efficiency and accuracy of feature point matching.

According to a first aspect of the present disclosure, there is provided an image feature point matching method, including: acquiring accurate pose information of the image sensor at a time t, wherein t is a positive integer; determining a plurality of first characteristic points, a characteristic descriptor and pixel coordinates of each first characteristic point of an ith image shot by the image sensor at the time t, wherein i is a positive integer; acquiring IMU data of the image sensor measured between t moment and t +1 moment by an inertial measurement unit IMU which is rigidly connected with the image sensor; determining estimated pose information of the image sensor at the t +1 moment according to the IMU data and the accurate pose information; determining an area where a second feature point matched with each first feature point in an i +1 th image shot by the image sensor at the t +1 moment is located according to the pixel coordinate of each first feature point, the accurate pose information and the estimated pose information; and in the region, determining the second characteristic point according to the characteristic descriptor of each first characteristic point.

In some embodiments, determining a region in which a second feature point matching the each first feature point in an i +1 th image captured by the image sensor at the time t +1 is located includes: determining the pixel coordinates of candidate feature points which are possibly matched with each first feature point in the (i + 1) th image of the image sensor according to the pixel coordinates of each first feature point, the accurate pose information and the estimated pose information; and determining a specified area in the i +1 th image by taking the pixel coordinates of the candidate feature points as the center as an area where a second feature point matched with each first feature point is located.

In some embodiments, determining the pixel coordinates of the candidate feature points in the i +1 th image that are likely to match the each first feature point comprises: determining the position coordinates of the three-dimensional space characteristic points of each first characteristic point in the three-dimensional space by using an imaging model according to the pixel coordinates of each first characteristic point and the accurate pose information; and determining the pixel coordinates of the candidate characteristic points according to the position coordinates of the three-dimensional space characteristic points and the estimated pose information.

In some embodiments, determining the pixel coordinates of the candidate feature point comprises: determining the position coordinates of the optical center and the focus of the image sensor at the t +1 moment according to the estimated pose information and the focal length of the image sensor; and determining the pixel coordinates of the candidate characteristic points according to the position coordinates of the three-dimensional space characteristic points, the position coordinates of the optical center of the image sensor at the moment of t +1 and the position coordinates of the focus.

In some embodiments, determining the pixel coordinates of the candidate feature point according to the position coordinates of the three-dimensional space feature point, the position coordinates of the optical center of the image sensor at the time t +1, and the position coordinates of the focal point comprises: calculating Euclidean distances between the three-dimensional space feature points, the optical centers and the focal points according to the position coordinates of the three-dimensional space feature points, the position coordinates of the optical centers and the position coordinates of the focal points; calculating the angle of an included angle between a line segment formed by the three-dimensional space characteristic point and the optical center and a line segment formed by the optical center and the focus according to the Euclidean distance; determining a product of a euclidean distance between the optical center and the focal point and a cosine value of the angle as a euclidean distance between the optical center and the candidate feature point; and determining the pixel coordinates of the candidate characteristic points according to the Euclidean distance between the optical center and the candidate characteristic points.

In some embodiments, the image feature point matching method further includes: preprocessing the ith image and the (i + 1) th image, wherein the preprocessing comprises at least one of color space conversion and scale conversion.

According to a second aspect of the present disclosure, there is provided an image feature point matching apparatus including: the first acquisition module is configured to acquire accurate pose information of the image sensor at a time t, wherein t is a positive integer; a first determination module configured to determine a plurality of first feature points, a feature descriptor of each first feature point, and pixel coordinates of an ith image captured by the image sensor at time t, i being a positive integer; a second acquisition module configured to acquire IMU data of the image sensor measured between time t and time t +1 by an inertial measurement unit IMU rigidly connected to the image sensor; a second determination module configured to determine estimated pose information of the image sensor at a time t +1 according to the IMU data and the accurate pose information; the third determining module is configured to determine a region where a second feature point matched with each first feature point in an i +1 th image shot by the image sensor at the t +1 moment is located according to the pixel coordinate of each first feature point, the accurate pose information and the estimated pose information; a fourth determining module configured to determine the second feature point according to the feature descriptor of each first feature point in the region.

According to a third aspect of the present disclosure, there is provided an image feature point matching apparatus including: a memory; and a processor coupled to the memory, the processor configured to perform the image feature point matching method of any of the above embodiments based on instructions stored in the memory.

According to a fourth aspect of the present disclosure, there is provided an image feature point matching system including: the image feature point matching device according to any one of the embodiments above; the image sensor is configured to shoot an ith image and an (i + 1) th image at the time t and the time t +1 during the moving process and send the ith image and the (i + 1) th image to the image feature point matching device, wherein both t and i are positive integers; and the inertial measurement unit IMU is rigidly connected with the image sensor and is configured to measure IMU data of the image sensor between the time t and the time t +1 and send the IMU data to the image feature point matching device.

According to a fifth aspect of the present disclosure, there is provided a map construction method including: acquiring multi-frame images shot by an image sensor at multiple moments in the moving process, wherein the multi-frame images comprise an ith image at the t moment and an i +1 th image at the t +1 moment, and both t and i are positive integers; determining a plurality of feature point pairs for the i-th image and the i + 1-th image using the image feature point matching method according to any one of claims 1 to 6, each feature point pair including a first feature point in the i-th image and a second feature point in the i + 1-th image that matches the first feature point; determining the accurate pose information of the image sensor at the t +1 th moment according to the accurate pose information of the image sensor at the t th moment and the plurality of characteristic point pairs; and constructing a map according to the accurate pose information of the plurality of moments and the multi-frame image.

In some embodiments, the map construction method further comprises: and carrying out nonlinear optimization and loop detection on the map.

In some embodiments, the multi-frame image is an image that satisfies a preset rule selected from all images captured by the image sensor during movement.

According to a sixth aspect of the present disclosure, there is provided a map construction system including: the acquisition module is configured to acquire a plurality of frames of images shot by the image sensor at a plurality of moments in the moving process, wherein the plurality of frames of images comprise an ith image at the t moment and an i +1 image at the t +1 moment, and both t and i are positive integers; a first determining module configured to determine, for the ith image and the (i + 1) th image, a plurality of feature point pairs using the image feature point matching method according to any one of claims 1 to 6, each feature point pair including a first feature point in the ith image and a second feature point matching the first feature point in the (i + 1) th image; a second determination module configured to determine the accurate pose information of the image sensor at the t +1 th moment according to the accurate pose information of the image sensor at the t th moment and the plurality of feature point pairs; and the construction module is configured to construct a map according to the accurate pose information of the moments and the multi-frame images.

According to a seventh aspect of the present disclosure, there is provided a computer-storable medium having stored thereon computer program instructions that, when executed by a processor, implement the image feature point matching method of any of the above embodiments or the map construction method of any of the above embodiments.

In the embodiment, the calculation amount can be reduced, and the efficiency and the accuracy of feature point matching can be improved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description, serve to explain the principles of the disclosure.

The present disclosure may be more clearly understood from the following detailed description, taken with reference to the accompanying drawings, in which:

FIG. 1 is a flow diagram illustrating an image feature point matching method according to some embodiments of the present disclosure;

FIG. 2 is a schematic diagram illustrating pose information of an image sensor according to some embodiments of the present disclosure;

FIG. 3 is a flow chart illustrating the determination of the region in which second feature points matching each first feature point in the i +1 th image are located according to some embodiments of the present disclosure;

FIG. 4 is a flow diagram illustrating the determination of pixel coordinates of candidate feature points according to some embodiments of the present disclosure;

FIG. 5 is a flow diagram illustrating a mapping method according to some embodiments of the present disclosure;

fig. 6 is a block diagram illustrating an image feature point matching apparatus according to some embodiments of the present disclosure;

FIG. 7 is a block diagram illustrating an image feature point matching apparatus according to further embodiments of the present disclosure;

FIG. 8 is a block diagram illustrating an image feature point matching system according to some embodiments of the present disclosure;

FIG. 9 is a block diagram illustrating a mapping system according to some embodiments of the present disclosure;

FIG. 10 is a block diagram illustrating a computer system for implementing some embodiments of the present disclosure.

Detailed Description

Various exemplary embodiments of the present disclosure will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.

Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.

The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.

Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.

In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.

Fig. 1 is a flow diagram illustrating an image feature point matching method according to some embodiments of the present disclosure.

As shown in fig. 1, the image feature point matching method includes steps S110 to S160.

In step S110, accurate pose information of the image sensor at time t is acquired. t is a positive integer. For example, the image sensor is a camera or the like.

And when t is 1, the accurate pose information of the image sensor at the time 0 is initialized pose information. For example, the initialized pose information includes position information (0,0,0) and posture information (0,0,0) of the image sensor at time 0.

When t is larger than 1, the accurate pose information of the image sensor at the time t is determined by obtaining a plurality of characteristic point pairs of an i-1 th image shot by the image sensor at the time t-1 and an i-th image shot at the time t through an image characteristic point matching method in any embodiments of the disclosure and further using the plurality of characteristic point pairs and the accurate pose information of the image sensor at the time t-1. Each feature point pair includes a first feature point in the i-1 th image and a second feature point matching the first feature point in the i +1 th image. In this case, i is a positive integer.

The pose information of the image sensor in the above-described embodiment will be described in detail below with reference to fig. 2.

Fig. 2 is a schematic diagram illustrating pose information of an image sensor according to some embodiments of the present disclosure.

The pose information of the image sensor includes position information and pose information. The position information is position coordinates in an x-y-z coordinate system as shown, which may be denoted as (x, y, z). The attitude information includes a heading (yaw) angle, a roll (roll) angle, and a pitch (pitch) angle, which may be represented as (a, b, c). As shown in fig. 2, the heading angle is the deviation angle of the image sensor with respect to the y coordinate axis; the roll angle is a deviation angle of the image sensor relative to a z coordinate axis; the pitch angle is the angle of deviation of the image sensor with respect to the x coordinate axis.

Returning to fig. 1, in step S120, a plurality of first feature points, a feature descriptor of each first feature point, and pixel coordinates of the ith image captured by the image sensor at time t are determined. For example, Feature point detection algorithms such as ORB (organized Features from accessed segmented Segment Test and organized Binary Robust Independent Elementary Features, combination of Accelerated segmentation detection Features and Binary Robust Independent Elementary Features), SURF (Accelerated Up Robust Features), or SIFT (Scale-Invariant Feature Transform) are used to set Feature point extraction thresholds to perform Feature point extraction on an image, and generate Feature point descriptors, thereby determining a plurality of first Feature points, a Feature descriptor of each first Feature point, and pixel coordinates. The ORB algorithm has certain advantages in speed, and the adoption of the ORB algorithm can further improve the rate of feature point matching.

In some embodiments, the ith image is also preprocessed before performing step S120. The pre-processing includes at least one of color space conversion and scale conversion. For example, the ith image is subjected to color space conversion, and the ith image is converted into a gray image, so that interference information is reduced, and the rate and accuracy of feature point matching are improved. For another example, the ith image may be subjected to scale conversion, and the ith image is cropped to a certain extent, so as to improve the rate and accuracy of feature point matching. Empirically, the size of the ith image can be uniformly converted to 320 pixels × 240 pixels. In general, the larger the size of an image is, the better the robustness of the feature point matching method is; the smaller the size of the image, the higher the rate and accuracy of feature point matching. The size of 320 pixels multiplied by 240 pixels can not only ensure better robustness, but also improve the matching rate and accuracy.

In step S130, IMU data of an image sensor measured between time t and time t +1 by an IMU (Inertial Measurement Unit) rigidly connected to the image sensor is acquired. The IMU data includes gyroscope data including angular velocity data of the image sensor in the x, y and z directions in the coordinate system as shown in fig. 2, and accelerometer data. The accelerometer data includes acceleration data in the x, y and z directions of the image sensor in a coordinate system as shown in fig. 2.

In step S140, estimated pose information of the image sensor at the time t +1 is determined according to the IMU data and the accurate pose information. The estimated pose information in the present disclosure is estimated from the IMU data. Due to the poor accuracy of IMU data, it is commonly used for estimation.

Taking the example that the precise pose information at the time t includes position information (x, y, z) and pose information (a, b, c), and the time difference between the time t and the time t +1 is Δ t, the accelerometer data in the IMU data can be used to obtain a plurality of accelerations of the image sensor in the x direction, the y direction, and the z direction in the coordinate system shown in fig. 2. For the x-direction, the y-direction and the z-direction, respectively, an average of the corresponding plurality of accelerations is determined as an average acceleration a within a time period of Δ t_x、a_y、a_z. From the position information (x, y, z) and the determined three-directional flatnessMean acceleration a_x、a_y、a_zThe position information (x ', y ', z ') of the image sensor at the time t +1 can be obtained by physical law calculation.

Similarly, the angular velocities of the image sensor in the x-direction, y-direction, and z-direction, respectively, in the coordinate system shown in fig. 2 can be obtained using the gyroscope data in the IMU data. Determining the average value of a plurality of corresponding angular velocities as the average acceleration w in the time period of delta t for the x direction, the y direction and the z direction respectively_x、w_y、w_z. According to the attitude information (a, b, c) and the determined average angular speed w of the three directions_x、w_y、w_zThe pose information (a ', b ', c ') of the image sensor at the time t +1 can be obtained by physical law calculation.

In step S150, according to the pixel coordinate, the accurate pose information, and the estimated pose information of each first feature point, a region where a second feature point matching each first feature point in an i +1 th image captured by the image sensor at a time t +1 is located is determined.

The above step S150 is realized by fig. 3, for example. Fig. 3 is a flow chart illustrating determining a region in which a second feature point matching each first feature point in an i +1 th image is located according to some embodiments of the present disclosure.

As shown in fig. 3, determining the area in which the second feature point matching each first feature point in the i +1 th image is located includes steps S151 to S152.

In step S151, pixel coordinates of candidate feature points in the i +1 th image that may be matched with each first feature point are determined according to the pixel coordinates, the accurate pose information, and the estimated pose information of each first feature point.

For example, according to the pixel coordinates and the accurate pose information of each first feature point, the position coordinates of the three-dimensional space feature points of each first feature point in the three-dimensional space are determined by using the imaging model; and determining the pixel coordinates of the candidate characteristic points according to the position coordinates and the estimated pose information of the three-dimensional space characteristic points.

Fig. 4 is a flow diagram illustrating determining pixel coordinates of candidate feature points according to some embodiments of the present disclosure.

As shown in FIG. 4, the ith image includes a first feature point p_tP is a first characteristic point p_tThree-dimensional spatial feature points in three-dimensional space. o_tx_ty_tz_tIs the coordinate system of the image sensor at time t, o_tThe pose information is the accurate pose information at the time t; o is_tX_tY_tIs the imaging plane coordinate system of the image sensor at time t, O_tIs the focal point of the image sensor, o_tTo O_tIs the focal length f of the image sensor. o_t+1x_t+1y_t+1z_t+1Coordinate system of the image sensor at time t +1, o_t+1The pose information is the estimated pose information of the determined t +1 moment; o is_t+1X_t+ ₁Y_t+1Is the imaging plane coordinate system of the image sensor at the time t +1, O_t+1Is the focal point of the image sensor, o_t+1To O_t+1Is the focal length f of the image sensor. The optical center is the center of the lens of the image sensor.

First characteristic point p_tHas been obtained by a feature point detection algorithm, e.g., (x)₁,y₁). According to the first characteristic point p_tPixel coordinate sum o_tThe position and orientation information (accurate position and orientation information at the time t) of the three-dimensional space feature point p is determined to be (x) by using the imaging model_p,y_p,z_p). The process of calculating using the imaging model is prior art and will not be described in detail in this disclosure.

To this end, the position coordinates (x) of the three-dimensional spatial feature point p have been determined_p,y_p,z_p)。

For example, the pixel coordinates of the candidate feature points are determined according to the position coordinates and the estimated pose information of the three-dimensional space feature points in the following manner.

Firstly, according to the estimated pose information and the focal length of the image sensor, determining the position coordinates of the optical center and the position of the focal point of the image sensor at the moment t +1And (4) coordinates. For example, in fig. 4, the position coordinates (x ', y ', z ') in the estimated pose information at time t +1 are determined as the optical center o_t+1The position coordinates of (a); the focus O can be determined by simply adding or subtracting the coordinate value of the position coordinate (x ', y ', z ') according to the focal length f_t+1Position coordinates (X ', Y ', Z ').

Then, the pixel coordinates of the candidate feature points are determined according to the position coordinates of the three-dimensional space feature points, the position coordinates of the optical center of the image sensor at the time t +1 and the position coordinates of the focal point.

In some embodiments, calculating Euclidean distances between every two of the three-dimensional space feature point, the optical center and the focus according to the position coordinates of the three-dimensional space feature point, the position coordinates of the optical center and the position coordinates of the focus; calculating the angle of an included angle between a line segment formed by the three-dimensional space characteristic point and the optical center and a line segment formed by the optical center and the focus according to the Euclidean distance; determining the product of the Euclidean distance between the optical center and the focal point and the cosine value of the angle as the Euclidean distance between the optical center and the candidate characteristic point; and determining the pixel coordinates of the candidate characteristic points according to the Euclidean distance between the optical center and the candidate characteristic points.

For example, in FIG. 4, p, o are calculated_t+1And O_t+1Euclidean distance between two pairs, for triangle delta po_t+1O_t+1The cosine theorem can be used for solving the angle po_t+1O_t+1. Suppose a candidate feature point is p_t+1，p_t+1Is the first characteristic point p_tImaging points, p, in the i +1 th image_t+1And o_t+1On the same straight line. Triangle delta p can be known according to imaging principle_t+1o_t+1O_t+1Is less than o_t+ ₁p_t+1O_t+1Right-angled triangle at right angle, angle o_t+1p _t+1O_t+1And & lt po_t+1O_t+1Are the same in degree, o_t+1And O_t+1Between the Euclidean distance and angle o_t+1p _t+1O_t+1The product of the cosine values of (a) is o_t+1And p_t+1The euclidean distance between them. Known as o_t+1Can determine p_t+1Position ofCoordinates (world coordinates) by converting the world coordinate system and the pixel coordinate system, p can be obtained_t+1Pixel coordinate (x)₂,y₂)。

Returning to fig. 3, in step S152, a specified region in the i +1 th image centered on the pixel coordinates of the candidate feature point is determined as a region where the second feature point matching each of the first feature points is located.

In some embodiments, a rectangular region in the i +1 th image, which is 5% of the size of the i +1 th image in both length and width, centered on the pixel coordinates of the candidate feature point, is determined as the region in which the second feature point matching each first feature point is located. In other embodiments, a circular region whose assigned value is a radius and whose pixel coordinate is the center of the candidate feature point may be determined as the region where the second feature point matching with each first feature point is located. In still other embodiments, a diamond, triangle, or other polygonal region centered on the pixel coordinates of the candidate feature point may also be determined, and the region where the second feature point matching each first feature point is located may be determined.

In some embodiments, the i +1 th image may also be preprocessed similarly or identically to the i-th image before performing step S150.

Returning to fig. 1, in step S160, in the area where the second feature point matching each first feature point is located, the second feature point is determined according to the feature descriptor of each first feature point.

For example, in FIG. 4, the candidate feature point p is shown_t+1And determining a plurality of feature points to be matched, a feature descriptor of each feature point to be matched and pixel coordinates of each feature point to be matched by using a feature point detection algorithm in a rectangular area which is centered and has the length and width which are 5% of the length and width of the i +1 th image. According to the first characteristic point p_tAnd the feature descriptors of the plurality of feature points to be matched, and determining a second feature point from the plurality of feature points to be matched. In some embodiments, the first feature point p will be compared with_tThe feature point to be matched corresponding to the feature descriptor with the minimum hamming distance is determined as the second feature point.

In the embodiment, the estimated pose information of the image sensor at the time t +1 is determined by combining the IMU data, the area where the second feature point matched with the first feature point is located is determined according to the pixel coordinate of the first feature point and the estimated pose information, and the feature point matching is performed in a limited area, so that the calculation amount and the matching time consumption can be reduced, and the speed and the accuracy of the feature point matching can be improved. In addition, by reducing the matching area, the occurrence of mismatching can be avoided to a great extent, and even if the matching error is in a small range, the overall influence is small. The stability of feature point matching can be improved by reducing the matching area, the number of feature point extraction, description and key frames can be properly reduced, and the accumulated error or tracking failure caused by over high proportion of error matching in all matching is avoided, so that the calculation amount is reduced.

FIG. 5 is a flow chart illustrating a mapping method according to some embodiments of the present disclosure.

As shown in fig. 5, the map construction method includes steps S510 to S540.

In step S510, a plurality of frame images captured by the image sensor at a plurality of times during the movement are acquired. The multi-frame image includes an ith image at a time t and an i +1 th image at a time t + 1. t and i are both positive integers. For example, the plurality of frames of images are images satisfying a preset rule selected from all images captured by the image sensor during movement. An image satisfying a preset rule may be referred to as a key frame. The images satisfying the preset rule may be, for example, images photographed at preset intervals or at preset positions.

In step S520, a plurality of pairs of feature points are determined for the ith image and the (i + 1) th image by an image feature point matching method. Each of the feature point pairs includes a first feature point in the i-th image and a second feature point matching the first feature point in the i + 1-th image. For example, the image feature point matching algorithm is an image feature point matching method in any embodiment of the present disclosure.

In step S530, the accurate pose information of the image sensor at the t +1 th time is determined according to the accurate pose information of the image sensor at the t th time and the plurality of feature point pairs. For example, a PNP (passive-N-point) algorithm may be used to determine the accurate pose information of the image sensor at the time t +1 by using at least four feature point pairs.

In step S540, a map is constructed according to the accurate pose information at multiple times and the multiple frames of images. For example, the map is a local map. In some embodiments, from the accurate pose information at multiple times, a motion trajectory of the image sensor may be constructed. And constructing a map according to the motion trail and the multi-frame images.

For example, after step S540, the map may also be subjected to nonlinear optimization and loop back detection. The nonlinear optimization comprises bundle adjustment of the motion trail, so that accumulated errors can be reduced, and a map is optimized. The nonlinear optimization may also include screening key frames to remove redundant (including duplicate or invalid) key frames. The loop detection can optimize the whole map, and eliminate accumulated errors to the maximum extent. Both non-linear optimization and loop-back detection are prior art and will not be described in detail in this disclosure.

In the above embodiments, by determining a plurality of pairs of feature points by using the image feature point matching method according to any embodiment of the present disclosure, the rate and accuracy of matching image feature points can be improved, and thus the rate and accuracy of map construction can be improved.

Fig. 6 is a block diagram illustrating an image feature point matching apparatus according to some embodiments of the present disclosure.

As shown in fig. 6, the image feature point matching device 61 includes a first obtaining module 611, a first determining module 612, a second obtaining module 613, a second determining module 614, a third determining module 615, and a fourth determining module 616. The image feature point matching device can be deployed on terminal equipment such as mobile phones, robots and unmanned planes and serves as functional modules of VR/AR, autonomous positioning navigation and the like.

The first obtaining module 611 is configured to obtain the accurate pose information of the image sensor at the time t, for example, execute step S110 shown in fig. 1. t is a positive integer.

The first determining module 612 is configured to determine a plurality of first feature points, a feature descriptor of each first feature point, and pixel coordinates of an ith image captured by the image sensor at time t, for example, to perform step S120 as shown in fig. 1. i is a positive integer.

The second acquiring module 613 is configured to acquire IMU data of the image sensor measured between time t and time t +1 by the inertial measurement unit IMU rigidly connected to the image sensor, for example, to perform step S130 as shown in fig. 1.

The second determination module 614 is configured to determine the estimated pose information of the image sensor at the time t +1 according to the IMU data and the precise pose information, for example, to perform step S140 shown in fig. 1.

The third determining module 615 is configured to determine, according to the pixel coordinate, the accurate pose information, and the estimated pose information of each first feature point, an area where a second feature point matching each first feature point in an i +1 th image captured by the image sensor at the time t +1 is located, for example, perform step S150 shown in fig. 1.

The fourth determining module 616 is configured to determine the second feature point according to the feature descriptor of each first feature point in the region where the second feature point matched with each first feature point is located, for example, execute step S160 shown in fig. 1.

Fig. 7 is a block diagram illustrating an image feature point matching apparatus according to further embodiments of the present disclosure.

As shown in fig. 7, the image feature point matching device 71 includes a memory 711; and a processor 712 coupled to the memory 711. The memory 711 is used for storing instructions for performing the corresponding embodiment of the image feature point matching method. The processor 712 is configured to perform the image feature point matching method in any of the embodiments of the present disclosure based on instructions stored in the memory 711.

Fig. 8 is a block diagram illustrating an image feature point matching system according to some embodiments of the present disclosure.

As shown in fig. 8, the image feature point matching system 8 includes an image feature point matching device 81, an image sensor 82, and an IMU 83. For example, the image feature point matching device 81 is the same as or similar to the image feature point matching device 61 and the image feature point matching device 71 in the above-described embodiments.

The image sensor 82 is configured to take the ith image and the (i + 1) th image at time t and time t +1 during the movement, and transmit the ith image and the (i + 1) th image to the image feature point matching device 81. t and i are both positive integers.

The IMU 83 is configured to measure IMU data of the image sensor 82 between time t and time t +1 and send the IMU data to the image feature point matching device 81. The IMU 83 is rigidly connected to the image sensor 82. For example, the IMU 83 and the image sensor 82 are both mounted on a mobile robot or drone.

FIG. 9 is a block diagram illustrating a mapping system according to some embodiments of the present disclosure.

As shown in fig. 9, the map construction system 9 includes an acquisition module 91, a first determination module 92, a second determination module 93, and a construction module 94.

The acquiring module 91 is configured to acquire a plurality of frames of images taken by the image sensor at a plurality of times during the movement, the plurality of frames of images including an i-th image at a t-th time and an i + 1-th image at a t + 1-th time, for example, step S510 shown in fig. 5 is performed. t and i are both positive integers.

The first determining module 92 is configured to determine a plurality of pairs of feature points for the ith image and the (i + 1) th image by using an image feature point matching method, for example, to perform step S520 shown in fig. 5. Each feature point pair includes a first feature point in the ith image and a second feature matching the first feature point in the (i + 1) th image. For example, the image feature point matching method is an image feature point matching method in any of some embodiments of the present disclosure.

The second determining module 93 is configured to determine the accurate pose information of the image sensor at the t +1 th time point according to the accurate pose information of the image sensor at the t th time point and the plurality of feature point pairs, for example, to execute step S530 shown in fig. 5.

The building module 94 is configured to build a map according to the precise pose information at multiple time instants and the multiple frames of images, for example, execute step S540 shown in fig. 5.

As shown in FIG. 10, computer system 100 may be embodied in the form of a general purpose computing device. Computer system 100 includes a memory 1010, a processor 1020, and a bus 1000 that couples various system components.

The memory 1010 may include, for example, system memory, non-volatile storage media, and the like. The system memory stores, for example, an operating system, an application program, a Boot Loader (Boot Loader), and other programs. The system memory may include volatile storage media such as Random Access Memory (RAM) and/or cache memory. The non-volatile storage medium stores, for example, instructions to perform corresponding embodiments of at least one of the image feature point matching methods. Non-volatile storage media include, but are not limited to, magnetic disk storage, optical storage, flash memory, and the like.

The processor 1020 may be implemented as discrete hardware components, such as a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gates or transistors, or the like. Accordingly, each of the modules, such as the judging module and the determining module, may be implemented by a Central Processing Unit (CPU) executing instructions in a memory for performing the corresponding step, or may be implemented by a dedicated circuit for performing the corresponding step.

Bus 1000 may use any of a variety of bus architectures. For example, bus structures include, but are not limited to, Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, and Peripheral Component Interconnect (PCI) bus.

The computer system 100 may also include an input-output interface 1030, a network interface 1040, a storage interface 1050, and the like. These

interfaces

1030, 1040, 1050 and the memory 1010 and the processor 1020 may be connected by a bus 1000. The input/output interface 1030 may provide a connection interface for input/output devices such as a display, a mouse, and a keyboard. Network interface 1040 provides a connection interface for various networking devices. The storage interface 1050 provides a connection interface for external storage devices such as a floppy disk, a usb disk, and an SD card.

Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable apparatus to produce a machine, such that the execution of the instructions by the processor results in an apparatus that implements the functions specified in the flowchart and/or block diagram block or blocks.

These computer-readable program instructions may also be stored in a computer-readable memory that can direct a computer to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instructions which implement the function specified in the flowchart and/or block diagram block or blocks.

The present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects.

By the image feature point matching method, the image feature point matching device and the image feature point matching system, the map construction method and the map construction system, and the computer storage medium in the embodiments, the calculation amount can be reduced, and the efficiency and the accuracy of feature point matching can be improved.

So far, an image feature point matching method, apparatus and system, a map construction method and system, and a computer-readable storage medium according to the present disclosure have been described in detail. Some details that are well known in the art have not been described in order to avoid obscuring the concepts of the present disclosure. It will be fully apparent to those skilled in the art from the foregoing description how to practice the presently disclosed embodiments.

Claims

1. An image feature point matching method includes:

acquiring accurate pose information of the image sensor at a time t, wherein t is a positive integer;

determining a plurality of first characteristic points, a characteristic descriptor and pixel coordinates of each first characteristic point of an ith image shot by the image sensor at the time t, wherein i is a positive integer;

acquiring IMU data of the image sensor measured between t moment and t +1 moment by an inertial measurement unit IMU which is rigidly connected with the image sensor;

determining estimated pose information of the image sensor at the t +1 moment according to the IMU data and the accurate pose information;

determining an area where a second feature point matched with each first feature point in an i +1 th image shot by the image sensor at the t +1 moment is located according to the pixel coordinate of each first feature point, the accurate pose information and the estimated pose information;

and in the region, determining the second characteristic point according to the characteristic descriptor of each first characteristic point.

2. The image feature point matching method according to claim 1, wherein determining an area where a second feature point matching each of the first feature points is located in an i +1 th image captured by the image sensor at a time t +1 comprises:

determining the pixel coordinates of candidate feature points which are possibly matched with each first feature point in the (i + 1) th image according to the pixel coordinates of each first feature point, the accurate pose information and the estimated pose information;

and determining a specified area in the i +1 th image by taking the pixel coordinates of the candidate feature points as the center as an area where a second feature point matched with each first feature point is located.

3. The image feature point matching method according to claim 2, wherein determining pixel coordinates of candidate feature points that are likely to match the each first feature point in the i +1 th image includes:

determining the position coordinates of the three-dimensional space characteristic points of each first characteristic point in the three-dimensional space by using an imaging model according to the pixel coordinates of each first characteristic point and the accurate pose information;

and determining the pixel coordinates of the candidate characteristic points according to the position coordinates of the three-dimensional space characteristic points and the estimated pose information.

4. The image feature point matching method according to claim 3, wherein determining pixel coordinates of the candidate feature point includes:

determining the position coordinates of the optical center and the focus of the image sensor at the t +1 moment according to the estimated pose information and the focal length of the image sensor;

and determining the pixel coordinates of the candidate characteristic points according to the position coordinates of the three-dimensional space characteristic points, the position coordinates of the optical center of the image sensor at the moment of t +1 and the position coordinates of the focus.

5. The image feature point matching method according to claim 4, wherein determining pixel coordinates of the candidate feature point based on the position coordinates of the three-dimensional space feature point, the position coordinates of the optical center of the image sensor at time t +1, and the position coordinates of the focal point comprises:

calculating Euclidean distances between the three-dimensional space feature points, the optical centers and the focal points according to the position coordinates of the three-dimensional space feature points, the position coordinates of the optical centers and the position coordinates of the focal points;

calculating the angle of an included angle between a line segment formed by the three-dimensional space characteristic point and the optical center and a line segment formed by the optical center and the focus according to the Euclidean distance;

determining a product of a euclidean distance between the optical center and the focal point and a cosine value of the angle as a euclidean distance between the optical center and the candidate feature point;

and determining the pixel coordinates of the candidate characteristic points according to the Euclidean distance between the optical center and the candidate characteristic points.

6. The image feature point matching method according to claim 1, further comprising:

preprocessing the ith image and the (i + 1) th image, wherein the preprocessing comprises at least one of color space conversion and scale conversion.

7. An image feature point matching apparatus comprising:

the first acquisition module is configured to acquire accurate pose information of the image sensor at a time t, wherein t is a positive integer;

a first determination module configured to determine a plurality of first feature points, a feature descriptor of each first feature point, and pixel coordinates of an ith image captured by the image sensor at time t, i being a positive integer;

a second acquisition module configured to acquire IMU data of the image sensor measured between time t and time t +1 by an inertial measurement unit IMU rigidly connected to the image sensor;

a second determination module configured to determine estimated pose information of the image sensor at a time t +1 according to the IMU data and the accurate pose information;

the third determining module is configured to determine a region where a second feature point matched with each first feature point in an i +1 th image shot by the image sensor at the t +1 moment is located according to the pixel coordinate of each first feature point, the accurate pose information and the estimated pose information;

a fourth determining module configured to determine the second feature point according to the feature descriptor of each first feature point in the region.

8. An image feature point matching apparatus comprising:

a memory; and

a processor coupled to the memory, the processor configured to perform the image feature point matching method of any of claims 1 to 6 based on instructions stored in the memory.

9. An image feature point matching system comprising:

the image feature point matching device according to any one of claims 7 to 8;

the image sensor is configured to shoot an ith image and an (i + 1) th image at the time t and the time t +1 during the moving process and send the ith image and the (i + 1) th image to the image feature point matching device, wherein both t and i are positive integers;

and the inertial measurement unit IMU is rigidly connected with the image sensor and is configured to measure IMU data of the image sensor between the time t and the time t +1 and send the IMU data to the image feature point matching device.

10. A map construction method, comprising:

acquiring multi-frame images shot by an image sensor at multiple moments in the moving process, wherein the multi-frame images comprise an ith image at the t moment and an i +1 th image at the t +1 moment, and both t and i are positive integers;

determining a plurality of feature point pairs for the i-th image and the i + 1-th image using the image feature point matching method according to any one of claims 1 to 6, each feature point pair including a first feature point in the i-th image and a second feature point in the i + 1-th image that matches the first feature point;

determining the accurate pose information of the image sensor at the t +1 th moment according to the accurate pose information of the image sensor at the t th moment and the plurality of characteristic point pairs;

and constructing a map according to the accurate pose information of the plurality of moments and the multi-frame image.

11. The mapping method of claim 10, further comprising:

and carrying out nonlinear optimization and loop detection on the map.

12. The map construction method according to claim 10, wherein the plurality of frames of images are images satisfying a preset rule selected from all images taken by the image sensor during movement.

13. A map building system, comprising:

the acquisition module is configured to acquire a plurality of frames of images shot by the image sensor at a plurality of moments in the moving process, wherein the plurality of frames of images comprise an ith image at the t moment and an i +1 image at the t +1 moment, and both t and i are positive integers;

a first determining module configured to determine, for the ith image and the (i + 1) th image, a plurality of feature point pairs using the image feature point matching method according to any one of claims 1 to 6, each feature point pair including a first feature point in the ith image and a second feature point matching the first feature point in the (i + 1) th image;

a second determination module configured to determine the accurate pose information of the image sensor at the t +1 th moment according to the accurate pose information of the image sensor at the t th moment and the plurality of feature point pairs;

and the construction module is configured to construct a map according to the accurate pose information of the moments and the multi-frame images.

14. A computer-storable medium having stored thereon computer program instructions which, when executed by a processor, implement the image feature point matching method of any one of claims 1 to 6 or the map construction method of any one of claims 10 to 12.