CN116664682A

CN116664682A - Robot, map generation method, and storage medium

Info

Publication number: CN116664682A
Application number: CN202210145368.1A
Authority: CN
Inventors: 龚喜
Original assignee: Shenzhen Pudu Technology Co Ltd
Current assignee: Shenzhen Pudu Technology Co Ltd
Priority date: 2022-02-17
Filing date: 2022-02-17
Publication date: 2023-08-29

Abstract

The application relates to a robot, a map generation method and a storage medium. The robot includes a large field angle camera, a memory, and a processor for: carrying out distortion correction processing on each frame of original image in a plurality of frames of original images obtained by shooting a target place through a large-field-angle camera to obtain a plurality of frames of target images after distortion correction; extracting feature points from each frame of target image to obtain a plurality of feature points in each frame of target image; matching a plurality of characteristic points in each two adjacent frames of target images, and determining a plurality of matched characteristic point pairs; and determining the target relative pose between each two adjacent frames of target images according to the direction vectors of the plurality of matched characteristic point pairs in each two adjacent frames of target images to obtain the coordinate information of each characteristic point pair in each two adjacent frames of target images at the target place, and constructing the map of the target place. By adopting the method, the accuracy of the visual map can be improved.

Description

Robot, map generation method, and storage medium

Technical Field

The present application relates to the field of robots, and in particular, to a robot, a map generating method, and a storage medium.

Background

In the related technical research of robots, navigation technology is a core technology of the robots and is also a key technology for realizing intelligent and autonomous movement. The map construction is the basis of the robot for navigation, and only the accurate and reliable map is established first, so that the navigation of the robot is guaranteed. Based on the advantages of low construction cost and high reliability of the visual map, the visual map is widely applied.

In the prior art, a pinhole camera with a smaller field angle is generally used for acquiring images for constructing a visual map.

However, in an environment where the indoor texture is weak, the accuracy of a visual map constructed using a pinhole camera is poor.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a robot, a map generation method, and a computer-readable storage medium that can improve visual map accuracy.

In a first aspect, the present application provides a robot, including a large field angle camera, a memory, and a processor, where the large field angle camera is configured to capture a target location to obtain a plurality of frames of original images, the memory stores a computer program that can be run on the processor, and the processor is configured to implement the following steps when executing the computer program:

Carrying out distortion correction processing on each frame of original image in the multi-frame original image to obtain a multi-frame target image after distortion correction;

extracting feature points of each frame of target image in the multi-frame target image to obtain a plurality of feature points of each frame of target image in the multi-frame target image;

matching a plurality of characteristic points in every two adjacent frames of target images in the multi-frame target images, and determining a plurality of characteristic point pairs matched in every two adjacent frames of target images in the multi-frame target images, wherein the characteristic point pairs comprise two mutually matched characteristic points;

obtaining direction vectors of a plurality of feature point pairs matched in every two adjacent frames of target images in the multi-frame target images, and determining target relative pose between every two adjacent frames of target images in the multi-frame target images based on the obtained direction vectors;

according to the target relative pose between every two adjacent frames of target images in the multi-frame target images, coordinate information of each feature point in every two adjacent frames of target images in the multi-frame target images in a target place is obtained;

and constructing a map of the target place based on the obtained coordinate information of each characteristic point in every two adjacent frames of target images in the multi-frame target images and the multi-frame original images.

In one embodiment, the performing distortion correction processing on each frame of original image in the multiple frames of original images to obtain a multiple frames of target images after distortion correction includes:

for a plurality of frames of original images, mapping each pixel point in each frame of original image in the plurality of frames of original images onto a target spherical surface to obtain a first mapping coordinate of each pixel point in each frame of original image in the plurality of frames of original images on the target spherical surface, wherein the target spherical surface is determined according to camera calibration internal parameters of a large-view camera;

the method comprises the steps of mapping first mapping coordinates of pixel points in each frame of original image in a multi-frame original image on the surface of a target sphere to the surface of a target virtual cube again to obtain second mapping coordinates of pixel points in each frame of original image in the multi-frame original image on the target virtual cube;

and obtaining a distortion corrected target image corresponding to each frame of original image in the multi-frame original image according to the second mapping coordinates of each pixel point in each frame of original image in the multi-frame original image on the surface of the target virtual cube.

In one embodiment, the acquiring the direction vector of the matched plurality of feature point pairs in each two adjacent frames of target images in the multi-frame target image includes:

For each feature point pair in every two adjacent frames of target images in the multi-frame target images, obtaining first mapping coordinates of two feature points matched with each other in the feature point pairs on a target sphere; and taking the first mapping coordinates corresponding to each characteristic point in the characteristic point pair as the direction vector of each characteristic point in the characteristic point pair.

In one embodiment, the determining, based on the obtained direction vector, the target relative pose between each two adjacent frames of target images in the multiple frames of target images includes:

according to the shooting moment of each frame of original image in the multi-frame original image, determining two earliest frames of original images from the multi-frame original images, wherein the earliest two frames of original images comprise a first original image and a second original image, and the shooting moment of the first original image is earlier than that of the second original image;

based on the direction vectors of a plurality of matched characteristic point pairs in two adjacent frame target images corresponding to the first original image and the second original image, obtaining target relative pose between the two adjacent frame target images corresponding to the first original image and the second original image;

acquiring a plurality of frames of residual target images except the target image corresponding to the first original image in the plurality of frames of target images;

And obtaining the target relative pose between every two adjacent frames of residual target images in the multi-frame residual target images based on the direction vectors and the coordinate information of the matched multiple feature point pairs in every two adjacent frames of residual target images in the multi-frame residual target images.

In one embodiment, the target relative pose comprises a target relative rotation matrix and a target relative translation matrix; the method for obtaining the target relative pose between two adjacent frames of target images corresponding to the first original image and the second original image based on the direction vectors of a plurality of matched feature point pairs in the two adjacent frames of target images corresponding to the first original image and the second original image comprises the following steps:

for each first characteristic point pair of a plurality of characteristic point pairs matched in two adjacent frame target images corresponding to a first original image and a second original image, acquiring a direction vector of the first characteristic point pair and a reference relative rotation matrix, wherein the reference relative rotation matrix is determined according to a projection plane of a target virtual cube corresponding to the first characteristic point pair;

obtaining a reference relative translation matrix of the first characteristic point pair based on the direction vector of the first characteristic point pair and the reference relative rotation matrix;

Taking the reference relative rotation matrix as a target relative rotation matrix between two adjacent frame target images corresponding to the first original image and the second original image;

and taking the reference relative translation matrix as a target relative translation matrix between two adjacent frame target images corresponding to the first original image and the second original image.

In one embodiment, the obtaining the target relative pose between each two adjacent frames of residual target images in the multiple frames of residual target images based on the direction vectors and the coordinate information of the matched feature point pairs in each two adjacent frames of residual target images in the multiple frames of residual target images includes:

for each two adjacent frames of residual target images in the multi-frame residual target images, acquiring coordinate information of a first target feature point of a first residual target image with earlier shooting time in the two adjacent frames of residual target images and a direction vector of a second target feature point matched with the first target feature point in a second residual target image with later shooting time;

and calculating the relative pose between the first residual target image and the second residual target image based on the coordinate information of the first target feature point and the direction vector of the second target feature point.

In one embodiment, the processor is further configured to implement the following steps when executing the computer program:

judging whether the feature point matched with any feature point can be searched in a reference target image corresponding to the first residual target image according to any feature point of the first residual target image, wherein the shooting moment of an original image corresponding to the reference target image is adjacent to the shooting moment of the original image corresponding to the first residual target image and is positioned before the shooting moment of the original image corresponding to the first residual target image;

if the target feature point can be found, any feature point is used as the first target feature point.

In one embodiment, the obtaining the coordinate information of each feature point in each two adjacent frame of target images in the multi-frame target image according to the target relative pose between each two adjacent frames of target images in the multi-frame target image includes:

and according to the target relative pose between every two adjacent frames of target images in the multi-frame target images, acquiring the coordinate information of each characteristic point in the target place in every two adjacent frames of target images in the multi-frame target images by utilizing a triangulation technology.

In a second aspect, the application further provides a map generation method. The method comprises the following steps:

Obtaining a plurality of frames of original images obtained by shooting a target place through a large-field-angle camera, and carrying out distortion correction processing on each frame of original image in the plurality of frames of original images to obtain a plurality of frames of target images after distortion correction;

In a third aspect, the application further provides a map generation device. The device comprises:

the acquisition module is used for acquiring a plurality of frames of original images obtained by shooting a target place through a large-field-of-view angle camera, and carrying out distortion correction processing on each frame of original image in the plurality of frames of original images to obtain a plurality of frames of target images after distortion correction;

the matching module is used for extracting the characteristic points of each frame of target image in the multi-frame target image to obtain a plurality of characteristic points in each frame of target image in the multi-frame target image; matching a plurality of characteristic points in every two adjacent frames of target images in the multi-frame target images, and determining a plurality of characteristic point pairs matched in every two adjacent frames of target images in the multi-frame target images, wherein the characteristic point pairs comprise two mutually matched characteristic points;

the determining module is used for acquiring the direction vectors of a plurality of feature point pairs matched in every two adjacent frames of target images in the multi-frame target images and determining the target relative pose between every two adjacent frames of target images in the multi-frame target images based on the acquired direction vectors;

the construction module is used for obtaining the coordinate information of each characteristic point in each adjacent two-frame target image in the multi-frame target image in the target place according to the target relative pose between each adjacent two-frame target image in the multi-frame target image; and constructing a map of the target place based on the obtained coordinate information of each characteristic point in every two adjacent frames of target images in the multi-frame target images and the multi-frame original images.

In a fourth aspect, the present application also provides a computer-readable storage medium. The computer-readable storage medium has stored thereon a computer program which, when executed by a processor, implements the map generation method as described in the second aspect above.

According to the robot, the map generation method, the map generation device and the storage medium, the large-field-angle camera is used for shooting a target place to obtain multiple frames of original images, and distortion correction processing is carried out on each frame of original image in the multiple frames of original images to obtain multiple frames of target images; extracting feature points of each frame of target image in the multi-frame target image to obtain a plurality of feature points of each frame of target image; matching a plurality of characteristic points in every two adjacent frames of target images in the multi-frame target images, and determining a plurality of characteristic point pairs matched in every two adjacent frames of target images in the multi-frame target images; obtaining direction vectors of a plurality of feature point pairs matched in every two adjacent frames of target images in the multi-frame target images, and determining target relative pose between every two adjacent frames of target images in the multi-frame target images based on the obtained direction vectors; according to the target relative pose between every two adjacent frames of target images in the multi-frame target images, coordinate information of each feature point in every two adjacent frames of target images in the multi-frame target images in a target place is obtained; and constructing a map of the target place based on the obtained coordinate information of each characteristic point in every two adjacent frames of target images in the multi-frame target images and the multi-frame original images. Because the map construction is carried out based on the multi-frame original image shot by the large-field-angle camera, the shooting field of view of the large-field-angle camera is wide, the number of feature points which can be extracted in a target place is increased, and the map construction precision is improved. In addition, as the original image is subjected to distortion correction before the feature extraction, the accuracy of feature point matching is improved, and the accuracy of map construction is further improved.

Drawings

FIG. 1 is a schematic diagram of a robot in one embodiment;

FIG. 2 is a flow chart of steps performed by the processor in one embodiment;

FIG. 3 is a flow chart of step 101 in one embodiment;

FIG. 4 is an original image in one embodiment;

FIG. 5 is a target image in one embodiment;

FIG. 6 is a flow chart of step 104 in one embodiment;

FIG. 7 is a flowchart of step 104 in another embodiment;

FIG. 8 is a flow chart of step 402 in one embodiment;

FIG. 9 is a flow chart of step 404 in one embodiment;

FIG. 10 is a flowchart illustrating steps performed by a processor in another embodiment;

FIG. 11 is a block diagram showing a map generating apparatus in one embodiment;

fig. 12 is an internal structural diagram of a computer device in one embodiment.

Detailed Description

The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.

Referring to fig. 1, a robot is provided in an embodiment of the present application. The robot comprises a large field angle camera 1, a memory and a processor.

The large-field angle camera 1 is used for shooting a target place to obtain a plurality of frames of original images.

Wherein the target site is a closed space, such as a room.

Optionally, the large angle-of-view camera 1 is a camera with an angle of view greater than a preset threshold. The range of the angle of view and the installation position of the large angle of view camera 1 can be adjusted according to the actual situation, and the present invention is not limited thereto.

Specifically, the large-angle-of-view camera 1 is a wide-angle camera. That is, the large-angle camera 1 is a camera with a focal length shorter than a standard lens, a field angle larger than the standard lens, a focal length longer than a fisheye lens, and a field angle smaller than the fisheye lens. Specifically, the wide-angle camera is divided into a common wide-angle camera and an ultra-wide-angle camera. The focal length of a common wide-angle camera is generally 38-24 mm, and the angle of view is 60-84 degrees; the camera of the ultra-wide angle lens is 20-13 mm, and the angle of view is 94-118 degrees.

Optionally, the robot shoots the target place to obtain a plurality of frames of original images in the process of moving the target place.

And a memory for storing a computer program executable on the processor.

The memory comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media.

A processor for implementing the steps shown in fig. 2 when executing a computer program, comprising steps 101-106:

and step 101, performing distortion correction processing on each frame of original image in the multi-frame original image to obtain a multi-frame target image after distortion correction.

Optionally, a distortion correction process is performed on the original image based on the camera distortion model. The distortion model may be one of the following models: brown distortion model, equidistant distortion model, unified camera model (UCM, unified Camera Model), enhanced UCM (eUCM), FOV distortion model, and the like.

And 102, extracting the characteristic points of each frame of target image in the multi-frame target image to obtain a plurality of characteristic points in each frame of target image in the multi-frame target image.

Optionally, for the target image, a feature point extraction algorithm is used to extract feature points in the target image. The feature point extraction algorithm includes SIFT (Scale Invariant Feature Transform ) algorithm, harris corner detection algorithm, SURF (Speeded Up Robust Features, accelerated robustness feature) algorithm, and ORB (Oriented FAST and Rotated BRIEF) algorithm, etc.

Step 103, matching a plurality of feature points in every two adjacent frames of target images in the multi-frame target images, and determining a plurality of feature point pairs matched in every two adjacent frames of target images in the multi-frame target images.

Wherein the pair of feature points includes two feature points that match each other.

Optionally, feature descriptors of feature points of each target image in two adjacent frames of target images are acquired, and feature point matching is performed according to the feature descriptors of the feature points.

Step 104, obtaining the direction vectors of a plurality of feature point pairs matched in every two adjacent frames of target images in the multi-frame target images, and determining the target relative pose between every two adjacent frames of target images in the multi-frame target images based on the obtained direction vectors.

The target relative pose comprises a target relative rotation matrix and a target relative translation matrix.

Optionally, based on the projection mapping relationship, determining coordinate information of the projection of each feature point to the virtual sphere, and obtaining a direction vector of each feature point according to the coordinate information. The projection mapping relation is determined according to the calibration internal reference of the wide-angle camera.

Optionally, a first pose calculation formula is used to calculate a relative pose of the target between every two adjacent frames of target images, where the first pose calculation formula is expressed as follows:

T ₁ ＝[t ₁ t ₂ t ₃ ]，

wherein, (x) ₁ ,y ₁ ,z ₁ )、(x ₂ ,y ₂ ,z ₂ ) Respectively representing the direction vectors of two feature points included in a certain feature point pair in the two adjacent frame target images; r is R ₁ Representing a target relative rotation matrix; t (T) ₁ Representing the relative translation matrix of the object.

And 105, obtaining the coordinate information of each characteristic point in each adjacent two-frame target image in the multi-frame target image at the target site according to the target relative pose between each adjacent two-frame target image in the multi-frame target image.

Wherein the coordinate information is determined based on a predefined coordinate system. Optionally, the predefined coordinate system is a world coordinate system.

Optionally, for any feature point pair of every two adjacent frames of target images, inputting the pixel coordinates and the target relative pose of each feature point in the feature point pair in the corresponding target image into the trained neural network model to obtain the coordinate information of the feature point pair in the target place. The neural network model may be a convolutional neural network model or a deep neural network model.

Optionally, according to the target relative pose between every two adjacent frames of target images in the multi-frame target images, the coordinate information of each feature point in every two adjacent frames of target images in the multi-frame target images in the target place is obtained by using a triangulation technology.

And 106, constructing a map of the target place based on the coordinate information of each feature point in every two adjacent frames of target images in the obtained multi-frame target images and the multi-frame original images.

Optionally, combining multiple frames of original images, and establishing a map of the target place according to the coordinate information corresponding to each feature pair obtained through calculation.

Specifically, the multiple frames of original images are ordered according to the shooting sequence of the multiple frames of original images. And constructing a map of the target place based on the coordinate information of the feature points corresponding to the original images according to the arrangement sequence of the original images.

In the embodiment, a large-field-angle camera is used for shooting a target place to obtain a plurality of frames of original images, and distortion correction processing is carried out on each frame of original image in the plurality of frames of original images to obtain a plurality of frames of target images; extracting feature points of each frame of target image in the multi-frame target image to obtain a plurality of feature points of each frame of target image; matching a plurality of characteristic points in every two adjacent frames of target images in the multi-frame target images, and determining a plurality of characteristic point pairs matched in every two adjacent frames of target images in the multi-frame target images; obtaining direction vectors of a plurality of feature point pairs matched in every two adjacent frames of target images in the multi-frame target images, and determining target relative pose between every two adjacent frames of target images in the multi-frame target images based on the obtained direction vectors; according to the target relative pose between every two adjacent frames of target images in the multi-frame target images, coordinate information of each feature point in every two adjacent frames of target images in the multi-frame target images in a target place is obtained; and constructing a map of the target place based on the obtained coordinate information of each characteristic point in every two adjacent frames of target images in the multi-frame target images and the multi-frame original images. Because the map construction is carried out based on the multi-frame original image shot by the large-field-angle camera, the shooting field of view of the large-field-angle camera is wide, the number of feature points which can be extracted in a target place is increased, and the map construction precision is improved. In addition, as the original image is subjected to distortion correction before the feature extraction, the accuracy of feature point matching is improved, and the accuracy of map construction is further improved.

In one embodiment, as shown in fig. 3, based on the embodiment shown in fig. 2, the embodiment relates to a process of performing distortion correction processing on each frame of original image in multiple frames of original images in step 101 to obtain a multi-frame target image after distortion correction, where the implementation process includes the following steps:

step 201, for a plurality of frames of original images, mapping each pixel point in each frame of original image in the plurality of frames of original images onto a target sphere, so as to obtain a first mapping coordinate of each pixel point in each frame of original image in the plurality of frames of original images on the target sphere.

The target spherical surface is determined according to camera calibration internal parameters of the large-view-angle camera. The radius of the target sphere is 1.

Optionally, a first mapping coordinate of each pixel point in the original image on the target sphere is calculated according to the following first coordinate conversion formula, where the first coordinate conversion formula is as follows:

wherein (u, v) represents the coordinates of a pixel point in the original image, f _x ,f _y ,c _x ,c _y ,x _i And (3) representing camera calibration internal parameters, and (x, y, z) representing a first mapping coordinate corresponding to the pixel point.

Step 202, mapping the first mapping coordinates of each pixel point in each frame of original image in the multi-frame original image on the target sphere to the surface of the target virtual cube again to obtain the second mapping coordinates of each pixel point in each frame of original image in the multi-frame original image on the target virtual cube.

Optionally, for each pixel point in each frame of original image, obtaining an internal reference matrix K corresponding to the target virtual cube, and calculating a second mapping coordinate of the pixel point on the target virtual cube according to the internal reference matrix K by using the following second coordinate conversion formula. The second coordinate conversion formula is expressed as follows:

wherein (x, y, z) represents a first mapping coordinate corresponding to a pixel point in the original image; r is a projection matrix corresponding to the first mapping coordinate, and the projection matrix is a 3x3 matrix; (u) _- ，v _{_} ) And the second mapping coordinates corresponding to the pixel point.

Specifically, the reference matrix K may be determined according to the side length of the target virtual cube. If the side length of the target virtual cube is a, the corresponding internal reference matrix K is

Specifically, the target sphere coincides with the center point of the virtual cube. For each first mapping coordinate, determining the corresponding first mapping coordinate according to the connection line between the first mapping coordinate and the origin and the plane of the cube where the intersection point of the virtual cube is locatedThe projection matrix, specifically, the projection matrix R corresponding to the front surface of the virtual cube isThe projection matrix R corresponding to the left surface is +.>The projection matrix R corresponding to the right surface is +.>

And 203, obtaining a distortion corrected target image corresponding to each frame of original image in the multi-frame original image according to the second mapping coordinates of each pixel point in each frame of original image in the multi-frame original image on the surface of the target virtual cube.

The distortion correction processing effect corresponding to the above steps can be seen in fig. 4 and fig. 5, and the original image shown in fig. 4 is subjected to the distortion correction processing corresponding to the above steps to obtain the target image shown in fig. 5.

In the embodiment, the distortion correction processing of the original image is realized by performing two mapping processes on the pixel points in the original image, so that the distortion degree of the image is reduced, and the accuracy of map construction is improved.

In one embodiment, as shown in fig. 6, based on the embodiment shown in fig. 3, the embodiment relates to an implementation process of obtaining, in step 104, a direction vector of a plurality of feature point pairs matched in every two adjacent frames of target images in a plurality of frames of target images, where the implementation process includes the following steps:

step 301, for each feature point pair in every two adjacent frames of target images in multiple frames of target images, obtaining first mapping coordinates of two feature points matched with each other in the feature point pairs on a target sphere.

Optionally, a world coordinate system is constructed with the center point of the target sphere as the origin. The first mapped coordinates are based on coordinates obtained from the constructed world coordinate system.

Step 302, using the first mapping coordinates corresponding to each feature point in the feature point pair as the direction vector of each feature point in the feature point pair.

Optionally, the direction vector is a vector from the origin to the first mapping coordinate.

In this embodiment, the first mapping coordinates corresponding to each feature point in the feature point pair are used as the direction vectors of each feature point in the feature point pair, so that the method is simple and the calculation amount is small.

In one embodiment, as shown in fig. 7, based on the embodiment shown in fig. 6, the embodiment relates to the implementation process of determining, in step 104, the target relative pose between every two adjacent frames of target images in the multiple frames of target images based on the obtained direction vector:

step 401, determining two original images shot earliest from the multiple original images according to shooting time of each original image in the multiple original images.

The two frames of original images which are shot earliest comprise a first original image and a second original image, and the shooting time of the first original image is earlier than that of the second original image.

And step 402, obtaining the target relative pose between the two adjacent frames of target images corresponding to the first original image and the second original image based on the direction vectors of the matched feature point pairs in the two adjacent frames of target images corresponding to the first original image and the second original image.

Optionally, the target relative pose between two adjacent frames of target images corresponding to the first original image and the second original image is calculated by using the first pose calculation formula.

Step 403, obtaining a plurality of frames of residual target images except the target image corresponding to the first original image in the plurality of frames of target images.

The multi-frame residual image is a target image corresponding to the original image except the first original image in the multi-frame original image.

And step 404, obtaining the target relative pose between every two adjacent frames of residual target images in the multi-frame residual target images based on the direction vectors and the coordinate information of the matched feature point pairs in every two adjacent frames of residual target images in the multi-frame residual target images.

Optionally, for a certain two adjacent frame of residual target images, acquiring a two adjacent frame of target images with shooting time before the two adjacent frame of residual target images, and calculating coordinate information corresponding to feature point pairs in the two adjacent frame of target images according to the target relative pose between the two adjacent frame of target images to obtain a plurality of matched feature point pair coordinate information in the two adjacent frame of residual target images.

Optionally, for each two adjacent frames of residual target images, a pnp algorithm is utilized to obtain target relative pose corresponding to the two adjacent frames of residual target images according to the direction vector and the coordinate information of the feature pairs, wherein the pnp algorithm comprises a P3P algorithm, a Direct Linear Transformation (DLT) algorithm, EPnP and the like. Specifically, the optimization function corresponding to the pnp algorithm is a reprojection error optimization function.

In the embodiment, two rules are adopted to calculate the relative pose of the target corresponding to the target image of the adjacent frame, so that the accuracy of the relative pose of the target is improved.

In one embodiment, the target relative pose includes a target relative rotation matrix and a target relative translation matrix, as shown in fig. 8, based on the embodiment shown in fig. 6, the embodiment relates to the implementation of step 402, including the steps of:

step 501, for each first feature point pair of a plurality of feature point pairs matched in two adjacent frame target images corresponding to the first original image and the second original image, a direction vector of the first feature point pair and a reference relative rotation matrix are obtained.

Wherein the reference relative rotation matrix is determined according to the projection plane of the first feature point pair corresponding to the target virtual cube.

Optionally, the reference rotation matrix is the projection matrix R corresponding to the first mapping coordinate mentioned above.

Step 502, obtaining a reference relative translation matrix of the first feature point pair based on the direction vector of the first feature point pair and the reference relative rotation matrix.

Optionally, the projection matrix R corresponding to the first feature point is used as the target relative rotation matrix R in the above-mentioned first pose calculation formula ₁ Substituting the first pose calculation formula to obtain T ₁ ，T ₁ The reference relative translation matrix of the first characteristic point pair is obtained.

In step 503, the reference relative rotation matrix is used as a target relative rotation matrix between two adjacent frame target images corresponding to the first original image and the second original image.

And step 504, taking the reference relative translation matrix as a target relative translation matrix between two adjacent frame target images corresponding to the first original image and the second original image.

In this embodiment, when calculating the relative pose of two adjacent frame target images corresponding to the first original image and the second original image, the corresponding target relative rotation matrix is determined directly according to the first feature point to the projection surface of the corresponding target virtual cube, so that the calculation amount is small.

In one embodiment, the target relative pose includes a target relative rotation matrix and a target relative translation matrix, as shown in fig. 9, based on the embodiment shown in fig. 6, the embodiment involves the implementation of step 404, which includes the following steps:

in step 601, for each two adjacent frames of residual target images in the multi-frame residual target images, coordinate information of a first target feature point of a first residual target image with an earlier shooting time in the two adjacent frames of residual target images and a direction vector of a second target feature point matched with the first target feature point in a second residual target image with a later shooting time are obtained.

Step 602, calculating a relative pose between the first residual target image and the second residual target image based on the coordinate information of the first target feature point and the direction vector of the second target feature point.

Optionally, according to a second pose calculation formula, calculating to obtain a relative pose between the first residual target image and the second residual target image, where the second pose calculation formula is expressed as follows:

wherein R is ₁₂ For a target relative rotation matrix, t, between the first and second remaining target images ₁₂ For a target relative translation matrix between the first residual target image and the second residual target image, P represents coordinate information corresponding to the first target feature point; d, d ₁ And representing the direction vector corresponding to the second target feature point.

In this embodiment, the relative pose between the first residual target image and the second residual target image is obtained through the coordinate information of the first target feature point and the direction vector of the second target feature point, so that the calculated amount is small, and the data processing efficiency is high.

In an embodiment, as shown in fig. 10, based on the embodiment shown in fig. 9, the processor is further configured to implement the following steps when executing the computer program:

in step 701, it is determined whether a feature point matching any feature point can be found in the reference target image corresponding to the first residual target image, for any feature point of the first residual target image.

The shooting time of the original image corresponding to the reference target image is adjacent to the shooting time of the original image corresponding to the first residual target image and is located before the shooting time of the original image corresponding to the first residual target image.

Step 702, if the target feature point can be found, taking any one feature point as the first target feature point.

Optionally, in order to reduce the calculation amount of the coordinate information corresponding to each feature point in the target image, for each two adjacent frames of residual target images in the multi-frame residual target image, only the coordinate information corresponding to the feature points except the first target feature point in each two adjacent frames of residual target images is calculated.

According to the embodiment, the accuracy of the relative pose is improved by determining the first target feature point so that the relative pose can be calculated only according to the coordinate information corresponding to the first target feature point.

In one embodiment, a robot is provided. The robot comprises a large field angle camera 1, a memory and a processor for implementing the following steps when executing a computer program:

step 801, for a plurality of frames of original images, mapping each pixel point in each frame of original image in the plurality of frames of original images onto a target sphere, so as to obtain a first mapping coordinate of each pixel point in each frame of original image in the plurality of frames of original images on the target sphere.

Step 802, mapping the first mapping coordinates of each pixel point in each frame of original image in the multi-frame original image on the target sphere to the surface of the target virtual cube again to obtain the second mapping coordinates of each pixel point in each frame of original image in the multi-frame original image on the target virtual cube.

Step 803, obtaining a distortion corrected target image corresponding to each frame of original image in the multi-frame original image according to the second mapping coordinates of each pixel point in each frame of original image in the multi-frame original image on the surface of the target virtual cube.

Step 804, extracting feature points from each frame of target image in the multi-frame target image to obtain a plurality of feature points in each frame of target image in the multi-frame target image.

And step 805, matching a plurality of feature points in every two adjacent frames of target images in the multi-frame target image, and determining a plurality of feature point pairs matched in every two adjacent frames of target images in the multi-frame target image.

Step 806, for each feature point pair in every two adjacent frames of target images in the multi-frame target image, obtaining the first mapping coordinates of two feature points matched with each other in the feature point pair on the target sphere.

Step 807, using the first mapping coordinates corresponding to each feature point in the feature point pair as the direction vector of each feature point in the feature point pair.

Step 808, determining the two original images shot earliest from the multiple original images according to the shooting time of each original image in the multiple original images.

Step 809, for each first feature point pair of the plurality of feature point pairs matched in the two adjacent frame target images corresponding to the first original image and the second original image, acquiring a direction vector of the first feature point pair and a reference relative rotation matrix.

Step 810, obtaining a reference relative translation matrix of the first feature point pair based on the direction vector of the first feature point pair and the reference relative rotation matrix.

Step 811, taking the reference relative rotation matrix as a target relative rotation matrix between two adjacent frames of target images corresponding to the first original image and the second original image; and taking the reference relative translation matrix as a target relative translation matrix between two adjacent frame target images corresponding to the first original image and the second original image.

And step 812, obtaining coordinate information of each first characteristic point pair at the target site by utilizing a triangulation technology according to the target relative rotation matrix and the target relative translation matrix between two adjacent frames of target images corresponding to the first original image and the second original image.

Step 813, obtaining a plurality of frames of residual target images except the target image corresponding to the first original image in the plurality of frames of target images;

step 814, for any feature point of the first remaining target image, it is determined whether a feature point matching any feature point can be found in the reference target image corresponding to the first remaining target image.

If yes, step 815 takes any feature point as the first target feature point.

In step 816, for each two adjacent frames of residual target images in the multi-frame residual target images, coordinate information of a first target feature point of a first residual target image with an earlier shooting time in the two adjacent frames of residual target images and a direction vector of a second target feature point matched with the first target feature point in a second residual target image with a later shooting time are obtained.

In step 817, based on the coordinate information of the first target feature point and the direction vector of the second target feature point, the relative pose between the first residual target image and the second residual target image is calculated.

Step 818, obtaining coordinate information of each first feature point pair at the target site by utilizing a triangulation technology according to the relative pose between the first residual target image and the second residual target image.

And step 819, constructing a map of the target place based on the obtained coordinate information of each feature point in every two adjacent frames of target images in the multiple frames of target images and the multiple frames of original images.

In an embodiment, based on the same inventive concept, the embodiment of the present application provides a map generating method, an execution subject of which may be a map generating device, where the map generating device is disposed on a robot, and part or all of terminals of the robot may be implemented by software, hardware, or a combination of software and hardware. The terminal can be a personal computer, a notebook computer, a media player, a smart television, a smart phone, a tablet computer, a portable wearable device and the like. The implementation of the solution to the problem provided by the method is similar to the implementation described in the robot, so the specific limitation in one or more embodiments of the map generation and recognition method provided below may refer to the limitation of the robot above, and will not be repeated here.

In one embodiment, a map generation method is provided, the method comprising:

In one embodiment, based on the above embodiment, the performing distortion correction processing on each frame of original image in the multiple frames of original images to obtain a multiple frame target image after distortion correction includes:

In one embodiment, based on the above embodiment, the obtaining a direction vector of a plurality of feature point pairs matched in every two adjacent frames of target images in the multi-frame target image includes:

In one embodiment, based on the above embodiment, the determining, based on the obtained direction vector, the target relative pose between each two adjacent frames of target images in the multiple frames of target images includes:

In one embodiment, based on the above embodiment, the target relative pose comprises a target relative rotation matrix and a target relative translation matrix; the method for obtaining the target relative pose between two adjacent frames of target images corresponding to the first original image and the second original image based on the direction vectors of a plurality of matched feature point pairs in the two adjacent frames of target images corresponding to the first original image and the second original image comprises the following steps:

taking the reference relative rotation matrix as a target relative rotation matrix between two adjacent frame target images corresponding to the first original image and the second original image; and taking the reference relative translation matrix as a target relative translation matrix between two adjacent frame target images corresponding to the first original image and the second original image.

In one embodiment, based on the above embodiment, the map generating method includes:

judging whether the feature point matched with any feature point can be found in the reference target image corresponding to the first residual target image according to any feature point of the first residual target image, wherein the shooting moment of the original image corresponding to the reference target image is adjacent to the shooting moment of the original image corresponding to the first residual target image and is positioned before the shooting moment of the original image corresponding to the first residual target image;

In one embodiment, based on the above embodiment, the obtaining the coordinate information of each feature point in each two adjacent frame of target images in the multiple frame of target images according to the target relative pose between each two adjacent frames of target images in the multiple frame of target images includes:

It should be understood that, although the steps in the flowcharts related to the above embodiments are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.

Based on the same inventive concept, the embodiment of the application also provides a map generation device for realizing the map generation method. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitation in one or more embodiments of the map generating device provided below may refer to the limitation of the map generating method hereinabove, and will not be repeated herein.

In one embodiment, as shown in fig. 11, there is provided a map generating apparatus including: an acquisition module 100, a matching module 200, a determination module 300, and a construction module 400, wherein:

the acquisition module 100 is configured to acquire a plurality of frames of original images obtained by shooting a target location with a large field angle camera, and perform distortion correction processing on each frame of original image in the plurality of frames of original images to obtain a plurality of frames of target images after distortion correction;

the matching module 200 is configured to perform feature point extraction processing on each frame of target image in the multiple frames of target images to obtain multiple feature points in each frame of target image in the multiple frames of target images; matching a plurality of characteristic points in every two adjacent frames of target images in the multi-frame target images, and determining a plurality of characteristic point pairs matched in every two adjacent frames of target images in the multi-frame target images, wherein the characteristic point pairs comprise two mutually matched characteristic points;

The determining module 300 is configured to obtain direction vectors of a plurality of feature point pairs matched in every two adjacent frame of target images in the multiple frame of target images, and determine a target relative pose between every two adjacent frames of target images in the multiple frame of target images based on the obtained direction vectors;

the construction module 400 is configured to obtain coordinate information of each feature point in each two adjacent frame of target images in the multi-frame target image at the target location according to the target relative pose between each two adjacent frames of target images in the multi-frame target image; and constructing a map of the target place based on the obtained coordinate information of each characteristic point in every two adjacent frames of target images in the multi-frame target images and the multi-frame original images.

In one embodiment, the obtaining module 100 is specifically configured to:

In one embodiment, the matching module 200 is specifically configured to:

In one embodiment, the determining module 300 is specifically configured to:

In one embodiment, the target relative pose comprises a target relative rotation matrix and a target relative translation matrix; the determining module 300 is further specifically configured to:

In one embodiment, the determining module 300 is further specifically configured to:

In one embodiment, the map generating apparatus is further configured to:

In one embodiment, the construction module 400 is specifically configured to:

The respective modules in the map generation apparatus described above may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a terminal, and the internal structure thereof may be as shown in fig. 12. The computer device includes a processor, a memory, and a communication interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless mode can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a map generation method.

It will be appreciated by those skilled in the art that the structure shown in FIG. 12 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.

In one embodiment, a computer readable storage medium is provided, on which a computer program is stored which, when executed by a processor, carries out the steps in the above method embodiments.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, or the like, but is not limited thereto.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims

1. The robot is characterized by comprising a large-field-angle camera, a memory and a processor, wherein the large-field-angle camera is used for shooting a target place to obtain a plurality of frames of original images, the memory stores a computer program capable of running on the processor, and the processor is used for realizing the following steps when executing the computer program:

extracting feature points from each frame of target image in the multi-frame target image to obtain a plurality of feature points in each frame of target image in the multi-frame target image;

matching a plurality of characteristic points in every two adjacent frames of target images in the multi-frame target image, and determining a plurality of characteristic point pairs matched in every two adjacent frames of target images in the multi-frame target image, wherein the characteristic point pairs comprise two mutually matched characteristic points;

according to the target relative pose between every two adjacent frames of target images in the multi-frame target images, coordinate information of each feature point in every two adjacent frames of target images in the multi-frame target images in the target place is obtained;

and constructing a map of the target place based on the obtained coordinate information of each characteristic point in every two adjacent frames of target images in the multiple frames of target images and the multiple frames of original images.

2. The robot of claim 1, wherein the performing distortion correction processing on each of the plurality of frames of original images to obtain a plurality of frames of target images after distortion correction comprises:

for the multi-frame original image, mapping each pixel point in each frame of original image in the multi-frame original image onto a target sphere to obtain a first mapping coordinate of each pixel point in each frame of original image in the multi-frame original image on the target sphere, wherein the target sphere is determined according to camera calibration internal parameters of the large-view-angle camera;

re-mapping the first mapping coordinates of each pixel point in each frame of original image in the multi-frame original image on the surface of the target sphere to obtain the second mapping coordinates of each pixel point in each frame of original image in the multi-frame original image on the target virtual cube;

3. The robot of claim 2, wherein the acquiring the direction vector of the matched pairs of feature points in each two adjacent frames of the multi-frame target image comprises:

for each feature point pair in every two adjacent frames of target images in the multi-frame target image, obtaining first mapping coordinates of two feature points matched with each other in the feature point pairs on a target sphere;

and taking the first mapping coordinates corresponding to each characteristic point in the characteristic point pair as the direction vector of each characteristic point in the characteristic point pair.

4. A robot as claimed in claim 3, wherein said determining the target relative pose between each adjacent two of the multiple frames of target images based on the acquired direction vectors comprises:

according to the shooting time of each frame of original image in the multi-frame original image, determining two earliest frames of original images from the multi-frame original images, wherein the earliest two frames of original images comprise a first original image and a second original image, and the shooting time of the first original image is earlier than the shooting time of the second original image;

obtaining a target relative pose between two adjacent frame target images corresponding to the first original image and the second original image based on the direction vectors of a plurality of feature point pairs matched in the two adjacent frame target images corresponding to the first original image and the second original image;

Acquiring a plurality of frames of residual target images except for the target image corresponding to the first original image in the plurality of frames of target images;

and obtaining the target relative pose between every two adjacent frames of residual target images in the multi-frame residual target image based on the direction vectors and the coordinate information of the matched multiple feature point pairs in every two adjacent frames of residual target images in the multi-frame residual target image.

5. The robot of claim 4, wherein the target relative pose comprises a target relative rotation matrix and a target relative translation matrix;

the obtaining the target relative pose between the two adjacent frame target images corresponding to the first original image and the second original image based on the direction vectors of the plurality of feature point pairs matched in the two adjacent frame target images corresponding to the first original image and the second original image includes:

for each first characteristic point pair of a plurality of characteristic point pairs matched in two adjacent frame target images corresponding to the first original image and the second original image, acquiring a direction vector of the first characteristic point pair and a reference relative rotation matrix, wherein the reference relative rotation matrix is determined according to a projection plane of the target virtual cube corresponding to the first characteristic point pair;

Obtaining a reference relative translation matrix of the first feature point pair based on the direction vector of the first feature point pair and a reference relative rotation matrix;

6. The robot of claim 4, wherein the obtaining the target relative pose between each two adjacent frames of the multi-frame remaining target image based on the direction vectors and the coordinate information of the matched pairs of the plurality of feature points in each two adjacent frames of the multi-frame remaining target image comprises:

7. The robot of claim 6, wherein the processor is further configured to perform the following steps when executing the computer program:

judging whether a feature point matched with any feature point can be found in a reference target image corresponding to the first residual target image according to any feature point of the first residual target image, wherein the shooting moment of an original image corresponding to the reference target image is adjacent to the shooting moment of the original image corresponding to the first residual target image and is positioned before the shooting moment of the original image corresponding to the first residual target image;

and if the first target feature point can be found, taking any feature point as the first target feature point.

8. The robot of claim 1, wherein the obtaining the coordinate information of each feature point in each two adjacent frame of the multi-frame target image according to the target relative pose between each two adjacent frames of the multi-frame target image comprises:

And according to the target relative pose between every two adjacent frames of target images in the multi-frame target images, utilizing a triangulation technology to obtain the coordinate information of each characteristic point in every two adjacent frames of target images in the multi-frame target images at the target site.

9. A map generation method, the method comprising:

10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of claim 9.