WO2023160445A1 - Simultaneous localization and mapping method and apparatus, electronic device, and readable storage medium - Google Patents

Simultaneous localization and mapping method and apparatus, electronic device, and readable storage medium Download PDF

Info

Publication number
WO2023160445A1
WO2023160445A1 PCT/CN2023/076247 CN2023076247W WO2023160445A1 WO 2023160445 A1 WO2023160445 A1 WO 2023160445A1 CN 2023076247 W CN2023076247 W CN 2023076247W WO 2023160445 A1 WO2023160445 A1 WO 2023160445A1
Authority
WO
WIPO (PCT)
Prior art keywords
pose information
image
information
frame image
frame
Prior art date
Application number
PCT/CN2023/076247
Other languages
French (fr)
Chinese (zh)
Inventor
向学勤
Original Assignee
维沃移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 维沃移动通信有限公司 filed Critical 维沃移动通信有限公司
Publication of WO2023160445A1 publication Critical patent/WO2023160445A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/20Drawing from basic elements, e.g. lines or circles
    • G06T11/206Drawing of charts or graphs
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/20Instruments for performing navigational calculations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras

Definitions

  • the application belongs to the technical field of communication, and in particular relates to a real-time positioning and map construction method, device, electronic equipment and readable storage medium.
  • electronic devices can perform real-time positioning and map construction by tracking the pose of each image frame of the current scene.
  • the electronic device can process the pose of the image frame of the current scene acquired by the electronic device immediately based on a filtering method or an optimization method, and output the processed pose, so that each pose tracking of image frames.
  • the filter-based method cannot correct the pose of the image frame, resulting in poor accuracy of electronic equipment tracking pose; on the other hand, the optimization-based method requires a large amount of calculation , so that it takes a long time for the electronic device to process the pose of a single image frame, which in turn leads to a long delay for the electronic device to track the pose; therefore, it may cause the electronic device to perform poor real-time positioning and map construction.
  • the purpose of the embodiment of the present application is to provide a real-time positioning and map construction method, device, electronic device and readable storage medium, which can solve the problem that the electronic device performs poor real-time positioning and map construction.
  • the embodiment of the present application provides a real-time positioning and map construction method, the method includes: determining the initial pose information of the i-th frame image collected according to the final pose information of the i-1th frame image collected , i is an integer greater than 1; the initial pose information of the i-th frame image is fused with the first interpolation variable to obtain the final pose information of the i-th frame image, and the first interpolation variable is obtained before collecting the i-th frame image
  • the last interpolation variable, the first interpolation variable is the interpolation variable between the initial pose information of the first image and the target pose information, the first image is the image of the key frame in the image collected before the i-th frame image, and the target
  • the pose information is the pose information after optimizing the initial pose information of the first image; based on the final pose information of the i-th frame image and the i-th frame image, real-time positioning and map construction are performed.
  • the embodiment of the present application provides a real-time positioning and map construction device, the device includes a collection module, a determination module, a fusion module and a processing module; the determination module is used to collect the i-1th frame image according to the collection module
  • the final pose information of the i-th frame image is determined to determine the initial pose information of the i-th frame image collected by the acquisition module, and i is an integer greater than 1;
  • the fusion module is used to fuse the initial pose information of the i-th frame image with the first interpolation variable, get i
  • the final pose information of the frame image, the first interpolation variable is the last interpolation variable obtained before the acquisition module collects the i-th frame image, the first interpolation variable is the initial pose information of the first image and the target pose information
  • the first image is the image of the key frame in the image collected by the acquisition module before the i-th frame image
  • the target pose information is the pose information after optimizing the initial pose information of the first image
  • the processing module
  • the embodiment of the present application provides an electronic device, the electronic device includes a processor and a memory, the memory stores programs or instructions that can run on the processor, and the programs or instructions are processed by the The steps of the method described in the first aspect are realized when the controller is executed.
  • an embodiment of the present application provides a readable storage medium, on which a program or an instruction is stored, and when the program or instruction is executed by a processor, the steps of the method described in the first aspect are implemented .
  • the embodiment of the present application provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is used to run programs or instructions, so as to implement the first aspect the method described.
  • an embodiment of the present application provides a computer program product, the program product is stored in a storage medium, and the program product is executed by at least one processor to implement the method described in the first aspect.
  • the initial pose information of the i-th frame image collected can be determined according to the final pose information of the i-1th frame image collected, where i is an integer greater than 1; and the i-th frame image's
  • the initial pose information is fused with the first interpolation variable to obtain the final pose information of the i-th frame image, the first interpolation variable is the last interpolation variable obtained before the i-th frame image is collected, and the first interpolation variable is the first image's
  • the interpolation variable between the initial pose information and the target pose information, the first image is the image of the key frame in the image collected before the i-th frame image, and the target pose information is the initial pose information of the first image.
  • Optimized pose information; and based on the final pose information of the i-th frame image and the i-th frame image, real-time positioning and map construction are performed.
  • the electronic device can be based on the i-th frame image, and the pose information after the fusion of the first interpolation variable and the initial pose information of the i-th frame image determined according to the final pose information of the i-1th frame image , to perform real-time positioning and map construction, and the first interpolation variable is the interpolation variable between the initial pose information before and after optimization of the key frame image in the image collected by the electronic device before the i-th frame image, therefore, a
  • the electronic device can correct the initial pose of the image in the i-th frame, thereby improving the accuracy of the tracking pose; delay. In this way, it is possible to ensure that the electronic device outputs high-frequency and high-precision pose information, thereby improving the effect of the electronic device's real-time positioning and map construction.
  • Fig. 1 is a flow chart of the real-time positioning and map construction method provided by the embodiment of the present application
  • Fig. 2 is a schematic diagram of the real-time positioning and map construction device provided by the embodiment of the present application;
  • Fig. 3 is a schematic diagram of an electronic device provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of hardware of an electronic device provided by an embodiment of the present application.
  • Real-time positioning and map construction refers to the process in which a moving object calculates its own position and builds an environmental map based on sensor information.
  • real-time positioning and map construction have been applied in the fields of robotics, virtual reality and augmented reality, and its uses include the positioning of the sensor itself, as well as subsequent path planning and scene understanding.
  • the mainstream real-time positioning and map construction methods are generally divided into two types based on filtering and optimization.
  • the filter-based method uses the values of various states at the previous moment to estimate the next moment; while the optimization-based method regards all states as variables, regards the equations of motion and observation equations as constraints between variables, and constructs errors function that minimizes the quadratic form of this error.
  • the filter-based method represented by the Kalman filter MSCKF Multi-State Constraint Kalman Filter, MSCKF
  • MSCKF Multi-State Constraint Kalman Filter
  • the method based on optimization has the characteristics of complete system modules (including real-time positioning module and map building module), high real-time positioning accuracy, and strong robustness of the whole system.
  • this method since this method has a relatively large amount of calculation during operation, the corresponding algorithm consumes a lot of power, and the output frequency of the algorithm is low, so this method is not suitable for the application needs of high output rate and low calculation power consumption of mobile terminals. In this way, the effect of real-time positioning and map construction by the electronic device is poor.
  • the electronic device can determine the initial pose information of the i-th frame image collected according to the final pose information of the i-1th frame image collected, where i is an integer greater than 1; and The initial pose information of the i-th frame image is fused with the first interpolation variable to obtain the final pose information of the i-th frame image, the first interpolation variable is the last interpolation variable obtained before the i-th frame image is collected, the first interpolation The variable is the interpolation variable between the initial pose information of the first image and the target pose information, and the first image is collected before the i-th frame image
  • the image is the image of the key frame, and the target pose information is the pose information after optimizing the initial pose information of the first image; and based on the final pose information of the i-th frame image and the i-th frame image, perform real-time positioning with map build.
  • the electronic device can correct the initial pose of the i-th frame image, so the accuracy of the tracking pose can be improved; on the other hand, since the electronic device only needs to calculate the interpolation variable for the key frame image , so the delay of tracking pose can be shortened. In this way, it is possible to ensure that the electronic device outputs high-frequency and high-precision pose information, thereby improving the effect of the electronic device's real-time positioning and map construction.
  • FIG. 1 shows a flow chart of the method for real-time positioning and map construction provided by the embodiment of the present application.
  • the real-time positioning and map construction method provided by the embodiment of the present application may include the following steps 101 to 103 .
  • Step 101 The electronic device determines the initial pose information of the collected i-th frame image according to the collected final pose information of the i-1th frame image.
  • the above i is an integer greater than 1.
  • the sensor in the electronic device may collect images of the scene where the electronic device is located in real time.
  • the pose information of the image may indicate the position of the image in the three-dimensional space.
  • the pose information may include rotation coordinates and displacement coordinates.
  • pose information T ⁇ R, P ⁇ , where R is the rotation coordinate, including the rotation coordinate centered on the X axis, the rotation coordinate centered on the Y axis, and the rotation centered on the Z axis in three-dimensional space. Coordinates; P is the displacement coordinates, including the coordinates on the X axis, the Y axis and the Z axis in the three-dimensional space.
  • step 101 can be specifically implemented through the following step 101a, and step A or step B.
  • Step 101a the electronic device uses a filtering algorithm to process the final pose information of the i-1th frame image to obtain the first pose information.
  • the principle of the filtering algorithm may be:
  • x is the first pose information
  • x i-1 is the final pose information of the i-1th frame image
  • f is the transfer matrix
  • n is the noise term.
  • the electronic device can calculate the first pose information according to the final pose information of the i-1 frame image.
  • the electronic device after the electronic device obtains the first pose information, it can judge the matching situation between the first pose information and the final pose information of the i-1th frame image, and determine according to the matching situation Do step A, or do step B.
  • Step A The matching degree of the first pose information of the electronic device and the final pose information of the i-1th frame image is less than or If it is equal to the preset matching degree, the final pose information of the i-1th frame image is determined as the initial pose information of the i-th frame image.
  • the preset matching degree may be set by default by the system or set by the user according to actual usage requirements.
  • the matching degree between the first pose information and the final pose information of the i-1th frame image is less than or equal to the preset matching degree, that is, the difference between the first pose information and the final pose information of the i-1th frame image is too big.
  • the electronic device fails to locate during the process of capturing the image of the current scene, for example, the electronic device collects the image of a white wall for a long time, when the electronic device uses the filtering algorithm to perform calculations, due to the accumulation of errors, There will be an abnormality in the calculated pose information, that is, the matching degree between the pose information (that is, the first pose information) and the final pose information of the previous frame image (that is, the i-1th frame image) is less than or equal to the preset suitability.
  • the electronic device may save the historical information of the entire positioning process, when the matching degree of the first pose information and the final pose information of the i-1th frame image is less than or equal to the preset matching degree , determine the final pose information of the i-1th frame image as the initial pose information of the i-th frame image, thereby reducing errors and ensuring the accuracy of the initial pose information of the i-th frame image.
  • Step B When the matching degree of the first pose information and the final pose information of the i-1th frame image is greater than the preset matching degree, the electronic device determines the first pose information as the initial position of the i-th frame image Posture information.
  • the electronic device can determine the first pose information as the initial pose information of the i-th frame image, that is, the initial pose information of the i-th frame image is that the electronic device adopts a filtering algorithm, and the i-1th frame image The pose information obtained by the final pose information processing.
  • the electronic device can judge the final pose information of the i-1th frame image, and use the filtering algorithm to process the final pose information of the i-1th frame image (that is, the first pose information), the final pose information of the i-1th frame image, or the first pose information determines the initial pose information of the i-th frame image, so the initial pose information of the i-th frame image can be ensured The accuracy of posture information.
  • Step 102 the electronic device fuses the initial pose information of the i-th frame image with the first interpolation variable to obtain the final pose information of the i-th frame image.
  • the first interpolation variable is the last interpolation variable obtained before the i-th frame of image is collected.
  • the electronic device can obtain an interpolation variable by calculating the pose information of the key frame image, and the electronic device no longer calculates the pose information of the newly acquired key frame image during the calculation process. Carry out calculations until the calculation ends; if the electronic device captures an image that is a key frame after the calculation, the electronic device can obtain a new interpolation variable by calculating the pose information of the image .
  • the first interpolation variable is an interpolation variable between initial pose information and target pose information of the first image.
  • the target pose information may be pose information optimized from the initial pose information of the first image.
  • the first image is an image of a key frame among the images collected before the image of the i-th frame.
  • the image collected by the electronic device is a key frame image, which may be a frame of image collected by the electronic device for a long time, or may include elements that have not been collected in the image collected by the electronic device. images etc.
  • the electronic device may fuse the initial pose information of the i-th frame image with the first interpolation variable through coordinate calculation, and obtain the final pose information of the i-th frame image.
  • the electronic device may fuse the initial pose information of the i-th frame image with the first interpolation variable by using a Kalman filter method to obtain the final pose information of the i-th frame image.
  • the electronic device can calculate the initial pose information of the a-th frame image and the initial pose information of the a-th frame image
  • the interpolation variable between the optimized pose information that is, the target pose information
  • the electronic device can use the initial position of the i-th frame image
  • the pose information is fused with the interpolation variable (i.e.
  • the electronic device can use the The initial pose information of the i-frame image is fused with the interpolation variable (ie, the first interpolation variable) calculated last time to obtain the final pose information of the i-th frame image. In this way, the accuracy of the final pose information of the obtained i-th frame image can be ensured.
  • Step 103 the electronic device performs real-time positioning and map construction based on the final pose information of the i-th frame of image and the i-th frame of image.
  • the electronic device may perform real-time positioning and map construction based on the final pose information of the i-th frame image and the i-th frame image, so as to construct a three-dimensional map corresponding to the i-th frame image.
  • the electronic device may superimpose the constructed three-dimensional map corresponding to the i-th frame image with the three-dimensional map corresponding to each frame image constructed before the i-th frame image, so that the current Real-time positioning and map construction of the scene.
  • the electronic device can directly perform real-time positioning and map construction based on the initial pose information of the i-th frame image and the i-th frame image ; Or, the electronic device can optimize the initial pose information of the i-th frame image, and determine a difference variable, and fuse the initial pose information of the i-th frame image with the interpolation variable to obtain the i-th frame image. The final pose information, so that the electronic device can perform real-time Positioning and Mapping.
  • the electronic device can be based on the i-th frame image, and the first interpolation variable and the i-th frame image determined according to the final pose information of the i-1th frame image
  • the pose information after the fusion of the initial pose information is used for real-time positioning and map construction
  • the first interpolation variable is the initial position before and after optimization of the image that is the key frame in the image collected by the electronic device before the i-th frame image Therefore, on the one hand, the electronic device can correct the initial pose of the i-th frame image, thereby improving the accuracy of tracking pose; on the other hand, the electronic device only needs to correct the initial pose of the key frame
  • the image computes interpolation variables, which can reduce the latency of tracking poses. In this way, it is possible to ensure that the electronic device outputs high-frequency and high-precision pose information, thereby improving the effect of the electronic device's real-time positioning and map construction.
  • the instant positioning and map construction method provided in the embodiment of the present application may further include the following steps 104 and 105.
  • Step 104 The electronic device optimizes the initial pose information of the first image based on the M pieces of pose information, the M sets of offset information, and the initial pose information of the first image to obtain target pose information.
  • the above-mentioned M pose information is the pose information after the latest optimization of M frames of images
  • the M frames of images are images that are key frames in the images collected before the first image
  • each set of offset information in the above M sets of offset information is an offset amount of the feature points of the first image relative to the feature points of one frame of the above M frames of images.
  • the feature point of the image may be any possible point such as a vertex, a corner point, or a center point in the image.
  • the number of feature points in the image may be one or multiple, and specifically may be determined by the electronic device according to the collected image.
  • the number of feature points of different images may be the same or different.
  • the number of feature points corresponding between the first image and one frame of the above-mentioned M frames of images may be N, and N is an integer greater than or equal to 0; it can be understood that this At this time, the set of offset information includes N offsets.
  • step 104 may be specifically implemented through the following steps 104a and 104b.
  • Step 104a the electronic device determines M sets of three-dimensional position information according to the above M sets of offset information.
  • the above M sets of offset information are in one-to-one correspondence with the above M sets of three-dimensional position information, and each set of three-dimensional position information can be used to indicate the feature points in the three-dimensional map constructed based on one frame of the above M frames of images .
  • the set of three-dimensional position information determined by the electronic device according to the set of offset information may indicate the N feature points in the constructed three-dimensional map.
  • the electronic device may, according to the The offset is used to determine a set of three-dimensional position information used to indicate the feature points in the three-dimensional map constructed based on the frame image.
  • Step 104b The electronic device uses a preset beam adjustment algorithm to process the above M pieces of pose information, the initial pose information of the first image, and the above M sets of three-dimensional position information to obtain target pose information.
  • the principle of the preset beam adjustment algorithm can be:
  • T is the initial pose information of the image
  • P is the three-dimensional position information
  • Z is the two-dimensional observation
  • is the projection equation
  • M is the number of images that are key frames before the first image
  • N is the M group of three-dimensional position information instructions The number of feature points in the 3D map.
  • the electronic device can process the above M pieces of pose information, the initial pose information of the first image, and the above M sets of three-dimensional position information, so as to obtain target pose information.
  • the electronic device adopts the preset beam adjustment algorithm, and at the same time of obtaining the target pose information, it can optimize the above M pose information, so that the electronic device will use the preset beam adjustment algorithm next time.
  • the beam adjustment algorithm obtains the target pose information of the new first image, it can be based on the optimized M pose information, thereby improving the accuracy of the instant positioning of the electronic device.
  • the electronic device can determine the M sets of three-dimensional position information corresponding to the M sets of offset information according to the M sets of offset information, and can indicate the feature points in the constructed three-dimensional map, and use
  • the preset beam adjustment algorithm calculates the pose information optimized for the initial pose information of the first image, so it can improve the accuracy of the electronic device in optimizing the initial pose information of the first image, so that the electronic device can instantly When positioning and map construction, the accuracy of map construction can be improved.
  • Step 105 the electronic device determines a first interpolation variable according to the first rotation coordinate and the first displacement coordinate in the initial pose information of the first image, and the second rotation coordinate and second displacement coordinate in the target pose information.
  • step 105 may be specifically implemented through the following steps 105a to 105c.
  • Step 105a the electronic device determines the target rotation coordinates according to the first rotation coordinates and the second rotation coordinates.
  • the electronic device may perform a multiplication operation on the transposition of the first rotation coordinate and the second rotation coordinate to obtain the target rotation coordinate.
  • Step 105b the electronic device determines the target position according to the target rotation coordinate, the first displacement coordinate, and the second displacement coordinate. Mark displacement coordinates.
  • the electronic device may multiply the target rotation coordinates and the first displacement coordinates to obtain the intermediate displacement coordinates; and perform subtraction operation on the second displacement coordinates and the intermediate displacement coordinates to obtain the target displacement coordinates.
  • step 105c the electronic device determines the target rotation coordinates and target displacement coordinates as first interpolation variables.
  • the electronic device can determine the target rotation coordinates according to the first rotation coordinates and the second rotation coordinates
  • the electronic device can determine the target displacement coordinates according to the target rotation coordinates, the first displacement coordinates, and the second displacement coordinates In this way, the electronic device can rotate the coordinates of the target and target displacement coordinates Determined as the first interpolation variable.
  • the electronic device can determine the target rotation coordinates according to the first rotation coordinates in the initial pose information of the first image and the second rotation coordinates in the target pose information of the first image, and according to the target rotation coordinates, the first displacement coordinates in the initial pose information of the first image, and the second displacement coordinates in the target pose information of the first image to determine the target displacement coordinates. Therefore, the electronic device can determine the target rotation coordinates and the target displacement coordinates as the first interpolation variables, so as to facilitate further fusion processing.
  • the electronic device can be based on the latest optimized pose information, M sets of offset information, and the initial pose information of the first image based on M frames of the image captured before the first image as a key frame , optimize the initial pose information of the first image, and determine the first
  • the interpolation variable that is, the electronic device can determine the first interpolation variable based on the coordinates in the pose information of the first image before and after optimization, so the accuracy of determining the first interpolation variable by the electronic device can be improved.
  • the instant positioning and map construction method provided in the embodiment of the present application may further include the following step 106.
  • Step 106 The electronic device determines the M sets of offset information according to the two-dimensional position information of the feature points of the first image and the two-dimensional position information of the feature points of the M frames of images.
  • the two-dimensional position information of the feature points of the first image may be determined by an electronic device through a filtering method.
  • the two-dimensional position information may be used to indicate the position of the feature point of the first image in the first image.
  • the principle of determining the two-dimensional position information of the feature points of the first image by the electronic device is as follows:
  • z k is the two-dimensional position information of the feature points of the first image (ie two-dimensional observation)
  • x k is the initial pose information of the first image
  • r k is the noise item
  • h is the observation matrix.
  • the electronic device may determine the two-dimensional position information of the feature points of the first image through a filtering method according to the initial pose information of the first image.
  • each set of offset information in the M sets of offset information may be an image coordinate difference between a feature point of the first image and a feature point of one frame of the above-mentioned M frames of images.
  • , y
  • each set of offset information includes offsets of all the feature points corresponding to the feature points of the first image and one frame of the above-mentioned M frames of images.
  • the set of offset information may include the image coordinate difference between feature point 1 and feature point 1', and the image coordinate difference between feature point 3 and feature point 3'.
  • the electronic device can determine M sets of offset information based on the two-dimensional position information of the feature points of the first image and the two-dimensional position information of the feature points of M frames of images, the electronic device can acquire the first image The offset information between each image that is a key frame collected before the first image is collected, so that the electronic device can obtain accurate target pose information based on the offset information, and then can improve the real-time positioning of the electronic device and map construction. precision.
  • the real-time positioning and map building method provided in the embodiment of the present application may be executed by a real-time positioning and map building device.
  • the real-time positioning and map construction device performed by the real-time positioning and map construction device is taken as an example to illustrate the real-time positioning and map construction device provided in the embodiment of the present application.
  • the embodiment of the present application provides an instant positioning and map construction device 20 , which may include: a collection module 21 , a determination module 22 , a fusion module 23 and a processing module 24 .
  • the determination module 22 can be used to determine the initial pose information of the i-th frame image collected by the acquisition module 21 according to the final pose information of the i-1th frame image collected by the acquisition module 21, where i is an integer greater than 1.
  • the fusion module 23 can be used to fuse the initial pose information of the i-th frame image with the first interpolation variable to obtain the final pose information of the i-th frame image.
  • the first interpolation variable is obtained before the acquisition module collects the i-th frame image
  • the last interpolation variable of is the interpolation variable between the initial pose information of the first image and the target pose information
  • the first image is a key frame in the image collected by the acquisition module before the ith frame image image
  • the target pose information is the pose information after optimizing the initial pose information of the first image.
  • the processing module 24 can be used for real-time positioning and map construction based on the final pose information of the i-th frame image and the i-th frame image.
  • the device 20 for real-time positioning and map construction may further include an optimization module.
  • the optimization module can be used to determine the initial pose information of the i-th frame image collected by the acquisition module 21 according to the final pose information of the i-1th frame image collected by the acquisition module 21, based on M poses information, M sets of offset information and the initial pose information of the first image, optimize the initial pose information of the first image, and obtain the target pose information, and the M pose information is the latest optimized pose information of M frames of images , M frames of images are images of key frames in the images collected by the acquisition module 21 before the first image, and each set of offset information is the offset of the feature points of the first image relative to the feature points of one frame of images in the M frames of images displacement.
  • the determining module 22 can also be used to determine the first interpolation value according to the first rotation coordinate and the first displacement coordinate in the initial pose information of the first image, and the second rotation coordinate and the second displacement coordinate in the target pose information variable.
  • the determination module 22 can specifically be used to determine M sets of three-dimensional position information according to M sets of offset information, where M sets of offset information correspond to M sets of three-dimensional position information, and each set of three-dimensional position information It is used to indicate the feature points in the 3D map constructed based on one frame of M images.
  • the optimization module can specifically be used to process the M pieces of pose information, the initial pose information of the first image, and the M groups of three-dimensional position information by using the preset beam adjustment algorithm to obtain the target pose information.
  • the determination module 22 can also be used to determine the initial pose information of the first image in the optimization module based on M pose information, M sets of offset information, and the initial pose information of the first image. After optimization, before obtaining the target pose information, M sets of offset information are determined according to the two-dimensional position information of the feature points of the first image and the two-dimensional position information of the feature points of the M frames of images.
  • the determining module 22 may specifically be configured to determine the target rotation coordinates according to the first rotation coordinates and the second rotation coordinates.
  • the determination module 22 may be specifically configured to determine the target displacement coordinates according to the target rotation coordinates, the first displacement coordinates, and the second displacement coordinates.
  • the determining module 22 may be specifically configured to determine the target rotation coordinates and the target displacement coordinates as the first interpolation variables.
  • the processing module 24 may be specifically configured to use a filtering algorithm to process the final pose information of the i-1th frame image to obtain the first pose information.
  • the determination module 22 can specifically be used to set the final position of the i-1th frame image to The pose information is determined as the initial pose information of the i-th frame image.
  • the determining module 22 can specifically be used to determine the first pose information as the i-th frame image when the matching degree between the first pose information and the final pose information of the i-1th frame image is greater than the preset matching degree initial pose information.
  • the real-time positioning and map construction device since the real-time positioning and map construction device can be based on the i-th frame image, and the first interpolation variable and the final pose information determined according to the i-1th frame image
  • the pose information after the fusion of the initial pose information of the i-th frame image is used for real-time positioning and map construction, and the first interpolation variable is in the image collected by the real-time positioning and map construction device before the i-th frame image is The interpolation variable between the initial pose information before and after the optimization of the image of the key frame, therefore, on the one hand, the instant positioning and map construction device can correct the initial pose of the i-th frame image, thereby improving the tracking pose Accuracy, on the other hand, the real-time positioning and map construction device only needs to calculate the interpolation variable for the key frame image, so that the delay of tracking pose can be shortened. In this way, it can ensure that the real-time positioning and mapping device outputs high-frequency and high-precision pose
  • the device for real-time positioning and map construction in the embodiment of the present application may be an electronic device, or a component in the electronic device, such as an integrated circuit or a chip.
  • the electronic device may be a terminal, or other devices other than the terminal.
  • the electronic device can be a mobile phone, a tablet computer, a notebook computer, a handheld computer, a vehicle electronic device, a mobile Internet device (Mobile Internet Device, MID), an augmented reality (augmented reality, AR)/virtual reality (virtual reality, VR) ) equipment, robots, wearable devices, ultra-mobile personal computer (ultra-mobile personal computer, UMPC), netbook or personal digital assistant (personal digital assistant, PDA), etc.
  • the device for real-time positioning and map construction in the embodiment of the present application may be a device with an operating system.
  • the operating system may be an Android (Android) operating system, an ios operating system, or other possible operating systems, which are not specifically limited in this embodiment of the present application.
  • the real-time positioning and map construction device provided by the embodiment of the present application can realize various processes realized by the method embodiment in FIG. 1 , and details are not repeated here to avoid repetition.
  • the embodiment of the present application also provides an electronic device 300, including a processor 301 and a memory 302.
  • the memory 302 stores programs or instructions that can run on the processor 301.
  • the programs or instructions are executed by the processor 301, the various steps in the embodiment of the real-time positioning and map construction method described above can be achieved, and the same technical effect can be achieved. To avoid repetition, details are not repeated here.
  • the electronic devices in the embodiments of the present application include the above-mentioned mobile electronic devices and non-mobile electronic devices.
  • FIG. 4 is a schematic diagram of a hardware structure of an electronic device implementing an embodiment of the present application.
  • the electronic device 1000 includes, but is not limited to: a radio frequency unit 1001, a network module 1002, an audio output unit 1003, an input unit 1004, a sensor 1005, a display unit 1006, a user input unit 1007, an interface unit 1008, a memory 1009, and a processor 1010, etc. part.
  • the electronic device 1000 can also include a power supply (such as a battery) for supplying power to various components, and the power supply can be logically connected to the processor 1010 through the power management system, so that the management of charging, discharging, and function can be realized through the power management system. Consumption management and other functions.
  • a power supply such as a battery
  • the structure of the electronic device shown in FIG. 4 does not constitute a limitation to the electronic device, and the electronic device may include more or fewer components than shown in the figure, or combine some components, or arrange different components, which will not be repeated here. .
  • the processor 1010 can be configured to determine the initial pose information of the i-th frame image collected by the sensor 1005 according to the final pose information of the i-1th frame image collected by the sensor 1005, where i is an integer greater than 1.
  • the processor 1010 can also be used to fuse the initial pose information of the i-th frame image with the first interpolation variable to obtain the final pose information of the i-th frame image, the first interpolation variable is before the acquisition module collects the i-th frame image get the last interpolation Variable, the first interpolation variable is the interpolation variable between the initial pose information of the first image and the target pose information, the first image is the image of the key frame in the image collected by the acquisition module before the ith frame image, and the target position
  • the pose information is pose information after optimizing the initial pose information of the first image.
  • the processor 1010 can also be used for real-time positioning and map construction based on the final pose information of the i-th frame image and the i-th frame image.
  • the processor 1010 may also be configured to determine the initial pose information of the i-th frame image collected by the sensor 1005 according to the final pose information of the i-1th frame image collected by the sensor 1005, Based on the M pose information, M sets of offset information and the initial pose information of the first image, optimize the initial pose information of the first image to obtain the target pose information, and the M pose information is the closest M frame image Posture information after one optimization, M frames of images are images of key frames in the images collected by the first image front sensor 1005, and each set of offset information is the feature point of the first image relative to one frame in the M frames of images The offset of the feature points of the image.
  • the processor 1010 may also be configured to determine the first interpolation value according to the first rotation coordinate and the first displacement coordinate in the initial pose information of the first image, and the second rotation coordinate and the second displacement coordinate in the target pose information variable.
  • the processor 1010 may be specifically configured to determine M sets of three-dimensional position information according to M sets of offset information, where M sets of offset information correspond to M sets of three-dimensional position information, and each set of three-dimensional position information It is used to indicate the feature points in the 3D map constructed based on one frame of M images.
  • the processor 1010 may be specifically configured to use a preset beam adjustment algorithm to process M pieces of pose information, initial pose information of the first image, and M sets of three-dimensional position information to obtain target pose information.
  • the processor 1010 may also be configured to optimize the initial pose information of the first image based on the M pieces of pose information, M sets of offset information, and the initial pose information of the first image , before obtaining the target pose information, M sets of offset information are determined according to the two-dimensional position information of the feature points of the first image and the two-dimensional position information of the feature points of the M frames of images.
  • the processor 1010 may specifically be configured to determine the target rotation coordinates according to the first rotation coordinates and the second rotation coordinates.
  • the processor 1010 may specifically be configured to determine the target displacement coordinates according to the target rotation coordinates, the first displacement coordinates, and the second displacement coordinates.
  • the processor 1010 may be specifically configured to determine the target rotation coordinates and the target displacement coordinates as the first interpolation variables.
  • the processor 1010 may be specifically configured to use a filtering algorithm to process the final pose information of the i-1th frame image to obtain the first pose information.
  • the processor 1010 can be specifically configured to set the final position of the i-1th frame image to The pose information is determined as the initial pose information of the i-th frame image.
  • the processor 1010 may specifically be configured to determine the first pose information as the i-th frame image when the matching degree between the first pose information and the final pose information of the i-1th frame image is greater than a preset matching degree initial pose information.
  • the electronic device since the electronic device can be based on the i-th frame image, and the first interpolation variable and the initial pose of the i-th frame image determined according to the final pose information of the i-1th frame image
  • the pose information after information fusion is used for real-time positioning and map construction, and the first interpolation variable is between the initial pose information before and after optimization of the image that is a key frame in the image collected by the electronic device before the i-th frame image the interpolation variable, Therefore, on the one hand, the electronic device can correct the initial pose of the i-th frame image, thereby improving the accuracy of tracking pose; on the other hand, the electronic device only needs to calculate the interpolation variable for the key frame image, which can shorten Delay in tracking pose. In this way, it is possible to ensure that the electronic device outputs high-frequency and high-precision pose information, thereby improving the effect of the electronic device's real-time positioning and map construction.
  • the input unit 1004 may include a graphics processor (Graphics Processing Unit, GPU) 10041 and a microphone 10042, and the graphics processor 10041 is used for the image capture device (such as the image data of the still picture or video obtained by the camera) for processing.
  • the display unit 1006 may include a display panel 10061, and the display panel 10061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like.
  • the user input unit 1007 includes at least one of a touch panel 10071 and other input devices 10072 .
  • the touch panel 10071 is also called a touch screen.
  • the touch panel 10071 may include two parts, a touch detection device and a touch controller.
  • Other input devices 10072 may include, but are not limited to, physical keyboards, function keys (such as volume control buttons, switch buttons, etc.), trackballs, mice, and joysticks, which will not be repeated here.
  • the memory 1009 can be used to store software programs as well as various data.
  • the memory 1009 may mainly include a first storage area for storing programs or instructions and a second storage area for storing data, wherein the first storage area may store an operating system, an application program or instructions required by at least one function (such as a sound playing function, image playback function, etc.), etc.
  • memory 1009 may include volatile memory or nonvolatile memory, or, memory 1009 may include both volatile and nonvolatile memory.
  • the non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electronically programmable Erase Programmable Read-Only Memory (Electrically EPROM, EEPROM) or Flash.
  • ROM Read-Only Memory
  • PROM programmable read-only memory
  • Erasable PROM Erasable PROM
  • EPROM erasable programmable read-only memory
  • Electrical EPROM Electrical EPROM
  • EEPROM electronically programmable Erase Programmable Read-Only Memory
  • Volatile memory can be random access memory (Random Access Memory, RAM), static random access memory (Static RAM, SRAM), dynamic random access memory (Dynamic RAM, DRAM), synchronous dynamic random access memory (Synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (Double Data Rate SDRAM, DDRSDRAM), enhanced synchronous dynamic random access memory (Enhanced SDRAM, ESDRAM), synchronous connection dynamic random access memory (Synch link DRAM , SLDRAM) and Direct Memory Bus Random Access Memory (Direct Rambus RAM, DRRAM).
  • RAM Random Access Memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • DRAM synchronous dynamic random access memory
  • SDRAM double data rate synchronous dynamic random access memory
  • Double Data Rate SDRAM Double Data Rate SDRAM
  • DDRSDRAM double data rate synchronous dynamic random access memory
  • Enhanced SDRAM, ESDRAM enhanced synchronous dynamic random access memory
  • Synch link DRAM , SLDRAM
  • Direct Memory Bus Random Access Memory Direct Rambus
  • the processor 1010 may include one or more processing units; optionally, the processor 1010 integrates an application processor and a modem processor, wherein the application processor mainly processes operations related to the operating system, user interface, and application programs, etc., Modem processors mainly process wireless communication signals, such as baseband processors. It can be understood that the foregoing modem processor may not be integrated into the processor 1010 .
  • the embodiment of the present application also provides a readable storage medium, where programs or instructions are stored on the readable storage medium, When the program or instruction is executed by the processor, it realizes the various processes in the embodiment of the real-time positioning and map construction method described above, and can achieve the same technical effect. In order to avoid repetition, details are not repeated here.
  • the processor is the processor in the electronic device described in the above embodiments.
  • the readable storage medium includes a computer-readable storage medium, such as a computer read-only memory ROM, a random access memory RAM, a magnetic disk or an optical disk, and the like.
  • the embodiment of the present application further provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is used to run programs or instructions to realize real-time positioning and map construction as described above
  • the chip includes a processor and a communication interface
  • the communication interface is coupled to the processor
  • the processor is used to run programs or instructions to realize real-time positioning and map construction as described above
  • chips mentioned in the embodiments of the present application may also be called system-on-chip, system-on-chip, system-on-a-chip, or system-on-a-chip.
  • the embodiment of the present application provides a computer program product, the program product is stored in a storage medium, and the program product is executed by at least one processor to realize the various processes in the above embodiments of the real-time positioning and map construction method, and can achieve the same To avoid repetition, the technical effects will not be repeated here.
  • the term “comprising”, “comprising” or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article or apparatus comprising a set of elements includes not only those elements, It also includes other elements not expressly listed, or elements inherent in the process, method, article, or device. Without further limitations, an element defined by the phrase “comprising a " does not preclude the presence of additional identical elements in the process, method, article, or apparatus comprising that element.
  • the scope of the methods and devices in the embodiments of the present application is not limited to performing functions in the order shown or discussed, and may also include performing functions in a substantially simultaneous manner or in reverse order according to the functions involved. Functions are performed, for example, the described methods may be performed in an order different from that described, and various steps may also be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Automation & Control Theory (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a simultaneous localization and mapping method and apparatus, an electronic device, and a readable storage medium, belonging to the technical field of communications. The method comprises: according to final pose information of an acquired (i-1)th frame of image, determining initial pose information of an acquired i-th frame of image, wherein i is an integer greater than 1; fusing the initial pose information of the i-th frame of image with a first interpolation variable, so as to obtain final pose information of the i-th frame of image, wherein the first interpolation variable is the last interpolation variable obtained before the i-th frame of image is acquired, the first interpolation variable is an interpolation variable between the initial pose information and target pose information of a first image, the first image is an image which is the key frame in the images acquired before the i-th frame of image, and the target pose information is pose information after optimizing the initial pose information of the first image; and performing simultaneous localization and mapping on the basis of the final pose information of the i-th frame of image, and the i-th frame of image.

Description

即时定位与地图构建方法、装置、电子设备及可读存储介质Instant positioning and map construction method, device, electronic device and readable storage medium
相关申请的交叉引用Cross References to Related Applications
本申请主张在2022年02月22日在中国提交的中国专利申请号202210163295.9的优先权,其全部内容通过引用包含于此。This application claims priority to Chinese Patent Application No. 202210163295.9 filed in China on February 22, 2022, the entire contents of which are hereby incorporated by reference.
技术领域technical field
本申请属于通信技术领域,具体涉及一种即时定位与地图构建方法、装置、电子设备及可读存储介质。The application belongs to the technical field of communication, and in particular relates to a real-time positioning and map construction method, device, electronic equipment and readable storage medium.
背景技术Background technique
随着通信技术的不断发展,电子设备的功能越来越丰富。例如,电子设备可以通过跟踪当前场景的每个图像帧的位姿,进行即时定位与地图构建。With the continuous development of communication technology, the functions of electronic devices are becoming more and more abundant. For example, electronic devices can perform real-time positioning and map construction by tracking the pose of each image frame of the current scene.
其中,在相关技术中,电子设备可以基于滤波的方法或基于优化的方法,对电子设备即时获取的当前场景的图像帧的位姿进行处理,并输出处理后的位姿,如此可以实现对每个图像帧的位姿的跟踪。Among them, in related technologies, the electronic device can process the pose of the image frame of the current scene acquired by the electronic device immediately based on a filtering method or an optimization method, and output the processed pose, so that each pose tracking of image frames.
然而,按照上述方法,由于一方面,基于滤波的方法无法对图像帧的位姿进行校正,从而导致电子设备跟踪位姿的准确性较差;另一方面,基于优化的方法的计算量较大,从而使得电子设备处理单个图像帧的位姿的时间较长,进而导致电子设备跟踪位姿的延时较长;因此可能导致电子设备进行即时定位与地图构建的效果较差。However, according to the above method, on the one hand, the filter-based method cannot correct the pose of the image frame, resulting in poor accuracy of electronic equipment tracking pose; on the other hand, the optimization-based method requires a large amount of calculation , so that it takes a long time for the electronic device to process the pose of a single image frame, which in turn leads to a long delay for the electronic device to track the pose; therefore, it may cause the electronic device to perform poor real-time positioning and map construction.
发明内容Contents of the invention
本申请实施例的目的是提供一种即时定位与地图构建方法、装置、电子设备及可读存储介质,能够解决电子设备进行即时定位与地图构建的效果较差的问题。The purpose of the embodiment of the present application is to provide a real-time positioning and map construction method, device, electronic device and readable storage medium, which can solve the problem that the electronic device performs poor real-time positioning and map construction.
第一方面,本申请实施例提供了一种即时定位与地图构建方法,该方法包括:根据采集的第i-1帧图像的最终位姿信息,确定采集的第i帧图像的初始位姿信息,i为大于1的整数;将第i帧图像的初始位姿信息与第一插值变量融合,得到第i帧图像的最终位姿信息,第一插值变量为在采集第i帧图像前得到的最后一个插值变量,第一插值变量为第一图像的初始位姿信息与目标位姿信息之间的插值变量,第一图像为在第i帧图像前采集的图像中为关键帧的图像,目标位姿信息为对第一图像的初始位姿信息进行优化后的位姿信息;基于第i帧图像的最终位姿信息和第i帧图像,进行即时定位与地图构建。In the first aspect, the embodiment of the present application provides a real-time positioning and map construction method, the method includes: determining the initial pose information of the i-th frame image collected according to the final pose information of the i-1th frame image collected , i is an integer greater than 1; the initial pose information of the i-th frame image is fused with the first interpolation variable to obtain the final pose information of the i-th frame image, and the first interpolation variable is obtained before collecting the i-th frame image The last interpolation variable, the first interpolation variable is the interpolation variable between the initial pose information of the first image and the target pose information, the first image is the image of the key frame in the image collected before the i-th frame image, and the target The pose information is the pose information after optimizing the initial pose information of the first image; based on the final pose information of the i-th frame image and the i-th frame image, real-time positioning and map construction are performed.
第二方面,本申请实施例提供了一种即时定位与地图构建装置,该装置包括采集模块、确定模块、融合模块和处理模块;确定模块,用于根据采集模块采集的第i-1帧图像的最终位姿信息,确定采集模块采集的第i帧图像的初始位姿信息,i为大于1的整数;融合模块,用于将第i帧图像的初始位姿信息与第一插值变量融合,得到第i 帧图像的最终位姿信息,第一插值变量为在采集模块采集第i帧图像前得到的最后一个插值变量,第一插值变量为第一图像的初始位姿信息与目标位姿信息之间的插值变量,第一图像为在第i帧图像前采集模块采集的图像中为关键帧的图像,目标位姿信息为对第一图像的初始位姿信息进行优化后的位姿信息;处理模块,用于基于第i帧图像的最终位姿信息和第i帧图像,进行即时定位与地图构建。In the second aspect, the embodiment of the present application provides a real-time positioning and map construction device, the device includes a collection module, a determination module, a fusion module and a processing module; the determination module is used to collect the i-1th frame image according to the collection module The final pose information of the i-th frame image is determined to determine the initial pose information of the i-th frame image collected by the acquisition module, and i is an integer greater than 1; the fusion module is used to fuse the initial pose information of the i-th frame image with the first interpolation variable, get i The final pose information of the frame image, the first interpolation variable is the last interpolation variable obtained before the acquisition module collects the i-th frame image, the first interpolation variable is the initial pose information of the first image and the target pose information Interpolation variables, the first image is the image of the key frame in the image collected by the acquisition module before the i-th frame image, and the target pose information is the pose information after optimizing the initial pose information of the first image; the processing module, It is used for real-time positioning and map construction based on the final pose information of the i-th frame image and the i-th frame image.
第三方面,本申请实施例提供了一种电子设备,该电子设备包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如第一方面所述的方法的步骤。In the third aspect, the embodiment of the present application provides an electronic device, the electronic device includes a processor and a memory, the memory stores programs or instructions that can run on the processor, and the programs or instructions are processed by the The steps of the method described in the first aspect are realized when the controller is executed.
第四方面,本申请实施例提供了一种可读存储介质,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如第一方面所述的方法的步骤。In a fourth aspect, an embodiment of the present application provides a readable storage medium, on which a program or an instruction is stored, and when the program or instruction is executed by a processor, the steps of the method described in the first aspect are implemented .
第五方面,本申请实施例提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现如第一方面所述的方法。In the fifth aspect, the embodiment of the present application provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is used to run programs or instructions, so as to implement the first aspect the method described.
第六方面,本申请实施例提供一种计算机程序产品,该程序产品被存储在存储介质中,该程序产品被至少一个处理器执行以实现如第一方面所述的方法。In a sixth aspect, an embodiment of the present application provides a computer program product, the program product is stored in a storage medium, and the program product is executed by at least one processor to implement the method described in the first aspect.
在本申请实施例中,可以根据采集的第i-1帧图像的最终位姿信息,确定采集的第i帧图像的初始位姿信息,i为大于1的整数;且将第i帧图像的初始位姿信息与第一插值变量融合,得到第i帧图像的最终位姿信息,第一插值变量为在采集第i帧图像前得到的最后一个插值变量,第一插值变量为第一图像的初始位姿信息与目标位姿信息之间的插值变量,第一图像为在第i帧图像前采集的图像中为关键帧的图像,目标位姿信息为对第一图像的初始位姿信息进行优化后的位姿信息;并基于第i帧图像的最终位姿信息和第i帧图像,进行即时定位与地图构建。通过该方案,由于电子设备可以基于第i帧图像,以及第一插值变量与根据第i-1帧图像的最终位姿信息确定的该第i帧图像的初始位姿信息融合后的位姿信息,进行即时定位与地图构建,且该第一插值变量为电子设备在该第i帧图像前采集的图像中为关键帧的图像的优化前后的初始位姿信息之间的插值变量,因此,一方面电子设备可以对该第i帧图像的初始位姿进行校正,从而可以提高跟踪位姿的准确性,另一方面电子设备只需对为关键帧的图像计算插值变量,从而可以缩短跟踪位姿的延时。如此,可以确保电子设备输出高频率高精度的位姿信息,进而可以提高电子设备进行即时定位与地图构建的效果。In the embodiment of the present application, the initial pose information of the i-th frame image collected can be determined according to the final pose information of the i-1th frame image collected, where i is an integer greater than 1; and the i-th frame image's The initial pose information is fused with the first interpolation variable to obtain the final pose information of the i-th frame image, the first interpolation variable is the last interpolation variable obtained before the i-th frame image is collected, and the first interpolation variable is the first image's The interpolation variable between the initial pose information and the target pose information, the first image is the image of the key frame in the image collected before the i-th frame image, and the target pose information is the initial pose information of the first image. Optimized pose information; and based on the final pose information of the i-th frame image and the i-th frame image, real-time positioning and map construction are performed. Through this solution, since the electronic device can be based on the i-th frame image, and the pose information after the fusion of the first interpolation variable and the initial pose information of the i-th frame image determined according to the final pose information of the i-1th frame image , to perform real-time positioning and map construction, and the first interpolation variable is the interpolation variable between the initial pose information before and after optimization of the key frame image in the image collected by the electronic device before the i-th frame image, therefore, a On the one hand, the electronic device can correct the initial pose of the image in the i-th frame, thereby improving the accuracy of the tracking pose; delay. In this way, it is possible to ensure that the electronic device outputs high-frequency and high-precision pose information, thereby improving the effect of the electronic device's real-time positioning and map construction.
附图说明Description of drawings
图1是本申请实施例提供的即时定位与地图构建方法的流程图;Fig. 1 is a flow chart of the real-time positioning and map construction method provided by the embodiment of the present application;
图2是本申请实施例提供的即时定位与地图构建装置的示意图;Fig. 2 is a schematic diagram of the real-time positioning and map construction device provided by the embodiment of the present application;
图3是本申请实施例提供的电子设备的示意图;Fig. 3 is a schematic diagram of an electronic device provided by an embodiment of the present application;
图4是本申请实施例提供的电子设备的硬件示意图。FIG. 4 is a schematic diagram of hardware of an electronic device provided by an embodiment of the present application.
具体实施方式 Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员获得的所有其他实施例,都属于本申请保护的范围。The following will clearly describe the technical solutions in the embodiments of the present application with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, but not all of them. All other embodiments obtained by persons of ordinary skill in the art based on the embodiments in this application belong to the protection scope of this application.
本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象,而不用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施,且“第一”、“第二”等所区分的对象通常为一类,并不限定对象的个数,例如第一对象可以是一个,也可以是多个。此外,说明书以及权利要求中“和/或”表示所连接对象的至少其中之一,字符“/”,一般表示前后关联对象是一种“或”的关系。The terms "first", "second" and the like in the specification and claims of the present application are used to distinguish similar objects, and are not used to describe a specific sequence or sequence. It should be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments of the application can be practiced in sequences other than those illustrated or described herein, and that references to "first," "second," etc. distinguish Objects are generally of one type, and the number of objects is not limited. For example, there may be one or more first objects. In addition, "and/or" in the specification and claims means at least one of the connected objects, and the character "/" generally means that the related objects are an "or" relationship.
下面结合附图,通过具体的实施例及其应用场景对本申请实施例提供的即时定位与地图构建方法、装置、电子设备及可读存储介质进行详细地说明。The real-time positioning and map construction method, device, electronic device and readable storage medium provided by the embodiments of the present application will be described in detail below through specific embodiments and application scenarios with reference to the accompanying drawings.
即时定位与地图构建是指运动物体根据传感器的信息,一边计算自身位置,一边构建环境地图的过程。目前,即时定位与地图构建已经应用在机器人、虚拟现实和增强现实等领域,其用途包括传感器自身的定位,以及后续的路径规划、场景理解。目前,主流的即时定位与地图构建方法一般分为基于滤波和基于优化的两种类型。基于滤波的方法是用前一时刻各种状态的值来估计下一个时刻;而基于优化的方法则是将所有状态看成为变量,把运动方程和观测方程看成是变量间的约束,构造误差函数,最小化这个误差的二次型。Real-time positioning and map construction refers to the process in which a moving object calculates its own position and builds an environmental map based on sensor information. At present, real-time positioning and map construction have been applied in the fields of robotics, virtual reality and augmented reality, and its uses include the positioning of the sensor itself, as well as subsequent path planning and scene understanding. At present, the mainstream real-time positioning and map construction methods are generally divided into two types based on filtering and optimization. The filter-based method uses the values of various states at the previous moment to estimate the next moment; while the optimization-based method regards all states as variables, regards the equations of motion and observation equations as constraints between variables, and constructs errors function that minimizes the quadratic form of this error.
然而,以多状态约束下的卡尔曼滤波器MSCKF(Multi-State Constraint Kalman Filter,MSCKF)为代表的基于滤波的方法,具有算法功耗较低、算法输出频率较高、即时定位精度良好的特点,比较适用于移动终端的应用场景。但由于其理论体系框架并不完整,缺乏地图构建的功能;再者,空间即时定位的误差将随着该方法运行时间的加长而逐渐变大;并且,在实际应用中一旦出现电子设备定位失败的情况,整个MSCKF系统将无法继续运行,因此可能导致现有的MSCKF系统鲁棒性较差。而基于优化的方法,具备系统模块完备(同时包含即时定位模块、地图构建模块),即时定位精度较高的特点,整个系统的鲁棒性也较强。但由于该方法在运行时计算量比较大,相应的算法功耗很高,算法输出频率较低,因此该方法并不适用于移动终端高输出速率、低计算功耗的应用需要。如此,导致电子设备进行即时定位与地图构建的效果较差。However, the filter-based method represented by the Kalman filter MSCKF (Multi-State Constraint Kalman Filter, MSCKF) under multi-state constraints has the characteristics of low algorithm power consumption, high algorithm output frequency, and good real-time positioning accuracy. , which is more suitable for application scenarios of mobile terminals. However, due to its incomplete theoretical framework, it lacks the function of map construction; moreover, the error of real-time spatial positioning will gradually increase with the increase of the running time of the method; and, in practical applications, once the positioning of electronic equipment fails In this case, the entire MSCKF system will not be able to continue to operate, which may lead to poor robustness of the existing MSCKF system. The method based on optimization has the characteristics of complete system modules (including real-time positioning module and map building module), high real-time positioning accuracy, and strong robustness of the whole system. However, since this method has a relatively large amount of calculation during operation, the corresponding algorithm consumes a lot of power, and the output frequency of the algorithm is low, so this method is not suitable for the application needs of high output rate and low calculation power consumption of mobile terminals. In this way, the effect of real-time positioning and map construction by the electronic device is poor.
为了解决上述问题,本申请实施例中,电子设备可以根据采集的第i-1帧图像的最终位姿信息,确定采集的第i帧图像的初始位姿信息,i为大于1的整数;且将第i帧图像的初始位姿信息与第一插值变量融合,得到第i帧图像的最终位姿信息,第一插值变量为在采集第i帧图像前得到的最后一个插值变量,第一插值变量为第一图像的初始位姿信息与目标位姿信息之间的插值变量,第一图像为在第i帧图像前采集的 图像中为关键帧的图像,目标位姿信息为对第一图像的初始位姿信息进行优化后的位姿信息;并基于第i帧图像的最终位姿信息和第i帧图像,进行即时定位与地图构建。通过该方法,一方面由于电子设备可以对该第i帧图像的初始位姿进行校正,因此可以提高跟踪位姿的准确性,另一方面由于电子设备只需对为关键帧的图像计算插值变量,因此可以缩短跟踪位姿的延时。如此,可以确保电子设备输出高频率高精度的位姿信息,进而可以提高电子设备进行即时定位与地图构建的效果。In order to solve the above problems, in the embodiment of the present application, the electronic device can determine the initial pose information of the i-th frame image collected according to the final pose information of the i-1th frame image collected, where i is an integer greater than 1; and The initial pose information of the i-th frame image is fused with the first interpolation variable to obtain the final pose information of the i-th frame image, the first interpolation variable is the last interpolation variable obtained before the i-th frame image is collected, the first interpolation The variable is the interpolation variable between the initial pose information of the first image and the target pose information, and the first image is collected before the i-th frame image The image is the image of the key frame, and the target pose information is the pose information after optimizing the initial pose information of the first image; and based on the final pose information of the i-th frame image and the i-th frame image, perform real-time positioning with map build. Through this method, on the one hand, the electronic device can correct the initial pose of the i-th frame image, so the accuracy of the tracking pose can be improved; on the other hand, since the electronic device only needs to calculate the interpolation variable for the key frame image , so the delay of tracking pose can be shortened. In this way, it is possible to ensure that the electronic device outputs high-frequency and high-precision pose information, thereby improving the effect of the electronic device's real-time positioning and map construction.
本申请实施例提供一种即时定位与地图构建方法,图1示出了本申请实施例提供的即时定位与地图构建方法的流程图。如图1所示,本申请实施例提供的即时定位与地图构建方法可以包括下述的步骤101至步骤103。An embodiment of the present application provides a method for real-time positioning and map construction, and FIG. 1 shows a flow chart of the method for real-time positioning and map construction provided by the embodiment of the present application. As shown in FIG. 1 , the real-time positioning and map construction method provided by the embodiment of the present application may include the following steps 101 to 103 .
步骤101、电子设备根据采集的第i-1帧图像的最终位姿信息,确定采集的第i帧图像的初始位姿信息。Step 101. The electronic device determines the initial pose information of the collected i-th frame image according to the collected final pose information of the i-1th frame image.
本申请实施例中,上述i为大于1的整数。In the embodiment of the present application, the above i is an integer greater than 1.
可选地,本申请实施例中,电子设备在即时定位与地图构建过程中,可以通过电子设备中的传感器实时采集电子设备所处的场景的图像。Optionally, in this embodiment of the present application, during the instant positioning and map building process of the electronic device, the sensor in the electronic device may collect images of the scene where the electronic device is located in real time.
本申请实施例中,图像的位姿信息可以指示图像在三维空间中的位置。In the embodiment of the present application, the pose information of the image may indicate the position of the image in the three-dimensional space.
可选地,本申请实施例中,位姿信息可以包括旋转坐标和位移坐标。Optionally, in this embodiment of the present application, the pose information may include rotation coordinates and displacement coordinates.
例如,位姿信息T={R,P},其中,R为旋转坐标,包括在三维空间中以X轴为中心的旋转坐标、以Y轴为中心的旋转坐标和以Z轴为中心的旋转坐标;P为位移坐标,包括在三维空间中在X轴上的坐标、在Y轴上的坐标和在Z轴上的坐标。For example, pose information T={R, P}, where R is the rotation coordinate, including the rotation coordinate centered on the X axis, the rotation coordinate centered on the Y axis, and the rotation centered on the Z axis in three-dimensional space. Coordinates; P is the displacement coordinates, including the coordinates on the X axis, the Y axis and the Z axis in the three-dimensional space.
下面对电子设备在确定采集的第i帧图像的初始位姿信息的具体方法进行详细说明。The specific method for the electronic device to determine the initial pose information of the captured i-th frame image will be described in detail below.
可选地,本申请实施例中,上述步骤101具体可以通过下述的步骤101a,以及步骤A或步骤B实现。Optionally, in the embodiment of the present application, the above step 101 can be specifically implemented through the following step 101a, and step A or step B.
步骤101a、电子设备采用滤波算法,对第i-1帧图像的最终位姿信息处理,得到第一位姿信息。Step 101a, the electronic device uses a filtering algorithm to process the final pose information of the i-1th frame image to obtain the first pose information.
可选地,本申请实施例中,滤波算法的原理可以为:Optionally, in this embodiment of the application, the principle of the filtering algorithm may be:
x=f(xi-1)+nx=f(x i-1 )+n
其中,x为第一位姿信息,xi-1为第i-1帧图像的最终位姿信息,f为转移矩阵,n为噪声项。Among them, x is the first pose information, x i-1 is the final pose information of the i-1th frame image, f is the transfer matrix, and n is the noise term.
可以看出,电子设备可以根据第i-1帧图像的最终位姿信息,计算得到第一位姿信息。It can be seen that the electronic device can calculate the first pose information according to the final pose information of the i-1 frame image.
可选地,本申请实施例中,电子设备在得到第一位姿信息之后,可以判断第一位姿信息与第i-1帧图像的最终位姿信息的匹配情况,并根据该匹配情况确定执行步骤A,或执行步骤B。Optionally, in the embodiment of the present application, after the electronic device obtains the first pose information, it can judge the matching situation between the first pose information and the final pose information of the i-1th frame image, and determine according to the matching situation Do step A, or do step B.
步骤A、电子设备在第一位姿信息与第i-1帧图像的最终位姿信息的匹配度小于或 等于预设匹配度的情况下,将第i-1帧图像的最终位姿信息确定为第i帧图像的初始位姿信息。Step A. The matching degree of the first pose information of the electronic device and the final pose information of the i-1th frame image is less than or If it is equal to the preset matching degree, the final pose information of the i-1th frame image is determined as the initial pose information of the i-th frame image.
可选地,本申请实施例中,预设匹配度可以为系统默认的或用户根据实际使用需求设置的。Optionally, in this embodiment of the present application, the preset matching degree may be set by default by the system or set by the user according to actual usage requirements.
可以理解,第一位姿信息与第i-1帧图像的最终位姿信息的匹配度小于或等于预设匹配度,即第一位姿信息与第i-1帧图像的最终位姿信息相差过大。It can be understood that the matching degree between the first pose information and the final pose information of the i-1th frame image is less than or equal to the preset matching degree, that is, the difference between the first pose information and the final pose information of the i-1th frame image is too big.
示例性地,若电子设备在采集当前场景的图像的过程中,出现定位失败的情况,例如电子设备长时间采集白墙的图像,则电子设备在采用滤波算法进行运算时,由于误差的累计,会出现计算出的位姿信息异常,即该位姿信息(即第一位姿信息)与前一帧图像(即第i-1帧图像)的最终位姿信息的匹配度小于或等于预设匹配度。For example, if the electronic device fails to locate during the process of capturing the image of the current scene, for example, the electronic device collects the image of a white wall for a long time, when the electronic device uses the filtering algorithm to perform calculations, due to the accumulation of errors, There will be an abnormality in the calculated pose information, that is, the matching degree between the pose information (that is, the first pose information) and the final pose information of the previous frame image (that is, the i-1th frame image) is less than or equal to the preset suitability.
可选地,本申请实施例中,电子设备可以保存整个定位过程的历史信息,在第一位姿信息与第i-1帧图像的最终位姿信息的匹配度小于或等于预设匹配度时,将第i-1帧图像的最终位姿信息确定为第i帧图像的初始位姿信息,从而可以减少误差,确保第i帧图像的初始位姿信息的准确性。Optionally, in the embodiment of the present application, the electronic device may save the historical information of the entire positioning process, when the matching degree of the first pose information and the final pose information of the i-1th frame image is less than or equal to the preset matching degree , determine the final pose information of the i-1th frame image as the initial pose information of the i-th frame image, thereby reducing errors and ensuring the accuracy of the initial pose information of the i-th frame image.
步骤B、电子设备在第一位姿信息与第i-1帧图像的最终位姿信息的匹配度大于预设匹配度的情况下,将第一位姿信息确定为第i帧图像的初始位姿信息。Step B. When the matching degree of the first pose information and the final pose information of the i-1th frame image is greater than the preset matching degree, the electronic device determines the first pose information as the initial position of the i-th frame image Posture information.
可以理解,第一位姿信息与第i-1帧图像的最终位姿信息的匹配度大于预设匹配度,即第一位姿信息与第i-1帧图像的最终位姿信息相差不大,此时,电子设备可以将第一位姿信息确定为第i帧图像的初始位姿信息,即第i帧图像的初始位姿信息为电子设备采用滤波算法,对第i-1帧图像的最终位姿信息处理得到的位姿信息。It can be understood that the matching degree between the first pose information and the final pose information of the i-1th frame image is greater than the preset matching degree, that is, the first pose information is not much different from the final pose information of the i-1th frame image , at this time, the electronic device can determine the first pose information as the initial pose information of the i-th frame image, that is, the initial pose information of the i-th frame image is that the electronic device adopts a filtering algorithm, and the i-1th frame image The pose information obtained by the final pose information processing.
本申请实施例中,由于电子设备可以通过判断第i-1帧图像的最终位姿信息,与采用滤波算法对第i-1帧图像的最终位姿信息处理得到的位姿信息(即第一位姿信息)之间的匹配度,将第i-1帧图像的最终位姿信息,或第一位姿信息确定第i帧图像的初始位姿信息,因此可以确保第i帧图像的初始位姿信息的准确性。In the embodiment of the present application, since the electronic device can judge the final pose information of the i-1th frame image, and use the filtering algorithm to process the final pose information of the i-1th frame image (that is, the first pose information), the final pose information of the i-1th frame image, or the first pose information determines the initial pose information of the i-th frame image, so the initial pose information of the i-th frame image can be ensured The accuracy of posture information.
步骤102、电子设备将第i帧图像的初始位姿信息与第一插值变量融合,得到第i帧图像的最终位姿信息。Step 102, the electronic device fuses the initial pose information of the i-th frame image with the first interpolation variable to obtain the final pose information of the i-th frame image.
本申请实施例中,第一插值变量为在采集第i帧图像前得到的最后一个插值变量。In the embodiment of the present application, the first interpolation variable is the last interpolation variable obtained before the i-th frame of image is collected.
需要说明的是,电子设备可以通过对为关键帧的图像的位姿信息的计算,得到一个插值变量,电子设备在计算的过程中,不再对新采集的为关键帧的图像的位姿信息进行计算,直至该次计算结束;若电子设备在该次计算结束后又采集到一张为关键帧的图像,则电子设备可以通过对该图像的位姿信息的计算,得到一个新的插值变量。It should be noted that the electronic device can obtain an interpolation variable by calculating the pose information of the key frame image, and the electronic device no longer calculates the pose information of the newly acquired key frame image during the calculation process. Carry out calculations until the calculation ends; if the electronic device captures an image that is a key frame after the calculation, the electronic device can obtain a new interpolation variable by calculating the pose information of the image .
本申请实施例中,第一插值变量为第一图像的初始位姿信息与目标位姿信息之间的插值变量。In the embodiment of the present application, the first interpolation variable is an interpolation variable between initial pose information and target pose information of the first image.
本申请实施例中,目标位姿信息可以为对第一图像的初始位姿信息进行优化后的位姿信息。 In the embodiment of the present application, the target pose information may be pose information optimized from the initial pose information of the first image.
本申请实施例中,第一图像为在第i帧图像前采集的图像中为关键帧的图像。In the embodiment of the present application, the first image is an image of a key frame among the images collected before the image of the i-th frame.
可选地,本申请实施例中,电子设备采集的图像中为关键帧的图像,可以为电子设备长时间采集的一帧图像,或者可以为电子设备采集的图像中包括未采集过的元素的图像等。Optionally, in this embodiment of the present application, the image collected by the electronic device is a key frame image, which may be a frame of image collected by the electronic device for a long time, or may include elements that have not been collected in the image collected by the electronic device. images etc.
可选地,本申请实施例中,电子设备可以通过坐标运算,将第i帧图像的初始位姿信息与第一插值变量进行融合,并得到第i帧图像的最终位姿信息。Optionally, in this embodiment of the present application, the electronic device may fuse the initial pose information of the i-th frame image with the first interpolation variable through coordinate calculation, and obtain the final pose information of the i-th frame image.
例如,若第i帧图像的初始位姿信息为第一插值变量为则电子设备可以通过坐标运算以将第i帧图像的初始位姿信息与第一插值变量进行融合,并得到第i帧图像的最终位姿信息Tout={Rout,Pout}。For example, if the initial pose information of the i-th frame image is The first interpolation variable is Then the electronic equipment can be calculated by coordinates The initial pose information of the i-th frame image is fused with the first interpolation variable, and the final pose information T out ={R out , P out } of the i-th frame image is obtained.
可选地,本申请实施例中,电子设备可以通过卡尔曼滤波方法将第i帧图像的初始位姿信息与第一插值变量融合,得到第i帧图像的最终位姿信息。Optionally, in this embodiment of the present application, the electronic device may fuse the initial pose information of the i-th frame image with the first interpolation variable by using a Kalman filter method to obtain the final pose information of the i-th frame image.
下面对本申请实施例提供的即时定位与地图构建方法进行示例性地说明。The method for real-time positioning and map construction provided by the embodiment of the present application is exemplarily described below.
示例性地,假设第a帧图像为在第i帧图像前采集的图像中为关键帧的图像,电子设备可以计算第a帧图像的初始位姿信息与对第a帧图像的初始位姿信息进行优化后的位姿信息(即目标位姿信息)之间的插值变量,若第i帧图像为电子设备计算得到该插值变量后采集的图像,则电子设备可以将第i帧图像的初始位姿信息与该插值变量(即第一插值变量)融合,得到第i帧图像的最终位姿信息;若第i帧图像为电子设备计算得到该插值变量前采集的图像,则电子设备可以将第i帧图像的初始位姿信息与上一次计算得到的插值变量(即第一插值变量)融合,得到第i帧图像的最终位姿信息。如此,可以确保得到的第i帧图像的最终位姿信息的准确性。For example, assuming that the a-th frame image is an image of a key frame among the images collected before the i-th frame image, the electronic device can calculate the initial pose information of the a-th frame image and the initial pose information of the a-th frame image The interpolation variable between the optimized pose information (that is, the target pose information), if the i-th frame image is the image collected after the electronic device calculates the interpolation variable, the electronic device can use the initial position of the i-th frame image The pose information is fused with the interpolation variable (i.e. the first interpolation variable) to obtain the final pose information of the i-th frame image; if the i-th frame image is an image collected before the electronic device calculates the interpolation variable, the electronic device can use the The initial pose information of the i-frame image is fused with the interpolation variable (ie, the first interpolation variable) calculated last time to obtain the final pose information of the i-th frame image. In this way, the accuracy of the final pose information of the obtained i-th frame image can be ensured.
对于电子设备确定第一插值变量,以及优化第一图像的初始位姿信息的具体方法将在下述的实施例中进行详细描述,为了避免重复,此处不予赘述。The specific method for determining the first interpolation variable by the electronic device and optimizing the initial pose information of the first image will be described in detail in the following embodiments, and will not be repeated here to avoid repetition.
步骤103、电子设备基于第i帧图像的最终位姿信息和第i帧图像,进行即时定位与地图构建。Step 103, the electronic device performs real-time positioning and map construction based on the final pose information of the i-th frame of image and the i-th frame of image.
本申请实施例中,电子设备可以基于第i帧图像的最终位姿信息和第i帧图像,进行即时定位与地图构建,以构建出第i帧图像对应的三维地图。In the embodiment of the present application, the electronic device may perform real-time positioning and map construction based on the final pose information of the i-th frame image and the i-th frame image, so as to construct a three-dimensional map corresponding to the i-th frame image.
可选地,本申请实施例中,电子设备可以将构建出的第i帧图像对应的三维地图,与第i帧图像之前构建的每一帧图像对应的三维地图进行叠加,从而可以实现对当前场景的即时定位与地图构建。Optionally, in this embodiment of the present application, the electronic device may superimpose the constructed three-dimensional map corresponding to the i-th frame image with the three-dimensional map corresponding to each frame image constructed before the i-th frame image, so that the current Real-time positioning and map construction of the scene.
需要说明的是,若电子设备在采集第i帧图像前并未得到任何一个插值变量,则电子设备可以直接基于第i帧图像的初始位姿信息和第i帧图像,进行即时定位与地图构建;或者,电子设备可以对第i帧图像的初始位姿信息进行优化,且确定一个差值变量,并将第i帧图像的初始位姿信息与该插值变量进行融合,得到第i帧图像的最终位姿信息,从而电子设备可以基于第i帧图像的最终位姿信息和第i帧图像,进行即时 定位与地图构建。It should be noted that if the electronic device does not obtain any interpolation variable before collecting the i-th frame image, the electronic device can directly perform real-time positioning and map construction based on the initial pose information of the i-th frame image and the i-th frame image ; Or, the electronic device can optimize the initial pose information of the i-th frame image, and determine a difference variable, and fuse the initial pose information of the i-th frame image with the interpolation variable to obtain the i-th frame image. The final pose information, so that the electronic device can perform real-time Positioning and Mapping.
在本申请实施例提供的即时定位与地图构建方法中,由于电子设备可以基于第i帧图像,以及第一插值变量与根据第i-1帧图像的最终位姿信息确定的该第i帧图像的初始位姿信息融合后的位姿信息,进行即时定位与地图构建,且该第一插值变量为电子设备在该第i帧图像前采集的图像中为关键帧的图像的优化前后的初始位姿信息之间的插值变量,因此,一方面电子设备可以对该第i帧图像的初始位姿进行校正,从而可以提高跟踪位姿的准确性,另一方面电子设备只需对为关键帧的图像计算插值变量,从而可以缩短跟踪位姿的延时。如此,可以确保电子设备输出高频率高精度的位姿信息,进而可以提高电子设备进行即时定位与地图构建的效果。In the real-time positioning and map construction method provided by the embodiment of the present application, since the electronic device can be based on the i-th frame image, and the first interpolation variable and the i-th frame image determined according to the final pose information of the i-1th frame image The pose information after the fusion of the initial pose information is used for real-time positioning and map construction, and the first interpolation variable is the initial position before and after optimization of the image that is the key frame in the image collected by the electronic device before the i-th frame image Therefore, on the one hand, the electronic device can correct the initial pose of the i-th frame image, thereby improving the accuracy of tracking pose; on the other hand, the electronic device only needs to correct the initial pose of the key frame The image computes interpolation variables, which can reduce the latency of tracking poses. In this way, it is possible to ensure that the electronic device outputs high-frequency and high-precision pose information, thereby improving the effect of the electronic device's real-time positioning and map construction.
可选地,本申请实施例中,在上述步骤101之前,本申请实施例提供的即时定位与地图构建方法还可以包括下述的步骤104和步骤105。Optionally, in the embodiment of the present application, before the above step 101, the instant positioning and map construction method provided in the embodiment of the present application may further include the following steps 104 and 105.
步骤104、电子设备基于M个位姿信息、M组偏移信息和第一图像的初始位姿信息,对第一图像的初始位姿信息进行优化,得到目标位姿信息。Step 104: The electronic device optimizes the initial pose information of the first image based on the M pieces of pose information, the M sets of offset information, and the initial pose information of the first image to obtain target pose information.
本申请实施例中,上述M个位姿信息为M帧图像最近一次优化后的位姿信息,M帧图像为在第一图像前采集的图像中为关键帧的图像,In the embodiment of the present application, the above-mentioned M pose information is the pose information after the latest optimization of M frames of images, and the M frames of images are images that are key frames in the images collected before the first image,
本申请实施例中,上述M组偏移信息中的每组偏移信息为第一图像的特征点相对于上述M帧图像中的一帧图像的特征点的偏移量。In the embodiment of the present application, each set of offset information in the above M sets of offset information is an offset amount of the feature points of the first image relative to the feature points of one frame of the above M frames of images.
可选地,本申请实施例中,图像的特征点可以为图像中的顶点、角点或中心点等任意可能的点。Optionally, in this embodiment of the present application, the feature point of the image may be any possible point such as a vertex, a corner point, or a center point in the image.
可选地,本申请实施例中,图像中的特征点的数量可以为一个也可以为多个,具体可以由电子设备根据采集的图像确定。Optionally, in this embodiment of the present application, the number of feature points in the image may be one or multiple, and specifically may be determined by the electronic device according to the collected image.
可选地,本申请实施例中,不同图像的特征点的数量可以相同也可以不同。Optionally, in this embodiment of the present application, the number of feature points of different images may be the same or different.
可选地,本申请实施例中,第一图像与上述M帧图像中的一帧图像之间相对应的特征点的数量可以为N个,N为大于或等于0的整数;可以理解,此时该组偏移信息中包括N个偏移量。Optionally, in this embodiment of the present application, the number of feature points corresponding between the first image and one frame of the above-mentioned M frames of images may be N, and N is an integer greater than or equal to 0; it can be understood that this At this time, the set of offset information includes N offsets.
对于电子设备确定上述M组偏移信息的具体方法将在下述的实施例中进行详细描述,为了避免重复,此处不予赘述。The specific method for the electronic device to determine the above M sets of offset information will be described in detail in the following embodiments, and will not be repeated here to avoid repetition.
下面对电子设备优化第一图像的初始位姿信息的具体方法进行详细说明。The specific method for the electronic device to optimize the initial pose information of the first image will be described in detail below.
可选地,本申请实施例中,上述步骤104具体可以通过下述的步骤104a和步骤104b实现。Optionally, in this embodiment of the present application, the foregoing step 104 may be specifically implemented through the following steps 104a and 104b.
步骤104a、电子设备根据上述M组偏移信息,确定M组三维位置信息。Step 104a, the electronic device determines M sets of three-dimensional position information according to the above M sets of offset information.
本申请实施例中,上述M组偏移信息与上述M组三维位置信息一一对应,每组三维位置信息可以用于指示基于上述M帧图像中的一帧图像构建的三维地图中的特征点。In the embodiment of the present application, the above M sets of offset information are in one-to-one correspondence with the above M sets of three-dimensional position information, and each set of three-dimensional position information can be used to indicate the feature points in the three-dimensional map constructed based on one frame of the above M frames of images .
可选地,本申请实施例中,当上述M组偏移信息中的一组偏移信息中包括N个偏 移量时,电子设备根据该组偏移信息确定的一组三维位置信息可以指示构建的三维地图中的N个特征点。Optionally, in this embodiment of the present application, when a set of offset information in the above M sets of offset information includes N offset When shifting, the set of three-dimensional position information determined by the electronic device according to the set of offset information may indicate the N feature points in the constructed three-dimensional map.
可选地,本申请实施例中,对于上述M组偏移信息中的每组偏移信息,电子设备可以根据第一图像的特征点相对于上述M帧图像中的一帧图像的特征点的偏移量,确定用于指示基于该帧图像构建的三维地图中的特征点的一组三维位置信息。Optionally, in this embodiment of the present application, for each set of offset information in the above M sets of offset information, the electronic device may, according to the The offset is used to determine a set of three-dimensional position information used to indicate the feature points in the three-dimensional map constructed based on the frame image.
步骤104b、电子设备采用预设光束法平差算法,对上述M个位姿信息、第一图像的初始位姿信息和上述M组三维位置信息进行处理,得到目标位姿信息。Step 104b: The electronic device uses a preset beam adjustment algorithm to process the above M pieces of pose information, the initial pose information of the first image, and the above M sets of three-dimensional position information to obtain target pose information.
可选地,本申请实施例中,预设光束法平差算法的原理可以为:
Optionally, in this embodiment of the application, the principle of the preset beam adjustment algorithm can be:
其中,T为图像的初始位姿信息,P为三维位置信息,Z为二维观测,π为投影方程,M为第一图像前为关键帧的图像的数量,N为M组三维位置信息指示的三维地图中的特征点的数量。Among them, T is the initial pose information of the image, P is the three-dimensional position information, Z is the two-dimensional observation, π is the projection equation, M is the number of images that are key frames before the first image, and N is the M group of three-dimensional position information instructions The number of feature points in the 3D map.
可以看出,电子设备可以对上述M个位姿信息、第一图像的初始位姿信息和上述M组三维位置信息进行处理,从而得到目标位姿信息。It can be seen that the electronic device can process the above M pieces of pose information, the initial pose information of the first image, and the above M sets of three-dimensional position information, so as to obtain target pose information.
需要说明的是,本申请实施例中,电子设备采用预设光束法平差算法,在得到目标位姿信息的同时,可以对上述M个位姿信息进行优化,以使电子设备在下一次采用预设光束法平差算法得到新的第一图像的目标位姿信息时,可以基于优化后的M个位姿信息,从而可以提高电子设备即时定位的精度。It should be noted that, in the embodiment of the present application, the electronic device adopts the preset beam adjustment algorithm, and at the same time of obtaining the target pose information, it can optimize the above M pose information, so that the electronic device will use the preset beam adjustment algorithm next time. When the beam adjustment algorithm obtains the target pose information of the new first image, it can be based on the optimized M pose information, thereby improving the accuracy of the instant positioning of the electronic device.
本申请实施例中,由于电子设备可以根据M组偏移信息,确定与该M组偏移信息一一对应的,且可以指示构建的三维地图中的特征点的M组三维位置信息,并采用预设光束法平差算法计算得到对第一图像的初始位姿信息优化后的位姿信息,因此可以提高电子设备对第一图像的初始位姿信息进行优化的准确性,从而电子设备在即时定位与地图构建时,可以提高地图构建的精度。In the embodiment of the present application, since the electronic device can determine the M sets of three-dimensional position information corresponding to the M sets of offset information according to the M sets of offset information, and can indicate the feature points in the constructed three-dimensional map, and use The preset beam adjustment algorithm calculates the pose information optimized for the initial pose information of the first image, so it can improve the accuracy of the electronic device in optimizing the initial pose information of the first image, so that the electronic device can instantly When positioning and map construction, the accuracy of map construction can be improved.
步骤105、电子设备根据第一图像的初始位姿信息中的第一旋转坐标和第一位移坐标,以及目标位姿信息中的第二旋转坐标和第二位移坐标,确定第一插值变量。Step 105, the electronic device determines a first interpolation variable according to the first rotation coordinate and the first displacement coordinate in the initial pose information of the first image, and the second rotation coordinate and second displacement coordinate in the target pose information.
对于旋转坐标和位移坐标的详细描述,具体可以参照上述实施例中的相关描述,为了避免重复,此处不再赘述。For detailed descriptions of the rotation coordinates and the displacement coordinates, specific reference may be made to relevant descriptions in the foregoing embodiments, and details are not repeated here to avoid repetition.
下面对电子设备确定第一插值变量的具体方法进行详细说明。The specific method for the electronic device to determine the first interpolation variable will be described in detail below.
可选地,本申请实施例中,上述步骤105具体可以通过下述的步骤105a至步骤105c实现。Optionally, in the embodiment of the present application, the foregoing step 105 may be specifically implemented through the following steps 105a to 105c.
步骤105a、电子设备根据第一旋转坐标和第二旋转坐标,确定目标旋转坐标。Step 105a, the electronic device determines the target rotation coordinates according to the first rotation coordinates and the second rotation coordinates.
可选地,本申请实施例中,电子设备可以对第一旋转坐标的转置和第二旋转坐标进行乘法运算,得到目标旋转坐标。Optionally, in this embodiment of the present application, the electronic device may perform a multiplication operation on the transposition of the first rotation coordinate and the second rotation coordinate to obtain the target rotation coordinate.
步骤105b、电子设备根据目标旋转坐标、第一位移坐标、第二位移坐标,确定目 标位移坐标。Step 105b, the electronic device determines the target position according to the target rotation coordinate, the first displacement coordinate, and the second displacement coordinate. Mark displacement coordinates.
可选地,本申请实施例中,电子设备可以将目标旋转坐标和第一位移坐标进行乘法运算,得到中间位移坐标;并将第二位移坐标与中间位移坐标进行减法运算,得到目标位移坐标。Optionally, in the embodiment of the present application, the electronic device may multiply the target rotation coordinates and the first displacement coordinates to obtain the intermediate displacement coordinates; and perform subtraction operation on the second displacement coordinates and the intermediate displacement coordinates to obtain the target displacement coordinates.
步骤105c、电子设备将目标旋转坐标和目标位移坐标,确定为第一插值变量。In step 105c, the electronic device determines the target rotation coordinates and target displacement coordinates as first interpolation variables.
下面对本申请实施例提供的即时定位与地图构建方法进行示例性地说明。The method for real-time positioning and map construction provided by the embodiment of the present application is exemplarily described below.
示例性地,假设第一图像的初始位姿信息中的第一旋转坐标为第一位移坐标为第一图像的目标位姿信息中的第二旋转坐标为第二位移坐标为那么,电子设备可以根据第一旋转坐标和第二旋转坐标,确定目标旋转坐标电子设备可以根据目标旋转坐标、第一位移坐标、第二位移坐标,确定目标位移坐标如此,电子设备可以将目标旋转坐标和目标位移坐标确定为第一插值变量。Exemplarily, it is assumed that the first rotation coordinate in the initial pose information of the first image is The first displacement coordinate is The second rotation coordinate in the target pose information of the first image is The second displacement coordinate is Then, the electronic device can determine the target rotation coordinates according to the first rotation coordinates and the second rotation coordinates The electronic device can determine the target displacement coordinates according to the target rotation coordinates, the first displacement coordinates, and the second displacement coordinates In this way, the electronic device can rotate the coordinates of the target and target displacement coordinates Determined as the first interpolation variable.
本申请实施例中,电子设备可以根据第一图像的初始位姿信息中的第一旋转坐标和第一图像的目标位姿信息中的第二旋转坐标,确定出目标旋转坐标,且根据目标旋转坐标、第一图像的初始位姿信息中的第一位移坐标和第一图像的目标位姿信息中的第二位移坐标,确定出目标位移坐标。从而电子设备可以将目标旋转坐标和目标位移坐标,确定为第一插值变量,以方便进一步地进行融合处理。In the embodiment of the present application, the electronic device can determine the target rotation coordinates according to the first rotation coordinates in the initial pose information of the first image and the second rotation coordinates in the target pose information of the first image, and according to the target rotation coordinates, the first displacement coordinates in the initial pose information of the first image, and the second displacement coordinates in the target pose information of the first image to determine the target displacement coordinates. Therefore, the electronic device can determine the target rotation coordinates and the target displacement coordinates as the first interpolation variables, so as to facilitate further fusion processing.
本申请实施例中,由于电子设备可以基于M帧在第一图像前采集的图像中为关键帧的图像最近一次优化后的位姿信息、M组偏移信息和第一图像的初始位姿信息,对第一图像的初始位姿信息进行优化,并可以根据优化后的位姿信息中的旋转坐标和位移坐标,以及第一图像的初始位姿信息中的旋转坐标和位移坐标,确定第一插值变量,即电子设备可以基于优化前后的第一图像的位姿信息中的坐标确定第一插值变量,因此可以提高电子设备确定第一插值变量的准确性。In the embodiment of the present application, since the electronic device can be based on the latest optimized pose information, M sets of offset information, and the initial pose information of the first image based on M frames of the image captured before the first image as a key frame , optimize the initial pose information of the first image, and determine the first The interpolation variable, that is, the electronic device can determine the first interpolation variable based on the coordinates in the pose information of the first image before and after optimization, so the accuracy of determining the first interpolation variable by the electronic device can be improved.
下面对电子设备确定上述M组偏移信息的具体方法进行详细说明。The specific method for the electronic device to determine the above M sets of offset information will be described in detail below.
可选地,本申请实施例中,在上述步骤104之前,本申请实施例提供的即时定位与地图构建方法还可以包括下述的步骤106。Optionally, in the embodiment of the present application, before the above step 104, the instant positioning and map construction method provided in the embodiment of the present application may further include the following step 106.
步骤106、电子设备根据第一图像的特征点的二维位置信息和上述M帧图像的特征点的二维位置信息,确定上述M组偏移信息。Step 106: The electronic device determines the M sets of offset information according to the two-dimensional position information of the feature points of the first image and the two-dimensional position information of the feature points of the M frames of images.
可选地,本申请实施例中,第一图像的特征点的二维位置信息可以由电子设备通过滤波方法确定。Optionally, in this embodiment of the present application, the two-dimensional position information of the feature points of the first image may be determined by an electronic device through a filtering method.
可选地,本申请实施例中,二维位置信息可以用于指示第一图像的特征点在第一图像中的位置。Optionally, in this embodiment of the present application, the two-dimensional position information may be used to indicate the position of the feature point of the first image in the first image.
可选地,本申请实施例中,电子设备确定第一图像的特征点的二维位置信息的原理如下:Optionally, in the embodiment of the present application, the principle of determining the two-dimensional position information of the feature points of the first image by the electronic device is as follows:
zk=h(xk)+rk z k =h(x k )+r k
其中,zk为第一图像的特征点的二维位置信息(即二维观测),xk为第一图像的初始位姿信息,rk为噪声项,h为观测矩阵。Among them, z k is the two-dimensional position information of the feature points of the first image (ie two-dimensional observation), x k is the initial pose information of the first image, r k is the noise item, and h is the observation matrix.
可以看出,电子设备可以根据第一图像的初始位姿信息,通过滤波方法确定第一图像的特征点的二维位置信息。It can be seen that the electronic device may determine the two-dimensional position information of the feature points of the first image through a filtering method according to the initial pose information of the first image.
可选地,本申请实施例中,M组偏移信息中的每组偏移信息可以为第一图像的特征点相对于上述M帧图像中的一帧图像的特征点的图像坐标之差。Optionally, in this embodiment of the present application, each set of offset information in the M sets of offset information may be an image coordinate difference between a feature point of the first image and a feature point of one frame of the above-mentioned M frames of images.
例如,第一图像的特征点A(x1,y1)相对于上述M帧图像中的一帧图像的特征点A’(x2,y2)的偏移量,为x=|x1-x2|,y=|y1-y2|。For example, the offset of the feature point A (x1, y1) of the first image relative to the feature point A' (x2, y2) of one frame of the above-mentioned M frames of images is x=|x1-x2|, y =|y1-y2|.
需要说明的是,本申请实施例中,每组偏移信息包括第一图像的特征点中与上述M帧图像中的一帧图像中对应的全部特征点的偏移量。It should be noted that, in the embodiment of the present application, each set of offset information includes offsets of all the feature points corresponding to the feature points of the first image and one frame of the above-mentioned M frames of images.
例如,假设第一图像包括特征点1、特征点2和特征点3,上述M帧图像中的一帧图像a包括与特征点1对应的特征点1’,与特征点3对应的特征点3’和特征点4,那么,该组偏移信息可以包括特征点1与特征点1’的图像坐标之差,以及特征点3与特征点3’的图像坐标之差。For example, assuming that the first image includes feature point 1, feature point 2, and feature point 3, one frame of image a in the above M frames of images includes feature point 1' corresponding to feature point 1, and feature point 3 corresponding to feature point 3 ' and feature point 4, then the set of offset information may include the image coordinate difference between feature point 1 and feature point 1', and the image coordinate difference between feature point 3 and feature point 3'.
本申请实施例中,由于电子设备可以根据第一图像的特征点的二维位置信息和M帧图像的特征点的二维位置信息,确定M组偏移信息,因此电子设备可以获取第一图像与采集第一图像前采集的为关键帧的每张图像之间的偏移信息,以方便电子设备基于该偏移信息得到准确的目标位姿信息,进而可以提高电子设备即时定位与地图构建的精度。In the embodiment of the present application, since the electronic device can determine M sets of offset information based on the two-dimensional position information of the feature points of the first image and the two-dimensional position information of the feature points of M frames of images, the electronic device can acquire the first image The offset information between each image that is a key frame collected before the first image is collected, so that the electronic device can obtain accurate target pose information based on the offset information, and then can improve the real-time positioning of the electronic device and map construction. precision.
本申请实施例提供的即时定位与地图构建方法,执行主体可以为即时定位与地图构建装置。本申请实施例中以即时定位与地图构建装置执行即时定位与地图构建的方法为例,说明本申请实施例提供的即时定位与地图构建装置。The real-time positioning and map building method provided in the embodiment of the present application may be executed by a real-time positioning and map building device. In the embodiment of the present application, the real-time positioning and map construction device performed by the real-time positioning and map construction device is taken as an example to illustrate the real-time positioning and map construction device provided in the embodiment of the present application.
结合图2,本申请实施例提供一种即时定位与地图构建装置20,该即时定位与地图构建装置20可以包括:采集模块21、确定模块22、融合模块23和处理模块24。确定模块22,可以用于根据采集模块21采集的第i-1帧图像的最终位姿信息,确定采集模块21采集的第i帧图像的初始位姿信息,i为大于1的整数。融合模块23,可以用于将第i帧图像的初始位姿信息与第一插值变量融合,得到第i帧图像的最终位姿信息,第一插值变量为在采集模块采集第i帧图像前得到的最后一个插值变量,第一插值变量为第一图像的初始位姿信息与目标位姿信息之间的插值变量,第一图像为在第i帧图像前采集模块采集的图像中为关键帧的图像,目标位姿信息为对第一图像的初始位姿信息进行优化后的位姿信息。处理模块24,可以用于基于第i帧图像的最终位姿信息和第i帧图像,进行即时定位与地图构建。Referring to FIG. 2 , the embodiment of the present application provides an instant positioning and map construction device 20 , which may include: a collection module 21 , a determination module 22 , a fusion module 23 and a processing module 24 . The determination module 22 can be used to determine the initial pose information of the i-th frame image collected by the acquisition module 21 according to the final pose information of the i-1th frame image collected by the acquisition module 21, where i is an integer greater than 1. The fusion module 23 can be used to fuse the initial pose information of the i-th frame image with the first interpolation variable to obtain the final pose information of the i-th frame image. The first interpolation variable is obtained before the acquisition module collects the i-th frame image The last interpolation variable of , the first interpolation variable is the interpolation variable between the initial pose information of the first image and the target pose information, and the first image is a key frame in the image collected by the acquisition module before the ith frame image image, the target pose information is the pose information after optimizing the initial pose information of the first image. The processing module 24 can be used for real-time positioning and map construction based on the final pose information of the i-th frame image and the i-th frame image.
一种可能的实现方式中,上述即时定位与地图构建装置20还可以包括优化模块。优化模块,可以用于在确定模块22根据采集模块21采集的第i-1帧图像的最终位姿信息,确定采集模块21采集的第i帧图像的初始位姿信息之前,基于M个位姿信息、 M组偏移信息和第一图像的初始位姿信息,对第一图像的初始位姿信息进行优化,得到目标位姿信息,M个位姿信息为M帧图像最近一次优化后的位姿信息,M帧图像为在第一图像前采集模块21采集的图像中为关键帧的图像,每组偏移信息为第一图像的特征点相对于M帧图像中的一帧图像的特征点的偏移量。确定模块22,还可以用于根据第一图像的初始位姿信息中的第一旋转坐标和第一位移坐标,以及目标位姿信息中的第二旋转坐标和第二位移坐标,确定第一插值变量。In a possible implementation manner, the device 20 for real-time positioning and map construction may further include an optimization module. The optimization module can be used to determine the initial pose information of the i-th frame image collected by the acquisition module 21 according to the final pose information of the i-1th frame image collected by the acquisition module 21, based on M poses information, M sets of offset information and the initial pose information of the first image, optimize the initial pose information of the first image, and obtain the target pose information, and the M pose information is the latest optimized pose information of M frames of images , M frames of images are images of key frames in the images collected by the acquisition module 21 before the first image, and each set of offset information is the offset of the feature points of the first image relative to the feature points of one frame of images in the M frames of images displacement. The determining module 22 can also be used to determine the first interpolation value according to the first rotation coordinate and the first displacement coordinate in the initial pose information of the first image, and the second rotation coordinate and the second displacement coordinate in the target pose information variable.
一种可能的实现方式中,确定模块22,具体可以用于根据M组偏移信息,确定M组三维位置信息,M组偏移信息与M组三维位置信息一一对应,每组三维位置信息用于指示基于M帧图像中的一帧图像构建的三维地图中的特征点。优化模块,具体可以用于采用预设光束法平差算法,对M个位姿信息、第一图像的初始位姿信息和M组三维位置信息进行处理,得到目标位姿信息。In a possible implementation manner, the determination module 22 can specifically be used to determine M sets of three-dimensional position information according to M sets of offset information, where M sets of offset information correspond to M sets of three-dimensional position information, and each set of three-dimensional position information It is used to indicate the feature points in the 3D map constructed based on one frame of M images. The optimization module can specifically be used to process the M pieces of pose information, the initial pose information of the first image, and the M groups of three-dimensional position information by using the preset beam adjustment algorithm to obtain the target pose information.
一种可能的实现方式中,确定模块22,还可以用于在优化模块基于M个位姿信息、M组偏移信息和第一图像的初始位姿信息,对第一图像的初始位姿信息进行优化,得到目标位姿信息之前,根据第一图像的特征点的二维位置信息和M帧图像的特征点的二维位置信息,确定M组偏移信息。In a possible implementation, the determination module 22 can also be used to determine the initial pose information of the first image in the optimization module based on M pose information, M sets of offset information, and the initial pose information of the first image. After optimization, before obtaining the target pose information, M sets of offset information are determined according to the two-dimensional position information of the feature points of the first image and the two-dimensional position information of the feature points of the M frames of images.
一种可能的实现方式中,确定模块22,具体可以用于根据第一旋转坐标和第二旋转坐标,确定目标旋转坐标。确定模块22,具体可以用于根据目标旋转坐标、第一位移坐标、第二位移坐标,确定目标位移坐标。确定模块22,具体可以用于将目标旋转坐标和目标位移坐标,确定为第一插值变量。In a possible implementation manner, the determining module 22 may specifically be configured to determine the target rotation coordinates according to the first rotation coordinates and the second rotation coordinates. The determination module 22 may be specifically configured to determine the target displacement coordinates according to the target rotation coordinates, the first displacement coordinates, and the second displacement coordinates. The determining module 22 may be specifically configured to determine the target rotation coordinates and the target displacement coordinates as the first interpolation variables.
一种可能的实现方式中,处理模块24,具体可以用于采用滤波算法,对第i-1帧图像的最终位姿信息处理,得到第一位姿信息。确定模块22,具体可以用于在第一位姿信息与第i-1帧图像的最终位姿信息的匹配度小于或等于预设匹配度的情况下,将第i-1帧图像的最终位姿信息确定为第i帧图像的初始位姿信息。确定模块22,具体可以用于在第一位姿信息与第i-1帧图像的最终位姿信息的匹配度大于预设匹配度的情况下,将第一位姿信息确定为第i帧图像的初始位姿信息。In a possible implementation manner, the processing module 24 may be specifically configured to use a filtering algorithm to process the final pose information of the i-1th frame image to obtain the first pose information. The determination module 22 can specifically be used to set the final position of the i-1th frame image to The pose information is determined as the initial pose information of the i-th frame image. The determining module 22 can specifically be used to determine the first pose information as the i-th frame image when the matching degree between the first pose information and the final pose information of the i-1th frame image is greater than the preset matching degree initial pose information.
在本申请实施例提供的即时定位与地图构建装置中,由于该即时定位与地图构建装置可以基于第i帧图像,以及第一插值变量与根据第i-1帧图像的最终位姿信息确定的该第i帧图像的初始位姿信息融合后的位姿信息,进行即时定位与地图构建,且该第一插值变量为该即时定位与地图构建装置在该第i帧图像前采集的图像中为关键帧的图像的优化前后的初始位姿信息之间的插值变量,因此,一方面该即时定位与地图构建装置可以对该第i帧图像的初始位姿进行校正,从而可以提高跟踪位姿的准确性,另一方面该即时定位与地图构建装置只需对为关键帧的图像计算插值变量,从而可以缩短跟踪位姿的延时。如此,可以确保该即时定位与地图构建装置输出高频率高精度的位姿信息,进而可以提高该即时定位与地图构建装置进行即时定位与地图构建的效果。 In the real-time positioning and map construction device provided in the embodiment of the present application, since the real-time positioning and map construction device can be based on the i-th frame image, and the first interpolation variable and the final pose information determined according to the i-1th frame image The pose information after the fusion of the initial pose information of the i-th frame image is used for real-time positioning and map construction, and the first interpolation variable is in the image collected by the real-time positioning and map construction device before the i-th frame image is The interpolation variable between the initial pose information before and after the optimization of the image of the key frame, therefore, on the one hand, the instant positioning and map construction device can correct the initial pose of the i-th frame image, thereby improving the tracking pose Accuracy, on the other hand, the real-time positioning and map construction device only needs to calculate the interpolation variable for the key frame image, so that the delay of tracking pose can be shortened. In this way, it can ensure that the real-time positioning and mapping device outputs high-frequency and high-precision pose information, thereby improving the effect of real-time positioning and map building by the real-time positioning and mapping device.
本实施例中各种实现方式具有的有益效果具体可以参见上述方法实施例中相应实现方式所具有的有益效果,为避免重复,此处不再赘述。For the beneficial effects of the various implementations in this embodiment, refer to the beneficial effects of the corresponding implementations in the foregoing method embodiments. To avoid repetition, details are not repeated here.
本申请实施例中的即时定位与地图构建装置可以是电子设备,也可以是电子设备中的部件,例如集成电路或芯片。该电子设备可以是终端,也可以为除终端之外的其他设备。示例性的,电子设备可以为手机、平板电脑、笔记本电脑、掌上电脑、车载电子设备、移动上网装置(Mobile Internet Device,MID)、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备、机器人、可穿戴设备、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本或者个人数字助理(personal digital assistant,PDA)等,还可以为服务器、网络附属存储器(Network Attached Storage,NAS)、个人计算机(personal computer,PC)、电视机(television,TV)、柜员机或者自助机等,本申请实施例不作具体限定。The device for real-time positioning and map construction in the embodiment of the present application may be an electronic device, or a component in the electronic device, such as an integrated circuit or a chip. The electronic device may be a terminal, or other devices other than the terminal. Exemplarily, the electronic device can be a mobile phone, a tablet computer, a notebook computer, a handheld computer, a vehicle electronic device, a mobile Internet device (Mobile Internet Device, MID), an augmented reality (augmented reality, AR)/virtual reality (virtual reality, VR) ) equipment, robots, wearable devices, ultra-mobile personal computer (ultra-mobile personal computer, UMPC), netbook or personal digital assistant (personal digital assistant, PDA), etc., can also serve as server, network attached storage (Network Attached Storage, NAS), personal computer (personal computer, PC), television (television, TV), teller machine, or self-service machine, etc., which are not specifically limited in this embodiment of the present application.
本申请实施例中的即时定位与地图构建装置可以为具有操作系统的装置。该操作系统可以为安卓(Android)操作系统,可以为ios操作系统,还可以为其他可能的操作系统,本申请实施例不作具体限定。The device for real-time positioning and map construction in the embodiment of the present application may be a device with an operating system. The operating system may be an Android (Android) operating system, an ios operating system, or other possible operating systems, which are not specifically limited in this embodiment of the present application.
本申请实施例提供的即时定位与地图构建装置能够实现图1的方法实施例实现的各个过程,为避免重复,这里不再赘述。The real-time positioning and map construction device provided by the embodiment of the present application can realize various processes realized by the method embodiment in FIG. 1 , and details are not repeated here to avoid repetition.
可选地,如图3所示,本申请实施例还提供一种电子设备300,包括处理器301和存储器302,存储器302上存储有可在所述处理器301上运行的程序或指令,该程序或指令被处理器301执行时实现如上述即时定位与地图构建方法实施例的各个步骤,且能达到相同的技术效果,为避免重复,这里不再赘述。Optionally, as shown in FIG. 3 , the embodiment of the present application also provides an electronic device 300, including a processor 301 and a memory 302. The memory 302 stores programs or instructions that can run on the processor 301. The When the programs or instructions are executed by the processor 301, the various steps in the embodiment of the real-time positioning and map construction method described above can be achieved, and the same technical effect can be achieved. To avoid repetition, details are not repeated here.
需要说明的是,本申请实施例中的电子设备包括上述所述的移动电子设备和非移动电子设备。It should be noted that the electronic devices in the embodiments of the present application include the above-mentioned mobile electronic devices and non-mobile electronic devices.
图4为实现本申请实施例的一种电子设备的硬件结构示意图。FIG. 4 is a schematic diagram of a hardware structure of an electronic device implementing an embodiment of the present application.
该电子设备1000包括但不限于:射频单元1001、网络模块1002、音频输出单元1003、输入单元1004、传感器1005、显示单元1006、用户输入单元1007、接口单元1008、存储器1009、以及处理器1010等部件。The electronic device 1000 includes, but is not limited to: a radio frequency unit 1001, a network module 1002, an audio output unit 1003, an input unit 1004, a sensor 1005, a display unit 1006, a user input unit 1007, an interface unit 1008, a memory 1009, and a processor 1010, etc. part.
本领域技术人员可以理解,电子设备1000还可以包括给各个部件供电的电源(比如电池),电源可以通过电源管理系统与处理器1010逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。图4中示出的电子设备结构并不构成对电子设备的限定,电子设备可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置,在此不再赘述。Those skilled in the art can understand that the electronic device 1000 can also include a power supply (such as a battery) for supplying power to various components, and the power supply can be logically connected to the processor 1010 through the power management system, so that the management of charging, discharging, and function can be realized through the power management system. Consumption management and other functions. The structure of the electronic device shown in FIG. 4 does not constitute a limitation to the electronic device, and the electronic device may include more or fewer components than shown in the figure, or combine some components, or arrange different components, which will not be repeated here. .
其中,处理器1010,可以用于根据传感器1005采集的第i-1帧图像的最终位姿信息,确定传感器1005采集的第i帧图像的初始位姿信息,i为大于1的整数。处理器1010,还可以用于将第i帧图像的初始位姿信息与第一插值变量融合,得到第i帧图像的最终位姿信息,第一插值变量为在采集模块采集第i帧图像前得到的最后一个插值 变量,第一插值变量为第一图像的初始位姿信息与目标位姿信息之间的插值变量,第一图像为在第i帧图像前采集模块采集的图像中为关键帧的图像,目标位姿信息为对第一图像的初始位姿信息进行优化后的位姿信息。处理器1010,还可以用于基于第i帧图像的最终位姿信息和第i帧图像,进行即时定位与地图构建。Wherein, the processor 1010 can be configured to determine the initial pose information of the i-th frame image collected by the sensor 1005 according to the final pose information of the i-1th frame image collected by the sensor 1005, where i is an integer greater than 1. The processor 1010 can also be used to fuse the initial pose information of the i-th frame image with the first interpolation variable to obtain the final pose information of the i-th frame image, the first interpolation variable is before the acquisition module collects the i-th frame image get the last interpolation Variable, the first interpolation variable is the interpolation variable between the initial pose information of the first image and the target pose information, the first image is the image of the key frame in the image collected by the acquisition module before the ith frame image, and the target position The pose information is pose information after optimizing the initial pose information of the first image. The processor 1010 can also be used for real-time positioning and map construction based on the final pose information of the i-th frame image and the i-th frame image.
一种可能的实现方式中,处理器1010,还可以用于在根据传感器1005采集的第i-1帧图像的最终位姿信息,确定传感器1005采集的第i帧图像的初始位姿信息之前,基于M个位姿信息、M组偏移信息和第一图像的初始位姿信息,对第一图像的初始位姿信息进行优化,得到目标位姿信息,M个位姿信息为M帧图像最近一次优化后的位姿信息,M帧图像为在第一图像前传感器1005采集的图像中为关键帧的图像,每组偏移信息为第一图像的特征点相对于M帧图像中的一帧图像的特征点的偏移量。处理器1010,还可以用于根据第一图像的初始位姿信息中的第一旋转坐标和第一位移坐标,以及目标位姿信息中的第二旋转坐标和第二位移坐标,确定第一插值变量。In a possible implementation manner, the processor 1010 may also be configured to determine the initial pose information of the i-th frame image collected by the sensor 1005 according to the final pose information of the i-1th frame image collected by the sensor 1005, Based on the M pose information, M sets of offset information and the initial pose information of the first image, optimize the initial pose information of the first image to obtain the target pose information, and the M pose information is the closest M frame image Posture information after one optimization, M frames of images are images of key frames in the images collected by the first image front sensor 1005, and each set of offset information is the feature point of the first image relative to one frame in the M frames of images The offset of the feature points of the image. The processor 1010 may also be configured to determine the first interpolation value according to the first rotation coordinate and the first displacement coordinate in the initial pose information of the first image, and the second rotation coordinate and the second displacement coordinate in the target pose information variable.
一种可能的实现方式中,处理器1010,具体可以用于根据M组偏移信息,确定M组三维位置信息,M组偏移信息与M组三维位置信息一一对应,每组三维位置信息用于指示基于M帧图像中的一帧图像构建的三维地图中的特征点。处理器1010,具体可以用于采用预设光束法平差算法,对M个位姿信息、第一图像的初始位姿信息和M组三维位置信息进行处理,得到目标位姿信息。In a possible implementation manner, the processor 1010 may be specifically configured to determine M sets of three-dimensional position information according to M sets of offset information, where M sets of offset information correspond to M sets of three-dimensional position information, and each set of three-dimensional position information It is used to indicate the feature points in the 3D map constructed based on one frame of M images. The processor 1010 may be specifically configured to use a preset beam adjustment algorithm to process M pieces of pose information, initial pose information of the first image, and M sets of three-dimensional position information to obtain target pose information.
一种可能的实现方式中,处理器1010,还可以用于在基于M个位姿信息、M组偏移信息和第一图像的初始位姿信息,对第一图像的初始位姿信息进行优化,得到目标位姿信息之前,根据第一图像的特征点的二维位置信息和M帧图像的特征点的二维位置信息,确定M组偏移信息。In a possible implementation, the processor 1010 may also be configured to optimize the initial pose information of the first image based on the M pieces of pose information, M sets of offset information, and the initial pose information of the first image , before obtaining the target pose information, M sets of offset information are determined according to the two-dimensional position information of the feature points of the first image and the two-dimensional position information of the feature points of the M frames of images.
一种可能的实现方式中,处理器1010,具体可以用于根据第一旋转坐标和第二旋转坐标,确定目标旋转坐标。处理器1010,具体可以用于根据目标旋转坐标、第一位移坐标、第二位移坐标,确定目标位移坐标。处理器1010,具体可以用于将目标旋转坐标和目标位移坐标,确定为第一插值变量。In a possible implementation manner, the processor 1010 may specifically be configured to determine the target rotation coordinates according to the first rotation coordinates and the second rotation coordinates. The processor 1010 may specifically be configured to determine the target displacement coordinates according to the target rotation coordinates, the first displacement coordinates, and the second displacement coordinates. The processor 1010 may be specifically configured to determine the target rotation coordinates and the target displacement coordinates as the first interpolation variables.
一种可能的实现方式中,处理器1010,具体可以用于采用滤波算法,对第i-1帧图像的最终位姿信息处理,得到第一位姿信息。处理器1010,具体可以用于在第一位姿信息与第i-1帧图像的最终位姿信息的匹配度小于或等于预设匹配度的情况下,将第i-1帧图像的最终位姿信息确定为第i帧图像的初始位姿信息。处理器1010,具体可以用于在第一位姿信息与第i-1帧图像的最终位姿信息的匹配度大于预设匹配度的情况下,将第一位姿信息确定为第i帧图像的初始位姿信息。In a possible implementation manner, the processor 1010 may be specifically configured to use a filtering algorithm to process the final pose information of the i-1th frame image to obtain the first pose information. The processor 1010 can be specifically configured to set the final position of the i-1th frame image to The pose information is determined as the initial pose information of the i-th frame image. The processor 1010 may specifically be configured to determine the first pose information as the i-th frame image when the matching degree between the first pose information and the final pose information of the i-1th frame image is greater than a preset matching degree initial pose information.
在本申请实施例提供的电子设备中,由于电子设备可以基于第i帧图像,以及第一插值变量与根据第i-1帧图像的最终位姿信息确定的该第i帧图像的初始位姿信息融合后的位姿信息,进行即时定位与地图构建,且该第一插值变量为电子设备在该第i帧图像前采集的图像中为关键帧的图像的优化前后的初始位姿信息之间的插值变量, 因此,一方面电子设备可以对该第i帧图像的初始位姿进行校正,从而可以提高跟踪位姿的准确性,另一方面电子设备只需对为关键帧的图像计算插值变量,从而可以缩短跟踪位姿的延时。如此,可以确保电子设备输出高频率高精度的位姿信息,进而可以提高电子设备进行即时定位与地图构建的效果。In the electronic device provided in the embodiment of the present application, since the electronic device can be based on the i-th frame image, and the first interpolation variable and the initial pose of the i-th frame image determined according to the final pose information of the i-1th frame image The pose information after information fusion is used for real-time positioning and map construction, and the first interpolation variable is between the initial pose information before and after optimization of the image that is a key frame in the image collected by the electronic device before the i-th frame image the interpolation variable, Therefore, on the one hand, the electronic device can correct the initial pose of the i-th frame image, thereby improving the accuracy of tracking pose; on the other hand, the electronic device only needs to calculate the interpolation variable for the key frame image, which can shorten Delay in tracking pose. In this way, it is possible to ensure that the electronic device outputs high-frequency and high-precision pose information, thereby improving the effect of the electronic device's real-time positioning and map construction.
本实施例中各种实现方式具有的有益效果具体可以参见上述方法实施例中相应实现方式所具有的有益效果,为避免重复,此处不再赘述。For the beneficial effects of the various implementations in this embodiment, refer to the beneficial effects of the corresponding implementations in the foregoing method embodiments. To avoid repetition, details are not repeated here.
应理解的是,本申请实施例中,输入单元1004可以包括图形处理器(Graphics Processing Unit,GPU)10041和麦克风10042,图形处理器10041对在视频捕获模式或图像捕获模式中由图像捕获装置(如摄像头)获得的静态图片或视频的图像数据进行处理。显示单元1006可包括显示面板10061,可以采用液晶显示器、有机发光二极管等形式来配置显示面板10061。用户输入单元1007包括触控面板10071以及其他输入设备10072中的至少一种。触控面板10071,也称为触摸屏。触控面板10071可包括触摸检测装置和触摸控制器两个部分。其他输入设备10072可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆,在此不再赘述。It should be understood that, in the embodiment of the present application, the input unit 1004 may include a graphics processor (Graphics Processing Unit, GPU) 10041 and a microphone 10042, and the graphics processor 10041 is used for the image capture device ( Such as the image data of the still picture or video obtained by the camera) for processing. The display unit 1006 may include a display panel 10061, and the display panel 10061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 1007 includes at least one of a touch panel 10071 and other input devices 10072 . The touch panel 10071 is also called a touch screen. The touch panel 10071 may include two parts, a touch detection device and a touch controller. Other input devices 10072 may include, but are not limited to, physical keyboards, function keys (such as volume control buttons, switch buttons, etc.), trackballs, mice, and joysticks, which will not be repeated here.
存储器1009可用于存储软件程序以及各种数据。存储器1009可主要包括存储程序或指令的第一存储区和存储数据的第二存储区,其中,第一存储区可存储操作系统、至少一个功能所需的应用程序或指令(比如声音播放功能、图像播放功能等)等。此外,存储器1009可以包括易失性存储器或非易失性存储器,或者,存储器1009可以包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDRSDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synch link DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DRRAM)。本申请实施例中的存储器1009包括但不限于这些和任意其它适合类型的存储器。The memory 1009 can be used to store software programs as well as various data. The memory 1009 may mainly include a first storage area for storing programs or instructions and a second storage area for storing data, wherein the first storage area may store an operating system, an application program or instructions required by at least one function (such as a sound playing function, image playback function, etc.), etc. Furthermore, memory 1009 may include volatile memory or nonvolatile memory, or, memory 1009 may include both volatile and nonvolatile memory. Among them, the non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electronically programmable Erase Programmable Read-Only Memory (Electrically EPROM, EEPROM) or Flash. Volatile memory can be random access memory (Random Access Memory, RAM), static random access memory (Static RAM, SRAM), dynamic random access memory (Dynamic RAM, DRAM), synchronous dynamic random access memory (Synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (Double Data Rate SDRAM, DDRSDRAM), enhanced synchronous dynamic random access memory (Enhanced SDRAM, ESDRAM), synchronous connection dynamic random access memory (Synch link DRAM , SLDRAM) and Direct Memory Bus Random Access Memory (Direct Rambus RAM, DRRAM). The memory 1009 in the embodiment of the present application includes but is not limited to these and any other suitable types of memory.
处理器1010可包括一个或多个处理单元;可选的,处理器1010集成应用处理器和调制解调处理器,其中,应用处理器主要处理涉及操作系统、用户界面和应用程序等的操作,调制解调处理器主要处理无线通信信号,如基带处理器。可以理解的是,上述调制解调处理器也可以不集成到处理器1010中。The processor 1010 may include one or more processing units; optionally, the processor 1010 integrates an application processor and a modem processor, wherein the application processor mainly processes operations related to the operating system, user interface, and application programs, etc., Modem processors mainly process wireless communication signals, such as baseband processors. It can be understood that the foregoing modem processor may not be integrated into the processor 1010 .
本申请实施例还提供一种可读存储介质,所述可读存储介质上存储有程序或指令, 该程序或指令被处理器执行时实现如上述即时定位与地图构建方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。The embodiment of the present application also provides a readable storage medium, where programs or instructions are stored on the readable storage medium, When the program or instruction is executed by the processor, it realizes the various processes in the embodiment of the real-time positioning and map construction method described above, and can achieve the same technical effect. In order to avoid repetition, details are not repeated here.
其中,所述处理器为上述实施例中所述的电子设备中的处理器。所述可读存储介质,包括计算机可读存储介质,如计算机只读存储器ROM、随机存取存储器RAM、磁碟或者光盘等。Wherein, the processor is the processor in the electronic device described in the above embodiments. The readable storage medium includes a computer-readable storage medium, such as a computer read-only memory ROM, a random access memory RAM, a magnetic disk or an optical disk, and the like.
本申请实施例另提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现如上述即时定位与地图构建方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。The embodiment of the present application further provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is used to run programs or instructions to realize real-time positioning and map construction as described above Each process of the method embodiment can achieve the same technical effect, and will not be repeated here to avoid repetition.
应理解,本申请实施例提到的芯片还可以称为系统级芯片、系统芯片、芯片系统或片上系统芯片等。It should be understood that the chips mentioned in the embodiments of the present application may also be called system-on-chip, system-on-chip, system-on-a-chip, or system-on-a-chip.
本申请实施例提供一种计算机程序产品,该程序产品被存储在存储介质中,该程序产品被至少一个处理器执行以实现如上述即时定位与地图构建方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。The embodiment of the present application provides a computer program product, the program product is stored in a storage medium, and the program product is executed by at least one processor to realize the various processes in the above embodiments of the real-time positioning and map construction method, and can achieve the same To avoid repetition, the technical effects will not be repeated here.
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。此外,需要指出的是,本申请实施方式中的方法和装置的范围不限按示出或讨论的顺序来执行功能,还可包括根据所涉及的功能按基本同时的方式或按相反的顺序来执行功能,例如,可以按不同于所描述的次序来执行所描述的方法,并且还可以添加、省去、或组合各种步骤。另外,参照某些示例所描述的特征可在其他示例中被组合。It should be noted that, in this document, the term "comprising", "comprising" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article or apparatus comprising a set of elements includes not only those elements, It also includes other elements not expressly listed, or elements inherent in the process, method, article, or device. Without further limitations, an element defined by the phrase "comprising a ..." does not preclude the presence of additional identical elements in the process, method, article, or apparatus comprising that element. In addition, it should be pointed out that the scope of the methods and devices in the embodiments of the present application is not limited to performing functions in the order shown or discussed, and may also include performing functions in a substantially simultaneous manner or in reverse order according to the functions involved. Functions are performed, for example, the described methods may be performed in an order different from that described, and various steps may also be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以计算机软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus a necessary general-purpose hardware platform, and of course also by hardware, but in many cases the former is better implementation. Based on such an understanding, the technical solution of the present application can be embodied in the form of computer software products, which are stored in a storage medium (such as ROM/RAM, magnetic disk, etc.) , optical disc), including several instructions to enable a terminal (which may be a mobile phone, computer, server, or network device, etc.) to execute the methods described in various embodiments of the present application.
上面结合附图对本申请的实施例进行了描述,但是本申请并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本申请的启示下,在不脱离本申请宗旨和权利要求所保护的范围情况下,还可做出很多形式,均属于本申请的保护之内。 The embodiments of the present application have been described above in conjunction with the accompanying drawings, but the present application is not limited to the above-mentioned specific implementations. The above-mentioned specific implementations are only illustrative and not restrictive. Those of ordinary skill in the art will Under the inspiration of this application, without departing from the purpose of this application and the scope of protection of the claims, many forms can also be made, all of which belong to the protection of this application.

Claims (17)

  1. 一种即时定位与地图构建方法,所述方法包括:A method for instant positioning and map construction, the method comprising:
    根据采集的第i-1帧图像的最终位姿信息,确定采集的第i帧图像的初始位姿信息,i为大于1的整数;According to the final pose information of the collected i-1th frame image, determine the initial pose information of the i-th frame image collected, where i is an integer greater than 1;
    将所述第i帧图像的初始位姿信息与第一插值变量融合,得到所述第i帧图像的最终位姿信息,所述第一插值变量为在采集所述第i帧图像前得到的最后一个插值变量,所述第一插值变量为第一图像的初始位姿信息与目标位姿信息之间的插值变量,所述第一图像为在所述第i帧图像前采集的图像中为关键帧的图像,所述目标位姿信息为对所述第一图像的初始位姿信息进行优化后的位姿信息;Fusing the initial pose information of the i-th frame image with a first interpolation variable to obtain the final pose information of the i-th frame image, the first interpolation variable is obtained before collecting the i-th frame image The last interpolation variable, the first interpolation variable is the interpolation variable between the initial pose information of the first image and the target pose information, and the first image is in the image collected before the i-th frame image is The image of the key frame, the target pose information is the pose information after optimizing the initial pose information of the first image;
    基于所述第i帧图像的最终位姿信息和所述第i帧图像,进行即时定位与地图构建。Real-time positioning and map construction are performed based on the final pose information of the i-th frame image and the i-th frame image.
  2. 根据权利要求1所述的方法,其中,所述根据采集的第i-1帧图像的最终位姿信息,确定采集的第i帧图像的初始位姿信息之前,所述方法还包括:The method according to claim 1, wherein said method further comprises:
    基于M个位姿信息、M组偏移信息和所述第一图像的初始位姿信息,对所述第一图像的初始位姿信息进行优化,得到所述目标位姿信息,所述M个位姿信息为M帧图像最近一次优化后的位姿信息,所述M帧图像为在所述第一图像前采集的图像中为关键帧的图像,每组偏移信息为所述第一图像的特征点相对于所述M帧图像中的一帧图像的特征点的偏移量;Based on the M pieces of pose information, M sets of offset information, and the initial pose information of the first image, optimize the initial pose information of the first image to obtain the target pose information, and the M The pose information is the last optimized pose information of M frames of images, the M frames of images are images of key frames in the images collected before the first image, and each group of offset information is the first image The offset of the feature points relative to the feature points of a frame of images in the M frames of images;
    根据所述第一图像的初始位姿信息中的第一旋转坐标和第一位移坐标,以及所述目标位姿信息中的第二旋转坐标和第二位移坐标,确定所述第一插值变量。The first interpolation variable is determined according to the first rotation coordinate and the first displacement coordinate in the initial pose information of the first image, and the second rotation coordinate and second displacement coordinate in the target pose information.
  3. 根据权利要求2所述的方法,其中,所述基于M个位姿信息、M组偏移信息和所述第一图像的初始位姿信息,对所述第一图像的初始位姿信息进行优化,得到所述目标位姿信息,包括:The method according to claim 2, wherein the initial pose information of the first image is optimized based on the M pieces of pose information, M sets of offset information, and the initial pose information of the first image , to obtain the target pose information, including:
    根据所述M组偏移信息,确定M组三维位置信息,所述M组偏移信息与所述M组三维位置信息一一对应,每组三维位置信息用于指示基于所述M帧图像中的一帧图像构建的三维地图中的特征点;According to the M sets of offset information, determine M sets of three-dimensional position information, the M sets of offset information correspond to the M sets of three-dimensional position information one by one, and each set of three-dimensional position information is used to indicate the The feature points in the three-dimensional map constructed from a frame of image;
    采用预设光束法平差算法,对所述M个位姿信息、所述第一图像的初始位姿信息和所述M组三维位置信息进行处理,得到所述目标位姿信息。The M pieces of pose information, the initial pose information of the first image, and the M sets of three-dimensional position information are processed by using a preset beam adjustment algorithm to obtain the target pose information.
  4. 根据权利要求3所述的方法,其中,所述基于M个位姿信息、M组偏移信息和所述第一图像的初始位姿信息,对所述第一图像的初始位姿信息进行优化,得到所述目标位姿信息之前,所述方法还包括:The method according to claim 3, wherein the initial pose information of the first image is optimized based on the M pose information, M sets of offset information, and the initial pose information of the first image , before obtaining the target pose information, the method also includes:
    根据所述第一图像的特征点的二维位置信息和所述M帧图像的特征点的二维位置信息,确定所述M组偏移信息。The M sets of offset information are determined according to the two-dimensional position information of the feature points of the first image and the two-dimensional position information of the feature points of the M frames of images.
  5. 根据权利要求2至4中任一项所述的方法,其中,所述根据所述第一图像的初始位姿信息中的第一旋转坐标和第一位移坐标,以及所述目标位姿信息中的第二旋转坐标和第二位移坐标,确定所述第一插值变量,包括: The method according to any one of claims 2 to 4, wherein, the first rotation coordinate and the first displacement coordinate in the initial pose information according to the first image, and the target pose information in the The second rotation coordinate and the second displacement coordinate are used to determine the first interpolation variable, including:
    根据所述第一旋转坐标和所述第二旋转坐标,确定目标旋转坐标;determining a target rotation coordinate according to the first rotation coordinate and the second rotation coordinate;
    根据所述目标旋转坐标、所述第一位移坐标、所述第二位移坐标,确定目标位移坐标;determining target displacement coordinates according to the target rotation coordinates, the first displacement coordinates, and the second displacement coordinates;
    将所述目标旋转坐标和所述目标位移坐标,确定为所述第一插值变量。The target rotation coordinates and the target displacement coordinates are determined as the first interpolation variables.
  6. 根据权利要求1所述的方法,其中,所述根据采集的第i-1帧图像的最终位姿信息,确定采集的第i帧图像的初始位姿信息,包括:The method according to claim 1, wherein said determining the initial pose information of the i-th frame image collected according to the final pose information of the i-1 frame image collected comprises:
    采用滤波算法,对所述第i-1帧图像的最终位姿信息处理,得到第一位姿信息;Using a filtering algorithm to process the final pose information of the i-1th frame image to obtain the first pose information;
    在所述第一位姿信息与所述第i-1帧图像的最终位姿信息的匹配度小于或等于预设匹配度的情况下,将所述第i-1帧图像的最终位姿信息确定为所述第i帧图像的初始位姿信息;When the matching degree between the first pose information and the final pose information of the i-1th frame image is less than or equal to the preset matching degree, the final pose information of the i-1th frame image Determined as the initial pose information of the i-th frame image;
    在所述第一位姿信息与所述第i-1帧图像的最终位姿信息的匹配度大于预设匹配度的情况下,将所述第一位姿信息确定为所述第i帧图像的初始位姿信息。When the matching degree between the first pose information and the final pose information of the i-1th frame image is greater than a preset matching degree, determining the first pose information as the i-th frame image initial pose information.
  7. 一种即时定位与地图构建装置,所述装置包括采集模块、确定模块、融合模块和处理模块;A real-time positioning and map construction device, the device includes a collection module, a determination module, a fusion module and a processing module;
    所述确定模块,用于根据所述采集模块采集的第i-1帧图像的最终位姿信息,确定所述采集模块采集的第i帧图像的初始位姿信息,i为大于1的整数;The determination module is configured to determine the initial pose information of the i-th frame image collected by the acquisition module according to the final pose information of the i-1th frame image collected by the acquisition module, where i is an integer greater than 1;
    所述融合模块,用于将所述第i帧图像的初始位姿信息与第一插值变量融合,得到所述第i帧图像的最终位姿信息,所述第一插值变量为在所述采集模块采集所述第i帧图像前得到的最后一个插值变量,所述第一插值变量为第一图像的初始位姿信息与目标位姿信息之间的插值变量,所述第一图像为在所述第i帧图像前所述采集模块采集的图像中为关键帧的图像,所述目标位姿信息为对所述第一图像的初始位姿信息进行优化后的位姿信息;The fusion module is configured to fuse the initial pose information of the i-th frame image with a first interpolation variable to obtain the final pose information of the i-th frame image, and the first interpolation variable is The last interpolation variable obtained before the module collects the i-th frame image, the first interpolation variable is the interpolation variable between the initial pose information of the first image and the target pose information, and the first image is the Among the images collected by the acquisition module before the i-th frame image, it is an image of a key frame, and the target pose information is the pose information after optimizing the initial pose information of the first image;
    所述处理模块,用于基于所述第i帧图像的最终位姿信息和所述第i帧图像,进行即时定位与地图构建。The processing module is configured to perform real-time positioning and map construction based on the final pose information of the i-th frame image and the i-th frame image.
  8. 根据权利要求7所述的装置,其中,所述装置还包括优化模块;The device according to claim 7, wherein the device further comprises an optimization module;
    所述优化模块,用于在所述确定模块根据所述采集模块采集的所述第i-1帧图像的最终位姿信息,确定所述采集模块采集的所述第i帧图像的初始位姿信息之前,基于M个位姿信息、M组偏移信息和所述第一图像的初始位姿信息,对所述第一图像的初始位姿信息进行优化,得到所述目标位姿信息,所述M个位姿信息为M帧图像最近一次优化后的位姿信息,所述M帧图像为在所述第一图像前所述采集模块采集的图像中为关键帧的图像,每组偏移信息为所述第一图像的特征点相对于所述M帧图像中的一帧图像的特征点的偏移量;The optimization module is configured to, in the determination module, determine the initial pose of the i-th frame image collected by the collection module according to the final pose information of the i-1th frame image collected by the collection module Before information, based on the M pose information, M groups of offset information, and the initial pose information of the first image, the initial pose information of the first image is optimized to obtain the target pose information, so The M pose information is the last optimized pose information of the M frames of images, and the M frames of images are images that are key frames in the images collected by the acquisition module before the first image, and each group of offsets The information is the offset of the feature points of the first image relative to the feature points of one frame of images in the M frames of images;
    所述确定模块,还用于根据所述第一图像的初始位姿信息中的第一旋转坐标和第一位移坐标,以及所述目标位姿信息中的第二旋转坐标和第二位移坐标,确定所述第一插值变量。 The determination module is further configured to, according to the first rotation coordinate and the first displacement coordinate in the initial pose information of the first image, and the second rotation coordinate and second displacement coordinate in the target pose information, The first interpolation variable is determined.
  9. 根据权利要求8所述的装置,其中,所述优化模块,具体用于根据所述M组偏移信息,确定M组三维位置信息,所述M组偏移信息与所述M组三维位置信息一一对应,每组三维位置信息用于指示基于所述M帧图像中的一帧图像构建的三维地图中的特征点;并采用预设光束法平差算法,对所述M个位姿信息、所述第一图像的初始位姿信息和所述M组三维位置信息进行处理,得到所述目标位姿信息。The device according to claim 8, wherein the optimization module is specifically configured to determine M sets of three-dimensional position information according to the M sets of offset information, the M sets of offset information and the M sets of three-dimensional position information One-to-one correspondence, each set of three-dimensional position information is used to indicate the feature points in the three-dimensional map constructed based on one frame of the M frame images; and the preset beam adjustment algorithm is used to calculate the M pose information , the initial pose information of the first image and the M sets of three-dimensional position information are processed to obtain the target pose information.
  10. 根据权利要求9所述的装置,其中,所述确定模块,还用于在所述优化模块基于所述M个位姿信息、所述M组偏移信息和所述第一图像的初始位姿信息,对所述第一图像的初始位姿信息进行优化,得到所述目标位姿信息之前,根据所述第一图像的特征点的二维位置信息和所述M帧图像的特征点的二维位置信息,确定所述M组偏移信息。The device according to claim 9, wherein the determining module is further configured to, in the optimization module, based on the M pieces of pose information, the M sets of offset information, and the initial pose of the first image Information, optimize the initial pose information of the first image, before obtaining the target pose information, according to the two-dimensional position information of the feature points of the first image and the two-dimensional position information of the feature points of the M frame images dimensional position information, and determine the M sets of offset information.
  11. 根据权利要求8至10中任一项所述的装置,其中,所述确定模块,具体用于根据所述第一旋转坐标和所述第二旋转坐标,确定目标旋转坐标;且根据所述目标旋转坐标、所述第一位移坐标、所述第二位移坐标,确定目标位移坐标;并将所述目标旋转坐标和所述目标位移坐标,确定为所述第一插值变量。The device according to any one of claims 8 to 10, wherein the determination module is specifically configured to determine a target rotation coordinate according to the first rotation coordinate and the second rotation coordinate; and according to the target The rotation coordinate, the first displacement coordinate, and the second displacement coordinate determine a target displacement coordinate; and determine the target rotation coordinate and the target displacement coordinate as the first interpolation variable.
  12. 根据权利要求7所述的装置,其中,所述确定模块,具体用于采用滤波算法,对所述第i-1帧图像的最终位姿信息处理,得到第一位姿信息;且在所述第一位姿信息与所述第i-1帧图像的最终位姿信息的匹配度小于或等于预设匹配度的情况下,将所述第i-1帧图像的最终位姿信息确定为所述第i帧图像的初始位姿信息;并在所述第一位姿信息与所述第i-1帧图像的最终位姿信息的匹配度大于预设匹配度的情况下,将所述第一位姿信息确定为所述第i帧图像的初始位姿信息。The device according to claim 7, wherein the determining module is specifically configured to use a filtering algorithm to process the final pose information of the i-1th frame image to obtain first pose information; and in the When the matching degree between the first pose information and the final pose information of the i-1th frame image is less than or equal to the preset matching degree, determine the final pose information of the i-1th frame image as the The initial pose information of the i-th frame image; and when the matching degree of the first pose information and the final pose information of the i-1th frame image is greater than the preset matching degree, the first A piece of pose information is determined as the initial pose information of the i-th frame image.
  13. 一种电子设备,包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如权利要求1-6中任一项所述的即时定位与地图构建方法的步骤。An electronic device, comprising a processor and a memory, the memory stores programs or instructions that can run on the processor, and when the programs or instructions are executed by the processor, any one of claims 1-6 is implemented Steps of the instant positioning and map construction method described in the item.
  14. 一种可读存储介质,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如权利要求1-6中任一项所述的即时定位与地图构建方法的步骤。A readable storage medium, on which a program or instruction is stored, and when the program or instruction is executed by a processor, the real-time positioning and map construction method according to any one of claims 1-6 is realized step.
  15. 一种计算机软件产品,所述计算机软件产品被至少一个处理器执行以实现如权利要求1-6中任一项所述的即时定位与地图构建方法。A computer software product, the computer software product is executed by at least one processor to realize the real-time positioning and map construction method according to any one of claims 1-6.
  16. 一种电子设备,包括所述电子设备被配置成用于执行如权利要求1-6中任一项所述的即时定位与地图构建方法。An electronic device, comprising the electronic device configured to execute the instant positioning and map construction method according to any one of claims 1-6.
  17. 一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现如权利要求1-6中任一项所述的即时定位与地图构建方法。 A chip, the chip includes a processor and a communication interface, the communication interface is coupled to the processor, the processor is used to run programs or instructions, and realize the real-time Localization and Mapping Methods.
PCT/CN2023/076247 2022-02-22 2023-02-15 Simultaneous localization and mapping method and apparatus, electronic device, and readable storage medium WO2023160445A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210163295.9 2022-02-22
CN202210163295.9A CN115205419A (en) 2022-02-22 2022-02-22 Instant positioning and map construction method and device, electronic equipment and readable storage medium

Publications (1)

Publication Number Publication Date
WO2023160445A1 true WO2023160445A1 (en) 2023-08-31

Family

ID=83574247

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/076247 WO2023160445A1 (en) 2022-02-22 2023-02-15 Simultaneous localization and mapping method and apparatus, electronic device, and readable storage medium

Country Status (2)

Country Link
CN (1) CN115205419A (en)
WO (1) WO2023160445A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115205419A (en) * 2022-02-22 2022-10-18 维沃移动通信有限公司 Instant positioning and map construction method and device, electronic equipment and readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105527968A (en) * 2014-09-29 2016-04-27 联想(北京)有限公司 Information processing method and information processing device
US20200240793A1 (en) * 2019-01-28 2020-07-30 Qfeeltech (Beijing) Co., Ltd. Methods, apparatus, and systems for localization and mapping
CN112967340A (en) * 2021-02-07 2021-06-15 咪咕文化科技有限公司 Simultaneous positioning and map construction method and device, electronic equipment and storage medium
CN115205419A (en) * 2022-02-22 2022-10-18 维沃移动通信有限公司 Instant positioning and map construction method and device, electronic equipment and readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105527968A (en) * 2014-09-29 2016-04-27 联想(北京)有限公司 Information processing method and information processing device
US20200240793A1 (en) * 2019-01-28 2020-07-30 Qfeeltech (Beijing) Co., Ltd. Methods, apparatus, and systems for localization and mapping
CN112967340A (en) * 2021-02-07 2021-06-15 咪咕文化科技有限公司 Simultaneous positioning and map construction method and device, electronic equipment and storage medium
CN115205419A (en) * 2022-02-22 2022-10-18 维沃移动通信有限公司 Instant positioning and map construction method and device, electronic equipment and readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BARATH DANIEL; MISHKIN DMYTRO; EICHHARDT IVAN; SHIPACHEV ILIA; MATAS JIRI: "Efficient Initial Pose-graph Generation for Global SfM", 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE, 20 June 2021 (2021-06-20), pages 14541 - 14550, XP034008675, DOI: 10.1109/CVPR46437.2021.01431 *

Also Published As

Publication number Publication date
CN115205419A (en) 2022-10-18

Similar Documents

Publication Publication Date Title
US11295472B2 (en) Positioning method, positioning apparatus, positioning system, storage medium, and method for constructing offline map database
JP6824433B2 (en) Camera posture information determination method, determination device, mobile terminal and computer program
CN108447097B (en) Depth camera calibration method and device, electronic equipment and storage medium
WO2018119889A1 (en) Three-dimensional scene positioning method and device
CN107358633A (en) Join scaling method inside and outside a kind of polyphaser based on 3 points of demarcation things
Honegger et al. Real-time velocity estimation based on optical flow and disparity matching
CN114581532A (en) Multi-phase external parameter combined calibration method, device, equipment and medium
CN110880189A (en) Combined calibration method and combined calibration device thereof and electronic equipment
CN111028267B (en) Monocular vision following system and method for mobile robot
CN112489121A (en) Video fusion method, device, equipment and storage medium
WO2023160445A1 (en) Simultaneous localization and mapping method and apparatus, electronic device, and readable storage medium
CN113361365B (en) Positioning method, positioning device, positioning equipment and storage medium
CN113029128A (en) Visual navigation method and related device, mobile terminal and storage medium
Li et al. A binocular MSCKF-based visual inertial odometry system using LK optical flow
CN116051600A (en) Optimizing method and device for product detection track
CN112967340A (en) Simultaneous positioning and map construction method and device, electronic equipment and storage medium
Cheng et al. AR-based positioning for mobile devices
CN113610702B (en) Picture construction method and device, electronic equipment and storage medium
CN113838151A (en) Camera calibration method, device, equipment and medium
CN117437348A (en) Computing device and model generation method
WO2023088127A1 (en) Indoor navigation method, server, apparatus and terminal
Yii et al. Distributed visual processing for augmented reality
CN113628284B (en) Pose calibration data set generation method, device and system, electronic equipment and medium
CN115278084A (en) Image processing method, image processing device, electronic equipment and storage medium
CN115278049A (en) Shooting method and device thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23759070

Country of ref document: EP

Kind code of ref document: A1