CN112950710A

CN112950710A - Pose determination method and device, electronic equipment and computer readable storage medium

Info

Publication number: CN112950710A
Application number: CN202110209686.5A
Authority: CN
Inventors: 杨坤
Original assignee: Guangzhou Xaircraft Technology Co Ltd
Current assignee: Guangzhou Xaircraft Technology Co Ltd
Priority date: 2021-02-24
Filing date: 2021-02-24
Publication date: 2021-06-11

Abstract

The embodiment of the invention provides a pose determining method and device, electronic equipment and a computer readable storage medium, and relates to the technical field of images. The pose determining method comprises the steps of acquiring first positioning data of an image to be processed and second positioning data corresponding to a historical image under the condition that visual tracking of the image to be processed fails, wherein the first positioning data is positioning data acquired synchronously with the image to be processed; and selecting a reference image from the historical images according to the first positioning data and the second positioning data. The reference image which can be successfully matched with the image to be processed is quickly searched by utilizing the acquisition position, so that the pose information of the image to be processed is calculated, and the processing efficiency is improved.

Description

Pose determination method and device, electronic equipment and computer readable storage medium

Technical Field

The invention relates to the technical field of images, in particular to a pose determination method, a pose determination device, electronic equipment and a computer readable storage medium.

Background

The emergence of the technology of instant positioning and mapping (SLAM) enables unmanned equipment (unmanned aerial vehicles, robots and the like) to realize autonomous navigation in unknown environments, and further promotes the development of the unmanned equipment.

Currently, monocular vision SLAM is widely used, and performs pose determination and map update by tracking each acquired frame of image. However, once tracking failure occurs, it needs to be relocated in the map using the bag of words. Obviously, the method of adopting the bag of words has the problem of return flight, and directly influences the completion efficiency of the SLAM task.

Disclosure of Invention

In view of the above, the present invention provides a pose determination method, a pose determination apparatus, an electronic device, and a computer-readable storage medium.

In order to achieve the above purpose, the embodiment of the present invention adopts the following technical solutions:

in a first aspect, the present invention provides a pose determination method, including:

under the condition that visual tracking for an image to be processed fails, acquiring first positioning data of the image to be processed and second positioning data corresponding to a historical image;

selecting a reference image from the historical image according to the first positioning data and the second positioning data;

and calculating the pose information corresponding to the image to be processed according to the reference image and the image to be processed by combining a real-time map.

In an alternative embodiment, the step of selecting a reference image from the history images according to the first positioning data and the second positioning data includes:

acquiring second positioning data synchronously acquired with a historical image used for creating a real-time map;

and if the distance between the second positioning data and the first positioning data of one frame of the historical image is smaller than a preset distance, determining the historical image as the reference image.

and acquiring the historical images with the shortest distance between the first positioning data corresponding to the images to be processed from a plurality of directions respectively according to second positioning data acquired synchronously with the historical images to serve as the reference images.

In an optional implementation manner, in a case where the pose information is not calculated from the reference image and the image to be processed, the pose determination method further includes:

creating a new real-time map and acquiring the pose information corresponding to the image to be processed according to the image to be processed and the newly acquired image data;

acquiring a similarity transformation relation corresponding to the new real-time map; wherein the similarity transformation relation is used for converting the position data in the visual coordinate system into a world coordinate system;

and converting all the real-time maps into a world coordinate system by using the similarity transformation relation corresponding to each real-time map so as to obtain actual map data.

In an alternative embodiment, after all the real-time maps are converted into the world coordinate system, the pose determination method further includes:

judging whether different real-time maps are overlapped or not;

and if the first map and the second map which are overlapped exist, fusing the overlapped part between the first map and the second map to obtain the actual map data.

In an optional embodiment, the step of determining whether there is an overlap between the different real-time maps includes:

sequentially checking whether areas with the same world coordinate exist between two different real-time maps in a world coordinate system;

and if the areas with the same world coordinates exist, judging that the two real-time maps are overlapped.

In an optional embodiment, in the case where the pose information is calculated from the reference image and the image to be processed, the pose determination method further includes:

generating a local incremental map according to the reference image and the image to be processed;

and updating the real-time map according to the local incremental map.

In a second aspect, the present invention provides a pose determination apparatus comprising:

the device comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring first positioning data of an image to be processed and second positioning data corresponding to a historical image under the condition that visual tracking of the image to be processed fails, and the first positioning data is positioning data acquired synchronously with the image to be processed;

the selection module is used for selecting a reference image from the historical image according to the first positioning data and the second positioning data;

and the calculation module is used for calculating the corresponding pose information of the image to be processed by combining a real-time map according to the reference image and the image to be processed.

In a third aspect, the present invention provides an electronic device, including a processor and a memory, where the memory stores machine executable instructions capable of being executed by the processor, and the processor can execute the machine executable instructions to implement the pose determination method according to any one of the foregoing embodiments.

In an optional implementation manner, the electronic device includes an unmanned device, the unmanned device is an aerial survey unmanned aerial vehicle with an autonomous positioning function, the aerial survey unmanned aerial vehicle collects corresponding first positioning data while collecting an image to be processed, and executes the pose determination method based on the first positioning data.

In a fourth aspect, the present invention provides a computer-readable storage medium on which a computer program is stored, the computer program, when executed by a processor, implementing the pose determination method according to any one of the preceding embodiments.

Compared with the prior art, the pose determining method provided by the embodiment of the invention acquires the reference image from the historical image used for creating the real-time map according to the first positioning data of the image to be processed and the second positioning data of the historical image under the condition that the visual tracking of the image to be processed fails, and calculates the pose information corresponding to the image to be processed according to the reference image and the image to be processed and by combining the real-time map. The distance between the image acquisition positions is utilized to select the reference image with the overlapping rate meeting the requirement so as to be matched and finish tracking, the problem that pose tracking is lost is solved quickly, meanwhile, a position point capable of being tracked is not required to be found again in a return flight mode, repeated work is avoided, and the operation efficiency is improved.

In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

Fig. 1 shows a schematic diagram of an electronic device provided by an embodiment of the present invention.

Fig. 2 shows one of the steps of the pose determination method provided by the embodiment of the present invention.

Fig. 3 illustrates one of the scenarios provided by the embodiment of the present invention for acquiring image data according to a working route and performing pose calculation.

Fig. 4 illustrates a second scenario in which image data is collected according to a working route and pose calculation is performed according to the embodiment of the present invention.

Fig. 5 is an exemplary diagram of determining a reference image.

Fig. 6 is another exemplary diagram of determining a reference image.

Fig. 7 shows a second step flowchart of the pose determination method according to the embodiment of the present invention.

Fig. 8 is a flowchart illustrating sub-steps of step S202 in fig. 7.

Fig. 9 shows a third step of the pose determination method according to the embodiment of the present invention.

Fig. 10 shows a schematic diagram of a pose determination apparatus provided by an embodiment of the present invention.

Icon: 100-an electronic device; 101-a memory; 102-a communication interface; 103-a processor; 104-a bus; 400-pose determining means; 401-a first obtaining module; 402-a selection module; 403-calculation module.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.

It is noted that relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The unmanned equipment is widely applied to various fields by utilizing the advantage that the manpower cost can be effectively reduced. For example, the robot cleaner is applied to a sweeping robot in the household field, an unmanned aerial vehicle, an unmanned vehicle and the like in the agricultural field. However, a prerequisite for unmanned devices to perform unmanned tasks is autonomous pose determination and navigation. In particular, it is very important for unmanned devices to achieve autonomous pose determination and navigation in strange environments.

In the related technology, the SLAM technology is mainly adopted for realizing autonomous pose determination and navigation in a strange environment, namely, a video stream is acquired in real time, each frame of image in a video is tracked, in the tracking process, a collected previous frame of image is selected as a key frame, and pose determination and map updating are carried out based on the key frame. Theoretically, corresponding pose tracking can be completed through the mode every time one frame of image is collected, however, if matching between one frame of image and the collected adjacent previous frame of image fails, corresponding pose tracking failure occurs, and the pose of the frame of image cannot be calculated.

In the face of the problem of tracking loss of images, the related art adopts a mode of using word bags in a map to perform relocation until the relocation is successful. When the image is relocated, the camera must go to the place where the camera has been taken before to successfully relocate the image, and then the image of the frame which is successfully relocated is continuously tracked. Obviously, the pose of the image frame between the moment of loss and the success of relocation is not available, and the return voyage also brings a lot of unnecessary and repetitive workload.

In order to solve the above problem, embodiments of the present invention provide a pose determination method, apparatus, electronic device, and computer-readable storage medium.

Referring to fig. 1, fig. 1 is a block diagram illustrating an electronic device 100 according to an embodiment of the invention.

The pose determination method and apparatus provided by the embodiment of the present invention may be applied to the electronic device 100. In some embodiments, the electronic device 100 may be a device in communication with an unmanned device, and configured to receive an image returned by the unmanned device to perform the pose determination method. Such as Personal Computers (PCs), servers, distributed computers. It is to be understood that the electronic device 100 is not limited to a physical device, and may also be a computer that is laid out on the physical device, a virtual machine built on a cloud platform, or the like and can provide the same function as the server or the virtual machine.

In some embodiments, the electronic device 100 may also be an unmanned device itself, so that the electronic device 100 may perform the pose determination method based on the captured image. For example, the unmanned device is an aerial survey unmanned aerial vehicle with an image acquisition module. The image acquisition module is used for acquiring an image frame to be estimated and executing the pose determination method according to the image frame to be estimated and the corresponding map data. In addition, the above-mentioned unmanned aerial vehicle also has an autonomous Positioning function, and a Positioning technology for realizing the above-mentioned autonomous Positioning function may be based on a Global Positioning System (GPS), a Global Navigation Satellite System (GLONASS), a COMPASS Navigation System (COMPASS), a galileo Positioning System, a Quasi-Zenith Satellite System (QZSS), a Wireless Fidelity (WiFi) Positioning technology, a beidou Satellite Navigation Positioning System, a Real-time kinematic (RTK) technology with high precision, and the like, or any combination thereof. One or more of the above-described positioning systems may be used interchangeably in this application. For the purpose of description, in the embodiment of the present invention, an example in which the unmanned device uses an RTK technique to perform positioning data acquisition is mainly described.

The operating system of the electronic device 100 may be, but is not limited to, a Windows system, a Linux system, and the like. The above-mentioned electronic device 100 comprises a memory 101, a communication interface 102, a processor 103 and a bus 104, said memory 101, communication interface 102 and processor 103 being connected via the bus 104, the processor 103 being adapted to execute executable modules, such as computer programs, stored in the memory 101. The Memory 101 may include a high-speed Random Access Memory 101 (RAM) and may also include a non-volatile Memory 101 (e.g., at least one disk Memory 101). The communication connection between the electronic device 100 and external devices is realized through at least one communication interface 102 (which may be wired or wireless).

The bus 104 may be an ISA bus, PCI bus, EISA bus, or the like. Only one bi-directional arrow is shown in fig. 1, but this does not indicate only one bus or one type of bus.

The memory 101 is used to store a program such as the pose determination apparatus 400 shown in fig. 7. The pose determination apparatus 400 includes at least one software function module which may be stored in the memory 101 in the form of software or firmware (firmware) or solidified in an Operating System (OS) of the electronic device 100. After receiving the execution instruction, the processor 103 executes the program to implement the pose determination method disclosed in the above embodiment of the present invention.

The processor 103 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 103. The Processor 103 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components.

It should be understood that the structure shown in fig. 1 is only a schematic structural diagram of the electronic device 100, and the electronic device 100 may also include more or fewer components than shown in fig. 1, or have a different configuration than shown in fig. 1. The components shown in fig. 1 may be implemented in hardware, software, or a combination thereof.

Referring to fig. 2, fig. 2 shows a pose determination method according to an embodiment of the present invention. As shown in fig. 2, the pose determination method may include the steps of:

step S101, when the visual tracking of the image to be processed fails, acquiring first positioning data of the image to be processed and second positioning data corresponding to the historical image.

In one aspect, the image to be processed is a frame of image data acquired by the unmanned device. On the other hand, the image to be processed is also an image in which a visual tracking failure has occurred.

The failure of the visual tracking can be understood as a failure in matching between a certain frame of image data and an adjacent previous frame of image data, which may result in a situation where pose information corresponding to the image data (i.e., the image to be processed) cannot be calculated.

The first positioning data represent the position of the unmanned equipment when the image to be processed is acquired.

In the SLAM process, after the pose information corresponding to the acquired image data is determined, a real-time map under the visual coordinate system of the unmanned equipment is created or updated based on the image data. In other words, the real-time map is constructed from the image data with the determined pose information, and therefore, in some embodiments, the history image may be an image with the determined pose information in the image data acquired by the unmanned aerial vehicle.

Of course, the image to be processed and the historical image are images acquired successively by the unmanned equipment in the process of acquiring the image according to the same operation path. The second positioning data represent the position of the unmanned equipment when the historical image is acquired. And step S103, calculating the corresponding pose information of the image to be processed according to the reference image and the image to be processed by combining a real-time map.

In some embodiments, by using the characteristics that image data with higher overlapping rate have common view and are easier to match successfully, matching points between an image to be processed and a reference image are obtained to determine matching points between the image to be processed and a real-time map, and further pose information of the image to be processed is calculated.

Implementation details of embodiments of the present invention are described below:

in some embodiments, the unmanned device may perform image acquisition according to a pre-planned job path. Although the possibility of visual tracking loss can be reduced to a certain extent by high-frequency image acquisition, the high-frequency image acquisition means a large amount of image tracking and real-time map reconstruction, the requirements on system resources are high, the required processing performance is also high, and the processing time consumption is relatively long.

In the embodiment of the invention, the image acquisition can be carried out by adopting a mode of discretely shooting images so as to reduce the processing amount of image data in the SLAM process. Naturally, the shooting frequency of the above-described discrete shooting is also required. For example, it is ensured that the overlapping ratio between two adjacent frames of image data is not lower than a preset value. Of course, at different aerial photographing heights, the same overlapping rate of the two adjacent frames of image data needs to be achieved, and the required acquisition frequencies are different. In this way, the discrete image acquisition frequency of the unmanned device can be determined according to the aerial photographing height and the overlapping rate required by the acquired images of two adjacent frames.

In some embodiments, after the unmanned device performs image acquisition according to a pre-planned operation path, the unmanned device performs instant positioning and map construction based on the acquired image data. Specifically, the following two scenarios can be classified:

in a first scenario, the drone may perform discrete image data acquisition while moving along a pre-planned work path. After the image data of the whole operation path is completely acquired, the pose information of the unmanned equipment is sequentially tracked when each frame of image data is acquired according to the acquisition sequence. Taking fig. 3 as an example, in the process of driving according to the working path, image data is sequentially collected, and after all image data are collected, each frame of image data is sequentially tracked.

In the second scenario, the unmanned device also performs discrete image data acquisition during movement along the pre-planned work path. Meanwhile, every time one frame of image data is acquired, pose estimation and map update of the unmanned aerial vehicle are performed based on the image data, for example, as shown in fig. 4.

In any of the above scenarios, the principle of implementing the pose determination method provided by the embodiment of the present invention is the same, and for convenience of description, the following description mainly takes fig. 4 as an example.

As described above, in order to reduce the processing amount of the image data, the embodiment of the present invention may adopt a discrete image acquisition manner, and the possibility of the failure of the visual tracking occurring during the process of performing the instant positioning and mapping process on the discretely acquired image data is also increased. Once a visual tracking failure occurs, the above step S101 can be implemented in the following two ways.

The step S101 may be: and acquiring second positioning data acquired synchronously with a historical image used for creating a real-time map and first positioning data acquired synchronously with an image to be processed.

Understandably, the image acquisition of the unmanned aerial vehicle can be performed while the position coordinates in the real space, that is, the positioning data, of the unmanned aerial vehicle when acquiring the image data can be determined by using the RTK technology. For convenience of description, the positioning data acquired synchronously with the acquisition of the image to be processed is referred to as first positioning data. The positioning data acquired in synchronization with the history image is referred to as second positioning data.

In one case, it may be that the frequency of acquiring the positioning data is higher than the frequency of acquiring the image data. For example, a GPS sensor is used to collect positioning data. In this case, the acquired positioning data and image data need to be temporally manipulated (e.g., a vision-fusion) to ensure that each frame of image data can correspond to the same acquisition time to acquire the positioning data.

In another case, the method can also be used for acquiring one positioning data in a linkage manner when acquiring one frame of image data, and the image data and the positioning data are consistent in acquisition time from a hardware level. For example, the acquisition of the positioning data is performed by using the RTK technique.

In some embodiments, the positioning data having the same acquisition time point may be matched from the acquired positioning data according to the acquisition time point of the image to be processed, so as to serve as the first positioning data. In brief, when the unmanned device collects image data, it also collects RTK data (positioning data) synchronously, and the consistency of data collection time is achieved on the hardware level, and the time alignment operation (vision-fusion) is not needed when the RTK data and the image data are fused.

As described above, the above history images are images in which the poses have been determined and which have been used to construct a real-time map. For example, the other images except for the image frame a in fig. 4, or the image whose acquisition time point is before the image frame a in fig. 3.

Similarly, in some embodiments, the positioning data having the same acquisition time point can be matched from the acquired positioning data according to the acquisition time point corresponding to each frame of historical image, so as to serve as the second positioning data.

In some embodiments, the step S102 can be implemented in various ways, and the following steps are mainly described as follows:

first implementation of step S102: and acquiring the distance between the second positioning data corresponding to each frame of historical image and the first positioning data. And if the distance between second positioning data and the first positioning data is smaller than the preset distance, determining the historical image corresponding to the second positioning data as a reference image.

The preset distance may be determined according to an overlap ratio and an aerial height required for successful matching between the images. Understandably, the farther the distance between the acquisition position points corresponding to different image data is, the lower the corresponding overlapping rate is under the same aerial photography height. Therefore, the corresponding relation between the distance between the acquisition position points and the overlapping rate of the acquired image data under different aerial photographing heights can be determined through a pre-test. Therefore, after the aerial photographing height of the unmanned equipment is determined, the corresponding preset distance can be obtained by combining the overlapping rate required by successful matching between the images.

For example, when the aerial photography height is 20 meters, and the distance between the collection position points is 1 meter, the overlap ratio between the two collected frames of image data is 80%, when the distance between the collection position points is 2 meters, the overlap ratio between the two collected frames of image data is 70%, and when the distance between the collection position points is 3 meters, the overlap ratio between the two collected frames of image data is 60%. If the overlap ratio between the different images reaches at least 70% to enable successful matching, the preset distance may be determined to be 2 meters.

Illustratively, on the basis of fig. 4, as shown in fig. 5, a target spatial range with a radius of a preset distance may be determined with the first positioning data of the image to be processed (i.e., image frame a) as the center. The history data (image frames b, c, d) for which the corresponding second positioning data belongs to the target spatial range is determined as a reference image.

The second implementation manner of the step S101 may be: and acquiring the historical images with the shortest distance between the first positioning data corresponding to the images to be processed from a plurality of directions respectively according to second positioning data acquired synchronously with the historical images to serve as reference images.

Taking fig. 6 as an example, the plurality of directions may be, but not limited to, a front direction (a direction in which the work implement moves), a rear direction (a direction in which the work implement moves in an opposite direction), a left direction (a left side perpendicular to the direction in which the work implement moves), a right direction (a right side perpendicular to the direction in which the work implement moves), a front left direction (a direction between the front and the left), a rear left direction (a direction between the rear and the left), a front right direction (a direction between the front and the right), and a rear right direction (a direction between the rear and the right). In practical applications, the actual types of the above-mentioned directions can be set by the user.

In some embodiments, the historical images of the acquisition position points located in the respective directions of the image to be processed may be acquired separately. And determining the history image with the closest distance between the first positioning data corresponding to the image to be processed from the history images in each direction as a reference image. For example, in fig. 6, image frames b, c, d, and e are determined as reference images.

Of course, in some specific embodiments, the step S102 may also be to obtain the object location data closest to the first location data from the second location data of all the history images, and use the history image corresponding to the object location data as the reference image.

In some embodiments, the step S103 may include the following steps:

s103-1, determining a matching characteristic point pair between the image to be processed and the reference image through characteristic extraction and matching analysis aiming at the image to be processed and the reference image.

The matching feature point pair includes a first feature point extracted from the image to be processed and a second feature point extracted from the reference image. The matching degree between the two characteristic points corresponding to the matching characteristic point pairs exceeds the set requirement.

In some embodiments, such a SIFT algorithm may be adopted to realize the extraction of feature points in the image data (the image to be processed and the reference image).

S103-2, acquiring a target map point matched with the second feature point in the matched feature point pair from the real-time map.

S103-3, calculating pose information corresponding to the image to be processed according to the first feature point in the multiple groups of matched feature point pairs and the corresponding target map point.

In some embodiments, the initial pose may be calculated from a first feature point and a corresponding target map point in the sets of matched feature points. For example, the initial pose solution can be performed using a Peractive-n-Points (e.g., P3P). It should be noted that P3P is a 3D-2D pose solution, that is, pose solution is performed based on known matching 3D points (map points) and image 2D points (feature points). And then, calculating a minimum re-projection error by using BA, and optimizing the pose of the camera by using the minimum re-projection error to obtain pose information corresponding to the image to be processed.

In some embodiments, in the case where pose information is calculated from the reference image and the image to be processed, the pose determination method may further include:

and generating a local incremental map according to the reference image and the image to be processed. And updating the real-time map according to the local incremental map.

For example, the above manner of generating the local incremental map may be: firstly, a target characteristic point pair is determined from a matching characteristic point pair between a reference image and an image to be processed. The second feature point in the target feature point pair is not matched with the existing map point in the map data. In other words, neither the first feature point in the above-described target feature point pair matches a map point in the map data. Secondly, a local incremental map is created according to the target characteristic point pairs. In some embodiments, triangulation may be used to generate new map points based on the target feature point pairs, thus obtaining a local incremental map.

Of course, there is still a possibility that the pose information cannot be calculated according to the reference image and the image to be processed. If the pose information corresponding to the image data to be processed still cannot be calculated according to the determined reference image and the image to be processed, the following steps can be adopted as shown in fig. 7:

step S201, according to the image to be processed and the newly acquired image data, a new real-time map is created and pose information corresponding to the image to be processed is acquired.

The newly acquired image data may be an image in which pose information has not been determined and an acquisition time point is located after the image to be processed. The step S201 is equivalent to creating another real-time map (or referred to as a sub-map) with the image to be processed as the first frame image and the image data acquired later, so as to determine the pose. And if the situation that the visual tracking fails does not occur in the image data acquired after the image to be processed, determining the pose and updating the map by adopting the newly created real-time map.

And then, the original real-time map is not updated any more, and the original real-time map is stored as an independent sub-map.

Step S202, acquiring a similarity transformation relation corresponding to the new real-time map.

The above-described similarity transformation relation is used to convert the position data in the visual coordinate system into the world coordinate system.

In some embodiments, the transformation between the visual coordinate system and the world coordinate system may be obtained by establishing a local coordinate system.

As shown in fig. 8, the step S202 may include the following sub-steps:

in the substep S202-1, positioning data corresponding to the first frame of image data for creating a new real-time map is obtained.

In the embodiment of the present invention, the first frame of image data for constructing the new real-time map may be an image to be processed.

And a substep S202-2, taking the positioning data corresponding to the first frame of image data as a local position, and establishing a local coordinate system.

That is, the first positioning data may be used as an origin to create a local coordinate system capable of converting with the world coordinate system, and the positioning data corresponding to the image data of the acquisition time point after the image to be processed may be projected into the local coordinate system. It should be noted that the distance between the coordinate position corresponding to each frame of image data in the local coordinate system and the origin of the local coordinate system is equal to the distance between the image data in the world coordinate system and the positioning data corresponding to the image to be processed.

And a substep S202-3 of fitting a similarity transformation relation according to the local coordinate system and the newly acquired image data.

The similarity transformation relation comprises a transformation mapping relation between the local coordinate system and the world coordinate system and a similarity transformation matrix between the visual coordinate system and the local coordinate system.

In some embodiments, the transformation mapping relationship may be determined according to the positioning data corresponding to the origin of the local coordinate system.

In some embodiments, the similarity transformation matrix may be determined by: converting the positioning data of the newly acquired image data into a local coordinate system and recording the converted positioning data as t_RTK. Sequentially acquiring coordinates of the camera center in the visual coordinate system when the frame of image data is acquired according to the pose information of the newly acquired image data, and recording the coordinates as t_vis。t_visT corresponding thereto_RTKThere is a similarity transformation matrix between them, denoted as T_{v_g}Due to the presence of noise, there is a certain error r between the transformations:

r＝t_vis-T_{v_g}*t_RTK；

by constructing the following optimization equation, a similarity transformation matrix T between two coordinate systems can be calculated_{v_g}：

Carry out the solution T_{v_g}. Where n represents the total number of frames of image data used to reconstruct the new real-time map (including the total number of image data used to reconstruct the new real-time map in the image to be processed and the newly acquired image data). m represents the serial number of the image data, and different values of m represent different image data used for creating the real-time map. At the same time, r_mRepresenting the error r corresponding to the mth image data used to create the new real-time map.

The local coordinate system is established by taking positioning data corresponding to first frame image data for creating the real-time map as a starting point. Obviously, the local coordinate systems corresponding to different real-time maps are different, and thus, it is easy to deduce that the similarity transformation relations corresponding to different real-time maps are also different. Of course, the principle of calculating the similarity transformation relationships corresponding to different real-time maps is the same, and is not described herein again.

Step S203, all real-time maps are converted into a world coordinate system by using the corresponding similarity transformation relation of each real-time map so as to obtain actual map data.

Under the condition that a plurality of real-time maps exist, the map points in the real-time maps can be mapped to the world coordinate system by sequentially using the similar transformation relation corresponding to each real-time map, and the real-time maps in the world coordinate system are combined to obtain actual map data.

In some embodiments, each map point in the real-time map may be projected from the visual coordinate system to the corresponding local coordinate system by using the similarity transformation matrix in the similarity transformation relationship corresponding to the real-time map. And then, converting the map points in the local coordinate system into a world coordinate system by using the conversion mapping relation in the similarity transformation relation so as to obtain a real-time map in the world coordinate system.

In addition, there is a possibility that there is overlap between different real-time maps in the world coordinate system, so after all the real-time maps are converted into the world coordinate system, as shown in fig. 9, the above pose determination method further includes:

step S301, judging whether different real-time maps are overlapped under world coordinates.

In some embodiments, in a world coordinate system, whether regions with the same world coordinate exist between different two real-time maps is sequentially checked. And if the areas with the same world coordinates exist, judging that the two real-time maps are overlapped.

In other embodiments, map points may be matched to edge regions of two different real-time maps, and if more than a specified number of map points are matched, it is determined that there is overlap between the two real-time maps.

In step S302, if there are a first map and a second map that overlap, the overlapping portion between the first map and the second map is fused to obtain actual map data.

In order to perform the corresponding steps in the above embodiments and various possible manners, an implementation manner of the pose determination apparatus 400 is given below, and optionally, the pose determination apparatus 400 may adopt the device structure of the electronic device 100 shown in fig. 1. Further, referring to fig. 10, fig. 10 is a functional block diagram of a pose determination apparatus 400 according to an embodiment of the present invention. It should be noted that the basic principle and the technical effects of the pose determining apparatus 400 provided by the present embodiment are the same as those of the above embodiments, and for the sake of brief description, no part of the present embodiment is mentioned, and reference may be made to the corresponding contents in the above embodiments. The pose determination apparatus 400 includes: a first obtaining module 401, a selecting module 402 and a calculating module 403.

The first obtaining module 401 is configured to obtain first positioning data of an image to be processed and second positioning data corresponding to a historical image when visual tracking on the image to be processed fails, where the first positioning data is positioning data acquired synchronously with the image to be processed.

In some embodiments, the above step S101 may be performed by the first obtaining module 401.

A selecting module 402, configured to select a reference image from the history image according to the first positioning data and the second positioning data.

In some embodiments, the above step S102 may be performed by the selection module 402.

A calculating module 403, configured to calculate, according to the reference image and the to-be-processed image, pose information corresponding to the to-be-processed image in combination with the real-time map.

In some embodiments, step S102 described above may be performed by the calculation module 403.

In some embodiments, the above pose determination apparatus 400 further includes:

and the creating module is used for creating a new real-time map and acquiring the pose information corresponding to the image to be processed according to the image to be processed and the newly acquired image data.

In some embodiments, the step S201 may be performed by the creation module.

The second acquisition module is used for acquiring the similarity transformation relation corresponding to the new real-time map; the similarity transformation relation is used for converting the position data in the visual coordinate system into the world coordinate system.

In some embodiments, the step S202 may be performed by a second obtaining module.

And the mapping module is used for converting all the real-time maps into a world coordinate system by utilizing the similar transformation relation corresponding to each real-time map so as to obtain actual map data.

In some embodiments, the step S203 may be performed by a mapping module.

In some embodiments, the pose determination apparatus 400 further comprises:

and the creating module is used for generating a local incremental map according to the reference image and the image to be processed.

And the updating module is used for updating the real-time map according to the local incremental map.

Alternatively, the modules may be stored in the memory 101 shown in fig. 1 in the form of software or Firmware (Firmware) or be fixed in an Operating System (OS) of the electronic device 100, and may be executed by the processor 103 in fig. 1. Meanwhile, data, codes of programs, and the like required to execute the above modules may be stored in the memory 101.

In summary, embodiments of the present invention provide a pose determination method, a pose determination apparatus, an electronic device, and a computer-readable storage medium. The pose determining method comprises the steps that under the condition that visual tracking aiming at an image to be processed fails, a reference image with the overlapping rate exceeding a preset value with the image to be processed is obtained from a historical image used for creating a real-time map according to first positioning data of the image to be processed; the first positioning data are positioning data acquired synchronously with the image to be processed; and calculating the pose information corresponding to the image to be processed according to the reference image and the image to be processed and by combining the real-time map. Under the condition that tracking fails, return voyage is not needed, the reference image which can be successfully matched is quickly searched by utilizing the acquisition position, the pose information of the image to be processed is recalculated, and the processing efficiency is improved.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, the functional modules in the embodiments of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A pose determination method, characterized by comprising:

2. The pose determination method according to claim 1, characterized in that the step of selecting a reference image from the history images according to the first positioning data and the second positioning data includes:

3. The pose determination method according to claim 1, characterized in that the step of selecting a reference image from the history images according to the first positioning data and the second positioning data includes:

4. The pose determination method according to claim 1, wherein in a case where the pose information is not calculated from the reference image and the image to be processed, the pose determination method further comprises:

5. The pose determination method according to claim 4, wherein after converting all the real-time maps into the world coordinate system, the pose determination method further comprises:

judging whether different real-time maps are overlapped or not;

6. The pose determination method according to claim 5, wherein the step of determining whether there is an overlap between the different real-time maps comprises:

7. The pose determination method according to claim 1, wherein in a case where the pose information is calculated from the reference image and the image to be processed, the pose determination method further comprises:

and updating the real-time map according to the local incremental map.

8. A pose determination apparatus characterized by comprising:

9. An electronic device comprising a processor and a memory, the memory storing machine executable instructions executable by the processor to implement the pose determination method of any of claims 1-7.

10. The electronic device according to claim 9, wherein the electronic device comprises an unmanned device, the unmanned device is an aerial survey unmanned aerial vehicle with an autonomous positioning function, the aerial survey unmanned aerial vehicle acquires corresponding first positioning data while acquiring the image to be processed, and executes the pose determination method based on the first positioning data.

11. A computer-readable storage medium on which a computer program is stored, the computer program, when being executed by a processor, implementing the pose determination method according to any one of claims 1 to 7.