CN111951158A - Recovery method and device for splicing interruption of aerial image of unmanned aerial vehicle and storage medium - Google Patents

Recovery method and device for splicing interruption of aerial image of unmanned aerial vehicle and storage medium Download PDF

Info

Publication number
CN111951158A
CN111951158A CN201910405078.4A CN201910405078A CN111951158A CN 111951158 A CN111951158 A CN 111951158A CN 201910405078 A CN201910405078 A CN 201910405078A CN 111951158 A CN111951158 A CN 111951158A
Authority
CN
China
Prior art keywords
current frame
pose
orb
coordinate system
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910405078.4A
Other languages
Chinese (zh)
Other versions
CN111951158B (en
Inventor
易雨亭
李建禹
孙元栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikrobot Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikrobot Technology Co Ltd filed Critical Hangzhou Hikrobot Technology Co Ltd
Priority to CN201910405078.4A priority Critical patent/CN111951158B/en
Publication of CN111951158A publication Critical patent/CN111951158A/en
Application granted granted Critical
Publication of CN111951158B publication Critical patent/CN111951158B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Processing (AREA)

Abstract

The application discloses recovery method, device and storage medium that unmanned aerial vehicle aerial image concatenation was interrupted specifically are: when the splicing of the aerial images of the unmanned aerial vehicle is interrupted, the unmanned aerial vehicle is instructed to return to the area to which the spliced images belong; acquiring an image re-shot by the unmanned aerial vehicle in the region to which the spliced image belongs, and taking the re-shot image as a current frame; repositioning according to the current frame and the existing key frame to determine the pose of the current frame under a fitting plane coordinate system, wherein the fitting plane is generated by fitting three-dimensional points corresponding to directional fast rotation ORB characteristic points of the existing key frame; and continuously splicing the current frame and the spliced image according to the pose of the current frame in the fitting plane coordinate system. By applying the technical scheme disclosed by the application, the splicing operation needs to be integrally spliced again when the splicing operation is interrupted, and only the splicing operation needs to be continued after the relocation operation is carried out on the spliced basis, so that the existing resources can be fully utilized, and the efficiency of the splicing operation is improved.

Description

Recovery method and device for splicing interruption of aerial image of unmanned aerial vehicle and storage medium
Technical Field
The application relates to the technical field of computer vision, in particular to a recovery method and a recovery device for splicing interruption of aerial images of an unmanned aerial vehicle and a storage medium.
Background
Conventional mapping techniques usually obtain information reflecting ground figure and position by measuring feature points and boundary lines of the ground by using remote sensing, laser, ultrasound, etc. Conventional mapping techniques, while highly accurate, are costly and take a long time from information acquisition to result generation. Aiming at the defects of the traditional technology, the method utilizes an unmanned aerial vehicle to carry out aerial photography at present, and utilizes aerial images to carry out image splicing to generate a panoramic image.
The image splicing mainly refers to a process of splicing a group of images with partial overlapping areas into a more comprehensive panoramic image, and can make up for the defect that a single image is small in view field.
In the existing image splicing process, splicing interruption may occur. The interruption may be caused by a plurality of reasons, such as frame missing when the unmanned aerial vehicle sends the aerial image to the ground system, or excessive error accumulation when the ground system finds out splicing, and the like. In any case, regardless of the cause, the image stitching may be interrupted during the implementation process, so that the stitching cannot be continued. For such a situation, the prior art has no effective solution, and generally only can be spliced again integrally, thereby wasting the existing resources.
Disclosure of Invention
The application provides a recovery method for splicing interruption of aerial images of an unmanned aerial vehicle, which can avoid the problem of resource waste caused by integral re-splicing under the condition of splicing interruption, and achieves the purposes of fully utilizing the existing resources and improving the splicing work efficiency.
The utility model provides a recovery method that unmanned aerial vehicle aerial image concatenation was interrupted specifically includes:
when the splicing of aerial images of the unmanned aerial vehicle is determined to be interrupted, indicating the unmanned aerial vehicle to return to the area to which the spliced images belong;
acquiring an image re-shot by the unmanned aerial vehicle in the region to which the spliced image belongs, and taking the re-shot image as a current frame;
repositioning according to the current frame and the existing key frame to determine the pose of the current frame under a fitting plane coordinate system, wherein the fitting plane is generated by fitting of three-dimensional points corresponding to directional fast rotation ORBs of the existing key frame;
and continuously splicing the current frame and the spliced image according to the pose of the current frame in a fitting plane coordinate system.
The utility model also provides a recovery unit that unmanned aerial vehicle image concatenation was interrupted that takes photo by plane can avoid wholly splicing the problem of wasting resources again under the condition that the concatenation was interrupted, reaches the existing resource of make full use of, improves concatenation work efficiency's purpose. This recovery unit that unmanned aerial vehicle image concatenation was interrupted specifically includes:
the indication unit is used for indicating the unmanned aerial vehicle to return to the area to which the spliced image belongs when the splicing of the aerial image of the unmanned aerial vehicle is determined to be interrupted;
the acquisition unit is used for acquiring an image re-shot by the unmanned aerial vehicle in the region to which the spliced image belongs, and taking the re-shot image as a current frame;
a repositioning unit, configured to reposition according to the current frame and an existing keyframe to determine a pose of the current frame in a fitting plane coordinate system, where the fitting plane is generated by fitting a three-dimensional point corresponding to a directional fast rotation ORB of the existing keyframe;
and the splicing unit is used for continuously splicing the current frame and the spliced image according to the pose of the current frame in the fitting plane coordinate system.
The application also provides a computer readable storage medium, which stores computer instructions, and the instructions can be executed by a processor to realize the recovery method for splicing and interrupting the aerial images of the unmanned aerial vehicle.
The application also provides an electronic device, which can avoid the problem of resource waste caused by the integral re-splicing under the condition of splicing interruption, and achieve the purposes of fully utilizing the existing resources and improving the splicing working efficiency. The method specifically comprises the following steps:
an electronic device comprising a computer-readable storage medium as described, and further comprising a processor that can execute the computer-readable storage medium.
It can be seen by above-mentioned technical scheme that, this application is after the image concatenation is interrupted, and ground system can instruct unmanned aerial vehicle to return the region that the image that has spliced belongs to, and unmanned aerial vehicle shoots the image again in the region that the image that has spliced belongs to, as the current frame. And finally, repositioning the current frame, and determining the pose of the current frame in the fitting plane coordinate system. Since the existing key frame is the image captured before the interruption, there should be a similarity between the image re-captured in the stitched image region and the existing key frame, and the relationship between the current frame and the stitched image can be determined by repositioning accordingly. Thereafter, the current frame and the stitched image may continue to be stitched as before the interruption. Therefore, by using the scheme of the application, the splicing does not need to be integrally spliced again when the splicing is interrupted, and only needs to be continuously spliced after being repositioned on the spliced basis, so that the existing resources can be fully utilized, and the efficiency of splicing work is improved.
Drawings
Fig. 1 is a flowchart of a first embodiment of the method of the present application.
Fig. 2 is a flowchart of a second embodiment of the method of the present application.
Fig. 3 is a flowchart of a method for selecting a candidate key frame according to a second embodiment of the present invention.
Fig. 4 is a flowchart of a method for determining a corresponding three-dimensional point of an ORB feature point of a current frame according to a second embodiment of the present application.
Fig. 5 is a flowchart of a method for further optimizing a current frame for the first time in the second embodiment of the present application.
Fig. 6 is a flowchart of a method for further optimizing a current frame for a second time in the second embodiment of the present application.
Fig. 7 is a flowchart of a method for further verifying the pose of the current frame in the second embodiment of the present application.
Fig. 8 is a flowchart of a method for transforming the pose of the current frame in the third embodiment of the present application.
Fig. 9 is a flowchart of a method for performing image stitching on a current frame according to a fourth embodiment of the present application.
Fig. 10 is a schematic structural diagram of a first embodiment of the apparatus of the present application.
Fig. 11 is a schematic diagram of an internal structure of a relocation unit 1002 according to a second embodiment of the present application.
Fig. 12 is a schematic structural diagram of a third embodiment of the apparatus of the present application.
Fig. 13 is a schematic diagram of an internal structure of a three-position posture optimizing unit 1005 according to an embodiment of the apparatus of the present application.
Fig. 14 is a schematic diagram of an internal structure of a three-position posture verifying unit 1006 in the embodiment of the present application.
Fig. 15 is a schematic diagram of the internal structure of a three-position posture conversion unit 1003 according to an embodiment of the present application.
Fig. 16 is a schematic diagram of the internal structure of a splicing unit 1004 in the third embodiment of the apparatus of the present application.
Fig. 17 is a schematic view of an internal structure of an electronic device in a fourth embodiment of the apparatus of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is further described in detail below by referring to the accompanying drawings and examples.
In practical application, unmanned aerial vehicle can carry on cloud platform camera usually, and the image that shoots the acquisition with cloud platform camera sends ground system to, is spliced in order to obtain the panorama by ground system. The ground system is a system for receiving aerial images of the unmanned aerial vehicle on the ground and splicing the aerial images, and in practical application, the ground system may be a system formed by one or more computers. The embodiment of the application is implemented by a ground system, and the recovery work after the image splicing interruption is realized. In addition, the unmanned aerial vehicle can be also provided with a Global Positioning System (GPS) in actual work, and corresponding information of the GPS is transmitted to the ground system when the image is transmitted.
In the embodiment of the application, if the splicing is interrupted, the ground system can indicate that the unmanned aerial vehicle flies back to the region of the spliced image, the unmanned aerial vehicle shoots the image again in the region of the spliced image, and the image is shot again to serve as the current frame. Because the existing key frame is the image shot before interruption, a certain incidence relation should exist between the image shot again in the spliced image area and the existing key frame, and the current frame is repositioned according to the incidence relation, namely the pose of the current frame is determined. The existing key frame is an image which participates in the splicing process before interruption, and is reliable, so that the current frame pose which is relocated based on the existing key frame is reliable. And then, continuing to splice the current frame and the spliced image according to the pose of the current frame. Therefore, when splicing is interrupted, the embodiment of the application does not need to be spliced integrally, and only needs to reposition the current frame on the basis of splicing and then continue splicing, so that the existing resources can be fully utilized, and the efficiency of splicing work is improved.
Fig. 1 is a flowchart of a method according to a first embodiment of the present application. As shown in fig. 1, the method includes:
step 100: and when determining that the splicing of the aerial images of the unmanned aerial vehicle is interrupted, indicating the unmanned aerial vehicle to return to the area to which the spliced images belong.
The step is executed when the ground system determines that the splicing of the aerial images of the unmanned aerial vehicle is interrupted. And receiving the indication of the ground system, and shooting the spliced image in the area by the unmanned aerial vehicle. In practical application, the ground system can indicate that the unmanned aerial vehicle flies back to any position of the spliced image before the splicing is interrupted, and the spliced image can be returned to the area to which the spliced image belongs.
Step 101: and acquiring an image re-shot by the unmanned aerial vehicle in the region to which the spliced image belongs, and taking the re-shot image as a current frame.
The unmanned aerial vehicle flies back to the area where the spliced image is located, the similarity between the image shot by the unmanned aerial vehicle and the image shot before interruption is higher, and subsequent relocation is facilitated. After the image is shot again, the unmanned aerial vehicle still transmits the shot image to the ground system as before the splicing is interrupted, and the frame image needs to be continuously spliced in the ground system in the embodiment of the application.
Step 102: and repositioning according to the current frame and the existing key frame to determine the pose of the current frame under a fitting plane coordinate system, wherein the fitting plane is generated by fitting three-dimensional points corresponding to directional fast rotation (ORB) feature points of the existing key frame.
The unmanned aerial vehicle aerial photography process is continuously shot, the change between adjacent images is not large, a certain change trend exists, and the image playing a key role in the change trend is called a key frame in the application. Aerial photography is the taking of images by a camera on the pan-tilt.
In the process of aerial photography, the unmanned aerial vehicle can continuously shift the position or the inclination angle, so that the continuously shot images have the relation of rotation and translation, and each shot image has a corresponding position and posture, namely the pose in the step. Those skilled in the art know that image stitching requires determining the pose of the current frame. Since the stitching has been interrupted, the pose of the current frame that is re-captured may not match the stitched image well, and the embodiment needs to be repositioned to continue stitching. The purpose of repositioning is to find a state that can be continuously spliced, determine the pose of the current frame, and integrate the current frame and the spliced image into a more complete panoramic image, and how to reposition will be described in detail in the following embodiments.
In practical application, in the process that the unmanned aerial vehicle flies back to the region to which the spliced image belongs, the image shot at the edge of the region may be successfully relocated, or the image shot inside the region may be successfully relocated, and the location where the relocation occurs does not limit the protection range of the embodiment of the present application.
In addition, in order to reasonably splice the images independently shot by the unmanned aerial vehicle, a standard plane needs to be selected, and the shot images are firstly unified into the standard plane, so that the subsequent splicing is more convenient. Because the image shot by the unmanned aerial vehicle aims at the same actual scene, the two-dimensional pixel points in the image correspond to the three-dimensional points in the actual scene. Among the two-dimensional pixels, there are some special points that are relatively conspicuous in the image, such as contour points, bright points in darker areas, dark points in lighter areas, and the like, and these special points are called ORB feature points. Of course, these ORB feature points also correspond to three-dimensional points, and these three-dimensional points can be fit into a plane, and the fit plane can be used as a standard plane. Before the splicing work is interrupted, the acquired images are spliced under a fitting plane coordinate system, and then after the splicing work is interrupted, the unmanned aerial vehicle acquires the current frame again and needs to be converted into the fitting plane coordinate system to continue splicing with the spliced images.
Step 103: and continuing to splice the current frame according to the pose of the current frame in the fitting plane coordinate system and the spliced image.
As mentioned above, the unmanned aerial vehicle continuously shifts the position and the inclination angle in the shooting process, and a certain incidence relation exists between the current frame and the spliced image. The incidence relation is embodied in that a transformation relation of rotation and translation exists between the current frame and the existing image, and the current frame and the corresponding part of the existing image can be aligned by utilizing the transformation relation, so that splicing is realized.
In practical application, the factor of the real scale or the proportional size can be further considered in the image splicing process. For example, there is 1 cm measured between two points in the panorama, but if the scale is not known, it is impossible to know what the distance of 1 cm on the panorama should be in the real geographic environment. Therefore, in another preferred embodiment of the present solution, global positioning system information may be added, so that the rendered panorama has a real scale. The method specifically comprises the steps that when the unmanned aerial vehicle transmits the aerial images to the ground system, corresponding global positioning system information is transmitted at the same time, and the global positioning system information is information relative to a global positioning system coordinate system. Therefore, the ground system obtains the current frame image in step 101 of the first embodiment of the present application, and also obtains the gps information of the current frame.
In practical application, the pose of an image aerial photographed by the unmanned aerial vehicle may be represented by any coordinate system, such as a first key frame camera coordinate system, where the first key frame camera coordinate system may be a camera coordinate system where any one key frame is photographed, such as a camera coordinate system where the first key frame is photographed or a camera coordinate system where a second key frame is photographed, and so on. No matter which coordinate system is adopted by the unmanned aerial vehicle to record the pose of the shot current frame during aerial shooting, before the pose in the fitting plane coordinate system is obtained in step 103 of the embodiment of the application, the pose can be converted into the global positioning system coordinate system first, and then converted into the fitting plane coordinate system from the global positioning system coordinate system. Therefore, the coordinate system of the shot image can be unified, and the image has a real scale.
In order to better illustrate the scheme of the application, the following method embodiments two, three and four are used for detailed description of the repositioning, pose conversion and image splicing respectively.
Fig. 2 is a flowchart of a second embodiment of the method, and describes how to perform the relocation process, i.e., how to re-determine the pose of the current frame. According to the second embodiment of the method, the similarity between the key frames and the current frame is utilized to determine which key frames have a larger influence on the pose of the current frame. As mentioned above, because the aerial images of the unmanned aerial vehicle are continuously shot, if the similarity degree of the two frames of images is higher, the two frames of images are relatively adjacent, and the poses of the two frames of images are similar. Then, the key frames with high similarity can be used as a basis to determine the pose of the current frame, i.e. to complete the repositioning.
As shown in fig. 2, a method for repositioning two current frames to determine their poses in the embodiment of the method of the present application, that is, the specific implementation of step 102 includes:
step 201: and carrying out image preprocessing on the acquired current frame.
The preprocessing described in this step is mainly to down-sample the image, so as to reduce the subsequent calculation amount. For example, the original image resolution is 3840 × 2160, and may be downsampled to 960 × 540 or 1280 × 1024. Of course, this step can be omitted if the calculation amount problem caused by the image resolution is not considered in practical application.
Step 202: and performing similarity comparison on the current frame and each key frame in a key frame set, and taking the key frames exceeding a preset similarity threshold as candidate key frames, wherein the key frame set comprises all key frames generated in the splicing process.
As previously mentioned, there may be several images that play a key role in the image stitching process called key frames, which may be kept in the key frame set all the time during the image stitching process before the interruption. In the step, the current frame and the key frame are compared in a similar manner, and the key frame with high similarity has a large influence on the determination of the pose of the current frame, so that the key frames exceeding a similarity threshold are used as candidate key frames which are used for determining the pose of the current frame. Here, the method for comparing similarity and selecting candidate keyframes can be specifically implemented by steps 2021 to 2023 shown in fig. 3:
step 2021: and searching a pre-established ORB dictionary for each key frame in the key frame set to determine the same feature descriptor corresponding to the ORB feature point of the key frame and the ORB feature point of the current frame, wherein the feature descriptor is stored in the ORB dictionary.
Here, the ORB dictionary is introduced: as known to those skilled in the art, an ORB dictionary is a pre-established structure for storing ORB features. The ORB features can be extracted from ORB feature points of the image through the detection of the existing FAST algorithm, and the extracted ORB features can be represented by feature descriptors. The ORB features are clustered using feature descriptors, represented as k-ary trees with depth d. The leaf nodes of the k-ary tree are called words and are used for storing ORB feature descriptors, which are strings of several bits. The k-ary tree with depth d thus generated by the above method is commonly referred to as an ORB dictionary.
When the ORB dictionary is searched, leaf nodes are searched layer by layer from the root node, and words stored in the leaf nodes are used as search results. From this, the current frame ORB feature points and key frame ORB feature points having the same ORB feature descriptors can be determined by looking up the ORB dictionary. Such as: finding out that the feature descriptor of the ith ORB feature point of the current frame is X in the ORB dictionary, finding out that the feature descriptor of the jth ORB feature point of a certain key frame is also X in the ORB dictionary, and thus the feature descriptors correspond to the same feature descriptor.
Step 2022: and calculating the sum of the weights corresponding to the same feature descriptors, wherein the sum is used as the similarity of the key frame and the current frame, and the weights corresponding to the feature descriptors are set in advance.
In practical application, different weights can be set for the feature descriptors corresponding to the ORB feature points to distinguish the importance degree of ORB feature point matching. Such as: assume that n ORB feature points in a certain key frame and a current frame have the same feature descriptor, where the feature descriptor sequence number of the ith ORB feature point of the current frame is Idi, and the feature descriptor sequence number of the jth ORB feature point of the key frame is Idj, Idi is Idj. In addition, the weight corresponding to the feature descriptor with the index Idi is wIdiThe feature descriptor with the sequence number Idj corresponds to a weight of wIdjSince Idi is Idj, wIdi=wIdj. Therefore, this step calculates the sum W of the weights of these same feature descriptorswordCan be expressed as:
Wword=∑iwIdior Wword=∑jwIdjEquation 1
After the sum of the weights corresponding to the same feature descriptors is calculated, the sum can be used as the similarity of the key frame and the current frame. Of course, if the more important the key is and the more important the feature descriptors corresponding to the current frame ORB feature points are, the greater the similarity is.
Step 2023: and in the key frame set, taking the key frames with the calculated similarity exceeding a preset similarity threshold as candidate key frames.
Through the above steps 2021 to 2023, candidate keyframes can be determined. In practical applications, the determined candidate key frames may be stored in the candidate key frame set KFcandFor subsequent use. It is these candidate key frames that are very similar to the current frame, and therefore it is reasonable to base the determination of the pose of the current frame in steps 202 and 203 described below.
Step 203: and matching according to the current frame and the candidate key frame to determine a three-dimensional point corresponding to the ORB characteristic point of the current frame.
As described above, since the candidate keyframe and the current frame are very similar and have a plurality of identical feature descriptors, it is described that the ORB feature points of the candidate keyframe and the ORB feature points of the current frame corresponding to the feature descriptors are feature matching pairs, and further it is described that the ORB feature points of the keyframe and the ORB feature points of the current frame are directed to the same three-dimensional point in the shooting scene.
The ORB features can be extracted from ORB feature points through existing FAST algorithm detection, and data in the extracted ORB features includes feature descriptors. The feature distance, such as the hamming distance, between two ORB feature points can be measured by comparing the degree of difference between the two feature descriptors. If the feature distance is smaller than the preset feature distance threshold, the two ORB feature points may be considered as matching, which is a pair of feature matching pairs. Then, by comparing the ORB feature points in the two images of the candidate key frame and the current frame in step 203 completely in this way, several feature matching pairs can be obtained.
The specific method for determining the three-dimensional point corresponding to the ORB feature point of the current frame in this step can be implemented as step 2031 to step 2032 shown in fig. 4:
step 2031: and searching a pre-established ORB dictionary, determining the feature descriptors corresponding to the candidate key frame ORB feature points, and determining the current frame ORB feature points corresponding to the same feature descriptors.
Step 2032: and taking the three-dimensional point corresponding to the candidate key frame ORB characteristic point as the three-dimensional point corresponding to the current frame ORB characteristic point, wherein the candidate key frame ORB characteristic point and the current frame ORB characteristic point have the same characteristic descriptor.
That is, suppose the candidate key frame set KFcandIn a certain candidate key frame KFcandiORB feature point KPTiAnd certain ORB feature point KPT of the current framejAll corresponding to the same feature descriptor X in the ORB dictionary, while the ORB feature points KPT of the known candidate key frameiCorresponding three-dimensional point MPTiThus, the current frame ORB feature point KPTjShould also correspond to three-dimensional points MPTi
According to step 2031 and step 2032, if there are corresponding three-dimensional points, three-dimensional points corresponding to all ORB feature points of the current frame can be determined.
Step 204: and calculating the pose of the current frame according to the three-dimensional points corresponding to the ORB feature points of the current frame, wherein the pose of the current frame is based on the pose of a first key frame camera coordinate system.
In practical applications, the camera coordinates when the first key frame is captured may be used as a reference standard, i.e. the first key frame camera coordinate system, and the subsequently captured images may be represented as images relative to the first key frame camera coordinate system. According to the setting, the current frame pose of the step is also relative to the pose of the first key frame camera coordinate system. Of course, in practical applications, the image captured by the pan/tilt head only needs to have a uniform coordinate system, and is not necessarily the first keyframe camera coordinate system.
The correspondence between the current frame ORB feature points and the corresponding three-dimensional points can be represented by equation 2:
Figure BDA0002060966460000081
wherein,
Figure BDA0002060966460000091
representing homogeneous coordinates of the current frame ORB feature points,
Figure BDA0002060966460000092
representing homogeneous coordinates of the corresponding three-dimensional points,
Figure BDA0002060966460000093
denotes the internal parameters of the pan-tilt camera, S denotes the scale size, and T ═ R T]The current frame pose is represented and is based on the pose in the first keyframe camera coordinate system.
Step 205: and converting the pose of the current frame into the fitting plane coordinate system according to the conversion relation between the first key frame camera coordinate system and the fitting plane coordinate system.
The pose of the current frame can be obtained through the steps 201 to 205, and the subsequent pose conversion and image stitching process can be continued.
In practical application, the pose of the current frame can be optimized even for multiple times to reduce errors, and the pose conversion and image stitching processes are carried out after more accurate and reliable poses are obtained.
The following describes how to optimize the pose of the current frame.
Optimizing the pose for the first time:
fig. 5 is a schematic diagram of a method for performing first-time optimization on the pose of the current frame. In the method, if the pose calculation of the current frame is reliable, the projection point of the three-dimensional point corresponding to the ORB feature point of the current frame is coincided with the ORB feature point, the greater the pose error is, and the greater the distance between the projection point and the ORB feature point is. Therefore, the pose of the current frame can be adjusted according to the thought, so that the distance between the projection point and the ORB characteristic point is minimum.
As shown in fig. 5, the first optimization of the pose of the current frame by two pairs in this embodiment includes:
step 501: and projecting the three-dimensional point corresponding to the ORB characteristic point of the current frame onto the current frame to obtain a first projection point.
In practical applications, the coordinates of the projection point in this step can be represented by the following formula 3:
Figure BDA0002060966460000094
wherein,
Figure BDA0002060966460000095
representing the homogeneous coordinates of the proxels,
Figure BDA0002060966460000096
represents the internal parameters of the pan-tilt camera, S represents the scale size, and T ═ R T]Then the pose of the current frame is represented,
Figure BDA0002060966460000097
representing homogeneous coordinates of three-dimensional points. Typically, the current frame has many or even hundreds to thousands of ORB feature points, and correspondingly hundreds to thousands of three-dimensional points, which all need to be projected onto the current frame. For the purpose of distinguishing from the projection point formed by the projection operation of the subsequent step, the projection point described in this step is referred to as a first projection point herein.
Step 502: and calculating the sum of pixel distances between all the first projection points and the image coordinates of the corresponding ORB feature points of the current frame as a first projection total error.
Since the three-dimensional points in step 501 are the three-dimensional points corresponding to the ORB feature points of the current frame, if the pose of the current frame is reasonable, the projection points of the three-dimensional points on the current frame should be not far away from the ORB feature points of the current frame. In practical applications, the pixel distance between a certain projection point and the corresponding image coordinate of the current frame ORB feature point may be expressed as:
Figure BDA0002060966460000101
suppose the image coordinates of a certain ORB feature point of the current frame are (x, y), and the image coordinates of the corresponding three-dimensional point projection point are (x)proj,yproj) W is the set weight, and the pixel distance between the image coordinates of the ORB feature points and the image coordinates of the projection points, i.e. the error between the projection points and the ORB feature points of the current frame, can be calculated according to equation 4. The pixel distance as used herein refers to the distance between any two coordinate points on the pixel image, and is not the same as the characteristic distance. According to this method, the error between all projected points and the ORB feature points, i.e. the first projected total error E:
Figure BDA0002060966460000102
where i represents the ith ORB feature point of the current frame, eiIndicating the error between the ith ORB feature point and the corresponding proxel,
Figure BDA0002060966460000103
denotes eiThe transposing of (1).
Step 503: and optimizing the pose of the current frame for the first time according to a nonlinear optimization algorithm to obtain the pose of the current frame optimized for the first time, so that the total error of the first projection is minimum.
As mentioned above, if the calculated pose of the current frame is reliable, the pixel distance between the projection point of its three-dimensional point and the corresponding ORB feature point should not be too large, and the total error of the first projection can be changed to the minimum by adjusting the pose of the current frame.
And (3) optimizing the pose for the second time:
in order to further reduce the error, the second optimization can be performed in the second embodiment. Fig. 6 is a schematic diagram of a method for performing second optimization on the pose of the current frame. The method adds more matching relations between ORB characteristic points and three-dimensional points to the current frame by using the candidate key frame selected in the step 202, and adjusts the pose of the current frame by using more three-dimensional points, thereby further reducing errors.
As shown in fig. 6, the method for performing the second optimization on the pose of the current frame in the embodiment of the method includes:
step 601: and projecting the three-dimensional points corresponding to the candidate key frames onto the current frame to form corresponding second projection points in the current frame.
The candidate key is the candidate key frame selected in step 202 of the second embodiment.
Step 602: and taking the ORB feature point of the current frame as the ORB feature point to be matched within the projection radius of the second projection point.
Step 603: and calculating the characteristic distance between the ORB characteristic points to be matched and the candidate key frame ORB characteristic points corresponding to the second projection point, and selecting the ORB characteristic point to be matched with the minimum characteristic distance as the selected ORB characteristic point to be matched.
In practical applications, there may be a plurality of ORB feature points within the projection radius of the projection point in the current frame, and only the ORB feature point with the minimum feature distance is selected here. The distance between two features, such as the hamming distance, can be measured by comparing the degree of difference between the two feature descriptors in step 603. It is noted that the feature distance described herein is different from the pixel distance described in step 502. The pixel distance is the distance between the image coordinates, and the feature distance is the size of the feature matching degree.
Step 604: and if the selected ORB feature point to be matched does not have a corresponding three-dimensional point and the feature distance is smaller than a set feature distance threshold value, forming a new matching pair by the selected ORB feature point to be matched and the three-dimensional point corresponding to the second projection point, so that the selected ORB feature point to be matched and the second projection point correspond to the same three-dimensional point.
As known to those skilled in the art, when the ORB feature points in the image are used to calculate the corresponding three-dimensional points, the three-dimensional points cannot be necessarily calculated, and some special cases of calculation failure may cause some ORB feature points in the image not to have corresponding three-dimensional points.
Here, if the feature distance is smaller than the preset feature distance threshold, it may be considered that the three-dimensional point projected to the current frame by the candidate key frame and the ORB feature point to be matched selected by the current frame are matched. Such as: ORB feature point KPT of some candidate key frameiCorresponding three-dimensional point MPTiThe projection point of the three-dimensional point on the current frame is
Figure BDA0002060966460000111
If the ORB feature point KPT to be matched is selected in the projection radiusjAnd ORB feature point KPT of candidate key frameiIf the characteristic distance between the current frame and the ORB is less than the characteristic distance threshold value, the current frame is considered to be the ORB characteristic point KPTjAnd three-dimensional point MPTiIs the corresponding.
The above steps 601 to 604 are actually processes of adding more corresponding relations between ORB feature points and three-dimensional points to the current frame by using the candidate keyframes.
Step 605: and projecting all three-dimensional points corresponding to the ORB characteristic points of the current frame onto the current frame to obtain a third projection point.
In this step, the current frame is the current frame after the first sub-optimization, and the current frame at this time is added with more relationships between ORB feature points and three-dimensional points, and this step projects more three-dimensional points. That is, all three-dimensional points corresponding to the current frame ORB feature point in this step include: and the three-dimensional points corresponding to the ORB characteristic points of the current frame in the first optimization process and the three-dimensional points corresponding to the ORB characteristic points to be matched selected in the second optimization process.
Step 606: and calculating the sum of the pixel distances between the third projection point and the image coordinates of the corresponding ORB feature point of the current frame as a second projection total error.
The implementation of this step is similar to step 502 in the first optimization process described above, and is not described here again.
Step 607: and performing secondary optimization on the pose of the current frame optimized for the first time according to the nonlinear optimization algorithm to obtain the pose of the current frame optimized for the second time, so that the total error of the second projection is minimum.
In this embodiment, the above steps 605 to 607 are processes of performing optimization for the second time, and the method is the same as the first optimization method shown in fig. 5, except that on the basis of the first optimization, more corresponding relationships between ORB feature points and three-dimensional points are added to the current frame. More corresponding relations are used as reference bases, so that more accurate and reliable current frame poses can be obtained.
Of course, in practical applications, if the problem of the pose error is not considered, the pose of the current frame may not be optimized, or the method of fig. 5 or fig. 6 may be used to perform the optimization only once.
Regardless of whether the calculated pose of the current frame is optimized or whether the calculated pose of the current frame is optimized for several times, the second embodiment of the method can further verify the pose of the current frame after the pose of the current frame is calculated. Fig. 7 is a method for further verifying the pose of the current frame according to the second embodiment of the present method. As shown in fig. 7, the method includes:
step 701: and calculating the number of the matching pairs of the ORB feature points of the current frame and the corresponding three-dimensional points.
In this step, when an ORB feature point of the current frame and a three-dimensional point form a corresponding relationship, the ORB feature point and the corresponding three-dimensional point are a matching pair. In practical application, a plurality of ORB feature points of the current frame can form matching pairs with the three-dimensional points respectively, and the number of the matching pairs can be determined.
Step 702: judging whether the number of the matching pairs is larger than a preset number threshold, if so, executing step 703; otherwise, step 704 is performed.
The pose of the current frame in the second embodiment of the method is calculated according to the ORB feature points and the corresponding three-dimensional points. When the corresponding relation is more, or the number of the matching pairs is more, the more the reference of the pose calculation of the current frame is, the more reliable the calculation result can be considered. Therefore, in order to ensure that the pose of the current frame is accurate and reliable, the number threshold of the ORB feature points and the three-dimensional points can be set in advance, and the calculated pose of the current frame is considered to be effective only when the number threshold is larger than the threshold.
Step 703: and determining that the pose of the current frame is effective, and finishing the repositioning process.
In the second embodiment of the method, the subsequent pose conversion and image stitching processes can be continuously executed after the repositioning process is finished.
Step 704: and determining that the pose of the current frame is invalid, and returning to the step of acquiring the image re-shot by the unmanned aerial vehicle in the region to which the spliced image belongs.
Here, the step of returning to obtain the current frame after the splicing of the aerial image of the unmanned aerial vehicle is to return to step 101 of the second embodiment of the method. Certainly, in practical application, the pose of the current frame may not be verified, and the determined pose of the current frame is directly subjected to pose conversion, that is, the process shown in fig. 7 is omitted.
Fig. 8 is a flowchart of a third embodiment of the method of the present application. Method embodiment three describes how to convert the current frame pose from the first keyframe camera coordinate system to the fitted plane coordinatesAnd tying down to facilitate subsequent splicing. In this embodiment, the pose of the current frame is obtained by first performing hypothesis calculation
Figure BDA0002060966460000131
Where i represents the current frame and w1 represents the first keyframe camera coordinate system, i.e., the current frame pose is the pose under the first keyframe camera coordinate system. As described above, in order to make the rendered panorama have a real scale, global positioning system information of the image is also obtained. Then, the pose of the current frame may be expressed as the pose in the global positioning system coordinate system, in addition to the pose in the first key frame camera coordinate system, in this embodiment, the pose of the current frame in the global positioning system coordinate system is expressed as the pose in the global positioning system coordinate system
Figure BDA0002060966460000132
Where i denotes the current frame and w denotes the global positioning system coordinate system.
As shown in fig. 8, the method for performing posture conversion according to the third embodiment of the present method includes:
step 801: and converting the pose of the current frame into the coordinate system of the global positioning system according to the conversion relation between the camera coordinate system of the first key frame and the coordinate system of the global positioning system.
In practical application, the received current frame pose can be recorded in an SE3 mode
Figure BDA0002060966460000133
i denotes the current frame and w1 denotes the first key frame camera coordinate system, i.e. the current frame pose is the pose under the first key frame camera coordinate system. Simultaneously recording the pose under the coordinate system of the global positioning system
Figure BDA0002060966460000134
i denotes the current frame and w denotes the global positioning system coordinate system. Wherein, the pose of the first key frame camera coordinate system
Figure BDA0002060966460000135
Can be calculated by the method of the above embodiment of the applicationPose in the coordinate system of global positioning system
Figure BDA0002060966460000136
It can be directly obtained from the gps information transmitted from the drone. When the received key frame is sufficient, the transformation relationship thereof can be calculated using the following equation 12.
Figure BDA0002060966460000137
Wherein,
Figure BDA0002060966460000138
representing the pose in the first keyframe camera coordinate system,
Figure BDA0002060966460000139
representing the pose, w, in a global positioning system coordinate systemiRepresenting the weight, min represents the minimum function,
Figure BDA0002060966460000141
and representing the transformation relation between the global positioning system coordinate system and the pose of the first key frame camera coordinate system. Equation 6 shows that
Figure BDA0002060966460000142
So that the overall error is minimized, the
Figure BDA0002060966460000143
Namely the transformation relation of the global positioning system coordinate system and the first key frame camera coordinate system.
Assuming that the transformation relationship between the global positioning system coordinate system and the first keyframe camera coordinate system pose has been determined in advance, in this step, the current frame pose in the first keyframe camera coordinate system can be transformed into the global positioning system coordinate system by equation 7 below.
Figure BDA0002060966460000144
Wherein,
Figure BDA0002060966460000145
for the known transformation relationship between the global positioning system coordinate system and the first keyframe camera coordinate system,
Figure BDA0002060966460000146
the pose of the current frame under the camera coordinate system of the first key frame is calculated by inter-frame tracking. Then, the pose of the current frame can be converted to the pose in the coordinate system of the global positioning system by equation 7
Figure BDA0002060966460000147
Step 802: and converting the pose of the current frame from the global positioning system coordinate system to the fitting plane coordinate system according to the conversion relation between the global positioning system coordinate system and the fitting plane coordinate system.
It is assumed here in advance that after the rotation matrix R and the translation vector t of the fitting plane coordinate system are calculated, the fitting plane coordinate system can be expressed as SE3
Figure BDA0002060966460000148
Therefore, the transformation relationship between the poses of the global positioning system coordinate system and the fitting plane coordinate system can be expressed in advance by equation 8:
Figure BDA0002060966460000149
wherein,
Figure BDA00020609664600001410
representing a translation between the global positioning system coordinate system and the first keyframe camera coordinate system,
Figure BDA00020609664600001411
representing a fitted planar coordinate system, then
Figure BDA00020609664600001412
A translation relationship between the global positioning system coordinate system and the fitted plane coordinate system is represented.
Assuming that the transformation relationship between the global positioning system coordinate system and the fitting plane coordinate system has been determined in advance according to equation 8, in this step, the present frame pose in the global positioning system coordinate system can be transformed to the fitting plane coordinate system by equation 9 below.
Figure BDA00020609664600001413
Wherein,
Figure BDA00020609664600001414
representing the transformation between the global positioning system coordinate system and the fitting plane coordinate system,
Figure BDA00020609664600001415
representing the pose of the current frame under the coordinate system of the global positioning system,
Figure BDA00020609664600001416
and representing the pose of the current frame under the fitting plane coordinate system. According to the formula 9, the rotation matrix of the current frame in the fitting plane coordinate system can be obtained
Figure BDA00020609664600001417
And translation vector
Figure BDA00020609664600001418
The third embodiment of the method converts the calculated pose of the current frame into a fitting plane coordinate system. Because the shot images are represented by a uniform plane coordinate system, the images can be spliced conveniently.
Fig. 9 is a flowchart of a method for implementing an image stitching process according to a fourth embodiment of the present application, and as shown in fig. 9, the method includes:
step 901: and calculating the homographic transformation relation between the current frame and the spliced image.
In practical application, because the difference between the continuous shooting images of the unmanned aerial vehicle is very small, the shot scenes can be considered to be in the same plane, and the homography transformation condition is met. The homographic transformation can be expressed by the following equation 10:
Figure BDA0002060966460000151
wherein, K represents the camera internal parameter used for unmanned aerial vehicle shooting, r1And r2Rotation matrices respectively representing the fitted plane coordinate system that has been calculated
Figure BDA0002060966460000152
The first column and the second column of (a),
Figure BDA0002060966460000153
representing the calculated translation vector of the fitted plane coordinate system, then H represents the homographic transformation relationship between the current frame and the stitched image.
Step 902: and determining the coordinates of the four corner points of the current frame in the existing image according to the homography transformation relation.
In order to stitch the current frame into the existing image, the corresponding relationship between the 4 corner points and the coordinates of the existing image needs to be determined, and the relationship can be expressed by the following formula 11:
Figure BDA0002060966460000154
wherein, (x, y, 1) represents the homogeneous coordinate of the angular point in the current frame image, (x ', y', 1) represents the homogeneous coordinate of the angular point in the existing image, H represents the homographic transformation between the current frame and the existing image, inv represents the inversion function, and s represents the scale. After the coordinates of the 4 corner points in the existing image are determined, the following steps can be used for stitching.
Step 903: and determining the pixel value of the extension part after splicing from the spliced image according to the homography transformation relation between the current frame and the spliced image.
Since 4 coordinate points are determined in the stitched image in step 902, the range of the 4 coordinate points is the portion to be stitched, and the pixel values of the corresponding coordinates of the current frame can be directly filled, or the pixel values can be filled by interpolation. For example, for a certain coordinate point of the existing image extension portion, a certain coordinate point corresponding to the current frame may be calculated by using the following formula 12:
Figure BDA0002060966460000155
equation 12 is actually calculated according to equation 11, and similarly, (x, y, 1) represents the homogeneous coordinate of the corner point in the current frame image, (x ', y', 1) represents the homogeneous coordinate of the corner point in the existing image, and H represents the homographic transformation s between the current frame and the existing image to represent the scale. That is, when it is necessary to fill a pixel value of a certain coordinate point of the extended portion of the existing image, the coordinate point corresponding to the current frame is determined by using formula 12, and then the pixel values of 4 points near the coordinate point are weighted and averaged to obtain the pixel value to be filled.
Therefore, the fourth step of the present embodiment can be used to complete the stitching of the current frame into the existing image to form a larger image, and the stitching can be continued subsequently on the basis. In practical application, since a part of the stitched image may overlap with the current frame image, a pixel fusion method, such as a laplacian of gaussian pyramid method, may be used to fuse pixels in the overlapping region, so that the stitched portion is not obvious, and a smoother image is obtained.
This application still provides a recovery unit that unmanned aerial vehicle image concatenation was interrupted that takes photo by plane, and figure 10 is the schematic structure diagram of this application device embodiment one. As shown in fig. 10, the apparatus includes: indicating unit 1000, acquiring unit 1001, repositioning and converting unit X1, splicing unit 1004. The repositioning and converting unit X includes a repositioning unit 1002 and a pose converting unit 1003. Specifically, the method comprises the following steps:
and the indicating unit 1000 is used for indicating the unmanned aerial vehicle to return to the area to which the spliced image belongs when the splicing interruption of the aerial image of the unmanned aerial vehicle is determined.
An obtaining unit 1001, configured to obtain an image that is newly captured by the unmanned aerial vehicle in an area to which the spliced image belongs, and use the newly captured image as a current frame.
And the repositioning and converting unit X1 is used for repositioning according to the current frame and the existing key frame to determine the pose of the current frame under a fitting plane coordinate system, wherein the fitting plane is generated by fitting a three-dimensional point corresponding to the directional fast rotation ORB of the existing key frame. Among them, the relocation and conversion unit X1 may:
a repositioning unit 1002, configured to reposition according to the current frame and an existing key frame to determine a pose of the current frame, where the pose of the current frame is based on a pose of a first key frame in a camera coordinate system, and the key frame is a key frame determined in an image stitching process.
A pose converting unit 1003, configured to convert the pose of the current frame into the fitting plane coordinate system according to a conversion relationship between the first key frame camera coordinate system and the fitting plane coordinate system, where the fitting plane is generated by fitting a three-dimensional point corresponding to an oriented fast rotation (ORB) of an existing key frame.
And the splicing unit 1004 is configured to calculate a transformation relationship between the current frame and the spliced image according to the pose of the current frame in the fitting plane coordinate system, and continue to splice the current frame and the spliced image according to the transformation relationship.
Fig. 11 is a schematic diagram of an internal structure of a relocation unit 1002 according to a second embodiment of the present application. As shown in FIG. 11, relocation unit 1002 includes: the device comprises a preprocessing unit A1, a candidate key frame determining unit A2, a current frame three-dimensional point determining unit A3 and a current frame pose calculating unit A4. Wherein:
and a preprocessing unit a1 for performing image preprocessing on the acquired current frame.
A candidate key frame determining unit a2, configured to perform similarity comparison between the current frame and each key frame in the key frame set, and use the key frame exceeding a preset similarity threshold as a candidate key frame, where the key frame set includes all key frames generated in the splicing process.
And the current frame three-dimensional point determining unit A3 is used for matching according to the current frame and the candidate key frame to determine the three-dimensional point corresponding to the current frame ORB characteristic point.
And the current frame pose calculation unit A4 is used for calculating the current frame pose according to the three-dimensional points corresponding to the current frame ORB feature points, and the pose of the current frame is based on the pose of the first key frame camera coordinate system.
The candidate key frame determination unit a2 may be implemented as follows: firstly, aiming at each key frame in a key frame set, searching a pre-established ORB dictionary to determine the same feature descriptors corresponding to the ORB feature points of the key frame and the ORB feature points of the current frame, wherein the feature descriptors are stored in the ORB dictionary; calculating the sum of weights corresponding to the same feature descriptors, wherein the sum is used as the similarity of the key frame and the current frame, and the weights corresponding to the feature descriptors are set in advance; and in the key frame set, taking the key frames with the calculated similarity exceeding a preset similarity threshold as candidate key frames.
The current frame three-dimensional point determination unit a3 may be implemented as follows: searching a pre-established ORB dictionary, determining feature descriptors corresponding to the candidate key frame ORB feature points, and determining current frame ORB feature points corresponding to the same feature descriptors; and taking the three-dimensional point corresponding to the candidate key frame ORB characteristic point as the three-dimensional point corresponding to the current frame ORB characteristic point, wherein the candidate key frame ORB characteristic point and the current frame ORB characteristic point have the same characteristic descriptor.
Here, the related concepts of the ORB dictionary can be referred to in the method section, and are not described in detail here. Through the implementation in the second embodiment of the apparatus, the pose of the current frame can be obtained, and then the implementation of the pose converting unit 1003 and the image stitching unit 1004 continues.
Fig. 12 is a schematic view of the internal structure of a third embodiment of the apparatus of the present application. As shown in fig. 13, including an indicating unit 1000, an acquiring unit 1001, a repositioning unit 1002, a pose converting unit 1003, and a stitching unit 1004, a pose optimizing unit 1005 and a pose verifying unit 1006 are added between the repositioning unit and the pose converting unit 1003, that is: and optimizing the pose of the current frame even for multiple times to reduce errors, obtaining a more accurate and reliable pose, and performing pose conversion and image splicing after verification.
Fig. 13 is a schematic diagram of an internal structure of a three-position posture optimizing unit 1005 according to an embodiment of the apparatus of the present application. As shown in fig. 13, the pose optimization unit 1005 includes a first time optimization unit B1 and a second time optimization unit B2. Wherein:
a first optimization unit B1, configured to perform a first optimization on the pose of the current frame, that is: projecting a three-dimensional point corresponding to an ORB characteristic point of a current frame onto the current frame to obtain a first projection point; calculating the sum of pixel distances between all the first projection points and the image coordinates of the corresponding ORB feature points of the current frame to serve as a first projection total error; and optimizing the pose of the current frame for the first time according to a nonlinear optimization algorithm to obtain the pose of the current frame optimized for the first time, so that the total error of the first projection is minimum.
A second time optimization unit B2, configured to perform a second time optimization on the pose of the current frame, that is: projecting the three-dimensional points corresponding to the candidate key frames onto the current frame to form corresponding second projection points in the current frame; taking the ORB feature point of the current frame as an ORB feature point to be matched within the projection radius of the second projection point; calculating the characteristic distance between the ORB characteristic points to be matched and candidate key frame ORB characteristic points corresponding to the second projection point, and selecting the ORB characteristic point to be matched with the minimum characteristic distance as the selected ORB characteristic point to be matched; if the selected ORB feature point to be matched does not have a corresponding three-dimensional point and the feature distance of the selected ORB feature point to be matched is smaller than a set feature distance threshold value, forming a new matching pair by the selected ORB feature point to be matched and the three-dimensional point corresponding to the second projection point, so that the selected ORB feature point to be matched and the second projection point correspond to the same three-dimensional point; projecting all three-dimensional points corresponding to the ORB characteristic points of the current frame onto the current frame to obtain third projection points; and calculating the sum of the pixel distances between the third projection point and the image coordinates of the corresponding ORB feature point of the current frame as a second projection total error.
Fig. 14 is a schematic diagram of an internal structure of a three-position posture verifying unit 1006 in the embodiment of the present application. As shown in fig. 14, the pose verification unit 1006 includes a matching pair calculation unit C1 and a discrimination unit C2. Wherein:
and a matching pair calculation unit C1, configured to calculate the number of matching pairs between the current frame ORB feature point and the corresponding three-dimensional point.
A judging unit C2, for judging whether the number of the matching pairs is larger than a preset number threshold, if so, determining that the pose of the current frame is valid and ending the repositioning process; otherwise, the pose of the current frame is determined to be invalid and the acquiring unit 1001 is restarted.
Fig. 15 is a schematic diagram of the internal structure of a three-position posture conversion unit 1003 according to an embodiment of the present application. As shown in fig. 15, the pose conversion unit 1003 includes: a first posture conversion unit D1 and a second posture conversion unit D2. Wherein:
and the first pose converting unit D1 is used for converting the pose of the current frame into the coordinate system of the global positioning system according to the conversion relation between the camera coordinate system of the first key frame and the coordinate system of the global positioning system.
And the second pose converting unit D2 is used for converting the pose of the current frame from the global positioning system coordinate system to the fitting plane coordinate system according to the conversion relation between the global positioning system coordinate system and the fitting plane coordinate system.
The logical unit can be used for converting the pose of the current frame from the first key frame camera coordinate system to the fitting plane coordinate system so as to facilitate subsequent splicing work. For the implementation of each logic unit of the pose conversion unit 1003 in the second embodiment of the present application, reference may be made to the detailed description of the third embodiment of the method.
Fig. 16 is a schematic diagram of the internal structure of a splicing unit 1004 in the third embodiment of the apparatus of the present application. At the moment, the pose of the current frame is converted into a fitting plane coordinate system, and the image shot by the unmanned aerial vehicle before interruption and the image shot again after interruption are all represented by a unified plane coordinate system, so that the unmanned aerial vehicle can be spliced conveniently. As shown in fig. 16, the stitching unit 1004 includes a homographic transformation calculating unit E1, a corner coordinate calculating unit E2, and a stitching executing unit E3.
Wherein:
and the homography transformation calculation unit E1 is used for calculating the homography transformation relation between the current frame and the finished spliced image.
And the corner point coordinate calculation unit E2 is used for determining the coordinates of the four corner points of the current frame in the finished spliced image according to the homographic transformation relation.
And the splicing execution unit E3 is used for determining the pixel value of the extension part after splicing from the spliced image according to the homographic transformation relation between the current frame and the spliced image.
Therefore, the current frame repositioned by the unmanned aerial vehicle can be spliced to the spliced image to form a larger panoramic image, and the splicing work is continuously completed. In practical application, since a part of the spliced image may overlap with the current frame image, a pixel fusion method, such as a laplacian of gaussian pyramid method, may be used to fuse pixels in the overlapping region, so that the spliced portion is not obvious, and a smoother image is obtained.
Embodiments of the present application also provide a computer-readable storage medium storing instructions, which when executed by a processor, cause the processor to perform the steps of the unmanned aerial vehicle aerial image stitching method as described above. In practice, the computer readable medium may be RAM, ROM, EPROM, magnetic disk, optical disk, etc., and is not intended to limit the scope of protection of this application.
The method steps described herein may be implemented in hardware, for example, logic gates, switches, Application Specific Integrated Circuits (ASICs), programmable logic controllers, embedded microcontrollers, etc., in addition to data processing programs. Such hardware capable of implementing the methods described herein may also constitute the present application.
The embodiment of the application further provides an electronic device, which can be a computer or a server, wherein the recovery device for splicing and interrupting the aerial images of the unmanned aerial vehicle of the embodiment of the application can be integrated. Fig. 17 shows an electronic device according to a fourth embodiment of the apparatus of the present application.
The electronic device may include one or more processors R1 of the processing core, one or more computer-readable storage media R2. The electronic device may further include a power supply R3, an input-output unit R4. Those skilled in the art will appreciate that fig. 17 is not limiting of electronic devices and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
Wherein:
the processor R1 is a control section of the electronic apparatus, connects the respective sections by various interfaces and lines, and performs various functions and processes data by running or executing a software program stored in the computer-readable storage medium R2, thereby completing the image stitching work.
The computer-readable storage medium R2 may be used to store software programs, i.e. programs involved in the above-described unmanned aerial vehicle aerial image stitching method.
The processor R1 executes various functional applications and data processing by executing software programs stored in the computer-readable storage medium R2. The computer-readable storage medium R2 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function (such as an image playback function, etc.), and the like; the storage data area can store data and the like (such as images shot by the unmanned aerial vehicle) used according to the needs of the electronic equipment. Further, the computer-readable storage medium R2 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the computer-readable storage medium R2 may also include a memory controller to provide the processor R1 access to the computer-readable storage medium R2.
The electronic equipment further comprises a power supply R3 for supplying power to each component, and preferably, the power supply R3 can be logically connected with the processor R1 through a power management system, so that functions of charging, discharging, power consumption management and the like can be managed through the power management system. The power source R1 may also include any component or components of one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.
The server may also include an input output unit R4, such as may be used to receive entered numeric or character information, and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control; such as various graphical user interfaces that may be used to display information entered by or provided to the user, as well as the server, which may be composed of graphics, text, icons, video, and any combination thereof.
By applying the embodiments of the application, the embodiment of the application can indicate the unmanned aerial vehicle to fly back to the area of the spliced image to shoot the image again under the condition that the splicing of the aerial image of the unmanned aerial vehicle is interrupted, so that the current frame is obtained again. As for the same scene, the re-shot image and the existing key frame in the previous splicing have a certain incidence relation, the incidence relation is utilized to carry out repositioning to determine the pose of the current frame, so that the splicing is continued, and the aim of interruption recovery is fulfilled. Therefore, when splicing is interrupted, the embodiment of the application does not need to be spliced integrally, and only needs to reposition the current frame on the basis of splicing and then continue splicing, so that the existing resources can be fully utilized, and the efficiency of splicing work is improved.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the scope of protection of the present application.

Claims (12)

1. A recovery method for splicing interruption of aerial images of an unmanned aerial vehicle is characterized by comprising the following steps:
when the splicing of aerial images of the unmanned aerial vehicle is determined to be interrupted, indicating the unmanned aerial vehicle to return to the area to which the spliced images belong;
acquiring an image re-shot by the unmanned aerial vehicle in the region to which the spliced image belongs, and taking the re-shot image as a current frame;
repositioning according to the current frame and the existing key frame to determine the pose of the current frame under a fitting plane coordinate system, wherein the fitting plane is generated by fitting three-dimensional points corresponding to directional fast rotation ORB characteristic points of the existing key frame;
and continuously splicing the current frame and the spliced image according to the pose of the current frame in a fitting plane coordinate system.
2. The method of claim 1, wherein the method of repositioning to determine the pose of the current frame in the fitted plane coordinate system based on the current frame and an existing keyframe comprises:
performing similarity comparison on the current frame and each key frame in the key frame set to obtain similarity, and taking the key frames exceeding a preset similarity threshold as candidate key frames;
matching according to the current frame and the candidate key frame to determine a three-dimensional point corresponding to the ORB characteristic point of the current frame;
calculating the pose of the current frame according to the three-dimensional points corresponding to the ORB feature points of the current frame, wherein the pose of the current frame is based on the pose of a first key frame camera coordinate system;
and converting the pose of the current frame into the fitting plane coordinate system according to the conversion relation between the first key frame camera coordinate system and the fitting plane coordinate system.
3. The method according to claim 2, wherein the comparing the similarity between the current frame and each key frame in the key frame set to obtain the similarity, and the method for using the key frame exceeding a preset similarity threshold as the candidate key frame comprises:
for each key frame in the key frame set, searching a pre-established ORB dictionary to determine the same feature descriptors corresponding to the ORB feature points of the key frame and the ORB feature points of the current frame, wherein the feature descriptors are stored in the ORB dictionary;
calculating the sum of the weights corresponding to the same feature descriptors to be used as the similarity of the key frame and the current frame, wherein the weights corresponding to the feature descriptors are set in advance;
and in the key frame set, taking the key frames with the calculated similarity exceeding a preset similarity threshold as candidate key frames.
4. The method of claim 2, wherein the step of matching the current frame and the candidate keyframes to determine the three-dimensional points corresponding to the ORB feature points of the current frame comprises:
searching a pre-established ORB dictionary, determining feature descriptors corresponding to the candidate key frame ORB feature points, and determining current frame ORB feature points corresponding to the same feature descriptors;
and taking the three-dimensional points corresponding to the candidate key frame ORB characteristic points as the three-dimensional points corresponding to the current frame ORB characteristic points.
5. The method of claim 2, wherein after calculating the pose of the current frame according to the three-dimensional points corresponding to the ORB feature points of the current frame, the method of repositioning according to the current frame and the existing key frame to determine the pose of the current frame in the fitted plane coordinate system further comprises a first optimization process, and the first optimization process comprises:
projecting the three-dimensional point corresponding to the ORB characteristic point of the current frame onto the current frame to obtain a first projection point;
calculating the sum of pixel distances between all the first projection points and the image coordinates of the corresponding ORB feature points of the current frame to serve as a first projection total error;
and optimizing the pose of the current frame for the first time according to a nonlinear optimization algorithm to obtain the pose of the current frame optimized for the first time, so that the total error of the first projection is minimum.
6. The method of claim 5, wherein after the first optimization process, the method of repositioning based on the current frame and the existing keyframes to determine the pose of the current frame in the fitted plane coordinate system further comprises a second optimization process comprising:
projecting the three-dimensional points corresponding to the candidate key frames onto the current frame to form corresponding second projection points in the current frame;
taking the ORB feature point of the current frame as an ORB feature point to be matched within the projection radius of the second projection point;
calculating the characteristic distance between the ORB characteristic points to be matched and candidate key frame ORB characteristic points corresponding to the second projection point, and selecting the ORB characteristic point to be matched with the minimum characteristic distance as the selected ORB characteristic point to be matched;
if the selected ORB feature point to be matched does not have a corresponding three-dimensional point and the feature distance of the selected ORB feature point to be matched is smaller than a set feature distance threshold value, forming a new matching pair by the selected ORB feature point to be matched and the three-dimensional point corresponding to the second projection point, wherein the selected ORB feature point to be matched and the second projection point correspond to the same three-dimensional point;
projecting all three-dimensional points corresponding to the ORB characteristic points of the current frame onto the current frame to obtain third projection points; all three-dimensional points corresponding to the current frame ORB characteristic point comprise a three-dimensional point corresponding to the current frame ORB characteristic point in the first optimization process and a three-dimensional point corresponding to the ORB characteristic point to be matched selected in the second optimization process;
calculating the sum of pixel distances between the third projection point and the image coordinates of the corresponding ORB feature point of the current frame to serve as a second projection total error;
and performing secondary optimization on the pose of the current frame according to the nonlinear optimization algorithm to obtain the pose of the current frame optimized for the second time, so that the total error of the second projection is minimum.
7. The method of claim 2, wherein after calculating the pose of the current frame according to the three-dimensional points corresponding to the ORB feature points of the current frame, the step of repositioning according to the current frame and the existing key frame to determine the pose of the current frame in the fitting plane coordinate system further comprises:
calculating the number of the matching pairs of the ORB feature points of the current frame and the corresponding three-dimensional points;
judging whether the number of the matching pairs is larger than a preset number threshold, if so, determining that the pose of the current frame is effective, and continuing to execute the step of converting the pose of the current frame to the fitting plane coordinate system; and if not, determining that the pose of the current frame is invalid, and returning to the step of acquiring the image re-shot by the unmanned aerial vehicle in the region to which the spliced image belongs.
8. The method of claim 1, wherein when obtaining the current frame after the splicing of the aerial images of the unmanned aerial vehicle is interrupted, the method further comprises: acquiring global positioning system information of the current frame, wherein the global positioning system information is information under a global positioning system coordinate system when the unmanned aerial vehicle takes a photo by plane;
the method for converting the pose of the current frame into the fitting plane coordinate system according to the conversion relation between the first key frame camera coordinate system and the fitting plane coordinate system comprises the following steps:
converting the pose of the current frame into the pose under a global positioning system coordinate system according to the conversion relation between a first key frame camera coordinate system and the global positioning system coordinate system;
and converting the pose of the current frame under the global positioning system coordinate system into the pose under the fitting plane coordinate system according to the conversion relation between the global positioning system coordinate system and the fitting plane coordinate system.
9. The method of claim 1, wherein the step of continuing to stitch the current frame and the stitched image according to the pose of the current frame in the fitting plane coordinate system comprises:
calculating homography transformation relation between the current frame and the spliced image;
determining coordinates of four corner points of the current frame in the spliced image according to the homographic transformation relation;
and determining the pixel value of the extension part after splicing from the spliced image according to the homography transformation relation between the current frame and the spliced image.
10. The utility model provides a recovery unit that unmanned aerial vehicle image concatenation was interrupted that takes photo by plane, its characterized in that, the device includes:
the indication unit is used for indicating the unmanned aerial vehicle to return to the area to which the spliced image belongs when the splicing of the aerial image of the unmanned aerial vehicle is determined to be interrupted;
the acquisition unit is used for acquiring an image re-shot by the unmanned aerial vehicle in the region to which the spliced image belongs, and taking the re-shot image as a current frame;
the repositioning and converting unit is used for repositioning according to the current frame and the existing key frame to determine the pose of the current frame under a fitting plane coordinate system, and the fitting plane is generated by fitting of three-dimensional points corresponding to the directional fast rotation ORB of the existing key frame;
and the splicing unit is used for continuously splicing the current frame and the spliced image according to the pose of the current frame in a fitting plane coordinate system.
11. A computer readable storage medium storing computer instructions, wherein the instructions when executed by a processor implement the method for recovering from splicing interruption of aerial images taken by a drone of any one of claims 1 to 9.
12. An electronic device comprising the computer-readable storage medium of claim 11, further comprising a processor that can execute the computer-readable storage medium.
CN201910405078.4A 2019-05-16 2019-05-16 Unmanned aerial vehicle aerial image splicing interruption recovery method, device and storage medium Active CN111951158B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910405078.4A CN111951158B (en) 2019-05-16 2019-05-16 Unmanned aerial vehicle aerial image splicing interruption recovery method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910405078.4A CN111951158B (en) 2019-05-16 2019-05-16 Unmanned aerial vehicle aerial image splicing interruption recovery method, device and storage medium

Publications (2)

Publication Number Publication Date
CN111951158A true CN111951158A (en) 2020-11-17
CN111951158B CN111951158B (en) 2024-04-12

Family

ID=73335529

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910405078.4A Active CN111951158B (en) 2019-05-16 2019-05-16 Unmanned aerial vehicle aerial image splicing interruption recovery method, device and storage medium

Country Status (1)

Country Link
CN (1) CN111951158B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112365406A (en) * 2021-01-13 2021-02-12 芯视界(北京)科技有限公司 Image processing method, device and readable storage medium
CN116109807A (en) * 2023-04-11 2023-05-12 深圳市其域创新科技有限公司 Panoramic SLAM method, device, computing equipment and storage medium
WO2023082922A1 (en) * 2021-11-15 2023-05-19 北京有竹居网络技术有限公司 Object positioning method and device in discontinuous observation condition, and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060239571A1 (en) * 2005-03-29 2006-10-26 Shenzhen Mindray Bio-Medical Electronics Co., Ltd. Method of volume-panorama imaging processing
CN102201115A (en) * 2011-04-07 2011-09-28 湖南天幕智能科技有限公司 Real-time panoramic image stitching method of aerial videos shot by unmanned plane

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060239571A1 (en) * 2005-03-29 2006-10-26 Shenzhen Mindray Bio-Medical Electronics Co., Ltd. Method of volume-panorama imaging processing
CN102201115A (en) * 2011-04-07 2011-09-28 湖南天幕智能科技有限公司 Real-time panoramic image stitching method of aerial videos shot by unmanned plane

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张楚东;陆建峰;: "基于稀疏光流的道路航拍图像拼接算法", 舰船电子工程, no. 08 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112365406A (en) * 2021-01-13 2021-02-12 芯视界(北京)科技有限公司 Image processing method, device and readable storage medium
WO2023082922A1 (en) * 2021-11-15 2023-05-19 北京有竹居网络技术有限公司 Object positioning method and device in discontinuous observation condition, and storage medium
CN116109807A (en) * 2023-04-11 2023-05-12 深圳市其域创新科技有限公司 Panoramic SLAM method, device, computing equipment and storage medium
CN116109807B (en) * 2023-04-11 2023-06-09 深圳市其域创新科技有限公司 Panoramic SLAM method, device, computing equipment and storage medium

Also Published As

Publication number Publication date
CN111951158B (en) 2024-04-12

Similar Documents

Publication Publication Date Title
CN110533587B (en) SLAM method based on visual priori information and map restoration
CN111968129B (en) Instant positioning and map construction system and method with semantic perception
WO2022121640A1 (en) Robot relocalization method and apparatus, and robot and readable storage medium
CN111951201B (en) Unmanned aerial vehicle aerial image splicing method, device and storage medium
CN111707281B (en) SLAM system based on luminosity information and ORB characteristics
CN111311684B (en) Method and equipment for initializing SLAM
CN111445526A (en) Estimation method and estimation device for pose between image frames and storage medium
US9299161B2 (en) Method and device for head tracking and computer-readable recording medium
CN108447090B (en) Object posture estimation method and device and electronic equipment
CN111951158B (en) Unmanned aerial vehicle aerial image splicing interruption recovery method, device and storage medium
CN102834845A (en) Method and arrangement for multi-camera calibration
Tang et al. ESTHER: Joint camera self-calibration and automatic radial distortion correction from tracking of walking humans
WO2015085779A1 (en) Method and system for calibrating surveillance cameras
CN110648363A (en) Camera posture determining method and device, storage medium and electronic equipment
WO2009082719A1 (en) Invariant visual scene and object recognition
CN108776976A (en) A kind of while positioning and the method, system and storage medium for building figure
CN112418288A (en) GMS and motion detection-based dynamic vision SLAM method
CN110490222A (en) A kind of semi-direct vision positioning method based on low performance processor device
CN117115784A (en) Vehicle detection method and device for target data fusion
Yan et al. PLPF‐VSLAM: An indoor visual SLAM with adaptive fusion of point‐line‐plane features
CN110855601B (en) AR/VR scene map acquisition method
Zhang et al. Dense 3d mapping for indoor environment based on feature-point slam method
Shao A Monocular SLAM System Based on the ORB Features
CN110009683B (en) Real-time on-plane object detection method based on MaskRCNN
Wang et al. Improved visual odometry based on ssd algorithm in dynamic environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 310051 room 304, B / F, building 2, 399 Danfeng Road, Binjiang District, Hangzhou City, Zhejiang Province

Applicant after: Hangzhou Hikvision Robot Co.,Ltd.

Address before: 310052 5 / F, building 1, building 2, no.700 Dongliu Road, Binjiang District, Hangzhou City, Zhejiang Province

Applicant before: HANGZHOU HIKROBOT TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information
TA01 Transfer of patent application right

Effective date of registration: 20230630

Address after: No.555, Qianmo Road, Binjiang District, Hangzhou City, Zhejiang Province

Applicant after: Hangzhou Hikvision Digital Technology Co.,Ltd.

Address before: 310051 room 304, B / F, building 2, 399 Danfeng Road, Binjiang District, Hangzhou City, Zhejiang Province

Applicant before: Hangzhou Hikvision Robot Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant