CN117252921A

CN117252921A - Aviation thermal infrared image positioning method, device and equipment based on view synthesis

Info

Publication number: CN117252921A
Application number: CN202310720630.5A
Authority: CN
Inventors: 刘宇翔; 颜深; 吴柔莞; 张茂军; 刘煜; 彭杨
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2023-06-16
Filing date: 2023-06-16
Publication date: 2023-12-19

Abstract

The application relates to an aviation thermal infrared image positioning method, device and equipment based on view synthesis. The method comprises the following steps: constructing a query image set, a reference map set and a ground control point set; the query image set comprises a plurality of query images; calculating the initial pose of the query image by using prior information acquired by the equipment sensor, and performing rendering synthesis on a pre-constructed three-dimensional reference model according to the initial pose and rendering software to obtain a reference image and a depth image of the reference image; and carrying out feature point matching on the query image and the reference image according to a deep learning algorithm to construct a depth-based 2D-3D corresponding relation, and solving the motion between a three-dimensional point set and a two-dimensional point set with the 2D-3D corresponding relation in the query image by utilizing an n-point perspective algorithm to obtain a final pose. The method can improve the positioning accuracy of the thermal infrared image.

Description

Aviation thermal infrared image positioning method, device and equipment based on view synthesis

Technical Field

The present disclosure relates to the field of image positioning technologies, and in particular, to an aviation thermal infrared image positioning method, device and equipment based on view synthesis.

Background

Global satellite navigation systems (GNSS), such as the Beidou satellite navigation system (BDS), the Global Positioning System (GPS), etc., utilize satellites, ground control systems and terminal receivers to provide positioning services for military and civilian fields. However, in various cases this function may be caused to be inaccurate. The error varies from a few meters to tens or hundreds of meters. The current modern positioning technology with simple structure, low price, large information quantity and strong anti-interference capability has good military and civil prospects, and is a good choice for supplementing GNSS. However, most of the existing visual positioning algorithms are based on visible light images, require enough illumination conditions, and are difficult to support military operations or civil fields spanning weather, seasons and day and night, such as night rescue. The temperature information captured by thermal imaging makes them more environmentally friendly and have a longer imaging distance. Compared with visible light images, the images are less influenced by glare and weather, and can realize clear perception of objects even in dark, foggy days, smog and other environments. Furthermore, thermal infrared cameras have been miniaturized in recent years, and the cost is no longer prohibitive. This makes it possible to use thermal images to assist in 6 degree of freedom positioning of an unmanned aerial vehicle or an airborne aircraft.

However, currently publicly available thermal infrared positioning datasets either lack annotations or focus on the relative visual positioning of indoor scenes. This data scarcity problem severely hampers the development of this field, making the current thermal infrared image positioning accuracy low.

Disclosure of Invention

Accordingly, it is necessary to provide an aviation thermal infrared image positioning method, device and equipment based on view synthesis, which can improve the thermal infrared image positioning accuracy.

An aviation thermal infrared image positioning method based on view synthesis, the method comprises the following steps:

constructing a query image set, a reference map set and a ground control point set; the query image set comprises a plurality of query images;

calculating the initial pose of the query image by using prior information acquired by the equipment sensor, and performing rendering synthesis on a pre-constructed three-dimensional reference model according to the initial pose and rendering software to obtain a reference image and a depth image of the reference image;

and carrying out feature point matching on the query image and the reference image according to a deep learning algorithm to construct a depth-based 2D-3D corresponding relation, and solving the motion between a three-dimensional point set and a two-dimensional point set with the 2D-3D corresponding relation in the query image by utilizing an n-point perspective algorithm to obtain a final pose.

In one embodiment, the process of constructing a set of query images, a set of reference maps, and a set of ground control points includes:

planning all flight paths into grids, and automatically constructing a reference map set by using a flight control system of the Xinjiang M300;

the manual operation flight platform carries out free navigation in a scene, and images under different time and different weather conditions obtained from a reference map set are adopted to construct a query image set by adopting two shooting strategies of continuous shooting at 2 second intervals and random manual shooting;

8 object edges in the reference map scene are selected as ground control points to construct a ground control point set.

In one embodiment, calculating an initial pose of a query image using prior information acquired by a device sensor includes:

calculating an initial pose of a query image using prior information acquired by a device sensor as

Wherein, the R-table image is oriented in a rotation matrix, and t represents the position of the query image in the three-dimensional spaceMark, 0 ^T Representing a complement 0 operation of the next coordinates.

In one embodiment, the pre-built three-dimensional reference model includes a visible model, a thermal model, a super-resolution thermal model, and a geometrically refined thermal model; the process for constructing the three-dimensional reference model comprises the following steps:

and acquiring visible light and thermal imaging images to perform thermal imaging image fusion on the super-resolution of the thermal imaging images and the visible light to obtain a visible model, a thermal model, a super-resolution thermal model and a geometric refinement thermal model.

In one embodiment, performing feature point matching on a query image and a reference image according to a deep learning algorithm to construct a depth-based 2D-3D correspondence, includes:

back projecting the depth of a matching point pr in the reference image into the reference map to obtain a coordinate P3D of a corresponding 3D point in a world coordinate system;

and matching the coordinates P3D with the matching points pq in the corresponding query, thereby completing the 2D-3D correspondence.

In one embodiment, back projecting the depth of the matching point in the reference image into the reference map to obtain coordinates of the corresponding 3D point in the world coordinate system includes:

back projecting the depth of the matching point in the reference image into the reference map to obtain the coordinate of the corresponding 3D point in the world coordinate system as follows

Wherein T is _r For the above-referenced pose to be referred to,in the form of homogeneous coordinates of three-dimensional coordinate points, +.>In the method, d is a depth value corresponding to a matching point in a reference image, and K is a camera internal reference matrix.

In one embodiment, the method for solving the motion between the three-dimensional point set and the two-dimensional point set with the 2D-3D corresponding relation in the query image by using an n-point perspective algorithm to obtain a final pose comprises the following steps:

solving the motion between the three-dimensional point set and the two-dimensional point set with the 2D-3D corresponding relation in the query image by utilizing an n-point perspective algorithm to obtain a probability density function

Where p (X|y) represents the probability density function of the 3D point cloud X projected onto the image plane given the camera pose y, f _i (y) represents an error between a projection position of the ith feature point on the image plane and coordinates of the corresponding 3D point cloud in the camera coordinate system,representing the sum of squares of errors of all feature points;

and (5) performing minimum calculation on the probability density function to obtain the final pose.

An aerial thermal infrared image positioning device based on view synthesis, the device comprising:

the data set construction module is used for constructing a query image set, a reference map set and a ground control point set; the query image set comprises a plurality of query images;

the rendering synthesis module is used for calculating the initial pose of the query image by using the prior information acquired by the equipment sensor, and performing rendering synthesis on a pre-constructed three-dimensional reference model according to the initial pose and rendering software to obtain a reference image and a depth image of the reference image;

and the feature point matching and pose solving module is used for carrying out feature point matching on the query image and the reference image according to the deep learning algorithm to construct a depth-based 2D-3D corresponding relation, and solving the motion between the three-dimensional point set and the two-dimensional point set with the 2D-3D corresponding relation in the query image by utilizing the n-point perspective algorithm to obtain the final pose.

A computer device comprising a memory storing a computer program and a processor which when executing the computer program performs the steps of:

According to the method, the device and the equipment for positioning the aviation thermal infrared image based on view synthesis, four three-dimensional reference models are constructed to serve as an aviation thermal image positioning data set, the accurate posture of a query image is provided, information from a plurality of sensors is recorded, the data set covers different weather conditions, the real conditions are better simulated, the 6-DoF posture of a visible image and all images is provided, further, the positioning accuracy of the aviation thermal infrared image is improved when image positioning is carried out subsequently, a reference view is rendered according to the initial posture and rendering software, a depth map of the reference image is generated, the depth map comprises depth values of all pixels in the reference image, a 2D-3D corresponding relation can be built in subsequent pose solving, positioning efficiency is improved, finally, a depth-based 2D-3D corresponding relation is built by matching feature points of the query image and the reference image through a deep learning algorithm, the motion between the three-dimensional point set and the two-dimensional point set with the 2D corresponding relation in the query image is solved by utilizing an n-point perspective algorithm, and finally the positioning accuracy of the aviation thermal infrared image can be greatly improved.

Drawings

FIG. 1 is a flow diagram of an aerial thermal infrared image positioning method based on view synthesis in one embodiment;

FIG. 2 is a schematic diagram of a frame of an aerospace thermal infrared image positioning device based on view synthesis in one embodiment;

FIG. 3 is an internal block diagram of a computer device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

In one embodiment, as shown in fig. 1, there is provided an aviation thermal infrared image positioning method based on view synthesis, including the steps of:

102, constructing a query image set, a reference map set and a ground control point set; the query image set includes a plurality of query images.

According to the method, all flight paths are planned to be grid-shaped, automation is realized by using a flight control system of the Xinjiang M300 to ensure full and uniform coverage of a measurement area to construct a reference map set, all query images are acquired by using a DJI H20T camera, query test images under different conditions are collected by considering the influence of temperature and weather on thermal imaging, and in order to enhance the diversity and practicability of data, a flight platform is manually operated while the query images are acquired, so that free navigation is performed in a scene. The query image set was constructed using two firing strategies, continuous firing at 2 second intervals and random manual firing. In order to more accurately align the model geographically to the real world, ground control points are often required in photogrammetry. Since the infrared camera cannot capture color and texture information, 8 object edges in the scene are selected as ground control points and their geographic coordinates are measured using the handheld RTK device to construct a ground control point set.

Implementation of constructing a query image set, a reference map set and a ground control point set

And 104, calculating the initial pose of the query image by using prior information acquired by the equipment sensor, and rendering and synthesizing a pre-constructed three-dimensional reference model according to the initial pose and rendering software to obtain a reference image and a depth image of the reference image.

Different types of unmanned aerial vehicles typically have various built-in sensors, such as inertial sensors (e.g., accelerometers, gyroscopes, gravimeters, compasses). The current position and velocity is continuously updated by calculating and integrating the information of these sensors, given the initial position and velocity. However, the error of the inertial sensor increases with time. Thus, a given angle can only be used as a priori information to determine the initial pose of the subsequent rendered rendering perspective. On-board GPS is often unstable and the altitude error tends to be large. Thus, the present application uses barometer to obtain the initial height as z-value.

Various three-dimensional reference models are constructed from the acquired visible light and thermal imaging images. These models are divided into two types, visible light models and thermal models. Because of the lower resolution of thermal images, the geometric accuracy of such three-dimensional models is lower. In order to improve the precision, the application experiments two methods, namely thermal image super-resolution and visible light geometric model thermal image fusion, to obtain four three-dimensional reference models, namely a visible model, a thermal model, a super-resolution thermal model and a geometric refinement thermal model. A textured mesh model is built using modern 3D reconstruction techniques and aligned to the real geographic world through built-in RTK measurements and GCPs.

Visual reference model: the visible light reference model is reconstructed from the high overlay and high resolution visible light aerial image acquired above. The model serves as a benchmark for our dataset and has a Ground Sample Distance (GSD) of 1 cm.

Thermal reference model-thermal images are collected for three-dimensional reference model construction. The method avoids the problem of cross-modal matching on the one hand and is in contrast to a visible model on the other hand. It also provides aligned multimodal data for subsequent studies. However, due to limitations of the acquisition equipment, the resolution of the thermal image obtained is low. This results in a GSD of only 10 cm for a 3D model constructed under the same setup. Thus, the present application improves the accuracy of the thermal model in two ways, as follows:

super-resolution thermal model: before three-dimensional reconstruction, the spatial resolution of the thermal three-dimensional map is improved by an image super-resolution technique. Therefore, the super-resolution thermal image network is adopted to amplify the original thermal image, and then the super-resolution thermal model is constructed.

Geometrically refined thermal model: another approach to improving the spatial resolution of a thermal model is to directly adjust its geometric model, since the geometric model determines the spatial resolution. In the texture mapping stage, all models are geographically aligned by using RTK and GCP, and the geometric model in the thermal model is replaced by a corresponding geometric model extracted from the visible model, and the geometric model is a high-precision texture-free model.

The four three-dimensional reference models are used as an aviation thermal image positioning data set to provide accurate gestures of query images and record information from a plurality of sensors. Furthermore, the data set covers different weather conditions to better simulate real conditions. This is the first dataset that enables 6-degree-of-freedom localization of aerial thermal images to be studied.

The pre-built three-dimensional reference model performs positioning-oriented viewpoint synthesis according to different scene requirements, wherein various ready-made renderers can be used for rendering images from the 3D grid, and the open source 3D rendering software Blender has the advantage of an extensible API with high rendering quality. Therefore, according to the initial pose obtained above, the Blender is used for rendering the reference view, and the depth map of the reference image is generated, wherein the depth map comprises depth values of all pixels in the reference image and is used for establishing a 2D-3D corresponding relation in subsequent pose solving.

And 106, carrying out feature point matching on the query image and the reference image according to a deep learning algorithm to construct a depth-based 2D-3D corresponding relation, and solving the motion between the three-dimensional point set and the two-dimensional point set with the 2D-3D corresponding relation in the query image by utilizing an n-point perspective algorithm to obtain the final pose.

Given the feature point matching between the query image and the synthetic reference image, a 2D-3D correspondence between the query image and the reference image can be established. Firstly, reversely projecting the depth of a matching point pr in a rendered image into a reference map to obtain a coordinate P3D of a corresponding 3D point in a world coordinate system, and correspondingly inquiring the matching point pq of the P3D, thereby completing the 2D-3D correspondence; and solving the motion between the three-dimensional point set and the two-dimensional point set with the corresponding relation by utilizing a PnP algorithm to obtain the final pose.

According to the aviation thermal infrared image positioning method based on view synthesis, four three-dimensional reference models are constructed to serve as aviation thermal image positioning data sets, the accurate posture of an inquiry image is provided, information from a plurality of sensors is recorded, the data sets cover different weather conditions, the real conditions are better simulated, the 6-DoF postures of a visible image and all images are provided, further, the positioning accuracy of the aviation thermal infrared image is improved when image positioning is carried out subsequently, the reference views are rendered according to initial postures and rendering software, the depth map of the reference image is generated, the depth map comprises depth values of all pixels in the reference image, the 2D-3D corresponding relation can be conveniently established in subsequent pose solving, positioning efficiency is improved, finally, the depth-based 2D-3D corresponding relation is constructed by matching characteristic points of the inquiry image and the reference image, the final-stage positioning accuracy of the aviation thermal infrared image can be greatly improved when the final pose is solved by utilizing an n-point perspective algorithm.

Wherein, the R-table image is oriented in a rotation matrix, t represents the position coordinate of the query image in the three-dimensional space, and 0 ^T Representing a complement 0 operation of the next coordinates.

It should be understood that, although the steps in the flowchart of fig. 1 are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in fig. 1 may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily performed in sequence, but may be performed alternately or alternately with at least a portion of other steps or sub-steps of other steps.

In one embodiment, as shown in fig. 2, there is provided an aviation thermal infrared image localization apparatus based on view synthesis, including: a dataset construction module 202, a rendering synthesis module 204, and a feature point matching and pose solving module 206, wherein:

a data set construction module 202 for constructing a query image set, a reference map set, and a ground control point set; the query image set comprises a plurality of query images;

the rendering synthesis module 204 is configured to calculate an initial pose of the query image according to prior information acquired by the device sensor, and perform rendering synthesis on a three-dimensional reference model constructed in advance according to the initial pose and rendering software to obtain a reference image and a depth image of the reference image;

the feature point matching and pose solving module 206 is configured to perform feature point matching on the query image and the reference image according to a deep learning algorithm to construct a depth-based 2D-3D correspondence, and solve the motion between the three-dimensional point set and the two-dimensional point set with the 2D-3D correspondence in the query image by using an n-point perspective algorithm to obtain a final pose.

In one embodiment, the process by which the data set construction module 202 also constructs a set of query images, a set of reference maps, and a set of ground control points includes:

In one embodiment, the rendering composition module 204 is further configured to calculate an initial pose of the query image using prior information acquired by the device sensor, including:

In one embodiment, the feature point matching and pose solving module 206 is further configured to perform feature point matching on the query image and the reference image according to a deep learning algorithm to construct a depth-based 2D-3D correspondence, including:

In one embodiment, the feature point matching and pose solving module 206 is further configured to back-project the depth of the matching point in the reference image into the reference map to obtain coordinates of the corresponding 3D point in the world coordinate system, including:

In one embodiment, the feature point matching and pose solving module 206 is further configured to solve, by using an n-point perspective algorithm, a motion between a three-dimensional point set and a two-dimensional point set having a 2D-3D correspondence in a query image, to obtain a final pose, where the method includes:

For specific limitations of the view synthesis-based avionic thermal infrared image positioning apparatus, reference may be made to the above limitation of the view synthesis-based avionic thermal infrared image positioning method, and no further description is given here. The various modules in the above-described view synthesis-based aerial thermal infrared image positioning apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a terminal, and the internal structure of which may be as shown in fig. 3. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements an aviation thermal infrared image localization method based on view synthesis. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.

It will be appreciated by those skilled in the art that the structure shown in fig. 3 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples merely represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims

1. An aviation thermal infrared image positioning method based on view synthesis, which is characterized by comprising the following steps:

2. The method of claim 1, wherein constructing the set of query images, the set of reference maps, and the set of ground control points comprises:

3. The method of claim 1, wherein calculating the initial pose of the query image using prior information acquired by a device sensor comprises:

calculating an initial pose of the query image as using prior information acquired by the device sensor

4. A method according to any one of claims 1 to 3, wherein the pre-built three-dimensional reference model comprises a visible model, a thermal model, a super-resolution thermal model and a geometrically refined thermal model; the process of constructing the three-dimensional reference model comprises the following steps:

5. The method of claim 1, wherein performing feature point matching on the query image and the reference image according to a deep learning algorithm to construct a depth-based 2D-3D correspondence, comprises:

back projecting the depth of the matching point pr in the reference image into a reference map to obtain a coordinate P3D of a corresponding 3D point in a world coordinate system;

6. The method of claim 5, wherein back projecting the depth of the matching point in the reference image into the reference map, to obtain coordinates of the corresponding 3D point in a world coordinate system, comprises:

Wherein T is _r For the above-described reference pose,in the form of homogeneous coordinates of three-dimensional coordinate points, +.>In the method, d is a depth value corresponding to a matching point in a reference image, and K is a camera internal reference matrix.

7. The method of claim 1, wherein solving the motion between the three-dimensional point set and the two-dimensional point set having the 2D-3D correspondence in the query image using an n-point perspective algorithm to obtain a final pose comprises:

solving the motion between the three-dimensional point set and the two-dimensional point set with the 2D-3D corresponding relation in the query image by utilizing an n-point perspective algorithm to obtain a probability density function as

and performing minimum calculation on the probability density function to obtain the final pose.

8. An aviation thermal infrared image positioning device based on view synthesis, characterized in that the device comprises:

the rendering synthesis module is used for calculating the initial pose of the query image by using prior information acquired by the equipment sensor, and performing rendering synthesis on a three-dimensional reference model constructed in advance according to the initial pose and rendering software to obtain a reference image and a depth image of the reference image;

and the feature point matching and pose solving module is used for carrying out feature point matching on the query image and the reference image according to a deep learning algorithm to construct a depth-based 2D-3D corresponding relation, and solving the motion between a three-dimensional point set and a two-dimensional point set with the 2D-3D corresponding relation in the query image by utilizing an n-point perspective algorithm to obtain the final pose.

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.