CN115731277A

CN115731277A - Image alignment method and device, storage medium and electronic equipment

Info

Publication number: CN115731277A
Application number: CN202110989389.7A
Authority: CN
Inventors: 叶培楚; 刘鹏
Original assignee: Guangzhou Xaircraft Technology Co Ltd
Current assignee: Guangzhou Xaircraft Technology Co Ltd
Priority date: 2021-08-26
Filing date: 2021-08-26
Publication date: 2023-03-03

Abstract

The application provides an image alignment method and device, a storage medium and electronic equipment, and relates to the technical field of image processing. The image alignment method comprises the following steps: determining a camera depth map corresponding to each of a plurality of cameras in the aircraft based on depth data corresponding to terrain data of the target scene; and projecting the images acquired by the cameras to the image planes of the virtual cameras corresponding to the cameras based on the camera depth maps corresponding to the cameras, and determining the aligned images corresponding to the images acquired by the cameras. According to the image alignment method, the depth data corresponding to the topographic data of the target scene are fused, compared with a multi-camera alignment method based on the assumption of high consistency, the alignment accuracy is obviously improved, the application range is expanded, and meanwhile a precondition basis is provided for the analysis of farmland operation information by a plurality of subsequent cameras.

Description

Image alignment method and device, storage medium and electronic equipment

Technical Field

The present application relates to the field of image processing technologies, and in particular, to an image alignment method and apparatus, a storage medium, and an electronic device.

Background

In an agricultural scene, farmland operation information such as crop growth, plant diseases and insect pests and the like is generally analyzed by means of images acquired by a plurality of cameras in an aircraft, and the analysis is performed on the premise that the images acquired by the cameras are aligned. In the existing multiple-camera image alignment scheme, it is generally assumed that the terrain relief is small and is approximated to be a plane, that is, the heights of the multiple cameras relative to the ground are fixed and uniform. Therefore, the existing scheme can cause large alignment error and cannot be applied to practical situations.

Disclosure of Invention

The present application is proposed to solve the above-mentioned technical problems. The embodiment of the application provides an image alignment method and device, a storage medium and electronic equipment.

In a first aspect, an embodiment of the present application provides an image alignment method, including: determining a camera depth map corresponding to each of a plurality of cameras in the aircraft based on depth data corresponding to terrain data of the target scene; and projecting the images acquired by the cameras to the image planes of the virtual cameras corresponding to the cameras based on the camera depth maps corresponding to the cameras, and determining the aligned images corresponding to the images acquired by the cameras.

With reference to the first aspect, in certain implementations of the first aspect, determining a camera depth map corresponding to each of a plurality of cameras in the aircraft based on depth data corresponding to terrain data of the target scene includes: determining depth information of image planes corresponding to the cameras in the terrain data based on the depth data and the relative transformation relation between the camera coordinate system corresponding to each of the cameras and the world coordinate system; and determining a camera depth map corresponding to each of the plurality of cameras based on the depth information of the image plane corresponding to each of the plurality of cameras in the terrain data.

With reference to the first aspect, in certain implementations of the first aspect, determining depth information of image planes corresponding to the multiple cameras in the terrain data based on the depth data and a relative transformation relationship between a camera coordinate system corresponding to each of the multiple cameras and a world coordinate system includes: determining a camera center coordinate corresponding to each of the plurality of cameras based on the depth data and a relative transformation relation between a camera coordinate system corresponding to each of the plurality of cameras and a world coordinate system; determining depth information of camera center coordinates corresponding to the plurality of cameras in the terrain data based on the camera center coordinates corresponding to the plurality of cameras and the depth data; and determining the depth information of the image plane corresponding to each of the plurality of cameras in the topographic data based on the depth information of the camera center coordinate corresponding to each of the plurality of cameras in the topographic data.

With reference to the first aspect, in certain implementations of the first aspect, determining depth information of image planes corresponding to the plurality of cameras in the terrain data based on depth information of camera center coordinates corresponding to the plurality of cameras in the terrain data includes: for each camera of the multiple cameras, determining depth information corresponding to each camera coordinate in the topographic data based on depth information and coordinate offset information corresponding to a camera center coordinate in the topographic data, wherein the coordinate offset information comprises offsets of other camera coordinates corresponding to each camera relative to the camera center coordinate; and determining depth information corresponding to the image plane of each camera in the terrain data based on the depth information corresponding to the coordinates of each camera in the terrain data.

With reference to the first aspect, in certain implementations of the first aspect, before determining the camera depth map corresponding to each of the plurality of cameras in the aircraft based on the depth data corresponding to the terrain data of the target scene, the method further includes: determining three-dimensional reconstruction information corresponding to a target scene; and generating terrain data of the target scene based on the three-dimensional reconstruction information.

With reference to the first aspect, in certain implementations of the first aspect, determining a camera depth map corresponding to each of a plurality of cameras in the aircraft based on depth data corresponding to terrain data of the target scene includes: determining depth data corresponding to terrain data of a target scene based on depth data acquired by a depth sensor in an aircraft; generating a depth map corresponding to a depth sensor based on depth data corresponding to topographic data of a target scene; and determining a camera depth map corresponding to each of the plurality of cameras based on the depth map corresponding to the depth sensor, the relative transformation relation between the depth sensor and the plurality of cameras, and the intrinsic parameter information corresponding to each of the plurality of cameras.

With reference to the first aspect, in certain implementations of the first aspect, projecting images acquired by each of the plurality of cameras onto an image plane of a virtual camera corresponding to the plurality of cameras based on a camera depth map corresponding to each of the plurality of cameras, and determining an aligned image corresponding to the images acquired by each of the plurality of cameras includes: determining a virtual viewpoint of a virtual camera corresponding to the plurality of cameras based on the respective position information of the plurality of cameras; determining a virtual relative transformation relation corresponding to each of the plurality of cameras based on a camera relative transformation relation between the virtual viewpoint and the plurality of cameras, wherein the virtual relative transformation relation is a relative transformation relation between each camera and the virtual viewpoint; based on a camera depth map corresponding to each of the plurality of cameras, the virtual relative transformation relation, the camera intrinsic parameter information corresponding to the virtual camera, and the camera intrinsic parameter information corresponding to each of the plurality of cameras, projecting coordinate points of an image coordinate system corresponding to images acquired by each of the plurality of cameras onto an image plane of the virtual camera, and determining an alignment image.

In a second aspect, an embodiment of the present application provides an image alignment apparatus, including: the first determining module is used for determining a camera depth map corresponding to each of a plurality of cameras in the aircraft based on depth data corresponding to topographic data of the target scene; and the second determining module is used for projecting the images acquired by the cameras to the image planes of the virtual cameras corresponding to the cameras based on the camera depth maps corresponding to the cameras, and determining the aligned images corresponding to the images acquired by the cameras.

In a third aspect, an embodiment of the present application provides a computer-readable storage medium, where the storage medium stores a computer program, and the computer program is configured to execute the method mentioned in the first aspect.

In a fourth aspect, an embodiment of the present application provides an electronic device, including: a processor; a memory for storing processor-executable instructions; the processor is configured to perform the method mentioned in the first aspect above.

According to the image alignment method and device, the storage medium and the electronic equipment, the camera depth maps corresponding to the cameras in the aircraft are determined through the depth data corresponding to the terrain data based on the target scene; and then, based on the camera depth maps corresponding to the multiple cameras, the aim of determining the alignment images corresponding to the images collected by the multiple cameras is achieved in a mode that the images collected by the multiple cameras are projected to the image planes of the virtual cameras corresponding to the multiple cameras. According to the image alignment method, the depth data corresponding to the topographic data of the target scene are fused, compared with a multi-camera alignment method based on the assumption of high consistency, the alignment accuracy is obviously improved, the application range is expanded, and meanwhile a precondition basis is provided for the analysis of farmland operation information by a plurality of subsequent cameras.

Drawings

The above and other objects, features and advantages of the present application will become more apparent by describing in more detail embodiments of the present application with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the principles of the application. In the drawings, like reference numbers generally represent like parts or steps.

Fig. 1 is a schematic view of a scenario applicable to the embodiment of the present application.

Fig. 2 is a schematic view of another scenario applicable to the embodiment of the present application.

Fig. 3 is a flowchart illustrating an image alignment method according to an exemplary embodiment of the present application.

Fig. 4 is a flowchart illustrating an image alignment method according to another exemplary embodiment of the present application.

Fig. 5 is a schematic flowchart illustrating a process of determining depth information of an image plane corresponding to each of a plurality of cameras in topographic data according to an exemplary embodiment of the present disclosure.

Fig. 6 is a schematic flowchart illustrating a process of determining depth information of image planes corresponding to multiple cameras in topographic data according to another exemplary embodiment of the present application.

Fig. 7 is a flowchart illustrating an image alignment method according to another exemplary embodiment of the present application.

Fig. 8 is a schematic flowchart illustrating a process of determining a camera depth map corresponding to each of a plurality of cameras in an aircraft according to an exemplary embodiment of the present application.

Fig. 9 is a schematic diagram illustrating a relative transformation structure of a depth sensor and a multispectral camera according to an exemplary embodiment of the present disclosure.

Fig. 10 is a schematic flowchart illustrating a process of determining an alignment image corresponding to an image acquired by each of a plurality of cameras according to an exemplary embodiment of the present application.

Fig. 11 is a schematic structural diagram of an image alignment apparatus according to an exemplary embodiment of the present application.

Fig. 12 is a schematic structural diagram of a first determining module according to an exemplary embodiment of the present application.

Fig. 13 is a schematic structural diagram of a depth information determining unit according to an exemplary embodiment of the present application.

Fig. 14 is a schematic structural diagram of a third determining subunit according to an exemplary embodiment of the present application.

Fig. 15 is a schematic structural diagram of an image alignment apparatus according to another exemplary embodiment of the present application.

Fig. 16 is a schematic structural diagram of a first determining module according to another exemplary embodiment of the present application.

Fig. 17 is a schematic structural diagram of a second determining module according to an exemplary embodiment of the present application.

Fig. 18 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Fig. 1 is a schematic view of a scenario applicable to the embodiment of the present application. As shown in fig. 1, a scenario to which the embodiment of the present application is applied is an aircraft scenario. Specifically, the scene includes an aircraft 2 loaded with an image capturing device 20 and a server 1 connected to the image capturing device 20.

The image acquisition device 20 comprises a plurality of cameras, the plurality of cameras are used for acquiring images corresponding to a target scene, the server 1 is used for determining a camera depth map corresponding to each of the plurality of cameras in the aircraft based on depth data corresponding to topographic data of the target scene, and then projecting the images acquired by each of the plurality of cameras to an image plane of a virtual camera corresponding to each of the plurality of cameras based on the camera depth map corresponding to each of the plurality of cameras, so as to determine an alignment image corresponding to the image acquired by each of the plurality of cameras. That is, the scene implements an image alignment method. The plurality of aircrafts can share one server, the server can receive data uploaded by different aircrafts, and the plurality of aircrafts can be updated when the server is updated, so that resources are saved.

It should be noted that the present application is also applicable to another scenario. Fig. 2 is a schematic view of another scenario applicable to the embodiment of the present application. Specifically, the scene includes an aircraft 2, where the aircraft 2 includes an image acquisition module 201 and a calculation module 202, and a communication connection relationship exists between the image acquisition module 201 and the calculation module 202.

Specifically, the image capturing module 201 in the aircraft 2 includes a plurality of cameras, the plurality of cameras are configured to capture images corresponding to the target scene, and the calculating module 202 in the aircraft 2 is configured to determine a camera depth map corresponding to each of the plurality of cameras in the aircraft based on depth data corresponding to the terrain data of the target scene, and then project the images captured by each of the plurality of cameras onto image planes of virtual cameras corresponding to the plurality of cameras based on the camera depth maps corresponding to each of the plurality of cameras, so as to determine aligned images corresponding to the images captured by each of the plurality of cameras. That is, the scene implements an image alignment method. Compared with the scene shown in fig. 1, the scene does not need to perform data transmission operation with a related device such as a server, and therefore the scene can ensure the real-time performance of the image alignment method.

Exemplary method

Fig. 3 is a flowchart illustrating an image alignment method according to an exemplary embodiment of the present application. As shown in fig. 3, an image alignment method provided in an embodiment of the present application includes the following steps.

Step 100, determining a camera depth map corresponding to each of a plurality of cameras in the aircraft based on depth data corresponding to topographic data of the target scene.

Illustratively, the plurality of cameras mentioned in step 100 may be binocular cameras or multi-view cameras on board the aircraft, or may be multispectral cameras on board the aircraft. The aircraft can be unmanned aerial vehicle, also can be other flight equipment, and this application does not do specifically and restricts.

For example, the camera depth map may be determined from depth values of pixels on the image plane to which the camera corresponds.

The topographical data is illustratively used to characterize topographical surface relief conditions in the target scene, i.e., data having elevation information.

Step 200, projecting images acquired by the plurality of cameras to image planes of virtual cameras corresponding to the plurality of cameras based on camera depth maps corresponding to the plurality of cameras, and determining alignment images corresponding to the images acquired by the plurality of cameras.

Illustratively, the images acquired by each of the plurality of cameras in step 200 include, but are not limited to, two frames of images taken by a binocular camera on board the aircraft simultaneously with the target scene, multiple frames of images taken by a multi-view camera on board the aircraft simultaneously with the target scene, or multiple frames of different bands of images taken by a multi-spectral camera on board the aircraft simultaneously with the target scene. This is not particularly limited in the embodiments of the present application.

Specifically, a virtual camera is constructed by adopting a virtual photography method, and images acquired by the cameras are re-projected into an ideal virtual image by utilizing the relative position relation among the cameras, so that the images of the cameras are aligned.

According to the image alignment method provided by the embodiment of the application, the camera depth maps corresponding to the multiple cameras in the aircraft are determined based on the depth data corresponding to the terrain data of the target scene, and then the images acquired by the multiple cameras are projected to the image planes of the virtual cameras corresponding to the multiple cameras based on the camera depth maps corresponding to the multiple cameras, so that the aim of determining the aligned images corresponding to the images acquired by the multiple cameras is fulfilled. According to the image alignment method and device, under the condition that the depth data corresponding to the terrain data of the target scene is known, the images of the cameras are aligned, the problem that the image alignment errors of the cameras are large due to the fact that the assumed height consistency is avoided, and then the image alignment accuracy is effectively improved. In addition, the embodiment of the application has the advantage of wide application range.

Particularly, the image alignment method provided by the embodiment of the application is applied to an agricultural unmanned aerial vehicle operation scene in a farmland scene, especially for non-flat operation scenes such as a low-altitude scene and a mountain land, the flying height of the unmanned aerial vehicle is usually low. Compared with a multispectral camera alignment method based on the assumption of high consistency, the alignment precision is obviously improved, and a precondition foundation is provided for the subsequent multispectral camera to be applied to analysis of farmland operation information.

It can be appreciated that the multispectral camera is used to provide multiband spectral data for agricultural remote sensing, and the multispectral camera comprises a plurality of independent imagers, each imager being provided with a specific optical filter, so that each imager can receive spectra in different wavelength ranges. The method comprises the steps of photographing a farmland scene through a multispectral camera to obtain farmland images of red, green, blue, red edge, near infrared and other different spectral bands, and aligning spectral images collected by a plurality of cameras of different spectral bands to obtain aligned images of the multispectral camera.

Fig. 4 is a schematic flowchart illustrating an image alignment method according to another exemplary embodiment of the present application. The embodiment shown in fig. 4 of the present application is extended based on the embodiment shown in fig. 3 of the present application, and the differences between the embodiment shown in fig. 4 and the embodiment shown in fig. 3 are emphasized below, and the descriptions of the same parts are omitted.

As shown in fig. 4, in the image alignment method provided in the embodiment of the present application, a camera depth map corresponding to each of a plurality of cameras in an aircraft is determined based on depth data corresponding to terrain data of a target scene (step 100), which includes the following steps.

Step 101, determining depth information of image planes corresponding to the plurality of cameras in terrain data based on the depth data and the relative transformation relation between the camera coordinate system corresponding to the plurality of cameras and the world coordinate system.

Specifically, a Real-time kinematic (RTK) module on the board of the unmanned aerial vehicle is used to obtain a spatial position of the target scene in a world coordinate system, that is, coordinates and an elevation of the world coordinate system of the spatial position corresponding to the target scene are determined. The positions of the plurality of cameras and the real-time differential positioning module are determined, and the relative transformation relation between each camera in the plurality of cameras and the real-time differential positioning module can be determined.

In an embodiment, the plurality of cameras and the real-time differential positioning module are fixed on the unmanned aerial vehicle during installation, the connection relationship between the plurality of cameras and the real-time differential positioning module is fixed and unchangeable based on rigid connection, and the relative transformation relationship between each camera in the plurality of cameras and the real-time differential positioning module can be determined through an external reference calibration method.

For example, a camera C of a plurality of cameras is known ₁ Determining the position relation between the camera C and the real-time differential positioning module by an external parameter calibration method ₁ Relative transformation relation with real-time differential positioning module

I.e. camera C of the plurality of cameras ₁ And the relative transformation relation between the corresponding coordinate system information and the world coordinate system information. In the same way, the method has the advantages of,the relative transformation relationship of other cameras in the plurality of cameras and the real-time differential positioning module can be obtained.

Step 102, determining a camera depth map corresponding to each of the plurality of cameras based on depth information of an image plane corresponding to each of the plurality of cameras in the terrain data.

Specifically, the point coordinates on the image plane to which each of the plurality of cameras corresponds are planar two-dimensional coordinates. The terrain data may be considered as a map containing three-dimensional coordinates, each pixel in the map recording the elevation and longitude and latitude of the terrain. Coordinate points on the image plane of the camera are mapped in the terrain data, and each coordinate point corresponds to one piece of elevation information. The image planes corresponding to the cameras are mapped in the terrain data, and depth value query is carried out according to longitude and latitude information of the terrain data, so that the depth information corresponding to point coordinates on the image planes corresponding to the cameras can be determined, and the camera depth maps corresponding to the cameras are determined.

According to the image alignment method provided by the embodiment of the application, the depth information of the image plane corresponding to each of the plurality of cameras in the topographic data is determined based on the depth data and the relative transformation relation between the camera coordinate system corresponding to each of the plurality of cameras and the world coordinate system, and the purpose of determining the camera depth map corresponding to each of the plurality of cameras is achieved based on the depth information of the image plane corresponding to each of the plurality of cameras in the topographic data, so that a precondition is provided for subsequent image alignment.

Fig. 5 is a schematic flowchart illustrating a process of determining depth information of image planes corresponding to multiple cameras in topographic data according to an exemplary embodiment of the present application. The embodiment shown in fig. 5 of the present application is extended based on the embodiment shown in fig. 4 of the present application, and the differences between the embodiment shown in fig. 5 and the embodiment shown in fig. 4 are mainly described below, and the description of the same parts is omitted.

As shown in fig. 5, in the image alignment method provided in the embodiment of the present application, based on the depth data and the relative transformation relationship between the camera coordinate system and the world coordinate system corresponding to each of the plurality of cameras, depth information of an image plane corresponding to each of the plurality of cameras in the terrain data is determined (step 101), which includes the following steps.

Step 1011, determining the camera center coordinates corresponding to the plurality of cameras respectively based on the depth data and the relative transformation relation between the camera coordinate system corresponding to the plurality of cameras respectively and the world coordinate system.

Step 1012, determining depth information of the camera center coordinates corresponding to each of the plurality of cameras in the terrain data based on the camera center coordinates corresponding to each of the plurality of cameras and the depth data.

And 1013, determining depth information of an image plane corresponding to each of the plurality of cameras in the terrain data based on the depth information of the camera center coordinates corresponding to each of the plurality of cameras in the terrain data.

For example, assume that the spatial position corresponding to the positioning information collected by the real-time differential positioning module is p _G ＝(x _G ,y _G ,z _G ) Camera C based on calibration ₁ Relative transformation relation with real-time differential positioning module

Determining a camera C of a plurality of cameras ₁ Camera coordinate center

Comprises the following steps:

general camera C ₁ Corresponding camera coordinate center

Performing depth value search in terrain data mapped to target scene to determine camera coordinate center

The depth information of (a). At the camera coordinate center

After the depth information is determined, the image plane corresponding to each of the plurality of cameras is further determined to be in the groundDepth information in the graphics data to further derive a camera depth map

And the camera depth maps of the rest cameras can be further obtained by the same method.

According to the image alignment method provided by the embodiment of the application, the camera center coordinates corresponding to the multiple cameras are determined based on the depth data and the relative transformation relation between the camera coordinate systems corresponding to the multiple cameras and the world coordinate system, then the depth information of the camera center coordinates corresponding to the multiple cameras in the terrain data is determined based on the camera center coordinates corresponding to the multiple cameras and the depth data, finally the purpose of determining the depth information of the image planes corresponding to the multiple cameras in the terrain data is achieved based on the depth information of the camera center coordinates corresponding to the multiple cameras in the terrain data, and the subsequent determination of the camera depth maps corresponding to the multiple cameras is facilitated.

Fig. 6 is a schematic flowchart illustrating a process of determining depth information of an image plane corresponding to each of a plurality of cameras in topographic data according to another exemplary embodiment of the present application. The embodiment shown in fig. 6 of the present application is extended based on the embodiment shown in fig. 5 of the present application, and the differences between the embodiment shown in fig. 6 and the embodiment shown in fig. 5 are emphasized below, and the descriptions of the same parts are omitted.

As shown in fig. 6, in the image alignment method provided in the embodiment of the present application, based on depth information of the camera center coordinates corresponding to each of the plurality of cameras in the topographic data, depth information of the image plane corresponding to each of the plurality of cameras in the topographic data is determined (step 1013), including the following steps. It will be appreciated that the following steps need to be performed for each of the plurality of cameras.

Step 10130, based on the depth information and coordinate offset information corresponding to the camera center coordinate in the terrain data, the depth information corresponding to each camera coordinate in the terrain data is determined. The coordinate offset information includes offsets of other camera coordinates corresponding to each camera relative to the camera center coordinate.

Step 10131, determining depth information corresponding to the image plane of each camera in the terrain data based on depth information corresponding to the respective camera coordinates corresponding to each camera in the terrain data.

Specifically, the center of the camera is equal to the center position of the picture taken by the camera, and the depth information of the coordinate center of the camera can be determined by corresponding the most central position in the topographic data. After the depth information of the camera coordinate center is determined, only camera C needs to be used ₁ And determining the depth information of other camera coordinates by the offset of the other camera coordinates on the corresponding image plane and the center of the camera coordinates. According to camera C ₁ Depth information of all camera coordinates on the corresponding image plane, thereby determining camera C ₁ Corresponding depth information in the terrain data. Similarly, depth information corresponding to the image planes of other cameras in the plurality of cameras in the terrain data may be determined.

For example, the image plane of the camera is 640 × 480, the pixel coordinate of the image center position is (320, 240), when the image center position determines the depth information in the topographic data according to the longitude and latitude, the depth information of other pixels in the topographic data can be determined according to the offset of other pixels in the image and the image center position, and thus the depth information of the image corresponding to the camera image plane can be obtained.

According to the image alignment method provided by the embodiment of the application, the depth information corresponding to the camera coordinate in the topographic data is determined based on the depth information and the coordinate offset information corresponding to the camera center coordinate in the topographic data, and the depth information corresponding to the image plane of each camera in the topographic data is determined based on the depth information corresponding to the camera coordinate in the topographic data, so that the purpose of determining the depth information corresponding to the image plane of each camera in the topographic data is achieved, the scene depth information corresponding to the image plane of the two-dimensional camera is recovered, and a precondition is provided for subsequently determining the camera depth map.

Fig. 7 is a flowchart illustrating an image alignment method according to another exemplary embodiment of the present application. The embodiment shown in fig. 7 of the present application is extended based on the embodiment shown in fig. 3 of the present application, and the differences between the embodiment shown in fig. 7 and the embodiment shown in fig. 3 are mainly described below, and the description of the same parts is omitted.

As shown in fig. 7, in the image alignment method provided in the embodiment of the present application, before determining a camera depth map corresponding to each of a plurality of cameras in an aircraft based on depth data corresponding to terrain data of a target scene (step 100), the following steps are included.

And 80, determining three-dimensional reconstruction information corresponding to the target scene.

Illustratively, the three-dimensional reconstruction information mentioned in step 80 may be obtained by reconstructing a three-dimensional environment of the target scene through a monocular camera or a binocular camera onboard the unmanned aerial vehicle in combination with the positioning information acquired by the real-time differential positioning module, so as to determine the three-dimensional reconstruction information corresponding to the target scene.

And step 90, generating topographic data of the target scene based on the three-dimensional reconstruction information.

Specifically, the terrain data generated by the three-dimensional dense reconstruction result is used as a source of the depth value, and the depth value of each point in the terrain data can be queried by adopting latitude and longitude.

In one embodiment, the flight height of the working unmanned aerial vehicle in the farmland scene is low, and the shooting scenes cannot be assumed to be all on one plane. Especially for mountainous regions, terraces and other scenes, the high consistency assumption is not satisfied. Adopt unmanned aerial vehicle to carry on monocular camera and real-time difference orientation module, unmanned aerial vehicle is when shooing the farmland scene, and real-time difference orientation module takes notes the locating information of shooting the position, after unmanned aerial vehicle shoots the image, according to shooting image and corresponding locating information, carries out three-dimensional dense to the farmland scene and rebuilds, regards the topography data that three-dimensional dense rebuild the result and generate as the source of degree of depth value for the degree of depth value of each point can all adopt the longitude and latitude to inquire in this topography data.

According to the image alignment method provided by the embodiment of the application, the three-dimensional reconstruction information corresponding to the target scene is determined, and the terrain data of the target scene is generated based on the three-dimensional reconstruction information, so that the depth value of each point in the terrain data can be inquired by adopting longitude and latitude. Under the condition that the depth data corresponding to the target scene is known, the images of the cameras are aligned, and compared with a multi-camera alignment method based on the high consistency assumption, the alignment accuracy is obviously improved.

Fig. 8 is a schematic flowchart illustrating a process of determining a camera depth map corresponding to each of a plurality of cameras in an aircraft according to an exemplary embodiment of the present application. The embodiment shown in fig. 8 of the present application is extended based on the embodiment shown in fig. 3 of the present application, and the differences between the embodiment shown in fig. 8 and the embodiment shown in fig. 3 are emphasized below, and the descriptions of the same parts are omitted.

As shown in fig. 8, in the image alignment method provided in the embodiment of the present application, a camera depth map corresponding to each of a plurality of cameras in an aircraft is determined based on depth data corresponding to topographic data of a target scene (step 100), which includes the following steps.

Step 103, determining depth data corresponding to the terrain data of the target scene based on the depth data acquired by the depth sensor in the aircraft.

Illustratively, the depth sensor mentioned in step 103 may be a Kinect sensor or a Realsense sensor. The type of depth sensor is not specifically limited by the present application, as long as depth data can be collected.

And 104, generating a depth map corresponding to the depth sensor based on the depth data corresponding to the topographic data of the target scene.

Illustratively, the depth map referenced at step 104 contains depth data of the target scene acquired by the depth sensor. Each pixel of the depth map corresponds to one depth data of the terrain data.

And 105, determining a camera depth map corresponding to each of the plurality of cameras based on the depth map corresponding to the depth sensor, the relative transformation relation between the depth sensor and the plurality of cameras, and the intrinsic parameter information corresponding to each of the plurality of cameras.

In one embodiment, the agricultural drone carries a multispectral camera and a depth sensor for farm work. Assuming that the multispectral camera consists of 4 bandsCamera C ₁ And a camera C ₂ Camera C ₃ And a camera C ₄ And (4) forming. The relative positions of the multispectral camera and the depth sensor D are determined, and the relative transformation relation between the depth sensor D and the cameras can be determined according to the calibration result.

Fig. 9 is a schematic diagram illustrating a relative transformation between a depth sensor and a multispectral camera according to an exemplary embodiment of the present disclosure. As shown in FIG. 9, the depth sensor D and the camera C in the multispectral camera M ₁ Has a relative transformation relationship of

Similarly, it can be determined that the depth sensor D is respectively corresponding to the camera C ₂ And a camera C ₃ And a camera C ₄ Relative transformation relationships of (a).

Because unmanned aerial vehicle's flying height is lower in the farmland scene, can't assume to shoot the scene all and be in a plane. Especially for mountainous regions, terraces and other scenes, the high consistency assumption is not satisfied. Obtaining depth data corresponding to terrain data of a target scene in real time through a depth sensor D, and generating a depth map I corresponding to the depth sensor D _D . For depth map I _D Of any one pixel p _i ＝(u _i ,v _i ,z _i ) Known depth map I _D The internal parameter of the corresponding camera is K _D Spatial point P in the D coordinate System of the depth sensor _i ^D ＝(x _i ,y _i ,z _i ) Can be determined based on the following formula (1).

Wherein the internal parameter of the camera is K _D The following matrix may be used.

Camera C in known depth sensor D and multispectral camera M ₁ Has a relative transformation relationship of

Transforming depth sensor D to camera C based on equation (2) below ₁ Under the coordinate system of (2), the camera C can be obtained ₁ Spatial points of the coordinate system of

Suppose camera C ₁ Is an internal parameter matrix of

Based on the following formula (3), the camera C ₁ Spatial points under the coordinate system of

Projection to camera C ₁ Image plane

In (1), the camera C can be obtained ₁ Image plane

Coordinate point of (5)

Depth map I constructed by depth sensor D _D All pixels are projected to camera C ₁ Image plane

In (1), a camera C can be constructed ₁ Corresponding phaseMachine depth map

In the same way, the camera C in the multispectral camera M is changed according to the transmission effect of the relative pose change ₂ And a camera C ₃ And a camera C ₄ Of (2)

Constructing a camera depth map

The image alignment method provided by the embodiment of the application determines depth data corresponding to terrain data of a target scene based on depth data acquired by a depth sensor in an aircraft, then generates a depth map corresponding to the depth sensor based on the depth data corresponding to the terrain data of the target scene, and finally determines a camera depth map corresponding to a plurality of cameras based on the depth map corresponding to the depth sensor, a relative transformation relation between the depth sensor and the plurality of cameras and internal parameter information corresponding to the plurality of cameras, so that the subsequent image alignment of the plurality of cameras is facilitated.

Fig. 10 is a schematic flowchart illustrating a process of determining an alignment image corresponding to an image acquired by each of a plurality of cameras according to an exemplary embodiment of the present application. The embodiment shown in fig. 10 of the present application is extended based on the embodiment shown in fig. 3 of the present application, and the differences between the embodiment shown in fig. 10 and the embodiment shown in fig. 3 are emphasized below, and the descriptions of the same parts are omitted.

As shown in fig. 10, in the image alignment method provided in the embodiment of the present application, images acquired by a plurality of cameras are projected onto image planes of virtual cameras corresponding to the plurality of cameras based on camera depth maps corresponding to the plurality of cameras, and an alignment image corresponding to the image acquired by each of the plurality of cameras is determined (step 200), including the following steps.

Step 201, based on the position information of each of the plurality of cameras, determining the virtual viewpoints of the virtual cameras corresponding to the plurality of cameras.

Illustratively, the plurality of cameras corresponds to a virtual camera, the virtual camera having a virtual viewpoint. The virtual viewpoint may be located at the center positions of the plurality of cameras, or may be located at other reference point positions corresponding to the plurality of cameras. For example, the virtual viewpoint of the virtual camera to which the multispectral camera corresponds may be located at the center of the multispectral camera. This is not a particular limitation of the present application.

Step 202, determining a virtual relative transformation relation corresponding to each of the plurality of cameras based on a camera relative transformation relation between the virtual viewpoint and the plurality of cameras, wherein the virtual relative transformation relation is a relative transformation relation between the cameras and the virtual viewpoint.

Specifically, the respective position information of the plurality of cameras and the camera relative transformation relationship between the plurality of cameras are known, and the relative transformation relationship between each of the plurality of cameras and the virtual viewpoint is determined based on the assumed virtual viewpoint of the virtual camera.

Step 203, projecting coordinate points of an image coordinate system corresponding to images acquired by the cameras to an image plane of the virtual camera based on the camera depth map corresponding to the cameras, the virtual relative transformation relation, the camera intrinsic parameter information corresponding to the virtual camera and the camera intrinsic parameter information corresponding to the cameras, and determining an alignment image.

For example, four of the multispectral cameras M are determined for the camera position, where the relative transformation between the four of the multispectral cameras M is known:

is represented by a camera C ₁ Change to camera C ₂ ；

Is represented by a camera C ₁ Conversion to camera C ₃ ；

Is represented by a camera C ₁ Change to camera C ₄ . Since the transformation matrix T is a homogeneous matrix, the relative transformation may beTo pass through multiplication, so that it is possible to obtain: camera C ₂ To camera C ₃ Relative transformation between

Camera C ₂ To camera C ₄ The relative transformation between:

camera C ₃ To camera C ₄ The relative transformation between:

the relative camera transformation relationships between the four cameras are known, so the virtual viewpoint V is assumed to be at the center of the multispectral camera, and the parameter K within the virtual camera is set _V . And a camera C ₁ Relative transformation to virtual viewpoint V

Determining Camera C by the same method ₂ Relative transformation between to virtual viewpoints V

Camera C ₃ Relative transformation between to virtual viewpoints V

Camera C ₄ Relative transformation to virtual viewpoint V

Determining a camera C according to the overlapping parts of the images shot by the four cameras ₁ The corresponding overlapping portions of the captured images. Suppose camera C ₁ Any coordinate point p in the corresponding overlapping part of the shot images _i ＝(u _i ,v _i 1) projection onto a virtual camera C _V Image plane, p is obtained according to the following formula (4) _i At the virtual camera C _V Coordinates corresponding to image plane

Then there are:

z in the above formula (4) _i As a coordinate point p _i The depth value of (2). In a similar way, all points in all cameras are projected to the image plane of the virtual camera, so that the information of four wave bands is gathered to the same virtual camera.

According to the image alignment method provided by the embodiment of the application, firstly, the virtual viewpoints of the virtual cameras corresponding to the cameras are determined based on the respective position information of the cameras; then, determining a virtual relative transformation relation corresponding to each of the plurality of cameras based on the camera relative transformation relation between the virtual viewpoint and the plurality of cameras; and finally, projecting coordinate points of an image coordinate system corresponding to the images acquired by the cameras to an image plane of the virtual camera based on the camera depth maps corresponding to the cameras, the virtual relative transformation relation, the camera intrinsic parameter information corresponding to the virtual camera and the camera intrinsic parameter information corresponding to the cameras, so as to determine an aligned image. The virtual photography method is adopted, the images acquired by the cameras are projected into the virtual camera by utilizing the mutual position relation among the cameras, the images acquired by the cameras are aligned, and the alignment error is controlled at a sub-pixel level.

Exemplary devices

Method embodiments of the present application are described in detail above with reference to fig. 1 to 10, and apparatus embodiments of the present application are described in detail below with reference to fig. 11 to 18. It is to be understood that the description of the method embodiments corresponds to the description of the apparatus embodiments, and therefore reference may be made to the method embodiments above for parts which are not described in detail.

Fig. 11 is a schematic structural diagram of an image alignment apparatus according to an exemplary embodiment of the present application. As shown in fig. 11, the image alignment apparatus provided in the embodiment of the present application includes a first determination module 300 and a second determination module 400. The first determination module 300 is configured to determine a camera depth map corresponding to each of a plurality of cameras in the aircraft based on depth data corresponding to terrain data of the target scene. The second determining module 400 is configured to project the images captured by the plurality of cameras to the image planes of the virtual cameras corresponding to the plurality of cameras based on the camera depth maps corresponding to the plurality of cameras, and determine the alignment images corresponding to the images captured by the plurality of cameras.

Fig. 12 is a schematic structural diagram of a first determining module according to an exemplary embodiment of the present application. As shown in fig. 12, in the image alignment apparatus provided in the embodiment of the present application, the first determining module 300 includes a depth information determining unit 301 and a camera depth map determining unit 302. The depth information determining unit 301 is configured to determine depth information of an image plane corresponding to each of the plurality of cameras in the terrain data based on the depth data and a relative transformation relationship between a camera coordinate system corresponding to each of the plurality of cameras and a world coordinate system. The camera depth map determination unit 302 is configured to determine a camera depth map corresponding to each of the plurality of cameras based on depth information of an image plane corresponding to each of the plurality of cameras in the terrain data.

Fig. 13 is a schematic structural diagram of a depth information determining unit according to an exemplary embodiment of the present application. As shown in fig. 13, in the image alignment apparatus provided in the embodiment of the present application, the depth information determining unit 301 includes a first determining subunit 3011, a second determining subunit 3012, and a third determining subunit 3013. The first determination subunit 3011 is configured to determine, based on the depth data and the relative transformation relationship between the camera coordinate system and the world coordinate system corresponding to each of the plurality of cameras, camera center coordinates corresponding to each of the plurality of cameras. The second determining subunit 3012 is configured to determine depth information of the camera center coordinates corresponding to each of the plurality of cameras in the topographic data based on the camera center coordinates corresponding to each of the plurality of cameras and the depth data. The third determining subunit 3013 is configured to determine, based on depth information of camera center coordinates in the topographic data corresponding to each of the plurality of cameras, depth information of an image plane corresponding to each of the plurality of cameras in the topographic data.

Fig. 14 is a schematic structural diagram of a third determining subunit according to an exemplary embodiment of the present application. As shown in fig. 14, in the image alignment apparatus provided in the embodiment of the present application, the third determination subunit 3013 includes, for each of the plurality of cameras, a fourth determination subunit 30130 and a fifth determination subunit 30131. The fourth determining subunit is configured to 30130, determine depth information corresponding to each camera coordinate in the topographic data based on depth information and coordinate offset information corresponding to the camera center coordinate in the topographic data, where the coordinate offset information includes offsets of other camera coordinates corresponding to each camera with respect to the camera center coordinate. The fifth determining subunit 30131 is configured to determine depth information, in the topographic data, of an image plane of each camera based on depth information, in the topographic data, of respective camera coordinates corresponding to each camera.

Fig. 15 is a schematic structural diagram of an image alignment apparatus according to another exemplary embodiment of the present application. As shown in fig. 15, in the image alignment apparatus provided in the embodiment of the present application, the image alignment apparatus further includes a three-dimensional reconstruction information determining module 500 and a topographic data determining module 600. The three-dimensional reconstruction information determining module 500 is configured to determine three-dimensional reconstruction information corresponding to a target scene. The terrain data determination module 600 is configured to generate terrain data of the target scene based on the three-dimensional reconstruction information.

Fig. 16 is a schematic structural diagram of a first determining module according to another exemplary embodiment of the present application. As shown in fig. 16, in the image alignment apparatus provided in the embodiment of the present application, the first determining module 300 includes a depth data determining unit 303, a sixth determining unit 304, and a seventh determining unit 305. The depth data determination unit 303 is configured to determine depth data corresponding to terrain data of the target scene based on depth data acquired by a depth sensor in the aircraft. The sixth determination unit 304 is configured to generate a depth map corresponding to the depth sensor based on depth data corresponding to terrain data of the target scene. The seventh determining unit 305 is configured to determine a camera depth map corresponding to each of the plurality of cameras based on the depth map corresponding to the depth sensor, the relative transformation relationship between the depth sensor and the plurality of cameras, and the intrinsic parameter information corresponding to each of the plurality of cameras.

Fig. 17 is a schematic structural diagram of a second determining module according to an exemplary embodiment of the present application. As shown in fig. 17, in the image alignment apparatus provided in the embodiment of the present application, the second determining module 400 includes a virtual viewpoint determining unit 401, a virtual relative transformation relation determining unit 402, and an aligned image determining unit 403. The virtual viewpoint determining unit 401 is configured to determine a virtual viewpoint of a virtual camera corresponding to the plurality of cameras based on the position information of each of the plurality of cameras. The virtual relative transformation relation determining unit 402 is configured to determine a virtual relative transformation relation corresponding to each of the plurality of cameras based on a camera relative transformation relation between the virtual viewpoint and the plurality of cameras, where the virtual relative transformation relation is a relative transformation relation between each camera and the virtual viewpoint. The aligned image determining unit 403 is configured to project coordinate points of an image coordinate system corresponding to images acquired by the plurality of cameras onto an image plane of the virtual camera based on the camera depth map corresponding to each of the plurality of cameras, the virtual relative transformation relation, the in-camera parameter information corresponding to the virtual camera, and the in-camera parameter information corresponding to each of the plurality of cameras, and determine an aligned image.

Exemplary electronic device

Next, an electronic apparatus according to an embodiment of the present application is described with reference to fig. 18. Fig. 18 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present application.

As shown in fig. 18, the electronic device 70 includes one or more processors 701 and a memory 702.

The processor 701 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 70 to perform desired functions.

Memory 702 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer-readable storage medium and executed by the processor 701 to implement the image alignment methods of the various embodiments of the present application described above and/or other desired functions. Various contents such as images respectively captured by a plurality of cameras may also be stored in the computer-readable storage medium.

In one example, the electronic device 70 may further include: an input device 703 and an output device 704, which are interconnected by a bus system and/or other form of connection mechanism (not shown).

The input device 703 may include, for example, a keyboard, a mouse, and the like.

The output device 704 may output various information including integrity information of the determined target structure and the like to the outside. The output devices 704 may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, among others.

Of course, for the sake of simplicity, only some of the components related to the present application in the electronic device 70 are shown in fig. 18, and components such as buses, input/output interfaces, and the like are omitted. In addition, electronic device 70 may include any other suitable components depending on the particular application.

Exemplary computer readable storage Medium

In addition to the above-described methods and apparatus, embodiments of the present application may also be a computer program product comprising computer program instructions that, when executed by a processor, cause the processor to perform the steps in the image alignment method according to various embodiments of the present application described above in this specification.

The computer program product may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages, for carrying out operations according to embodiments of the present application. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.

Furthermore, embodiments of the present application may also be a computer-readable storage medium having stored thereon computer program instructions, which, when executed by a processor, cause the processor to perform the steps in the image alignment method according to various embodiments of the present application described above in this specification.

The computer readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The foregoing describes the general principles of the present application in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present application are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present application. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the foregoing disclosure is not intended to be exhaustive or to limit the disclosure to the precise details disclosed.

The block diagrams of devices, apparatuses, systems referred to in this application are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".

It should also be noted that in the devices, apparatuses, and methods of the present application, the components or steps may be decomposed and/or recombined. These decompositions and/or recombinations are to be considered as equivalents of the present application.

The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the application. Thus, the present application is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The foregoing description has been presented for purposes of illustration and description. Furthermore, the description is not intended to limit embodiments of the application to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims

1. An image alignment method, comprising:

determining a camera depth map corresponding to each of a plurality of cameras in the aircraft based on depth data corresponding to terrain data of the target scene;

the method comprises the steps of projecting images acquired by the plurality of cameras to image planes of virtual cameras corresponding to the plurality of cameras based on camera depth maps corresponding to the plurality of cameras, and determining alignment images corresponding to the images acquired by the plurality of cameras.

2. The image alignment method of claim 1, wherein determining a camera depth map corresponding to each of a plurality of cameras in the aircraft based on depth data corresponding to terrain data of the target scene comprises:

determining depth information of image planes corresponding to the cameras in the terrain data based on the depth data and relative transformation relations between camera coordinate systems corresponding to the cameras and a world coordinate system;

determining a camera depth map corresponding to each of the plurality of cameras based on depth information of an image plane corresponding to each of the plurality of cameras in the terrain data.

3. The image alignment method according to claim 2, wherein the determining depth information of the image plane corresponding to each of the plurality of cameras in the terrain data based on the depth data and a relative transformation relationship between a camera coordinate system corresponding to each of the plurality of cameras and a world coordinate system comprises:

determining a camera center coordinate corresponding to each of the plurality of cameras based on the depth data and a relative transformation relation between a camera coordinate system corresponding to each of the plurality of cameras and the world coordinate system;

determining depth information of the camera center coordinates corresponding to each of the plurality of cameras in the terrain data based on the camera center coordinates corresponding to each of the plurality of cameras and the depth data;

and determining depth information of an image plane corresponding to each of the plurality of cameras in the terrain data based on the depth information of the camera center coordinate corresponding to each of the plurality of cameras in the terrain data.

4. The image alignment method according to claim 3, wherein the determining the depth information of the image plane corresponding to each of the plurality of cameras in the terrain data based on the depth information of the camera center coordinate corresponding to each of the plurality of cameras in the terrain data comprises:

for each of the plurality of cameras, determining a plurality of camera parameters,

determining depth information corresponding to each camera coordinate in the topographic data based on depth information and coordinate offset information corresponding to the camera center coordinate in the topographic data, wherein the coordinate offset information comprises offsets of other camera coordinates corresponding to each camera relative to the camera center coordinate;

and determining depth information corresponding to the image plane of each camera in the topographic data based on the depth information corresponding to the coordinates of each camera in the topographic data.

5. The image alignment method of claim 1, further comprising, prior to determining a camera depth map corresponding to each of the plurality of cameras in the aircraft based on depth data corresponding to the terrain data of the target scene:

determining three-dimensional reconstruction information corresponding to the target scene;

and generating terrain data of the target scene based on the three-dimensional reconstruction information.

6. The image alignment method of claim 1, wherein determining a camera depth map corresponding to each of a plurality of cameras in the aircraft based on depth data corresponding to terrain data of the target scene comprises:

determining depth data corresponding to terrain data of the target scene based on depth data acquired by a depth sensor in the aircraft;

generating a depth map corresponding to the depth sensor based on depth data corresponding to terrain data of the target scene;

and determining a camera depth map corresponding to each of the plurality of cameras based on the depth map corresponding to the depth sensor, the relative transformation relation between the depth sensor and the plurality of cameras and the intrinsic parameter information corresponding to each of the plurality of cameras.

7. The image alignment method according to any one of claims 1 to 6, wherein the projecting the images acquired by the plurality of cameras to the image planes of the virtual cameras corresponding to the plurality of cameras based on the camera depth maps corresponding to the plurality of cameras, and determining the alignment images corresponding to the images acquired by the plurality of cameras comprises:

determining a virtual viewpoint of a virtual camera corresponding to the plurality of cameras based on the respective position information of the plurality of cameras;

determining a virtual relative transformation relation corresponding to each of the plurality of cameras based on a camera relative transformation relation between the virtual viewpoint and the plurality of cameras, wherein the virtual relative transformation relation is a relative transformation relation between each camera and the virtual viewpoint;

and projecting coordinate points of an image coordinate system corresponding to the images acquired by the cameras to an image plane of the virtual camera based on the camera depth maps corresponding to the cameras, the virtual relative transformation relation, the camera internal parameter information corresponding to the virtual camera and the camera internal parameter information corresponding to the cameras, and determining the aligned images.

8. An image alignment apparatus, comprising:

the first determining module is used for determining a camera depth map corresponding to each of a plurality of cameras in the aircraft based on depth data corresponding to topographic data of the target scene;

the second determination module is used for projecting the images acquired by the cameras to the image planes of the virtual cameras corresponding to the cameras based on the camera depth maps corresponding to the cameras, and determining the alignment images corresponding to the images acquired by the cameras.

9. A computer-readable storage medium, characterized in that the storage medium stores a computer program for executing the image alignment method of any one of claims 1 to 7.

10. An electronic device, characterized in that the electronic device comprises:

a processor;

a memory for storing the processor-executable instructions;

the processor for performing the image alignment method of any one of the preceding claims 1 to 7.