CN117830879A

CN117830879A - Indoor-oriented distributed unmanned aerial vehicle cluster positioning and mapping method

Info

Publication number: CN117830879A
Application number: CN202410012938.9A
Authority: CN
Inventors: 钟毅; 易雪婷; 鲁仁全; 杨立鑫; 刘畅; 徐雍
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Priority date: 2024-01-02
Filing date: 2024-01-02
Publication date: 2024-04-05
Anticipated expiration: 2044-01-02
Also published as: CN117830879B

Abstract

The invention relates to the technical field of unmanned aerial vehicles, in particular to an indoor-oriented distributed unmanned aerial vehicle cluster positioning and mapping method, which comprises the following steps: s1, image acquisition of an indoor environment is carried out through single unmanned aerial vehicles in a distributed unmanned aerial vehicle cluster, characteristic matching is carried out on an image acquisition process based on a preset characteristic point extraction and matching method of deep learning, and image acquisition data of the single unmanned aerial vehicle are obtained according to a characteristic matching result; s2, mutually calibrating the single unmanned aerial vehicle image acquisition data acquired by a plurality of unmanned aerial vehicles in the distributed unmanned aerial vehicle cluster; and S3, carrying out matching fusion on the single unmanned aerial vehicle image acquisition data acquired by a plurality of unmanned aerial vehicles in the distributed unmanned aerial vehicle cluster based on a preset track estimation method, and obtaining a global pose chart. The method improves the convergence rate of global pose calculation in the unmanned aerial vehicle cluster and balances the expenditure of data communication and calculation processes in the cluster.

Description

Indoor-oriented distributed unmanned aerial vehicle cluster positioning and mapping method

Technical Field

The invention relates to the technical field of unmanned aerial vehicles, in particular to an indoor-oriented distributed unmanned aerial vehicle cluster positioning and mapping method.

Background

When executing some complicated or large-scale tasks, compare in single unmanned aerial vehicle, distributed unmanned aerial vehicle cluster is as a many unmanned aerial vehicle system that can independently, distributively collaborative work accomplish the task, still has improved unmanned aerial vehicle's fault tolerance and reliability when improving task execution efficiency for even some unmanned aerial vehicle breaks down or loses the signal, other unmanned aerial vehicles still can accomplish the task. The method has important significance in order to ensure that the distributed unmanned aerial vehicle cluster can fly independently to avoid the obstacle in the execution task and realize real-time positioning and map building of the unmanned aerial vehicle cluster. In recent years, researchers have perfected the real-time positioning and mapping technology of a single autonomous system, but a lot of technical blanks still exist for realizing efficient and robust positioning for a distributed unmanned aerial vehicle cluster.

In the process of executing tasks, the unmanned aerial vehicle clusters are mainly dependent on the global navigation satellite system to perform global positioning, but due to the fact that the global navigation satellite system in an indoor environment has poor signals, the unmanned aerial vehicle clusters face application scenes such as a large warehouse, a sports stadium and an underground garage and cannot be accurately positioned, and therefore the indoor-oriented distributed unmanned aerial vehicle clusters need to be separated from the global navigation satellite system to perform positioning and mapping.

In an actual application scene, due to the characteristics of low cost, light weight, low power consumption and the like of a vision-inertial sensor combination, a vision-inertial odometer is usually adopted on a small single unmanned aerial vehicle system to perform simultaneous positioning and image construction work. However, when a single vision-inertial odometer is directly used for an unmanned plane, the state estimation has serious drift due to low accuracy of feature point matching; meanwhile, different drifting of single unmanned aerial vehicles in the unmanned aerial vehicle cluster can enable the cluster to generate different position estimates at the same position, so that global consistency is poor, and therefore, ensuring global consistency of the distributed unmanned aerial vehicle cluster under the condition of failure of a global navigation satellite system is still a very challenging research topic.

At present, the vision-inertial odometer of a single unmanned aerial vehicle has been widely applied, but the examples applied to the distributed unmanned aerial vehicle cluster are still relatively few, and when the vision-inertial odometer system adopted at present executes the multi-machine coordination problem, the real-time performance is poor because the signal transmission among unmanned aerial vehicles is limited to a certain extent.

Disclosure of Invention

The invention aims to solve the problem that in the prior art, when a unmanned aerial vehicle is modeled by using a vision-inertial odometer system, the real-time performance and consistency are poor due to factors such as signal transmission and the like.

In order to solve the technical problems, the invention provides an indoor-oriented distributed unmanned aerial vehicle cluster positioning and mapping method, which comprises the following steps:

s1, image acquisition of an indoor environment is carried out through single unmanned aerial vehicles in a distributed unmanned aerial vehicle cluster, characteristic matching is carried out on an image acquisition process based on a preset characteristic point extraction and matching method of deep learning, and image acquisition data of the single unmanned aerial vehicle are obtained according to a characteristic matching result;

s2, mutually calibrating the single unmanned aerial vehicle image acquisition data acquired by a plurality of unmanned aerial vehicles in the distributed unmanned aerial vehicle cluster;

and S3, carrying out matching fusion on the single unmanned aerial vehicle image acquisition data acquired by a plurality of unmanned aerial vehicles in the distributed unmanned aerial vehicle cluster based on a preset track estimation method, and obtaining a global pose chart.

Furthermore, the preset feature point extraction and matching method based on deep learning specifically comprises the following steps:

constructing a global descriptor for the single unmanned aerial vehicle image acquisition data by using a preset neural network model;

performing feature detection on the single unmanned aerial vehicle image acquisition data by using a SuperPoint algorithm, and extracting local descriptors in the single unmanned aerial vehicle image acquisition data to obtain feature points of a frame image of the single unmanned aerial vehicle image acquisition data;

determining key frames in the acquired multiple frame images according to the number of the characteristic points and the number of the common view points;

when a new key frame is determined, different key frames are matched, and the feature points in the successfully matched different key frames are matched.

Furthermore, the preset neural network model is a neural network model which is obtained by training a knowledge distillation method by taking NetVLAD as a teacher network and MobileNetVLAD as a student network. Further, when determining a new key frame, matching different key frames, and matching the feature points in the successfully matched different key frames, wherein a K-nearest neighbor feature matching algorithm is used.

Still further, step S2 comprises the sub-steps of:

preprocessing and jointly initializing a camera and an inertial measurement unit of a single unmanned aerial vehicle, and combining the camera and the inertial measurement unit of the single unmanned aerial vehicle to form a tightly-coupled vision-inertial odometer by utilizing a sliding window optimization method;

introducing an ultra-wideband sensor, estimating the position of an anchor point in data obtained by the ultra-wideband sensor according to the data obtained by the vision-inertial odometer, and performing joint optimization to obtain a local pose graph;

global optimization is carried out on a local pose diagram obtained by a single unmanned aerial vehicle through loop detection of the single unmanned aerial vehicle, and the local pose diagram is used as a measured value acquired by the single unmanned aerial vehicle;

based on a preset closed-loop detection method, calculating measured values acquired by the initialized cameras and inertial measurement units at the same place by different unmanned aerial vehicles to obtain a relative measured value containing the closed-loop edge in the pose graph.

Further, the preset closed loop detection method specifically comprises the following steps:

a first unmanned aerial vehicle transmits the key frame containing the characteristic points to the distributed unmanned aerial vehicle cluster at a specific position;

the second unmanned aerial vehicle performs full-image search based on a K neighbor feature matching algorithm, and obtains detection data which is matched with the key frame transmitted by the first unmanned aerial vehicle and is positioned at the specific position;

and the first unmanned aerial vehicle calculates the relative pose among unmanned aerial vehicles at the specific position based on the detection data.

Still further, step S3 includes the steps of:

constructing a pose graph optimization model based on the keyframes, the relative measured values and the single unmanned aerial vehicle image acquisition data acquired by each unmanned aerial vehicle in the distributed unmanned aerial vehicle cluster;

solving the pose graph optimization model by adopting a preset two-stage solving method, wherein:

the first stage, carrying out a rotation matrix solution on the pose graph optimization model based on an ARock asynchronous step-by-step optimization algorithm;

the second stage, carrying out disturbance optimization solution on the pose graph optimization model based on an ARock asynchronous step-by-step optimization algorithm;

and outputting a solving result of the pose graph optimization model as the global pose graph.

Further, step S3 is performed when the communication intensity between the unmanned aerial vehicles in the distributed unmanned aerial vehicle cluster is greater than a preset signal threshold.

The method has the advantages that the method optimizes the processes of feature point extraction, multi-sensor data fusion and global pose estimation in the distributed unmanned aerial vehicle cluster, so that the speed of feature point extraction of a single unmanned aerial vehicle is increased, accumulated calibration is performed based on a plurality of sensor data, the robustness and accuracy of an unmanned aerial vehicle system are improved, step optimization is used in the global pose estimation process, the convergence speed of the distributed unmanned aerial vehicle system for fusion of different pose graphs is improved, and the cost of data communication and calculation is balanced.

Drawings

Fig. 1 is a block flow diagram of steps of a method for locating and mapping an indoor-oriented distributed unmanned aerial vehicle cluster provided by an embodiment of the invention;

FIG. 2 is a schematic diagram of a training process of a preset neural network model according to an embodiment of the present invention;

FIG. 3 is a schematic flow chart of a SuperPoint matching algorithm provided by an embodiment of the present invention;

FIG. 4 is a schematic diagram of sensor data fusion provided by an embodiment of the present invention;

FIG. 5 is a factor graph of the ultra wideband residual error problem provided by an embodiment of the present invention;

fig. 6 is a schematic flow chart of a preset closed loop detection method according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

Referring to fig. 1, fig. 1 is a block flow diagram of steps of a method for indoor-oriented distributed unmanned aerial vehicle cluster positioning and mapping, which includes the following steps:

s1, carrying out feature matching on the image acquisition process based on a preset feature point extraction and matching method of deep learning, and obtaining single unmanned aerial vehicle image acquisition data according to a feature matching result.

The preset feature point extraction and matching method based on deep learning specifically comprises the following steps:

Specifically, in the embodiment of the invention, a preset neural network model is trained by using a knowledge distillation mode to construct a global descriptor aiming at the problems of low matching speed and low accuracy in the key frame extraction and matching process of simultaneous positioning and mapping based on vision-inertia. As shown in FIG. 2, the preset neural network model is a neural network model with higher precision and lighter weight obtained by training a knowledge distillation method by taking NetVLAD as a teacher network and MobileNetVLAD as a student network. The global descriptor is constructed by training another narrow and deep network (MobileNetVLAD student network corresponding to the NetVLAD teacher network) using the already trained NetVLAD teacher network, and by minimizing the mean square error loss between the NetVLAD predicted target global descriptor and the trained global descriptor.

The embodiment of the invention uses SuperPoint to perform feature point detection and local descriptor extraction, and the flow is shown in figure 3. Specifically, the SuperPoint is a matching algorithm, the SuperPoint model has an encoder and two decoders, and the input data is a single-channel image of H×W×1; the two decoders have a shared encoder, wherein the main functions of the encoder are image dimension reduction, acquisition characteristics and output as related to the targetIs a feature map of (1); the feature point decoder is mainly composed of a convolution layer, a normalized exponential function and a reconstruction function, and outputs a dense feature map with H multiplied by W multiplied by 1, wherein each value in the map represents the probability that the corresponding pixel point is the feature point; the descriptor decoder consists of a convolution layer, a bicubic interpolation module and an L2 normalization layer and outputs 256-bit dense descriptors; finally, the main component analysis method (principal component analysis, PCA) is utilized to reduce the dimension of the descriptors, so that the matching speed of the feature points is improved.

In a specific implementation process, for a single unmanned aerial vehicle in a cluster, the maximum number of landmark points that each camera in the unmanned aerial vehicle can track is set to be N _max On each camera view, the SuperPoint algorithm is first used to extractSparse feature points and local descriptors, assuming that the number of successful extractions is N _sp Carrying out inter-frame tracking on the characteristic points by utilizing a nearest neighbor algorithm; n was then extracted using a Shi-Tomasi corner detector _max -N _sp And tracking the road marking points by an optical flow method, wherein only the characteristic points extracted by the SuperPoint are used for closed loop detection and multi-unmanned aerial vehicle matching, and the road marking points by the optical flow method are only used for self-motion estimation.

After the feature point tracking is completed, whether the current frame is a key frame can be determined according to the number of feature points and the number of common view points, and in the embodiment of the invention, if the feature points tracked by the current frame are less than a set threshold value or the feature points including the last key frame in the feature points tracked by the current frame are less than the set threshold value, the current frame is set as the key frame, so that the key frames are not too dense or sparse, and enough local map points can be ensured to be generated.

In order to realize sparse feature matching among unmanned aerial vehicle clusters, when a new key frame arrives, the latest key frame matched with the new key frame is searched in a sliding window, and when the inner product of global descriptors of the two key frames is larger than a threshold value, the two key frames are considered to be matched with each other.

Specifically, when a new key frame is determined, different key frames are matched, and in the step of matching the feature points in the successfully matched different key frames, a K neighbor feature matching algorithm is used.

The specific steps of the K neighbor feature matching are as follows:

firstly, selecting k=2 points most similar to the characteristic points during matching, and calculating matching distances between the two points and the characteristic points respectively; then calculate the ratio of these two matching distances, if:

the point most similar to the feature point at this time is considered to be the correct matching point, and the successfully matched feature point is considered to be the projection belonging to the same landmark point.

S2, calibrating the image acquisition data of the single unmanned aerial vehicle acquired by the unmanned aerial vehicles in the distributed unmanned aerial vehicle cluster.

Because the global navigation satellite system signal is poor and accurate global positioning cannot be provided in the indoor environment, the embodiment of the invention utilizes the ultra-wideband wireless communication technology to perform close-range positioning, constructs a vision-inertial navigation-ultra-wideband based multi-sensor fusion method, and utilizes the complementary property of the sensors to mutually correct the sensors so as to achieve a better effect. Specifically, referring to the data fusion process of fig. 4, step S2 includes the following sub-steps:

The specific mode of the steps in the implementation process is as follows:

for camera and inertial measurement unitAnd preprocessing, then initializing vision-inertia, and integrating the readings of the inertia measurement unit at the current moment in time to obtain the state of the inertia measurement unit at the next moment, wherein the state comprises the pose, movement, noise and the like of the inertia measurement unit. For the next instant (i.e. t) _i+1 Time of day) recursively propagates according to the following formula:

in the aboveFor the pre-integration quantity of the inertial measurement unit, +.>Respectively t _i Position, speed and attitude at time; the inertial measurement unit is measured from t in world coordinate system _k From time to t _j The position change at the time is predicted as: />

Aiming at the problem that the inherent time offset (namely ultra wideband residual error) exists due to the fact that the data rate of the ultra wideband sensor is different from the data rate of the camera, the embodiment of the invention uses a distance as a center method to associate each distance data with the position of the same time stamp, and the factor graph is shown in fig. 5. FIG. 5 (a) example of a time stamp relationship for camera, inertial navigation and ultra wideband sampling, where t _k Representing the sampling time, t, of the key frame of the kth frame _c Representing the sampling instant, t, of the camera image of the c-th frame _j Representing the time of the j-th sampling of the ultra-wideband sensor; (b) The time stamp corresponds to fig. 5 (a) using a factor graph with distance as a center.

The ultra-wideband residual function thus constructed is:

in the aboveRepresenting t _j Ultra wideband residual error of time d _j Representing t _j Distance data obtained by a moment of time vision-inertial odometer, ^W ap represents the ultra wideband anchor point position, +.>Representing t _j And position information conforming to the inertial navigation rate is obtained at the moment.

Estimating ultra-wideband anchor point locations in a world frame using short-term, accurate vision-inertial odometry data ^W ap, embodiments of the present invention use the following cost function to be minimized:

in the middle ofFor ultra wideband residual, ρ () is a Huber kernel function for removing outliers, ++>Distance data d obtained for all ultra-wideband sampling moment vision-inertial odometers _j And corresponding predicted position information->Is set of (a), i.eWherein-> Can be measured by an inertial measurement unit from t in world coordinate system _k From time to t _j Prediction of the position change of a time instant->And t _k Velocity information according with inertial navigation rate obtained at moment +.>To calculate:

t is in _j Time t for ultra-wideband sampling _k At t _j Before the moment and at a distance t _j The sampling time of the key frame of the k-th frame with the nearest time.

When obtained by minimizing the above function ^W ap and parallelIf the position of the ultra-wideband anchor point is smaller than a certain threshold value, the position of the ultra-wideband anchor point is considered to be fixed.

Once the ultra-wideband anchor point is located, subsequent distance measurements will be tightly coupled with the vision-inertial data to obtain an accurate, cumulative drift-reduced odometer based on joint optimization of the key frames. Thus, the total residual function of the system is:

E _VIR (χ)＝E _VI (χ)+E _R (χ)；

where χ represents the set of all key frames in the sliding window at the current time, E _VI (χ) is the residual error that the vision-odometer system produces during the sliding window optimization:

in the middle ofFor visual residual error, < >>E is inertial navigation residual error _p For marginalizing residual,/->Is that a set of landmark points is observed in the nth and mth keyframes simultaneously, and II represents Euclidean norms of vectors, and rho (·) is a Huber kernel function for eliminating abnormal values.

E _R (χ) is ultra wideband residual:

in the middle ofIs t in sliding window _k And t _k+1 Distance data d between two key frames _j And corresponding predicted position changesIs->γ _r Can be used as a weight factor by presetting gamma _r And adjusting the influence of ultra-wideband residual errors on optimization.

In the embodiment of the invention, when one unmanned aerial vehicle encounters another unmanned aerial vehicle or passes through the same place visited by the other unmanned aerial vehicle, the unmanned aerial vehicle generates the relative measured value with the other unmanned aerial vehicle, and the map combination among the unmanned aerial vehicles can be performed based on the relative measured value. The invention uses the preset closed loop detection method to process the flow, and the process specifically comprises the following steps:

The process schematic diagram of the preset closed loop detection method is shown in fig. 6. Illustratively, unmanned plane i transmits a compact keyframe containing the global descriptor NetVLAD into the unmanned plane cluster, and after receiving this information, unmanned plane j performs loop-back detection. Firstly, performing full-graph search by using Faiss, then performing multi-unmanned aerial vehicle feature point matching by using a K neighbor matching method, and if matching is successful, sending a complete key frame containing road sign information by unmanned aerial vehicle j for extracting relative gestures by using a PnP method by using unmanned aerial vehicle i, and removing outliers.

When communication among unmanned aerial vehicle clusters is good, the overall consistent track estimation of the unmanned aerial vehicle clusters can be realized by enabling each unmanned aerial vehicle to share all information as much as possible. Aiming at indoor environments, the embodiment of the invention designs the preset signal threshold to judge whether communication between unmanned aerial vehicles is good or not. Step S3 comprises the steps of:

Specifically, the pose chart optimization problem constructed by the embodiment of the invention is shown in the following formula:

where ε is the set of all edges, including the closed loop edge (generated by the closed loop detection of step S2) and the self-moving edge (generated by the single-frame unmanned aerial vehicle vision-inertia-ultra wideband odometer, i.e., the acquired single unmanned aerial vehicle image acquisition data), S0 (3) represents the lie group, which is itself constrained to describe the rotation of the three-dimensional space,is a rotation matrix +.>Is the translation vector, z is the global state, x is the complete state of the pose map:

wherein N is the number of unmanned aerial vehicles in the cluster, and M is the key frame number of each unmanned aerial vehicle.

Rewriting the pose chart optimization problem in a distributed method to obtain:

epsilon in _i Is the set of all edges of the ith drone,representing projection of a rotation matrix in the global state of the ith unmanned aerial vehicle at time t to the local state,/-, for example>And (5) representing projecting the translation vector in the global state of the ith unmanned aerial vehicle at the moment t to the local state.

The embodiment of the invention solves the problem of optimizing the pose graph by adopting a two-stage method, wherein the rotation of the initialized pose graph is used for avoiding a local minimum value in the first stage, and the pose graph is refined and optimized in the second stage; and finally, combining the result of pose graph optimization with the result of the vision-inertia-ultra-wideband odometer to obtain a global consistent track of data fusion of each unmanned aerial vehicle, namely the global pose graph.

Specifically, the asynchronous step-by-step optimization algorithm ARock is used for solving the problem of pose graph optimization in the first stage, so that the unmanned aerial vehicle cluster only needs to use the latest bivariate received from a remote place in the process of graph construction, and a synchronous running equation is not needed. This makes the drone clusters less sensitive to communication delays and convergence speed is also improved.

The nonlinearity of the pose graph problem is derived from the rotation matrix, and when the rotation initialization is proper, the pose graph optimization problem is close to the linear least square problem, so that the solution is convenient. The rotation initialization aims at solving a solution or an approximate solution of a rotation part in the pose diagram optimization problem;

the embodiment of the invention uses a chord relaxation method to solve the rotation initialization problem, in the algorithm, firstly, the problem is solved by relaxing the SO (3) constraint:

wherein the method comprises the steps ofIs c _i The set of all rotations in ∈,>is a rotation matrix of the drone i at time t,is a vertical priori, added because the roll angle and pitch angle of the vision-inertial odometer are considerable, which can be used as a priori condition for enhanced initialization, +.>Is a matrix->Third line, ++>g＝[0，0，1] ^T ，/>The rotation matrix of the unmanned aerial vehicle i is obtained by measuring an odometer at the moment t. In the implementation process, the rotation initialization problem can be effectively solved by using a linear solver as a linear least square problem.

Then pass throughTo recover the rotation matrix, wherein II _F Is the French Luo Beini Usnea norm.

After the initialization is completed, in the second stage, the disturbance problem of the distributed pose graph optimization is constructed by using the initialization rotation:

in the middle ofIs x _i Is the disturbance state, z ^* Is a global disturbance state,/->Wherein Exp (·) maps the lie algebra to the lie groups.

From the above equation, it can be seen that the disturbance problem is a nonlinear problem, and thus the problem of accuracy due to linearization is avoided. The ARock algorithm may also be applied during implementation to solve the perturbation problem to ensure that the algorithm is asynchronous. In the whole operation process of the unmanned aerial vehicle, the ARock algorithm is continuously iterated at a certain fixed frequency to update, and information is added to the ARock algorithm in an incremental mode, so that the convergence of the algorithm is guaranteed, the convergence speed of the algorithm is improved, and meanwhile, the communication overhead among unmanned aerial vehicle clusters is reduced. Obviously, under the condition that the expression of the pose image data is clear and the single unmanned aerial vehicle image acquisition data respectively acquired by each unmanned aerial vehicle is matched with each other, the global pose image can be obtained by combining the two data.

Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored on a computer readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM) or the like.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present invention.

While the embodiments of the present invention have been illustrated and described in connection with the drawings, what is presently considered to be the most practical and preferred embodiments of the invention, it is to be understood that the invention is not limited to the disclosed embodiments, but on the contrary, is intended to cover various equivalent modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims

1. An indoor-oriented distributed unmanned aerial vehicle cluster positioning and mapping method is characterized by comprising the following steps:

2. The indoor-oriented distributed unmanned aerial vehicle cluster positioning and mapping method according to claim 1, wherein the deep learning-based preset feature point extraction and matching method specifically comprises:

3. The indoor-oriented distributed unmanned aerial vehicle cluster positioning and mapping method according to claim 2, wherein the preset neural network model is a neural network model obtained by training a NetVLAD as a teacher network and a MobileNetVLAD as a student network through a knowledge distillation method.

4. The indoor-oriented distributed unmanned aerial vehicle cluster locating and mapping method of claim 2, wherein when determining a new keyframe, a K-nearest neighbor feature matching algorithm is used in the step of matching different keyframes and matching the feature points in the different keyframes that were successfully matched.

5. The indoor-oriented distributed unmanned aerial vehicle cluster locating and mapping method of claim 2, wherein step S2 comprises the sub-steps of:

6. The indoor-oriented distributed unmanned aerial vehicle cluster positioning and mapping method according to claim 5, wherein the preset closed loop detection method specifically comprises:

7. The indoor-oriented distributed unmanned aerial vehicle cluster locating and mapping method of claim 5, wherein step S3 comprises the steps of:

8. The method for locating and mapping indoor-oriented distributed unmanned aerial vehicle clusters according to claim 5, wherein step S3 is performed when the communication intensity between each unmanned aerial vehicle in the distributed unmanned aerial vehicle clusters is greater than a preset signal threshold.