CN114842220B - Unmanned aerial vehicle visual positioning method based on multi-source image matching - Google Patents

Unmanned aerial vehicle visual positioning method based on multi-source image matching Download PDF

Info

Publication number
CN114842220B
CN114842220B CN202210321285.3A CN202210321285A CN114842220B CN 114842220 B CN114842220 B CN 114842220B CN 202210321285 A CN202210321285 A CN 202210321285A CN 114842220 B CN114842220 B CN 114842220B
Authority
CN
China
Prior art keywords
aerial vehicle
unmanned aerial
image
convolution layer
positioning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210321285.3A
Other languages
Chinese (zh)
Other versions
CN114842220A (en
Inventor
袁媛
刘赶超
李超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN202210321285.3A priority Critical patent/CN114842220B/en
Publication of CN114842220A publication Critical patent/CN114842220A/en
Application granted granted Critical
Publication of CN114842220B publication Critical patent/CN114842220B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/74Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/42Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/803Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of input or preprocessed data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/17Terrestrial scenes taken from planes or by drones
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention provides an unmanned aerial vehicle visual positioning method based on multi-source image matching. Firstly, training a feature extraction network by using an existing multi-source matching image and a real unmanned airport scene image respectively; then, performing feature extraction on the unmanned aerial vehicle image by using a trained network, and performing position estimation on the unmanned aerial vehicle by using position point information, so that the positioning search range is reduced; and then, extracting features of the satellite images in the estimated position range, and performing feature matching on the unmanned aerial vehicle images and the satellite images by using similarity measurement to obtain an unmanned aerial vehicle positioning result. The method can better solve the problem of heterogeneous matching between the satellite image and the unmanned aerial vehicle image, can be used for various application scenes, has small calculated amount, and can better meet the real-time positioning requirement of the unmanned aerial vehicle platform.

Description

Unmanned aerial vehicle visual positioning method based on multi-source image matching
Technical Field
The invention belongs to the technical field of multi-source remote sensing matching, and particularly relates to an unmanned aerial vehicle visual positioning method based on multi-source image matching.
Background
Unmanned aerial vehicle positioning is usually realized by satellite navigation, but as a passive signal receiving mode, navigation signals are easy to interfere in special scenes. When the signal is lost, the accumulated error of the inertial measurement unit becomes larger and larger over time. The computer vision processes and analyzes the vision information through the computer system, realizes the functions of detecting, identifying, tracking, positioning and the like of the target, and has stronger anti-interference capability. Therefore, unmanned aerial vehicle positioning based on visual matching can well solve unmanned aerial vehicle positioning problem under satellite refusing condition.
The unmanned aerial vehicle visual positioning method is roughly divided into three types: map-free positioning methods (e.g., visual odometry), map-based positioning methods (e.g., synchronous positioning and patterning methods), and map-based positioning methods (e.g., image matching methods). The three visual positioning methods have advantages and disadvantages and application ranges: the method based on map construction and map-free positioning only requires cameras installed on UAVs (Unmanned Aerial Vehicle, unmanned aerial vehicles), but estimation errors of inter-frame motion can be seriously accumulated; additional pre-recorded geographic reference image libraries are needed based on the image matching method, but the absolute position of the UAV can be obtained without accumulating errors.
The image matching method is mainly divided into a traditional method and a deep learning method. The traditional method is based on manually designed descriptors to extract features to realize remote sensing image matching, and the correspondence between local features (regions, lines and points) is mainly sought through descriptor similarity and/or space geometric relations. The use of locally significant features allows such methods to run quickly and be robust to noise, complex geometric deformations and significant radiometric differences. However, due to the popularity of higher resolution and larger size data, the method cannot meet the requirements of more correspondence, higher accuracy and more flexible applications. With the proposal of a large number of marked data sets, the deep learning method, in particular to a Convolutional Neural Network (CNN), achieves a very good effect in the field of image matching. The CNN has the main advantage that the CNN can automatically learn the characteristics favorable for image matching under the guidance of the label data. In contrast to manually designed descriptors, deep learning-based features contain not only low-level spatial information, but also high-level semantic information. Due to the strong capability of automatically extracting the features, the deep learning method can obtain higher matching accuracy.
Although the unmanned aerial vehicle visual positioning method based on image matching has significant advantages, some problems still need to be solved: firstly, the imaging conditions of a reference image with geographic information and an unmanned aerial vehicle image are different, and the multi-source image matching has the problem of heterogeneity; in addition, a small amount of annotation data is difficult to adapt to various application scenes; finally, the platform of the unmanned aerial vehicle also puts a strict requirement on real-time.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides an unmanned aerial vehicle visual positioning method based on multi-source image matching. Firstly, training a feature extraction network by using an existing multi-source matching image and a real unmanned airport scene image respectively; then, performing feature extraction on the unmanned aerial vehicle image by using a trained network, and performing position estimation on the unmanned aerial vehicle by using position point information, so that the positioning search range is reduced; and then, extracting features of the satellite images in the estimated position range, and performing feature matching on the unmanned aerial vehicle images and the satellite images by using similarity measurement to obtain an unmanned aerial vehicle positioning result. The method can better solve the problem of heterogeneous matching between the satellite image and the unmanned aerial vehicle image, can be used for various application scenes, has small calculated amount, and can better meet the real-time positioning requirement of the unmanned aerial vehicle platform.
The unmanned aerial vehicle visual positioning method based on multi-source image matching is characterized by comprising the following steps:
step 1: training a twin network for feature extraction by adopting satellite and virtual unmanned aerial vehicle matching images acquired on Google Earth Pro, and storing network parameters; retraining the network by using the tagged real-scene unmanned aerial vehicle takeoff position image to obtain a trained feature extraction network which is suitable for the real unmanned aerial vehicle working scene; the twin network for feature extraction comprises a convolution layer 1, a maximum pooling layer 1, a convolution layer 2, a maximum pooling layer 2, a convolution layer 3, a convolution layer 4, a convolution layer 5, a maximum pooling layer 3, a convolution layer 6 and a convolution layer 7 which are sequentially connected, wherein the convolution kernel size of the convolution layer 1 is 7×7×24, the step size is 1, the convolution kernel size of the convolution layer 2 is 5×5×24, the step size is 1, the convolution kernel sizes of the convolution layer 3 and the convolution layer 4 are 3×3×96, the step size is 1, the convolution kernel size of the convolution layer 5 is 3×3×64, the step size is 1, the convolution kernel size of the convolution layer 6 is 3×3×128, the convolution kernel size of the convolution layer 7 is 8×8×128, the step size is 1, the kernel sizes of the maximum pooling layer 1 and the maximum pooling layer 2 are 3×3, and the kernel size of the maximum pooling layer 3 is 6×6;
step 2: uniformly cutting an unmanned aerial vehicle image into three parts in the transverse direction and the longitudinal direction respectively to form a nine-grid, downsampling each grid to 128 multiplied by 128, extracting features by utilizing the network trained in the step 1, and jointly forming the description I 'of the whole unmanned aerial vehicle image by the extracted 9 features' k
Step 3: the method comprises the steps of taking the first n nearest positioning position points of the unmanned aerial vehicle at the current moment to carry out position estimation, respectively calculating the flight speed of the unmanned aerial vehicle in n-1 position intervals of the longitude and latitude directions according to the longitude and latitude where the position points are located and the flight time between every two position points, and recording the flight speed sequence V in the n-1 position intervals of the longitude direction long The flying speed sequence in the n-1 position interval in the latitude direction is V lat The motion trail of the unmanned aerial vehicle is regarded as uniform motion, and the sequence V long And V lat The velocity mean value in the system obeys the t distribution, and the velocity range in the longitude or latitude direction isWherein (1)>And S is *2 Representing the mean and variance of the longitudinal or latitudinal velocity, respectively, α being the confidence parameter of the t distribution, t α/2 (n-2) represents that the t distribution side quantile of n-1 sampling intervals is utilized under the condition that the confidence is 1-alpha, the n is obtained by looking up a t distribution table, the value range of n is 3-6, and the value of alpha is 0.005; initializing the first n positioning position points at the initial moment by utilizing the take-off position;
multiplying the speed range by the time difference between the current time and the flight time of the nth position point to obtain a displacement range in the period, if the obtained displacement range is smaller than 20m, keeping the displacement range to be 20m, and adding the displacement range with the coordinates of the nth position point to obtain the current position range of the unmanned aerial vehicle;
step 4: cutting every 10m of the satellite image in the position range obtained in the step 3, uniformly cutting the satellite image into three parts in the transverse direction and the longitudinal direction each time to form a nine-grid, downsampling each small grid to 128 multiplied by 128, extracting features by utilizing the network trained in the step 1, and putting 9 features of each extracted satellite image and the position labels thereof into a library to be matched;
step 5: traversing the library to be matched, and calculating L between 9 features of each satellite image and the features of the corresponding position of the nine squares on the unmanned aerial vehicle image obtained in the step 2 2 Norms to get 9L 2 The sum of the norm values is used as a similarity measurement value of the whole image, so that each satellite image in the library to be matched obtains a similarity measurement value of the satellite image and the unmanned aerial vehicle image, and a position label of the satellite image with the minimum similarity measurement value is used as a current unmanned aerial vehicle positioning result; if the difference between the minimum two similarity measurement values is larger than a threshold value beta, the current positioning result is considered to be reliable, and the current unmanned aerial vehicle positioning result is taken as a positioning position point; the value of the threshold value beta is 0.25;
step 6: and repeatedly executing the steps 2 to 5 until the visual navigation flight task is finished.
The beneficial effects of the invention are as follows: because the virtual dataset image and the real scene image are adopted to train the network respectively, the network can be better used for extracting global high-order semantic features, and the problem of heterogeneity among the multi-source images is solved; by adopting the position estimation method, the matching search range can be effectively reduced, the positioning accuracy is improved, and the real-time performance of the system is ensured.
Drawings
FIG. 1 is a flow chart of a method for visual positioning of an unmanned aerial vehicle based on multi-source image matching;
FIG. 2 is a flow chart of a position estimation of the present invention;
fig. 3 is a resulting image of the visual positioning of an unmanned aerial vehicle in a college of university of northwest industrial university using the method of the present invention.
Detailed Description
The invention will be further illustrated with reference to the following figures and examples, which include but are not limited to the following examples.
As shown in fig. 1, the invention provides an unmanned aerial vehicle visual positioning method based on multi-source image matching, which comprises the following specific implementation processes:
step 1: training a twin network for feature extraction by adopting satellite and virtual unmanned aerial vehicle matching images acquired on Google Earth Pro, and storing network parameters; retraining the network by using the tagged real-scene unmanned aerial vehicle takeoff position image to obtain a trained feature extraction network which is suitable for the real unmanned aerial vehicle working scene; the specific structure of the twin network for feature extraction is shown in table 1, wherein Conv represents a convolutional layer name, pooling represents a Pooling layer name, C represents a convolutional operation, and MP represents a maximum Pooling operation.
TABLE 1
Structure of the Type(s) Output size Nuclear size Stride length
Conv1 C 128×128×24 7×7×24 1
Pooling1 MP 64×64×24 3×3 2
Conv2 C 64×64×64 5×5×24 1
Pooling2 MP 32×32×64 3×3 2
Conv3 C 32×32×96 3×3×96 1
Conv4 C 32×32×96 3×3×96 1
Conv5 C 32×32×64 3×3×64 1
Pooling3 MP 8×8×64 6×6 4
Conv6 C 8×8×128 3×3×128 1
Conv7 C 1×1×128 8×8×128 1
Step 2: uniformly cutting an unmanned aerial vehicle image into three parts in the transverse direction and the longitudinal direction respectively to form a nine-grid, downsampling each grid to 128 multiplied by 128, extracting features by utilizing the network trained in the step 1, and jointly forming the description I 'of the whole unmanned aerial vehicle image by the extracted 9 features' k
Step 3: the method comprises the steps of taking the first n nearest trusted positioning position points of the unmanned aerial vehicle at the current moment to perform position estimation (the position information of the area is known by default in the flying position of the unmanned aerial vehicle, the earliest n position points can be directly initialized in the take-off position, the n trusted position points are continuously and iteratively updated through flight matching in the later stage), calculating the flying speeds of the unmanned aerial vehicle in n-1 position intervals in the longitude and latitude directions according to the longitude and latitude where the position points are located and the flying time length between every two position points, and recording the flying speed sequence V in the n-1 position intervals in the longitude direction long The flying speed sequence in the n-1 position interval in the latitude direction is V lat The motion trail of the unmanned aerial vehicle is regarded as uniform motion, and the sequence is thatV long And V lat The velocity mean value in the system obeys the t distribution, and the velocity range in the longitude or latitude direction isWherein (1)>And S is *2 Representing the mean and variance of the longitudinal or latitudinal velocity, respectively, α being the confidence parameter of the t distribution, t α/2 (n-2) represents that the t distribution side quantile of n-1 sampling intervals is utilized under the condition that the confidence is 1-alpha, the n is obtained by looking up a t distribution table, the value range of n is 3-6, and the value of alpha is 0.005;
multiplying the time difference between the current moment and the flight moment of the nth position point by the speed range to obtain a displacement range in the period, when the value of the obtained displacement range is smaller than 20m, reserving the value of the displacement range as 20m, and adding the value of the displacement range with the coordinates of the nth position point to obtain the current position range of the unmanned aerial vehicle;
the above-described position estimation process is shown in fig. 2.
Step 4: cutting every 10m of the satellite image in the position range obtained in the step 3, uniformly cutting the satellite image into three parts in the transverse direction and the longitudinal direction each time to form a nine-grid, downsampling each small grid to 128 multiplied by 128, extracting features by utilizing the network trained in the step 1, and placing 9 features of each extracted satellite image and position labels thereof into a library to be matched.
Step 5: traversing the library to be matched, and calculating L between 9 features of each satellite image and the features of the corresponding position of the nine squares on the unmanned aerial vehicle image obtained in the step 2 2 Norms to get 9L 2 And taking the sum of the norm values as a similarity measurement value of the whole image, so that each satellite image in the library to be matched obtains a similarity measurement value of the satellite image and the unmanned aerial vehicle image, and taking the position label of the satellite image with the minimum similarity measurement value as the current unmanned aerial vehicle positioning result. If the difference between the minimum two similarity measurement values is greater than the threshold value beta, the current matching result is considered to have higher reliability, and the current matching result is considered to have higher reliability at the end of the sequenceTail-adding and updating the n trusted positioning positions in the step 3; the value of the threshold value beta is 0.25.
Step 6: and repeatedly executing the steps 2 to 5 until the visual navigation flight task is finished.
To verify the effectiveness of the method of the invention, the central processing unit isE5-2680 v4 2.40GHz CPU, a memory 64G, a display card NVIDIA RTX 3090 and Ubuntu16.04 operating systems, and performing simulation experiments based on Pytorch1.7.1 and Python3.8.5 language environments. The center part of los Angeles in virtual data is constructed by using the 3D modeling function of Google Earth Pro software as an experimental area, and the area is 4942 multiplied by 3408m 2 . There are typical urban landscapes in the area, with multi-span houses, streets and vehicles, and also in suburban areas, which are open. Thereby obtaining the simulation image of the unmanned aerial vehicle. And for the converted aerial image and the cut satellite image, the positions of the images are marked by the coordinates of each image, so that a matching pair of two heterogeneous images is generated, and 10000 virtual data are obtained. And simultaneously, obtaining the real image of the unmanned aerial vehicle by aerial photography, and carrying out artificial annotation to generate a corresponding matched satellite image. Each scene of the real data is 600 pairs of samples, the actual application scene is simulated at the same time, a long-distance unmanned aerial vehicle video is marked, the duration is 4 minutes, the flight distance is 2 kilometers, the unmanned aerial vehicle flight height is 170m, the video quality is 1920x1080, and each second is 30 frames.
The random gradient descent method Stochastic Gradient Descent (SGD) is used as an optimizer to optimize network parameters, and the parameters of the optimizer SGD are: learning rate 0.01, impulse 0.9, total 50 epochs. The learning rate of every 20 epochs in the virtual data training stage is reduced by 10 times, 30 images are taken in the fine tuning stage, data enhancement is carried out through rotation, and the learning rate is unchanged.
Table 2 shows the matching accuracy calculation results obtained after training the network by using the virtual data and the real aerial data manufactured by Google Earth Pro, respectively. It can be seen that the network has the preliminary heterogeneous matching capability after training on the virtual data, but the effect of directly migrating to the real data is lower, only 36.4%; after the network model is subjected to training fine adjustment on the real data, the matching effect is remarkably improved, and the performance on the real data is greatly improved to 61.5%. Table 3 shows the calculation results after adding the nine-grid cutting process to the image. It can be seen that the test accuracy of each stage after the nine-grid is added is remarkably improved. Fig. 3 shows a result image of the method for visual positioning in the university of northwest industrial university of security, wherein gray solid position points represent actual flight positions, and white hollow is used as a fulcrum to represent visual positioning results of the positions. It can be seen that the result of visual positioning is substantially around the true position, with good positioning accuracy.
TABLE 2
Virtual data training After virtual data training and real data fine tuning
Virtual data testing 66.3% 48.7%
Real data testing 36.4% 61.5%
TABLE 3 Table 3
Post-training testing of virtual data Virtual data training+real data post-fine tuning test
Does not adopt 9 palace lattice 66.3% 61.5%
Adopts 9 palace lattice 75.4% 69.7%

Claims (1)

1. The unmanned aerial vehicle visual positioning method based on multi-source image matching is characterized by comprising the following steps:
step 1: training a twin network for feature extraction by adopting satellite and virtual unmanned aerial vehicle matching images acquired on Google Earth Pro, and storing network parameters; retraining the network by using the tagged real-scene unmanned aerial vehicle takeoff position image to obtain a trained feature extraction network which is suitable for the real unmanned aerial vehicle working scene; the twin network for feature extraction comprises a convolution layer 1, a maximum pooling layer 1, a convolution layer 2, a maximum pooling layer 2, a convolution layer 3, a convolution layer 4, a convolution layer 5, a maximum pooling layer 3, a convolution layer 6 and a convolution layer 7 which are sequentially connected, wherein the convolution kernel size of the convolution layer 1 is 7×7×24, the step size is 1, the convolution kernel size of the convolution layer 2 is 5×5×24, the step size is 1, the convolution kernel sizes of the convolution layer 3 and the convolution layer 4 are 3×3×96, the step size is 1, the convolution kernel size of the convolution layer 5 is 3×3×64, the step size is 1, the convolution kernel size of the convolution layer 6 is 3×3×128, the convolution kernel size of the convolution layer 7 is 8×8×128, the step size is 1, the kernel sizes of the maximum pooling layer 1 and the maximum pooling layer 2 are 3×3, and the kernel size of the maximum pooling layer 3 is 6×6;
step 2: uniformly cutting an unmanned aerial vehicle image into three parts in the transverse direction and the longitudinal direction respectively to form a nine-grid, downsampling each grid to 128 multiplied by 128, extracting features by utilizing the network trained in the step 1, and jointly forming the description I 'of the whole unmanned aerial vehicle image by the extracted 9 features' k
Step 3: the method comprises the steps of taking the first n nearest positioning position points of the unmanned aerial vehicle at the current moment to carry out position estimation, respectively calculating the flight speed of the unmanned aerial vehicle in n-1 position intervals of the longitude and latitude directions according to the longitude and latitude where the position points are located and the flight time between every two position points, and recording the flight speed sequence V in the n-1 position intervals of the longitude direction long The flying speed sequence in the n-1 position interval in the latitude direction is V lat The motion trail of the unmanned aerial vehicle is regarded as uniform motion, and the sequence V long And V lat The velocity mean value in the system obeys the t distribution, and the velocity range in the longitude or latitude direction isWherein (1)>And S is *2 Representing the mean and variance of the longitudinal or latitudinal velocity, respectively, α being the confidence parameter of the t distribution, t α/2 (n-2) represents that the t distribution side quantile of n-1 sampling intervals is utilized under the condition that the confidence is 1-alpha, the n is obtained by looking up a t distribution table, the value range of n is 3-6, and the value of alpha is 0.005; initializing the first n positioning position points at the initial moment by utilizing the take-off position;
multiplying the speed range by the time difference between the current time and the flight time of the nth position point to obtain a displacement range in the period, if the obtained displacement range is smaller than 20m, keeping the displacement range to be 20m, and adding the displacement range with the coordinates of the nth position point to obtain the current position range of the unmanned aerial vehicle;
step 4: cutting every 10m of the satellite image in the position range obtained in the step 3, uniformly cutting the satellite image into three parts in the transverse direction and the longitudinal direction each time to form a nine-grid, downsampling each small grid to 128 multiplied by 128, extracting features by utilizing the network trained in the step 1, and putting 9 features of each extracted satellite image and the position labels thereof into a library to be matched;
step 5: traversing the library to be matched, and calculating L between 9 features of each satellite image and the features of the corresponding position of the nine squares on the unmanned aerial vehicle image obtained in the step 2 2 Norms to get 9L 2 The sum of the norm values is used as a similarity measurement value of the whole image, so that each satellite image in the library to be matched obtains a similarity measurement value of the satellite image and the unmanned aerial vehicle image, and a position label of the satellite image with the minimum similarity measurement value is used as a current unmanned aerial vehicle positioning result; if the difference between the minimum two similarity measurement values is larger than a threshold value beta, the current positioning result is considered to be reliable, and the current unmanned aerial vehicle positioning result is taken as a positioning position point; the value of the threshold value beta is 0.25;
step 6: and repeatedly executing the steps 2 to 5 until the visual navigation flight task is finished.
CN202210321285.3A 2022-03-24 2022-03-24 Unmanned aerial vehicle visual positioning method based on multi-source image matching Active CN114842220B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210321285.3A CN114842220B (en) 2022-03-24 2022-03-24 Unmanned aerial vehicle visual positioning method based on multi-source image matching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210321285.3A CN114842220B (en) 2022-03-24 2022-03-24 Unmanned aerial vehicle visual positioning method based on multi-source image matching

Publications (2)

Publication Number Publication Date
CN114842220A CN114842220A (en) 2022-08-02
CN114842220B true CN114842220B (en) 2024-02-27

Family

ID=82563377

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210321285.3A Active CN114842220B (en) 2022-03-24 2022-03-24 Unmanned aerial vehicle visual positioning method based on multi-source image matching

Country Status (1)

Country Link
CN (1) CN114842220B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020181685A1 (en) * 2019-03-12 2020-09-17 南京邮电大学 Vehicle-mounted video target detection method based on deep learning
CN113239952A (en) * 2021-03-30 2021-08-10 西北工业大学 Aerial image geographical positioning method based on spatial scale attention mechanism and vector map
CN113361508A (en) * 2021-08-11 2021-09-07 四川省人工智能研究院(宜宾) Cross-view-angle geographic positioning method based on unmanned aerial vehicle-satellite
WO2022022695A1 (en) * 2020-07-31 2022-02-03 华为技术有限公司 Image recognition method and apparatus

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020181685A1 (en) * 2019-03-12 2020-09-17 南京邮电大学 Vehicle-mounted video target detection method based on deep learning
WO2022022695A1 (en) * 2020-07-31 2022-02-03 华为技术有限公司 Image recognition method and apparatus
CN113239952A (en) * 2021-03-30 2021-08-10 西北工业大学 Aerial image geographical positioning method based on spatial scale attention mechanism and vector map
CN113361508A (en) * 2021-08-11 2021-09-07 四川省人工智能研究院(宜宾) Cross-view-angle geographic positioning method based on unmanned aerial vehicle-satellite

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于POS与图像匹配的无人机目标定位方法研究;张岩;李建增;李德良;杜玉龙;;军械工程学院学报;20150215(01);全文 *
无人机影像与卫星影像配准的卷积神经网络方法;蓝朝桢;施群山;崔志祥;秦剑琪;徐青;;测绘科学技术学报;20200215(01);全文 *

Also Published As

Publication number Publication date
CN114842220A (en) 2022-08-02

Similar Documents

Publication Publication Date Title
KR102273559B1 (en) Method, apparatus, and computer readable storage medium for updating electronic map
CN111862126B (en) Non-cooperative target relative pose estimation method combining deep learning and geometric algorithm
CN107024216B (en) Intelligent vehicle fusion positioning system and method introducing panoramic map
WO2019153245A1 (en) Systems and methods for deep localization and segmentation with 3d semantic map
GB2568286A (en) Method of computer vision based localisation and navigation and system for performing the same
US11430199B2 (en) Feature recognition assisted super-resolution method
CN113593017A (en) Method, device and equipment for constructing surface three-dimensional model of strip mine and storage medium
CN109033245B (en) Mobile robot vision-radar image cross-modal retrieval method
CN113052106B (en) Airplane take-off and landing runway identification method based on PSPNet network
AU2020375559B2 (en) Systems and methods for generating annotations of structured, static objects in aerial imagery using geometric transfer learning and probabilistic localization
CN113516664A (en) Visual SLAM method based on semantic segmentation dynamic points
CN111611918B (en) Traffic flow data set acquisition and construction method based on aerial data and deep learning
CN114564545A (en) System and method for extracting ship experience course based on AIS historical data
CN111812978B (en) Cooperative SLAM method and system for multiple unmanned aerial vehicles
CN115496900A (en) Sparse fusion-based online carbon semantic map construction method
JP2020153956A (en) Mobile location estimation system and mobile location method
CN114842220B (en) Unmanned aerial vehicle visual positioning method based on multi-source image matching
CN113227713A (en) Method and system for generating environment model for positioning
CN115187614A (en) Real-time simultaneous positioning and mapping method based on STDC semantic segmentation network
CN110826432B (en) Power transmission line identification method based on aviation picture
CN112836586A (en) Intersection information determination method, system and device
Sikdar et al. Unconstrained Vision Guided UAV Based Safe Helicopter Landing
Sun et al. Accurate deep direct geo-localization from ground imagery and phone-grade gps
CN115994934B (en) Data time alignment method and device and domain controller
Astudillo et al. Mono-LSDE: Lightweight Semantic-CNN for Depth Estimation from Monocular Aerial Images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant