CN115619959B

CN115619959B - Comprehensive environment three-dimensional modeling method for extracting key frames based on videos acquired by unmanned aerial vehicle

Info

Publication number: CN115619959B
Application number: CN202211631541.5A
Authority: CN
Inventors: 朱义勇; 梁亢聘; 李金玖; 袁渊; 袁现旺; 慕昊润; 张洪碧
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2022-12-19
Filing date: 2022-12-19
Publication date: 2023-04-07
Anticipated expiration: 2042-12-19
Also published as: CN115619959A

Abstract

The comprehensive environment three-dimensional modeling method comprises the steps of obtaining an optimal air route covering an area needing modeling based on an optimization result of a comprehensive preference function considering constraint conditions such as various costs and the like, and enabling an unmanned aerial vehicle to sail along the optimal air route to obtain comprehensive environment video data; measuring image frames of the comprehensive environment video data by using a method for calculating image frame similarity and using the Babbitt distance to extract a key frame set; extracting a plurality of groups of adjacent key frame combinations in the key frame set, calculating the contact ratio by using an improved SIFT algorithm, extracting frames of the frame combinations which do not meet the precision requirement, and supplementing the frames to obtain a screening and supplementing key frame set; and embedding geographic information data into the screened supplementary key frame set and preprocessing the geographic information data to construct a comprehensive environment three-dimensional model of the region needing to be modeled. The method not only ensures the working stability and high efficiency of the unmanned aerial vehicle and reduces the operation burden of personnel, but also provides a terrain three-dimensional model with geographic information data, and is convenient and practical.

Description

Comprehensive environment three-dimensional modeling method for extracting key frames based on videos acquired by unmanned aerial vehicle

Technical Field

The invention relates to the technical field of unmanned aerial vehicle remote sensing, in particular to a comprehensive environment three-dimensional modeling method for extracting key frames based on videos acquired by an unmanned aerial vehicle.

Background

It is common practice in many industry sectors to use drones to take pictures or videos to view a particular area. In consideration of portability, the single-lens-carried micro unmanned aerial vehicle is mostly used in many current operation scenes, the single lens can only acquire a two-dimensional plane graph or a two-dimensional video, and in the later period, the situation of comprehensively watching the terrain and researching and judging needs to be repeatedly switched or continuously played back in a plurality of pictures, so that the problems of long consumed time, inconvenience in operation, difficulty in plotting and the like are caused.

Based on the inconvenience that the single lens can only obtain a two-dimensional plane graph or a two-dimensional video at present, three-dimensional modeling can be adopted to overcome the inconvenience, however, the solution for constructing the three-dimensional model which is popular in the market has the following defects: firstly, the operation flow is complicated, thorns need to be selected in advance and image control points need to be finished in the field operation, and the whole set of flow is not completely suitable for different environments of scenes; secondly, in the flight operation of the unmanned aerial vehicle, the acquired information data mainly comprises photos, if the number of the photos is small, the possibility of missing and losing information exists, and if the number of the photos is too large, the subsequent three-dimensional modeling can be greatly burdened, and the intermediate trade-off standard is difficult to determine; and thirdly, the time consumed by outfield operation is long, the flight speed of the unmanned aerial vehicle is limited due to factors such as optical flow phenomenon and sensor sensitivity, the low-altitude low-speed flying unmanned aerial vehicle is difficult to ensure the working stability under certain severe conditions, and sometimes the long-time air stagnation of the unmanned aerial vehicle is not allowed under meteorological conditions, battery capacity and scene sensitivity.

Disclosure of Invention

In order to overcome at least one technical defect in the prior art mentioned above and improve or optimize the prior art, the invention provides a comprehensive environment three-dimensional modeling method for extracting key frames based on videos acquired by an unmanned aerial vehicle, which is used for optimizing the generation rate of a three-dimensional model, quickly updating the three-dimensional model as far as possible, and optimizing application scenes such as mountain forests, urban environments and the like to eliminate information fog and provide convenience for decision making of operating personnel.

In order to achieve the purpose, the invention provides a comprehensive environment three-dimensional modeling method for extracting key frames based on videos collected by an unmanned aerial vehicle, which comprises the following steps:

acquiring an optimal air route covering an area needing to be modeled based on an optimization result of a comprehensive preference function considering one or more constraint conditions including air route length cost, dead time cost, special operation risk area cost and no-fly area cost, and enabling the unmanned aerial vehicle to sail along the optimal air route to acquire comprehensive environment video data;

measuring the image frames of the comprehensive environment video data by using a method for calculating the similarity of the image frames and the Babbitt distance so as to extract a key frame set;

extracting a plurality of groups of adjacent key frame combinations in the key frame set, calculating the contact ratio by using an improved SIFT algorithm, and extracting and supplementing frames of the frame combinations which do not meet the precision requirement to obtain a screening and supplementing key frame set;

and embedding geographic information data into the screened supplementary key frame set and preprocessing the screened supplementary key frame set so as to construct a comprehensive environment three-dimensional model of the region needing to be modeled.

Further, the expression of the comprehensive preference function is:

wherein the content of the first and second substances,

representing a comprehensive preference value for the route planning;mrepresenting the number of constraints;nrepresenting the number of voyages of the unmanned aerial vehicle;c _ij is represented iniDuring the navigation of each flight segmentjThe cost of each constraint;λ _j representing preference coefficients, representing the first to be given in a certain route planningjPreference degrees of the respective constraints.

Further, the expression of the route length cost is as follows:

wherein, the first and the second end of the pipe are connected with each other,c _i1 is represented iniThe length cost of the flight line in the voyage of each flight segment;

represents the firstiThe length of the flight path of each flight segment;

representing the linear distance of the entire flight path including all the legs in the most ideal case.

Further, the expression of the dead time cost is:

wherein the content of the first and second substances,c _i2 is represented iniDead time cost in each leg of the voyage;

represents the firstiThe time taken for the voyage of each voyage section;

the time required by the unmanned plane to fly at a constant speed in the linear navigation process under the optimal condition is represented.

Further, the expression of the special operation risk zone cost is as follows:

wherein the content of the first and second substances,c _i3 is represented iniCost of special operation risk areas in each flight segment navigation;R ₃ a threat radius representing a risk zone;r _i3 represents the firstiThe shortest distance between the flight path of each flight segment and the center position of the risk area;

represents the firstiThe length of the flight path of each flight segment;

representing the linear distance of the entire flight path including all flight segments in the most ideal case.

Further, the expression of the no-fly zone cost is as follows:

wherein the content of the first and second substances,c _i4 is represented iniThe cost of the no-fly zone in the navigation of each flight segment,r _i4 represents the firstiThe shortest distance between the flight path of each flight segment and the flight control point of the flight control area;R ₄ representing the no-fly radius of the no-fly zone.

Further, the measuring the image frames of the integrated environment video data by using the babbit distance by using the method for calculating the image frame similarity to extract the key frame set specifically includes:

extracting a frame of image frame at intervals of a preset image frame number as a prepared key frame;

by the formula

Calculating the Papanicolaou distance between the extracted preparation key frame and the previous key frame, wherein the first key frame is initialized by the acquired first image frame;pand withqRepresenting two image frames;p(x) Andq(x) Representing two image frames atxThe gray value of (d);

representing the babbitt distance of two image frames,

；

judgment ofCoefficient of pasteurisation

If the number of the key frames does not exceed a preset Bhattacharyya coefficient threshold value, judging the prepared key frame as a new key frame;

continuously extracting a frame of image frame as a next prepared key frame at intervals of a preset image frame number, judging whether the next prepared key frame is a new key frame or not by the Babbitt coefficient threshold judgment method, and repeating the judgment in sequence until all the image frames in the comprehensive environment video data are sequentially traversed and judged for one round to obtain the key frame set.

Further, the improved SIFT algorithm comprises:

when determining the direction, setting the Gaussian weighting parameter as the preset multiple of the key point scale, and then countingnHistogram of each direction to obtain histogram statistic

Will be the largest of them

And as the direction of the key point, neglecting the selection of the auxiliary direction, recording the histogram sequence and applying the histogram sequence to the calculation of the feature vector generation.

Further, the embedding of the geographic information data into the filtered supplementary keyframe set specifically includes:

the programming program extracts geographic information data stored in the ROM of the unmanned aerial vehicle, then the key frames in the screening supplement key frame set correspond to the geographic information data one by one through comparison with the time stamps of the key frames, and the geographic information data are added into exif metadata of each key frame by applying API provided by exiv 2.

Further, the pretreatment specifically comprises:

changing the color space of the key frames in the screening supplement key frame set, converting the RGB space into HSV space, and performing histogram equalization on the gray component V;

the probability of gray scale is calculated by the formula

,

(ii) a In the formula (I), the compound is shown in the specification,nthe number of the representative pixels is,

represents the first

The number of the gray levels is one,

represents the first

Number of pixels of a gray scale;

all the pixel points are transformed by the transformation formula

In the formula (I), the compound is shown in the specification,

representing the transformed pixels.

In general, compared with the prior art, the above technical solution contemplated by the present invention can achieve the following beneficial effects:

(1) The unmanned aerial vehicle of the method adopts a video shooting mode to carry out data acquisition in application, so that higher flying speed can be adopted to sweep over a surveyed region, the dead time and the operation time of the unmanned aerial vehicle can be effectively reduced, the safe working time of the unmanned aerial vehicle with limited battery capacity is ensured, and the burden of operators and the requirements on the control skills of the unmanned aerial vehicle are reduced. The improved operation process abandons the steps with higher speciality such as drawing image control points and the like, and the requirement on the air route is relatively low in the mode of recording the video, so that the training process of operators is simplified, and the complexity of the whole operation process is not high. The method takes the optimization result of the comprehensive preference function of one or more constraint conditions including the length cost, the dead time cost, the special operation risk area cost and the no-fly area cost of the flight line into consideration to obtain the optimal flight line covering the area to be modeled, so that the unmanned aerial vehicle can be ensured to be capable of more comprehensively collecting the video data of the area to be modeled while the flight line is most economical.

(2) The method measures the image frame of the comprehensive environment video data by using the method of calculating the image frame similarity in the video stream and the Papanicolaou distance, takes the extracted key frame set as the data source for model synthesis, greatly reduces the data volume, greatly relieves the operation pressure of model synthesis, and enables the dynamic update of the synthesized environment three-dimensional model to be possible along with the application requirements.

(3) According to the method, a plurality of groups of adjacent key frame combinations are extracted from the key frame set, the contact ratio calculation is carried out by using an improved SIFT algorithm, the frame extraction and the supplement are carried out on the frame combinations which do not meet the precision requirement, so that a screened supplement key frame set is obtained, one-time inspection and quality evaluation are carried out on the key frame set, and the precision of key frame extraction is improved.

(4) The method embeds geographic information data into the screened key frame set. Compared with the traditional information carrier, the three-dimensional model constructed by the method has the obvious advantages of bearing accurate geographical position information and being capable of directly carrying out calculation on the model, thereby improving the secondary development potential of the model.

(5) The method uses an image enhancement algorithm based on histogram equalization to carry out preprocessing on the screened key frames, so that the image contrast and definition are enhanced, and experiments prove that the number of matched key points can be obviously increased, thereby being beneficial to improving the precision and efficiency of subsequent modeling.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.

Fig. 1 is a flowchart of a comprehensive environment three-dimensional modeling method for extracting key frames based on videos collected by an unmanned aerial vehicle according to an embodiment of the present invention;

FIG. 2 is a schematic view of a route integrally using a well-shaped route and supplemented with a surrounding route according to an embodiment of the present invention; in fig. 2, 1 and 2 are no-fly zones, 3 and 4 are risk zones, 5 and 6 are dead time limit zones, one well-shaped long chain indicated by 7 is a well-shaped route, and a plurality of surrounding short chains indicated by 8 are surrounding routes;

FIG. 3 is a schematic view of a "well" pattern of the flight path provided by an embodiment of the present invention;

FIG. 4 is a schematic diagram of a circular route according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.

The terms "comprises" or "comprising," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements but may alternatively include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The comprehensive environment three-dimensional modeling method based on the unmanned aerial vehicle collected video and extracted key frames mainly comprises the following design aspects. Firstly, an optimized data acquisition flow of the unmanned aerial vehicle is realized, the data acquisition mode mainly comprises video shooting, and meanwhile, the dead time of the unmanned aerial vehicle is reduced as much as possible by methods of improving the flight speed, simplifying the flight line and the like; secondly, the algorithm of key frame identification and extraction is used for reducing the number of extracted key frame sets as much as possible and reducing the operation time overhead on the premise of meeting the requirement of modeling contact ratio; thirdly, a sampling inspection method of the frame set is used, for the key frame set extracted and generated by adopting the algorithm, one-time inspection and check are needed, and missing image information is supplemented in time after a problem is found; and fourthly, embedding the geographic information of the key frame set, wherein the method is used for ensuring that the generated model has accurate geographic information data, ensuring the functionality and the expansibility of the model and providing possibility for subsequent secondary development. More specific schemes are set forth below.

As shown in fig. 1, in an embodiment, a method for three-dimensional modeling of an integrated environment based on extraction of keyframes from videos captured by an unmanned aerial vehicle mainly includes the following steps:

step 1, obtaining an optimal air route covering an area needing modeling based on an optimization result of a comprehensive preference function considering one or more constraint conditions including air route length cost, dead time cost, special operation risk area cost and no-fly area cost, and enabling an unmanned aerial vehicle to sail along the optimal air route to obtain comprehensive environment video data.

The step is the optimization of the comprehensive environment data acquisition process. When the unmanned aerial vehicle works, the sensitivity degree of each factor can change along with the difference of the comprehensive environment, and therefore operators are required to obtain high-value information as much as possible in the limited dead time of the unmanned aerial vehicle. The optimization aims to plan a route covering a region needing to be modeled under the constraint conditions of comprehensively considering environment, battery capacity, dead time and the like, and the optimization problem is a typical multi-objective optimization problem.

First, the constraints mentioned above with different physical meanings are converted into dimensionless satisfaction indicators with the same order of magnitude, which are used as parameters of a global preference function of the form as follows:

（1）

in the formula (I), the compound is shown in the specification,

planning a comprehensive preference value for the air route;mthe number of constraint conditions;nrepresenting the number of voyages of the unmanned aerial vehicle;c _ij is represented in the firstiDuring the navigation of each flight segmentjThe cost of each constraint;λ _j representing preference coefficients set for adapting to various application scenarios, particularly for the first time in a certain route planning taskjPreference degrees of the respective constraints. The form of the formula improves the mode of averaging the preference function of the original constraint condition and taking the logarithm, and the calculation result of the formula can be correspondingly adjusted according to the emphasis point of the operation requirement by adding the preference coefficient. Experiments show that 4 constraint conditions including the following cost of the length of the flight line, the cost of the dead time, the cost of the special operation risk area and the cost of the no-fly area need to be considered in the flight line planning.

One is the flight line length penalty. The length of the flight line is the length of the actual flight track when the unmanned aerial vehicle takes off and returns, and under the optimal condition, the linear flight path is undoubtedly the most economical and practical solution, but is limited by various factors, and the flight line inevitably has the condition of bending and folding, so the cost in this respect is the ratio of the bending flight line to the linear distance, and the calculation form is as follows:

（2）

in the formula (I), the compound is shown in the specification,c _i1 is represented iniThe length cost of the flight line in each flight segment navigation;

represents the firstiThe length of the flight path of each flight segment;

The second is the dead time cost. The dead time of the small unmanned aerial vehicle is limited, and how to improve the data acquisition efficiency as much as possible under the constraint of limited operation time is an important research direction. Based on the above consideration, the calculation method of the partial cost function is as follows:

（3）

in the formula (I), the compound is shown in the specification,c _i2 is represented in the firstiDead time cost in each leg of the voyage;

represents the firstiThe length of dead time (time taken to sail) for each flight segment;

the time required by the unmanned aerial vehicle to fly at a constant speed in the process of straight line navigation under the optimal condition is represented, and the optimal navigation uniform speed is preferably 20 km/h.

Thirdly, the risk cost (or the cost of the risk area of the special operation) in various special operations. In some special applications, this problem needs to be considered, and the ground threat is proportional to the square of the drone from the center point of the risk area, so the form of calculation of this part of the cost is as follows:

（4）

（5）

in the formula (I), the compound is shown in the specification,c _i3 is represented iniCost of special operation risk areas in each flight segment navigation;R ₃ representing threats of the risk zoneA radius;r _i3 represents the firstiThe shortest distance between the flight path of each flight segment and the center position of the risk area;

represents the firstiThe length of the flight path of each flight segment;

And fourthly, the terrain and the cost of a no-fly zone (or the cost of the no-fly zone) which is defined in advance. The unmanned aerial vehicle is limited by performance conditions of the micro unmanned aerial vehicle, the flying height and the flying speed of the unmanned aerial vehicle are influenced by factors such as air pressure, humidity and air speed, and therefore, the local extremely severe region of the natural environment is not suitable for operation of the micro unmanned aerial vehicle. Meanwhile, due to factors such as a planned plan and the like, part of regions can be divided into no-fly regions, and the unmanned aerial vehicle cannot enter the no-fly regions. Considering the above factors together, the cost calculation form of this part is:

（6）

in the formula (I), the compound is shown in the specification,c _i4 is represented in the firstiThe cost of the no-fly zone in the navigation of each flight segment,R ₄ is a no-fly zone or a no-fly radius of an area with extremely severe terrain environment and unsuitable for flying of the unmanned aerial vehicle,r _i4 represents the firstiThe shortest distance between the flight path of each flight section and the flight control point of the flight control area.

In the formula (1), the reaction mixture is,λthe design of the unmanned aerial vehicle follows the preference of operators and a plan made in advance, and the design of the unmanned aerial vehicle can be adaptively adjusted according to the performance, the environment and other factors of the unmanned aerial vehicle. Given the suggested design principle of the present embodiment, the constraint conditions can be divided into six levels, which are very important, slightly important, not important and not concerned, according to the different degrees of importance, and they respectively correspond to different preference coefficients, as follows:

（7）

in the case of a given geographic environment, depending on the degree of importance placed on each factor, the operator may balance the environment of the area to be mapped and accept or reject each factor.

According to the method, the comprehensive environment can be segmented to form a multi-level, optional and strong-operability air route planning base map. On the basis, considering the coincidence requirement of three-dimensional modeling, comprehensively adopting two modes of a surrounding type route and a 'well' -shaped route, and carrying out global coverage reconnaissance on the region to be surveyed, wherein the expression form of the whole route is shown in figure 2, firstly, the whole route is covered on the region to be surveyed by the 'well' -shaped route (the route schematic diagram of the 'well' -shaped route is shown in figure 3), if the condition of unsuitable flight is met, the route is directly bypassed, and then, the surrounding type route (the route schematic diagram of the surrounding type route is shown in figure 4, the surrounding track is a circular arc line which takes the track point of the 'well' -shaped route as the center of a circle and takes a certain distance as the radius) is used for carrying out supplementary shooting on the partial region. Although accuracy is sacrificed to a certain extent, the flight path planning method can actually meet basic modeling requirements, and is a flight path planning method with strong operability particularly in a time-critical scene.

And 2, measuring the image frames of the video data of the comprehensive environment by using the Papanicolaou distance by adopting a method for calculating the similarity of the image frames so as to extract a key frame set.

This step is the identification and extraction of key frames in the video stream information. The comprehensive environment video data acquired by the method inevitably has excessive redundant information, and if the three-dimensional modeling operation is directly carried out on the basis of the excessive redundant information, excessive pressure is caused on an arithmetic unit, so that a key frame acquisition flow for eliminating the redundant data and only supporting the modeling data is left.

The implementation method of the step is established on a key frame extraction algorithm based on the similarity between video sampling and images. The method comprehensively considers the following two factors: firstly, the video stream information is collected by a camera with 30 FPS-60 FPS, if the video stream information is compared one by one, the algorithm efficiency is too low, and too much time is wasted in an extraction link, so that the calculated amount is compressed in advance by using a sampling method, and the final result is not influenced by sampling on the basis of reasonable parameter selection through experimental verification; secondly, the selection condition of the key frame needs to meet the coincidence degree requirement of three-dimensional modeling, but the calculation amount of directly calculating the image coincidence degree is large, so that the method of roughly estimating the coincidence degree by image similarity calculation is adopted for approximate substitution in the step.

Through experiments, the calculation requirement of the contact ratio can be well fitted by measuring two images by adopting the Babbitt distance, the calculation method mainly reflects the pixel value distribution similarity of the images, and can eliminate the influence of tiny noise and light stream on the images, and the calculation method is as follows:

（8）

wherein the content of the first and second substances,

the calculation method of (c) is as follows:

（9）

in the formula (I), the compound is shown in the specification,

is the babbitt distance of two image frames,

referred to as the babbitt coefficient,

，pand withqRepresenting two image frames;p(x) Andq(x) Representing two image frames atxThe gray value of (d).

As can be seen from the formula (9), a greater babbit coefficient means a greater overlap ratio between two samples, and when the babbit distance between one image frame and the previous key frame reaches a certain threshold (or the babbit coefficients of two image frames do not exceed a preset babbit coefficient threshold), it is determined that the previous key frame is the next key frame. The whole extraction process is as follows:

(1) Every 10 frames, one frame of image is extracted as a preparation key frame.

(2) The babbitt distance between the extracted preparation key frame and the previous key frame is calculated by equation (8), wherein the first key frame is obtained by initializing the acquired first image frame.

(3) Setting a Bhattacharyya coefficient threshold related to the Bhattacharyya distance

If, if

A new key frame is determined.

(4) And (4) repeating the processes of the steps (1) to (3) until the whole video stream traversal is finished.

And 3, extracting a plurality of groups of adjacent key frame combinations in the key frame set, calculating the contact ratio by using an improved SIFT algorithm, extracting frames of the frame combinations which do not meet the precision requirement, and supplementing the frames to obtain a screening and supplementing key frame set.

This step is the screening and supplementing of the key frame. The key frame set extracted by the method for calculating the image similarity inevitably has precision errors, so that the set needs to be checked and evaluated once. In consideration, in order to improve the accuracy, the step adopts a method of extracting several groups of adjacent key frame combinations from the key frame set and then using an SIFT algorithm to calculate the contact ratio. And performing frame extraction and re-supplement on the frame combinations which do not meet the requirements.

The key of the step lies in the application and improvement of the traditional SIFT algorithm, the traditional SIFT algorithm is high in complexity, long in image processing time and low in overall efficiency, simplification needs to be performed to a certain degree, and then the RANSAC algorithm can be used for making up for the reduction of precision after simplification.

The conventional SIFT algorithm mainly comprises the following five steps: firstly, establishing a scale space, and generating a Gaussian difference scale space by utilizing convolution of Gaussian difference kernels with different scales and an image; secondly, detecting an extreme value in a scale space, accurately positioning the extreme value point, and removing a key point with low contrast and a corresponding point of an unstable edge; thirdly, matching the directions of the key points, and determining the directions of the key points by using the gradient direction distribution characteristics of the neighborhood pixels of the key points so as to enable the key points to have rotation invariance; fourthly, generating a feature point descriptor, describing each key point by 16 seed points of 4 multiplied by 4, and finally forming a 128-dimensional feature vector; fifthly, feature matching, namely calculating Euclidean distance by using the feature vector generated in the fourth step, and searching matched feature points through comparison. Through experimental analysis, the time consumption for generating the feature point descriptors in the fourth step and matching the features in the fifth step is longest, so that optimization needs to be performed on the two key links.

First, in determining the direction of the keypoint, a direction corresponding to 80% of the energy of the main peak is taken as the auxiliary direction of the keypoint, but since the coordinates of the feature points of the auxiliary direction and the main direction are the same, this may cause a situation in which the number of matches is increased, and therefore, the reference to the auxiliary direction should be cancelled. Secondly, a process of generating feature vectors is carried out, the key point neighborhood histogram is repeatedly generated in the key point direction distribution and the feature vector generation, and optimization can be carried out.

To solve the above two problems, first, in determining the direction, a Gaussian weighting parameter is defined as a key point scale

1.5 times of the total amount of the compound, and then countingnHistogram of each direction to obtain histogram statistic

Will be the largest of them

And (4) as the direction of the key point, neglecting the selection of the auxiliary direction, recording the histogram sequence, and applying the histogram sequence to the feature vector generation calculation. Through the optimization, the number of generated key points can be reduced, but the number of the reserved key points can be matched, and an accurate coincidence result of the image can be obtained.

And 4, embedding geographic information data into the screened supplementary key frame set and preprocessing the screened supplementary key frame set so as to construct a comprehensive environment three-dimensional model of the region to be modeled.

For the three-dimensional model, the fact that geographic position information is carried is a crucial feature, and the keyframes extracted by the method lack related data in exif metadata, so that the synthesized three-dimensional model does not support later-stage function expansion.

In order to solve the problem, the geographic information data stored in the ROM of the unmanned aerial vehicle can be extracted through a programming program, then the extracted key frames are in one-to-one correspondence with the geographic information data through comparison with the time stamps of the key frames, and the data are added into exif metadata of each frame by applying an API (application programming interface) provided by exiv2, so that the secondary development potential of the model can be improved.

Unmanned aerial vehicle can receive the influence of factors such as sun angle, environment cover when taking photo by plane, causes the image of gathering to have the difference on grey scale and the luminance, has adverse effect to the detection of image characteristic. In this embodiment, an image enhancement algorithm based on histogram equalization is used to pre-process the screened keyframes, and the steps of the algorithm are as follows:

(1) And changing the color space of the key frames in the key frame set after screening and supplementing, converting the RGB space into HSV space, and performing histogram equalization on the gray component V.

(2) The probability of the gray level is calculated by the formula:

（10）

in the formula (I), the compound is shown in the specification,nthe number of the representative pixels is,

represents the first

The number of the gray levels is one,

represents the first

Number of pixels of one gray level.

(3) All the pixel points are transformed, and the transformation formula is as follows:

（11）

in the formula

Representing the transformed pixels.

After the preprocessing by the method, the contrast and the definition of the image frame can be enhanced, and experiments prove that the number of matching key points can be increased remarkably, thereby being beneficial to improving the precision and the efficiency of subsequent modeling.

After the data is sufficiently prepared, the model can be constructed by using mature tools such as Photoscan and Context Capture, and the three-dimensional model in the OSGB format is generally selected and generated, so that the three-dimensional model can be adapted to various display platforms and is favorable for development of expansibility.

Through the four steps of operation, the scheme can achieve the effect similar to that of the traditional oblique photography modeling under the condition of greatly reducing the data processing amount, not only meets the timeliness requirement under various operating environments, but also ensures the requirements of intuition, three-dimensional and image of the model, and simultaneously has certain inspiring and reference significance for the traditional oblique photography.

The technical route of the method is a brand new research direction generated after surveying and mapping intersect with the subjects such as unmanned aerial vehicles, image processing and the like, and the application target of the method is to provide comprehensive environment information service for terrain-based situation research and judgment and operation planning. In an emergency state, the system can be used as a situation display platform of a specific area; in the normal state, the system can provide information support for relevant business departments to view the terrain in an all-around manner and think, discuss and deduce. The system has the advantages that the real geographic environment is vividly and stereoscopically displayed in front of related personnel, and functions of plotting, dynamic demonstration and the like are provided, so that the decision-making process is accelerated.

In conclusion, the invention optimizes the data acquisition strategy on the basis of the traditional oblique photography operation flow, and replaces photo shooting with video recording. This improvement has reduced the requirement of whole set of flow to the airline planning, has reduced unmanned aerial vehicle's the time of staying empty, is applicable to the current many basic departments or individual unmanned aerial vehicle small, the limited occasion of battery capacity to with video mode information acquisition, greatly reduced to operating personnel's technical requirement, whole set of solution has extensive adaptability and practicality.

The invention can quickly and efficiently identify and extract the key frame in the video stream. The whole section of the comprehensive environment video has excessive redundant data, so that the designed algorithm for identifying key frames is the key for improving the overall efficiency, most redundant information is removed on the premise of meeting the requirement of modeling coincidence rate, and data support is provided for later-stage modeling. This scheme stands at user's angle, the simple function of making a video recording of making full use of miniature unmanned aerial vehicle (portability is good, take off weight about 1.5 kilograms), the extraction of key frame is carried out to the video of miniature unmanned aerial vehicle collection, realize the three-dimensional modeling of comprehensive environment, aim at overcoming the picture or the all-round topography of watching of video that directly use unmanned aerial vehicle to gather, study and judge the not enough of situation existence, strengthened the interactive experience of operation personnel with the terrain environment, promoted decision-making efficiency.

The invention checks the extracted key frame set. The key frame set is checked by sampling and detecting a proper range, the extracted samples are subjected to accurate coincidence degree calculation, and if an image frame which does not meet the accuracy requirement is found, the missing part is supplemented.

The invention embeds geographic information data into the screened key frame set. Compared with the traditional information carrier, the constructed three-dimensional model has the obvious advantages of bearing accurate geographical position information and being capable of directly carrying out calculation work on the model. However, the image frames extracted directly lack geographic information, and therefore, a special auxiliary program is needed to embed the calibrated geographic information data so as to meet various application requirements.

The above description is merely an exemplary embodiment of the present disclosure, and the scope of the present disclosure is not limited thereto. That is, all equivalent changes and modifications made in accordance with the teachings of the present disclosure are intended to be included within the scope of the present disclosure. Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

All possible combinations of the technical features in the above embodiments may not be described for the sake of brevity, but should be considered as being within the scope of the present disclosure as long as there is no contradiction between the combinations of the technical features.

It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. An integrated environment three-dimensional modeling method for extracting key frames based on videos collected by an unmanned aerial vehicle is characterized by comprising the following steps:

acquiring an optimal air route covering an area needing modeling based on an optimization result of a comprehensive preference function considering one or more constraint conditions including air route length cost, dead time cost, special operation risk area cost and no-fly area cost, and enabling the unmanned aerial vehicle to sail along the optimal air route to acquire comprehensive environment video data;

extracting a plurality of groups of adjacent key frame combinations in the key frame set, calculating the contact ratio by using an improved SIFT algorithm, extracting frames of the frame combinations which do not meet the precision requirement, and supplementing the frames to obtain a screening and supplementing key frame set;

embedding geographic information data into the screened supplementary key frame set and preprocessing the screened supplementary key frame set so as to construct a comprehensive environment three-dimensional model of the region needing to be modeled;

wherein the improved SIFT algorithm comprises:

Will be the largest of them

2. The modeling method of claim 1, wherein the synthetic preference function has the expression:

wherein the content of the first and second substances,

representing a comprehensive preference value of route planning;mrepresenting the number of constraints;nrepresenting the number of voyages of the unmanned aerial vehicle;c _ij is represented iniDuring the navigation of each flight segmentjThe cost of each constraint;λ _j representing preference coefficients, representing the first to be given in a certain route planningjPreference degrees of individual constraints.

3. A modeling method according to claim 2, in which the route length cost is expressed as:

wherein the content of the first and second substances,c _i1 is represented in the firstiThe length cost of the flight line in each flight segment navigation;

represents the firstiThe length of the flight path of each flight segment;

4. A modeling method according to claim 2, wherein the dead-time cost is expressed as:

represents the firstiThe time taken for the voyage of each voyage section;

5. The modeling method of claim 2, wherein the special job risk zone cost is expressed as:

wherein the content of the first and second substances,c _i3 is represented in the firstiCost of special operation risk areas in each flight segment navigation;R ₃ a threat radius representing a risk zone;r _i3 represents the firstiThe shortest distance between the flight path of each flight segment and the center position of the risk area;

represents the firstiThe length of the flight path of each flight segment; />

6. A modeling method in accordance with claim 2, wherein the forbidden flight zone cost is expressed as:

wherein the content of the first and second substances,c _i4 is represented in the firstiThe cost of the no-fly zone in the navigation of each flight segment,r _i4 represents the firstiThe shortest distance between the flight path of each flight section and the flight control point of the flight control area;R ₄ representing the no-fly radius of the no-fly zone.

7. The modeling method of claim 1, wherein said using the method of calculating image frame similarity to measure image frames of the integrated environmental video data using babbitt distance to extract keyframe sets comprises:

by the formula

Calculating the Papanicolaou distance between the extracted preparation key frame and the previous key frame, wherein the first key frame is initialized by the acquired first image frame;pandqrepresenting two image frames;p(x) Andq(x) Representing two image frames atxThe gray value of (d); />

Representing the Bhattacharyya distance of two image frames>

；

Judging the Barcol index

8. The modeling method of claim 1, wherein said embedding geographic information data into said set of filtered supplemental keyframes specifically comprises:

9. The modeling method of claim 1, wherein the preprocessing specifically includes:

the probability of the gray level is calculated by the formula