CN115330594A - Target rapid identification and calibration method based on unmanned aerial vehicle oblique photography 3D model - Google Patents
Target rapid identification and calibration method based on unmanned aerial vehicle oblique photography 3D model Download PDFInfo
- Publication number
- CN115330594A CN115330594A CN202210850871.7A CN202210850871A CN115330594A CN 115330594 A CN115330594 A CN 115330594A CN 202210850871 A CN202210850871 A CN 202210850871A CN 115330594 A CN115330594 A CN 115330594A
- Authority
- CN
- China
- Prior art keywords
- picture
- splicing
- image
- unmanned aerial
- aerial vehicle
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 55
- 230000008569 process Effects 0.000 claims abstract description 19
- 238000006243 chemical reaction Methods 0.000 claims abstract description 16
- 238000013473 artificial intelligence Methods 0.000 claims abstract description 9
- 230000000007 visual effect Effects 0.000 claims abstract description 7
- 238000007781 pre-processing Methods 0.000 claims abstract description 6
- 230000009466 transformation Effects 0.000 claims description 11
- 238000012545 processing Methods 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 8
- 238000013135 deep learning Methods 0.000 claims description 7
- 230000011218 segmentation Effects 0.000 claims description 6
- 238000013507 mapping Methods 0.000 claims description 3
- 230000002194 synthesizing effect Effects 0.000 claims description 2
- 238000012549 training Methods 0.000 claims description 2
- 238000005516 engineering process Methods 0.000 abstract description 6
- 238000002360 preparation method Methods 0.000 abstract 1
- 230000004927 fusion Effects 0.000 description 6
- 238000001514 detection method Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 238000000605 extraction Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4038—Image mosaicing, e.g. composing plane images from plane sub-images
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C11/00—Photogrammetry or videogrammetry, e.g. stereogrammetry; Photographic surveying
- G01C11/02—Picture taking arrangements specially adapted for photogrammetry or photographic surveying, e.g. controlling overlapping of pictures
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C11/00—Photogrammetry or videogrammetry, e.g. stereogrammetry; Photographic surveying
- G01C11/04—Interpretation of pictures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/08—Projecting images onto non-planar surfaces, e.g. geodetic screens
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/14—Transformations for image registration, e.g. adjusting or mapping for alignment of images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/80—Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/17—Terrestrial scenes taken from planes or by drones
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
- G06T2207/10012—Stereo images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20092—Interactive image processing based on input by user
- G06T2207/20104—Interactive definition of region of interest [ROI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Remote Sensing (AREA)
- Radar, Positioning & Navigation (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- Computer Graphics (AREA)
- Geometry (AREA)
- Medical Informatics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Stereoscopic And Panoramic Photography (AREA)
- Studio Devices (AREA)
- Length Measuring Devices By Optical Means (AREA)
Abstract
The invention relates to the technical field of information technology, and discloses a target rapid identification and calibration method based on an unmanned aerial vehicle oblique photography 3D model, which comprises the following steps: the method comprises the following steps: preprocessing a 2D picture; step two: splicing 2D pictures; step three: artificial intelligence recognition; step four: 2D/3D coordinate conversion; step five: the 3D visual identification is used for directly carrying out image identification by using a shot picture, so that the problems of repeated identification and low identification speed exist, the process processes high-precision coordinates of the center of a light spot carried by the picture to obtain a picture track, the adjacent and nearest 4 pictures in the track are used as data sources for picture splicing, splicing data preparation is carried out for image splicing, the picture splicing sequence and the splicing position are directly selected by using three-dimensional coordinate information, and the situation that feature points are acquired and matched one by one in the image splicing process is avoided.
Description
Technical Field
The invention relates to the technical field of information, in particular to a target rapid identification and calibration method based on an unmanned aerial vehicle oblique photography 3D model.
Background
With the maturity of unmanned aerial vehicles and artificial intelligence technologies, visual information contents acquired by researching pictures shot by the unmanned aerial vehicles are increasingly abundant. In a traditional mode, a large number of two-dimensional photos shot by an unmanned aerial vehicle can generate a geographic information three-dimensional model with high-precision three-dimensional coordinate information by using an empty three-dimensional photogrammetry modeling method, so that the measurement of points, lines, surfaces and bodies of the three-dimensional model is realized. 3D specific scene graphs in a large-scale space can be generated by utilizing unmanned aerial vehicle photography technology. One important class of applications is the need to be able to identify objects in a scene graph and to calibrate their position and geometric information.
The intelligent recognition of objects such as vehicles, houses, trees, people and the like is carried out on the 2D picture shot by the unmanned aerial vehicle by utilizing the deep learning technology. Although the technology is mature, the single-frame 2D picture is only local information of the scene graph, and the target position and the geometric information in the single-frame 2D picture cannot be directly calibrated, so that the visualization effect is poor, and the industrial requirements cannot be met.
However, it is also difficult to directly identify and calibrate a scene graph target in 3D model data, and usually a 3D model is described by 3D point cloud data, and although the 3D point cloud model can cover a global scene graph, the three-dimensional model has model reconstruction in the modeling process, so that original image information is lost to different degrees, which affects the identification effect, and in addition, the processed data amount is large, a large amount of calculation power needs to be consumed, and the application requirements of the industry cannot be met in terms of time efficiency.
Therefore, a hybrid mechanism is required to be established, and the method can be used for fusing coordinate and geometric dimension information on a 3D model by using the accurate identification capability on a 2D picture, so as to realize the rapid identification and calibration of the target on a 3D global scene graph.
Disclosure of Invention
Technical problem to be solved
Aiming at the defects of the prior art, the invention provides a target rapid identification and calibration method based on an unmanned aerial vehicle oblique photography 3D model, which solves the problems of rapid target identification, merging, splicing and statistics of 2D pictures under a large amount of large-amplitude overlapping degrees; and 2D/3D coordinate transformation is rapidly carried out, the position and geometric information of the identified target in the 2D picture can be rapidly calibrated, and visual display is carried out in a 3D model.
(II) technical scheme
In order to achieve the purpose, the invention provides the following technical scheme: 1. a target rapid identification and calibration method based on an unmanned aerial vehicle oblique photography 3D model comprises the following steps:
the method comprises the following steps: preprocessing a 2D picture;
step two: splicing the 2D pictures;
step three: artificial intelligence identification;
step four: 2D/3D coordinate conversion;
step five: and 3D visual identification.
Preferably, three-dimensional coordinate information carried by a 2D picture shot by an unmanned aerial vehicle is utilized, the 2D picture is quickly spliced according to the overlapping degree, the calculated data volume of the picture is reduced, then an artificial intelligent recognition method is utilized to intelligently recognize interested picture contents, the coordinate information of a 2D picture recognition result is mapped and calculated through a coordinate conversion system to obtain a 3D coordinate value, and a recognition object is marked on the 3D model.
Preferably, the specific operation flow of the step one 2D picture preprocessing is as follows: when the unmanned aerial vehicle shoots a fixed-focus picture in photogrammetry, a flight area is set in advance, a flight route is planned, and a flight overlapping degree is set, wherein the overlapping degree is generally above 70%, a good result can be obtained in three-dimensional modeling, and meanwhile, a high-precision three-dimensional coordinate in picture shooting can be obtained by using an RTK or PPK high-precision positioning system of Beidou/GPS when data are collected, and the general coordinate system is WGS84. The method comprises the steps of directly using shot pictures to carry out image recognition, and solving the problems of repeated recognition and low recognition speed.
Preferably, the specific operation flow of the step two 2D picture stitching is as follows: according to 4 spliced pictures obtained by calculating the high-precision three-dimensional coordinates, splicing and synthesizing the 2D pictures, wherein the processing process comprises the following steps: inputting a merged image, calculating an overlapping area, extracting characteristic points, registering a 2D image, projecting transformation, splicing calculation, fusing the image and generating the image.
Preferably, the specific operation flow of the artificial intelligence recognition in the third step is as follows: the intelligent identification of the spliced 2D picture is realized by adopting a deep learning algorithm, and the identification content mainly comprises the following steps: trees, farmlands, wire bars, ponds, cars, houses/foundations, people, green belts, sidewalks, etc.;
the deep learning data set adopts the existing public data sets such as: ADE20K, urban Drone Dataset (UDD), stanford Drone Dataset and the like are subjected to pre-training and coarse adjustment, and then the picture shot by an unmanned aerial vehicle is used for fine adjustment after enhancement;
the image processing and acquisition panorama segmentation algorithm is an open-source SegFormer segmentation algorithm realized in a transform and multi-layer perceptron MLP decoding mode.
Preferably, the specific operation flow of the four 2D/3D coordinate transformations in the step is as follows: establishing a rapid conversion model for the pixel coordinates identified by the artificial intelligent image and the three-dimensional coordinates of the physical world, wherein the conversion process is as follows: pixel coordinates-image coordinates-camera coordinates-WGS 84 coordinates.
(III) advantageous effects
Compared with the prior art, the invention provides a target rapid identification and calibration method based on an unmanned aerial vehicle oblique photography 3D model, which has the following beneficial effects:
1. according to the target rapid identification and calibration method based on the unmanned aerial vehicle oblique photography 3D model, 1, image splicing and identification calculation data volume are reduced, calculation identification speed is improved, and 2, position calibration and rapid visual display with good effect from 2D identification to 3D display are achieved.
2. According to the target rapid identification and calibration method based on the unmanned aerial vehicle oblique photography 3D model, the problems of repeated identification and low identification speed exist when the shot pictures are directly used for image identification, the process processes high-precision coordinates of the centers of light spots carried by the pictures to obtain picture tracks, 4 adjacent and closest pictures in the tracks are used as data sources for picture splicing to prepare splicing data for image splicing, the three-dimensional coordinate information is directly used for selecting the splicing sequence and the splicing position of the pictures, and the situation that feature points are acquired and matched one by one in the image splicing process is avoided.
3. According to the target rapid identification and calibration method based on the unmanned aerial vehicle oblique photography 3D model, image feature point extraction is performed by matching through feature descriptors of pictures, and the same feature points extracted from different pictures are further found. Feature points are extracted by using an SURF/ORB image processing module integrated in OpenCV, a 128-dimensional vector can be extracted by using an SIFT algorithm to serve as the feature points for splicing and merging, and the matching is more accurate by calculating the Euclidean distance to judge whether the two image feature points are matched.
4. According to the target rapid identification and calibration method based on the unmanned aerial vehicle oblique photography 3D model, a RANSAC algorithm is adopted to solve a homography matrix of image registration. If errors are accumulated due to image splicing, joint optimization can be performed by using a light beam adjustment method, so that a plurality of camera parameters can be optimized, and a more accurate image position can be obtained.
Drawings
FIG. 1 is a flow chart of a method of identification and calibration of the present invention;
FIG. 2 is a process diagram of the 2D picture stitching process of the present invention;
FIG. 3 is a network architecture diagram of the SE-YOLOv5 algorithm of the present invention;
FIG. 4 is a diagram of the 2D/3D coordinate transformation process of the present invention;
FIG. 5 is a data chart of an embodiment of the present invention;
FIG. 6 is a graph of discrete image point values versus their position in an image in accordance with the present invention;
FIG. 7 is a graph of the coordinate relationship of a camera according to the present invention;
fig. 8 is a rigid transformation diagram of the coordinates from camera coordinates to WGS84 coordinates according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
Referring to fig. 1-4, a method for quickly identifying and calibrating a target based on an unmanned aerial vehicle oblique photography 3D model includes the following steps: the method comprises the steps of preprocessing 2D pictures, setting a flight area in advance, planning a flight route and setting a flight overlapping degree when an unmanned aerial vehicle shoots a fixed-focus picture in photogrammetry, wherein the overlapping degree can generally obtain a better result when three-dimensional modeling is carried out when the unmanned aerial vehicle shoots the fixed-focus picture, meanwhile, a high-precision three-dimensional coordinate when the picture is shot can be obtained by using an RTK (real time kinematic) or PPK (point-to-multipoint K) high-precision positioning system of a Beidou/GPS (global positioning system) when data is collected, wherein the common coordinate system is WGS84, and the problems of repeated identification and low identification speed exist when the shot picture is directly used for image identification.
Step two: the method comprises the steps of 2D picture splicing, wherein 4 spliced pictures are obtained through calculation according to high-precision three-dimensional coordinates, the 2D pictures are spliced and synthesized, the processing process is as shown in figure 2, the overlapping degree of shot images set through flight planning of an unmanned aerial vehicle is utilized, the adjacent pictures of the synthesized pictures are calculated according to shooting time and the shot three-dimensional high-precision positions, a synthesized splicing area is determined, in order to guarantee the balance requirement of resource consumption and the synthesized splicing precision of subsequent picture calculation, and in addition, the camera shaking problem exists due to factors such as wind speed and air pressure of the unmanned aerial vehicle in the flight process, the overlapping degree of the pictures is not 100% accurate, so when the splicing area is calculated through the overlapping degree, a part of the overlapping area is reserved, and the splicing can not be strictly intercepted and spliced according to the overlapping degree set through flight of the aircraft.
The image feature point extraction is to use the feature descriptors of the pictures for matching, further find the same feature points extracted from different pictures, use a SURF/ORB image processing module integrated in OpenCV to extract the feature points, or use an SIFT algorithm to extract a 128-dimensional vector as a feature point for splicing and merging, and judge whether the two picture feature points are matched by calculating the Euclidean distance.
The 2D image registration is to obtain the relative position of the images of the matching pair after the obtained matching pair is utilized, because the flight heights of the airplanes can not be kept absolutely consistent, the relative position transformation of two spliced images needs to be calculated, 4 matching points are utilized, any three matching points are required to be not shared, a RANSAC algorithm is adopted to solve the homography matrix of the image registration, if the image splicing can cause the accumulation of errors, a beam adjustment method can be used for joint optimization, a plurality of camera parameters can be optimized, and then more accurate image positions can be obtained.
In the image projection, in order to ensure consistency of spliced images, a mapping transformation algorithm in OPENCV is used to obtain plane projection, cylindrical projection, spherical projection, fisheye projection and cube projection, wherein the scale of the setting mapping is set as the focal length of the camera.
The image splicing is that after the splicing positions of two adjacent images are obtained, a fusion algorithm is used for a plurality of pixels near the splicing, and for the image at one side of the position far away from the splicing in the overlapping area, dislocation and artifacts between the images can be effectively removed.
Image fusion is performed by using a feathering fusion algorithm in OpenCV, which is to obtain a weight for a position near a seam according to a distance from the seam and perform weighted fusion, and a laplacian fusion algorithm, which is to obtain components of images with different frequencies and perform fusion according to the frequencies.
Step three: artificial intelligence recognition
Unmanned aerial vehicle aerial photography image has that dynamic range is big, the background is complicated, the ambient light is complicated scheduling problem, and traditional computer vision algorithm is difficult to realize the detection of robust, adopts the intelligent recognition of degree of deep learning algorithm realization concatenation back 2D picture, and the discernment content mainly includes: trees, farmlands, wire bars, ponds, cars, houses/foundations, people, green belts, sidewalks, etc.
The deep learning data set adopts the existing public data sets such as: ADE20K, urban Drone Dataset (UDD), stanford Dronedataset, etc. are pre-trained and coarse tuned, and then enhanced with the images taken by the Drone and fine tuned.
The method comprises the steps of processing images and collecting a panoramic segmentation algorithm, wherein an open-source SegFormer segmentation algorithm is realized by adopting a transform and multi-layer perceptron MLP decoding mode.
The image target recognition detection algorithm uses YOLOv5 as a reference, and is improved to form an SE-YOLOv5 target detection algorithm, and the network architecture of the algorithm is shown in FIG. 3.
The SE-YOLOv5 is a target detection algorithm formed after a (Squeeze-and-Excitation, SE) channel attention mechanism is introduced into the YOLOv5, compared with the YOLOv5, the precision is improved, the whole framework adopts Darknet as a backhaul layer, a feature pyramid network FPN as a Neck layer, convolution as a Head layer, and technologies such as spatial attention Focus, cross-stage local network CSP, pyramid pooling SPP and feature aggregation network PANet are adopted inside to enhance information flow among layers, channels and stages, reduce the number of model parameters and finally realize quick and robust detection and identification.
Step four: 2D/3D coordinate conversion, the object on the earth has a three-dimensional coordinate to become a three-dimensional world coordinate system, the WGS84 coordinate system is commonly used, the unmanned aerial vehicle has a high-precision three-dimensional coordinate value as the optical center position of the camera when shooting, the camera also has a camera coordinate system, the light has an image coordinate system through a camera lens, the shot picture has a pixel coordinate system, the pixel coordinate system is mainly used for manual intelligent image identification, a fast conversion model needs to be established between the pixel coordinate system and the three-dimensional coordinate of the physical world, and the conversion process is shown in figure 4.
The pixel coordinates are the coordinates of the digitized picture in relation to its discrete image point values stored in the computer and its position in the image, usually identified by a coordinate system u-v, the origin of which is the upper left corner of the image; the image coordinates refer to the position coordinates of the photographed object in the photograph, and are identified by x-y, and the relationship is shown in fig. 6:
(u 0 ,v 0 ) Is a constant corresponding to the coordinate transformation, and the same can be obtained:
x=udx-u 0 dx y=vdy-v 0 dy
the camera coordinates are three-dimensional coordinates with the camera optical center as the origin, wherein the z-axis points in front of the camera, i.e. perpendicular to the imaging plane, and the coordinate relationship is shown in fig. 7:
f is the focal length of the camera, and the conversion relationship is as follows:
x=fX c /Z c y=fY c /Z c
the same inverse calculation can obtain:
X c =xZ c /f Y c =yZ c /f
the conversion from camera coordinates to WGS84 coordinates is rigid transformation, and the coordinate system may be rotated or translated, and the coordinate relationship is shown in fig. 8:
where (X, Y, Z) refers to the WGS84 coordinate system, if the camera coordinate system is rotated only around the X-axis by an angle θ, then:
X c =X Y c =Ycosθ+Zsinθ Z c =Zcosθ-Ysinθ
the same method can obtain a formula of rotating a certain angle around the Y axis and the Z axis, and the camera coordinate and the WGS84 coordinate conversion formula are obtained through synthesis.
And after the 2D pixel coordinate of the recognized result is subjected to coordinate transformation, a 3D coordinate value can be directly obtained, and frame selection display is directly performed on the 3D model, so that high-precision recognition and high-quality visual display are realized.
Examples
Taking 500 photos taken at 1 frame time after unmanned aerial vehicle flying as an example, the same artificial intelligence recognition algorithm is adopted, and the same configuration computer is used, wherein the pixels of the photos are 5472 × 3648, the horizontal resolution is 72dpi, the vertical resolution is 72dpi, the bit depth is 24, the resolution unit is 2, the color represents sRGB, the camera aperture value f/3.2, and the data size is 4.10GB (4,412,289,847 bytes), and the specific details are as shown in FIG. 5, and the overlapping degree is 80%.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.
Claims (6)
1. A target rapid identification and calibration method based on an unmanned aerial vehicle oblique photography 3D model is characterized in that: the method comprises the following steps:
the method comprises the following steps: preprocessing a 2D picture;
step two: splicing the 2D pictures;
step three: artificial intelligence recognition;
step four: 2D/3D coordinate conversion;
step five: and 3D visual identification.
2. The method for rapidly identifying and calibrating the target based on the unmanned aerial vehicle oblique photography 3D model according to claim 1, characterized in that: the method comprises the steps of utilizing three-dimensional coordinate information carried by a 2D picture shot by an unmanned aerial vehicle, combining the overlapping degree to quickly splice the 2D picture, reducing the calculated data volume of the picture, then utilizing an artificial intelligence recognition method to intelligently recognize the interested picture content, mapping and calculating the coordinate information of a 2D picture recognition result through a coordinate conversion system to obtain a 3D coordinate value, and identifying a recognition object on a 3D model.
3. The method for rapidly identifying and calibrating the target based on the unmanned aerial vehicle oblique photography 3D model according to claim 1, characterized in that: the specific operation flow of the first step of 2D picture preprocessing is as follows: when the unmanned aerial vehicle shoots a fixed-focus picture in photogrammetry, a flight area is set in advance, a flight route is planned, and a flight overlapping degree is set, wherein the overlapping degree is generally above 70%, a good result can be obtained in three-dimensional modeling, and meanwhile, a high-precision three-dimensional coordinate in picture shooting can be obtained by using an RTK or PPK high-precision positioning system of Beidou/GPS when data are collected, and the general coordinate system is WGS84. The method comprises the steps of directly using shot pictures to identify images, wherein repeated identification and low identification speed exist in the image identification process, processing high-precision coordinates of the centers of light spots carried by the pictures to obtain picture tracks, using the nearest 4 adjacent pictures in the tracks as data sources for picture splicing, preparing splicing data for image splicing, directly using three-dimensional coordinate information to select the splicing sequence and the splicing position of the pictures, avoiding the situation that feature points are acquired and matched one by one in the image splicing process, and then selecting the closest pictures for splicing.
4. The method for rapidly identifying and calibrating the target based on the unmanned aerial vehicle oblique photography 3D model according to claim 1, characterized in that: the specific operation flow of the step two 2D picture splicing is as follows: according to 4 spliced pictures obtained by calculating the high-precision three-dimensional coordinates, splicing and synthesizing the 2D pictures, wherein the processing process comprises the following steps: inputting a merged image, calculating an overlapping area, extracting characteristic points, registering a 2D image, projecting transformation, splicing calculation, fusing the image and generating the image.
5. The method for rapidly identifying and calibrating the target based on the unmanned aerial vehicle oblique photography 3D model according to claim 1, characterized in that: the specific operation flow of the third artificial intelligence identification step is as follows: adopt the intelligent recognition of the 2D picture after the deep learning algorithm realization concatenation, the discernment content mainly includes: trees, farmlands, wire bars, ponds, cars, houses/foundations, people, green belts, sidewalks, etc.;
the deep learning data set adopts the existing public data sets such as: ADE20K, urban Drone Dataset (UDD), stanford Drone Dataset and the like are subjected to pre-training and coarse adjustment, and then the picture shot by an unmanned aerial vehicle is used for fine adjustment after enhancement;
the image processing and acquisition panorama segmentation algorithm is an open-source SegFormer segmentation algorithm realized in a transform and multi-layer perceptron MLP decoding mode.
6. The method for rapidly identifying and calibrating the target based on the unmanned aerial vehicle oblique photography 3D model according to claim 1, characterized in that: the specific operation flow of the four-step 2D/3D coordinate conversion is as follows: establishing a rapid conversion model for the pixel coordinates identified by the artificial intelligent image and the three-dimensional coordinates of the physical world, wherein the conversion process is as follows: pixel coordinates-image coordinates-camera coordinates-WGS 84 coordinates.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210850871.7A CN115330594A (en) | 2022-07-19 | 2022-07-19 | Target rapid identification and calibration method based on unmanned aerial vehicle oblique photography 3D model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210850871.7A CN115330594A (en) | 2022-07-19 | 2022-07-19 | Target rapid identification and calibration method based on unmanned aerial vehicle oblique photography 3D model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115330594A true CN115330594A (en) | 2022-11-11 |
Family
ID=83916898
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210850871.7A Pending CN115330594A (en) | 2022-07-19 | 2022-07-19 | Target rapid identification and calibration method based on unmanned aerial vehicle oblique photography 3D model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115330594A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116189115A (en) * | 2023-04-24 | 2023-05-30 | 青岛创新奇智科技集团股份有限公司 | Vehicle type recognition method, electronic device and readable storage medium |
CN116468889A (en) * | 2023-04-04 | 2023-07-21 | 中国航天员科研训练中心 | Panorama segmentation method and system based on multi-branch feature extraction |
CN116664828A (en) * | 2023-04-15 | 2023-08-29 | 北京中科航星科技有限公司 | Intelligent equipment image information processing system and method |
-
2022
- 2022-07-19 CN CN202210850871.7A patent/CN115330594A/en active Pending
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116468889A (en) * | 2023-04-04 | 2023-07-21 | 中国航天员科研训练中心 | Panorama segmentation method and system based on multi-branch feature extraction |
CN116468889B (en) * | 2023-04-04 | 2023-11-07 | 中国航天员科研训练中心 | Panorama segmentation method and system based on multi-branch feature extraction |
CN116664828A (en) * | 2023-04-15 | 2023-08-29 | 北京中科航星科技有限公司 | Intelligent equipment image information processing system and method |
CN116664828B (en) * | 2023-04-15 | 2023-12-15 | 北京中科航星科技有限公司 | Intelligent equipment image information processing system and method |
CN116189115A (en) * | 2023-04-24 | 2023-05-30 | 青岛创新奇智科技集团股份有限公司 | Vehicle type recognition method, electronic device and readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11830163B2 (en) | Method and system for image generation | |
CN111062873B (en) | Parallax image splicing and visualization method based on multiple pairs of binocular cameras | |
Zhang et al. | A UAV-based panoramic oblique photogrammetry (POP) approach using spherical projection | |
CN110782394A (en) | Panoramic video rapid splicing method and system | |
JP6201476B2 (en) | Free viewpoint image capturing apparatus and method | |
CN115330594A (en) | Target rapid identification and calibration method based on unmanned aerial vehicle oblique photography 3D model | |
CN108629829B (en) | Three-dimensional modeling method and system of the one bulb curtain camera in conjunction with depth camera | |
CN111028155B (en) | Parallax image splicing method based on multiple pairs of binocular cameras | |
WO2023280038A1 (en) | Method for constructing three-dimensional real-scene model, and related apparatus | |
Hoppe et al. | Online Feedback for Structure-from-Motion Image Acquisition. | |
CN110799921A (en) | Shooting method and device and unmanned aerial vehicle | |
CN106157304A (en) | A kind of Panoramagram montage method based on multiple cameras and system | |
CN115937288A (en) | Three-dimensional scene model construction method for transformer substation | |
WO2019100219A1 (en) | Output image generation method, device and unmanned aerial vehicle | |
JP2019045991A (en) | Generation device, generation method and program | |
CN113345084B (en) | Three-dimensional modeling system and three-dimensional modeling method | |
CN110275179A (en) | A kind of building merged based on laser radar and vision ground drawing method | |
Wendel et al. | Automatic alignment of 3D reconstructions using a digital surface model | |
Gao et al. | Multi-source data-based 3D digital preservation of largescale ancient chinese architecture: A case report | |
Fu et al. | Image stitching techniques applied to plane or 3-D models: a review | |
CN108564654B (en) | Picture entering mode of three-dimensional large scene | |
CN112529498B (en) | Warehouse logistics management method and system | |
CN107067368B (en) | Streetscape image splicing method and system based on deformation of image | |
CN112508784A (en) | Panoramic image method of planar object contour model based on image stitching | |
CN107644394A (en) | A kind of processing method and processing device of 3D rendering |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |