CN117593350A - Binocular stereo matching method and system for unmanned aerial vehicle power transmission line detection - Google Patents

Binocular stereo matching method and system for unmanned aerial vehicle power transmission line detection Download PDF

Info

Publication number
CN117593350A
CN117593350A CN202410070619.3A CN202410070619A CN117593350A CN 117593350 A CN117593350 A CN 117593350A CN 202410070619 A CN202410070619 A CN 202410070619A CN 117593350 A CN117593350 A CN 117593350A
Authority
CN
China
Prior art keywords
image
aerial vehicle
unmanned aerial
matching
depth
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410070619.3A
Other languages
Chinese (zh)
Inventor
巢建树
朱程
赖佳华
吴晓亮
安德钰
李霆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Quanzhou Institute of Equipment Manufacturing
Original Assignee
Quanzhou Institute of Equipment Manufacturing
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Quanzhou Institute of Equipment Manufacturing filed Critical Quanzhou Institute of Equipment Manufacturing
Priority to CN202410070619.3A priority Critical patent/CN117593350A/en
Publication of CN117593350A publication Critical patent/CN117593350A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/17Terrestrial scenes taken from planes or by drones
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30181Earth observation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Remote Sensing (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a binocular stereo matching method and a binocular stereo matching system for unmanned aerial vehicle power line detection, and relates to the technical field of image processing; according to the method, the binocular camera is used for acquiring image input, and the depth estimation algorithm model is integrated into the built-in chip of the unmanned aerial vehicle. The design combines the traditional unmanned plane and the deep learning model, and realizes the reasoning of the real-time binocular image. And simultaneously, an Nvidia Jetson chip is fused to serve as a visual accelerator, so that the reasoning efficiency is improved. The innovation on the structure utilizes the hardware accelerator, so that the unmanned aerial vehicle can perform deep learning reasoning more efficiently, and particularly the recognition accuracy is improved when small objects are processed. The structural innovation is integrated, and the obstacle avoidance function of the unmanned aerial vehicle is finally realized.

Description

Binocular stereo matching method and system for unmanned aerial vehicle power transmission line detection
Technical Field
The invention discloses a binocular stereo matching method and system for unmanned aerial vehicle power line detection, and relates to the technical field of image processing.
Background
The conventional image processing method has difficulty in matching in a non-textured area and a scene where a fine structure exists. The lack of obvious features in the non-textured area makes it difficult for traditional matching methods to extract valid information. The tiny structure may be ignored or mismatched by the algorithm, reducing the matching accuracy. These problems affect the performance of traditional stereo matching methods in complex scenes, limiting their practical utility in unmanned aerial vehicle obstacle avoidance applications and the like.
The conventional method performs poorly in handling occlusion and out-of-boundary mismatch areas. The shielding problem makes it difficult for the algorithm to identify the exact shape and position of the shielded object, affecting the performance of the obstacle avoidance system. Meanwhile, the difficulty in processing the unmatched areas makes the algorithm lack global knowledge of the occluding objects in a complex scene. This limits the robustness and practicality of conventional approaches in unmanned aerial vehicle obstacle avoidance.
It is difficult to accurately estimate the depth of an elongate object such as a power line by conventional methods. Because power lines typically have a small cross-section, conventional methods may not capture enough characteristic information to make accurate depth estimates. This leads to in unmanned aerial vehicle keeps away barrier system, to the degree of depth perception of targets such as power transmission line not enough, has influenced and has kept away the barrier effect. Thus, increasing the accuracy of the elongate object depth estimation becomes a key challenge in optimizing obstacle avoidance systems.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a binocular stereo matching method and a binocular stereo matching system for unmanned aerial vehicle transmission line detection, wherein the adopted technical scheme is as follows:
in a first aspect, a binocular stereo matching method for unmanned aerial vehicle power line detection, the method comprising:
s1, acquiring an image through a binocular camera configured by an unmanned aerial vehicle, and transmitting the image to a three-dimensional matching network configured by the unmanned aerial vehicle;
s2, carrying out three-dimensional matching through the three-dimensional matching network according to the acquired image to obtain a parallax image;
s3, according to the parallax image, calculating the depth value of each pixel through an algorithm model to fill the missing part of the depth information, removing abnormal values and outputting a depth image;
s4, sending the depth image to an edge end as a prediction estimation result, and performing real-time analysis through an Nvidia Jetson chip;
and S5, tracking the detected power transmission line according to the depth image, detecting the potential power transmission line and dynamically adjusting the unmanned aerial vehicle route.
In some implementations, in S2, the stereo matching includes feature extraction, cost matching, cost aggregation, and disparity estimation.
In some implementations, S2 specifically includes:
s21, extracting features through a CNN network;
s22, carrying out feature enhancement through an attention TAC module according to the extracted features, and obtaining geometric feature information through the construction cost volume of the geometric cost body;
s23, according to the self-similarity of the features, the prediction is transmitted to the unmatched area through updating the parallax by the SA module.
In some implementations, the S3 further includes:
and carrying out format conversion on the algorithm model to construct a communication network of the unmanned aerial vehicle and the three-dimensional matching network.
In a second aspect, an embodiment of the present invention provides a binocular stereo matching system for unmanned aerial vehicle power line detection, the system comprising:
the image acquisition unit is used for acquiring images through the binocular camera configured by the unmanned aerial vehicle and transmitting the images to the three-dimensional matching network configured by the unmanned aerial vehicle;
the image matching unit is used for carrying out three-dimensional matching through the three-dimensional matching network according to the acquired image to obtain a parallax image;
the image processing unit is used for calculating the depth value of each pixel through an algorithm model according to the parallax image to fill the missing part of the depth information, removing abnormal values and outputting a depth image;
the image analysis unit is used for sending the depth image to an edge end as a prediction estimation result, and carrying out real-time analysis through an Nvidia Jetson chip;
and the route adjustment unit is used for tracking the detected power transmission line according to the depth image, detecting the potential power transmission line and dynamically adjusting the route of the unmanned aerial vehicle.
In some implementations, in the image matching unit, the stereo matching includes feature extraction, cost matching, cost aggregation, and disparity estimation.
In some implementations, the image matching unit specifically includes:
the feature extraction subunit is used for extracting features through a CNN network;
the feature enhancement subunit is used for carrying out feature enhancement through the attention TAC module according to the extracted features and obtaining geometric feature information through the geometric cost body construction cost volume;
and the parallax transmission subunit is used for transmitting the prediction to the unmatched area through updating the parallax by the SA module according to the self-similarity of the characteristics.
In some implementations, the image processing unit further includes:
and the format conversion subunit is used for carrying out format conversion on the algorithm model and constructing a communication network of the unmanned aerial vehicle and the three-dimensional matching network.
In a third aspect, an embodiment of the present invention provides an electronic device, including a memory and a processor, where the memory is configured to store one or more computer instructions, and where the one or more computer instructions implement the method according to the first aspect, when executed by the processor.
In a fourth aspect, embodiments of the present invention provide a computer storage medium having a computer program stored therein, which when executed by a processor, is adapted to carry out the method according to the first aspect.
One or more embodiments of the present invention can provide at least the following advantages:
a channel attention TAC module is provided, cross view feature similarity information is selectively aggregated, and sensitivity to details is improved. The lightweight three-dimensional regularization network and the geometric coding body coding module are provided, which are beneficial to capturing global geometric structures and process global information more comprehensively compared with the traditional method. The Self-Attention (SA) module is introduced, and the robustness of the algorithm to the shielding, non-texture area and weak texture area is improved by measuring the Self-similarity of the features;
the invention obtains image input through the binocular camera, and integrates the depth estimation algorithm model into the built-in chip of the unmanned aerial vehicle. The design combines the traditional unmanned plane and the deep learning model, and realizes the reasoning of the real-time binocular image. And simultaneously, an Nvidia Jetson chip is fused to serve as a visual accelerator, so that the reasoning efficiency is improved. The innovation on the structure utilizes the hardware accelerator, so that the unmanned aerial vehicle can perform deep learning reasoning more efficiently, and particularly the recognition accuracy is improved when small objects are processed. The structural innovation is integrated, and the obstacle avoidance function of the unmanned aerial vehicle is finally realized.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the drawings that are needed in the embodiments or the description of the prior art will be briefly described below, it will be obvious that the drawings in the following description are some embodiments of the present invention, and that other drawings can be obtained according to these drawings without inventive effort to a person skilled in the art.
Fig. 1 is a flowchart of a binocular stereo matching method for unmanned aerial vehicle power line detection provided by an embodiment of the invention;
fig. 2 is a block diagram of a binocular stereo matching system for unmanned aerial vehicle power line detection provided by an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a stereo matching network according to an embodiment of the present invention;
fig. 4 is a schematic diagram of the operation of the attention module according to the embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to be within the scope of the present invention.
Embodiment one:
fig. 1 shows a flowchart of a binocular stereo matching method for detecting an unmanned aerial vehicle power line, and as shown in fig. 1, the binocular stereo matching method for detecting an unmanned aerial vehicle power line provided in the embodiment includes:
s1, acquiring an image through a binocular camera configured by an unmanned aerial vehicle, and transmitting the image to a three-dimensional matching network configured by the unmanned aerial vehicle;
s2, carrying out three-dimensional matching through the three-dimensional matching network according to the acquired image to obtain a parallax image;
s3, according to the parallax image, calculating the depth value of each pixel through an algorithm model to fill the missing part of the depth information, removing abnormal values and outputting a depth image;
s4, sending the depth image to an edge end as a prediction estimation result, and performing real-time analysis through an Nvidia Jetson chip;
and S5, tracking the detected power transmission line according to the depth image, detecting the potential power transmission line and dynamically adjusting the unmanned aerial vehicle route.
Aiming at the problem that the unmanned aerial vehicle cannot detect tiny objects such as a power line and the like in the flying process to crash, the invention provides a binocular stereo matching method for detecting the power line of the unmanned aerial vehicle. And determining whether potential barriers exist by carrying out depth estimation on the power transmission line on the flight path of the unmanned aerial vehicle by adopting a binocular depth estimation algorithm based on stereo matching. Once the depth estimation algorithm predicts the obstacles such as the power transmission line, the obstacles react immediately, and the flight route of the unmanned aerial vehicle is re-planned, so that the obstacle avoidance early warning target is realized.
Firstly, according to S1, the unmanned aerial vehicle configures a binocular camera, and performs corresponding parameter setting and calibration to ensure that the image quality achieves the best effect. Through preset flight routes or manual control, images are shot in real time and transmitted to a three-dimensional matching network embedded in the unmanned aerial vehicle for processing and analysis.
Next, according to S2, performing stereo matching through the stereo matching network according to the acquired image, to obtain a parallax image; the stereo matching comprises feature extraction, cost matching, cost aggregation and parallax estimation. The method specifically comprises the following steps:
s21, extracting features through a CNN network;
s22, carrying out feature enhancement through an attention TAC module according to the extracted features, and obtaining geometric feature information through the construction cost volume of the geometric cost body;
s23, according to the self-similarity of the features, the prediction is transmitted to the unmatched area through updating the parallax by the SA module.
Taking binocular images shot by an unmanned aerial vehicle as input, performing feature extraction, cost matching, cost aggregation and parallax estimation in a three-dimensional matching network proposed in the step S2, wherein the feature extraction uses a CNN network, the attention TAC module performs feature enhancement, the geometric feature information is obtained by proposing a geometric cost body construction cost volume, and further, the parallax d0 is updated by using the SA module. The module propagates high quality predictions to unmatched areas by measuring the self-similarity of features. This approach enhances robustness to occlusion, non-textured areas and weakly textured areas. Finally, the method is further improved by introducing an additional refinement step to achieve a different balance between speed and accuracy.
Next, according to S3, according to the parallax image, calculating a depth value of each pixel through an algorithm model to fill a missing part of the depth information, removing an abnormal value, and outputting a depth image; according to the parallax image, calculating the depth value of each pixel through an algorithm model to fill the missing part of the depth information, removing abnormal values and outputting a depth image
And calculating the depth value of each pixel according to the output parallax image of the stereo matching network. After the processing, the missing part of the depth information is filled, abnormal values are removed, and a depth image is output.
And next, transmitting the estimation result to the cloud for real-time analysis according to the S4. The environment is divided into a passable area and an obstacle area, and the distance between the power transmission line and the unmanned aerial vehicle is detected.
Finally, according to S5, the detected power lines are tracked, so that the unmanned aerial vehicle can dynamically adjust the route, and detect potential power lines. And according to the real-time analysis result, adjusting the flight height, speed, direction and the like of the unmanned aerial vehicle, implementing detection and early warning, and planning a new route.
As shown in the schematic diagram of the three-dimensional matching network provided in fig. 3, the method of the invention is embedded in the unmanned aerial vehicle obstacle avoidance system, and based on the idea of aggregating local and global characteristic information, the three-dimensional matching network mainly comprises two key components: a transducer-based feature enhancement module and a geometry encoding module. For a given pair of corrected images Il (left) and Ir (right), 8-fold down-sampled dense features F1 and F2 are extracted by a Convolutional Neural Network (CNN) feature extractor, where H, W and D represent height, width, and feature dimensions, respectively. These extracted shallow features are then input into a Transformer Cross Attention (TCA) module for feature enhancement to construct a cost volume. Finally, the cost amounts are fed into a lightweight 3D regularized network step by step, and unmatched areas are processed through a Self-Attention (SA) module, so that a final parallax image is obtained. This procedure helps to improve the accuracy and robustness of stereo matching.
Inspired by a restorer, as shown in the working schematic diagram of the attention module provided in fig. 4, the Channel Attention (TCA) module is feature enhanced, and the information is selectively aggregated based on cross-view feature similarity by using a sliding window. This enables the Transformer block to effectively fuse the low-level image features of the encoder with the high-level features of the decoder, helping to preserve fine structural and texture details in the image. The one-dimensional horizontal cross attention is customized for correction image, and aims to capture long-distance pixel dependency and retain more high-frequency fine structure information as much as possible. The design not only accelerates the learning process of the model, but also enhances the modeling capability of the complex scene.
To achieve accurate modeling of sharp edges and better handling of weak texture regions, it becomes critical to maintain high frequency information in the channel. Maintaining high resolution in the whole structure is an intuitive method, but due to the extremely high computational cost, the downsampling of the image to 1/8 of the original size by pixel decompression is chosen, while expanding the channel without losing any high frequency information. Specifically, the original image has a shape of [ C, h×r, w×r ], and is reshaped into [ c×r2, H, W ] after the pixel is decompressed.
A correlation cost volume is constructed from the left and right features F1 and F2 of the attention TAC module enhancement. However, this cost volume is based solely on feature correlation, lacking the ability to capture global geometry. To solve this problem, a lightweight is usedIs used for processing Ccorr to obtain a geometric coding volume
The three-dimensional regularization network R is based on a lightweight three-dimensional UNet, comprising three downsampled blocks and three upsampled blocks. Each downsampled block consists of two 3x3x3 three-dimensional convolutions. Inspired by CoEx work, the cost volume channel is excited with weights calculated from the left feature F1 for cost aggregation. In a three-dimensional regularization network, a guided cost volume excitation operation effectively extrapolates and propagates scene geometry information to yield a geometrically encoded volume. Further, all pairwise correlations between the corresponding left and right features are computed to obtain a global feature correlation.
To geometrically encode volumesThe initial parallax d0 was regressed, and the soft argmin method was used. The disparity d0 is obtained by geometrically encoding the volume +.>The preset parallax indexes are weighted and summed, and the softmax function is used for weighting. Further, the SA module is used to update the disparity d0. The module propagates high quality predictions to unmatched areas by measuring the self-similarity of features. This approach enhances robustness to occlusion, non-textured areas and weakly textured areas. The method of the invention can achieve competitive performance while maintaining simplicity and high efficiency. However, by introducing an additional refinement step, the method can be further improved to achieve a different balance between speed and accuracy.
Embodiment two:
fig. 2 shows a block diagram of a binocular stereo matching system for unmanned aerial vehicle power line detection, and as shown in fig. 2, the binocular stereo matching system for unmanned aerial vehicle power line detection provided in the embodiment includes:
the image acquisition unit is used for acquiring images through the binocular camera configured by the unmanned aerial vehicle and transmitting the images to the three-dimensional matching network configured by the unmanned aerial vehicle;
the image matching unit is used for carrying out three-dimensional matching through the three-dimensional matching network according to the acquired image to obtain a parallax image;
the image processing unit is used for calculating the depth value of each pixel through an algorithm model according to the parallax image to fill the missing part of the depth information, removing abnormal values and outputting a depth image;
the image analysis unit is used for sending the depth image to an edge end as a prediction estimation result, and carrying out real-time analysis through an Nvidia Jetson chip;
and the route adjustment unit is used for tracking the detected power transmission line according to the depth image, detecting the potential power transmission line and dynamically adjusting the route of the unmanned aerial vehicle.
Specifically, in the image matching unit, the stereo matching includes feature extraction, cost matching, cost aggregation and parallax estimation.
Specifically, the image matching unit specifically includes:
the feature extraction subunit is used for extracting features through a CNN network;
the feature enhancement subunit is used for carrying out feature enhancement through the attention TAC module according to the extracted features and obtaining geometric feature information through the geometric cost body construction cost volume;
and the parallax transmission subunit is used for transmitting the prediction to the unmatched area through updating the parallax by the SA module according to the self-similarity of the characteristics.
Specifically, the image processing unit further includes:
and the format conversion subunit is used for carrying out format conversion on the algorithm model and constructing a communication network of the unmanned aerial vehicle and the three-dimensional matching network.
Embodiment III:
the embodiment also provides an electronic device, including a memory and a processor, where the memory is configured to store one or more computer instructions, and the one or more computer instructions when executed by the processor implement the method of the first embodiment;
in practical applications, the processor may be an application specific integrated circuit (Application Specific Integrated Circuit, abbreviated as ASIC), a digital signal processor (Digital Signal Processor, abbreviated as DSP), a digital signal processing device (Digital Signal Processing Device, abbreviated as DSPD), a programmable logic device (Programmable Logic Device, abbreviated as PLD), a field programmable gate array (Field Programmable Gate Array, abbreviated as FPGA), a controller, a microcontroller (Microcontroller Unit, MCU), a microprocessor or other electronic component implementation for executing the method in the above embodiment.
The method implemented by this embodiment is as described in embodiment one.
Embodiment four:
the present embodiment also provides a computer storage medium having a computer program stored therein, which when executed by one or more processors, implements the method of the first embodiment;
the computer readable storage medium may be implemented by any type or combination of volatile or nonvolatile Memory devices, such as static random access Memory (Static Random Access Memory, SRAM for short), electrically erasable programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EPROM for short), programmable Read-Only Memory (Programmable Read-Only Memory, PROM for short), read-Only Memory (ROM for short), magnetic Memory, flash Memory, magnetic disk, or optical disk.
The method implemented by the embodiment comprises the following steps:
the method implemented by this embodiment is as described in embodiment one.
In the several embodiments provided in the embodiments of the present invention, it should be understood that the disclosed system and method may be implemented in other manners. The system and method embodiments described above are merely illustrative.
It should be noted that, in this document, the terms "first," "second," and the like in the description and the claims of the present application and the above drawings are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
Although the embodiments of the present invention are described above, the embodiments are only used for facilitating understanding of the present invention, and are not intended to limit the present invention. Any person skilled in the art can make any modification and variation in form and detail without departing from the spirit and scope of the present disclosure, but the scope of the present disclosure is still subject to the scope of the appended claims.

Claims (10)

1. A binocular stereo matching method for unmanned aerial vehicle power line detection, the method comprising:
s1, acquiring an image through a binocular camera configured by an unmanned aerial vehicle, and transmitting the image to a three-dimensional matching network configured by the unmanned aerial vehicle;
s2, carrying out three-dimensional matching through the three-dimensional matching network according to the acquired image to obtain a parallax image;
s3, according to the parallax image, calculating the depth value of each pixel through an algorithm model to fill the missing part of the depth information, removing abnormal values and outputting a depth image;
s4, sending the depth image to an edge end as a prediction estimation result, and performing real-time analysis through an Nvidia Jetson chip;
and S5, tracking the detected power transmission line according to the depth image, detecting the potential power transmission line and dynamically adjusting the unmanned aerial vehicle route.
2. The method of claim 1, wherein in S2, the stereo matching comprises feature extraction, cost matching, cost aggregation, and disparity estimation.
3. The method according to claim 2, wherein S2 comprises in particular:
s21, extracting features through a CNN network;
s22, carrying out feature enhancement through an attention TAC module according to the extracted features, and obtaining geometric feature information through the construction cost volume of the geometric cost body;
s23, according to the self-similarity of the features, the prediction is transmitted to the unmatched area through updating the parallax by the SA module.
4. A method according to claim 3, wherein S3 further comprises:
and carrying out format conversion on the algorithm model to construct a communication network of the unmanned aerial vehicle and the three-dimensional matching network.
5. A binocular stereo matching system for unmanned aerial vehicle power line detection, the system comprising:
the image acquisition unit is used for acquiring images through the binocular camera configured by the unmanned aerial vehicle and transmitting the images to the three-dimensional matching network configured by the unmanned aerial vehicle;
the image matching unit is used for carrying out three-dimensional matching through the three-dimensional matching network according to the acquired image to obtain a parallax image;
the image processing unit is used for calculating the depth value of each pixel through an algorithm model according to the parallax image to fill the missing part of the depth information, removing abnormal values and outputting a depth image;
the image analysis unit is used for sending the depth image to an edge end as a prediction estimation result, and carrying out real-time analysis through an Nvidia Jetson chip;
and the route adjustment unit is used for tracking the detected power transmission line according to the depth image, detecting the potential power transmission line and dynamically adjusting the route of the unmanned aerial vehicle.
6. The system of claim 5, wherein in the image matching unit, the stereo matching includes feature extraction, cost matching, cost aggregation, and disparity estimation.
7. The system according to claim 6, wherein the image matching unit specifically comprises:
the feature extraction subunit is used for extracting features through a CNN network;
the feature enhancement subunit is used for carrying out feature enhancement through the attention TAC module according to the extracted features and obtaining geometric feature information through the geometric cost body construction cost volume;
and the parallax transmission subunit is used for transmitting the prediction to the unmatched area through updating the parallax by the SA module according to the self-similarity of the characteristics.
8. The system of claim 7, wherein the image processing unit further comprises:
and the format conversion subunit is used for carrying out format conversion on the algorithm model and constructing a communication network of the unmanned aerial vehicle and the three-dimensional matching network.
9. An electronic device comprising a memory and a processor, the memory configured to store one or more computer instructions, wherein the one or more computer instructions when executed by the processor implement the method of any of claims 1-4.
10. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program which, when executed by a processor, is adapted to carry out the method according to any of the preceding claims 1-4.
CN202410070619.3A 2024-01-18 2024-01-18 Binocular stereo matching method and system for unmanned aerial vehicle power transmission line detection Pending CN117593350A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410070619.3A CN117593350A (en) 2024-01-18 2024-01-18 Binocular stereo matching method and system for unmanned aerial vehicle power transmission line detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410070619.3A CN117593350A (en) 2024-01-18 2024-01-18 Binocular stereo matching method and system for unmanned aerial vehicle power transmission line detection

Publications (1)

Publication Number Publication Date
CN117593350A true CN117593350A (en) 2024-02-23

Family

ID=89915398

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410070619.3A Pending CN117593350A (en) 2024-01-18 2024-01-18 Binocular stereo matching method and system for unmanned aerial vehicle power transmission line detection

Country Status (1)

Country Link
CN (1) CN117593350A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110287964A (en) * 2019-06-13 2019-09-27 浙江大华技术股份有限公司 A kind of solid matching method and device
CN113642486A (en) * 2021-08-18 2021-11-12 国网江苏省电力有限公司泰州供电分公司 Unmanned aerial vehicle distribution network inspection method with airborne front-end identification model
CN113763562A (en) * 2021-08-31 2021-12-07 哈尔滨工业大学(威海) Binocular vision-based facade feature detection and facade feature processing method
WO2022089077A1 (en) * 2020-10-28 2022-05-05 西安交通大学 Real-time binocular stereo matching method based on adaptive candidate parallax prediction network
CN114565842A (en) * 2022-02-21 2022-05-31 西安电子科技大学 Unmanned aerial vehicle real-time target detection method and system based on Nvidia Jetson embedded hardware
CN115170638A (en) * 2022-07-13 2022-10-11 东北林业大学 Binocular vision stereo matching network system and construction method thereof
CN116029996A (en) * 2022-12-27 2023-04-28 天津云圣智能科技有限责任公司 Stereo matching method and device and electronic equipment

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110287964A (en) * 2019-06-13 2019-09-27 浙江大华技术股份有限公司 A kind of solid matching method and device
WO2022089077A1 (en) * 2020-10-28 2022-05-05 西安交通大学 Real-time binocular stereo matching method based on adaptive candidate parallax prediction network
CN113642486A (en) * 2021-08-18 2021-11-12 国网江苏省电力有限公司泰州供电分公司 Unmanned aerial vehicle distribution network inspection method with airborne front-end identification model
CN113763562A (en) * 2021-08-31 2021-12-07 哈尔滨工业大学(威海) Binocular vision-based facade feature detection and facade feature processing method
CN114565842A (en) * 2022-02-21 2022-05-31 西安电子科技大学 Unmanned aerial vehicle real-time target detection method and system based on Nvidia Jetson embedded hardware
CN115170638A (en) * 2022-07-13 2022-10-11 东北林业大学 Binocular vision stereo matching network system and construction method thereof
CN116029996A (en) * 2022-12-27 2023-04-28 天津云圣智能科技有限责任公司 Stereo matching method and device and electronic equipment

Similar Documents

Publication Publication Date Title
CN109034018B (en) Low-altitude small unmanned aerial vehicle obstacle sensing method based on binocular vision
CN109919993B (en) Parallax map acquisition method, device and equipment and control system
CN113902897A (en) Training of target detection model, target detection method, device, equipment and medium
CN113313763B (en) Monocular camera pose optimization method and device based on neural network
CN113264066A (en) Obstacle trajectory prediction method and device, automatic driving vehicle and road side equipment
CN114445265A (en) Equal-rectangular projection stereo matching two-stage depth estimation machine learning algorithm and spherical distortion layer
CN116222577B (en) Closed loop detection method, training method, system, electronic equipment and storage medium
CN113378693A (en) Target generation detection system and method and device for detecting target
CN115511759A (en) Point cloud image depth completion method based on cascade feature interaction
CN116246119A (en) 3D target detection method, electronic device and storage medium
CN116310673A (en) Three-dimensional target detection method based on fusion of point cloud and image features
CN116311091A (en) Vehicle counting method based on pyramid density perception attention network
Ke et al. Deep multi-view depth estimation with predicted uncertainty
CN113792598B (en) Vehicle-mounted camera-based vehicle collision prediction system and method
CN114842340A (en) Robot binocular stereoscopic vision obstacle sensing method and system
Parmehr et al. Automatic registration of optical imagery with 3d lidar data using local combined mutual information
CN112528763B (en) Target detection method, electronic equipment and computer storage medium
CN106570889A (en) Detecting method for weak target in infrared video
CN112132950B (en) Three-dimensional point cloud scene updating method based on crowdsourcing image
CN113112547A (en) Robot, repositioning method thereof, positioning device and storage medium
CN117496312A (en) Three-dimensional multi-target detection method based on multi-mode fusion algorithm
CN112950786A (en) Vehicle three-dimensional reconstruction method based on neural network
CN117593350A (en) Binocular stereo matching method and system for unmanned aerial vehicle power transmission line detection
CN115035551B (en) Three-dimensional human body posture estimation method, device, equipment and storage medium
CN114998630B (en) Ground-to-air image registration method from coarse to fine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination