CN115205712A - Unmanned aerial vehicle inspection image quick duplicate checking method and system based on twin network - Google Patents

Unmanned aerial vehicle inspection image quick duplicate checking method and system based on twin network Download PDF

Info

Publication number
CN115205712A
CN115205712A CN202210362383.1A CN202210362383A CN115205712A CN 115205712 A CN115205712 A CN 115205712A CN 202210362383 A CN202210362383 A CN 202210362383A CN 115205712 A CN115205712 A CN 115205712A
Authority
CN
China
Prior art keywords
image
data
aerial vehicle
unmanned aerial
inspection image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210362383.1A
Other languages
Chinese (zh)
Inventor
李强
吴文炤
赵峰
王卫卫
薛濛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Information and Telecommunication Co Ltd
Anhui Jiyuan Software Co Ltd
Original Assignee
State Grid Information and Telecommunication Co Ltd
Anhui Jiyuan Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Information and Telecommunication Co Ltd, Anhui Jiyuan Software Co Ltd filed Critical State Grid Information and Telecommunication Co Ltd
Priority to CN202210362383.1A priority Critical patent/CN115205712A/en
Publication of CN115205712A publication Critical patent/CN115205712A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/17Terrestrial scenes taken from planes or by drones
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Abstract

The application relates to the technical field of equipment maintenance, and provides a method and a system for quickly checking the duplicate of an unmanned aerial vehicle inspection image based on a twin network, wherein the method comprises the steps of acquiring inspection image data of each inspection of the unmanned aerial vehicle; classifying the inspection image data based on a ViT mode to obtain subdata sets of the inspection image data under each classification background; similarity calculation is carried out on the subdata sets under each classification background based on the twin neural network, the subdata sets are compared with a threshold value to obtain repeated routing inspection image data, an image pair with similar shooting time is obtained in a time correlation mode, calculation of irrelevant images can be avoided, data amount calculated by the twin network is reduced, and calculation efficiency is improved. Compared with other twin networks, the twin network fusing ResNet has the advantages that the feature vector is more accurate, the similarity judgment of the twin networks is facilitated, and the repeated query of the image is realized based on the twin network technology.

Description

Unmanned aerial vehicle inspection image rapid duplicate checking method and system based on twin network
Technical Field
The application relates to the technical field of equipment maintenance, in particular to a method and a system for quickly checking duplicate of an unmanned aerial vehicle inspection image based on a ViT-twin cascade network.
Background
Electric power is patrolled and examined and is the important guarantee that electric power is stable continuously transmitted, and traditional transmission line is patrolled and examined and is relied on a large amount of professional on-the-spot patrols and examines, leads to patrolling and examining with high costs, patrolling and examining inefficiency, and along with unmanned aerial vehicle, helicopter are patrolled and examined the degree of depth application in the transmission line, greatly reduced the pressure that the manual work was patrolled and examined, nevertheless patrol and examine produced a large amount of data and still need the manual detection. The artificial intelligence technique and the electric power inspection are combined, so that the detection efficiency can be greatly improved. But unmanned aerial vehicle, helicopter are patrolled and examined and to be produced a large amount of data, and a large amount of data of patrolling and examining are gushed into, can bring the problem of two aspects: (1) The transmission and storage of data by mass data causes great pressure; (2) A large amount of marking data are needed for training the artificial intelligence model, and a large amount of repeated data exist in the image data which is inspected every time, so that repeated marks can be caused, and marking cost is increased.
At present, a great deal of research is available on the data deduplication technology, and the research mainly includes two aspects: the method comprises the following steps that firstly, a traditional duplicate removal method comprises a complete file detection technology and a block level repeated detection technology; and the other is a duplication eliminating method based on deep learning.
The traditional duplicate removal method usually achieves the purpose of identifying whether data are repeated or not by comparing multi-level data such as file level and block level. The method firstly excavates repeated data in the file through hash algorithms such as MD4, MD5, SHA-1 and the like. The method generates a very complicated calculation process in the data detection process, and reduces the detection speed.
The method for removing the duplicate based on the deep learning comprises the steps of firstly extracting global features through wavelet transformation, gabor transformation and other modes, or directly obtaining the features of an image through CNN, then obtaining a real number matrix of the image through feature hashing feature engineering technology, and finally realizing repeated inspection of data through binary feature comparison. The method combines a machine learning technology and a feature engineering technology, and is complex to implement.
Twin networks, also known as conjoined networks, were first proposed by LeCun to verify that the signature on the check and the bank reservation signature are identical. With the development of the technology, in 2010, hinton uses a siamese architecture structure to construct a model, so that face recognition verification is realized, and a good experimental result is obtained. In 2015, sergey Zagoruyko et al improved Siamese Network for image similarity calculation to make system performance better. However, the twin networks proposed in these documents are the comparison of two pictures, and if the twin networks are directly used for similarity analysis of a large number of transmission images, the efficiency is greatly reduced, and the twin networks are not suitable for transmission inspection scenes.
Disclosure of Invention
Aiming at the problem that the implementation process of removing repeated data from power transmission inspection image data is complex, the main purpose of the application is to provide a method, a device, equipment and a computer readable storage medium for rapidly checking the repeated data of the unmanned aerial vehicle inspection image based on the ViT-twin cascade network.
In order to achieve the purpose, the application provides a method for quickly checking the double of the patrol inspection image of the unmanned aerial vehicle based on a twin network, which comprises the following steps:
acquiring polling image data of each polling of the unmanned aerial vehicle;
classifying the inspection image data based on a ViT mode to obtain a subdata set of the inspection image data under each classification background;
and performing similarity calculation on the subdata sets under each classification background based on the twin neural network, and comparing the similarity calculation with a threshold value to obtain repeated routing inspection image data.
Optionally, when the inspection image data is acquired, acquiring the image shooting time through the attribute of the inspection image of the unmanned aerial vehicle, and eliminating the similarity calculation of the inspection image irrelevant to the shooting time.
Optionally, before performing similarity calculation on the sub data sets in each classification background, the method further includes obtaining an image pair with time correlation by using the attribute of the unmanned aerial vehicle inspection image.
Optionally, the calculating the similarity of the subdata sets in each classification context based on the twin neural network includes:
fusing the twin neural network with a ResNet101 network, and performing feature extraction on an image pair to obtain a low-dimensional feature vector;
sending the feature vector into a loss function of the twin network for similarity calculation to obtain image similarity;
and sending the image pair similarity to a decision layer, and outputting a final similarity judgment result.
Optionally, the sub data sets of the inspection image data under each classification background are obtained, where the classification background includes a tower type, an infrastructure type, and an image type, and when the inspection image data is classified based on a ViT method, the inspection image data is classified into a tower type sub data set, an infrastructure type sub data set, and an image type sub data set.
Optionally, the calculating the similarity of the subdata sets in each classification context based on the twin neural network includes:
carrying out spatial mapping on the image pair through feature extraction to obtain a low-dimensional data vector;
and then, carrying out similarity calculation on the data vectors, and outputting the similarity or 0 and 1.
Optionally, the inspection image data is classified based on the ViT mode, a transform framework of an image block sequence is used to complete a classification task of the inspection image data, a standard transform is used in the image classification task, an image in the inspection image data is split into blocks, a linear embedded sequence of the image blocks is used as input of the transform, and an image classification model is trained in a supervision mode.
Optionally, training an image classification model in a supervision mode, including embedding an image block, embedding learnable types, embedding positions, and extracting features corresponding to learnable category embedding vectors by an encoder for image classification;
embedding a block sequence for reconstructing three-dimensional image data to two-dimensional image blocks into the image blocks, performing linear transformation on each image block data, and then reducing the dimension of each image block data, wherein the image blocks are used for standard data input format conversion of the image data in a Transformer architecture;
the position embedding is used to preserve position information between the input image blocks.
Optionally, the encoder is configured to extract features corresponding to learnable class embedding vectors for image classification after the class vectors, the image block embedding, and the position coding are integrated into an input embedding vector.
In addition, for realizing above-mentioned purpose, this application still provides an unmanned aerial vehicle patrols and examines image fast check duplicate system based on twin network, unmanned aerial vehicle patrols and examines image fast check duplicate system based on twin network includes:
the data acquisition module is used for acquiring polling image data of each polling of the unmanned aerial vehicle; the data classification module is used for classifying the inspection image data based on a ViT mode to obtain a subdata set of the inspection image data under each classification background; and the data duplicate checking module is used for carrying out similarity calculation on the subdata sets under each classification background based on the twin neural network and comparing the similarity calculation with a threshold value to obtain repeated routing inspection image data.
In addition, in order to achieve the above object, the present application further provides a twin network-based unmanned aerial vehicle inspection image fast duplicate checking device, where the twin network-based unmanned aerial vehicle inspection image fast duplicate checking device includes a processor, a memory, and a twin network-based unmanned aerial vehicle inspection image fast duplicate checking program stored in the memory and executable by the processor, where when the processor executes the twin network-based unmanned aerial vehicle inspection image fast duplicate checking program, the steps of the twin network-based unmanned aerial vehicle inspection image fast duplicate checking method as described above are implemented.
In addition, in order to achieve the above object, the present application also provides a computer readable storage medium, on which a twin network based unmanned aerial vehicle inspection image fast duplicate checking program is stored, where when the twin network based unmanned aerial vehicle inspection image fast duplicate checking program is executed by a processor, the steps of the twin network based unmanned aerial vehicle inspection image fast duplicate checking method as described above are implemented.
The application provides a method and a system for rapidly checking duplicate of an unmanned aerial vehicle inspection image based on a twin network, wherein the method classifies the inspection image in a ViT mode to obtain an image subdata set under each background; then, carrying out similarity graduation analysis on the sub-data by adopting a Siamese Network technology to obtain a similarity analysis result; finally, the effectiveness and feasibility of the duplicate removal technology in a power transmission scene are verified through experiments.
And moreover, shooting time information is obtained through the original unmanned aerial vehicle routing inspection image attribute, image data with time correlation is screened out, and then repeated inquiry of images is achieved by adopting a twin network-based technology. The image pairs with similar shooting time are obtained by adopting a time correlation mode, so that the calculation of irrelevant images can be avoided, the data quantity calculated by a twin network is reduced, and the calculation efficiency is improved. Compared with other twin networks, the twin network fusing the ResNet obtains the feature vector more accurately, and is beneficial to similarity judgment of the twin network.
These and other aspects of the present application will be more readily apparent from the following description of the embodiments. It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application. In the drawings:
FIG. 1 is a flow chart of the unmanned aerial vehicle inspection image quick duplicate checking method based on a twin network;
FIG. 2 is a schematic diagram of an unmanned aerial vehicle inspection image library and time attributes in the unmanned aerial vehicle inspection image fast duplicate checking method based on the twin network;
FIG. 3 is a flow chart of similarity calculation in the unmanned aerial vehicle inspection image rapid duplicate checking method based on the twin network;
FIG. 4 is a schematic diagram of an image similarity calculation result in the unmanned aerial vehicle inspection image rapid duplicate checking method based on the twin network;
FIG. 5 is a schematic diagram of a Simese network structure fused with ResNet50 in the unmanned aerial vehicle inspection image fast duplicate checking method based on the twin network;
FIG. 6 is a system block diagram of the unmanned aerial vehicle inspection image fast duplicate checking system based on the twin network;
the objectives, features, and advantages of the present application will be further described with reference to the accompanying drawings.
Detailed Description
The present application is further described with reference to the accompanying drawings and the detailed description, and it should be noted that, in the present application, the embodiments or technical features described below may be arbitrarily combined to form a new embodiment without conflict.
It should be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad application.
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, of the embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The flow diagrams depicted in the figures are merely illustrative and do not necessarily include all of the elements and operations/steps, nor do they necessarily have to be performed in the order depicted. For example, some operations/steps may be decomposed, combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.
It is to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments and features of the embodiments described below can be combined with each other without conflict.
The embodiment of the application provides a method and a system for rapidly checking the duplicate of an unmanned aerial vehicle inspection image based on a twin network, wherein inspection images are classified through ViT to obtain an image subdata set under each background; then, carrying out similarity graduation analysis on the sub-data by adopting a Siamese Network technology to obtain a similarity analysis result; finally, the effectiveness and feasibility of the duplicate removal technology in a power transmission scene are verified through experiments.
In some embodiments, the twin network-based unmanned aerial vehicle inspection image fast duplicate checking method may be applied to a twin network-based unmanned aerial vehicle inspection image fast duplicate checking device, which may be a device with display and processing functions, such as a PC, a portable computer, a mobile terminal, and the like, although not limited thereto.
Referring to fig. 1, fig. 1 is a schematic flow diagram of a first embodiment of a twin network-based unmanned aerial vehicle inspection image quick duplicate checking method according to the present application. In the embodiment of the application, the unmanned aerial vehicle inspection image rapid duplicate checking method based on the twin network comprises the following steps S10-S30:
and S10, acquiring polling image data of the unmanned aerial vehicle for each polling.
In some embodiments, when the inspection image data is acquired, acquiring the image shooting time through the attribute of the inspection image of the unmanned aerial vehicle, and eliminating the similarity calculation of the inspection image irrelevant to the shooting time.
And S20, classifying the inspection image data based on a ViT mode to obtain a subdata set of the inspection image data under each classification background.
And S30, calculating the similarity of the subdata sets under each classification background based on the twin neural network, and comparing the similarity with a threshold value to obtain repeated patrol inspection image data.
In some embodiments, before performing similarity calculation on the sub data sets in each classification context, the method further includes polling attributes of the images by the drone to obtain an image pair with a time correlation.
As shown in fig. 2, the inspection image library stores inspection image data batches according to the date of "year/month/day" each time, the images in each batch comprise the shooting time, the shooting time is inquired, the time is "year/month/day/hour/minute/second", and image pairs with high correlation are obtained by comparing the shooting times.
According to the unmanned aerial vehicle inspection image rapid duplicate checking method based on the twin network, shooting time information is obtained through the original unmanned aerial vehicle inspection image attribute, image data with time correlation is screened out, and then repeated inquiry of images is achieved through the twin network technology. The image pairs with similar shooting time are obtained by adopting a time correlation mode, so that the calculation of irrelevant images can be avoided, the data quantity calculated by a twin network is reduced, and the calculation efficiency is improved. Compared with other twin networks, the twin network fusing the ResNet obtains the feature vector more accurately, and is beneficial to similarity judgment of the twin network.
Based on the above-mentioned embodiment shown in fig. 1, in some embodiments of the present application, referring to fig. 3, in step S30, the similarity calculation is performed on the sub data sets in each classification context based on the twin neural network, and the method includes steps S301 to S303:
step S301, fusing a ResNet101 network with the twin neural network, and performing feature extraction on an image pair to obtain a low-dimensional feature vector;
step S302, the feature vectors are sent into a loss function of the twin network for similarity calculation, and image pair similarity is obtained;
and step S303, sending the image pair similarity into a decision layer, and outputting a final similarity judgment result.
In the embodiment of the application, the image pairs to be calculated are respectively sent to a ResNet network for image feature extraction, and a low-dimensional feature matrix is obtained; then sending the characteristic matrix into a loss function of the twin network for similarity calculation to obtain a similarity measurement result; and finally, sending the similarity measurement result to a judgment layer for similarity judgment. The decision layer may set a threshold as needed, and decide that the image pair is similar if the similarity measure exceeds the threshold. The complete image review process is shown in fig. 4, with the example of a vibration damper.
In some embodiments, the inspection image data is classified based on the ViT method, a transform framework of an image block sequence is used to complete a classification task of the inspection image data, a standard transform is used in the image classification task, an image in the inspection image data is divided into blocks, a linear embedded sequence of the image blocks is used as an input of the transform, and an image classification model is trained in a supervised manner.
In the embodiment of the application, when the inspection image data is classified based on the ViT mode, the Transformer architecture is used as a common network component in the existing natural language processing task, but the application in the field of computer vision is still limited. Therefore, the pure Transformer directly applied to the sequence of image blocks (sequences of image patches) in the embodiment of the present application can efficiently complete the task of image classification. When a large amount of data is pre-trained and migrated to a plurality of small and medium-sized image recognition references, compared with the traditional convolutional neural network mode, the image classification accuracy of the Vision Transformer is higher, and fewer training resources are needed.
Inspired by the scaling concept of the Transformer in the natural language processing task, the standard Transformer is directly applied to the image classification task. The image is divided into blocks (patch), the linear embedded sequence of the image blocks is used as the input of a Transformer, and an image classification model is trained in a supervised mode.
In some embodiments, the image classification model is trained in a supervised manner, including image block embedding, learnable embedding, position embedding, and the encoder extracting features corresponding to learnable class embedding vectors for image classification.
Specifically, the image block is embedded with a block sequence for reconstructing three-dimensional image data to two-dimensional image data, and each image block data is subjected to linear transformation once and then subjected to dimensionality reduction, so that the image block data is used for standard data input format conversion of the image data in a transform architecture.
Wherein, in the image block embedding, for a data input format satisfying a standard Transformer architecture, three-dimensional image data is reconstructed into a two-dimensional block sequence, wherein the dimension of the image data can be regarded as
Figure RE-GDA0003822368520000081
The reconstructed block sequence is
Figure RE-GDA0003822368520000082
Where (H, W) is the resolution of the image data, C is the number of channels of the image data, P 2 Representing the resolution of the image block data, the number of image blocks being expressed as
Figure RE-GDA0003822368520000091
I.e. the effective input sequence length in the transform architecture.
The essence of the image block embedding is that for each image block data
Figure RE-GDA0003822368520000092
After linear transformation for one time, obtaining
Figure RE-GDA0003822368520000093
Is subjected to dimensionality reduction to obtain
Figure RE-GDA0003822368520000094
From P 2 * And D, reducing the C dimension to the D dimension, and realizing the standard data input format conversion of the image data in the Transformer architecture.
Specifically, in the learnable embedding, one learnable embedding is added to the image block embedding sequence
Figure RE-GDA0003822368520000095
States or characteristics at the output of a transform encoder
Figure RE-GDA0003822368520000096
For use as an image representation. In the pre-training and parameter fine-tuning stages, a classification head is attached to
Figure RE-GDA0003822368520000097
And then used for image classification.
Specifically, the position embedding is used to retain position information between input image blocks. Position embedding
Figure RE-GDA0003822368520000098
Also as important content for the embedding of image blocks, for retaining position information between input image blocks. If the model is not provided with the position information of the image blocks, the model needs to learn the jigsaw through the semantics of the image blocks, and the learning cost of the image classification model is easily increased.
Specifically, the encoder is configured to extract features corresponding to learnable class embedding vectors for image classification after the class vectors, the image block embedding, and the position coding are integrated into an input embedding vector.
In the Encoder, after integrating the category vector, the image block embedding and the position coding into an integrated input embedding vector, a transform Encoder can be fed in. Continuously and forwards passing through a Transformer Encoder formed by serially stacking Transformer Encoder Blocks, and finally extracting the characteristics corresponding to the learnable class embedding vectors for image classification. The overall forward calculation process is as follows:
by image embedding blocks
Figure RE-GDA0003822368520000099
Class vector x class And a position code E pose Constructing an embedding input vector z 0
Figure RE-GDA0003822368520000101
Wherein, the first and the second end of the pipe are connected with each other,
Figure RE-GDA0003822368520000102
an MSA Block consisting of a multi-head attention mechanism, layer normalization and jump connection (Layer Norm & Add) can repeat L times, wherein the ith output z' L is calculated as follows:
z′ l =MSA(LN(z l-1 ))+z l-1 ,l∈1,...,L
by feed-forward network (FFN), layer normalization and hopping connections (Layer Norm)&Add) can repeat L, wherein the L output is z l The calculation method is as follows:
z l =MLP(LN(z′ l ))+z′ l ,l∈1,...,L
the image representation y is output by Layer normalization (Layer Norm) and the classification head (MLP or FC) and is calculated as follows:
Figure RE-GDA0003822368520000103
the ViT-based method is adopted to process the power transmission image data in batches, and the data are classified into a pole tower type, an infrastructure type and an image type.
In some embodiments, the calculating the similarity of the subdata sets in each classification context based on the twin neural network comprises:
carrying out spatial mapping on the image pair through feature extraction to obtain a low-dimensional data vector;
and then, carrying out similarity calculation on the data vectors, and outputting the similarity or 0 and 1.
In the embodiment of the application, when the similarity of the Siamese network is analyzed, the Siamese network is also called a twin neural network, and the Siamese network is introduced into the power transmission inspection field and used for checking the duplicate of similar pictures and reducing the quantity of repeated data.
As the Siemese network is a double-branch structure formed by two weight-sharing sub-networks, paired image data sets are input into the branch network, and the branch network outputs results and sends the results to the decision network at the top layer for similarity calculation after a series of steps such as convolution, activation, pooling and the like. The branches of the siense network can be viewed as descriptor computation modules and the top network can be viewed as a similar function. This network has been developed to two basic structures: pseudo-siamese networks and 2-channel networks.
On the basis of a Pseudo-sierse network and a 2-channel network, the ResNet50 is fused for feature extraction, and the Sierse network structure of the ResNet50 is shown in figure 5.
In the model training process, the images processed in advance are respectively sent into a ResNet50 network for feature extraction to obtain a data set after dimension reduction, then the data set is sent into a contrast Loss network for calculation of a Loss function, the model is trained in a strong supervision mode, and the following parameters including a Loss term and an L are used 2 Norm regularization term as a learning objective function:
Figure RE-GDA0003822368520000111
in the above formula, the first part is a regular term, and L is adopted 2 A regularization term, a second partial error loss portion, where w is a weight of the neural network,
Figure RE-GDA0003822368520000112
is the output neuron of the ith training image, y i Are the corresponding labels (with-1 and 1 indicating input picture mismatch and input picture match, respectively).
In the present application, the model training process is as follows:
(1) And (4) preparing data. The training data set is divided into positive and negative sample data pairs.
(2) And (5) feature extraction. And respectively extracting features of each scene through a ResNet50 network to obtain a low-dimensional data set.
(3) And training the network.
And training the positive and negative samples by the training neural network respectively, wherein the trained model can be used for similarity analysis of the images of the power transmission parts and used as a reference for deduplication processing.
In the application, model training is based on verifying validity and accuracy of the ViT-based image classification method in power transmission image data, designing an image classification experiment to test image classification accuracy and difference under a transform architecture and a traditional Res-net network architecture, verifying a power transmission image data cleaning effect based on the ViT method, and designing an experiment to test a power transmission image cleaning effect.
Therefore, the unmanned aerial vehicle inspection image rapid duplicate checking method based on the twin network classifies the inspection images in a ViT mode to obtain the image subdata sets under each background; then, carrying out similarity graduation analysis on the sub-data by adopting a Siamese Network technology to obtain a similarity analysis result; finally, the effectiveness and feasibility of the duplicate removal technology in a power transmission scene are verified through experiments.
In addition, the embodiment of the application also provides a twin network-based unmanned aerial vehicle inspection image fast duplicate checking system.
Referring to fig. 6, fig. 6 is a functional module schematic diagram of a first embodiment of the unmanned aerial vehicle inspection image fast duplicate checking system based on the twin network. In the embodiment of this application, twin network based unmanned aerial vehicle patrols and examines quick check system of image, includes:
the data acquisition module 10 is used for acquiring polling image data of each polling of the unmanned aerial vehicle;
the data classification module 20 is configured to classify the inspection image data based on a ViT method, and obtain a sub data set of the inspection image data under each classification background; and
and the data duplicate checking module 30 is configured to perform similarity calculation on the sub data sets in each classification background based on the twin neural network, and compare the similarity calculation with a threshold value to obtain repeated patrol inspection image data.
Each module in the unmanned aerial vehicle inspection image rapid duplicate checking system based on the twin network corresponds to each step in the unmanned aerial vehicle inspection image rapid duplicate checking method based on the twin network, and the functions and the implementation process are not repeated here.
The unmanned aerial vehicle inspection image rapid duplicate checking method and device based on the twin network can be realized in the form of a computer program, and the computer program can run on unmanned aerial vehicle inspection image rapid duplicate checking equipment based on the twin network.
The unmanned aerial vehicle inspection image fast duplicate checking device based on the twin network comprises a processor and a memory which are connected through a system bus, wherein the memory can comprise a nonvolatile storage medium and an internal memory.
The processor is used for providing calculation and control capacity and supporting the operation of the whole unmanned aerial vehicle inspection image quick duplicate checking device based on the twin network.
The internal memory provides an environment for running a computer program in the nonvolatile storage medium, and the computer program can enable the processor to execute any one of the unmanned aerial vehicle inspection image quick duplicate checking methods based on the twin network when being executed by the processor.
It should be understood that the Processor may be a Central Processing Unit (CPU), and the Processor may be other general purpose processors, digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, etc. Wherein a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The processor is used for running a computer program stored in the memory to realize each embodiment of the unmanned aerial vehicle inspection image quick duplicate checking method based on the twin network, and details are not repeated here.
In addition, the embodiment of the application also provides a computer readable storage medium.
The computer readable storage medium stores a twin network-based unmanned aerial vehicle inspection image fast duplicate checking program, and when the twin network-based unmanned aerial vehicle inspection image fast duplicate checking program is executed by a processor, the steps of the twin network-based unmanned aerial vehicle inspection image fast duplicate checking method are realized.
The method for implementing the unmanned aerial vehicle inspection image rapid duplicate checking program based on the twin network when executed can refer to each embodiment of the unmanned aerial vehicle inspection image rapid duplicate checking method based on the twin network, and is not described herein again.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of other like elements in a process, method, article, or system comprising the element.
The above-mentioned serial numbers of the embodiments of the present application are merely for description, and do not represent the advantages and disadvantages of the embodiments.
The application is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.
The application provides a method and a system for quickly checking the duplicate of an unmanned aerial vehicle inspection image based on a twin network, wherein the method classifies the inspection image in a ViT mode to obtain image subdata sets under each background; then, carrying out similarity graduation analysis on the sub-data by adopting a Siamese Network technology to obtain a similarity analysis result; finally, the effectiveness and feasibility of the deduplication technology in a power transmission scene are verified through experiments.
And moreover, shooting time information is obtained through the original unmanned aerial vehicle inspection image attribute, image data with time correlation is screened out, and then repeated inquiry of images is achieved by adopting a twin network-based technology. The image pairs with similar shooting time are obtained by adopting a time correlation mode, so that the calculation of irrelevant images can be avoided, the data quantity calculated by a twin network is reduced, and the calculation efficiency is improved. Compared with other twin networks, the twin network fusing ResNet has the advantages that the feature vector is more accurate, and the similarity judgment of the twin network is facilitated.
The above description is only a preferred embodiment of the present application, and not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings of the present application, or which are directly or indirectly applied to other related technical fields, are included in the scope of the present application.

Claims (10)

1. A method for quickly checking a duplicate of an unmanned aerial vehicle inspection image based on a twin network is characterized by comprising the following steps:
acquiring patrol image data of each patrol of the unmanned aerial vehicle;
classifying the inspection image data based on a ViT mode to obtain a subdata set of the inspection image data under each classification background;
and performing similarity calculation on the subdata sets under each classification background based on the twin neural network, and comparing the similarity calculation with a threshold value to obtain repeated patrol image data.
2. The unmanned aerial vehicle inspection image fast duplicate checking method based on the twin network of claim 1, wherein when the inspection image data is obtained, the method further comprises obtaining the image shooting time through the attribute of the unmanned aerial vehicle inspection image, and eliminating the similarity calculation of the inspection image without considering the shooting time.
3. The twin network-based unmanned aerial vehicle inspection image fast duplication checking method according to claim 2, wherein before the similarity calculation of the sub data sets in each classification background, the method further comprises obtaining an image pair with time correlation through the attributes of the unmanned aerial vehicle inspection image.
4. The twin network-based unmanned aerial vehicle inspection image fast duplicate checking method according to claim 3, wherein the twin network-based sub data sets under each classification background are subjected to similarity calculation, and the similarity calculation method comprises the following steps:
fusing the twin neural network with a ResNet101 network, and performing feature extraction on an image pair to obtain a low-dimensional feature vector;
sending the feature vector into a loss function of the twin network for similarity calculation to obtain image similarity;
and sending the image pair similarity into a decision layer, and outputting a final similarity judgment result.
5. The twin-network-based unmanned aerial vehicle inspection image fast duplicate checking method according to claim 1, wherein the sub data sets of the inspection image data under each classification background are obtained, wherein the classification background includes a tower type, an infrastructure type and an image type, and when the inspection image data is classified based on a ViT method, the inspection image data is classified into a tower type sub data set, an infrastructure type sub data set and an image type sub data set.
6. The unmanned aerial vehicle inspection tour image quick duplicate checking method based on the twin network as claimed in any one of claims 1 to 5, wherein the calculating of the similarity of the subdata sets under each classification background based on the twin neural network comprises:
carrying out spatial mapping on the image pair through feature extraction to obtain a low-dimensional data vector;
and then, carrying out similarity calculation on the data vectors, and outputting the similarity or 0 and 1.
7. The unmanned aerial vehicle inspection image fast duplication checking method based on the twin network as claimed in claim 1, wherein the inspection image data is classified based on a ViT mode, a transform framework of an image block sequence is used to complete a classification task of the inspection image data, a standard transform is used in the image classification task, an image in the inspection image data is split into blocks, a linear embedding sequence of the image blocks is used as input of the transform, and an image classification model is trained in a supervision mode.
8. The unmanned aerial vehicle inspection tour image fast duplication checking method based on the twin network according to claim 7, wherein a supervised mode is adopted to train an image classification model, and the supervised mode includes image block embedding, learnable embedding, position embedding and an encoder extracting features corresponding to learnable category embedding vectors for image classification;
embedding a block sequence for reconstructing three-dimensional image data to two-dimensional image blocks into the image blocks, performing linear transformation on each image block data, and then reducing the dimension of each image block data, wherein the image blocks are used for standard data input format conversion of the image data in a Transformer architecture;
the position embedding is used to preserve position information between the input image blocks.
9. The twin network-based unmanned aerial vehicle inspection tour image fast duplicate checking method of claim 8, wherein the encoder is configured to extract features corresponding to learnable class embedded vectors for image classification after integrating the class vectors, the image block embedding and the position coding into one input embedded vector.
10. The utility model provides an unmanned aerial vehicle patrols and examines quick check of image system of duplicating based on twin network which characterized in that includes:
the data acquisition module is used for acquiring polling image data of each polling of the unmanned aerial vehicle;
the data classification module is used for classifying the inspection image data based on a ViT mode to obtain a subdata set of the inspection image data under each classification background; and
and the data duplicate checking module is used for carrying out similarity calculation on the subdata sets under each classification background based on the twin neural network and comparing the similarity calculation with a threshold value to obtain repeated patrol inspection image data.
CN202210362383.1A 2022-04-07 2022-04-07 Unmanned aerial vehicle inspection image quick duplicate checking method and system based on twin network Pending CN115205712A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210362383.1A CN115205712A (en) 2022-04-07 2022-04-07 Unmanned aerial vehicle inspection image quick duplicate checking method and system based on twin network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210362383.1A CN115205712A (en) 2022-04-07 2022-04-07 Unmanned aerial vehicle inspection image quick duplicate checking method and system based on twin network

Publications (1)

Publication Number Publication Date
CN115205712A true CN115205712A (en) 2022-10-18

Family

ID=83574948

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210362383.1A Pending CN115205712A (en) 2022-04-07 2022-04-07 Unmanned aerial vehicle inspection image quick duplicate checking method and system based on twin network

Country Status (1)

Country Link
CN (1) CN115205712A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115689928A (en) * 2022-10-31 2023-02-03 国网电力空间技术有限公司 Method and system for removing duplicate of transmission tower inspection image under visible light

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115689928A (en) * 2022-10-31 2023-02-03 国网电力空间技术有限公司 Method and system for removing duplicate of transmission tower inspection image under visible light
CN115689928B (en) * 2022-10-31 2023-11-28 国网电力空间技术有限公司 Method and system for removing duplication of transmission tower inspection images under visible light

Similar Documents

Publication Publication Date Title
CN111797893B (en) Neural network training method, image classification system and related equipment
WO2021139191A1 (en) Method for data labeling and apparatus for data labeling
CN113221687B (en) Training method of pressing plate state recognition model and pressing plate state recognition method
CN113887661B (en) Image set classification method and system based on representation learning reconstruction residual analysis
CN112633459A (en) Method for training neural network, data processing method and related device
CN114170516B (en) Vehicle weight recognition method and device based on roadside perception and electronic equipment
CN114239560A (en) Three-dimensional image classification method, device, equipment and computer-readable storage medium
CN113011568A (en) Model training method, data processing method and equipment
CN111209974A (en) Tensor decomposition-based heterogeneous big data core feature extraction method and system
US20230055263A1 (en) Stratification in non-classified heterogeneous object labels
CN112364893A (en) Semi-supervised zero-sample image classification method based on data enhancement
CN115205712A (en) Unmanned aerial vehicle inspection image quick duplicate checking method and system based on twin network
CN114373099A (en) Three-dimensional point cloud classification method based on sparse graph convolution
Xu et al. Cow face recognition for a small sample based on Siamese DB Capsule Network
CN113762326A (en) Data identification method, device and equipment and readable storage medium
CN113704534A (en) Image processing method and device and computer equipment
WO2022162427A1 (en) Annotation-efficient image anomaly detection
Zingaro et al. Multimodal side-tuning for document classification
CN116796288A (en) Industrial document-oriented multi-mode information extraction method and system
CN116506210A (en) Network intrusion detection method and system based on flow characteristic fusion
CN113392191B (en) Text matching method and device based on multi-dimensional semantic joint learning
CN114372532A (en) Method, device, equipment, medium and product for determining label marking quality
CN117591813B (en) Complex equipment fault diagnosis method and system based on multidimensional features
CN117370679B (en) Method and device for verifying false messages of multi-mode bidirectional implication social network
CN114422199B (en) CMS (content management system) identification method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination