CN116433647A - Insulator image quality evaluation method and system based on multitask learning - Google Patents
Insulator image quality evaluation method and system based on multitask learning Download PDFInfo
- Publication number
- CN116433647A CN116433647A CN202310469429.4A CN202310469429A CN116433647A CN 116433647 A CN116433647 A CN 116433647A CN 202310469429 A CN202310469429 A CN 202310469429A CN 116433647 A CN116433647 A CN 116433647A
- Authority
- CN
- China
- Prior art keywords
- task
- insulator
- image
- distortion
- quality
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 239000012212 insulator Substances 0.000 title claims abstract description 99
- 238000000034 method Methods 0.000 title claims abstract description 51
- 238000013441 quality evaluation Methods 0.000 title claims abstract description 51
- 230000003993 interaction Effects 0.000 claims abstract description 35
- 230000008447 perception Effects 0.000 claims abstract description 26
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 25
- 238000000605 extraction Methods 0.000 claims abstract description 21
- 238000011156 evaluation Methods 0.000 claims abstract description 11
- 230000008569 process Effects 0.000 claims description 19
- 230000004913 activation Effects 0.000 claims description 10
- 238000012549 training Methods 0.000 claims description 8
- 238000002372 labelling Methods 0.000 claims description 5
- 230000006870 function Effects 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 4
- 230000009467 reduction Effects 0.000 claims description 4
- 230000002776 aggregation Effects 0.000 claims description 3
- 238000004220 aggregation Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000002156 mixing Methods 0.000 claims description 2
- 238000009413 insulation Methods 0.000 claims 1
- 230000004044 response Effects 0.000 claims 1
- 230000005540 biological transmission Effects 0.000 abstract description 4
- 238000003745 diagnosis Methods 0.000 abstract description 4
- 238000007781 pre-processing Methods 0.000 abstract description 2
- 238000012216 screening Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 10
- 230000004927 fusion Effects 0.000 description 6
- 238000012360 testing method Methods 0.000 description 5
- 238000004590 computer program Methods 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 3
- 230000033001 locomotion Effects 0.000 description 3
- 238000001303 quality assessment method Methods 0.000 description 3
- 230000015556 catabolic process Effects 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 230000032683 aging Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 208000003464 asthenopia Diseases 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0004—Industrial image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/761—Proximity, similarity or dissimilarity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30108—Industrial image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30168—Image quality inspection
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Quality & Reliability (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an insulator image quality evaluation method and system based on multitask learning, which relate to the technical field of insulator image evaluation and comprise the following steps: collecting high-definition insulator images, marking quality labels by adding distortion with grading and integrating advanced objective evaluation algorithm, constructing an insulator distortion image data set, fusing fine-grained characteristics of different layers by a multi-scale decoupling characteristic extraction mode, coding by an encoder, inputting the obtained coding characteristics into a task interaction block of a task perception transformer decoder, capturing task interaction of each task, decoding the perception characteristics of each task, and evaluating the insulator distortion images according to an MLP network; the invention realizes screening and preprocessing of distorted insulator components in insulator fault diagnosis application, and plays an important role in improving the accuracy of an insulator fault diagnosis algorithm in a power transmission line.
Description
Technical Field
The invention relates to the technical field of insulator image quality evaluation, in particular to an insulator image quality evaluation method and system based on multitask learning.
Background
In the transmission line, the number of insulators is numerous and the distribution is wide. Because the insulator is in the field environment for a long time, the insulator is easily influenced by high voltage and complex climate environment, and defects, cracks and the like are easily generated. The insulator assembly in the aerial image is identified with high efficiency and high precision, and plays an important role in timely diagnosing faults of the power transmission line. However, due to the self factors such as aging and damage of the camera, the conditions of distortion and degradation of the shot insulator image are caused by the external influences such as shaking and complex environment in the shooting process, and the identification accuracy and defect detection accuracy of the insulator are seriously affected. Therefore, a high-performance quality evaluation algorithm is needed to monitor and evaluate the quality of the acquired insulator image.
Subjective image quality evaluation is time-consuming and labor-consuming, and can easily cause visual fatigue of human eyes; the objective image quality evaluation utilizes a system for scoring a large number of distorted images by a computer, thereby saving manpower and material resources and being capable of rapidly judging the images. The full-reference image quality evaluation algorithm needs participation of high-definition images, and calculates the difference or similarity between the contrast-distorted image and the high-definition image through the statistical characteristics of the contrast-distorted image and the high-definition image, so as to give a score. The semi-reference image quality assessment scores the images against partial features of the high definition image and the distorted image. However, in practical applications, corresponding high-definition images are often not found, and no-reference image quality evaluation algorithms (No Reference Image Quality Assessment, NR-IQA) that do not require high-definition images are widely used in practice.
The insulator image acquired by the unmanned aerial vehicle is likely to be degraded due to bad weather influence, camera movement and other conditions, and a high-definition image is not available for comparison in actual application, so that the non-reference image quality evaluation algorithm can be used only, and the evaluation difficulty is increased; the image quality evaluation algorithm disclosed at present has good effect on the evaluation of natural scene images, but once the natural scene images are migrated to the insulator images, the natural scene images are inconsistent with human opinions, so that the image quality evaluation algorithm aiming at the insulator images needs to be developed; currently, there is no quality evaluation dataset for insulator images, and creation of a dataset requires a large number of images with human opinion scores, which results in huge manpower and time consumption. Currently, reference-free quality evaluation algorithms, which are generally based on Convolutional Neural Networks (CNNs), generally directly learn the mapping between images and subjective scores, but these algorithms tend to directly solve the complex regression problem; moreover, the global features utilized are not sufficient to capture complex distortions, and multi-scale distortion modes cannot be considered. In order to solve the problem of complex end-to-end training, there is a study of dividing image quality evaluation into a plurality of subtasks, and assisting the image quality evaluation task through other subtasks. However, these multitasking learning methods focus mainly on shared features, lacking the interactive capability between tasks.
Therefore, the reliable method for evaluating the quality of the insulator image is lacked in the prior art, so that the distortion characteristics of the insulator image are better mined, and the non-reference image quality evaluation method for fully capturing the distortion characteristics of the insulator image is a scientific problem solved by the invention. In addition, how to evaluate the quality and grade the distortion of the insulator image from the manual grading is a urgent problem to be solved in creating a quality evaluation data set based on the insulator image.
Disclosure of Invention
In order to solve the problems, the invention aims to provide an insulator image quality evaluation method and system based on multi-task learning, which are characterized in that a multi-scale distortion information is enhanced by constructing an insulator image quality evaluation data set and adding fine-granularity feature fusion, a deformable mixer encoder module and a task perception transformer decoder module are introduced, more information areas related to different tasks are highlighted, the interaction capability of the tasks is improved, and further the quality evaluation of an insulator distortion image is completed.
In order to achieve the technical purpose, the application provides an insulator image quality evaluation method based on multi-task learning, which comprises the following steps:
collecting high-definition insulator images, adding distortion with grading, and marking quality labels by using an objective evaluation algorithm to construct an insulator distortion image data set;
based on an insulator distortion image dataset, fusing fine-grained characteristics of different layers through a multi-scale decoupling characteristic extraction mode, encoding through an encoder, and inputting the obtained encoding characteristics into a task interaction block of a task perception transducer decoder;
and capturing task interaction of each task based on the task interaction block, decoding the perception characteristic of each task, and evaluating the insulator distortion image according to the MLP network.
Preferably, in the process of labeling the quality label, the distorted high-definition insulator image is labeled by an MDSI algorithm.
Preferably, in the process of labeling the quality label, the insulator distortion image labeled by the quality label is subjected to the pearson linear correlation coefficient test through a GMSD algorithm, and the error label is removed.
Preferably, in the process of quality label marking, the range of scalar quality scores is divided into discrete subintervals according to distortion grades, and each subinterval is made into a quality grade as a quality grade label.
Preferably, before encoding by an encoder, the channels of each layer of feature map of the Resnet50 network are divided into 4 groups, features of the first group of channels are transmitted downwards, features of the second group of channels to the fourth group of channels are subjected to 3×3 convolution to perform feature extraction, and are transmitted downwards after splice reduction, and 4 groups of channel information are fused again by using 1×1 convolution to realize multi-scale feature extraction.
Preferably, in the encoding process by the encoder, the encoder is a deformable mixer encoder, and comprises a linear layer for performing image feature dimension reduction, a channel sensing module for performing channel mixing through standard point-by-point convolution, a first GELU activation+BatchNorm module, a spatial sensing module for performing spatial context aggregation to obtain corresponding offset of a reference point, a second GELU activation+BatchNorm module, a residual module and a Reshape module for flattening image features, which are sequentially arranged, wherein the residual module respectively connects the outputs of the first GELU activation+BatchNorm module and the second GELU activation+BatchNorm module.
Preferably, during decoding by the task aware transform decoder, two encoding features are acquired by two deformable hybrid encoders;
inputting two coding features into a task perception transformer decoder for decoding, wherein a task interaction block consists of a multi-head self-attention module MHSA and a small multi-layer perceptron sMLP, and the multi-head self-attention module MHSA is used for projecting the features and constructing a self-attention strategy comprising inquiry, keys and a value matrix; the small multi-layer perceptron sMLP is used for generating interaction characteristics of a quality scoring task and a quality grading task according to a self-attention strategy, and the small multi-layer perceptron sMLP consists of a linear layer and a LayerNorm.
Preferably, in the process of obtaining the perceptual features, based on the interaction features, the perceptual features of the decoding task are obtained by parallel application of LayerNorm through two task query blocks.
Preferably, in the process of quality evaluation, quality evaluation is performed on the insulator distortion image according to a prediction result by performing quality score prediction and distortion grade prediction on the insulator distortion image according to two mutually independent MLP networks based on perception characteristics, wherein a loss function adopts a dynamic weight coefficient to determine loss contribution of two tasks, the learning of the distortion grading task is focused in the early stage of training, the learning rule which is easy to get difficult is simulated, and the quality score and the distortion grade of the obtained image of each block are averaged to obtain the quality score and the distortion grade of the final whole image, so that the quality evaluation on the insulator distortion image is completed.
The invention also provides an insulator image quality evaluation system based on multitask learning, which comprises:
the data acquisition module is used for acquiring the high-definition insulator image;
the data processing module is used for marking quality labels by adding distortion with grading according to the high-definition insulator images and utilizing an objective evaluation algorithm to construct an insulator distortion image data set, fusing fine-grained characteristics of different layers in a multi-scale decoupling characteristic extraction mode, and then encoding by an encoder to obtain encoding characteristics;
the evaluation module is used for inputting the coding features into a task interaction block of the task perception transformer decoder, capturing task interaction of each task, decoding the perception features of each task, and evaluating the insulator distortion image according to the MLP network.
The invention discloses the following technical effects:
according to the method, an insulator image quality evaluation data set is quickly constructed in a manner of separating subjective scores;
the invention adopts a characteristic extraction method of multi-scale characteristic decoupling to fully aggregate multi-scale distortion information; the addition of the deformable mixer encoder module and the task aware transform decoder module handle the multi-task learning problem, facilitating the full mining of shared features and respective perceptual features between distortion level prediction tasks and quality score prediction tasks.
The invention carries out quality evaluation and distortion classification on the acquired insulator sub-images at the same time, thereby realizing screening and preprocessing of distorted insulator components in insulator fault diagnosis application and playing an important role in improving the accuracy of an insulator fault diagnosis algorithm in a power transmission line.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of an insulator image quality evaluation method based on multitask learning according to the invention;
FIG. 2 is a network structure diagram of an insulator image quality evaluation method based on multi-task learning according to the invention;
FIG. 3 is a schematic diagram of a multi-scale decoupling process according to the present invention;
FIG. 4 is a schematic diagram of a deformable mixer encoder according to the present invention;
FIG. 5 is a schematic diagram of a task interaction block according to the present invention;
FIG. 6 is a schematic diagram of a task query block according to the present invention.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, which are generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, as provided in the accompanying drawings, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, are intended to be within the scope of the present application.
As shown in fig. 1-6, example 1: according to the invention, an insulator distortion image data set is constructed for the first time, an insulator image quality evaluation method based on multi-task learning is provided, and a quality grading prediction task is introduced to help optimize a regression task. The invention utilizes multi-scale decoupling to extract fine granularity characteristics, fuses multi-scale distortion characteristics, introduces a deformable mixer encoder module and a task perception transducer decoder module to improve the multi-task learning capability, solves the problem that a reliable method for evaluating the quality of an insulator image is lacked in the prior art, and is characterized in that the invention comprises the following specific realization processes:
the insulator image collected by the unmanned aerial vehicle is affected by bad weather, and the image degradation occurs under the conditions of camera movement, etc., so that a high-performance quality monitoring algorithm is needed to monitor and evaluate the quality of the collected insulator image. The previously disclosed algorithm cannot consider a multi-scale distortion mode, and the captured global features are insufficient for extracting complex distortion; to avoid directly solving the complex regression problem, some students divide the quality assessment task into multiple subtasks, however, these approaches focus mainly on sharing features, lacking interactions between tasks. According to the invention, firstly, an insulator distorted image data set is constructed, fine granularity characteristic extraction and fusion are carried out by adopting multi-scale decoupling, distortion information of each scale is extracted as accurately as possible, an effective deformable mixer encoder module and a task perception transducer decoder are introduced, and the interaction capability between a quality score prediction task and a distortion grade prediction task is improved. A flowchart of an insulator image quality evaluation method based on multitasking learning is shown in fig. 1.
Firstly, marking a score label by utilizing a high-definition insulator image and an objective full-reference image quality evaluation algorithm-MDSI, and in order to remove an error label, carrying out a pearson linear correlation coefficient (Pearson linear correlation coefficient, PLCC) test on a distorted image of the same content by utilizing another advanced full-reference image quality evaluation algorithm-GMSD, so as to construct a large insulator distorted image data set separated from subjective scores; then dividing the insulator distortion image into a training set and a testing set, and dividing the quality scores into distortion grades; training a network model by using the distorted image in the training set and the corresponding grading label and the distortion grade label; and (5) fully training, inputting the distorted images in the test set into a model for testing, and obtaining corresponding distortion grade and quality score. The network structure diagram of the insulator image quality evaluation method based on the multi-task learning is shown in fig. 2.
The insulator image quality evaluation method based on multi-task learning mainly comprises a multi-scale feature extraction network, an encoder, a decoder and a multi-task prediction network.
(1) Multiscale feature extraction network:
in the quality evaluation of the sub-images of the insulator, the distortion of different degrees needs to be extracted, and a plurality of features with different scales are often hidden in the distorted images, and the residual images contain important information related to the image quality. In view of this feature, the present invention proposes a multi-scale feature extraction network shown in fig. 2, which extracts four layers of distortion features from the Resnet50 base network, and performs stitching fusion after downsampling them to the same size. Wherein the downsampling process is implemented by a 3 x 3 convolution, which is multi-scale decoupled for finer granularity multi-scale feature extraction, as shown in fig. 3.
Firstly, the channels are grouped, the number scale of the groups is 4, the features of the first group are transmitted downwards, the features of the second group are subjected to feature extraction through a 3X 3 convolution, so that the receptive field of the feature extraction is changed along with the change of the receptive field of the feature extraction, and the receptive field of the later groups is bigger, finally, the features of the groups are spliced and restored, and the multi-scale features in the same layer are extracted again by using the 1X 1 convolution fusion channel information.
(2) An encoder:
to improve the feature extraction capability of the network, the invention introduces two independent deformable mixer encoders, outputting deformable features of two prediction tasks.
The deformable mixer encoder adaptively provides more efficient receptive fields and sampling spatial locations for each task for spatially aware deformable spatial features and channel aware location features to generate deformed features for task 1, for example, as shown in fig. 4.
First, image feature X ε R is processed through the linear layer H/4×W/4×C′ The channel dimension of (2) decreases from C to moreSmall dimension C'. The channel perception module allows communication between different channels, mixed by standard point-by-point convolution (convolution kernel 1×1); followed by GELU activation and battnorm; the spatial awareness module can model the spatial context aggregation, the image features are fed to the convolution operator to learn the corresponding offsets Δ (i, j) for all reference points, a process that can be written as:
W 2 is a deformable weight, delta (i,j) Is a learnable offset.
Followed by GELU activation, battnorm, and residual ligation. Reshape operation will feature X ε R H/4×W/4×C ' flattening to sequence R N×C′ (N=H/4×W/4)。
The output of the deformable mixer encoder is a specific feature of both tasks and can be used as input to the subsequent decoder.
(3) A decoder:
to improve the ability of multitasking interactions, the present invention introduces a task aware transform decoder.
The task aware converter decoder comprises a task interaction block and two task query blocks, which are respectively used for capturing task interaction characteristics and carrying out corresponding prediction.
The task interaction block consists of a multi-head self-attention Module (MHSA) and a small multi-layer perceptron (sMLP), and captures task interaction of each task through an attention mechanism, as shown in figure 5.
First, two output deformation features from a deformable mixer encoder are connected,
wherein X is f ∈R 2N×C′ Is a fusion feature.
To achieve efficient task interactions, features are first projected into the query (Q), key (K) and value (V) of dimension dk, and then a self-attention strategy is constructed:
Q=LN(X f ),X=LN(X f ),V=LN(X f )
X′ f =MHSA(Q,K,V)
wherein Q epsilon R N×C′ ,K∈R N×C′ And V.epsilon.R N×C′ The query, key and value matrix, respectively, the self-attention is calculated by Q, K and V, and the specific calculation of the self-attention is as follows:
finally, generating interactive features of the quality grading task and the quality grading task by using the sMLP, as shown in the following formula.
The task query block decodes the perceptual features of task 1 and task 2, respectively, from the predicted task interaction features, as shown in FIG. 6, to generate the perceptual features of task 1As an example.
First, layerNorm is applied in parallel to generate query Q, key K and value V, morphing featuresAs a task query Q, the task interaction feature +.>Key K and value V as MHSA, then a self-attention is constructedForce strategy:
task awareness features from R through remodelling operations N×C′ (n=h/4×w/4) and residual connection:
(4) Quality score regression and distortion level classification network:
a quality rating prediction task, simplified from complex quality regression tasks, was introduced to help optimize the regression task. Specifically, the range of scalar quality scores is divided into discrete sub-intervals, with each sub-interval being made a quality level representing a particular quality level for the distortion rating prediction task. The network comprises two parts: the image quality score prediction module and the image distortion level prediction module are both implemented using a simple multi-layer perceptual layer (MLP).
The insulator image quality evaluation method based on multi-task learning provided by the invention comprises the following steps:
the first step: five different types of distortion are added on the high-definition insulator image, each distortion has five distortion levels, an objective evaluation algorithm is utilized for marking quality labels, the range of scalar quality scores is divided into discrete subintervals, and each subinterval is made to be a quality level and used as a quality level label. The insulator distorted images are preprocessed, and each distorted image is divided into a plurality of 224×224 small blocks.
And a second step of: and the feature extraction adopts a multi-scale decoupling feature extraction mode, and finally, the fine-grained features of different layers are fused to form new features which are used as the input of the deformable mixer encoder. Inputting the output result of the deformable mixer encoder into a task interaction block of a task perception transformer decoder, and capturing task interaction of each task; and inputting the interaction characteristics into a task query block to decode the perception characteristics of each task.
And a third step of: and carrying out quality scoring prediction and distortion grade prediction on the perception characteristics of each task obtained by the improved network through two independent MLP networks, wherein the loss contribution of the two tasks is determined by adopting a dynamic weight coefficient in a loss function, so that the model is focused on learning the distortion rating task in the initial stage of training, and the learning rule from easy to difficult is simulated. And averaging the obtained image quality scores and distortion grades of all the blocks to obtain the quality scores and distortion grades of the final whole image, thereby finishing the quality evaluation of the insulator distorted image.
The invention has the following characteristics:
(1) According to the invention, an insulator distortion image data set is constructed for the first time, gaussian filtering with different blur degrees is carried out on a high-definition insulator image, white noise with different degrees is added, JPEG2000 compression and JPEG compression with different degrees are carried out, motion blur with different blur degrees is added, various distortions in the acquisition process of an insulator assembly in an actual aviation image are simulated, an insulator image quality evaluation data set is formed, and the problem of lack of the existing open data set for the insulator image quality evaluation is solved.
(2) According to the invention, a subjective scoring mode is separated, a high-performance full-reference image quality evaluation algorithm is utilized to score each insulator distortion image, and a label and a distortion image are screened by combining with another high-performance full-reference image quality evaluation algorithm, so that the quick construction of the label in the insulator image quality evaluation data set is realized.
(3) The invention constructs an effective fine granularity feature extraction and fusion module, fully considers the influence of different scale distortions, aggregates multi-scale distortion information, and solves the problem that most algorithms fail to fully consider multi-scale distortion modes.
(4) The present invention also introduces a deformable mixer encoder module that captures more information areas associated with each task; the task perception transformer decoder is introduced to pay attention to task perception of each task, so that the problem of lack of global modeling in CNN is relieved, and the perception capability and interaction capability of quality scoring tasks and distortion grading tasks are improved.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In the description of the present invention, it should be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the present invention, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.
Claims (10)
1. The insulator image quality evaluation method based on multitask learning is characterized by comprising the following steps of:
collecting high-definition insulator images, adding distortion with grading, and marking quality labels by using an objective evaluation algorithm to construct an insulator distortion image data set;
based on the insulator distortion image dataset, fusing fine-grained characteristics of different layers through a multi-scale decoupling characteristic extraction mode, encoding through an encoder, and inputting the obtained encoding characteristics into a task interaction block of a task perception transducer decoder;
and capturing task interaction of each task based on the task interaction block, decoding the perception characteristic of each task, and evaluating the insulator distortion image according to an MLP network.
2. The method for evaluating the image quality of the insulator based on the multi-task learning according to claim 1, wherein the method comprises the following steps:
and in the process of quality label labeling, the quality label labeling is carried out on the high-definition insulator image added with the distortion through an MDSI algorithm.
3. The method for evaluating the image quality of the insulator based on the multi-task learning according to claim 2, wherein the method comprises the following steps:
and in the process of labeling the quality label, the pearson linear correlation coefficient is checked through a GMSD algorithm on the high-definition insulator image labeled with the quality label, and the error label is removed.
4. A multitask learning-based insulator image quality evaluation method according to claim 3, characterized in that:
in the process of quality label marking, the range of scalar quality scores is divided into discrete subintervals according to distortion grades, and each subinterval is made to be a quality grade and used as a quality grade label.
5. The method for evaluating the image quality of the insulator based on the multi-task learning according to claim 4, wherein:
before encoding by an encoder, dividing channels of each layer of characteristics of the Resnet50 network into 4 groups, transmitting characteristics of the first group of channels downwards, extracting characteristics of the second group of channels to the fourth group of channels by 3X 3 convolution, performing splice reduction, transmitting downwards, and fusing 4 groups of channel information by 1X 1 convolution again for realizing multi-scale characteristic extraction.
6. The method for evaluating the image quality of the insulator based on the multi-task learning according to claim 5, wherein the method comprises the following steps:
in the process of encoding by an encoder, the encoder is a deformable mixer encoder and comprises a linear layer for performing image feature dimension reduction, a channel sensing module for performing channel mixing through standard point-by-point convolution, a first GELU activation +BatchNorm module, a space sensing module for performing space context aggregation to obtain corresponding offset of a reference point, a second GELU activation +BatchNorm module, a residual module and a response module for flattening image features, wherein the residual module is used for respectively connecting the outputs of the first GELU activation +BatchNorm module and the second GELU activation +BatchNorm module.
7. The method for evaluating the image quality of the insulator based on the multi-task learning according to claim 6, wherein:
acquiring two coding features through two deformable mixer encoders in the process of decoding through a task-aware transform decoder;
inputting the two coding features to the task perception transformer decoder for decoding, wherein the task interaction block consists of a multi-head self-attention module MHSA and a small multi-layer perceptron sMLP, and the multi-head self-attention module MHSA is used for projecting the features and constructing a self-attention strategy comprising inquiry, keys and a value matrix; the small multi-layer perceptron sMLP is used for generating interaction characteristics of a quality scoring task and a distortion grading task according to the self-attention strategy, and the small multi-layer perceptron sMLP consists of a linear layer and a LayerNorm layer.
8. The method for evaluating the image quality of the insulator based on the multi-task learning according to claim 7, wherein:
in the process of obtaining the perception feature, based on the interaction feature, layerNorm is applied in parallel through two task query blocks to obtain the perception feature of the decoding task.
9. The method for evaluating the image quality of the insulator based on the multi-task learning according to claim 8, wherein the method comprises the following steps:
in the quality evaluation process, based on the perception characteristics, quality score prediction and distortion grade prediction are carried out on the insulator distortion image according to the two mutually independent MLP networks, and quality evaluation is carried out on the insulator distortion image according to a prediction result, wherein a loss function adopts a dynamic weight coefficient to determine loss contribution of two tasks, the learning of the distortion rating task is focused in the early training stage, the learning rule from easy to difficult is simulated, the quality score and the distortion grade of the obtained image of each block are averaged to obtain the quality score and the distortion grade of the final whole image, and further the quality evaluation of the insulator distortion image is completed.
10. An insulation sub-image quality evaluation system based on multitasking learning, comprising:
the data acquisition module is used for acquiring the high-definition insulator image;
the data processing module is used for marking quality labels by adding distortion with grading according to the high-definition insulator images and utilizing an objective evaluation algorithm to construct an insulator distortion image data set, fusing fine-grained characteristics of different layers in a multi-scale decoupling characteristic extraction mode, and then encoding by an encoder to obtain encoding characteristics;
the evaluation module is used for inputting the coding features into a task interaction block of a task perception transformer decoder, capturing task interaction of each task, decoding the perception features of each task, and evaluating the insulator distortion image according to an MLP network.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310469429.4A CN116433647A (en) | 2023-04-27 | 2023-04-27 | Insulator image quality evaluation method and system based on multitask learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310469429.4A CN116433647A (en) | 2023-04-27 | 2023-04-27 | Insulator image quality evaluation method and system based on multitask learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116433647A true CN116433647A (en) | 2023-07-14 |
Family
ID=87087219
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310469429.4A Pending CN116433647A (en) | 2023-04-27 | 2023-04-27 | Insulator image quality evaluation method and system based on multitask learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116433647A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117710370A (en) * | 2024-02-05 | 2024-03-15 | 江西财经大学 | Method and system for evaluating blind quality of true distortion panoramic image driven by multiple tasks |
-
2023
- 2023-04-27 CN CN202310469429.4A patent/CN116433647A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117710370A (en) * | 2024-02-05 | 2024-03-15 | 江西财经大学 | Method and system for evaluating blind quality of true distortion panoramic image driven by multiple tasks |
CN117710370B (en) * | 2024-02-05 | 2024-05-10 | 江西财经大学 | Method and system for evaluating blind quality of true distortion panoramic image driven by multiple tasks |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108428227B (en) | No-reference image quality evaluation method based on full convolution neural network | |
CN114359283B (en) | Defect detection method based on Transformer and electronic equipment | |
US20200402221A1 (en) | Inspection system, image discrimination system, discrimination system, discriminator generation system, and learning data generation device | |
CN111383209A (en) | Unsupervised flaw detection method based on full convolution self-encoder network | |
CN110648310A (en) | Weak supervision casting defect identification method based on attention mechanism | |
CN110555831B (en) | Deep learning-based drainage pipeline defect segmentation method | |
CN112101138A (en) | Bridge inhaul cable surface defect real-time identification system and method based on deep learning | |
CN116433647A (en) | Insulator image quality evaluation method and system based on multitask learning | |
CN117147561B (en) | Surface quality detection method and system for metal zipper | |
CN117540779A (en) | Lightweight metal surface defect detection method based on double-source knowledge distillation | |
CN116703885A (en) | Swin transducer-based surface defect detection method and system | |
CN117197763A (en) | Road crack detection method and system based on cross attention guide feature alignment network | |
CN118321203B (en) | Robot remote control system and control method | |
CN114612803B (en) | Improved CENTERNET transmission line insulator defect detection method | |
CN115082798A (en) | Power transmission line pin defect detection method based on dynamic receptive field | |
CN113516652A (en) | Battery surface defect and adhesive detection method, device, medium and electronic equipment | |
CN116166966B (en) | Water quality degradation event detection method based on multi-mode data fusion | |
CN116485802B (en) | Insulator flashover defect detection method, device, equipment and storage medium | |
CN116883399A (en) | Visual detection method, device, system and equipment for defects in sapphire shouldering stage | |
CN117274197A (en) | PCB defect detection method based on YOLO v5 algorithm improvement | |
CN112200766A (en) | Industrial product surface defect detection method based on area-associated neural network | |
Hu et al. | Hybrid Pixel‐Level Crack Segmentation for Ballastless Track Slab Using Digital Twin Model and Weakly Supervised Style Transfer | |
CN113947567B (en) | Defect detection method based on multitask learning | |
KR20210037199A (en) | Apparatus for dividing, tagging an image and for detecting defect of facilities using the same | |
Zhu et al. | Siqd: Surveillance image quality database and performance evaluation for objective algorithms |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |