CN116543295A - Lightweight underwater target detection method and system based on degradation image enhancement - Google Patents
Lightweight underwater target detection method and system based on degradation image enhancement Download PDFInfo
- Publication number
- CN116543295A CN116543295A CN202310366420.0A CN202310366420A CN116543295A CN 116543295 A CN116543295 A CN 116543295A CN 202310366420 A CN202310366420 A CN 202310366420A CN 116543295 A CN116543295 A CN 116543295A
- Authority
- CN
- China
- Prior art keywords
- underwater
- image
- target detection
- network
- urpc
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 69
- 230000015556 catabolic process Effects 0.000 title claims abstract description 15
- 238000006731 degradation reaction Methods 0.000 title claims abstract description 15
- 238000012549 training Methods 0.000 claims abstract description 39
- 238000012360 testing method Methods 0.000 claims abstract description 23
- 238000003384 imaging method Methods 0.000 claims abstract description 18
- 238000012795 verification Methods 0.000 claims abstract description 16
- 238000012545 processing Methods 0.000 claims abstract description 7
- 238000007781 pre-processing Methods 0.000 claims abstract description 6
- 238000000034 method Methods 0.000 claims description 24
- 238000011176 pooling Methods 0.000 claims description 18
- 230000006870 function Effects 0.000 claims description 17
- 230000004913 activation Effects 0.000 claims description 9
- 238000010586 diagram Methods 0.000 claims description 8
- 230000007246 mechanism Effects 0.000 claims description 8
- 238000000605 extraction Methods 0.000 claims description 6
- 239000013598 vector Substances 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 5
- 230000009466 transformation Effects 0.000 claims description 4
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 3
- 238000012937 correction Methods 0.000 claims description 3
- 230000001131 transforming effect Effects 0.000 claims description 3
- 238000002834 transmittance Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 2
- 239000000284 extract Substances 0.000 claims description 2
- 230000000694 effects Effects 0.000 abstract description 6
- 238000013135 deep learning Methods 0.000 abstract description 4
- 238000004422 calculation algorithm Methods 0.000 description 7
- 238000013527 convolutional neural network Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 230000003287 optical effect Effects 0.000 description 4
- 238000011161 development Methods 0.000 description 3
- 238000002372 labelling Methods 0.000 description 3
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 3
- 241000258957 Asteroidea Species 0.000 description 2
- 241000257465 Echinoidea Species 0.000 description 2
- 241000251511 Holothuroidea Species 0.000 description 2
- 241000237503 Pectinidae Species 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 235000020637 scallop Nutrition 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 238000000137 annealing Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000002238 attenuated effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000004445 quantitative analysis Methods 0.000 description 1
- 238000011897 real-time detection Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/05—Underwater scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration by the use of more than one image, e.g. averaging, subtraction
-
- G06T5/90—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/52—Scale-space analysis, e.g. wavelet analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/766—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using regression, e.g. by projecting features on hyperplanes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20048—Transform domain processing
- G06T2207/20052—Discrete cosine transform [DCT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Abstract
The invention discloses a degradation image enhancement-based light underwater target detection method and a degradation image enhancement-based light underwater target detection system, which relate to the technical field of deep learning image processing, and are used for receiving an image enhancement data set UIE and an underwater target detection data set URPC, preprocessing the UIE and the URPC, and dividing the UIE and the URPC into a training set, a verification set and a test set; inputting the preprocessed underwater image of the UIE into a pre-established underwater imaging model for enhancement to obtain a clear underwater image, and inputting the image of the URPC into the pre-established underwater imaging model to obtain an enhanced image; marking the underwater image and the enhanced image with the best effect as weight files, and storing the weight files; and inputting the weight file, the training set, the verification set and the test set into a pre-established lightweight underwater target detection model, finally outputting an image containing an underwater target detection frame, identifying and marking targets in the underwater target detection frame, and calculating average precision.
Description
Technical Field
The invention relates to the technical field of deep learning image processing, in particular to a degradation image enhancement-based light underwater target detection method and system.
Background
The method has important value for the detection task of the underwater target which is not separated from the exploration of the ocean, namely in the resource development, the submarine fishing, the ecological protection and the military operation. However, due to the influence of the underwater complex environment, a series of problems such as high cost and low safety exist in a manual or semi-manual detection mode, which brings great challenges to the underwater task. At present, with the vigorous development of deep learning technology, a vision-based target detection technology becomes a hot spot gradually, is widely used for underwater target recognition tasks, plays an important role in various aspects such as resource development, underwater monitoring, ecological protection and the like, and provides powerful support for the underwater detection tasks.
The underwater environment is complex, and great trouble is brought to the target detection task. Since light rays can be selectively attenuated under water, that is, the propagation of light under water has a wavelength dependence. In the underwater environment, red light decays fastest, then green light goes to blue light, so that the collected underwater image always presents a blue-green background, and color deviation is generated. At the same time, the floats in water cause scattering of light, thereby blurring the image details. The above problems may lead to degradation of image quality, directly affecting the accuracy of the subsequent detection task.
In recent years, deep learning is being vigorously developed. The object detection technology has greatly progressed and is widely applied to various scenes. However, the network structure of the method at the present stage is complex, the parameter quantity is huge, and the real-time detection is not facilitated. At present, the target detection algorithm is mainly divided into two types, namely a one-stage detector which directly predicts the type of the position on the characteristic image, such as SSD, YOLO series and the like, and has the characteristics of high speed, high accuracy and the like. The other type is a two-stage algorithm based on detection frames and classifiers, firstly generates region candidate frames, and classifies each region candidate frame by a convolutional neural network, such as R-CNN, fast-RCNN and the like, and has the characteristics of high accuracy but low speed.
Disclosure of Invention
In order to solve the above-mentioned shortcomings in the background art, the present invention aims to provide a method and a system for detecting a lightweight underwater target based on degraded image enhancement,
the aim of the invention can be achieved by the following technical scheme: a method for detecting a lightweight underwater target based on degradation image enhancement comprises the following steps:
receiving an image enhancement data set UIE and an underwater target detection data set URPC, preprocessing the UIE and the URPC, and dividing the preprocessed UIE and the preprocessed URPC into a training set, a verification set and a test set;
embedding a pre-established underwater imaging model into a UWCNN-SD network, inputting an underwater image in a UIE data set into the UWCNN-SD network for training, storing the trained weight through an original underwater image and a real image training network, loading the weight into the UWCNN-SD network, inputting a URPC data set into the trained UWCNN-SD network, finally obtaining a clear underwater image,
inputting the clear underwater image, the training set, the verification set and the test set into a pre-established lightweight underwater target detection model, finally outputting an image containing an underwater target detection frame, identifying and marking the target in the underwater target detection frame, and calculating average precision to obtain a detection result.
Wherein optionally the UIE dataset comprises 950 pairs of underwater images and real images, wherein the real images are clear underwater images, without color bias.
Optionally, the URPC data set contains 7600 pictures and corresponding tag files, wherein the tag files comprise a labeling frame, position information of the labeling frame and real category information of the content of the labeling frame.
Optionally, the underwater imaging model performs discrete cosine transform on the underwater image, separates the underwater image into a high-frequency part and a low-frequency part, then builds a CNN network and a loss function, embeds the underwater imaging model into the network, can eliminate color deviation on the low-frequency part, highlights texture details on the high-frequency part, and finally fuses the high-frequency part and the low-frequency part to output a clear underwater image.
Wherein optionally, the underwater imaging model is as follows:
I λ (x)=J λ (x)t λ (x)+A λ (1-t λ (x))
in the formula ,Iλ (x) For captured underwater images, J λ (x) To make the image clear, t λ (x) For transmittance, A λ As global background light, λ represents an RGB channel;
converting the atmospheric scattering model:
J λ (x)=K λ (x)I λ (x)-K λ (x)+1
let t λ(x) and Aλ Combined into a single variable K λ (x) Performing discrete cosine transform on the underwater image, and separating the underwater image into a high-frequency component and a low-frequency component:
I λ (x)=I λ LF (x)+I λ HF (x)
J λ LF (x)=K λ (x)I λ LF (x)-K λ (x)+1
J λ HF (x)=K λ (x)I λ HF (x)-K λ (x)+1
LF represents a low-frequency component, HF represents a high-frequency component, a CNN network is constructed, underwater images in a UIE data set are input into the network for training, and parameters K are learned λ (x) Will K λ (x) Substituting the image into an underwater imaging model, and reversely solving a clear underwater image.
Optionally, the CNN network training process is as follows:
training a network by minimizing a loss function, initializing network parameters by adopting Gaussian distribution, wherein an Adam optimizer is used for optimizing the network parameters, storing and loading learned weights into a test file, inputting an underwater image in a URPC data set into the test file, and acquiring an enhanced underwater target detection image, wherein the loss function is as follows:
wherein ,
mu and sigma respectively represent a gray image J λ(x) and Iλ (x) The mean value and the standard deviation of (c),representing a gray-scale image J λ(x) and Iλ (x) Covariance of C 1 =(K 1 +L) 2 、C 2 =(K 2 +L) 2, in the formula K1 =0.01、K 2 =0.03、L=1。
Optionally, the lightweight underwater target detection model extracts three feature graphs with different sizes by replacing a feature extraction network in YOLOV5 with a GhostNet lightweight module, adds a CA attention mechanism on the neck, and then inputs the three feature graphs with different scales into three classification regression layers respectively for prediction.
Wherein optionally, the CA attention mechanism comprises the following three operations:
information embedding operation: for a given input feature map, performing pooling operation along the horizontal direction and the vertical direction of the feature map by using global average pooling to obtain two embedded information feature maps, and using H×1 pooling cores in the horizontal direction, and obtaining H×1×C information feature maps by using H×W×C input features through global average pooling operation, wherein the formula is as follows:
the vertical direction, namely the Y direction, uses a 1 XW pooling core, and an information characteristic diagram of 1 XW XC is obtained by carrying out global average pooling operation on input characteristics of H XW XC, wherein the formula is as follows:
attention generation operation: the two information feature graphs Z generated in the last step are compared h c and Zw c Stitching is performed along the spatial dimension, followed by a 1 x 1 convolution operation and activation function. Then performing slicing operation along the space dimension to obtain two separated feature images, and respectively performing transformation and activation functions on the two separated feature images to obtain two attention vectors g h and gw The formula is as follows:
wherein ,
g h =σ(F h (f h ))
g w =σ(F w (f w ))
feature map correction operation: the previous operation results in two attention vectors g h E C.times.H.times.1 and g w E C x 1 x W, broadcast transforming it into C x H x W dimension and residual operating input characteristic diagram x c And performing corresponding position multiplication operation to obtain the final attention characteristic.
Optionally, the calculation process of the average precision is as follows:
the statistical data of p and recall rate r are used for measuring accuracy, the accuracy is the ratio of true positive tp to all predicted positive tp+fp, the ratio of true positive tp+fn is represented in the predicted result, the recall rate is the ratio of true positive tp+fn, the ratio of true positive tp+fn is represented in all targets, and the average accuracy AP is the average value of all accuracies obtained under the condition of possible values of all recall rates;
wherein ,
a degraded image enhancement based lightweight underwater target detection system, comprising:
an image processing module: the method comprises the steps of receiving an image enhancement data set UIE and an underwater target detection data set URPC, preprocessing the UIE and the URPC, and dividing the preprocessed UIE and the preprocessed URPC into a training set, a verification set and a test set;
an image enhancement module: embedding a pre-established underwater imaging model into a UWCNN-SD network, inputting the underwater image in the preprocessed UIE data set into the UWCNN-SD network for training, storing the trained weight through an original underwater image and a real image training network, loading the weight into the UWCNN-SD network, and inputting the URPC data set into the trained UWCNN-SD network to obtain a clear underwater image;
an image generation module: the method is used for inputting clear underwater images and training sets, verification sets and test sets into a pre-established lightweight underwater target detection model, finally outputting images containing the underwater target detection frame, identifying and marking targets in the underwater target detection frame, and calculating average accuracy.
The invention has the beneficial effects that:
the invention utilizes the image enhancement technology and the target detection technology of the forefront edge, and realizes the practicability of the front edge technology. Aiming at the difficulty of underwater target detection, the invention firstly applies UWCNN-SD algorithm to enhance the degraded underwater image, eliminates color deviation caused by light attenuation, then improves the model based on the YOLOV5 model, replaces the feature extraction network of the model with GhostNet to reduce parameters and calculated amount, improves reasoning speed, introduces CA attention mechanism and enhances the extraction of the features. Finally, the underwater target detection precision is higher, the speed is faster, and the generalization capability is good.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described, and it will be obvious to those skilled in the art that other drawings can be obtained according to these drawings without inventive effort;
FIG. 1 is an overall flow chart of the present invention;
FIG. 2 is a network structure diagram of the UWCNN-SD algorithm used in the present invention;
FIG. 3 is a network structure diagram of the improved yolo 5 model of the present invention;
FIG. 4 is a flow chart of a CA attention mechanism used in the present invention;
FIG. 5 is a graph of effect detection for YOLOV 5;
fig. 6 is an effect detection diagram of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in fig. 1, a method for detecting a lightweight underwater target based on degraded image enhancement includes the steps of:
s1, acquiring an image enhancement dataset UIE and an underwater target detection dataset URPC, wherein the UIE dataset comprises 960 pairs of underwater images and corresponding clear images, the URPC dataset comprises 7600 pictures and labels, the label format of the URPC dataset is converted from xml to txt by PASCALVOC, and the dataset is divided into a training set, a verification set and a test set according to the ratio of 7:2:1. The URPC data set is divided into an image and a label, the image comprises four targets of sea urchins, starfish, scallops and sea cucumbers, and the label comprises target category information and position information.
S2, unifying the image size of the data set in the step S1 to be 640 multiplied by 3, and enhancing the underwater image by utilizing a UWCNN-SD algorithm. The UWCNN-SD algorithm is based on an underwater imaging model:
I λ (x)=J λ (x)t λ (x)+A λ (1-t λ (x))
in the formula ,Iλ (x) For captured underwater images, J λ (x) To make the image clear, t λ (x) For transmittance, A λ As global background light, λ represents an RGB channel;
converting the atmospheric scattering model:
J λ (x)=K λ (x)I λ (x)-K λ (x)+1
let t λ(x) and Aλ Combined into a single variable K λ (x) Performing discrete cosine transform on the underwater image, and separating the underwater image into a high-frequency component and a low-frequency component:
I λ (x)=I λ LF (x)+I λ HF (x)
J λ LF (x)=K λ (x)I λ LF (x)-K λ (x)+1
J λ HF (x)=K λ (x)I λ HF (x)-K λ (x)+1
LF represents a low-frequency component, HF represents a high-frequency component, a CNN network is constructed, underwater images in a UIE data set are input into the network for training, and parameters K are learned λ (x) Will K λ (x) Substituting the image into an underwater imaging model, and reversely solving a clear underwater image.
The network is trained by minimizing the loss function by first initializing network parameters with a gaussian distribution. Adam optimizers are used to optimize network parameters. During training, the learning rate was set to 0.0001, the batch size was 16, and the training period was 30. And finally, saving and loading the learned weight into a test file, inputting the underwater image in the URPC data set into the test file, and obtaining the enhanced underwater target detection image. The loss function is as follows:
wherein ,
mu and sigma respectively represent a gray image J λ(x) and Iλ (x) The mean value and the standard deviation of (c),representing a gray-scale image J λ(x) and Iλ (x) Covariance of C 1 =(K 1 +L) 2 、C 2 =(K 2 +L) 2, in the formula K1 =0.01、K 2 =0.03、L=1。
S3, constructing a lightweight underwater target detection network, replacing a feature extraction network in the YOLOV5 with a GhostNet lightweight module, and extracting three feature graphs with different sizes, namely 80×80×40, 40×40×112 and 20×20×160. Adding a CA attention mechanism in the neck, and then respectively inputting three feature maps with different scales into three classification regression layers to predict;
the GhosNet module firstly performs feature extraction by using fewer convolution check input feature graphs, and then performs linear transformation on the feature graphs phi_i of all channels by using Depth-wise con-volumtion to obtain a Ghost feature graph. And finally, compressing the Ghost feature map and the feature map to generate a final feature map. The module consists of two stacked Ghost parts, the front of which is the extension of Ghost-BottleNeck, the main effect being to increase the number of channels and thus the dimension of the feature map. The latter part is to reduce the dimension of the feature map, to ensure consistency with the input, and finally to connect the two parts together by means of a jump connection. Meanwhile, the ReLu activation function is used in the front part to ensure the gradient disappearance phenomenon in the backward propagation process of the data. The ReLu activation function is not used later, because the distribution of the data of the next layer and the previous layer is different after the activation function is used, so that the difference of data input is continuously adapted, and the training speed of the network is reduced.
The CA attention mechanism comprises the following three operations:
information embedding operation (Coordinate Information Embedding): for a given input feature map, performing pooling operation along the horizontal direction and the vertical direction of the feature map by using global average pooling to obtain two embedded information feature maps (as shown in fig. 3), and using h×1 pooling cores in the horizontal direction to obtain h×1×c information feature maps by using the global average pooling operation on the input features of h×w×c, wherein the formula is as follows:
the vertical direction, namely the Y direction, uses a 1 XW pooling core, and an information characteristic diagram of 1 XW XC is obtained by carrying out global average pooling operation on input characteristics of H XW XC, wherein the formula is as follows:
attention generation operation: the two information feature graphs Z generated in the last step are compared h c and Zw c Stitching is performed along the spatial dimension, followed by a 1 x 1 convolution operation and activation function. Then performing slicing operation along the space dimension to obtain two separated feature images, and respectively performing transformation and activation functions on the two separated feature images to obtain two attention vectors g h and gw The formula is as follows:
wherein ,
g h =σ(F h (f h ))
g w =σ(F w (f w ))
feature map correction operation: the previous operation results in two attention vectors g h E C.times.H.times.1 and g w E C x 1 x W, broadcast transforming it into C x H x W dimension and residual operating input characteristic diagram x c And performing corresponding position multiplication operation to obtain the final attention characteristic.
S4, inputting the images of the training set and the verification set which are enhanced in the step S2 into a lightweight target detection model, wherein the training set is used for training the model, the training condition of the verification set on the model is fed back in time, and the test set is used for checking the final detection effect of the model. The input image size is 640×640×3, the training batch is set to 32, the training period is 300, the IOU threshold is set to 0.45, the initial learning rate is set to 0.01, the learning rate is updated by adopting a cosine annealing mode, and the training of the model is accelerated. And (3) saving a trained weight file, marking as 'best. Pt', loading the weight file into a model, inputting an image of a test set into the model, finally outputting a picture with a target marking frame and target information, and calculating average precision.
The average precision, which is the ratio of true positives (tp) to all predicted positives (tp+fp), represents the proportion of correct predictions in the predicted results, is measured using statistics of precision p and recall r. Whereas recall is the ratio of true positive to actual positive (tp+fn), indicating the proportion of all targets that are correctly predicted. Average accuracy AP refers to the average of all accuracies obtained with the possible values of all recall.
wherein ,
the simulation experiment uses an underwater image from a URPC2020 official data set, wherein the data set is from a national underwater robot major optical image game, comprises 7600 underwater real optical images with marks, comprises color cast, weak contrast and the like caused by various illumination conditions, and has four targets of sea urchins, starfish, scallops and sea cucumbers, and complex detection scenes such as target overlapping and shielding. The Python3.7 and Pytorch frameworks were used to implement on a server with 2 NVIDIARTX2080TI GPUs (11 GB memory).
The data set is divided into a training set, a verification set and a test set, and the dividing ratio is 7:2:1.
Comparison method YOLOV5 and the method of the invention the underwater target detection accuracy, model size and detection speed in the test image are compared with the results of table 1.
Table 1 simulation experiment underwater target detection precision and model parameter comparison
Quantitative analysis from table 1: the method can effectively detect the underwater target, has remarkable advantages compared with the YOLOV5L algorithm, improves the average precision by 3.2%, and reduces the size of the model by 53.1%.
A degraded image enhancement based lightweight underwater target detection system, comprising:
an image processing module: the method comprises the steps of receiving an image enhancement data set UIE and an underwater target detection data set URPC, preprocessing the UIE and the URPC, and dividing the preprocessed UIE and the preprocessed URPC into a training set, a verification set and a test set;
an image enhancement module: the method comprises the steps of inputting a preprocessed underwater image of the UIE into a pre-established underwater imaging model for enhancement to obtain a clear underwater image, and inputting an image of the URPC into the pre-established underwater imaging model to obtain an enhanced image;
and a storage module: the method comprises the steps of marking an underwater image and an enhanced image with the best effect as weight files, and storing the weight files;
an image generation module: the method is used for inputting the weight file, the training set, the verification set and the test set into a pre-established lightweight underwater target detection model, finally outputting an image containing an underwater target detection frame, identifying and marking targets in the underwater target detection frame, and calculating average precision.
Based on the same inventive concept, the present invention also provides a computer apparatus comprising: one or more processors, and memory for storing one or more computer programs; the program includes program instructions and the processor is configured to execute the program instructions stored in the memory. The processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application SpecificIntegrated Circuit, ASIC), field-Programmable gate arrays (FPGAs) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc., which are the computational core and control core of the terminal for implementing one or more instructions, in particular for loading and executing one or more instructions within a computer storage medium to implement the methods described above.
It should be further noted that, based on the same inventive concept, the present invention also provides a computer storage medium having a computer program stored thereon, which when executed by a processor performs the above method. The storage media may take the form of any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electrical, magnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
In the description of the present specification, the descriptions of the terms "one embodiment," "example," "specific example," and the like, mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present disclosure. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The foregoing has shown and described the basic principles, principal features, and advantages of the present disclosure. It will be understood by those skilled in the art that the present disclosure is not limited to the embodiments described above, which have been described in the foregoing and description merely illustrates the principles of the disclosure, and that various changes and modifications may be made therein without departing from the spirit and scope of the disclosure, which is defined in the appended claims.
Claims (10)
1. The lightweight underwater target detection method based on degraded image enhancement is characterized by comprising the following steps of:
receiving an image enhancement data set UIE and an underwater target detection data set URPC, preprocessing the UIE and the URPC, and dividing the preprocessed UIE and the preprocessed URPC into a training set, a verification set and a test set;
embedding a pre-established underwater imaging model into a UWCNN-SD network, inputting the underwater image in the preprocessed UIE data set into the UWCNN-SD network for training, storing the trained weight through an original underwater image and a real image training network, loading the weight into the UWCNN-SD network, and inputting the URPC data set into the trained UWCNN-SD network to obtain a clear underwater image;
inputting the clear underwater image, the training set, the verification set and the test set into a pre-established lightweight underwater target detection model, finally outputting an image containing an underwater target detection frame, identifying and marking the target in the underwater target detection frame, and calculating average precision to obtain a detection result.
2. The degradation image enhancement based lightweight underwater target detection method of claim 1, wherein the UIE dataset comprises 950 pairs of underwater images and true images, wherein the true images are clear underwater images without color bias.
3. The degradation image enhancement based lightweight underwater target detection method of claim 1, wherein the URPC data set comprises 7600 pictures and corresponding tag files, wherein the tag files comprise a tag frame, position information of the tag frame and true category information of the tag frame content.
4. The degradation image enhancement-based light underwater target detection method according to claim 1, wherein the underwater imaging model performs discrete cosine transform on an underwater image, separates the underwater image into a high-frequency part and a low-frequency part, then builds a CNN network and a loss function, embeds the underwater imaging model into the network to eliminate color deviation on the low-frequency part, highlights texture details on the high-frequency part, and finally fuses the two parts to output a clear underwater image.
5. The degradation image enhancement based lightweight underwater target detection method of claim 4, wherein the underwater imaging model is as follows:
I λ (x)=J λ (x)t λ (x)+A λ (1-t λ (x))
in the formula ,Iλ (x) For captured underwater images, J λ (x) To make the image clear, t λ (x) For transmittance, A λ As global background light, λ represents an RGB channel;
converting the atmospheric scattering model:
J λ (x)=K λ (x)I λ (x)-K λ (x)+1
let t λ(x) and Aλ Combined into a single variable K λ (x) Separating underwater imagesA discrete cosine transform, separated into a high frequency component and a low frequency component:
I λ (x)=I λ LF (x)+I λ HF (x)
J λ LF (x)=K λ (x)I λ LF (x)-K λ (x)+1
J λ HF (x)=K λ (x)I λ HF (x)-K λ (x)+1
LF represents a low-frequency component, HF represents a high-frequency component, a CNN network is constructed, underwater images in a UIE data set are input into the network for training, and parameters K are learned λ (x) Will K λ (x) Substituting the image into an underwater imaging model, and reversely solving a clear underwater image.
6. The degradation image enhancement-based lightweight underwater target detection method of claim 5, wherein the CNN network training process is as follows:
training a network by minimizing a loss function, initializing network parameters by adopting Gaussian distribution, wherein an Adam optimizer is used for optimizing the network parameters, storing and loading learned weights into a test file, inputting an underwater image in a URPC data set into the test file, and acquiring an enhanced underwater target detection image, wherein the loss function is as follows:
wherein ,
mu and sigma respectively represent a gray image J λ(x) and Iλ (x) The mean value and the standard deviation of (c),representing a gray-scale image J λ(x) and Iλ (x) Covariance of C 1 =(K 1 +L) 2 、C 2 =(K 2 +L) 2, in the formula K1 =0.01、K 2 =0.03、L=1。
7. The degradation image enhancement-based light-weight underwater target detection method according to claim 1, wherein the light-weight underwater target detection model extracts three feature graphs with different sizes by replacing a feature extraction network in YOLOV5 with a Ghostnet light-weight module, adds a CA attention mechanism to a neck, and then inputs the three feature graphs with different scales into three classification regression layers respectively for prediction.
8. The method for detecting a lightweight underwater target based on degraded image enhancement according to claim 7, wherein the CA attention mechanism comprises the following three operations:
information embedding operation: for a given input feature map, performing pooling operation along the horizontal direction and the vertical direction of the feature map by using global average pooling to obtain two embedded information feature maps, and using H×1 pooling cores in the horizontal direction, and obtaining H×1×C information feature maps by using H×W×C input features through global average pooling operation, wherein the formula is as follows:
the vertical direction, namely the Y direction, uses a 1 XW pooling core, and the H XW XC input features are subjected to global average pooling operation to obtain a 1 XW XC information feature map, wherein the formula is as follows:
attention generation operation: the two information feature graphs Z generated in the last step are compared h c and Zw c Stitching is performed along the spatial dimension, followed by a 1 x 1 convolution operation and activation function. Then performing slicing operation along the space dimension to obtain two separated feature images, and respectively performing transformation and activation functions on the two separated feature images to obtain two attention vectors g h and gw The formula is as follows:
wherein ,
g h =σ(F h (f h ))
g w =σ(F w (f w ))
feature map correction operation: the previous operation results in two attention vectors g h E C.times.H.times.1 and g w E C x 1 x W, broadcast transforming it into C x H x W dimension and residual operating input characteristic diagram x c And performing corresponding position multiplication operation to obtain the final attention characteristic.
9. The degradation image enhancement-based lightweight underwater target detection method as claimed in claim 1, wherein the calculation process of the average accuracy is as follows:
the statistical data of p and recall rate r are used for measuring accuracy, the accuracy is the ratio of true positive tp to all predicted positive tp+fp, the ratio of true positive tp+fn is represented in the predicted result, the recall rate is the ratio of true positive tp+fn, the ratio of true positive tp+fn is represented in all targets, and the average accuracy AP is the average value of all accuracies obtained under the condition of possible values of all recall rates;
wherein ,
10. a degradation image enhancement-based lightweight underwater target detection system, comprising:
an image processing module: the method comprises the steps of receiving an image enhancement data set UIE and an underwater target detection data set URPC, preprocessing the UIE and the URPC, and dividing the preprocessed UIE and the preprocessed URPC into a training set, a verification set and a test set;
an image enhancement module: embedding a pre-established underwater imaging model into a UWCNN-SD network, inputting the underwater image in the preprocessed UIE data set into the UWCNN-SD network for training, storing the trained weight through an original underwater image and a real image training network, loading the weight into the UWCNN-SD network, and inputting the URPC data set into the trained UWCNN-SD network to obtain a clear underwater image;
an image generation module: the method is used for inputting clear underwater images and training sets, verification sets and test sets into a pre-established lightweight underwater target detection model, finally outputting images containing the underwater target detection frame, identifying and marking targets in the underwater target detection frame, and calculating average accuracy.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310366420.0A CN116543295A (en) | 2023-04-07 | 2023-04-07 | Lightweight underwater target detection method and system based on degradation image enhancement |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310366420.0A CN116543295A (en) | 2023-04-07 | 2023-04-07 | Lightweight underwater target detection method and system based on degradation image enhancement |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116543295A true CN116543295A (en) | 2023-08-04 |
Family
ID=87449698
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310366420.0A Pending CN116543295A (en) | 2023-04-07 | 2023-04-07 | Lightweight underwater target detection method and system based on degradation image enhancement |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116543295A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116912675A (en) * | 2023-09-13 | 2023-10-20 | 吉林大学 | Underwater target detection method and system based on feature migration |
-
2023
- 2023-04-07 CN CN202310366420.0A patent/CN116543295A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116912675A (en) * | 2023-09-13 | 2023-10-20 | 吉林大学 | Underwater target detection method and system based on feature migration |
CN116912675B (en) * | 2023-09-13 | 2023-11-28 | 吉林大学 | Underwater target detection method and system based on feature migration |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110188765B (en) | Image semantic segmentation model generation method, device, equipment and storage medium | |
CN109086811B (en) | Multi-label image classification method and device and electronic equipment | |
CN113033537B (en) | Method, apparatus, device, medium and program product for training a model | |
CN112801169B (en) | Camouflage target detection method, system, device and storage medium based on improved YOLO algorithm | |
CN112634209A (en) | Product defect detection method and device | |
CN113569667B (en) | Inland ship target identification method and system based on lightweight neural network model | |
CN110390340B (en) | Feature coding model, training method and detection method of visual relation detection model | |
CN113469088B (en) | SAR image ship target detection method and system under passive interference scene | |
CN110135505B (en) | Image classification method and device, computer equipment and computer readable storage medium | |
CN111914843B (en) | Character detection method, system, equipment and storage medium | |
US20200065664A1 (en) | System and method of measuring the robustness of a deep neural network | |
CN113052834A (en) | Pipeline defect detection method based on convolution neural network multi-scale features | |
CN111539456B (en) | Target identification method and device | |
CN114612832A (en) | Real-time gesture detection method and device | |
CN116543295A (en) | Lightweight underwater target detection method and system based on degradation image enhancement | |
CN111368634B (en) | Human head detection method, system and storage medium based on neural network | |
CN115546171A (en) | Shadow detection method and device based on attention shadow boundary and feature correction | |
CN114429208A (en) | Model compression method, device, equipment and medium based on residual structure pruning | |
CN116310850B (en) | Remote sensing image target detection method based on improved RetinaNet | |
CN116311004B (en) | Video moving target detection method based on sparse optical flow extraction | |
CN112614108A (en) | Method and device for detecting nodules in thyroid ultrasound image based on deep learning | |
CN114998222A (en) | Automobile differential shell surface detection method, electronic equipment and medium | |
CN113076819A (en) | Fruit identification method and device under homochromatic background and fruit picking robot | |
CN112541915A (en) | Efficient cloth defect detection method, system and equipment for high-resolution images | |
CN111369508A (en) | Defect detection method and system for metal three-dimensional lattice structure |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |