CN114092793A - End-to-end biological target detection method suitable for complex underwater environment - Google Patents

End-to-end biological target detection method suitable for complex underwater environment Download PDF

Info

Publication number
CN114092793A
CN114092793A CN202111342981.4A CN202111342981A CN114092793A CN 114092793 A CN114092793 A CN 114092793A CN 202111342981 A CN202111342981 A CN 202111342981A CN 114092793 A CN114092793 A CN 114092793A
Authority
CN
China
Prior art keywords
underwater
network
image
convolution
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111342981.4A
Other languages
Chinese (zh)
Other versions
CN114092793B (en
Inventor
方笑海
章学挺
潘勉
于海滨
吕帅帅
彭时林
史剑光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN202111342981.4A priority Critical patent/CN114092793B/en
Publication of CN114092793A publication Critical patent/CN114092793A/en
Application granted granted Critical
Publication of CN114092793B publication Critical patent/CN114092793B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/40Image enhancement or restoration using histogram techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an end-to-end biological target detection method suitable for a complex underwater environment. The method comprises the following steps: s1, grabbing the used underwater data set by an underwater robot, dividing the underwater data set into a training set and a testing set, unifying the size of the underwater image through up-sampling or down-sampling, and then normalizing; s2, selecting an underwater image with poor imaging quality from the existing underwater data set, and enhancing the image by a histogram equalization method to form a data set of an enhanced network; s3, training the underwater image enhancement network by taking the poor underwater image as the input of the enhancement network and the enhanced image as a true value; s4, extracting the features of the underwater training set image after network enhancement by using a full convolution network, and then performing target recognition and classification on the feature map of the underwater image by using a one-stage detection network to obtain a trained model; and S5, sending the processed underwater test set into the trained model for testing.

Description

End-to-end biological target detection method suitable for complex underwater environment
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to an end-to-end biological target detection method suitable for a complex underwater environment.
Background
With the development of computer vision and image processing technology, the application of image processing methods to improve underwater image quality to meet the requirements of human vision system and machine recognition has gradually become a hotspot. With the development of artificial intelligence, the deep learning method is gradually applied to underwater target recognition. However, due to the influence of uneven underwater light, the underwater image has the problems of color distortion, underexposure and the like, and the traditional deep learning target detection method lacks sufficient capacity for processing the underwater target. In underwater environments, enhancement of low quality images is essential for computer vision.
For underwater image enhancement, conventional image processing methods including color correction algorithms and contrast enhancement algorithms, white balance methods, gray world theory and gray edge theory are typical color correction methods. However, the processing results of these methods are not satisfactory for underwater vision. The task of target detection for underwater images remains to be further studied.
Disclosure of Invention
In view of the technical problems, the invention is used for providing an end-to-end biological target detection method suitable for a complex underwater environment, which comprises the steps of preprocessing data to improve the accuracy of a classifier, enhancing an underwater image through a generation network, extracting features through a deep neural network, and finally identifying and classifying.
In order to solve the technical problems, the invention adopts the following technical scheme:
s1, grabbing an underwater data set by an underwater robot, dividing the underwater data set into a training set and a testing set, wherein the underwater data set comprises underwater targets of sea cucumbers, sea urchins, scallops and starfishes, 20% of the underwater data set is used as the testing set, 80% of the underwater data set is used as the training set, the underwater images are unified in size through up-sampling or down-sampling, and then normalization is carried out;
s2, selecting an underwater image with poor imaging quality from the existing underwater data set, and enhancing the image by a histogram equalization method to form a data set of an enhanced network;
s3, training the underwater image enhancement network by taking the poor underwater image as the input of the enhancement network and the enhanced image as a true value;
s4, extracting the features of the underwater training set image after network enhancement by using a full convolution network, and then performing target recognition and classification on the feature map of the underwater image by using a one-stage detection network to obtain a trained model;
and S5, sending the processed underwater test set into the trained model for testing.
Preferably, S1 further includes:
s101, assume xiIs the image pixel point value, min (x)i) And max (x)i) Representing the maximum and minimum values of the image pixel, respectively. The normalized underwater images are:
Figure BDA0003352826080000021
preferably, S2 further includes:
gain comparison using histogram equalizationThe real value of the difference image is used for training the enhancement network, and the number n of pixels of each gray level in the image is counted firstlykK is in the range of [ O, L-1]The initial probability density function of the image histogram is p (r)k) Then the transformation function is:
Figure BDA0003352826080000031
obtaining the probability density function p (S) after equalization through the transformation functionk) And the method is applied to actual images to obtain paired underwater data sets.
Preferably, S3 further includes:
s301, using the generation countermeasure network to perform image enhancement, inputting X of the underwater image with poor quality into the generation network, wherein the convolution module of each layer comprises three processes of convolution, batch normalization and ReLu, the input X passes through N convolution kernels with kernel _ size of 3 × 3, and output is obtained
Figure BDA0003352826080000032
N represents the total number of channels, i represents the ith channel, and the extracted features are as follows:
Figure BDA0003352826080000033
in the formula
Figure BDA0003352826080000034
Representing a convolution operation, with a total of 5 convolutional layers, the output of the 3 rd convolutional layer and the output of the fifth convolutional layer superimposed;
s302, the data after the convolution layer needs to be further processed, in order to enable the model to be easy to converge and enable the network training process to be more stable, batch normalization is added after the convolution, and by calculating the mean value and the variance of the data in each batch, N is assumed to exist in a small batchmOne sample, then define the output as
Figure BDA0003352826080000035
Wherein FnRepresents the convolution output corresponding to the nth sample, in each small batch, for
Figure BDA0003352826080000036
The data in (1) is subjected to batch normalization to obtain
Figure BDA0003352826080000037
Expressed as:
Figure BDA0003352826080000041
wherein, Fn(k, l) represents the l element in the k channel in the convolutional layer output corresponding to the sample before batch normalization,
Figure BDA0003352826080000042
i.e. the data after batch normalization, alphakAnd betakFor trainable parameters corresponding to the kth channel,. epsilon.is a very small number that prevents the divisor from being 0, E (. eta.) for the averaging operation, and Var (. eta.) for the variance operation;
s303, then using the pair of activation functions ReLU
Figure BDA0003352826080000043
Each element in the group is nonlinearly activated to obtain
Figure BDA0003352826080000044
If the input is
Figure BDA0003352826080000045
The corresponding output after the ReLU
Figure BDA0003352826080000046
Expressed as:
Figure BDA0003352826080000047
s304, inputting the image generated by the generation network into the countermeasure network to judge whether the output of the generation network achieves the purpose of enhancement, wherein the discriminator consists of 3 simple convolutional layers, and each convolutional layer also has three processes of convolution, batch processing normalization and ReLu;
s305, in order to ensure that the result has a good visual and quantitative performance, the loss function is the countermeasure loss L1And characteristic loss L2Two-part, loss-fighting in order for the generator to generate a better-performing output, assuming D represents the arbiter network, xrAnd xfIf the values are sampled from the true distribution and the pseudo distribution, respectively, the penalty is:
Figure BDA0003352826080000048
the feature loss is the Euclidean distance of the feature data extracted from the convolutional layer of VGG16 through which the input and generated images respectively pass, and can reduce the instability of the generator network, assuming ILInput representing color cast, G (I)L) Represents the output of the generating network, phiiRepresenting a feature map obtained from a feature extraction network, i representing its ith pooled feature map, Wi,HiIs the dimension of the extracted feature map, the feature loss is:
Figure BDA0003352826080000051
preferably, S4 further includes:
s401, a Resnet50 module is used by a feature extraction layer, a Resnet50 network structure firstly performs convolution operation on input, then 4 residual blocks are included, the convolution operation is performed for 50 times in total, each residual block has jump connection to solve the problem of gradient dissipation or explosion, and if the input of the residual block is X and the output of the tape measure network is H, the output is as follows:
Y=H(X)+X
s402, the low-level feature semantic information is less, but the target position is accurate; the high-level feature semantic information is rich, but the target position is rough, firstly, a top-down path is adopted to transmit the high-level strong semantic features, then, a bottom-up path is added to supplement the feature map, and the low-level strong positioning features are transmitted.
Preferably, S5 further includes:
s501, the detection module mainly comprises two sub-networks, namely a classification sub-network and a frame regression sub-network, wherein the classification sub-network predicts the existence probability and the class probability of the target at each spatial position for each anchor point. The sub-network is a simple full convolution module and consists of four full convolution layers, and the parameters of the sub-network are shared among all feature maps with different scales; finally, the classification is performed using sidmoid, assuming piTo determine the probability that the current ith anchor point may be the target via the network,
Figure BDA0003352826080000053
for the probability that the ith anchor point is labeled as the target, the classification loss function is:
Figure BDA0003352826080000052
s502, in parallel with the object classification sub-network, the offset of each anchor frame is regressed to a near real value by using another simple full-volume machine network, and the object classification sub-network and the border regression sub-network have the same structure but different use parameters. Let tiThe offset of the block relative to the anchor point is predicted for the positive samples,
Figure BDA0003352826080000061
for the offset of the true value relative to the anchor point, the bounding box regression loss is:
Figure BDA0003352826080000062
drawings
Fig. 1 is a flowchart of the overall algorithm steps of an end-to-end biological target detection method suitable for a complex underwater environment according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a flow chart of steps of an end-to-end biological target detection method suitable for a complex underwater environment according to an embodiment of the present invention is shown, which includes the following steps:
s1, grabbing an underwater data set by an underwater robot, dividing the underwater data set into a training set and a testing set, wherein the underwater data set comprises underwater targets of sea cucumbers, sea urchins, scallops and starfishes, 20% of the underwater data set is used as the testing set, 80% of the underwater data set is used as the training set, the underwater images cannot directly enter a network due to different sizes of the underwater images, the sizes of the underwater images are unified through up-sampling or down-sampling, and then normalization is performed.
S2, aiming at the problem that an underwater data set paired for image enhancement network training cannot be obtained in actual engineering, selecting an underwater image with poor imaging quality from the existing underwater data set, and enhancing the image by a histogram equalization method to form a data set of an enhancement network;
s3, training the underwater image enhancement network by taking the poor underwater image as the input of the enhancement network and the enhanced image as a true value;
s4, extracting the features of the underwater training set image after network enhancement by using a full convolution network, and then performing target recognition and classification on the feature map of the underwater image by using a one-stage detection network to obtain a trained model;
and S5, sending the processed underwater test set into the trained model for testing.
In a specific application example, S1 further includes:
the image is normalized, the goal of the normalization is to find a certain mapping relation, after the normalization, the characteristics of different dimensions have certain comparability on the numerical value, and the accuracy of the classifier can be greatly improved. Let x beiIs the image pixel point value, min (x)i) And max (x)i) Representing the maximum and minimum values of the image pixel, respectively. The normalized underwater images are:
Figure BDA0003352826080000071
in a specific application example, S2 further includes:
paired underwater data sets cannot be obtained in actual engineering, and the real values of poor images obtained by histogram equalization are used for training of an enhanced network. Firstly, counting the number n of pixels of each gray level in an imagekK is in the range of [ O, L-1]The initial probability density function of the image histogram is p (r)k) Then the transformation function is:
Figure BDA0003352826080000072
obtaining the probability density function p (S) after equalization through the transformation functionk) And the method is applied to actual images to obtain paired underwater data sets.
In a specific application example, S3 further includes:
s301, image enhancement is carried out by using the generation countermeasure network. Inputting X of the underwater image with poor quality into a generating network, wherein the convolution module of each layer comprises three processes of convolution, batch normalization and ReLu, and the input X passes through N convolution kernels with kernel _ size of 3X 3 to obtain output
Figure BDA0003352826080000081
N represents the total number of channels, i represents the ith channel, and the extracted features are as follows:
Figure BDA0003352826080000082
in the formula
Figure BDA0003352826080000083
Representing a convolution operation, with a total of 5 convolutional layers, the output of the 3 rd convolutional layer and the output of the fifth convolutional layer superimposed;
s302, the data after the convolution layer needs to be further processed, in order to enable the model to be easy to converge and enable the network training process to be more stable, batch normalization is added after the convolution, and by calculating the mean value and the variance of the data in each batch, N is assumed to exist in a small batchmOne sample, then define the output as
Figure BDA0003352826080000084
Wherein FnRepresents the convolution output corresponding to the nth sample, in each small batch, for
Figure BDA0003352826080000085
The data in (1) is subjected to batch normalization to obtain
Figure BDA0003352826080000086
Expressed as:
Figure BDA0003352826080000087
wherein, Fn(k, l) represents the l element in the k channel in the convolutional layer output corresponding to the sample before batch normalization,
Figure BDA0003352826080000088
i.e. the data after batch normalization, alphakAnd betakFor trainable parameters corresponding to the kth channel,. epsilon.is a very small number that prevents the divisor from being 0, E (. eta.) for the averaging operation, and Var (. eta.) for the variance operation;
s303, then using the pair of activation functions ReLU
Figure BDA0003352826080000091
Each element in the group is nonlinearly activated to obtain
Figure BDA0003352826080000092
If the input is
Figure BDA0003352826080000093
The corresponding output after the ReLU
Figure BDA0003352826080000094
Expressed as:
Figure BDA0003352826080000095
s304, inputting the image generated by the generation network into the countermeasure network to judge whether the output of the generation network achieves the purpose of enhancement. The discriminator consists of 3 simple convolutional layers, and each convolutional layer also has three processes of convolution, batch normalization and ReLu.
S305, in order to ensure that the result has a good visual and quantitative performance, our loss function is based on the adversarial loss L1And characteristic loss L2Two parts are formed. The penalty is combated in order for the generator to generate a better-performing output. Let D denote the arbiter network, xrAnd xfSampled values from the true distribution and the pseudo distribution, respectively. The penalty on confrontation is then:
Figure BDA0003352826080000096
the feature loss is the euclidean distance of the feature data extracted from the convolutional layer of VGG16 through which the input and generated images pass, respectively, and can reduce the instability of the generator network. Let ILInput representing color cast, G (I)L) Represents the output of the generating network, phiiA feature map obtained from a feature extraction network is shown. i represents its ith pooled feature map. Wi,HiIs the dimension of the extracted feature map. ThenThe characteristic loss is:
Figure BDA0003352826080000097
in a specific application example, S4 further includes:
s401, the feature extraction layer uses a Resnet50 module. The Resnet50 network structure performs the convolution operation on the input first, and then contains 4 residual blocks for a total of 50 convolution operations. Each residual block has jump connection to solve the problem of gradient dissipation or explosion. Assuming that the input to the residual block is X and the output of the tape measure network is H, the output is:
Y=H(X)+X
s402, the low-level feature semantic information is less, but the target position is accurate; and the high-level characteristic semantic information is rich, but the target position is rough. Firstly, a top-down path is adopted to transmit the high-level strong semantic features, then a bottom-up path is added to supplement the feature map, and the low-level strong positioning features are transmitted.
In a specific application example, S5 further includes:
s501, the detection module mainly comprises two sub-networks: a classification sub-network and a bounding box regression sub-network. The classification sub-network predicts for each anchor point the target existence probability and the class probability for each spatial location. The sub-network is a simple full convolution module, which is composed of four full convolution layers, and the parameters of the sub-network are shared among all the feature maps with different scales. Finally, sidmoid is used for classification. Let p beiTo determine the probability that the current ith anchor point may be the target via the network,
Figure BDA0003352826080000101
for the probability that the ith anchor point is labeled as the target, the classification loss function is:
Figure BDA0003352826080000102
S502, in parallel with the target classification sub-network, we use another simple full-scroller network to return the offset of each anchor box to a nearby true value. The target classification sub-network and the bounding box regression sub-network have the same structure but different parameters. Let tiThe offset of the block relative to the anchor point is predicted for the positive samples,
Figure BDA0003352826080000103
for the offset of the true value relative to the anchor point, the bounding box regression loss is:
Figure BDA0003352826080000104
it is to be understood that the exemplary embodiments described herein are illustrative and not restrictive. Although one or more embodiments of the present invention have been described with reference to the accompanying drawings, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present invention as defined by the following claims.

Claims (6)

1. An end-to-end biological target detection method suitable for a complex underwater environment is characterized by comprising the following steps:
s1, grabbing an underwater data set by an underwater robot, dividing the underwater data set into a training set and a testing set, wherein the underwater data set comprises underwater targets of sea cucumbers, sea urchins, scallops and starfishes, 20% of the underwater data set is used as the testing set, 80% of the underwater data set is used as the training set, the underwater images are unified in size through up-sampling or down-sampling, and then normalization is carried out;
s2, selecting an underwater image with poor imaging quality from the existing underwater data set, and enhancing the image by a histogram equalization method to form a data set of an enhanced network;
s3, training the underwater image enhancement network by taking the poor underwater image as the input of the enhancement network and the enhanced image as a true value;
s4, extracting the features of the underwater training set image after network enhancement by using a full convolution network, and then performing target recognition and classification on the feature map of the underwater image by using a one-stage detection network to obtain a trained model;
and S5, sending the processed underwater test set into the trained model for testing.
2. The method for end-to-end biological target detection applicable to complex underwater environments of claim 1, wherein S1 further comprises:
let x beiIs the image pixel point value, min (x)i) And max (x)i) Respectively representing the maximum value and the minimum value of image pixels, wherein the normalized underwater image is as follows:
Figure FDA0003352826070000011
3. the method for end-to-end biological target detection applicable to complex underwater environments of claim 1, wherein S2 further comprises:
counting the number n of pixels of each gray level in the imagekK is in the range of [0, L-1]The initial probability density function of the image histogram is p (r)k) Then the transformation function is:
Figure FDA0003352826070000021
obtaining the probability density function p (S) after equalization through the transformation functionk) And the method is applied to actual images to obtain paired underwater data sets.
4. The method for end-to-end biological target detection in a complex underwater environment as claimed in claim 1, wherein S3 further comprises:
s301, using the generation countermeasure network to enhance the image, inputting the X of the underwater image with poor quality into the generation network, and inputting the X of each layerThe convolution module comprises three processes of convolution, batch normalization and ReLu, and input X passes through N convolution kernels with kernel _ size of 3X 3 to obtain output
Figure FDA0003352826070000022
N represents the total number of channels, i represents the ith channel, and the extracted features are as follows:
Figure FDA0003352826070000023
in the formula
Figure FDA0003352826070000024
Representing a convolution operation, with a total of 5 convolutional layers, the output of the 3 rd convolutional layer and the output of the fifth convolutional layer superimposed;
s302, further processing the data after the convolution layer, adding batch normalization after the convolution in order to make the model easy to converge and make the network training process more stable, and assuming that N exists in a small batch by calculating the mean value and variance of the data in each batchmOne sample, then define the output as
Figure FDA0003352826070000025
Wherein FnRepresents the convolution output corresponding to the nth sample, in each small batch, for
Figure FDA0003352826070000026
The data in (1) is subjected to batch normalization to obtain
Figure FDA0003352826070000027
Figure FDA0003352826070000028
Expressed as:
Figure FDA0003352826070000031
wherein, Fn(k, l) represents the l element in the k channel in the convolutional layer output corresponding to the sample before batch normalization,
Figure FDA0003352826070000032
i.e. the data after batch normalization, alphakAnd ρkFor trainable parameters corresponding to the kth channel,. epsilon.is a very small number that prevents the divisor from being 0, E (. eta.) for the averaging operation, and Var (. eta.) for the variance operation;
s303, then using the pair of activation functions ReLU
Figure FDA0003352826070000033
Each element in the group is nonlinearly activated to obtain
Figure FDA0003352826070000034
If the input is
Figure FDA0003352826070000035
The corresponding output after the ReLU
Figure FDA0003352826070000036
Expressed as:
Figure FDA0003352826070000037
s304, inputting the image generated by the generation network into the countermeasure network to judge whether the output of the generation network achieves the purpose of enhancement, wherein the discriminator consists of 3 simple convolutional layers, and each convolutional layer also has three processes of convolution, batch processing normalization and ReLu;
s305, in order to ensure that the result has a good visual and quantitative performance, our loss function is based on the adversarial loss L1And characteristic loss L2Two parts are formed. The penalty is combated in order for the generator to generate a better-performing output. FalseLet D denote the arbiter network, xrAnd xfSampled values from the true distribution and the pseudo distribution, respectively. The penalty on confrontation is then:
Figure FDA0003352826070000038
the feature loss is the Euclidean distance of the feature data extracted from the convolutional layer of VGG16 through which the input and generated images respectively pass, assuming ILInput representing color cast, G (I)L) Represents the output of the generating network, phiiRepresenting a feature map obtained from a feature extraction network, i representing its ith pooled feature map, Wi,HiIs the dimension of the extracted feature map, the feature loss is:
Figure FDA0003352826070000041
5. the method for end-to-end biological target detection in a complex underwater environment as claimed in claim 1, wherein S4 further comprises:
s401, the feature extraction layer uses a Resnet50 module. The Resnet50 network structure firstly performs convolution operation on input, then comprises 4 residual blocks, and 50 times of convolution operation is performed in total, each residual block has jump connection to solve the problem of gradient dissipation or explosion, and if the input of the residual block is X and the output of the tape measure network is H, the output is:
Y=H(X)+X
s402, the low-level feature semantic information is less, but the target position is accurate; and the high-level characteristic semantic information is rich, but the target position is rough. Firstly, a top-down path is adopted to transmit high-level strong semantic features, then a bottom-up path is added to supplement a feature map, and low-level strong positioning features are transmitted.
6. The method for end-to-end biological target detection in a complex underwater environment as claimed in claim 1, wherein S5 further comprises:
s501, the detection module mainly comprises two sub-networks: the classification sub-network is a simple full-convolution module and consists of four full-convolution layers, the parameters of the sub-network are shared among all feature maps with different scales, and finally, the classification is carried out by using sidmoid, and p is assumed to beiTo determine the probability that the current ith anchor point may be the target via the network,
Figure FDA0003352826070000042
for the probability that the ith anchor point is labeled as the target, the classification loss function is:
Figure FDA0003352826070000051
s502, in parallel with the target classification sub-network, we use another simple full-reel network to return the offset of each anchor frame to a nearby true value. The target classification sub-network and the bounding box regression sub-network have the same structure but different parameters. Let tiThe offset of the block relative to the anchor point is predicted for the positive samples,
Figure FDA0003352826070000052
for the offset of the true value relative to the anchor point, the bounding box regression loss is:
Figure FDA0003352826070000053
CN202111342981.4A 2021-11-12 2021-11-12 End-to-end biological target detection method suitable for complex underwater environment Active CN114092793B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111342981.4A CN114092793B (en) 2021-11-12 2021-11-12 End-to-end biological target detection method suitable for complex underwater environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111342981.4A CN114092793B (en) 2021-11-12 2021-11-12 End-to-end biological target detection method suitable for complex underwater environment

Publications (2)

Publication Number Publication Date
CN114092793A true CN114092793A (en) 2022-02-25
CN114092793B CN114092793B (en) 2024-05-17

Family

ID=80300549

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111342981.4A Active CN114092793B (en) 2021-11-12 2021-11-12 End-to-end biological target detection method suitable for complex underwater environment

Country Status (1)

Country Link
CN (1) CN114092793B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114387190A (en) * 2022-03-23 2022-04-22 山东省计算中心(国家超级计算济南中心) Adaptive image enhancement method and system based on complex environment
CN115880574A (en) * 2023-03-02 2023-03-31 吉林大学 Underwater optical image lightweight target identification method, equipment and medium
CN115984269A (en) * 2023-03-20 2023-04-18 湖南长理尚洋科技有限公司 Non-invasive local water ecological safety detection method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109543585A (en) * 2018-11-16 2019-03-29 西北工业大学 Underwater optics object detection and recognition method based on convolutional neural networks
CN111209952A (en) * 2020-01-03 2020-05-29 西安工业大学 Underwater target detection method based on improved SSD and transfer learning
CN111723823A (en) * 2020-06-24 2020-09-29 河南科技学院 Underwater target detection method based on third-party transfer learning
CN112417980A (en) * 2020-10-27 2021-02-26 南京邮电大学 Single-stage underwater biological target detection method based on feature enhancement and refinement
CN112767279A (en) * 2021-02-01 2021-05-07 福州大学 Underwater image enhancement method for generating countermeasure network based on discrete wavelet integration

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109543585A (en) * 2018-11-16 2019-03-29 西北工业大学 Underwater optics object detection and recognition method based on convolutional neural networks
CN111209952A (en) * 2020-01-03 2020-05-29 西安工业大学 Underwater target detection method based on improved SSD and transfer learning
CN111723823A (en) * 2020-06-24 2020-09-29 河南科技学院 Underwater target detection method based on third-party transfer learning
CN112417980A (en) * 2020-10-27 2021-02-26 南京邮电大学 Single-stage underwater biological target detection method based on feature enhancement and refinement
CN112767279A (en) * 2021-02-01 2021-05-07 福州大学 Underwater image enhancement method for generating countermeasure network based on discrete wavelet integration

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
徐岩;孙美双;: "基于卷积神经网络的水下图像增强方法", 吉林大学学报(工学版), no. 06, 26 March 2018 (2018-03-26) *
贾振卿;刘雪峰;: "基于YOLO和图像增强的海洋动物目标检测", 电子测量技术, no. 14, 23 July 2020 (2020-07-23) *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114387190A (en) * 2022-03-23 2022-04-22 山东省计算中心(国家超级计算济南中心) Adaptive image enhancement method and system based on complex environment
CN115880574A (en) * 2023-03-02 2023-03-31 吉林大学 Underwater optical image lightweight target identification method, equipment and medium
CN115984269A (en) * 2023-03-20 2023-04-18 湖南长理尚洋科技有限公司 Non-invasive local water ecological safety detection method and system

Also Published As

Publication number Publication date
CN114092793B (en) 2024-05-17

Similar Documents

Publication Publication Date Title
CN111639692B (en) Shadow detection method based on attention mechanism
CN110443143B (en) Multi-branch convolutional neural network fused remote sensing image scene classification method
CN109583342B (en) Human face living body detection method based on transfer learning
CN107133943B (en) A kind of visible detection method of stockbridge damper defects detection
CN114092793A (en) End-to-end biological target detection method suitable for complex underwater environment
CN111832443B (en) Construction method and application of construction violation detection model
CN109376591B (en) Ship target detection method for deep learning feature and visual feature combined training
CN110796009A (en) Method and system for detecting marine vessel based on multi-scale convolution neural network model
CN109685765B (en) X-ray film pneumonia result prediction device based on convolutional neural network
CN112818969A (en) Knowledge distillation-based face pose estimation method and system
CN113361645B (en) Target detection model construction method and system based on meta learning and knowledge memory
CN115035371B (en) Well wall crack identification method based on multi-scale feature fusion neural network
CN111242026A (en) Remote sensing image target detection method based on spatial hierarchy perception module and metric learning
CN116452810A (en) Multi-level semantic segmentation method and device, electronic equipment and storage medium
CN115223032A (en) Aquatic organism identification and matching method based on image processing and neural network fusion
CN113627240B (en) Unmanned aerial vehicle tree species identification method based on improved SSD learning model
CN116844114A (en) Helmet detection method and device based on YOLOv7-WFD model
CN114821356B (en) Optical remote sensing target detection method for accurate positioning
Meng et al. A Novel Steganography Algorithm Based on Instance Segmentation.
CN111950586B (en) Target detection method for introducing bidirectional attention
CN114581769A (en) Method for identifying houses under construction based on unsupervised clustering
CN114463628A (en) Deep learning remote sensing image ship target identification method based on threshold value constraint
CN117809169B (en) Small-sample underwater sonar image classification method and model building method thereof
CN111476129A (en) Soil impurity detection method based on deep learning
Kaur et al. Deep learning with invariant feature based species classification in underwater environments

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant