CN114092793A - End-to-end biological target detection method suitable for complex underwater environment - Google Patents
End-to-end biological target detection method suitable for complex underwater environment Download PDFInfo
- Publication number
- CN114092793A CN114092793A CN202111342981.4A CN202111342981A CN114092793A CN 114092793 A CN114092793 A CN 114092793A CN 202111342981 A CN202111342981 A CN 202111342981A CN 114092793 A CN114092793 A CN 114092793A
- Authority
- CN
- China
- Prior art keywords
- underwater
- network
- image
- convolution
- output
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 22
- 238000000034 method Methods 0.000 claims abstract description 26
- 238000012549 training Methods 0.000 claims abstract description 21
- 238000012360 testing method Methods 0.000 claims abstract description 15
- 238000005070 sampling Methods 0.000 claims abstract description 8
- 230000002708 enhancing effect Effects 0.000 claims abstract description 5
- 238000003384 imaging method Methods 0.000 claims abstract description 4
- 238000010606 normalization Methods 0.000 claims description 23
- 230000006870 function Effects 0.000 claims description 18
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 9
- 238000012545 processing Methods 0.000 claims description 7
- 238000000605 extraction Methods 0.000 claims description 6
- 230000009466 transformation Effects 0.000 claims description 6
- 241000258957 Asteroidea Species 0.000 claims description 3
- 238000012935 Averaging Methods 0.000 claims description 3
- 241000257465 Echinoidea Species 0.000 claims description 3
- 241000251511 Holothuroidea Species 0.000 claims description 3
- 241000237503 Pectinidae Species 0.000 claims description 3
- 230000004913 activation Effects 0.000 claims description 3
- 238000004880 explosion Methods 0.000 claims description 3
- 235000020637 scallop Nutrition 0.000 claims description 3
- 239000013589 supplement Substances 0.000 claims description 3
- 230000000007 visual effect Effects 0.000 claims description 3
- 238000012937 correction Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/40—Image enhancement or restoration using histogram techniques
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Probability & Statistics with Applications (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an end-to-end biological target detection method suitable for a complex underwater environment. The method comprises the following steps: s1, grabbing the used underwater data set by an underwater robot, dividing the underwater data set into a training set and a testing set, unifying the size of the underwater image through up-sampling or down-sampling, and then normalizing; s2, selecting an underwater image with poor imaging quality from the existing underwater data set, and enhancing the image by a histogram equalization method to form a data set of an enhanced network; s3, training the underwater image enhancement network by taking the poor underwater image as the input of the enhancement network and the enhanced image as a true value; s4, extracting the features of the underwater training set image after network enhancement by using a full convolution network, and then performing target recognition and classification on the feature map of the underwater image by using a one-stage detection network to obtain a trained model; and S5, sending the processed underwater test set into the trained model for testing.
Description
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to an end-to-end biological target detection method suitable for a complex underwater environment.
Background
With the development of computer vision and image processing technology, the application of image processing methods to improve underwater image quality to meet the requirements of human vision system and machine recognition has gradually become a hotspot. With the development of artificial intelligence, the deep learning method is gradually applied to underwater target recognition. However, due to the influence of uneven underwater light, the underwater image has the problems of color distortion, underexposure and the like, and the traditional deep learning target detection method lacks sufficient capacity for processing the underwater target. In underwater environments, enhancement of low quality images is essential for computer vision.
For underwater image enhancement, conventional image processing methods including color correction algorithms and contrast enhancement algorithms, white balance methods, gray world theory and gray edge theory are typical color correction methods. However, the processing results of these methods are not satisfactory for underwater vision. The task of target detection for underwater images remains to be further studied.
Disclosure of Invention
In view of the technical problems, the invention is used for providing an end-to-end biological target detection method suitable for a complex underwater environment, which comprises the steps of preprocessing data to improve the accuracy of a classifier, enhancing an underwater image through a generation network, extracting features through a deep neural network, and finally identifying and classifying.
In order to solve the technical problems, the invention adopts the following technical scheme:
s1, grabbing an underwater data set by an underwater robot, dividing the underwater data set into a training set and a testing set, wherein the underwater data set comprises underwater targets of sea cucumbers, sea urchins, scallops and starfishes, 20% of the underwater data set is used as the testing set, 80% of the underwater data set is used as the training set, the underwater images are unified in size through up-sampling or down-sampling, and then normalization is carried out;
s2, selecting an underwater image with poor imaging quality from the existing underwater data set, and enhancing the image by a histogram equalization method to form a data set of an enhanced network;
s3, training the underwater image enhancement network by taking the poor underwater image as the input of the enhancement network and the enhanced image as a true value;
s4, extracting the features of the underwater training set image after network enhancement by using a full convolution network, and then performing target recognition and classification on the feature map of the underwater image by using a one-stage detection network to obtain a trained model;
and S5, sending the processed underwater test set into the trained model for testing.
Preferably, S1 further includes:
s101, assume xiIs the image pixel point value, min (x)i) And max (x)i) Representing the maximum and minimum values of the image pixel, respectively. The normalized underwater images are:
preferably, S2 further includes:
gain comparison using histogram equalizationThe real value of the difference image is used for training the enhancement network, and the number n of pixels of each gray level in the image is counted firstlykK is in the range of [ O, L-1]The initial probability density function of the image histogram is p (r)k) Then the transformation function is:
obtaining the probability density function p (S) after equalization through the transformation functionk) And the method is applied to actual images to obtain paired underwater data sets.
Preferably, S3 further includes:
s301, using the generation countermeasure network to perform image enhancement, inputting X of the underwater image with poor quality into the generation network, wherein the convolution module of each layer comprises three processes of convolution, batch normalization and ReLu, the input X passes through N convolution kernels with kernel _ size of 3 × 3, and output is obtainedN represents the total number of channels, i represents the ith channel, and the extracted features are as follows:
in the formulaRepresenting a convolution operation, with a total of 5 convolutional layers, the output of the 3 rd convolutional layer and the output of the fifth convolutional layer superimposed;
s302, the data after the convolution layer needs to be further processed, in order to enable the model to be easy to converge and enable the network training process to be more stable, batch normalization is added after the convolution, and by calculating the mean value and the variance of the data in each batch, N is assumed to exist in a small batchmOne sample, then define the output asWherein FnRepresents the convolution output corresponding to the nth sample, in each small batch, forThe data in (1) is subjected to batch normalization to obtainExpressed as:
wherein, Fn(k, l) represents the l element in the k channel in the convolutional layer output corresponding to the sample before batch normalization,i.e. the data after batch normalization, alphakAnd betakFor trainable parameters corresponding to the kth channel,. epsilon.is a very small number that prevents the divisor from being 0, E (. eta.) for the averaging operation, and Var (. eta.) for the variance operation;
s303, then using the pair of activation functions ReLUEach element in the group is nonlinearly activated to obtainIf the input isThe corresponding output after the ReLUExpressed as:
s304, inputting the image generated by the generation network into the countermeasure network to judge whether the output of the generation network achieves the purpose of enhancement, wherein the discriminator consists of 3 simple convolutional layers, and each convolutional layer also has three processes of convolution, batch processing normalization and ReLu;
s305, in order to ensure that the result has a good visual and quantitative performance, the loss function is the countermeasure loss L1And characteristic loss L2Two-part, loss-fighting in order for the generator to generate a better-performing output, assuming D represents the arbiter network, xrAnd xfIf the values are sampled from the true distribution and the pseudo distribution, respectively, the penalty is:
the feature loss is the Euclidean distance of the feature data extracted from the convolutional layer of VGG16 through which the input and generated images respectively pass, and can reduce the instability of the generator network, assuming ILInput representing color cast, G (I)L) Represents the output of the generating network, phiiRepresenting a feature map obtained from a feature extraction network, i representing its ith pooled feature map, Wi,HiIs the dimension of the extracted feature map, the feature loss is:
preferably, S4 further includes:
s401, a Resnet50 module is used by a feature extraction layer, a Resnet50 network structure firstly performs convolution operation on input, then 4 residual blocks are included, the convolution operation is performed for 50 times in total, each residual block has jump connection to solve the problem of gradient dissipation or explosion, and if the input of the residual block is X and the output of the tape measure network is H, the output is as follows:
Y=H(X)+X
s402, the low-level feature semantic information is less, but the target position is accurate; the high-level feature semantic information is rich, but the target position is rough, firstly, a top-down path is adopted to transmit the high-level strong semantic features, then, a bottom-up path is added to supplement the feature map, and the low-level strong positioning features are transmitted.
Preferably, S5 further includes:
s501, the detection module mainly comprises two sub-networks, namely a classification sub-network and a frame regression sub-network, wherein the classification sub-network predicts the existence probability and the class probability of the target at each spatial position for each anchor point. The sub-network is a simple full convolution module and consists of four full convolution layers, and the parameters of the sub-network are shared among all feature maps with different scales; finally, the classification is performed using sidmoid, assuming piTo determine the probability that the current ith anchor point may be the target via the network,for the probability that the ith anchor point is labeled as the target, the classification loss function is:
s502, in parallel with the object classification sub-network, the offset of each anchor frame is regressed to a near real value by using another simple full-volume machine network, and the object classification sub-network and the border regression sub-network have the same structure but different use parameters. Let tiThe offset of the block relative to the anchor point is predicted for the positive samples,for the offset of the true value relative to the anchor point, the bounding box regression loss is:
drawings
Fig. 1 is a flowchart of the overall algorithm steps of an end-to-end biological target detection method suitable for a complex underwater environment according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a flow chart of steps of an end-to-end biological target detection method suitable for a complex underwater environment according to an embodiment of the present invention is shown, which includes the following steps:
s1, grabbing an underwater data set by an underwater robot, dividing the underwater data set into a training set and a testing set, wherein the underwater data set comprises underwater targets of sea cucumbers, sea urchins, scallops and starfishes, 20% of the underwater data set is used as the testing set, 80% of the underwater data set is used as the training set, the underwater images cannot directly enter a network due to different sizes of the underwater images, the sizes of the underwater images are unified through up-sampling or down-sampling, and then normalization is performed.
S2, aiming at the problem that an underwater data set paired for image enhancement network training cannot be obtained in actual engineering, selecting an underwater image with poor imaging quality from the existing underwater data set, and enhancing the image by a histogram equalization method to form a data set of an enhancement network;
s3, training the underwater image enhancement network by taking the poor underwater image as the input of the enhancement network and the enhanced image as a true value;
s4, extracting the features of the underwater training set image after network enhancement by using a full convolution network, and then performing target recognition and classification on the feature map of the underwater image by using a one-stage detection network to obtain a trained model;
and S5, sending the processed underwater test set into the trained model for testing.
In a specific application example, S1 further includes:
the image is normalized, the goal of the normalization is to find a certain mapping relation, after the normalization, the characteristics of different dimensions have certain comparability on the numerical value, and the accuracy of the classifier can be greatly improved. Let x beiIs the image pixel point value, min (x)i) And max (x)i) Representing the maximum and minimum values of the image pixel, respectively. The normalized underwater images are:
in a specific application example, S2 further includes:
paired underwater data sets cannot be obtained in actual engineering, and the real values of poor images obtained by histogram equalization are used for training of an enhanced network. Firstly, counting the number n of pixels of each gray level in an imagekK is in the range of [ O, L-1]The initial probability density function of the image histogram is p (r)k) Then the transformation function is:
obtaining the probability density function p (S) after equalization through the transformation functionk) And the method is applied to actual images to obtain paired underwater data sets.
In a specific application example, S3 further includes:
s301, image enhancement is carried out by using the generation countermeasure network. Inputting X of the underwater image with poor quality into a generating network, wherein the convolution module of each layer comprises three processes of convolution, batch normalization and ReLu, and the input X passes through N convolution kernels with kernel _ size of 3X 3 to obtain outputN represents the total number of channels, i represents the ith channel, and the extracted features are as follows:
in the formulaRepresenting a convolution operation, with a total of 5 convolutional layers, the output of the 3 rd convolutional layer and the output of the fifth convolutional layer superimposed;
s302, the data after the convolution layer needs to be further processed, in order to enable the model to be easy to converge and enable the network training process to be more stable, batch normalization is added after the convolution, and by calculating the mean value and the variance of the data in each batch, N is assumed to exist in a small batchmOne sample, then define the output asWherein FnRepresents the convolution output corresponding to the nth sample, in each small batch, forThe data in (1) is subjected to batch normalization to obtainExpressed as:
wherein, Fn(k, l) represents the l element in the k channel in the convolutional layer output corresponding to the sample before batch normalization,i.e. the data after batch normalization, alphakAnd betakFor trainable parameters corresponding to the kth channel,. epsilon.is a very small number that prevents the divisor from being 0, E (. eta.) for the averaging operation, and Var (. eta.) for the variance operation;
s303, then using the pair of activation functions ReLUEach element in the group is nonlinearly activated to obtainIf the input isThe corresponding output after the ReLUExpressed as:
s304, inputting the image generated by the generation network into the countermeasure network to judge whether the output of the generation network achieves the purpose of enhancement. The discriminator consists of 3 simple convolutional layers, and each convolutional layer also has three processes of convolution, batch normalization and ReLu.
S305, in order to ensure that the result has a good visual and quantitative performance, our loss function is based on the adversarial loss L1And characteristic loss L2Two parts are formed. The penalty is combated in order for the generator to generate a better-performing output. Let D denote the arbiter network, xrAnd xfSampled values from the true distribution and the pseudo distribution, respectively. The penalty on confrontation is then:
the feature loss is the euclidean distance of the feature data extracted from the convolutional layer of VGG16 through which the input and generated images pass, respectively, and can reduce the instability of the generator network. Let ILInput representing color cast, G (I)L) Represents the output of the generating network, phiiA feature map obtained from a feature extraction network is shown. i represents its ith pooled feature map. Wi,HiIs the dimension of the extracted feature map. ThenThe characteristic loss is:
in a specific application example, S4 further includes:
s401, the feature extraction layer uses a Resnet50 module. The Resnet50 network structure performs the convolution operation on the input first, and then contains 4 residual blocks for a total of 50 convolution operations. Each residual block has jump connection to solve the problem of gradient dissipation or explosion. Assuming that the input to the residual block is X and the output of the tape measure network is H, the output is:
Y=H(X)+X
s402, the low-level feature semantic information is less, but the target position is accurate; and the high-level characteristic semantic information is rich, but the target position is rough. Firstly, a top-down path is adopted to transmit the high-level strong semantic features, then a bottom-up path is added to supplement the feature map, and the low-level strong positioning features are transmitted.
In a specific application example, S5 further includes:
s501, the detection module mainly comprises two sub-networks: a classification sub-network and a bounding box regression sub-network. The classification sub-network predicts for each anchor point the target existence probability and the class probability for each spatial location. The sub-network is a simple full convolution module, which is composed of four full convolution layers, and the parameters of the sub-network are shared among all the feature maps with different scales. Finally, sidmoid is used for classification. Let p beiTo determine the probability that the current ith anchor point may be the target via the network,for the probability that the ith anchor point is labeled as the target, the classification loss function is:
S502, in parallel with the target classification sub-network, we use another simple full-scroller network to return the offset of each anchor box to a nearby true value. The target classification sub-network and the bounding box regression sub-network have the same structure but different parameters. Let tiThe offset of the block relative to the anchor point is predicted for the positive samples,for the offset of the true value relative to the anchor point, the bounding box regression loss is:
it is to be understood that the exemplary embodiments described herein are illustrative and not restrictive. Although one or more embodiments of the present invention have been described with reference to the accompanying drawings, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present invention as defined by the following claims.
Claims (6)
1. An end-to-end biological target detection method suitable for a complex underwater environment is characterized by comprising the following steps:
s1, grabbing an underwater data set by an underwater robot, dividing the underwater data set into a training set and a testing set, wherein the underwater data set comprises underwater targets of sea cucumbers, sea urchins, scallops and starfishes, 20% of the underwater data set is used as the testing set, 80% of the underwater data set is used as the training set, the underwater images are unified in size through up-sampling or down-sampling, and then normalization is carried out;
s2, selecting an underwater image with poor imaging quality from the existing underwater data set, and enhancing the image by a histogram equalization method to form a data set of an enhanced network;
s3, training the underwater image enhancement network by taking the poor underwater image as the input of the enhancement network and the enhanced image as a true value;
s4, extracting the features of the underwater training set image after network enhancement by using a full convolution network, and then performing target recognition and classification on the feature map of the underwater image by using a one-stage detection network to obtain a trained model;
and S5, sending the processed underwater test set into the trained model for testing.
2. The method for end-to-end biological target detection applicable to complex underwater environments of claim 1, wherein S1 further comprises:
let x beiIs the image pixel point value, min (x)i) And max (x)i) Respectively representing the maximum value and the minimum value of image pixels, wherein the normalized underwater image is as follows:
3. the method for end-to-end biological target detection applicable to complex underwater environments of claim 1, wherein S2 further comprises:
counting the number n of pixels of each gray level in the imagekK is in the range of [0, L-1]The initial probability density function of the image histogram is p (r)k) Then the transformation function is:
obtaining the probability density function p (S) after equalization through the transformation functionk) And the method is applied to actual images to obtain paired underwater data sets.
4. The method for end-to-end biological target detection in a complex underwater environment as claimed in claim 1, wherein S3 further comprises:
s301, using the generation countermeasure network to enhance the image, inputting the X of the underwater image with poor quality into the generation network, and inputting the X of each layerThe convolution module comprises three processes of convolution, batch normalization and ReLu, and input X passes through N convolution kernels with kernel _ size of 3X 3 to obtain outputN represents the total number of channels, i represents the ith channel, and the extracted features are as follows:
in the formulaRepresenting a convolution operation, with a total of 5 convolutional layers, the output of the 3 rd convolutional layer and the output of the fifth convolutional layer superimposed;
s302, further processing the data after the convolution layer, adding batch normalization after the convolution in order to make the model easy to converge and make the network training process more stable, and assuming that N exists in a small batch by calculating the mean value and variance of the data in each batchmOne sample, then define the output asWherein FnRepresents the convolution output corresponding to the nth sample, in each small batch, forThe data in (1) is subjected to batch normalization to obtain Expressed as:
wherein, Fn(k, l) represents the l element in the k channel in the convolutional layer output corresponding to the sample before batch normalization,i.e. the data after batch normalization, alphakAnd ρkFor trainable parameters corresponding to the kth channel,. epsilon.is a very small number that prevents the divisor from being 0, E (. eta.) for the averaging operation, and Var (. eta.) for the variance operation;
s303, then using the pair of activation functions ReLUEach element in the group is nonlinearly activated to obtainIf the input isThe corresponding output after the ReLUExpressed as:
s304, inputting the image generated by the generation network into the countermeasure network to judge whether the output of the generation network achieves the purpose of enhancement, wherein the discriminator consists of 3 simple convolutional layers, and each convolutional layer also has three processes of convolution, batch processing normalization and ReLu;
s305, in order to ensure that the result has a good visual and quantitative performance, our loss function is based on the adversarial loss L1And characteristic loss L2Two parts are formed. The penalty is combated in order for the generator to generate a better-performing output. FalseLet D denote the arbiter network, xrAnd xfSampled values from the true distribution and the pseudo distribution, respectively. The penalty on confrontation is then:
the feature loss is the Euclidean distance of the feature data extracted from the convolutional layer of VGG16 through which the input and generated images respectively pass, assuming ILInput representing color cast, G (I)L) Represents the output of the generating network, phiiRepresenting a feature map obtained from a feature extraction network, i representing its ith pooled feature map, Wi,HiIs the dimension of the extracted feature map, the feature loss is:
5. the method for end-to-end biological target detection in a complex underwater environment as claimed in claim 1, wherein S4 further comprises:
s401, the feature extraction layer uses a Resnet50 module. The Resnet50 network structure firstly performs convolution operation on input, then comprises 4 residual blocks, and 50 times of convolution operation is performed in total, each residual block has jump connection to solve the problem of gradient dissipation or explosion, and if the input of the residual block is X and the output of the tape measure network is H, the output is:
Y=H(X)+X
s402, the low-level feature semantic information is less, but the target position is accurate; and the high-level characteristic semantic information is rich, but the target position is rough. Firstly, a top-down path is adopted to transmit high-level strong semantic features, then a bottom-up path is added to supplement a feature map, and low-level strong positioning features are transmitted.
6. The method for end-to-end biological target detection in a complex underwater environment as claimed in claim 1, wherein S5 further comprises:
s501, the detection module mainly comprises two sub-networks: the classification sub-network is a simple full-convolution module and consists of four full-convolution layers, the parameters of the sub-network are shared among all feature maps with different scales, and finally, the classification is carried out by using sidmoid, and p is assumed to beiTo determine the probability that the current ith anchor point may be the target via the network,for the probability that the ith anchor point is labeled as the target, the classification loss function is:
s502, in parallel with the target classification sub-network, we use another simple full-reel network to return the offset of each anchor frame to a nearby true value. The target classification sub-network and the bounding box regression sub-network have the same structure but different parameters. Let tiThe offset of the block relative to the anchor point is predicted for the positive samples,for the offset of the true value relative to the anchor point, the bounding box regression loss is:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111342981.4A CN114092793B (en) | 2021-11-12 | 2021-11-12 | End-to-end biological target detection method suitable for complex underwater environment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111342981.4A CN114092793B (en) | 2021-11-12 | 2021-11-12 | End-to-end biological target detection method suitable for complex underwater environment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114092793A true CN114092793A (en) | 2022-02-25 |
CN114092793B CN114092793B (en) | 2024-05-17 |
Family
ID=80300549
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111342981.4A Active CN114092793B (en) | 2021-11-12 | 2021-11-12 | End-to-end biological target detection method suitable for complex underwater environment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114092793B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114387190A (en) * | 2022-03-23 | 2022-04-22 | 山东省计算中心(国家超级计算济南中心) | Adaptive image enhancement method and system based on complex environment |
CN115880574A (en) * | 2023-03-02 | 2023-03-31 | 吉林大学 | Underwater optical image lightweight target identification method, equipment and medium |
CN115984269A (en) * | 2023-03-20 | 2023-04-18 | 湖南长理尚洋科技有限公司 | Non-invasive local water ecological safety detection method and system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109543585A (en) * | 2018-11-16 | 2019-03-29 | 西北工业大学 | Underwater optics object detection and recognition method based on convolutional neural networks |
CN111209952A (en) * | 2020-01-03 | 2020-05-29 | 西安工业大学 | Underwater target detection method based on improved SSD and transfer learning |
CN111723823A (en) * | 2020-06-24 | 2020-09-29 | 河南科技学院 | Underwater target detection method based on third-party transfer learning |
CN112417980A (en) * | 2020-10-27 | 2021-02-26 | 南京邮电大学 | Single-stage underwater biological target detection method based on feature enhancement and refinement |
CN112767279A (en) * | 2021-02-01 | 2021-05-07 | 福州大学 | Underwater image enhancement method for generating countermeasure network based on discrete wavelet integration |
-
2021
- 2021-11-12 CN CN202111342981.4A patent/CN114092793B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109543585A (en) * | 2018-11-16 | 2019-03-29 | 西北工业大学 | Underwater optics object detection and recognition method based on convolutional neural networks |
CN111209952A (en) * | 2020-01-03 | 2020-05-29 | 西安工业大学 | Underwater target detection method based on improved SSD and transfer learning |
CN111723823A (en) * | 2020-06-24 | 2020-09-29 | 河南科技学院 | Underwater target detection method based on third-party transfer learning |
CN112417980A (en) * | 2020-10-27 | 2021-02-26 | 南京邮电大学 | Single-stage underwater biological target detection method based on feature enhancement and refinement |
CN112767279A (en) * | 2021-02-01 | 2021-05-07 | 福州大学 | Underwater image enhancement method for generating countermeasure network based on discrete wavelet integration |
Non-Patent Citations (2)
Title |
---|
徐岩;孙美双;: "基于卷积神经网络的水下图像增强方法", 吉林大学学报(工学版), no. 06, 26 March 2018 (2018-03-26) * |
贾振卿;刘雪峰;: "基于YOLO和图像增强的海洋动物目标检测", 电子测量技术, no. 14, 23 July 2020 (2020-07-23) * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114387190A (en) * | 2022-03-23 | 2022-04-22 | 山东省计算中心(国家超级计算济南中心) | Adaptive image enhancement method and system based on complex environment |
CN115880574A (en) * | 2023-03-02 | 2023-03-31 | 吉林大学 | Underwater optical image lightweight target identification method, equipment and medium |
CN115984269A (en) * | 2023-03-20 | 2023-04-18 | 湖南长理尚洋科技有限公司 | Non-invasive local water ecological safety detection method and system |
Also Published As
Publication number | Publication date |
---|---|
CN114092793B (en) | 2024-05-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111639692B (en) | Shadow detection method based on attention mechanism | |
CN110443143B (en) | Multi-branch convolutional neural network fused remote sensing image scene classification method | |
CN109583342B (en) | Human face living body detection method based on transfer learning | |
CN107133943B (en) | A kind of visible detection method of stockbridge damper defects detection | |
CN114092793A (en) | End-to-end biological target detection method suitable for complex underwater environment | |
CN111832443B (en) | Construction method and application of construction violation detection model | |
CN109376591B (en) | Ship target detection method for deep learning feature and visual feature combined training | |
CN110796009A (en) | Method and system for detecting marine vessel based on multi-scale convolution neural network model | |
CN109685765B (en) | X-ray film pneumonia result prediction device based on convolutional neural network | |
CN112818969A (en) | Knowledge distillation-based face pose estimation method and system | |
CN113361645B (en) | Target detection model construction method and system based on meta learning and knowledge memory | |
CN115035371B (en) | Well wall crack identification method based on multi-scale feature fusion neural network | |
CN111242026A (en) | Remote sensing image target detection method based on spatial hierarchy perception module and metric learning | |
CN116452810A (en) | Multi-level semantic segmentation method and device, electronic equipment and storage medium | |
CN115223032A (en) | Aquatic organism identification and matching method based on image processing and neural network fusion | |
CN113627240B (en) | Unmanned aerial vehicle tree species identification method based on improved SSD learning model | |
CN116844114A (en) | Helmet detection method and device based on YOLOv7-WFD model | |
CN114821356B (en) | Optical remote sensing target detection method for accurate positioning | |
Meng et al. | A Novel Steganography Algorithm Based on Instance Segmentation. | |
CN111950586B (en) | Target detection method for introducing bidirectional attention | |
CN114581769A (en) | Method for identifying houses under construction based on unsupervised clustering | |
CN114463628A (en) | Deep learning remote sensing image ship target identification method based on threshold value constraint | |
CN117809169B (en) | Small-sample underwater sonar image classification method and model building method thereof | |
CN111476129A (en) | Soil impurity detection method based on deep learning | |
Kaur et al. | Deep learning with invariant feature based species classification in underwater environments |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |