CN113420759A - Anti-occlusion and multi-scale dead fish identification system and method based on deep learning - Google Patents

Anti-occlusion and multi-scale dead fish identification system and method based on deep learning Download PDF

Info

Publication number
CN113420759A
CN113420759A CN202110653176.7A CN202110653176A CN113420759A CN 113420759 A CN113420759 A CN 113420759A CN 202110653176 A CN202110653176 A CN 202110653176A CN 113420759 A CN113420759 A CN 113420759A
Authority
CN
China
Prior art keywords
dead fish
feature
scale
features
occlusion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110653176.7A
Other languages
Chinese (zh)
Other versions
CN113420759B (en
Inventor
杨明东
张先奎
陈静
杨勇
周红坤
杨飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
No 750 Test Field of China Shipbuilding Industry Corp
Original Assignee
No 750 Test Field of China Shipbuilding Industry Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by No 750 Test Field of China Shipbuilding Industry Corp filed Critical No 750 Test Field of China Shipbuilding Industry Corp
Priority to CN202110653176.7A priority Critical patent/CN113420759B/en
Publication of CN113420759A publication Critical patent/CN113420759A/en
Application granted granted Critical
Publication of CN113420759B publication Critical patent/CN113420759B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/60Rotation of a whole image or part thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/40Image enhancement or restoration by the use of histogram techniques
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A40/00Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
    • Y02A40/80Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in fisheries management
    • Y02A40/81Aquaculture, e.g. of fish

Abstract

The invention discloses an anti-occlusion and multi-scale dead fish identification system and method based on deep learning, aiming at the difficulties of complex background, large scale change, occlusion and the like of dead fish identification, a FasterRCNN identification model is improved to adapt to a dead fish identification scene of a complex underwater environment. Firstly, carrying out contrast-limited adaptive histogram equalization processing on an image to be identified, and enhancing the local contrast of a target; secondly, designing a multi-scale feature enhancement module to solve the problem of dead fish identification in a large-scale range, designing an anti-blocking module based on an attention mechanism, highlighting a target area, and eliminating background interference such as other noises; finally, a rotating rectangular frame is used for representing a dead fish target, and the identification precision of the dead fish in a dense scene is greatly improved.

Description

Anti-occlusion and multi-scale dead fish identification system and method based on deep learning
Technical Field
The invention relates to the technical field of application of image recognition, in particular to an underwater dead fish recognition technology, and specifically relates to an anti-occlusion and multi-scale dead fish recognition system and method based on deep learning.
Background
In the process of fish culture, because bacteria and parasites exist in culture water or the oxygen concentration is insufficient due to poor water mobility caused by overlarge culture density, fish death can inevitably occur in the culture process. The dead fish firstly sinks to the water bottom after death, and the internal organs ferment to generate gas and then float to the water surface, and in the process, the movement of the dead fish in the cultivation net cage can contact with other live fish or be eaten by the live fish to cause germ diffusion. In order to prevent the problem, a dead fish identification method is urgently needed to be designed, so that information such as the position and the number of the dead fish before floating can be timely given, and a basis is provided for the advanced cleaning and collection of the following dead fish.
The traditional dead fish identification method usually adopts manual intervention, but the intelligent level is low, and manpower and material resources are greatly increased. With the rapid development of artificial intelligence and deep learning, the application of image target recognition technology is wider and wider, and classic recognition models such as fast RCNN and YOLO appear, however, the methods are designed for the water environment. A dead fish identification method and an early warning system based on a deep convolutional neural network directly transfer an original fast RCNN model to underwater dead fish identification. The method and the system for monitoring the cultured fish based on image recognition directly apply the YOLO model to fish recognition. The method directly applies the existing models, does not consider and pertinently solve the following four practical problems and factors, and causes low identification accuracy: (1) the underwater image generally has the characteristics of low contrast, low brightness and the like, has a great difference with the overwater image, and cannot be directly transferred; (2) after the dead fish sinks to the water bottom, intra-class shielding between the dead fish or between the dead fish and live fish can occur, meanwhile, the dead fish can be interfered by aquatic plants and the like, inter-class shielding occurs, and the extracted identification features are easily polluted; (3) due to the difference of the category, the volume and the like of the dead fish, the scale change range of the dead fish is large; (4) the fast RCNN and other methods all adopt horizontal rectangular frames to represent the target position, however, because the dead fish body states are generally distributed in different directions, the horizontal rectangular frames contain a large amount of useless background information, and in a dense scene, the rectangular frames are overlapped and are easy to be eliminated in a post-processing stage.
Disclosure of Invention
In order to solve the defects and defects of the prior art, through research and development and design, the applicant provides a dead fish identification method based on deep learning, a multi-scale feature enhancement module is designed to effectively solve the problem of multi-scale dead fish identification, an anti-blocking module based on an attention mechanism is further designed, the foreground is effectively highlighted, background interference is inhibited, and in addition, the identification effect is better in a dense scene through predicting a rotating rectangular frame representing the position of dead fish.
Specifically, the invention is realized by the following steps: an anti-occlusion and multi-scale dead fish identification system based on deep learning, comprising:
the system comprises a dead fish identification data set module, a dead fish identification data set module and a test set, wherein the dead fish identification data set module is used for acquiring and storing underwater images shot underwater and constructing a dead fish identification training set, a verification set and a test set;
the image feature extraction module is used for preprocessing the underwater image and extracting the bottom edge and the high-level abstract feature map of the image;
and the multi-scale feature strengthening module is used for improving the multi-scale expression capability of the features in a mode of connecting a plurality of feature pyramids in series. Generating a candidate frame to represent the foreground target, and extracting and fusing the characteristics of the candidate frames in the regions with different scales in a self-adaptive manner;
the anti-blocking module is used for learning the mask of the foreground object through an attention mechanism, fusing the mask with the candidate frame characteristics, inhibiting the interference of the background and obtaining the anti-blocking characteristics,
and the dead fish target identification module is used for regression representing the rotating rectangular frame of the dead fish position by taking the candidate frame characteristics as an initial starting point and combining the anti-blocking characteristics and the full connecting layer, and performing dead fish identification classification to complete the dead fish target identification.
Meanwhile, based on the system, the invention also discloses an anti-occlusion and multi-scale dead fish identification method based on deep learning,
step S1, acquiring underwater images, establishing a dead fish identification data set for training and testing,
step S2, line image preprocessing, namely inputting the image into an image feature extraction module, and extracting the bottom edge and the high-level abstract feature map of the image;
step S3, the multi-scale feature strengthening module improves the multi-scale expression capability of the features in a mode of connecting a plurality of feature pyramids in series to obtain multi-scale features;
step S4, generating a horizontal rectangular region candidate box from the multi-scale features, and designing a self-adaptive ROIAlign algorithm for self-adaptively extracting and fusing the features of the region candidate boxes with different scales;
step S5, learning the mask of the foreground object by the region candidate frame characteristics through an attention mechanism, and fusing the mask with the candidate frame characteristics to inhibit the interference of the background;
and step S6, taking the horizontal rectangular area candidate frame as an initial starting point, combining the feature and the full connecting layer after the anti-blocking branch is fused, regressing and representing the rotating rectangular frame at the dead fish position, classifying the dead fish, and completing the dead fish target identification.
The working principle and the beneficial effects of the invention are introduced as follows: and shooting an underwater image by using an underwater camera, and making a dead fish identification data set for training and testing. Meanwhile, an image preprocessing module is designed to provide high-quality and clear images to be identified for the subsequent steps; inputting the image into a ResNet feature extraction module, and extracting the bottom layer edge and the high-layer abstract feature map of the image; after the characteristic diagram is output, a multi-scale characteristic strengthening module is designed, and the multi-scale expression capability of the characteristics is improved in a mode of connecting a plurality of characteristic pyramids in series, so that the problem that the scale change range of the dead fish is large and difficult to identify is solved; inputting the multi-scale features into an RPN network, generating a horizontal rectangular region candidate frame (ROI) which represents a foreground target (dead fish or live fish), and designing a self-adaptive ROIAlign algorithm for self-adaptively extracting and fusing the candidate frame features with different scales; inputting the region candidate frame characteristics into an anti-blocking branch, learning the mask of a foreground target through an attention mechanism, fusing the mask with the candidate frame characteristics, inhibiting the interference of a background and achieving the purpose of anti-blocking; and (3) taking the generated horizontal candidate frame as an initial starting point, combining the feature and the full connecting layer after the anti-blocking branch is fused, regressing and representing the rotating rectangular frame at the dead fish position, classifying the dead fish, and completing the target identification of the dead fish. The image feature extraction module comprises ResNet, VGG, MobileNet or Efficient general series network, and can be selected according to the real-time processing requirement of the system. The generated candidate box comprises an RPN network and an adaptive ROIAlign algorithm unit; firstly, the confidence degrees of the candidate frames output by the RPN are sorted, so that the frames with low confidence degrees are removed, and then subsequent NMS operation is carried out to improve the post-processing speed of the algorithm;
aiming at the difficulties of complex background, large scale change, shielding and the like of dead fish identification, the fast RCNN identification model is improved to adapt to a dead fish identification scene of a complex underwater environment. Before the identification processing, the system and the method design a self-adaptive histogram equalization algorithm with limited contrast to carry out image preprocessing so as to improve image detail information and local contrast and improve the identification accuracy; and the multi-scale feature strengthening module extracts rich multi-scale information by connecting a plurality of feature pyramids in series, designs a cross-stage connection unit between the feature pyramids to reuse features, and solves the problem of gradient disappearance caused by network deepening. And finally accumulating the pyramid layers with the same step length and outputting the strengthened characteristic pyramid. The adaptive ROIAlign is different from the standard ROIAlign in that the features are extracted only in a single feature pyramid layer, and can adaptively pool and fuse the features of different feature layers, so that small targets and large targets can share low-layer and high-layer information. And the anti-shielding module is used for learning a mask of the candidate region in a weak supervision mode by combining an attention mechanism and the marked rectangular frame, wherein the mask emphasizes a target visual region and inhibits interference such as noise of a shielding region, so that the extracted features pay more attention to the foreground target, and meanwhile, the features are stronger in discriminability and have stronger anti-shielding capability. The rotating rectangular frame is used for positioning the dead fish target, and the dead fish position can be framed out without overlapping in a dense scene. Meanwhile, the interference of a large number of similar background areas is eliminated, and the training robustness is improved.
Firstly, carrying out contrast-limited adaptive histogram equalization processing on an image to be identified, and enhancing the local contrast of a target; secondly, designing a multi-scale feature enhancement module to solve the problem of dead fish identification in a large-scale range, designing an anti-blocking module based on an attention mechanism, highlighting a target area, and eliminating background interference such as other noises; finally, a rotating rectangular frame is used for representing a dead fish target, and the identification precision of the dead fish in a dense scene is greatly improved.
Drawings
FIG. 1 is a diagram of a dead fish identification model;
FIG. 2 is a schematic diagram of a data set annotation;
FIG. 3 is a schematic diagram of the multi-scale feature enhancement module of FIG. 1;
FIG. 4 is a schematic diagram of the feature pyramid construction shown in FIG. 3;
FIG. 5 is a schematic diagram of the cross-phase connection unit of FIG. 3;
FIG. 6 is a schematic diagram of the adaptive ROIAlign of FIG. 1;
FIG. 7 is a schematic view of the anti-blocking module of FIG. 1;
FIG. 8 is a rotating rectangular box representation method.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings in conjunction with the following detailed description. It should be understood that the description is intended to be exemplary only, and is not intended to limit the scope of the present invention. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present invention.
Example 1: the invention provides an anti-occlusion and multi-scale dead fish identification method based on deep learning, which is specifically shown in a flow chart 1 and comprises the following steps:
step 1, shooting an underwater image by using an underwater camera, and making a dead fish identification data set for training and testing. Meanwhile, an image preprocessing module is designed to provide high-quality and clear images to be identified for subsequent steps.
And 2, inputting the image into a ResNet-50 feature extraction module, and extracting the bottom layer edge and the high-layer abstract feature map of the image.
And 3, after the characteristic diagram is output in the step 2, designing a multi-scale characteristic strengthening module, and improving the multi-scale expression capability of the characteristics in a mode of connecting a plurality of characteristic pyramids in series so as to relieve the problem that the scale change range of the dead fish is large and difficult to identify.
And 4, inputting the multi-scale features output in the step 3 into an RPN (resilient packet network), generating a horizontal rectangular region candidate frame (ROI) to represent a foreground target (dead fish or live fish), and designing a self-adaptive ROIAlign algorithm for self-adaptively extracting and fusing the features of the candidate frames with different scales.
And 5, inputting the region candidate frame characteristics into the anti-blocking branch, learning the mask of the foreground target through an attention mechanism, and fusing the mask with the candidate frame characteristics to inhibit the interference of the background and achieve the purpose of anti-blocking.
And 6, taking the horizontal candidate frame generated in the step 4 as an initial starting point, combining the feature and the full connecting layer after the anti-blocking branches are fused, regressing and representing the rotating rectangular frame at the dead fish position, classifying the dead fish, and completing the dead fish target identification.
The step 1 comprises the following steps:
step 1-1, shooting a large number of underwater images by using an underwater camera, and dividing a data set into a training set, a testing set and a verification set according to the ratio of 6:3:1 after dead fish/live fish target labeling. Further, as shown in fig. 2, the labeling is first performed by using a solid line rotation rectangle box in the figure, which is represented by four vertices (x1, y1, x2, y2, x3, y3, x4, y4) according to the specific position and size of the dead fish. Further, the horizontal rectangular box (x0, y0, xM, yM) is used for training of the RPN network, and is generated by:
x0=min(x1,x4),y0=min(y1,y4),xM=max(x2,x3),yM=max(y2,y3)
step 1-2, training process data preprocessing: firstly, carrying out normalization processing on a training image, then carrying out random scale scaling, randomly cutting out image blocks with a certain size, and finally completing preprocessing after random horizontal turning and random contrast limited adaptive histogram equalization for subsequent training.
Step 1-3, preprocessing test process data: firstly, carrying out image normalization, then carrying out self-adaptive histogram equalization with limited random contrast and then completing pretreatment.
In the step 2, a ResNet-50 network is used to perform feature extraction on the input image data, so as to obtain five feature maps of Res1, Res2, Res3, Res4 and Res5 shown in fig. 1, wherein the number of channels is 128, 256, 512, 1024 and 2048, and the step size is 2, 4,8,16 and 32.
The multi-scale feature enhancement module in step 3 is shown in fig. 3, and includes the following steps:
step 3-1, generating a characteristic pyramid: the generation method is as shown in fig. 4, firstly, input features of Res2, Res3, Res4 and Res5 are convolved by 1 × 1 in a lateral connection, the number of channels is reduced to 256, and bilinear interpolation is adopted to perform 2 times of upsampling on high-level low-resolution features; then, pixel-by-pixel addition is carried out on the up-sampled features and the down-channel features; and finally, fusing and adding the feature maps by using 3 x3 convolution to obtain the current reinforced features, and repeating the steps for 3 times to generate a feature pyramid consisting of 4 layers of features.
Step 3-2, generating a multi-stage feature pyramid: and (3) taking the output of the pyramid generated in the previous stage as the input of the pyramid of the next stage, then constructing the pyramid of the next stage according to the mode of the step 3-1, and generating the multi-stage feature pyramid in a series connection mode.
And 3-3, connecting the multi-stage feature pyramids through a cross-stage connecting unit, wherein the specific schematic diagram of the unit is shown in fig. 5. Firstly, 1 × 1 convolution is used, channel dimensionality reduction is carried out on the features of the previous stage respectively, and then the channel dimensionality reduction and the features of the current stage are accumulated after downsampling to form new features of the features. And the spatial attention mask generating unit is used for generating a training mask label in a weak supervision mode through a marked rectangular box.
Step 3-4, the multi-scale feature generation process is shown in fig. 3, n stage feature pyramids are represented as { P1, P2, …, Pn }, each pyramid is composed of 4 layers of features { C1, C2, C3, C4}, the step sizes are 4,8,16,32, respectively, and the feature layers with the same step size of each pyramid are accumulated pixel by pixel to form multi-scale feature outputs { F1, F2, F3, F4 }. Generally, the larger the n value is set, the stronger the processing performance is, and in consideration of the problem of the calculation amount, n may be set to 2, and if the processing accuracy is to be improved, the larger n value may be set.
The step 4 is to generate a region level candidate frame and extract features, and a specific process thereof is shown in fig. 6, and includes the following steps:
and 4-1, generating horizontal candidate boxes by using the RPN network, wherein the candidate boxes may contain dead fish or live fish. The method is different from the method of directly filtering redundant candidate frames by using an NMS mode in fast RCNN, but ranks the confidence degrees of the candidate frames firstly, and takes 15000 as the top, then carries out subsequent NMS operation, and reserves 2000 candidate frames, and takes 5000 as the top to carry out NMS and reserves 200 frames in the test stage. The other links of the RPN are consistent with the Faster RCNN.
Step 4-2, the fast RCNN allocates candidate frames to { F1, F2, F3, F4} by using the sizes of the candidate frames, and then extracts features corresponding to the candidate frames by using a 7 × 7 pooling core roiign module.
Step 4-3, different from the feature conversion of the features by the fast RCNN through the full connection layer, the invention uses the convolution of 3 multiplied by 3 to process in order to keep the feature space information.
And 4-4, performing maximum value fusion on the processed features, and finally outputting feature dimensions of 7 multiplied by 256.
The step 5 is the construction of an anti-occlusion branch, and the specific process is shown in fig. 7:
step 5-1, a spatial attention mask generation test phase. And (3) performing two 3 × 3 convolutions on the features output in the step (4), performing nonlinear activation on each convolution by using a ReLU, converting the features into feature graphs of 1 channel by using 1 × 1 convolution, and finally converting the output value to a value between 0 and 1 by using a Sigmoid function.
And 5-2, generating a training phase by the spatial attention mask. In order to train the mask, the mask label is calculated by using a weak supervision mode, specifically, pixels inside a marked rectangular box are marked as 1, and pixels outside the marked rectangular box are marked as 0. In addition, the invention uses a binary cross entropy loss function to supervise the training process of the mask, and if the label of the mask is ci and the predicted mask is pi, the mask generates the loss LmaskComprises the following steps:
Figure BDA0003112605490000101
where N represents the number of pixels, i.e., N is 7 × 7 is 49.
And 5-3, weighting the generated mask and the input features, specifically, multiplying the generated mask and the input features pixel by pixel to highlight the non-occluded target part, inhibit the interference of other non-target areas, enable the features to be more discriminative and achieve the purpose of anti-occlusion.
Step 6 is to use the horizontal candidate frame generated in step 4 as a starting point to carry out category prediction and rotating rectangular frame regression on the dead fish target, and the specific process is as follows:
and 6-1, predicting the target of the dead fish. In the invention, the target categories are dead fish, live fish and background 3 categories, and during training, the category prediction adopts a cross entropy loss function. During testing, after softmax probability normalization and a maximum function are carried out, the category and the classification confidence coefficient are predicted.
And 6-2, rotating the dead fish to return in a rectangular frame. The diagram of the rotating rectangular frame is shown in fig. 8 and is represented by the form (X, y, w, h, θ), where (X, y) is the center point of the rotating rectangular frame, (w, h) is the width and height of the rotating rectangular frame, and θ represents the angle of a certain side of the rotating frame at an acute angle with the X axis, and the side is defined as w, and in this case, θ ∈ [ -pi/2, 0). Furthermore, the starting point of the rotating rectangular frame regression is the horizontal candidate frame for RPN prediction, which is encoded as:
Figure BDA0003112605490000102
Figure BDA0003112605490000103
wherein (x)0,y0,w0,h00) Is a horizontal candidate box, and θ0And (2) respectively representing a predicted value and a true value of the rotating frame, wherein the values of the other parameters (y, w, h and theta) are consistent. Predict values ti and true values tiFor loss between
Figure BDA0003112605490000114
And (3) calculating:
Figure BDA0003112605490000111
Figure BDA0003112605490000112
when testing, the predicted value needs to be decoded, and the decoding rule is as follows:
Figure BDA0003112605490000113
and 6-3, obtaining a detection result of the rotating frame after decoding, and removing the redundant detection frame by using the rotating NMS to obtain a final dead fish identification result.
It is to be understood that the above-described embodiments of the present invention are merely illustrative of or explaining the principles of the invention and are not to be construed as limiting the invention. Therefore, any modification, equivalent replacement, improvement and the like made without departing from the spirit and scope of the present invention should be included in the protection scope of the present invention. Further, it is intended that the appended claims cover all such variations and modifications as fall within the scope and boundaries of the appended claims or the equivalents of such scope and boundaries.

Claims (10)

1. An anti-occlusion and multi-scale dead fish recognition system based on deep learning is characterized by comprising:
the system comprises a dead fish identification data set module, a dead fish identification data set module and a test set, wherein the dead fish identification data set module is used for acquiring and storing underwater images shot underwater and constructing a dead fish identification training set, a verification set and a test set;
the image feature extraction module is used for preprocessing the underwater image and extracting the bottom edge and the high-level abstract feature map of the image;
the multi-scale feature enhancement module is used for improving the multi-scale expression capability of the features in a mode of connecting a plurality of feature pyramids in series; generating a candidate frame to represent the foreground target, and extracting and fusing the characteristics of the candidate frames in the regions with different scales in a self-adaptive manner;
and the anti-blocking module is used for learning the mask of the foreground target through an attention mechanism, fusing the mask with the candidate frame characteristics, and inhibiting the interference of the background to obtain the anti-blocking characteristics.
And the dead fish target identification module is used for regression representing the rotating rectangular frame of the dead fish position by taking the candidate frame characteristics as an initial starting point and combining the anti-blocking characteristics and the full connecting layer, and performing dead fish identification classification to complete the dead fish target identification.
2. The anti-occlusion and multi-scale dead fish identification system of claim 1, wherein: the dead fish identification data set module comprises three categories of data of dead fish, live fish and background fish, and is divided into three data sets of a training set, a verification set and a test set; the image preprocessing module is also used for designing a contrast-limited adaptive histogram equalization algorithm to carry out image preprocessing so as to improve image detail information and local contrast; the image feature extraction module comprises ResNet, VGG, MobileNet or Efficient general series network, and can be selected according to the real-time processing requirement of the system.
3. The anti-occlusion and multi-scale dead fish identification system of claim 1, wherein: the multi-stage feature pyramid generation mode is that firstly, the output features of the image feature extraction module are convoluted by 1 x1 in a lateral connection mode, the number of channels is reduced to 256, and 2 times of up-sampling is carried out on the high-level low-resolution features by using bilinear interpolation; then, pixel-by-pixel addition is carried out on the up-sampled features and the down-channel features; and finally, fusing and adding the feature maps by using 3 x3 convolution to obtain the current reinforced features, and repeating the steps for 3 times to generate a single feature pyramid consisting of 4 layers of features.
4. The anti-occlusion and multi-scale dead fish identification system of claim 3, wherein: the multi-stage feature pyramid is formed by connecting a plurality of feature pyramids in series, the output of the feature pyramid generated in the previous stage is used as the input of the feature pyramid in the next stage, the previous feature pyramid expresses shallow features, the subsequent feature pyramid is a deep feature, and each feature pyramid contains rich multi-scale information;
the cross-stage connecting unit transmits the characteristics of the previous stage to the next stage, so that the current characteristics can fully reuse the prior knowledge before, and the characteristic expression capability is enhanced;
the multi-scale feature generation process is that n multi-stage feature pyramids are assumed to be expressed as { P1,P2,…,PnEach pyramid consists of 4 layers of features C1,C2,C3,C4The step lengths are respectively 4,8,16 and 32, the feature layers with the same step length of each pyramid are accumulated pixel by pixel, and the multi-scale feature { F is output1,F2,F3,F4}; where n is set to 2 or a larger value of n.
5. The anti-occlusion and multi-scale dead fish identification system of claim 1, wherein: the generated candidate box comprises an RPN network and an adaptive ROIAlign algorithm unit; firstly, the confidence degrees of the candidate frames output by the RPN are sorted, so that the frames with low confidence degrees are removed, and then subsequent NMS operation is carried out to improve the post-processing speed of the algorithm; the self-adaptive ROIAlign algorithm unit is used for self-adaptively pooling the characteristics of different characteristic layers and fusing the characteristics, and firstly, all the generated candidate frames are mapped to the multi-scale characteristic layer { F ] generated in the step 31,F2,F3,F4And performing ROIAlign operation with 7 × 7 pooling kernels, and mapping the generated features to complete fusion.
6. The anti-occlusion and multi-scale dead fish identification system of claim 1, wherein: the anti-blocking module comprises a space attention mask generation unit and a feature weighting unit, wherein the space attention mask generation unit is used for extracting 7 multiplied by 256 features of the candidate box as input, mapping the feature map into 7 multiplied by 1 dimensions after two 3 multiplied by 3 convolutions and one 1 multiplied by 1 convolution, and finally outputting a foreground target probability map with a value between [0,1] after a Sigmoid activation function; the characteristic weighting unit is used for multiplying the probability graph and the original input characteristic pixel by pixel and then outputting the result as a branch;
and the spatial attention mask generating unit is used for generating a training mask label in a weak supervision mode through a marked rectangular box.
7. The anti-occlusion and multi-scale dead fish identification system of claim 1, wherein: the rotating rectangular frame is represented by the form (X, y, w, h, theta), wherein (X, y) is the center point of the rotating rectangular frame, (w, h) is the width and height of the rotating rectangular frame, and theta represents the angle of a certain side of the rotating rectangular frame, which forms an acute angle with the X axis, and is defined as w, and at this time, theta is formed by [ -pi/2, 0).
8. The anti-occlusion and multi-scale dead fish identification system of claim 7, wherein: the rotating rectangular frame performs regression by taking the horizontal candidate frame as a starting point, and the encoding mode is as follows:
Figure FDA0003112605480000041
Figure FDA0003112605480000042
wherein (x)0,y0,w0,h00) Is a horizontal candidate box, and θ0The method comprises the following steps that (1) as-pi/2, x and x' are a predicted value and a true value of a rotating frame respectively, and other parameters (y, w, h and theta) have the same meanings; smooth is used between predicted value t and true value tL1The loss is calculated.
9. An anti-occlusion and multi-scale dead fish identification method based on deep learning is characterized by comprising the following steps:
step S1, acquiring underwater images, establishing a dead fish identification data set for training and testing,
step S2, line image preprocessing, namely inputting the image into an image feature extraction module, and extracting the bottom edge and the high-level abstract feature map of the image;
step S3, the multi-scale feature strengthening module improves the multi-scale expression capability of the features in a mode of connecting a plurality of feature pyramids in series to obtain multi-scale features;
step S4, generating a horizontal rectangular region candidate box from the multi-scale features, and designing a self-adaptive ROIAlign algorithm for self-adaptively extracting and fusing the features of the region candidate boxes with different scales;
step S5, learning the mask of the foreground object by the region candidate frame characteristics through an attention mechanism, and fusing the mask with the candidate frame characteristics to inhibit the interference of the background;
and step S6, taking the horizontal rectangular area candidate frame as an initial starting point, combining the feature and the full connecting layer after the anti-blocking branch is fused, regressing and representing the rotating rectangular frame at the dead fish position, classifying the dead fish, and completing the dead fish target identification.
10. The anti-occlusion and multi-scale dead fish identification method of claim 9, wherein: in step S2, an adaptive histogram equalization algorithm with limited contrast is designed to perform image preprocessing to improve image detail information and local contrast;
in step S3, accumulating pyramid layers of the same step size and outputting a reinforced feature pyramid;
in step S4, the roiign algorithm may adaptively pool and fuse the features of different feature layers, so that both the small target and the large target can share the low-layer and high-layer information.
CN202110653176.7A 2021-06-11 2021-06-11 Anti-occlusion and multi-scale dead fish identification system and method based on deep learning Active CN113420759B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110653176.7A CN113420759B (en) 2021-06-11 2021-06-11 Anti-occlusion and multi-scale dead fish identification system and method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110653176.7A CN113420759B (en) 2021-06-11 2021-06-11 Anti-occlusion and multi-scale dead fish identification system and method based on deep learning

Publications (2)

Publication Number Publication Date
CN113420759A true CN113420759A (en) 2021-09-21
CN113420759B CN113420759B (en) 2023-04-18

Family

ID=77788414

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110653176.7A Active CN113420759B (en) 2021-06-11 2021-06-11 Anti-occlusion and multi-scale dead fish identification system and method based on deep learning

Country Status (1)

Country Link
CN (1) CN113420759B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115049952A (en) * 2022-04-24 2022-09-13 南京农业大学 Juvenile fish limb identification method based on multi-scale cascade perception deep learning network
CN117115688A (en) * 2023-08-17 2023-11-24 广东海洋大学 Dead fish identification and counting system and method based on deep learning under low-brightness environment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109543679A (en) * 2018-11-16 2019-03-29 南京师范大学 A kind of dead fish recognition methods and early warning system based on depth convolutional neural networks
CN109583343A (en) * 2018-11-21 2019-04-05 荆门博谦信息科技有限公司 A kind of fish image processing system and method
US20200394413A1 (en) * 2019-06-17 2020-12-17 The Regents of the University of California, Oakland, CA Athlete style recognition system and method
CN112766274A (en) * 2021-02-01 2021-05-07 长沙市盛唐科技有限公司 Water gauge image water level automatic reading method and system based on Mask RCNN algorithm
CN112926652A (en) * 2021-02-25 2021-06-08 青岛科技大学 Fish fine-grained image identification method based on deep learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109543679A (en) * 2018-11-16 2019-03-29 南京师范大学 A kind of dead fish recognition methods and early warning system based on depth convolutional neural networks
CN109583343A (en) * 2018-11-21 2019-04-05 荆门博谦信息科技有限公司 A kind of fish image processing system and method
US20200394413A1 (en) * 2019-06-17 2020-12-17 The Regents of the University of California, Oakland, CA Athlete style recognition system and method
CN112766274A (en) * 2021-02-01 2021-05-07 长沙市盛唐科技有限公司 Water gauge image water level automatic reading method and system based on Mask RCNN algorithm
CN112926652A (en) * 2021-02-25 2021-06-08 青岛科技大学 Fish fine-grained image identification method based on deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CHUANG YU等: ""Segmentation and measurement scheme for fish morphological features based on Mask R-CNN"", 《INFORMATION PROCESSING IN AGRICULTURE》 *
LINGCAI ZENG等: ""Underwater target detection based on Faster R-CNN and adversarial occlusion network"", 《ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115049952A (en) * 2022-04-24 2022-09-13 南京农业大学 Juvenile fish limb identification method based on multi-scale cascade perception deep learning network
CN115049952B (en) * 2022-04-24 2023-04-07 南京农业大学 Juvenile fish limb identification method based on multi-scale cascade perception deep learning network
CN117115688A (en) * 2023-08-17 2023-11-24 广东海洋大学 Dead fish identification and counting system and method based on deep learning under low-brightness environment

Also Published As

Publication number Publication date
CN113420759B (en) 2023-04-18

Similar Documents

Publication Publication Date Title
CN115797931B (en) Remote sensing image semantic segmentation method and device based on double-branch feature fusion
CN112232349B (en) Model training method, image segmentation method and device
CN109934200B (en) RGB color remote sensing image cloud detection method and system based on improved M-Net
Isikdogan et al. Seeing through the clouds with deepwatermap
CN113378906B (en) Unsupervised domain adaptive remote sensing image semantic segmentation method with feature self-adaptive alignment
CN111797712B (en) Remote sensing image cloud and cloud shadow detection method based on multi-scale feature fusion network
WO2023000159A1 (en) Semi-supervised classification method, apparatus and device for high-resolution remote sensing image, and medium
CN109558806A (en) The detection method and system of high score Remote Sensing Imagery Change
CN113420759B (en) Anti-occlusion and multi-scale dead fish identification system and method based on deep learning
CN110751075A (en) Remote sensing image culture pond detection method based on example segmentation
CN112560865B (en) Semantic segmentation method for point cloud under outdoor large scene
CN113888547A (en) Non-supervision domain self-adaptive remote sensing road semantic segmentation method based on GAN network
CN116258719A (en) Flotation foam image segmentation method and device based on multi-mode data fusion
CN113255837A (en) Improved CenterNet network-based target detection method in industrial environment
CN114220126A (en) Target detection system and acquisition method
CN114943876A (en) Cloud and cloud shadow detection method and device for multi-level semantic fusion and storage medium
CN115410081A (en) Multi-scale aggregated cloud and cloud shadow identification method, system, equipment and storage medium
CN115471746A (en) Ship target identification detection method based on deep learning
CN115713632A (en) Feature extraction method and device based on multi-scale attention mechanism
CN115424059A (en) Remote sensing land use classification method based on pixel level comparison learning
CN116778470A (en) Object recognition and object recognition model training method, device, equipment and medium
CN116246138A (en) Infrared-visible light image target level fusion method based on full convolution neural network
CN113192018B (en) Water-cooled wall surface defect video identification method based on fast segmentation convolutional neural network
CN112560719B (en) High-resolution image water body extraction method based on multi-scale convolution-multi-core pooling
CN111860668A (en) Point cloud identification method of deep convolution network for original 3D point cloud processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant