CN113076871A - Fish shoal automatic detection method based on target shielding compensation - Google Patents

Fish shoal automatic detection method based on target shielding compensation Download PDF

Info

Publication number
CN113076871A
CN113076871A CN202110354428.6A CN202110354428A CN113076871A CN 113076871 A CN113076871 A CN 113076871A CN 202110354428 A CN202110354428 A CN 202110354428A CN 113076871 A CN113076871 A CN 113076871A
Authority
CN
China
Prior art keywords
feature
feature map
fish
image
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110354428.6A
Other languages
Chinese (zh)
Other versions
CN113076871B (en
Inventor
丁泉龙
杨伟健
曹燕
王一歌
韦岗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202110354428.6A priority Critical patent/CN113076871B/en
Publication of CN113076871A publication Critical patent/CN113076871A/en
Application granted granted Critical
Publication of CN113076871B publication Critical patent/CN113076871B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A40/00Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
    • Y02A40/80Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in fisheries management
    • Y02A40/81Aquaculture, e.g. of fish

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a fish school automatic detection method based on target shielding compensation, which comprises the following steps: the method comprises the steps that a camera is carried on a multi-rotor unmanned aerial vehicle to collect fish shoal images, and marking and data expansion are carried out; performing feature extraction, namely performing multistage feature extraction from shallow to deep on an input fish school image by using a double-branch feature extraction network to obtain five feature maps; carrying out feature fusion, fusing semantic information of a deep feature map into a shallow feature map on the upper layer by using an improved semantic embedding branch, and fusing detail information of a four-time down-sampling feature map into an eight-time down-sampling feature map; and predicting the fish target through the three characteristic graphs to obtain a candidate frame, processing the repeated candidate frame by adopting an improved DIoU _ NMS non-maximum value inhibition algorithm, and outputting a fish school detection result. The invention can improve the recall rate of the fish school detection when mutual shielding is caused by fish school aggregation, thereby improving the average accuracy of the fish school detection.

Description

Fish shoal automatic detection method based on target shielding compensation
Technical Field
The invention relates to the technical field of image target detection, in particular to a fish school automatic detection method based on target shielding compensation.
Background
Modern fish culture is not independent of systematic management, and fish school detection has very important practical significance for culture industrialization, wherein the fish school detection can detect whether fish exists and the size of the fish, and further evaluate whether culture and fish feeding are proper.
The fish school detection can adopt a sonar image method and an optical image method. The sonar image method utilizes an ultrasonic principle, acquires underwater fish-swarm sonar images through a sonar system, and then detects fish targets from the sonar images, but for an actual underwater scene, the sonar image method is easy to be interfered by other objects. With the development and improvement of underwater photography technology, optical imaging methods are now available. By adopting an optical image method, an optical image of a fish shoal needs to be acquired firstly, and then the fish is detected and marked by a target detection method. And the target detection is a branch in image processing, which is to find out all objects in the specified category in the picture and mark their specific positions in the image with rectangular frames. The manual marking of the fish shoal is expensive and inefficient, and in order to promote the development of automatic informatization of the fish farming industry, it is very important to research the automatic fish shoal detection method aiming at the actual underwater environment of the farm.
With the continuous development of computer technology, the automatic detection of the underwater fish school optical image by using deep learning can reduce the time for searching and marking fish, thereby saving the time for relevant workers to execute the task and improving the working efficiency.
The Yolov4 target detection algorithm belongs to a deep learning algorithm, gives consideration to detection speed and detection precision, and is widely applied to the field of image target detection. The Yolov4 algorithm firstly sends a data set into a Yolov4 network for training, stores a trained network model weight file, then inputs a test image by using the stored network model weight file, namely a prediction frame which possibly has a target in the test image can be generated, and simultaneously, a confidence score of the target in the prediction frame is given. The algorithm has good effects on detection speed and detection precision, is suitable for being applied to automatic fish school detection, and can quickly obtain a detection result after a fish school image is shot.
However, when the fish image data is shot underwater actually, the underwater scene is complex, the collected fish images have the condition that mutual occlusion is caused by fish aggregation, if the YOLOv4 algorithm is directly used for detecting the fish targets, the detection effect on the occluded targets is poor, missed detection occurs, and the recall rate of the fish targets is relatively low. Therefore, it is desirable to provide an underwater fish detection method with high recall rate.
Disclosure of Invention
The invention aims to solve the defects in the prior art and provides an automatic fish school detection method based on target occlusion compensation.
The purpose of the invention can be achieved by adopting the following technical scheme:
a fish school automatic detection method based on target shielding compensation comprises the following steps:
s1, acquiring a fish school image in a pond environment through a multi-rotor unmanned airship carrying a camera, and marking and data expansion the acquired fish school image;
the underwater fish shoal image can be acquired by flying the multi-rotor unmanned airship to the sky of an interested water area and landing the unmanned airship to the water surface, and then acquiring optical image data of the cultured fish shoal by using a camera carried on the unmanned airship.
S2, inputting the fish image into a double branch feature extraction network to perform multilevel feature extraction from shallow to deep, wherein the double branch feature extraction network is called as a double branch feature extraction network because a lightweight original information feature extraction network parallel to CSPDarknet53 is added on the basis of a trunk feature extraction network CSPDarknet53 of a YOLOv4 algorithm; after multi-stage feature extraction is carried out by a double-branch feature extraction network, five feature maps are obtained, and the five feature maps are respectively a two-time down-sampling feature map FA1Fourfold down-sampling feature map FA2Eight-fold down-sampling feature map FA3Sixteen-fold down-sampling feature map FA4Thirty-two times downsampling feature map FA5Resolutions of 1/2, 1/4, 1/8, 1/16, 1/32 of the input fish school image, respectively;
s3, using Modified Semantic Embedding Branch (MSEB) to obtain the feature map F obtained in the step S2A5Fusing semantic information of to the feature map FA4In (1), obtaining a characteristic diagram FAM4 Feature map F AM41/16 for the input fish image; the characteristic diagram F obtained in the step S2A4Fusing semantic information of to the feature map FA3In (1), obtaining a characteristic diagram FAM3 Feature map F AM31/8 for the input fish image;
s4, convolution down-sampling is carried out to obtain the four times down-sampled feature map F obtained in the step S2A2The detail information of (4) is fused with the eight-fold down-sampled feature map F obtained in step S3AM3In (1), obtaining a characteristic diagram FAMC3 Feature map F AMC31/8 for the input fish image;
s5, obtaining the characteristic diagram F obtained in the step S2A5The characteristic diagram F obtained in step S3AM4And the characteristic diagram F obtained in step S4AMC3After feature fusion is carried out on the feature pyramid structure of the YOLOv4 algorithm, three feature graphs are obtained, wherein the three feature graphs are FB3、FB4And FB5Then using the feature map FB3、FB4And FB5Predicting the fish target after convolution processing to obtain repeated candidate frames and corresponding prediction confidence scores;
and S6, processing the repeated candidate boxes by adopting a non-maximum suppression algorithm of the improved DIoU _ NMS to obtain a prediction box result containing the prediction confidence score, and drawing the prediction box result on a corresponding picture as a fish shoal detection result.
Further, in step S1, labeling the fish targets in each collected fish image one by using labelImg image labeling software, generating an xml tag file containing labeling information for each image after labeling, and constructing an original data set from the collected fish images and the tag files corresponding to the collected fish images; and then expanding the original data set in a data enhancement mode comprising vertical turnover, horizontal turnover, brightness change, random Gaussian white noise addition, filtering and affine transformation to form a final data set and improve the robustness of the network model to environmental changes.
Further, in step S2, the fish school image is input into the dual-branch feature extraction network for multi-stage feature extraction from shallow to deep, so as to extract and retain the original features of the input image more fully, and compensate the problem of insufficient fish features when the fish school is blocked, and the specific process of feature extraction by the dual-branch feature extraction network is as follows:
the trunk feature extraction network CSPDarknet53 comprises a CBM unit and five cross-phase local network CSPx units; the CBM unit consists of a Convolution layer Convolution with the step length of 1 and the Convolution kernel of 3 x 3, a Batch Normalization layer Batch Normalization and a Mish activation function layer; the CSPx unit is formed by fusing a plurality of CBM units and x Res unit residual error units, wherein each Res unit residual error unit consists of a CBM unit with a convolution kernel of 1 x 1, a CBM unit with a convolution kernel of 3 x 3 and a residual error structure, the two characteristic graphs are spliced on a channel by using the Concatenate fusion operation, and the dimensionality of the spliced characteristic graphs is expanded; the channel number of the Convolution layer convergence of the five CSPx units is 64, 128, 256, 512 and 1024 in sequence, and each CSPx unit is subjected to twice down-sampling; the characteristic graphs obtained by five CSPx units are respectively FC1、FC2、FC3、FC4、FC5Resolutions of 1/2, 1/4, 1/8, 1/16, 1/32 of the input fish school image, respectively;
the lightweight original information feature extraction network comprises five CM units, wherein each CM unit consists of a convolutional layer constraint with the step length of 2 and the convolutional kernel of 3 x 3 and a maximum pooling layer MaxPool with the pooling step length of 1 and the pooling kernel of 3 x 3, the convolutional layer with the step length of 2 is subjected to one-time double down sampling, and the number of convolutional layer channels of each CM unit is the same as that of the corresponding cross-stage local network CSPx unit in a main feature extraction network CSPDarknet 53; the characteristic diagram obtained by five CM units is FL1、FL2、FL3、FL4、FL5Resolutions of 1/2, 1/4, 1/8, 1/16, 1/32 of the input fish school image, respectively;
then, in the multi-stage feature extraction process from shallow to deep, a feature graph F extracted by a light-weight original information feature extraction network is obtainedLiFeature map F extracted from CSPDarknet53 networkCiPerforming Add fusion operation, i is 1,2,3,4,5, adding corresponding pixel values of the two feature maps to obtain a final extracted feature map FA1、FA2、FA3、FA4、FA5The resolutions are 1/2, 1/4, 1/8, 1/16, 1/32 of the input fish school image, respectively.
Furthermore, in the multi-stage feature extraction process from shallow to deep, the feature map extracted from the shallow layer has rich detail information, but lacks semantic information, and cannot better identify and detect the target; on the contrary, semantic information can be well extracted from a deep layer, but a large amount of detail information is lost, and position information cannot be effectively predicted. Therefore, in step S3, the improved semantic embedding branch is used to fuse the semantic information of the deep feature map into the shallow feature map on the upper layer, so as to compensate the problem of insufficient semantic information in the shallow feature map, thereby improving the recall rate of the fish target during detection, and the process of fusing by using the improved semantic embedding branch is as follows:
firstly, the deep layer feature map F obtained in step S2A5Performing coordinate fusion operation of different scale features by using a convolution layer with convolution kernel 1 × 1 and a convolution layer with convolution kernel 3 × 3, performing double upsampling by using a nearest neighbor interpolation mode after passing through a Sigmoid function, and performing double upsampling with the shallow feature map F obtained in the step S2A4Multiplying the pixel values to obtain a feature map F AM41/16 of the resolution of the input fish image, so that the deep feature map FA5Fusing semantic information of (2) to the shallow feature map FA4Middle, shallow layer characteristic map FA4The semantic information of (2) is insufficient;
then, the deep layer feature map F obtained in step S2 is subjected toA4It is also embedded with improved semantic embedding branchesFusion of semantic information to shallow feature map FA3In (1), obtaining a characteristic diagram F AM31/8 with resolution of input fish image, making up for shallow feature map FA3The semantic information of (2) is insufficient;
the Sigmoid functional form used in the improved semantic embedding branch is as follows:
Figure BDA0003003154500000051
where i is the input and e is the natural constant.
Further, in the step S4, the detail information of the quadruple down-sampled feature map is fused into the eight-fold down-sampled feature map by means of convolution down-sampling, and the detail information of the quadruple down-sampled feature map is fully utilized to compensate the positioning of the edge contour of the fish when the fish school is blocked, where the fusion process is as follows:
firstly, the feature map F obtained in the step S2 is four times sampledA2Processing by a CBL unit, wherein the CBL unit consists of a Convolution layer convention with a step size of 1 and a Convolution kernel of 3 × 3, a Batch Normalization layer Batch Normalization and a LeakyReLU activation function layer, performing double down-sampling by using the Convolution layer convention with the step size of 2 and the Convolution kernel of 3 × 3, and performing double down-sampling with the feature map F obtained in the step S3AM3Carrying out Concatenate fusion operation after CBL unit treatment to obtain a characteristic diagram FAMC3At a resolution of 1/8 of the input fish image, thereby using a quadruple down-sampled feature map FA2The details of (a).
Further, the step S5 process is as follows:
first, the feature map F obtained in step S2A5The characteristic diagram F obtained in step S3AM4And the characteristic diagram F obtained in step S4AMC3After feature fusion is carried out on the feature pyramid structure of the YOLOv4 algorithm, three feature graphs F are obtainedB3、FB4And FB5Wherein the feature Pyramid structure of the YOLOv4 algorithm includes a Spatial Pyramid Pooling layer (SPP) and a Path Aggregation Network (PANet), and the Spatial Pyramid structure includes a Spatial Pyramid Pooling layer (SPP)The structure of the pooling layer is in a feature map FA5After three times of CBL unit processing, performing Concatenate fusion operation by adopting four maximum pooling layers with pooling cores of 1 × 1, 5 × 5, 9 × 9 and 13 × 13, and repeatedly fusing the features by the structure of the path aggregation network through paths from bottom to top and from top to bottom; then three feature maps F are processedB3、FB4And FB5Respectively carrying out convolution layer processing with a CBL unit and a convolution kernel of 1 x 1 to obtain three Prediction feature maps of different sizes, namely Prediction1, Prediction2 and Prediction3, wherein the resolutions of the three Prediction feature maps are 1/8, 1/16 and 1/32 of an input fish school image; and then, predicting the fish target by using the three prediction characteristic graphs to obtain repeated candidate frames and corresponding prediction confidence scores.
Further, in the step S6, the repeated candidate box is processed by using the non-maximum suppression algorithm of the improved DIoU _ NMS, so that the missed detection problem of the blocked target is compensated, and the recall rate of the blocked fish is further improved, which includes the following specific processing procedures:
s601, traversing all candidate frames in an image, sequentially judging the prediction confidence score of each candidate frame, reserving the candidate frames with the scores larger than the confidence threshold value and the corresponding scores thereof, and deleting the candidate frames with the scores lower than the confidence threshold value;
s602, selecting the candidate frame M with the highest prediction confidence score in the residual candidate frames, and traversing the other candidate frames B in sequenceiCalculating Distance cross ratio Distance-IoU with the candidate frame M, wherein Distance cross ratio Distance-IoU is called DIoU for short if a certain candidate frame BiIf the value of DIoU with the candidate box M is not lower than the given threshold value epsilon, the overlap of the two boxes is considered to be high, and if the candidate box B is directly deleted according to the DIoU _ NMS algorithmiEasily cause the problem of missed detection when the fish swarm aggregation causes the occlusion, therefore, the improved DIoU _ NMS algorithm divides the candidate box BiThe prediction confidence score of (2) is reduced, and then the candidate frame M is removed to the final prediction frame set G; wherein the prediction confidence score reduction criterion is as follows:
Figure BDA0003003154500000071
where M is the candidate box with the highest current prediction confidence score, BiIs the other candidate box to be traversed, ρ (M, B)i) Is M and BiC is a distance containing M and BiIs the diagonal length of the minimum bounding rectangle, DIoU (M, B)i) Is M and BiIs a given threshold value of the distance cross-over ratio DIoU, SiIs candidate frame BiIs predicted confidence score, S'iIs candidate frame BiA reduced score prediction confidence score;
and S603, repeatedly executing the step S602 until all the candidate frames are processed, and drawing the final prediction frame set G on the corresponding picture as an output result to obtain a fish school detection result.
Further, in the step S602, DIoU adds a penalty factor based on the intersection ratio IoU, the penalty factor considering the distance between the center points of the two candidate boxes, DIoU (M, B)i) The calculation method of (c) is as follows:
Figure BDA0003003154500000072
wherein M is the candidate box with the highest current prediction confidence score, BiIs the other candidate box to be traversed, ρ (M, B)i) Is M and BiC is a distance containing M and BiIoU (M, B)i) Is M and BiThe ratio of the intersection and union of (a).
Compared with the prior art, the invention has the following advantages and effects:
(1) in the image feature extraction process, the double-branch feature extraction network is used for extracting the features of the input fish school image, the problem of insufficient fish features when the fish school is shielded is compensated, and the original features of the fish can be extracted more fully.
(2) The invention adopts the improved semantic embedded branch MSEB to fuse the semantic information of the deep layer feature map into the feature map of the upper layer, thereby making up the problem of insufficient semantic information in the shallow layer feature map of the upper layer and further improving the recall rate of the fish target.
(3) According to the method, the detail information of the four-time down-sampling feature map is fused into the eight-time down-sampling feature map, so that the edge contour information of the fish is fully acquired by utilizing the detail information of the four-time down-sampling feature map, and the edge contour of the fish can be more accurately positioned when a fish shoal is shielded.
(4) The invention adopts the non-maximum suppression algorithm of the improved DIoU _ NMS to process the repeated candidate frame, and the missed detection problem of the blocked target is compensated, so that the missed detection of the deleted repeated candidate frame and the true frame is balanced, and the recall rate of the blocked fish is further improved.
Drawings
FIG. 1 is a flow chart of a fish shoal automatic detection method based on target occlusion compensation disclosed by the invention;
fig. 2 is a network structure diagram of a fish school automatic detection method based on target occlusion compensation in an embodiment of the present invention, where Concat represents Concatenate fusion operation;
fig. 3 is a block diagram of an improved semantic embedding branch MSEB in an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Examples
In this embodiment, a flow chart shown in fig. 1 and a network structure chart shown in fig. 2 are used to provide an automatic fish school detection method based on target occlusion compensation, so as to realize automatic detection of an underwater fish school target, where the specific flow is as follows:
s1, flying the unmanned airship to the sky of a water area of interest by using a multi-rotor wing and landing the unmanned airship to the water surface, then shooting image data of cultured fish schools by using a camera carried on the unmanned airship, enabling a camera to face the front, setting the interval time of shooting the images to be 5 seconds, enabling the original resolution of the shot images to be 1920 x 1080, and then marking and expanding the collected fish school images to obtain a data set for training;
s2, inputting the fish image into a double-branch feature extraction network to perform multistage feature extraction from shallow to deep, wherein the double-branch feature extraction network is specifically a double-branch feature extraction network which is called as a double-branch feature extraction network because a lightweight original information feature extraction network parallel to a trunk feature extraction network CSPDarknet53 is added on the basis of the trunk feature extraction network CSPDarknet53 of a YOLov4 algorithm; after multi-stage feature extraction is carried out by a double-branch feature extraction network, five feature maps are obtained, and the five feature maps are respectively a two-time down-sampling feature map FA1Fourfold down-sampling feature map FA2Eight-fold down-sampling feature map FA3Sixteen-fold down-sampling feature map FA4Thirty-two times downsampling feature map FA5Resolutions of 1/2, 1/4, 1/8, 1/16, 1/32 of the input fish school image, respectively;
s3, using Modified Semantic Embedding Branch (MSEB) to obtain the feature map F obtained in the step S2A5Fusing semantic information of to the feature map FA4In (1), obtaining a characteristic diagram F AM41/16 for the resolution of the input fish image; the characteristic diagram F obtained in the step S2A4Fusing semantic information of to the feature map FA3In (1), obtaining a characteristic diagram F AM31/8 for the resolution of the input fish image;
s4, convolution down-sampling is carried out to obtain the four times down-sampled feature map F obtained in the step S2A2The detail information of (4) is fused with the eight-fold down-sampled feature map F obtained in step S3AM3In (1), obtaining a characteristic diagram F AMC31/8 for the resolution of the input fish image;
s5, obtaining the characteristic diagram F obtained in the step S2A5The characteristic diagram F obtained in step S3AM4And the characteristic diagram F obtained in step S4AMC3Through YAfter feature fusion is carried out on the feature pyramid structure of the OLOv4 algorithm, three feature graphs are obtained, wherein the three feature graphs are FB3、FB4And FB5Then using the feature map FB3、FB4And FB5Predicting the fish target after convolution processing to obtain repeated candidate frames and corresponding prediction confidence scores;
and S6, processing the repeated candidate boxes by adopting a non-maximum suppression algorithm of the improved DIoU _ NMS to obtain a prediction box result containing the prediction confidence score, and drawing the prediction box result on a corresponding picture as a fish shoal detection result.
In this embodiment, step S1 uses labelImg labeling software to label the fish bodies in the collected fish images one by one with rectangular frames in a manual labeling manner, so as to obtain corresponding xml tag files, and record the coordinates and categories of each target in the images; and then, expanding the acquired fish image and the corresponding tag file thereof in a data enhancement mode comprising vertical turnover, horizontal turnover, brightness change, random Gaussian white noise addition, filtering and affine transformation to form a final data set and improve the robustness of the network model to environmental changes.
In this embodiment, in step S2, a fish image is input into a dual branch feature extraction network for multi-level feature extraction from shallow to deep, where 208 in fig. 2 shows a specific structure of the dual branch feature extraction network, and a lightweight original information feature extraction network parallel to CSPDarknet53 is added on the basis of CSPDarknet53 as a main feature extraction network of the YOLOv4 algorithm, and the structure of the dual branch feature extraction network is described as follows:
the trunk feature extraction network CSPDarknet53 comprises a CBM unit and five cross-phase local network CSPx units; the CBM unit is composed of a Convolution layer Convolution with a step size of 1 and a Convolution kernel of 3 × 3, a Batch Normalization layer Batch Normalization and a Mish activation function layer, and 201 in fig. 2 shows the structure of one CBM unit; the CSPx unit is formed by fusing a plurality of CBM units and x Res unit residual error units, wherein 204 in FIG. 2 shows the structure of a CSPx unit; res unit residual unit in CSPx unit is formed by CBM with convolution kernel of 1 x 1The element, the CBM unit with convolution kernel 3 × 3 and the residual structure, and 203 in fig. 2 gives the structure of a Res unit residual unit; splicing the two feature graphs on a channel by using a coordinate fusion operation, wherein the dimension can be expanded; the number of convolution layer channels of the five CSPx units is 64, 128, 256, 512 and 1024 in sequence, and each CSPx unit is subjected to twice down sampling; the characteristic diagram obtained by five CSPx units is FC1、FC2、FC3、FC4、FC5Resolutions of 1/2, 1/4, 1/8, 1/16, 1/32 of the input fish school image, respectively;
the lightweight original information feature extraction network comprises five CM units, wherein each CM unit consists of a Convolution layer constraint with the step length of 2 and the Convolution kernel of 3 x 3 and a maximum pooling layer MaxPool with the pooling step length of 1 and the pooling kernel of 3 x 3, and the structure of one CM unit is given by 205 in FIG. 2; the convolution layer with the step length of 2 can be subjected to one-time double down sampling, and the number of convolution layer channels of each CM unit is the same as that of corresponding cross-stage local network CSPx units in the main feature extraction network CSPDarknet 53; the characteristic diagram obtained by five CM units is FL1、FL2、FL3、FL4、FL5Resolutions of 1/2, 1/4, 1/8, 1/16, 1/32 of the input fish school image, respectively;
then, in the multi-stage feature extraction process from shallow to deep, a feature graph F extracted by a light-weight original information feature extraction network is obtainedLi(i-1, 2,3,4,5) and feature map F extracted by corresponding CSPDarknet53 network in sequenceCi(i is 1,2,3,4,5), Add fusion operation is performed, i.e. corresponding pixel values of the two feature maps are added to obtain the final extracted feature map FA1、FA2、FA3、FA4、FA5The resolutions are 1/2, 1/4, 1/8, 1/16, 1/32 of the input fish school image, respectively.
In this embodiment, in step S3, the improved semantic embedding branch MSEB is used to fuse the semantic information of the deep feature map into the shallow feature map on the upper layer, and fig. 3 shows a specific structure diagram of the improved semantic embedding branch MSEB; the specific steps for fusion using the MSEB are,firstly, the deep layer feature map F obtained in step S2A5Performing coordinate fusion of different scale features by using a convolution layer with convolution kernel 1 × 1 and a convolution layer with convolution kernel 3 × 3, performing double upsampling by using a nearest neighbor interpolation mode after passing through a Sigmoid function, and performing double upsampling with the shallow feature map F obtained in the step S2A4Multiplying the pixel values to obtain a feature map F AM41/16 of the resolution of the input fish image, so that the deep feature map FA5Fusing semantic information of (2) to the shallow feature map FA4Middle, shallow layer characteristic map FA4The semantic information of (2) is insufficient;
then, the deep layer feature map F obtained in step S2 is subjected toA4The MSEB is also used for fusing the semantic information thereof into a shallow feature map FA3In (1), obtaining a characteristic diagram F AM31/8 with resolution of input fish image, making up for shallow feature map FA3The semantic information of (2) is insufficient;
the form of Sigmoid function used in the improved semantic embedding branch MSEB is as follows:
Figure BDA0003003154500000121
where i is the input and e is the natural constant.
In this embodiment, the implementation process of step S4 is as follows:
firstly, the feature map F obtained in the step S2 is four times sampledA2After processing by a CBL unit, wherein the CBL unit is composed of a Convolution layer convention, a Batch Normalization layer Batch Normalization and a leakage relu activation function layer, of which Convolution kernel is 3 × 3 and step size is 1, a structure of the CBL unit is shown as 202 in fig. 2; then, the Convolution layer Convolution with the Convolution kernel size of 3 × 3 is double down-sampled by the step size of 2, and the feature map F obtained in step S3 is obtainedAM3Carrying out Concatenate fusion operation after CBL unit treatment to obtain a characteristic diagram FAMC3At a resolution of 1/8 of the input fish image, thereby using a quadruple down-sampled feature map FA2The detailed information of the fish can fully obtain the edge contour information of the fish and compensate the shielding of fish shoalThe edge contour of the fish is positioned.
In this embodiment, the implementation process of step S5 is as follows:
first, the feature map F obtained in step S2A5The characteristic diagram F obtained in step S3AM4And the characteristic diagram F obtained in step S4AMC3After feature fusion is carried out on the feature pyramid structure of the YOLOv4 algorithm, three feature graphs F are obtainedB3、FB4And FB5Wherein, the feature Pyramid structure of the YOLOv4 algorithm includes a Spatial Pyramid Pooling layer (SPP) and a Path Aggregation Network (PANet), and the SPP structure is in a feature graph FA5After three times of CBL unit processing, Concatenate fusion was performed using four maximal pooling layers with pooling kernels of 1 × 1, 5 × 5, 9 × 9, and 13 × 13, the structure of SPP is given at 206 in fig. 2, the features are repeatedly fused by bottom-up and top-down paths for the PANet structure, and the structure of PANet is given at 207 in fig. 2; then three feature maps F are processedB3、FB4And FB5Performing convolutional layer processing with a CBL unit and a convolutional kernel of 1 x 1 respectively to obtain three Prediction feature maps of different sizes, namely Prediction1, Prediction2 and Prediction3, wherein the resolutions of the three Prediction feature maps are 1/8, 1/16 and 1/32 of an input fish swarm image respectively; and then, predicting the fish target by using the three prediction characteristic graphs to obtain repeated candidate frames and corresponding prediction confidence scores.
In this embodiment, the implementation process of step S6 is as follows:
s601, traversing all candidate frames in an image, sequentially judging the prediction confidence score of each candidate frame, reserving the candidate frames with the scores larger than the confidence threshold value and the corresponding scores thereof, and deleting the candidate frames with the scores lower than the confidence threshold value;
s602, selecting the candidate frame M with the highest prediction confidence score in the residual candidate frames, and traversing the other candidate frames B in sequenceiCalculating Distance cross ratio Distance-IoU with the candidate frame M, wherein Distance cross ratio Distance-IoU is called DIoU for short if a certain candidate frame BiThe value of DIoU with the candidate box M is not lower than a given threshold ε, the overlap of the two boxes is consideredHigher degree, not directly deleting the candidate frame BiInstead, the candidate frame B is reducediThen removing the candidate frame M to the final prediction frame set G; wherein the prediction confidence score reduction criterion is as follows:
Figure BDA0003003154500000131
where M is the candidate box with the highest current prediction confidence score, BiIs the other candidate box to be traversed, ρ (M, B)i) Is M and BiC is a distance containing M and BiIs the diagonal length of the minimum bounding rectangle, DIoU (M, B)i) Is M and BiIs a given threshold value of the distance cross-over ratio DIoU, SiIs candidate frame BiIs predicted confidence score, S'iIs candidate frame BiA reduced score prediction confidence score;
and S603, repeatedly executing the step S602 until all the candidate frames are processed, and drawing the final prediction frame set G on the corresponding picture as an output result to obtain a fish school detection result.
The DIoU in step S602 adds a penalty factor based on the intersection ratio IoU, where the penalty factor takes into account the distance between the center points of the two candidate frames, and the specific calculation method is as follows:
Figure BDA0003003154500000141
wherein M is the candidate box with the highest current prediction confidence score, BiIs the other candidate box to be traversed, ρ (M, B)i) Is M and BiC is a distance containing M and BiIoU (M, B)i) Is M and BiThe ratio of the intersection and union of (a).
In this embodiment, the Prediction frame needs to be continuously adjusted during training to be close to the real frame of the target to be detected, so that 9 kinds of prior frames with different sizes are obtained by using a K-means clustering algorithm on the fish image data set before training, the prior frames are suitable for the acquired fish image data set, and the three Prediction feature maps, namely, Prediction1, Prediction2 and Prediction3, are respectively set to be 3 kinds of prior frames with different sizes. The K-means clustering algorithm measures the approaching degree of the two frames by using the intersection ratio IoU as an index, and a formula for specifically calculating the distance between the two frames is as follows:
distance (box, center) 1-IoU (box, center) formula (4)
Wherein box represents a candidate box to be calculated, center represents a candidate box of a cluster center, and IoU (box, center) represents an intersection ratio of the candidate box to be calculated and the candidate box of the cluster center.
In this embodiment, during training, the initial learning rate is set to 0.0002, the initial training iteration round number epoch is set to 45, 8 images are randomly selected for training each time, an Adam optimizer is used to accelerate network model convergence, and meanwhile, in order to reduce GPU memory overhead, the resolution of each image for training is adjusted to 416 × 416.
In this embodiment, the loss function loss predicts the error L from the regression boxlocConfidence error LconfClassification error LclsThe method comprises the following three parts of:
Figure BDA0003003154500000151
Figure BDA0003003154500000152
the specific calculation method of v in the above formula (5) is formula (6), IoU (P, T) is the intersection ratio of the prediction box and the real box, and ρ (P)ctr,Tctr) Is the distance between the center points of the predicted frame and the real frame, d is the diagonal length of the minimum bounding rectangle containing the predicted frame and the real frame, wgtAnd hgtRespectively the width and height of the real frame, w and h respectively the width and height of the prediction frame, the image is divided into S x S grids,m is the number of a priori boxes anchor that will be generated per mesh,
Figure BDA0003003154500000153
indicating that the prediction box contains the object to be detected,
Figure BDA0003003154500000154
indicating that the prediction box does not contain the object to be detected,
Figure BDA0003003154500000155
is the prediction confidence for the corresponding prior box,
Figure BDA0003003154500000156
is the actual confidence, λnoobjIs a set weight coefficient, c is a category to which the object to be detected belongs,
Figure BDA0003003154500000157
is the actual probability that the object in the corresponding mesh belongs to the class c,
Figure BDA0003003154500000158
is the predicted probability that the target in the corresponding mesh belongs to the class c.
In this embodiment, after the relevant parameters are set, the fish school data set is trained, a curve change of loss can be obtained after the training is completed, the loss function loss starts to decrease faster and tends to converge finally, a trained fish school target detection model weight file is stored, then a test fish school image file is input by using the stored model weight file, the fish school image can be subjected to fish target detection, prediction frames in which targets may exist in the image are generated, a prediction confidence score of each prediction frame is given, and an image with the prediction frames and the prediction confidence scores thereof is output.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (8)

1. A fish school automatic detection method based on target shielding compensation is characterized by comprising the following steps:
s1, acquiring a fish school image in a pond environment through a multi-rotor unmanned airship carrying a camera, and marking and data expansion the acquired fish school image;
s2, inputting the fish image into a double-branch feature extraction network for multilevel feature extraction from shallow to deep, wherein the double-branch feature extraction network is formed by adding a lightweight original information feature extraction network parallel to CSPDarknet53 on the basis of a main feature extraction network CSPDarknet53 of a YOLOv4 algorithm; after multi-stage feature extraction is carried out by a double-branch feature extraction network, five feature maps are obtained, and the five feature maps are respectively a two-time down-sampling feature map FA1Fourfold down-sampling feature map FA2Eight-fold down-sampling feature map FA3Sixteen-fold down-sampling feature map FA4Thirty-two times downsampling feature map FA5Resolutions of 1/2, 1/4, 1/8, 1/16, 1/32 of the input fish school image, respectively;
s3, using improved semantic embedding branch to embed the feature graph F obtained in the step S2A5Fusing semantic information of to the feature map FA4In (1), obtaining a characteristic diagram FAM4Feature map FAM41/16 for the input fish image; the characteristic diagram F obtained in the step S2A4Fusing semantic information of to the feature map FA3In (1), obtaining a characteristic diagram FAM3Feature map FAM31/8 for the input fish image;
s4, convolution down-sampling is carried out to obtain the four times down-sampled feature map F obtained in the step S2A2The detail information of (4) is fused with the eight-fold down-sampled feature map F obtained in step S3AM3In (1), obtaining a characteristic diagram FAMC3Feature map FAMC31/8 for the input fish image;
s5, obtaining the characteristic diagram F obtained in the step S2A5The characteristic diagram F obtained in step S3AM4And the characteristic diagram F obtained in step S4AMC3After feature fusion is carried out on the feature pyramid structure of the YOLOv4 algorithm, three feature graphs are obtained, wherein the three feature graphs are FB3、FB4And FB5Then using the feature map FB3、FB4And FB5Predicting the fish target after convolution processing to obtain repeated candidate frames and corresponding prediction confidence scores;
and S6, processing the repeated candidate boxes by adopting a non-maximum suppression algorithm of the improved DIoU _ NMS to obtain a prediction box result containing the prediction confidence score, and drawing the prediction box result on a corresponding picture as a fish shoal detection result.
2. The method according to claim 1, wherein in step S1, labelImg image labeling software is used to label the fish targets in each collected fish image one by one, after labeling, each image generates an xml tag file containing labeling information, and the collected fish image and its corresponding tag file construct an original data set; and then expanding the original data set in a data enhancement mode comprising vertical inversion, horizontal inversion, brightness change, random Gaussian white noise addition, filtering and affine transformation to form a final data set.
3. The method of claim 1, wherein the trunk feature extraction network CSPDarknet53 comprises a CBM unit and five cross-phase local network CSPx units; the CBM unit consists of a Convolution layer Convolution with the step length of 1 and the Convolution kernel of 3 x 3, a Batch Normalization layer Batch Normalization and a Mish activation function layer; the CSPx unit is formed by fusing a plurality of CBM units and x Res unit residual error units, wherein each Res unit residual error unit consists of a CBM unit with a convolution kernel of 1 x 1, a CBM unit with a convolution kernel of 3 x 3 and a residual error structure, the two characteristic graphs are spliced on a channel by using the Concatenate fusion operation, and the dimensionality of the spliced characteristic graphs is expanded; of Convolution layers of five CSPx unitsThe number of channels is 64, 128, 256, 512 and 1024 in sequence, and each CSPx unit is subjected to twice down-sampling; the characteristic graphs obtained by five CSPx units are respectively FC1、FC2、FC3、FC4、FC5Resolutions of 1/2, 1/4, 1/8, 1/16, 1/32 of the input fish school image, respectively;
the lightweight original information feature extraction network comprises five CM units, wherein each CM unit consists of a convolutional layer constraint with the step length of 2 and the convolutional kernel of 3 x 3 and a maximum pooling layer MaxPool with the pooling step length of 1 and the pooling kernel of 3 x 3, the convolutional layer with the step length of 2 is subjected to one-time double down sampling, and the number of convolutional layer channels of each CM unit is the same as that of the corresponding cross-stage local network CSPx unit in a main feature extraction network CSPDarknet 53; the characteristic diagram obtained by five CM units is FL1、FL2、FL3、FL4、FL5Resolutions of 1/2, 1/4, 1/8, 1/16, 1/32 of the input fish school image, respectively;
then, in the multi-stage feature extraction process from shallow to deep, a feature graph F extracted by a light-weight original information feature extraction network is obtainedLiFeature map F extracted from CSPDarknet53 networkCiPerforming Add fusion operation, i is 1,2,3,4,5, adding corresponding pixel values of the two feature maps to obtain a final extracted feature map FA1、FA2、FA3、FA4、FA5The resolutions are 1/2, 1/4, 1/8, 1/16, 1/32 of the input fish school image, respectively.
4. The method for automatically detecting fish school based on target occlusion compensation as claimed in claim 3, wherein the fusion process using improved semantic embedding branch in step S3 is as follows:
firstly, the deep layer feature map F obtained in step S2A5Performing coordinate fusion operation of different scale features by using convolution layer with convolution kernel of 1 × 1 and convolution layer with convolution kernel of 3 × 3, performing double upsampling by using nearest neighbor interpolation after passing through Sigmoid function, and performing step S2The obtained shallow characteristic diagram FA4Multiplying the pixel values to obtain a feature map FAM41/16 of the resolution of the input fish image, so that the deep feature map FA5Fusing semantic information of (2) to the shallow feature map FA4Performing the following steps;
then, the deep layer feature map F obtained in step S2 is subjected toA4Fusing its semantic information to the shallow feature map F also using improved semantic embedding branchesA3In (1), obtaining a characteristic diagram FAM31/8 for the resolution of the input fish image;
the Sigmoid functional form used in the improved semantic embedding branch is as follows:
Figure FDA0003003154490000041
where i is the input and e is the natural constant.
5. The method for automatically detecting fish school based on target occlusion compensation as claimed in claim 3, wherein said step S4 is as follows:
firstly, the feature map F obtained in the step S2 is four times sampledA2Processing by a CBL unit, wherein the CBL unit consists of a Convolution layer convention with a step size of 1 and a Convolution kernel of 3 × 3, a Batch Normalization layer Batch Normalization and a LeakyReLU activation function layer, performing double down-sampling by using the Convolution layer convention with the step size of 2 and the Convolution kernel of 3 × 3, and performing double down-sampling with the feature map F obtained in the step S3AM3Carrying out Concatenate fusion operation after CBL unit treatment to obtain a characteristic diagram FAMC3The resolution is 1/8 of the input fish image.
6. The method for automatically detecting fish school based on target occlusion compensation as claimed in claim 5, wherein said step S5 is as follows:
first, the feature map F obtained in step S2A5The characteristic diagram F obtained in step S3AM4And the characteristic diagram F obtained in step S4AMC3Calculated by YOLOv4After the characteristic pyramid structure of the method is subjected to characteristic fusion, three characteristic graphs F are obtainedB3、FB4And FB5Wherein, the feature pyramid structure of the YOLOv4 algorithm comprises a spatial pyramid pooling layer and a path aggregation network, and the structure of the spatial pyramid pooling layer is in a feature map FA5After three times of CBL unit processing, performing Concatenate fusion operation by adopting four maximum pooling layers with pooling cores of 1 × 1, 5 × 5, 9 × 9 and 13 × 13, and repeatedly fusing the features by the structure of the path aggregation network through paths from bottom to top and from top to bottom; then three feature maps F are processedB3、FB4And FB5Respectively carrying out convolution layer processing with a CBL unit and a convolution kernel of 1 x 1 to obtain three Prediction feature maps of different sizes, namely Prediction1, Prediction2 and Prediction3, wherein the resolutions of the three Prediction feature maps are 1/8, 1/16 and 1/32 of an input fish school image; and then, predicting the fish target by using the three prediction characteristic graphs to obtain repeated candidate frames and corresponding prediction confidence scores.
7. The method for automatically detecting fish school based on target occlusion compensation as claimed in claim 1, wherein the process of step S6 is as follows:
s601, traversing all candidate frames in an image, sequentially judging the prediction confidence score of each candidate frame, reserving the candidate frames with the scores larger than the confidence threshold value and the corresponding scores thereof, and deleting the candidate frames with the scores lower than the confidence threshold value;
s602, selecting the candidate frame M with the highest prediction confidence score in the residual candidate frames, and traversing the other candidate frames B in sequenceiCalculating Distance cross ratio Distance-IoU with the candidate frame M, wherein Distance cross ratio Distance-IoU is called DIoU for short if a certain candidate frame BiIf the value of DIoU with the candidate frame M is not less than the given threshold value epsilon, the degree of overlap between the two frames is considered to be high, and the candidate frame B is not directly deletediInstead, the candidate frame B is reducediThen removing the candidate frame M to the final prediction frame set G; wherein the prediction confidence score reduction criterion is as follows:
Figure FDA0003003154490000051
where M is the candidate box with the highest current prediction confidence score, BiIs the other candidate box to be traversed, ρ (M, B)i) Is M and BiC is a distance containing M and BiIs the diagonal length of the minimum bounding rectangle, DIoU (M, B)i) Is M and BiIs a given threshold value of the distance cross-over ratio DIoU, SiIs candidate frame BiIs predicted confidence score of, Si Is candidate frame BiA reduced score prediction confidence score;
and S603, repeatedly executing the step S602 until all the candidate frames are processed, and drawing the final prediction frame set G on the corresponding picture as an output result to obtain a fish school detection result.
8. The method as claimed in claim 7, wherein the DIoU in step S602 is obtained by adding a penalty factor based on the intersection ratio IoU, wherein the penalty factor takes into account the distance between the center points of the two candidate frames, DIoU (M, B)i) The calculation method of (c) is as follows:
Figure FDA0003003154490000061
wherein M is the candidate box with the highest current prediction confidence score, BiIs the other candidate box to be traversed, ρ (M, B)i) Is M and BiC is a distance containing M and BiIoU (M, B)i) Is M and BiThe ratio of the intersection and union of (a).
CN202110354428.6A 2021-04-01 2021-04-01 Fish shoal automatic detection method based on target shielding compensation Active CN113076871B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110354428.6A CN113076871B (en) 2021-04-01 2021-04-01 Fish shoal automatic detection method based on target shielding compensation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110354428.6A CN113076871B (en) 2021-04-01 2021-04-01 Fish shoal automatic detection method based on target shielding compensation

Publications (2)

Publication Number Publication Date
CN113076871A true CN113076871A (en) 2021-07-06
CN113076871B CN113076871B (en) 2022-10-21

Family

ID=76614401

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110354428.6A Active CN113076871B (en) 2021-04-01 2021-04-01 Fish shoal automatic detection method based on target shielding compensation

Country Status (1)

Country Link
CN (1) CN113076871B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113435396A (en) * 2021-07-13 2021-09-24 大连海洋大学 Underwater fish school detection method based on image self-adaptive noise resistance
CN113610070A (en) * 2021-10-11 2021-11-05 中国地质环境监测院(自然资源部地质灾害技术指导中心) Landslide disaster identification method based on multi-source data fusion
CN113887608A (en) * 2021-09-28 2022-01-04 北京三快在线科技有限公司 Model training method, image detection method and device
CN114387510A (en) * 2021-12-22 2022-04-22 广东工业大学 Bird identification method and device for power transmission line and storage medium
CN114419364A (en) * 2021-12-24 2022-04-29 华南农业大学 Intelligent fish sorting method and system based on deep feature fusion
CN114419568A (en) * 2022-01-18 2022-04-29 东北大学 Multi-view pedestrian detection method based on feature fusion
CN114863263A (en) * 2022-07-07 2022-08-05 鲁东大学 Snakehead detection method for intra-class shielding based on cross-scale hierarchical feature fusion
CN114898105A (en) * 2022-03-04 2022-08-12 武汉理工大学 Infrared target detection method under complex scene
US11790640B1 (en) 2022-06-22 2023-10-17 Ludong University Method for detecting densely occluded fish based on YOLOv5 network
CN117409368A (en) * 2023-10-31 2024-01-16 大连海洋大学 Real-time analysis method for shoal gathering behavior and shoal starvation behavior based on density distribution

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101566691A (en) * 2009-05-11 2009-10-28 华南理工大学 Method and system for tracking and positioning underwater target
CN111310622A (en) * 2020-02-05 2020-06-19 西北工业大学 Fish swarm target identification method for intelligent operation of underwater robot
CN111652118A (en) * 2020-05-29 2020-09-11 大连海事大学 Marine product autonomous grabbing guiding method based on underwater target neighbor distribution
CN111738139A (en) * 2020-06-19 2020-10-02 中国水产科学研究院渔业机械仪器研究所 Cultured fish monitoring method and system based on image recognition
CN112001339A (en) * 2020-08-27 2020-11-27 杭州电子科技大学 Pedestrian social distance real-time monitoring method based on YOLO v4
CN112084866A (en) * 2020-08-07 2020-12-15 浙江工业大学 Target detection method based on improved YOLO v4 algorithm
CN112308040A (en) * 2020-11-26 2021-02-02 山东捷讯通信技术有限公司 River sewage outlet detection method and system based on high-definition images
CN112465803A (en) * 2020-12-11 2021-03-09 桂林慧谷人工智能产业技术研究院 Underwater sea cucumber detection method combining image enhancement

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101566691A (en) * 2009-05-11 2009-10-28 华南理工大学 Method and system for tracking and positioning underwater target
CN111310622A (en) * 2020-02-05 2020-06-19 西北工业大学 Fish swarm target identification method for intelligent operation of underwater robot
CN111652118A (en) * 2020-05-29 2020-09-11 大连海事大学 Marine product autonomous grabbing guiding method based on underwater target neighbor distribution
CN111738139A (en) * 2020-06-19 2020-10-02 中国水产科学研究院渔业机械仪器研究所 Cultured fish monitoring method and system based on image recognition
CN112084866A (en) * 2020-08-07 2020-12-15 浙江工业大学 Target detection method based on improved YOLO v4 algorithm
CN112001339A (en) * 2020-08-27 2020-11-27 杭州电子科技大学 Pedestrian social distance real-time monitoring method based on YOLO v4
CN112308040A (en) * 2020-11-26 2021-02-02 山东捷讯通信技术有限公司 River sewage outlet detection method and system based on high-definition images
CN112465803A (en) * 2020-12-11 2021-03-09 桂林慧谷人工智能产业技术研究院 Underwater sea cucumber detection method combining image enhancement

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
TAO LIU: "Multi-class fish stock statistics technology based on object classification and tracking algorithm", 《ECOLOGICAL INFORMATICS 63 (2021) 101240》 *
李庆忠等: "基于改进YOLO和迁移学习的水下鱼类目标实时检测", 《模式识别与人工智能》 *
沈军宇: "基于深度学习的鱼群检测方法研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113435396B (en) * 2021-07-13 2022-05-20 大连海洋大学 Underwater fish school detection method based on image self-adaptive noise resistance
CN113435396A (en) * 2021-07-13 2021-09-24 大连海洋大学 Underwater fish school detection method based on image self-adaptive noise resistance
CN113887608A (en) * 2021-09-28 2022-01-04 北京三快在线科技有限公司 Model training method, image detection method and device
CN113610070A (en) * 2021-10-11 2021-11-05 中国地质环境监测院(自然资源部地质灾害技术指导中心) Landslide disaster identification method based on multi-source data fusion
CN114387510A (en) * 2021-12-22 2022-04-22 广东工业大学 Bird identification method and device for power transmission line and storage medium
CN114419364A (en) * 2021-12-24 2022-04-29 华南农业大学 Intelligent fish sorting method and system based on deep feature fusion
CN114419568A (en) * 2022-01-18 2022-04-29 东北大学 Multi-view pedestrian detection method based on feature fusion
CN114898105A (en) * 2022-03-04 2022-08-12 武汉理工大学 Infrared target detection method under complex scene
CN114898105B (en) * 2022-03-04 2024-04-19 武汉理工大学 Infrared target detection method under complex scene
US11790640B1 (en) 2022-06-22 2023-10-17 Ludong University Method for detecting densely occluded fish based on YOLOv5 network
CN114863263A (en) * 2022-07-07 2022-08-05 鲁东大学 Snakehead detection method for intra-class shielding based on cross-scale hierarchical feature fusion
CN114863263B (en) * 2022-07-07 2022-09-13 鲁东大学 Snakehead fish detection method for blocking in class based on cross-scale hierarchical feature fusion
US11694428B1 (en) 2022-07-07 2023-07-04 Ludong University Method for detecting Ophiocephalus argus cantor under intra-class occulusion based on cross-scale layered feature fusion
CN117409368A (en) * 2023-10-31 2024-01-16 大连海洋大学 Real-time analysis method for shoal gathering behavior and shoal starvation behavior based on density distribution

Also Published As

Publication number Publication date
CN113076871B (en) 2022-10-21

Similar Documents

Publication Publication Date Title
CN113076871B (en) Fish shoal automatic detection method based on target shielding compensation
CN111027493B (en) Pedestrian detection method based on deep learning multi-network soft fusion
CN108537824B (en) Feature map enhanced network structure optimization method based on alternating deconvolution and convolution
CN111898432B (en) Pedestrian detection system and method based on improved YOLOv3 algorithm
CN111401293B (en) Gesture recognition method based on Head lightweight Mask scanning R-CNN
CN112418330A (en) Improved SSD (solid State drive) -based high-precision detection method for small target object
CN111311611B (en) Real-time three-dimensional large-scene multi-object instance segmentation method
CN115331087A (en) Remote sensing image change detection method and system fusing regional semantics and pixel characteristics
WO2021077947A1 (en) Image processing method, apparatus and device, and storage medium
CN113516664A (en) Visual SLAM method based on semantic segmentation dynamic points
CN113850136A (en) Yolov5 and BCNN-based vehicle orientation identification method and system
CN110852327A (en) Image processing method, image processing device, electronic equipment and storage medium
CN113537085A (en) Ship target detection method based on two-time transfer learning and data augmentation
CN114565675A (en) Method for removing dynamic feature points at front end of visual SLAM
CN110659601A (en) Depth full convolution network remote sensing image dense vehicle detection method based on central point
CN114519819B (en) Remote sensing image target detection method based on global context awareness
CN116645592A (en) Crack detection method based on image processing and storage medium
CN115527050A (en) Image feature matching method, computer device and readable storage medium
CN113487610B (en) Herpes image recognition method and device, computer equipment and storage medium
CN114463624A (en) Method and device for detecting illegal buildings applied to city management supervision
CN113887649A (en) Target detection method based on fusion of deep-layer features and shallow-layer features
CN113177956A (en) Semantic segmentation method for unmanned aerial vehicle remote sensing image
CN113160117A (en) Three-dimensional point cloud target detection method under automatic driving scene
CN116363532A (en) Unmanned aerial vehicle image traffic target detection method based on attention mechanism and re-parameterization
Xie et al. Pedestrian detection and location algorithm based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant