CN112598713A - Offshore submarine fish detection and tracking statistical method based on deep learning - Google Patents
Offshore submarine fish detection and tracking statistical method based on deep learning Download PDFInfo
- Publication number
- CN112598713A CN112598713A CN202110232509.9A CN202110232509A CN112598713A CN 112598713 A CN112598713 A CN 112598713A CN 202110232509 A CN202110232509 A CN 202110232509A CN 112598713 A CN112598713 A CN 112598713A
- Authority
- CN
- China
- Prior art keywords
- fish
- tracking
- detection
- box
- branch
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 241000251468 Actinopterygii Species 0.000 title claims abstract description 136
- 238000001514 detection method Methods 0.000 title claims abstract description 62
- 238000013135 deep learning Methods 0.000 title claims abstract description 23
- 238000007619 statistical method Methods 0.000 title claims description 9
- 238000000034 method Methods 0.000 claims abstract description 40
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 38
- 238000013528 artificial neural network Methods 0.000 claims abstract description 31
- 238000012545 processing Methods 0.000 claims abstract description 17
- 238000001914 filtration Methods 0.000 claims abstract description 13
- 239000002245 particle Substances 0.000 claims abstract description 13
- 235000019688 fish Nutrition 0.000 claims description 130
- 238000004364 calculation method Methods 0.000 claims description 17
- 238000012549 training Methods 0.000 claims description 12
- 230000006870 function Effects 0.000 claims description 11
- 238000005520 cutting process Methods 0.000 claims description 9
- 230000008859 change Effects 0.000 claims description 8
- 238000012937 correction Methods 0.000 claims description 7
- 238000007781 pre-processing Methods 0.000 claims description 6
- 239000000126 substance Substances 0.000 claims description 6
- 230000003044 adaptive effect Effects 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 4
- 125000004122 cyclic group Chemical group 0.000 claims description 3
- 238000009826 distribution Methods 0.000 claims description 3
- 230000005764 inhibitory process Effects 0.000 claims description 3
- 238000005259 measurement Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 230000011218 segmentation Effects 0.000 description 4
- 241001465754 Metazoa Species 0.000 description 3
- 241000894007 species Species 0.000 description 3
- 238000009360 aquaculture Methods 0.000 description 2
- 244000144974 aquaculture Species 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000003860 storage Methods 0.000 description 2
- 241000972773 Aulopiformes Species 0.000 description 1
- 241000251511 Holothuroidea Species 0.000 description 1
- XRYVAQQLDYTHCL-UHFFFAOYSA-N Marini Chemical compound O1C=2C(CC(CC=C(C)C)C(C)=C)=C(O)C=C(O)C=2C(=O)CC1C1=CC=C(O)C=C1O XRYVAQQLDYTHCL-UHFFFAOYSA-N 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000009313 farming Methods 0.000 description 1
- 235000013305 food Nutrition 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000003703 image analysis method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 235000019515 salmon Nutrition 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformation in the plane of the image
- G06T3/40—Scaling the whole image or part thereof
- G06T3/4007—Interpolation-based scaling, e.g. bilinear interpolation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/40—Image enhancement or restoration by the use of histogram techniques
-
- G06T5/90—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/277—Analysis of motion involving stochastic approaches, e.g. using Kalman filters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
- G06T2207/20132—Image cropping
Abstract
The invention discloses a method for detecting, tracking and counting fish at the near shore based on deep learning, which comprises the steps of extracting characteristics in an input underwater real-time video by processing the input underwater real-time video through a basic neural network YOLOv5, tracking the position and the type of fish output by a branch and the ID number of each tracked fish, correcting the output of the tracking branch by using a detection branch as the final output, and obtaining the position, the type and the number of the fish in each picture. The invention matches the result between each frame by using the particle filtering and the KM algorithm, thereby matching the serial number of the fish in the video.
Description
Technical Field
The invention relates to the field of seabed exploration and detection, in particular to a near-shore seabed fish detection, tracking and statistics method based on deep learning.
Background
The ocean has very abundant biological resources; thus, coastal countries are actively developing marine farms, particularly fisheries aquaculture-type marine farms. The food and agriculture organization of the united nations records that the global edible fish yield of a marine ranch in 2016 is 2870 ten thousand tons (674 hundred million dollars) and the yield accounts for 49.5 percent of the total yield of aquaculture in 2016. Currently, offshore fishing is being over developed and the aquaculture industry is also in saturation; thus, marine ranch operations are considered to be an important approach to addressing the decline in fishery resources. However, there are also some problems with marine ranch operations (e.g., over-fishing, ecosystem imbalance, etc.). By enhancing the monitoring of the underwater biological resources, the fishing time and intensity can be controlled according to the change of the underwater biological resources, thereby solving the problems. For marine ranches, real-time monitoring of the number of organisms can form the basis of a protection strategy for scientific fishery management and sustainable fish production. In addition, the fish resource statistics help researchers understand the abundance of species, and the corresponding fish resource statistics can be analyzed in combination with local sea states to determine conditions suitable for the survival of each species. Therefore, the technology has important practical significance.
In the last decade, several tracking and detection methods have been introduced in the field of fishery management. In the detection algorithm, the traditional research method is to extract the fine features of the underwater target by fusing multi-sensor and multi-feature information. For example, Ishibashi et al utilizes optical sensors to acquire specific images of underwater targets. Saini and Biswas detect targets by detecting edges using adaptive thresholds. At present, the mainstream method is to capture an object by using an underwater camera and extract features by using a deep learning algorithm. Deep learning algorithms such as fast-RCNN and Resnet have been applied to underwater biometric identification processes such as sea cucumber identification (Xia et al, 2018) and fish detection algorithms (CN 202010003815.0). The main problem with this detection algorithm is that it is not possible to identify whether the fish in both frames are the same animal; therefore, a tracking model is required. In tracking algorithms, traditional filtering methods, such as particle filtering, optical flow methods and object segmentation, are the main methods, which are mainly tested under controlled conditions, such as in a limited laboratory environment. For example, Chuang tracks fish using object segmentation and object height block stereo matching. The method divides the fish into a plurality of parts for matching, and ignores the overall characteristics of the fish. Sun proposes a consistent fish tracking strategy for underwater surveillance systems with multiple static cameras and overlapping fields of view. And (3) adopting an accelerated robust feature technology and a centroid coordinate isomorphic mapping technology to capture the fishes. However, this method cannot identify the kind of fish. Romero-Ferero proposed an automated method to track all individuals in a small or large population of unmarked animals. Their algorithms have high accuracy for populations below 100 people; however, this method must be performed in an ideal laboratory environment. Meng-Che proposes a fish segmentation and tracking algorithm, which overcomes the problem of low contrast and ensures accurate segmentation of fish shape boundaries by adopting a histogram back projection method for a double local threshold image. However, with this method, sudden movements of the fish may cause tracking failures. In addition, the algorithm is too complex to achieve real-time tracking.
In recent years, several methods for tracking the abundance of fishes and automatically counting fish populations by using a machine vision technology are proposed. For example, Song et al (2020) propose an automatic fish counting method based on a hybrid neural network model to achieve real-time, accurate, objective, and lossless fish population counting in ocean salmon farming. The method adopts a multi-row convolutional neural network as a front end to capture characteristic information of different receptive fields. Meanwhile, the back end adopts a wider and deeper expanded convolutional neural network to reduce the loss of space structure information in the network transmission process. Finally, a hybrid neural network model is constructed. However, the main limitation of this method is that fish are regarded as particles and the type of fish cannot be classified. Marini et al (2018) developed a content-based image analysis method based on genetic programming. However, crowded scenes limit the efficiency of identification when a large number of fish are gathered in front of the camera. When these aggregates are particularly dense, individuals often overlap each other, which increases the false negative rate.
Disclosure of Invention
Most of the above fish detection and tracking methods based on machine vision do not relate to problematic scenes and practical difficulties in the harsh environment of a real marine ranch. In particular, the problems of multi-class multi-target real-time tracking and high underwater turbidity and difficult recognition are not solved. Because the time complexity is high due to the high algorithm complexity when the prior art adopts the traditional image processing, the real-time requirement is difficult to complete while the high accuracy is kept, and in the latest technology adopting deep learning, the target detection and the target tracking cannot be integrated together, so that the step-by-step operation is required, the additional time overhead and storage space overhead of intermediate results are required, and the misrecognition rate is high due to the complex underwater environment.
In order to solve the defects of the prior art, the invention provides the following technical scheme:
a near-shore submarine fish detection and tracking statistical method based on deep learning comprises the following steps:
step 3, obtaining the final output result, and carrying out online tracking on the final output result;
and 4, matching results between each frame by using particle filtering and a KM algorithm according to an online tracking result so as to match the serial numbers of the fishes, and associating the serial numbers with data of the multi-class fish schools according to the identified and tracked fishes.
Further, in step 1, obtaining fish image information, and preprocessing the fish image information, specifically including: acquiring an acquired underwater real-time video, cutting a fish image data set from the underwater real-time video, performing contrast processing on images in the fish image data set, and cutting the images subjected to the contrast processing to a specified size from an original size, wherein the original size is 1920 x 1080.
Further, the cutting from the original size to the designated size specifically includes: and calculating a scaling factor by taking the longest edge of the image as a reference edge, scaling the whole image to 608 × 342 through bilinear interpolation, then performing zero filling on the upper and lower edges of the image, and finally obtaining a cropped image with the specified size of 608 × 608.
Further, performing contrast processing on the images in the fish image dataset specifically includes: the histogram of the RGB channel is color compensated and then the compensated image is CLAHE-processed with contrast limited adaptive histogram equalization.
Further, performing color compensation on the histogram of the RGB channel specifically includes:
divide the image into RGB three channels, respectivelyCalculate the average b for each channelavg、gavg、 ravgAnd the minimum value of the three average values is used as the correction parameter value of the color shift, value = min { b }avg,gavg,ravg};
The average value for each channel is calculated by the following formula:
wherein the index i ranges from 0 to n-1 and the index j ranges from 0 to m-1;
if the calculation result is less than 0, each channel of the value at the (i, j) position is defined as 0; otherwise, the pixel value is corrected using the channel average and the correction parameter value.
Further, performing online tracking on the final output result, specifically including: extracting a posterior box by adopting a non-maximum inhibition method based on the heat map score; determining the positions of key points with the heat map scores larger than a threshold value, and calculating corresponding posterior frames according to the estimated offset and the size of the prediction frame; the box linking is implemented using an online tracking algorithm.
Further, the method for realizing the box link by using the online tracking algorithm specifically comprises the following steps: initializing a tracking track set based on a detection frame in a first frame, and setting a threshold value; particle filtering is used to predict the location of the tracking trajectory in the current frame, and in subsequent frames, the Re-ID features and the IOU measurements are linked to a set of tracking trajectories if the distance between these blocks is greater than a threshold.
Further, the training process of the FDT neural network structure is as follows: and inputting the training sample tensor of the original reconstructed image and the training tensor of the target truth-value image into the FDT neural network structure, and performing cyclic training on the FDT neural network structure until the loss function output by the network is lower than a set threshold value.
Further, the calculation formula of the loss function is as follows:
wherein the content of the first and second substances,represents the posterior box loss,Represents the loss of the category andindicating loss.
wherein the content of the first and second substances,is the coordinate of the center point of the ground real posterior box, b is the coordinate of the center point of the prediction boundary box, IoU is the intersection of the area of the ground real posterior box and the area of the prediction prior box,c represents the diagonal distance of the minimum closure area containing the real a-posteriori box and the prediction a-priori box,,two influencing factors, IoU,、The calculation formula of (a) is as follows:
wherein,The width and height of the ground real posterior box, w, h are the width and height of the prediction prior box,the area of the real posterior box is shown,representing the area of a prediction prior box;
when the jth prior frame of the ith grid is responsible for a certain real object, calculating a classification loss function by a posterior frame generated by the prior frame; when the object is true, the object is,=1, otherwise=0;
Considering Re-ID loss, embedding Re-ID is treated as a classification task, all object instances of the same identity are treated as a class, an identity feature vector is extracted at position (i, j), and the mapping to a class distribution vector is learnedThe category label of the real posterior box is expressed as;
where K is the number of classes and N is the number of true posterior boxes.
The software for detecting, tracking and counting the underwater submarine fishes comprises the following steps: collecting real-time videos; extracting features in the video through a basic neural network (YOLOv 5); respectively inputting to a detection branch and a tracking branch, further, respectively modeling the characteristics in the two branches, carrying out nonlinear change, and detecting the position and the type of fish in a branch output picture, wherein: 1) the location, type of the output fish of the tracking branch and the ID (number) of each tracked fish; 2) correcting the output of the tracking branch with the detected branch as the final output; the results between each frame are matched using the online tracking part, i.e. particle filtering and KM algorithm, to match the number of fish in the video.
The invention provides an end-to-end neural network framework, which can directly output results while inputting videos, and provides an image enhancement algorithm at an input end, thereby obviously improving the accuracy of images.
For the FDT algorithm, when a section of underwater real-time video is input, after the underwater real-time video passes through a basic neural network (YOLOv 5), characteristics such as fish textures, fish shapes, fish sizes and the like in the video are extracted, then the characteristics are respectively input into a detection branch and a tracking branch, nonlinear change is carried out in the two branches through modeling the characteristics, and the position and the type of the fish in a branch output picture are detected. The location, type of fish and ID (number) of each tracked fish are output by the tracking branch. And finally, correcting the output of the tracking branch by using the detection branch as the final output. The position, category and number of the fish in each picture can be obtained. And then matching the number of the fish in the video by using an online tracking part, namely particle filtering and a KM algorithm to match the result between each frame.
In the process, detection and multi-target tracking algorithms are fused into a framework, so that tracking statistics of multi-class fish schools can be realized, an end-to-end unified neural network architecture is adopted, online processing can be realized, and a statistical result is output while a video is input.
Drawings
In order to more clearly illustrate the embodiments of the present application or technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings can be obtained by those skilled in the art according to the drawings.
Fig. 1 is a comparison diagram of an original image after image preprocessing and an original image after processing according to an embodiment of the present invention;
FIG. 2 is a diagram of an FDT algorithm architecture provided by an embodiment of the present invention;
FIG. 3 provides a process diagram of a training and testing phase according to an embodiment of the present invention;
fig. 4 is a schematic diagram of the OceanEye software main interface according to the embodiment of the present invention.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a near-shore submarine fish detection and tracking statistical method based on deep learning, which is applied to fish population statistics.
The fish population statistical technique mainly comprises two parts, namely accurately identifying underwater fishes, and matching the fishes identified in each frame to form a tracking track.
The first part is the object recognition task. At present, in the field of computer vision, many target recognition algorithms have reached the accuracy of human eye recognition, but they are only focused on object recognition on land; the identification precision of the underwater complex environment is low, because light can be refracted and reflected in underwater transmission, the turbid underwater illumination is uneven, the light attenuation rates of the same wavelength are different, and the color shift phenomenon can occur in the underwater environment; for the above reasons, images photographed underwater have problems of degradation in quality such as low contrast, color distortion and blurring of texture, and thus, it is difficult to distinguish marine animals.
The second part is mainly the multi-target tracking task. The task is to match the objects identified in the two frames and determine the ID of the object in the video. However, the current multi-target tracking algorithm mainly focuses on single-class multi-target tracking. The workload of multi-class multi-target tracking is less, the mainstream algorithm is a two-stage algorithm of firstly identifying and then tracking, the real-time requirement cannot be met, and the current end-to-end algorithm capable of simultaneously identifying and tracking has lower precision. Furthermore, many fish are similar in size and appearance, and it is therefore difficult to distinguish such fish according to their texture and size, and many experiments are required to create a usable model; moreover, the fish swim irregularly in all directions; therefore, fish deformation and a blocking phenomenon may frequently occur.
The invention provides a near-shore submarine fish detection, tracking and statistics method based on deep learning, which comprises the following steps:
in this step, the picture processed in step 1 is input into an FDT neural network structure, and features in the picture, such as fish texture, fish shape, fish size, etc., are extracted after passing through a basic neural network YOLOv 5; the FDT neural network structure is shown in fig. 2.
In step 1, obtaining fish image information, and preprocessing the fish image information, specifically including:
acquiring an acquired underwater real-time video, cutting a fish image data set from the underwater real-time video, performing contrast processing on images in the fish image data set, and cutting the images subjected to the contrast processing to a specified size from an original size, wherein the original size is 1920 x 1080;
the cutting from the original size to the specified size specifically comprises:
calculating a scaling factor by taking the longest edge of the image as a reference edge, scaling the whole image to 608 × 342 through bilinear interpolation, then performing zero filling on the upper edge and the lower edge of the image, and finally obtaining a cut image with the specified size of 608 × 608;
performing contrast processing on the images in the fish image dataset, specifically comprising: the histogram of the RGB channel is color compensated and then the compensated image is CLAHE-processed with contrast limited adaptive histogram equalization.
By the contrast processing, the problems of underwater color distortion, low contrast, fuzzy details and the like are effectively solved; then, the image contrast is improved by using a self-adaptive histogram equalization CLAHE algorithm, the problems of color shift and contrast are solved, and the identification precision is improved. The images before and after enhancement are shown in fig. 1.
Performing color compensation on the histogram of the RGB channel, specifically comprising:
dividing the image into RGB three channels, respectively calculating the average value b of each channelavg、gavg、 ravgAnd the minimum value of the three average values is used as the correction parameter value of the color shift, value = min { b }avg,gavg,ravg};
The average value for each channel is calculated by the following formula:
wherein the index i ranges from 0 to n-1 and the index j ranges from 0 to m-1;
if the calculation result is less than 0, each channel of the value at the (i, j) position is defined as 0; otherwise, the pixel value is corrected using the channel average and the correction parameter value.
wherein the detection branch and the tracking branch share the same basic feature extraction network to reduce the amount of computation. The first output size of the tracking branch is 13 x 45. The first value 13 is the height of the feature map. The second value 13 is the width of the feature map. The third value 45 includes the x and y coordinates of the predicted center point for each channel, the width and height of the posterior box, the confidence and the probability of 9 fish. Similarly, the size of the second output is 26 × 26 × 45, and the size of the third output is 52 × 52 × 45. The tracking branch uses a one-time MOT method, which has four outputs, respectively, center point, posterior box position, center offset, and Re-ID. The center point displays the x-coordinate and the y-coordinate of the detected object. The posterior box position shows the height and width of the target box. The center offset output is responsible for more accurately locating the target for reducing quantization errors resulting from the signature steps. The Re-ID is used to identify whether the same object exists in different frames. Then, the correction tracking branch is output by the detection branch. The center point of the target is matched between the two branch outputs. Here, if the center point of the trace branch is the same as the center point of the detection branch, the posterior box corresponding to the target output by the trace branch is replaced by the detection branch. If the center point of the detected branch is not contained in the tracking list, the detected target is added to the tracking list. And finally, carrying out online tracking on the central point, the posterior frame, the fish and the Re-ID of each tracking target.
The advantages of the combination of detecting and tracking branches are summarized below. Multiple class objects cannot be tracked simultaneously using only one tracking branch. However, we can identify the class of the tracked object by detecting the additional class output of the branch. Therefore, multi-class synchronous tracking is realized. In addition, the output of the posterior box of the detection branch is more accurate than that of the tracking branch, and the tracking precision and the online tracking time are improved. Finally, due to the complexity of the underwater environment, such as when two fish overlap, the detection branch can extract more detailed features to track the correct target.
Step 3, obtaining the final output result, and carrying out online tracking on the final output result;
performing online tracking on the final output result, specifically comprising: extracting a posterior box by adopting a non-maximum inhibition method based on the heat map score; determining the positions of key points with the heat map scores larger than a threshold value, and calculating corresponding posterior frames according to the estimated offset and the size of the prediction frame; the box linking is implemented using an online tracking algorithm.
The method for realizing the frame linking by using the online tracking algorithm specifically comprises the following steps: initializing a tracking track set based on a detection frame in a first frame, and setting a threshold value; particle filtering is used to predict the location of the tracking trajectory in the current frame, and in subsequent frames, the Re-ID features and the IOU measurements are linked to a set of tracking trajectories if the distance between these blocks is greater than a threshold.
In this step, two parallel branches can get the posterior box and the Re-ID, as shown in FIG. 3.
And 4, matching results between each frame by using particle filtering and a KM algorithm according to an online tracking result so as to match the serial numbers of the fishes, and associating the serial numbers with data of the multi-class fish schools according to the identified and tracked fishes.
In the process, detection and multi-target tracking algorithms are fused into a framework, so that tracking statistics of multi-class fish schools can be realized, an end-to-end unified neural network architecture is adopted, online processing can be realized, and a statistical result is output while a video is input.
The training process of the FDT neural network structure is as follows: and inputting the training sample tensor of the original reconstructed image and the training tensor of the target truth-value image into the FDT neural network structure, and performing cyclic training on the FDT neural network structure until the loss function output by the network is lower than a set threshold value.
In the process, detection and multi-target tracking algorithms are fused into a framework, so that tracking statistics of multi-class fish schools can be realized, an end-to-end unified neural network architecture is adopted, online processing can be realized, and a statistical result is output while a video is input.
Said loss functionlossIs composed of three parts, namely posterior frame loss(loss function of predicted true position), class loss(loss function of predicted species class) and losses(predicted loss function of biological number), the calculation formula is as follows:
in the formula, the overlapping area, the distance between the central points and the length-width ratio are considered at the same time, and the CIoU can obtain better convergence speed and precision on the BBox regression problem;
wherein the content of the first and second substances,is the coordinate of the center point of the ground real posterior box, b is the coordinate of the center point of the prediction boundary box, IoU is the intersection of the area of the ground real posterior box and the area of the prediction prior box,c represents the diagonal distance of the minimum closure area that can contain the true a posteriori box and the predicted a priori box,,are two influencing factors, IoU,、The calculation formula of (a) is as follows:
wherein the content of the first and second substances,andthe width and height of the ground real posterior box, w and h are the width and height of the prediction prior box,it refers to the area of the real posterior box,refers to the predicted prior box area;
when the jth prior frame of the ith grid is responsible for a certain real object, calculating a classification loss function by a posterior frame generated by the prior frame; when the object is true, the object is,=1, otherwise=0;
Considering Re-ID loss, embedding Re-ID is treated as a classification task, all object instances of the same identity are treated as a class, an identity feature vector is extracted at position (i, j), and the mapping to a class distribution vector is learnedThe category label of the real posterior box is expressed as;
where K is the number of classes and N is the number of true posterior boxes.
In a more preferred embodiment of the invention, the invention also provides a software for detecting and tracking the underwater submarine fishes, which is based on the algorithm; the software interface is as shown in fig. 4, specifically, the input video is clicked in a cloud storage manner, and the processed video and the corresponding statistical results of each category are displayed in real time. When a section of underwater real-time video is input, after the underwater real-time video passes through a basic neural network (YOLOv 5), characteristics in the video, such as fish textures, fish shapes, fish sizes and the like, are extracted, then the characteristics are respectively input into a detection branch and a tracking branch, nonlinear change is carried out in the two branches through modeling the characteristics, and the position and the type of the fish in an output picture of the detection branch are detected. The location, type of fish and ID (number) of each tracked fish are output by the tracking branch. And finally, correcting the output of the tracking branch by using the detection branch as the final output. The position, category and number of the fish in each picture can be obtained. And then matching results between each frame by using an online tracking part, namely particle filtering and a KM algorithm, so as to match the numbers of the fish in the video. In the process, detection and multi-target tracking algorithms are fused into a framework, so that tracking statistics of multi-class fish schools can be realized, an end-to-end unified neural network architecture is adopted, online processing can be realized, and a statistical result is output while a video is input.
Furthermore, the software detection and tracking process comprises the following steps: collecting real-time videos; extracting features in the video through a basic neural network (YOLOv 5); respectively inputting to a detection branch and a tracking branch, further, respectively modeling the characteristics in the two branches, carrying out nonlinear change, and detecting the position and the type of fish in a branch output picture, wherein: 1 tracking the location, type of the branched output fish and the ID (number) of each tracked fish; 2 correcting the output of the tracking branch by using the detection branch as the final output; the results between each frame are matched using the online tracking part, i.e. particle filtering and KM algorithm, to match the number of fish in the video.
While certain exemplary embodiments of the present invention have been described above by way of illustration only, it will be apparent to those of ordinary skill in the art that the described embodiments may be modified in various different ways without departing from the spirit and scope of the invention. Accordingly, the drawings and description are illustrative in nature and should not be construed as limiting the scope of the invention.
Claims (10)
1. A near-shore submarine fish detection and tracking statistical method based on deep learning is characterized by comprising the following steps:
step 1, obtaining fish image information, preprocessing the fish image information, sending the preprocessed fish image information into a trained FDT neural network structure, and extracting three feature maps with different scales after passing through a basic neural network, wherein the basic neural network is a target detection network YOLOv5, and the target detection network adopts a parallel double-branch structure based on deep learning and is used for detecting and tracking fishes in real-time in a real marine ranch environment;
step 2, obtaining three feature maps with different scales, inputting the feature maps into a detection branch and a tracking branch, and outputting the position and the type of the fish in the image by the detection branch through modeling the features and carrying out nonlinear change in the detection branch and the tracking branch; tracking the position and the type of the fish in the branch output image and the serial number ID of each tracked fish; correcting the output result of the tracking branch by using the output result of the detection branch, and taking the corrected output result as a final output result, wherein the position, the category and the serial number ID of the fish in each picture are recorded in the final output result;
step 3, obtaining the final output result, and carrying out online tracking on the final output result;
and 4, matching results between each frame by using particle filtering and a KM algorithm according to an online tracking result so as to match the serial numbers of the fishes, and associating the serial numbers with data of the multi-class fish schools according to the identified and tracked fishes.
2. The method for detecting, tracking and counting fish on the near shore seabed based on deep learning of claim 1, wherein in step 1, fish image information is obtained, and the fish image information is preprocessed, specifically comprising: acquiring an acquired underwater real-time video, cutting a fish image data set from the underwater real-time video, performing contrast processing on images in the fish image data set, and cutting the images subjected to the contrast processing to a specified size from an original size, wherein the original size is 1920 x 1080.
3. The deep learning-based offshore fish detection and tracking statistical method according to claim 2, wherein the cutting from an original size to a specified size specifically comprises: and calculating a scaling factor by taking the longest edge of the image as a reference edge, scaling the whole image to 608 × 342 through bilinear interpolation, then performing zero filling on the upper and lower edges of the image, and finally obtaining a cropped image with the specified size of 608 × 608.
4. The method for offshore submarine fish detection and tracking statistics based on deep learning according to claim 2, wherein the contrast processing of the images in the fish image dataset specifically comprises: the histogram of the RGB channel is color compensated and then the compensated image is CLAHE-processed with contrast limited adaptive histogram equalization.
5. The method for offshore fish detection and tracking statistics based on deep learning as claimed in claim 4, wherein the color compensation of the histogram of RGB channels specifically comprises:
dividing the image into RGB three channels, respectively calculating the average value b of each channelavg、gavg、 ravgAnd the minimum value of the three average values is used as the correction parameter value of the color shift, value = min { b }avg,gavg,ravg};
The average value for each channel is calculated by the following formula:
wherein the index i ranges from 0 to n-1 and the index j ranges from 0 to m-1;
if the calculation result is less than 0, each channel of the value at the (i, j) position is defined as 0; otherwise, the pixel value is corrected using the channel average and the correction parameter value.
6. The method for offshore submarine fish detection and tracking statistics based on deep learning of claim 1, wherein the online tracking of the final output result specifically comprises: extracting a posterior box by adopting a non-maximum inhibition method based on the heat map score; determining the positions of key points with the heat map scores larger than a threshold value, and calculating corresponding posterior frames according to the estimated offset and the size of the prediction frame; the box linking is implemented using an online tracking algorithm.
7. The method for offshore fish detection and tracking statistics based on deep learning as claimed in claim 6, wherein the online tracking algorithm is used to realize frame linking, and specifically comprises: initializing a tracking track set based on a detection frame in a first frame, and setting a threshold value; particle filtering is used to predict the location of the tracking trajectory in the current frame, and in subsequent frames, the Re-ID features and the IOU measurements are linked to a set of tracking trajectories if the distance between these blocks is greater than a threshold.
8. The deep learning-based offshore fish detection and tracking statistical method according to claim 1, wherein the FDT neural network structure is trained as follows: and inputting the training sample tensor of the original reconstructed image and the training tensor of the target truth-value image into the FDT neural network structure, and performing cyclic training on the FDT neural network structure until the loss function output by the network is lower than a set threshold value.
9. The deep learning-based offshore fish detection and tracking statistical method according to claim 8, wherein the loss function is calculated by the following formula:
10. The offshore seafloor fish detection and tracking statistical method based on deep learning of claim 9,
wherein the content of the first and second substances,is the coordinate of the center point of the ground real posterior box, b is the coordinate of the center point of the prediction boundary box, IoU is the intersection of the area of the ground real posterior box and the area of the prediction prior box,c represents the diagonal distance of the minimum closure area containing the real a-posteriori box and the prediction a-priori box,,two influencing factors, IoU,、The calculation formula of (a) is as follows:
wherein,The width and height of the ground real posterior box, w, h are the width and height of the prediction prior box,the area of the real posterior box is shown,representing the area of a prediction prior box;
when the jth prior frame of the ith grid is responsible for a certain real object, calculating a classification loss function by a posterior frame generated by the prior frame; when the object is true, the object is,=1, otherwise=0;
In considering Re-ID loss, Re-ID embedding is handled as a classification task,all object instances of the same identity are treated as a class, an identity feature vector is extracted at location (i, j), and the mapping to a class distribution vector is learnedThe category label of the real posterior box is expressed as;
where K is the number of classes and N is the number of true posterior boxes.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110232509.9A CN112598713A (en) | 2021-03-03 | 2021-03-03 | Offshore submarine fish detection and tracking statistical method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110232509.9A CN112598713A (en) | 2021-03-03 | 2021-03-03 | Offshore submarine fish detection and tracking statistical method based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112598713A true CN112598713A (en) | 2021-04-02 |
Family
ID=75210140
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110232509.9A Pending CN112598713A (en) | 2021-03-03 | 2021-03-03 | Offshore submarine fish detection and tracking statistical method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112598713A (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113112726A (en) * | 2021-05-11 | 2021-07-13 | 创新奇智(广州)科技有限公司 | Intrusion detection method, device, equipment, system and readable storage medium |
CN113326850A (en) * | 2021-08-03 | 2021-08-31 | 中国科学院烟台海岸带研究所 | Example segmentation-based video analysis method for group behavior of Charybdis japonica |
CN113379746A (en) * | 2021-08-16 | 2021-09-10 | 深圳荣耀智能机器有限公司 | Image detection method, device, system, computing equipment and readable storage medium |
CN113569971A (en) * | 2021-08-02 | 2021-10-29 | 浙江索思科技有限公司 | Image recognition-based catch target classification detection method and system |
CN113780127A (en) * | 2021-08-30 | 2021-12-10 | 武汉理工大学 | Ship positioning and monitoring system and method |
CN114037737A (en) * | 2021-11-16 | 2022-02-11 | 浙江大学 | Neural network-based offshore submarine fish detection and tracking statistical method |
CN114049477A (en) * | 2021-11-16 | 2022-02-15 | 中国水利水电科学研究院 | Fish passing fishway system and dynamic identification and tracking method for fish quantity and fish type |
CN114463675A (en) * | 2022-01-11 | 2022-05-10 | 北京市农林科学院信息技术研究中心 | Underwater fish group activity intensity identification method and device |
CN115063378A (en) * | 2022-06-27 | 2022-09-16 | 中国平安财产保险股份有限公司 | Intelligent counting method, device, equipment and storage medium |
CN115953725A (en) * | 2023-03-14 | 2023-04-11 | 浙江大学 | Fish egg automatic counting system based on deep learning and counting method thereof |
TWI801911B (en) * | 2021-06-18 | 2023-05-11 | 國立臺灣海洋大學 | Aquatic organism identification method and system |
CN116721132A (en) * | 2023-06-20 | 2023-09-08 | 中国农业大学 | Multi-target tracking method, system and equipment for industrially cultivated fishes |
CN117292305A (en) * | 2023-11-24 | 2023-12-26 | 中国科学院水生生物研究所 | Method, system, electronic equipment and medium for determining fetal movement times of fish fertilized eggs |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108805064A (en) * | 2018-05-31 | 2018-11-13 | 中国农业大学 | A kind of fish detection and localization and recognition methods and system based on deep learning |
-
2021
- 2021-03-03 CN CN202110232509.9A patent/CN112598713A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108805064A (en) * | 2018-05-31 | 2018-11-13 | 中国农业大学 | A kind of fish detection and localization and recognition methods and system based on deep learning |
Non-Patent Citations (2)
Title |
---|
SHASHA LIU等: ""Embedded Online Fish Detection and Tracking System via YOLOv3 and Parallel Correlation Filter"", 《OCEANS 2018 MTS/IEEE CHARLESTON》 * |
TAO LIU等: ""Multi-class fish stock statistics technology based on object classification and tracking algorithm"", 《ECOLOGICAL INFORMATICS》 * |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113112726A (en) * | 2021-05-11 | 2021-07-13 | 创新奇智(广州)科技有限公司 | Intrusion detection method, device, equipment, system and readable storage medium |
TWI801911B (en) * | 2021-06-18 | 2023-05-11 | 國立臺灣海洋大學 | Aquatic organism identification method and system |
CN113569971B (en) * | 2021-08-02 | 2022-03-25 | 浙江索思科技有限公司 | Image recognition-based catch target classification detection method and system |
CN113569971A (en) * | 2021-08-02 | 2021-10-29 | 浙江索思科技有限公司 | Image recognition-based catch target classification detection method and system |
CN113326850A (en) * | 2021-08-03 | 2021-08-31 | 中国科学院烟台海岸带研究所 | Example segmentation-based video analysis method for group behavior of Charybdis japonica |
CN113326850B (en) * | 2021-08-03 | 2021-10-26 | 中国科学院烟台海岸带研究所 | Example segmentation-based video analysis method for group behavior of Charybdis japonica |
CN113379746B (en) * | 2021-08-16 | 2021-11-02 | 深圳荣耀智能机器有限公司 | Image detection method, device, system, computing equipment and readable storage medium |
CN113379746A (en) * | 2021-08-16 | 2021-09-10 | 深圳荣耀智能机器有限公司 | Image detection method, device, system, computing equipment and readable storage medium |
CN113780127A (en) * | 2021-08-30 | 2021-12-10 | 武汉理工大学 | Ship positioning and monitoring system and method |
CN114049477B (en) * | 2021-11-16 | 2023-04-07 | 中国水利水电科学研究院 | Fish passing fishway system and dynamic identification and tracking method for fish quantity and fish type |
CN114049477A (en) * | 2021-11-16 | 2022-02-15 | 中国水利水电科学研究院 | Fish passing fishway system and dynamic identification and tracking method for fish quantity and fish type |
CN114037737A (en) * | 2021-11-16 | 2022-02-11 | 浙江大学 | Neural network-based offshore submarine fish detection and tracking statistical method |
CN114037737B (en) * | 2021-11-16 | 2022-08-09 | 浙江大学 | Neural network-based offshore submarine fish detection and tracking statistical method |
CN114463675A (en) * | 2022-01-11 | 2022-05-10 | 北京市农林科学院信息技术研究中心 | Underwater fish group activity intensity identification method and device |
CN115063378A (en) * | 2022-06-27 | 2022-09-16 | 中国平安财产保险股份有限公司 | Intelligent counting method, device, equipment and storage medium |
CN115063378B (en) * | 2022-06-27 | 2023-12-05 | 中国平安财产保险股份有限公司 | Intelligent point counting method, device, equipment and storage medium |
CN115953725A (en) * | 2023-03-14 | 2023-04-11 | 浙江大学 | Fish egg automatic counting system based on deep learning and counting method thereof |
CN116721132A (en) * | 2023-06-20 | 2023-09-08 | 中国农业大学 | Multi-target tracking method, system and equipment for industrially cultivated fishes |
CN116721132B (en) * | 2023-06-20 | 2023-11-24 | 中国农业大学 | Multi-target tracking method, system and equipment for industrially cultivated fishes |
CN117292305A (en) * | 2023-11-24 | 2023-12-26 | 中国科学院水生生物研究所 | Method, system, electronic equipment and medium for determining fetal movement times of fish fertilized eggs |
CN117292305B (en) * | 2023-11-24 | 2024-02-20 | 中国科学院水生生物研究所 | Method, system, electronic equipment and medium for determining fetal movement times of fish fertilized eggs |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112598713A (en) | Offshore submarine fish detection and tracking statistical method based on deep learning | |
Yang et al. | Computer vision models in intelligent aquaculture with emphasis on fish detection and behavior analysis: a review | |
Jia et al. | Detection and segmentation of overlapped fruits based on optimized mask R-CNN application in apple harvesting robot | |
CN109766830B (en) | Ship target identification system and method based on artificial intelligence image processing | |
CN111178197B (en) | Mass R-CNN and Soft-NMS fusion based group-fed adherent pig example segmentation method | |
CN111046880A (en) | Infrared target image segmentation method and system, electronic device and storage medium | |
Umamaheswari et al. | Weed detection in farm crops using parallel image processing | |
CN113592896B (en) | Fish feeding method, system, equipment and storage medium based on image processing | |
CN109685045A (en) | A kind of Moving Targets Based on Video Streams tracking and system | |
CN110415208A (en) | A kind of adaptive targets detection method and its device, equipment, storage medium | |
CN110853070A (en) | Underwater sea cucumber image segmentation method based on significance and Grabcut | |
CN110827312A (en) | Learning method based on cooperative visual attention neural network | |
CN114724022A (en) | Culture fish school detection method, system and medium fusing SKNet and YOLOv5 | |
Liu et al. | A high-density fish school segmentation framework for biomass statistics in a deep-sea cage | |
Xia et al. | In situ sea cucumber detection based on deep learning approach | |
CN115731282A (en) | Underwater fish weight estimation method and system based on deep learning and electronic equipment | |
Hou et al. | Detection and localization of citrus fruit based on improved You Only Look Once v5s and binocular vision in the orchard | |
Wang et al. | Using an improved YOLOv4 deep learning network for accurate detection of whitefly and thrips on sticky trap images | |
Yu et al. | U-YOLOv7: a network for underwater organism detection | |
Li et al. | Fast recognition of pig faces based on improved Yolov3 | |
Xu et al. | Detection of bluefin tuna by cascade classifier and deep learning for monitoring fish resources | |
Hu et al. | Automatic detection of pecan fruits based on Faster RCNN with FPN in orchard | |
CN114037737B (en) | Neural network-based offshore submarine fish detection and tracking statistical method | |
Siripattanadilok et al. | Recognition of partially occluded soft-shell mud crabs using Faster R-CNN and Grad-CAM | |
CN112308002B (en) | Submarine organism identification and detection method based on single-stage deep learning network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210402 |