CN114202563A - Fish multi-target tracking method based on balance joint network - Google Patents

Fish multi-target tracking method based on balance joint network Download PDF

Info

Publication number
CN114202563A
CN114202563A CN202111532580.5A CN202111532580A CN114202563A CN 114202563 A CN114202563 A CN 114202563A CN 202111532580 A CN202111532580 A CN 202111532580A CN 114202563 A CN114202563 A CN 114202563A
Authority
CN
China
Prior art keywords
fish
target
tracking
training
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111532580.5A
Other languages
Chinese (zh)
Inventor
李振波
李蔚然
张涵钰
杨普
徐子毓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Agricultural University
Original Assignee
China Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Agricultural University filed Critical China Agricultural University
Priority to CN202111532580.5A priority Critical patent/CN114202563A/en
Publication of CN114202563A publication Critical patent/CN114202563A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/251Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30241Trajectory

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a fish multi-target tracking method based on a balanced combined network, belonging to the technical field of aquaculture. The method is based on Chinese agricultural artificial intelligence innovation entrepreneurship competition data sets, and optimized, sorted and supplemented on the basis of the original data sets to generate new OptMFT data sets; the robustness of the model in a complex environment is further enhanced by carrying out combined training on the data set; meanwhile, some negative sample image frames with poor image quality are removed, the training weight of data with obvious fish targets and clear swimming tracks is strengthened, the accuracy of the model for fish identification is further improved, and the robustness verification of the model under the condition of high-speed swimming of a fish swarm is enhanced by performing full-frame training on a data set; the invention has wide application range, can have good effect in a plurality of culture environments and has strong practicability.

Description

Fish multi-target tracking method based on balance joint network
Technical Field
The invention belongs to the technical field of aquaculture, and particularly relates to a fish multi-target tracking method based on a balanced combined network.
Background
(1) Defects and problems of aquaculture of traditional intensive culture pond
For the aquaculture mode of the intensive culture pond, the level of dissolved oxygen in water has important influence on the ingestion of fish, the digestion of baits and the growth of fish. When the dissolved oxygen content is too low, the fish can have adverse symptoms such as reduced appetite, low digestibility, slow growth and the like, and when the dissolved oxygen content is too low, the fish can die due to oxygen deficiency, so that the aquaculture yield is seriously influenced. From the analysis of the physiology and behavior of the fish, the low dissolved oxygen can cause the fish to have the phenomena of cluster floating, diving behavior, even rapid swimming and irregular swimming on the water surface, gray and dark body color, white film generation of fish eyes and the like. Therefore, whether the fish shoal has abnormal behaviors or not can be judged by the multi-target tracking method of the fishes, and a basis is provided for intelligent culture.
At present, some unreasonable phenomena exist in the aquaculture industry of China, and the phenomena are very common in aquaculture species in culture environments with different ecology. If a particular region is suitable for aquaculture of a certain type of organism, the aquatic biomass will seriously exceed the environmental load, and long-term intensive culture has a single structure in a local region, which easily causes depletion of energy and substances of the ecosystem, resulting in ecological imbalance, red tide and occurrence of pathogenic organisms. In addition, the short food chain and high energy conversion rate result in poor ecological system stability and may cause diseases and epidemic outbreaks. Therefore, in aquaculture, it is an important research direction to find whether the fed fish is infected by diseases and insect pests in time so as to reduce unnecessary culture loss.
At present, the rapid expansion of the culture scale and the rapid increase of the culture density have larger and larger consumption on the water environment and the soil environment, and the culture water environment gradually worsens due to the excessive waste of the fed baits and the abuse of culture drugs. The current intelligent, automatic and informatization degree is low, how to utilize information technologies such as the Internet of things and the like to reasonably utilize resources, and the realization of efficient and green cultivation is the main development target of the current aquaculture industry.
(2) Fish swarm target tracking method based on computer vision
The fish target tracking by using a computer vision related method is one of important technologies for realizing intelligent aquaculture. Compared with the fish detection and tracking method based on acoustics or sensors, the fish detection and tracking method based on computer vision has the advantages of real-time performance, non-contact type, simple equipment requirement, no influence on normal behavior activities of fishes and the like. In comparison, computer images and video information have rich interpretability, and depth semantic information invisible to naked eyes can be obtained through data analysis, so that the fish school tracking based on the computer vision technology has a wide application prospect.
In the field of fish shoal tracking, the traditional tracking method still occupies the mainstream. In the detection stage, the traditional difference method is mostly used for carrying out pixel-level calculation, and positioning and identification are carried out on the fish target; and in the tracking stage, the trajectory prediction and drawing are carried out by a kinematic method, such as a filter correlation method. With the development of deep learning technology, compared with the traditional method, the CNN-based detector is remarkably improved in precision, and the accuracy of the predicted track of the model is higher. However, at present, the application of the method to the fish target still does not make a major breakthrough. Most tracking methods still use the traditional background difference method, the interframe difference method and the like as detectors, and use the related methods such as filtering, SORT and the like as trackers to predict the behavior track. The tracking technology is greatly challenged by the problems of mutual shielding and adhesion among fish bodies, illumination of underwater images and the like due to the fact that fish schools move at high speed.
The fish school target tracking method based on computer vision mainly has the following problems:
(1) problem of underwater image data: the fish multi-target tracking data set mainly has the problems of data acquisition and data frame quality.
Firstly, the fish data sets are small in scale, high in density and high in acquisition difficulty, so that the existing open-source fish data sets are few, and the quality of the open-source fish data sets still cannot meet the requirements of a deep learning-based method. Especially in terms of resolution and image frames, there is still a lack of high quality open source fish multi-target tracking data sets. Therefore, data collected on an experimental site is taken as a main part in data training at present, and the precision of a trained model is obviously reduced under the application of other environments.
Secondly, the underwater image data has low brightness and contrast, much noise and serious color distortion problem, which brings more challenge to the detection algorithm. The underwater image enhancement method needs to be realized in a preprocessing step to achieve accurate detection, and a tracking result is more reliable through a smaller offset input position, but the general image enhancement technology has poor applicability in underwater data and is more difficult in underwater image restoration. Therefore, there is still much room for development in fish data set development and image and video data enhancement based on deep learning.
(2) The traditional tracking method has the problems.
The traditional tracking technology is still used as the main part of the tracking and predicting method of the fish, and the traditional tracking model is usually designed based on a kinematics method. A target tracking method based on kinematics, such as particle filtering, Kalman filtering, nuclear correlation filtering algorithm and the like, is widely applied to fish tracking. Although the overall performance of the model is improved in fish tracking, in the final performance, the precision of any sub-model for tracking and detecting is reduced, so that not only can the network not be shared and the calculation speed be difficult to break through, but also the problems of mutual shielding of fishes in a fish swarm and other complex environments cannot be solved.
The conventional tracking method mainly comprises a detector stage and a tracker stage. In the detector stage, as the influence of underwater sundries is less, the fish body target is obvious, and the difference method is widely applied to fish body identification: including background subtraction, interframe subtraction, and composite subtraction. The existing researchers use the Gaussian mixture model to carry out background difference, and use the Gaussian probability density function to distinguish the targets. In the tracker stage, the fish targets are mainly applied in a filter series method. Common filtering-type methods such as particle filtering, kalman filtering, and the like. The existing researchers provide a fish tracking method based on particle filtering, data association is carried out by adopting a global nearest neighbor method through self-adaptive partitioning, but the method is poor in robustness and obvious in false detection in a complex environment.
Due to the complexity of scene and target changes, the target tracking problem based on deep learning has been one of the most challenging research directions in the field of computer vision. Since 2013, although a target tracking method based on deep learning has made a series of significant progress, the current tracking algorithm cannot simultaneously meet the requirements of robustness, instantaneity and accuracy because the actual scene is often more complex than the evaluation data, and the target detection algorithm based on deep learning is influenced by the underwater environment when being migrated to the fish target, and the enhancement requirement on the image is high.
(3) Fish school shielding problem.
The occlusion problem is always a difficult point to deal with in the detection and tracking of fish schools. The target is difficult to re-identify and predict due to mutual occlusion between fish groups or position change caused by instantaneous acceleration of fishes between frames.
Firstly, the intensive distribution of fish schools causes a serious mutual shielding problem among fishes, and compared with the re-identification of crowd, the individual features of the fishes are more difficult to extract, and the original target is more difficult to retrieve, so that the frequent switching of fish IDs, and the classical identification method of the crowd is not suitable for fish school tracking. For short-time mutual shielding, a fish individual kinematics model needs to be established to match a missing target; for the problem of fixed point shielding, a master-slave camera or a camera with a mirror is used for connecting multi-camera multi-target tracking information, and 3D coordinates are calculated.
Secondly, when feeding or frightening fish stocks, sudden accelerations often occur in fish stocks, and the explosive accelerations can lead to target loss. The problem of target loss due to acceleration can be effectively solved by recording video data at a high frame rate, but this method significantly increases the burden on hardware devices. In model training, the increase of image frames also leads to a reduction in training efficiency. Therefore, it is difficult to use less data to perform accurate target re-identification and trajectory prediction.
Disclosure of Invention
The invention aims to provide a fish multi-target tracking method based on a balanced combined network, which is characterized by comprising the following steps of:
step 1, establishing and optimizing a data set:
based on a Chinese agricultural artificial intelligence innovation entrepreneurship competition data set, optimization, arrangement and supplement are carried out on the basis of an original data set, and a new OptMFT data set is generated; the robustness of the model in a complex environment is further enhanced by carrying out combined training on the data set; meanwhile, some negative sample image frames with poor image quality are removed, so that the training weight of data with obvious fish targets and clear swimming tracks is strengthened, and the accuracy of the model for fish identification is further improved; in addition, full-frame training is performed on the data set to enhance the robustness of the model to the high-speed swimming condition of the fish school.
The multi-fish multi-target tracking model is tested on an optimized multi-fish target tracking data set OptMFT, mainly comprises fish swimming behavior data in a culture barrel, a camera adopts a Haokwev video camera to shoot, and the video frame rate is 25 frames/second; in order to make the data set further suitable for the training of the model, the data set is subjected to re-optimization and sorting;
step 2, designing and optimizing a fish multi-target tracking algorithm:
a balanced multi-fish target tracking network is designed to realize multi-target tracking; namely, the multi-fish multi-target tracking model utilizes ResNet-101 embedded with deformable convolution as a backbone network of the network; extracting low-dimensional features by using four different detection tracking heads, and performing double-branch use on high-resolution feature map information; the detector and the tracker are fused into the same network to realize the detection, the tracking and the matching of the fish target are respectively realized by using the two branches of detection and tracking; four different detection and tracking heads are designed before two branches to extract characteristic information required by the branches, a high-resolution characteristic diagram extracted by a backbone network is effectively extracted with the characteristic information, an anchor-free method is used for accurately positioning a fish target in the detection, the target matching and re-identification are carried out by taking the concept of DeepsORT for reference in the tracking, and the high-precision tracking and individual information keeping of the fish target are realized in the complex culture pond environment.
Step 3, model training and evaluation:
in the model training, a fish multi-target tracking data set of the open-world celestial unit is used, and the data set has the problems of bait residue, light spot influence, water surface ripple disturbance, mutual shielding caused by intensive distribution of fish schools, target loss caused by sudden acceleration of the fish schools and variable contrast; the training model is required to have higher robustness; in the training, a Ubuntu 18.04.2LTS system is used, and 1 NVIDIA Tesla V100 video card is adopted for model training and ablation experiments; and test verification is carried out on the data set; full frame training is performed on the OptMFT dataset to enhance the robustness of the model for high-speed walking conditions,
and (3) verifying by using a verification model in multi-fish multi-target tracking: the balanced joint fish multi-target tracking network is compared with the classical algorithm on an OptMFT _ light data set: firstly, using fast RCNN-50-FPN and fast RCNN-101-FPN as backbone network training of a detector; through SORT, DeepSORT and Tracktor experimental test verification; to evaluate the performance of the detection and re-identification.
In the step 2, high-precision tracking and individual information keeping of the fish target are realized under the complex culture pond environment. The fish multi-target tracking method specifically adopts computer vision and deep learning technology, and comprises the following steps:
(1) using the improved OptMFT data set as a training set of a model, unifying the image size to 1088 × 1088, and inputting the image size into a target network for model pre-training;
(2) replacing the 3 x 3 convolution layer in the fifth stage in the network with deformable convolution by using ResNet-101 as a backbone network, and replacing the final pooling layer with deformable RoIpooling to realize effective tracking of the fish target with non-rigid characteristic;
(3) designing, detecting and tracking double branches for extracting fish target detection characteristics and individual information re-identification characteristics respectively; carrying out feature extraction on a high-resolution feature map by designing 3 detection heads and 1 re-picking head;
(4) extracting characteristic information required by target detection by applying an estimated thermodynamic diagram head, a center offset head and a bounding box size head to the detection branch; applying a re-picking head to the tracking branch to extract characteristic information required by target tracking;
the head of the thermodynamic diagram is used for estimating the center point of the fish body target, and the response value M of the center point (x, y) of the thermodynamic diagramxyCan be written as:
Figure BDA0003411948690000071
the bounding box size head is used for estimating the target size of the fish body, and the center offset head is used for calculating the center offset of the position of the fish body;
(5) the proposed weight balance loss is utilized to carry out the learning balance of the double-branch training, and the more optimal target characteristics for detection and re-identification are obtained simultaneously in the training of the same network;
the correlation loss function is defined as follows:
Ldeth*Lhmo*Losb*Lbs (2)
Figure BDA0003411948690000072
Figure BDA0003411948690000073
δh,δo,δbrepresents the weight coefficients, ω, of the optimized estimated thermodynamic diagram head, center offset head and bounding box size head1,ω2Learnable parameters for balancing detection and tracking tasks;
(6) experiments prove that the characteristic dimension of the training model has better tracking performance on fish re-identification when the characteristic dimension is 64, and the training characteristic dimension of the model is set to be 64 to find out more lost interest targets.
And 3, verifying by using a verification model in multi-fish multi-target tracking: the balanced joint fish multi-target tracking network is compared with the classical algorithm on an OptMFT _ light data set: firstly, using fast RCNN-50-FPN and fast RCNN-101-FPN as backbone network training of a detector; SORT, DeepSORT and Tracktor experiments show that when fast RCNN-50-FPN is used as a detector trunk, the model is better in MOTA value and recall rate, the reason is that the detector obtains high accuracy on a fish school target, and the extremely high ID conversion number indicates that the target loss problem is not essentially solved; when the backbone network is fast RCNN-101-FPN, the model performs very poorly on MOTA and recall. The larger the false detection rate FN value is, the more obvious the missed detection is, and the higher the precision is, the better the identification effect of the detector on the fish is; secondly, testing a plurality of typical JDE paradigm multi-target tracking models to verify the effectiveness of the users;
the method has the advantages that the performance is compared with a real-time model in application, and an OptMFT data set is used for carrying out a comparison experiment on a balanced combined fish multi-target tracking network and a DeepSORT tracker. The result shows that the inference speed of DeepsORT is better represented and reaches 32.16 FPS. And its lower IDF1 (21.6%) and MOTA (56.1%) means poor recognition ability for emphasis. In contrast, the balanced combined fish multi-target tracking network obtains better balance performance in terms of precision and speed.
The invention has the following characteristics:
(1) because of the characteristics of small fish school targets, the swimming speed is high, the shielding, scale change and tracking difficulty are easy to occur, and the tracking model based on deep learning on the fish school is rare at present, the invention provides a model specially used for multi-target tracking of fish; the feature information required by detection and tracking is combined in the same network for training and extraction, so that the model training efficiency is greatly improved.
(2) Comparing and optimizing a backbone network of the network, and using ResNet-101 embedded with deformable convolution as the backbone network of the network; extracting low-dimensional features by using four different detection tracking heads, and performing double-branch use on high-resolution feature map information;
(3) a weight balance loss function is designed to optimize the training of the two branches, so that the network has better training efficiency on position characteristic information, ID information and the like, and the tracking precision is effectively improved; training tracking of different feature dimensions is supported, and good tracking effect is achieved on 128 and 64 feature dimensions.
(4) In order to verify the effectiveness of the model, the invention carries out comparison experiments with a plurality of target tracking algorithms such as SORT, DeepsORT, Tracktor, CenterTrack, FairMOT and the like. Experiments prove that the model achieves better balance between precision and speed, effectively reduces the switching times of the target, can realize multi-target tracking of the fish school in the complex environment, better solves the problems of shielding, high speed and multi-scale of the fish school compared with other models, and achieves higher IDF1 and MOTA results.
(5) The invention has wide application range, can have good effect in a plurality of culture environments and has strong practicability.
Drawings
FIG. 1 is a flow chart of a multi-target tracking method for fishes.
FIG. 2 is an example of an OptMFT dataset in which (a) normal, (b) perturbation, (c) circular swimming, (d) rest, (e) bait, (f) feed, (g) corrugation, (h) spot;
FIG. 3 is a sample presentation of trace results, where a, b, c, d represent visual trace result sample displays of the model in the OptMFT dataset;
FIG. 4 is a sample display diagram of the continuous frame tracking result, in which a, b, c, d are continuous frame tracking patterns;
FIG. 5 is an example of a detection and tracking branch, where a three input heads are used to estimate the thermodynamic diagram, center offset and bounding box size, respectively, for the detection branch. Thermodynamic diagrams determine position, and center eccentricity aims at position correction; b, in the tracking stage, an embedded re-picking branch is used for feature extraction and low-level features of various fish bodies are identified for re-identification.
Detailed Description
The invention provides a fish multi-target tracking method based on a balanced combined network, which is further explained by combining the attached drawings and an embodiment.
At present, the training data required by the deep learning technology is large, and the open-source fish target tracking data set is small; and the underwater data collection is difficult because the deployment cost of the underwater camera is high and the illumination condition is poor. Meanwhile, the labeling of the tracking data set is relatively complex, and the multi-fish multi-target tracking model of the data set with high frame rate and high resolution is trained by using the data set of the breeding experiment. The invention uses the fish swimming behavior data in the culture barrel to perform the test on the optimized multi-fish target tracking data set OptMFT, and comprises the following steps:
step 1, establishing and optimizing a data set:
the multi-fish multi-target tracking model is a Chinese agricultural artificial intelligence innovation entrepreneurship competition data set, and is optimized, sorted and supplemented on the basis of the original data set to generate a new OptMFT data set; the original data is a Chinese agricultural artificial intelligence innovation entrepreneurship competition data set and mainly comprises fish swimming behavior data in a culture barrel, a camera adopts a Haicanwei video camera to shoot, and the video frame rate is 25 frames/second; in order to further adapt to the training of the model, the data set is optimized and sorted again, and firstly, the robustness of the model in a complex environment is further enhanced through combination training; secondly, negative sample image frames with poor images are removed, training weights of data with obvious fish targets and clear swimming tracks are strengthened, and accuracy of the model for fish identification is further improved; in addition, full-frame training is performed on the data set to enhance the robustness of the model to the high-speed swimming condition of the fish school. The optimized data set is shown in fig. 2. Wherein, (a) normal, (b) disturbance, (c) circular swimming, (d) rest, (e) bait, (f) feeding, (g) corrugation, (h) light spot;
step 2, designing and optimizing a fish multi-target tracking algorithm:
a balanced multi-fish target tracking network is designed to realize multi-target tracking; namely, the multi-fish multi-target tracking model utilizes ResNet-101 embedded with deformable convolution as a backbone network of the network; extracting low-dimensional features by using four different detection tracking heads, and performing double-branch use on high-resolution feature map information; the detector and the tracker are fused into the same network to realize the detection, the tracking and the matching of the fish target are respectively realized by using the two branches of detection and tracking; four different detection and tracking heads are designed before two branches to extract characteristic information required by the branches, a high-resolution characteristic diagram extracted by a backbone network is effectively extracted with the characteristic information, an anchor-free method is used for accurately positioning a fish target in the detection, the target matching and re-identification are carried out by taking the concept of DeepsORT for reference in the tracking, and the high-precision tracking and individual information keeping of the fish target are realized in the complex culture pond environment. Specifically, a balanced multi-fish multi-target tracking network combining detection and tracking is designed by adopting a fish multi-target tracking method of computer vision and deep learning technology; the design and implementation process of the network is shown in fig. 1: the method comprises the following steps:
(1) the above improved OptMFT dataset was used as a training set for the model, with uniform image size to 1088 × 1088, and input into the target network for model pre-training.
(2) And replacing the 3 x 3 convolution layer of the fifth stage in the network with deformable convolution by using ResNet-101 as a backbone network, and replacing the final pooling layer with deformable RoIpooling to realize effective tracking of the fish target with non-rigid characteristics.
(3) Designing, detecting, and tracking the double branches are respectively used for fish target detection features, such as extraction of individual information re-identification features shown in fig. 5, where a: for the detection branch, three input heads are used to estimate the thermodynamic diagram, center offset and bounding box size, respectively. Thermodynamic diagrams determine position, and center eccentricity aims at position correction; b: in the tracking phase, an embedded re-picking head branch is used for feature extraction and low-level features of various fish bodies are identified for re-identification.
Feature extraction of the high-resolution feature map is performed by designing 3 detection heads and 1 re-picking individual head.
(4) Extracting characteristic information required by target detection by applying an estimated thermodynamic diagram head, a center offset head and a bounding box size head to the detection branch; and (5) applying a re-picking head to the tracking branch to extract the characteristic information required by target tracking.
Estimating the head of the thermodynamic diagram for estimating the center point of the fish body target, and responding the value M of the center point (x, y) of the thermodynamic diagramxyCan be written as:
Figure BDA0003411948690000111
the bounding box size header is used to estimate the fish body target size and the center offset header is used to calculate the center offset of the fish body position.
(5) And performing learning balance of double-branch training by using the proposed weight balance loss, and simultaneously acquiring more optimal target characteristics for detection and re-identification in the training of the same network.
The correlation loss function is defined as follows:
Ldet=δh**Lhmo*Losb*Lbs (2)
Figure BDA0003411948690000121
Figure BDA0003411948690000122
δh,δo,δbrepresents the weight coefficients, ω, of the optimized estimated thermodynamic diagram head, center offset head and bounding box size head1,ω2Are learnable parameters used to balance detection and tracking tasks.
(6) Experiments prove that the characteristic dimension of the training model has better tracking performance on fish re-identification when the characteristic dimension is 64, and the training characteristic dimension of the model is set to be 64 to find out more lost interest targets.
Step 3, model training and evaluation:
in the model training, a fish multi-target tracking data set of the open-world celestial unit is used, and the data set has the problems of bait residue, light spot influence, water surface ripple disturbance, mutual shielding caused by intensive distribution of fish schools, target loss caused by sudden acceleration of the fish schools and variable contrast; the training model is required to have higher robustness; in the training, a Ubuntu 18.04.2LTS system is used, and 1 NVIDIA Tesla V100 video card is adopted for model training and ablation experiments; full-frame training is carried out on the OptMFT data set to enhance the robustness of the model under the condition of high-speed walking, and test verification is carried out on the data set;
performing tracking effect test on the model by using the OptMFT data set, wherein the test effect is shown in FIG. 3, wherein a, b, c and d represent that visual tracking result sampling display of the model is performed in the OptMFT data set;
in complex water body environments with different contrasts and residual baits, the model disclosed by the invention has good detection performance on fish targets. Through continuous frame testing, the model of the invention has certain target recovery performance under the conditions of shielding, accelerated swimming, turning and the like, but still has the phenomenon of target loss under difficult conditions, and the continuous frame picture tracking effect is shown in fig. 4, wherein a, b, c and d are continuous frame tracking patterns.
The verification result of the verification model in the step 3 in the multi-fish multi-target tracking is as follows: the balanced joint fish multi-target tracking network is compared with the classical algorithm on an OptMFT _ light data set: firstly, using fast RCNN-50-FPN and fast RCNN-101-FPN as backbone network training of a detector; SORT, DeepSORT and Tracktor experiments show that when fast RCNN-50-FPN is used as a detector trunk, the model is better in MOTA value and recall rate, the reason is that the detector obtains high accuracy on a fish school target, and the extremely high ID conversion number indicates that the target loss problem is not essentially solved; when the backbone network is fast RCNN-101-FPN, the model performs very poorly on MOTA and recall. The larger the false detection rate FN value is, the more obvious the missed detection is, and the higher the precision is, the better the identification effect of the detector on the fish is; secondly, testing a plurality of typical JDE paradigm multi-target tracking models to verify the effectiveness of the users; experiments show that the centrtrack is relatively poor in false detection rate FN, which means more deletion detections; a higher IDF1 (37.0%) indicates that FairMOT significantly improves FN phenomenon, and effectively improves re-identification performance; the balanced joint fish multi-target tracking network achieves the highest 38.6% IDF1, which means it performs best in tracking in combination with ID information. In addition, the balanced combined fish multi-target tracking network also achieves the highest MOTA (71.4%) and Recall (73.6%) of the JDE paradigm, and exceeds FairMOT (MOTA 71.4%, Recall 73.2%).
The invention provides a multi-target tracking model specially used for fish schools, which comprises the following steps: by carrying out feature sharing of detection and tracking tasks, tracking is efficiently realized, the features (shown in figure 5) required by two branches and the imbalance of feature dimensions are balanced, and the problem of mutual occlusion of fish schools is solved to a certain extent by an anchor-free method. The MOTA criteria were achieved on the optimized OptMFT data set.

Claims (3)

1. A fish multi-target tracking method based on a balance combined network is characterized by comprising the following steps:
step 1, establishing and optimizing a data set, carrying out optimization, arrangement and supplement on the basis of an original data set based on a Chinese agricultural artificial intelligence innovation entrepreneurship competition data set, and generating a new OptMFT data set; the robustness of the model in a complex environment is further enhanced by carrying out combined training on the data set; meanwhile, some negative sample image frames with poor image quality are removed, so that the training weight of data with obvious fish targets and clear swimming tracks is strengthened, and the accuracy of the model for fish identification is further improved; in addition, full-frame training is carried out on the data set to enhance robustness verification of the model under the condition of high-speed swimming of the fish school;
the fish multi-target tracking model is tested on an optimized fish multi-target tracking data set OptMFT, mainly comprises fish swimming behavior data in a culture barrel, a camera adopts a Haokwev video camera to shoot, and the video frame rate is 25 frames/second; in order to make the data set further suitable for the training of the model, the data set is subjected to re-optimization and sorting;
step 2, designing and optimizing a fish multi-target tracking algorithm, and designing a balanced fish multi-target tracking network to realize multi-target tracking; namely, the fish multi-target tracking model utilizes ResNet-101 embedded with deformable convolution as a backbone network of the network; extracting low-dimensional features by using four different detection tracking heads, and performing double-branch use on high-resolution feature map information; the detector and the tracker are fused into the same network to realize the detection, the tracking and the matching of the fish target are respectively realized by using the two branches of detection and tracking; four different detection and tracking heads are designed before two branches to extract characteristic information required by the branches, a high-resolution characteristic diagram extracted by a backbone network is effectively extracted with the characteristic information, an anchor-free method is used for accurately positioning a fish target in the detection, the target matching and re-identification are carried out by taking the concept of DeepsORT for reference in the tracking, and the high-precision tracking and individual information keeping of the fish target are realized in a complex culture pond environment;
step 3, model training evaluation, namely using a spacious astronomical fish multi-target tracking data set in model training, wherein the data set has the problems of bait residue, light spot influence, water surface ripple disturbance, mutual shielding caused by intensive distribution of fish schools, target loss caused by sudden acceleration of the fish schools and variable contrast; the training model is required to have higher robustness; in the training, a Ubuntu 18.04.2LTS system is used, and 1 NVIDIA Tesla V100 video card is adopted for model training and ablation experiments; and test verification is carried out on the data set; full-frame training is carried out on the OptMFT data set to enhance the robustness of the model under the condition of high-speed walking;
and (3) verifying by using a verification model in the fish multi-target tracking: the balanced joint fish multi-target tracking network is compared with the classical algorithm on an OptMFT _ light data set: firstly, using fast RCNN-50-FPN and fast RCNN-101-FPN as backbone network training of a detector; through SORT, DeepSORT and Tracktor experimental test verification; to evaluate the performance of the detection and re-identification.
2. The fish multi-target tracking method based on the balanced joint network as claimed in claim 1, wherein in the step 2, high-precision tracking and individual information keeping of the fish target are realized in a complex culture pond environment. The fish multi-target tracking method specifically adopts computer vision and deep learning technology, and comprises the following steps:
(1) using the improved OptMFT data set as a training set of a model, unifying the image size to 1088 × 1088, and inputting the image size into a target network for model pre-training;
(2) replacing the 3 x 3 convolution layer in the fifth stage in the network with deformable convolution by using ResNet-101 as a backbone network, and replacing the final pooling layer with deformable RoIpooling to realize effective tracking of the fish target with non-rigid characteristic;
(3) designing, detecting and tracking double branches for extracting fish target detection characteristics and individual information re-identification characteristics respectively; carrying out feature extraction on a high-resolution feature map by designing 3 detection heads and 1 re-picking head;
(4) applying a thermodynamic diagram to estimate a head, predict a frame head and offset the correction head on a detection branch, and extracting characteristic information required by target detection; applying a re-recognition head to the tracking branch to extract characteristic information required by target tracking;
the thermodynamic diagram estimation head is used for estimating the center point of a fish body target, and the response value M of the thermodynamic diagram center point (x, y)xyCan be written as:
Figure FDA0003411948680000031
the bounding box size head is used for estimating the target size of the fish body, and the center offset head is used for calculating the center offset of the position of the fish body;
(5) and performing learning balance of double-branch training by using the proposed weight balance loss, and simultaneously acquiring more optimal target characteristics for detection and re-identification in the training of the same network.
The correlation loss function is defined as follows:
Ldet=δh*Lhmo*Losb*Lbs (2)
Figure FDA0003411948680000032
Figure FDA0003411948680000033
δk,δc,δbrepresents the weight coefficients, ω, of the optimized estimated thermodynamic diagram head, center offset head and bounding box size head1,ω2Learnable parameters for balancing detection and tracking tasks;
(6) experiments prove that the characteristic dimension of the training model has better tracking performance on fish re-identification when the characteristic dimension is 64, and the training characteristic dimension of the model is set to be 64 to find out more lost interest targets.
3. The fish multi-target tracking method based on the balanced joint network as claimed in claim 1, wherein the step 3 is verified by a verification model in the multi-fish multi-target tracking: the balanced joint fish multi-target tracking network is compared with the classical algorithm on an OptMFT _ light data set: firstly, using fast RCNN-50-FPN and fast RCNN-101-FPN as backbone network training of a detector; SORT, DeepSORT and Tracktor experiments show that when fast RCNN-50-FPN is used as a detector trunk, the model is better in MOTA value and recall rate, the reason is that the detector obtains high accuracy on a fish school target, and the extremely high ID conversion number indicates that the target loss problem is not essentially solved; when the backbone network is fast RCNN-101-FPN, the model has extremely poor performance on MOTA and recall rate; the larger the false detection rate FN value is, the more obvious the missed detection is, and the higher the precision is, the better the identification effect of the detector on the fish is; secondly, several typical JDE paradigm multi-target tracking models were tested to verify our validity.
CN202111532580.5A 2021-12-15 2021-12-15 Fish multi-target tracking method based on balance joint network Pending CN114202563A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111532580.5A CN114202563A (en) 2021-12-15 2021-12-15 Fish multi-target tracking method based on balance joint network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111532580.5A CN114202563A (en) 2021-12-15 2021-12-15 Fish multi-target tracking method based on balance joint network

Publications (1)

Publication Number Publication Date
CN114202563A true CN114202563A (en) 2022-03-18

Family

ID=80653868

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111532580.5A Pending CN114202563A (en) 2021-12-15 2021-12-15 Fish multi-target tracking method based on balance joint network

Country Status (1)

Country Link
CN (1) CN114202563A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114782759A (en) * 2022-06-22 2022-07-22 鲁东大学 Method for detecting densely-occluded fish based on YOLOv5 network
CN116363494A (en) * 2023-05-31 2023-06-30 睿克环境科技(中国)有限公司 Fish quantity monitoring and migration tracking method and system
CN116721132A (en) * 2023-06-20 2023-09-08 中国农业大学 Multi-target tracking method, system and equipment for industrially cultivated fishes

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114782759A (en) * 2022-06-22 2022-07-22 鲁东大学 Method for detecting densely-occluded fish based on YOLOv5 network
CN114782759B (en) * 2022-06-22 2022-09-13 鲁东大学 Method for detecting densely-occluded fish based on YOLOv5 network
US11790640B1 (en) 2022-06-22 2023-10-17 Ludong University Method for detecting densely occluded fish based on YOLOv5 network
CN116363494A (en) * 2023-05-31 2023-06-30 睿克环境科技(中国)有限公司 Fish quantity monitoring and migration tracking method and system
CN116721132A (en) * 2023-06-20 2023-09-08 中国农业大学 Multi-target tracking method, system and equipment for industrially cultivated fishes
CN116721132B (en) * 2023-06-20 2023-11-24 中国农业大学 Multi-target tracking method, system and equipment for industrially cultivated fishes

Similar Documents

Publication Publication Date Title
Wang et al. Real-time underwater onboard vision sensing system for robotic gripping
CN114202563A (en) Fish multi-target tracking method based on balance joint network
CN111046967A (en) Underwater image classification method based on convolutional neural network and attention mechanism
Li et al. CMFTNet: Multiple fish tracking based on counterpoised JointNet
Wang et al. Vision-based in situ monitoring of plankton size spectra via a convolutional neural network
Cao et al. Automatic coarse-to-fine joint detection and segmentation of underwater non-structural live crabs for precise feeding
Fu et al. A case study of utilizing YOLOT based quantitative detection algorithm for marine benthos
Li et al. A self-attention feature fusion model for rice pest detection
Liu et al. A multitask model for realtime fish detection and segmentation based on YOLOv5
CN110334703B (en) Ship detection and identification method in day and night image
Cao et al. Learning-based low-illumination image enhancer for underwater live crab detection
Ouyang et al. An anchor-free detector with channel-based prior and bottom-enhancement for underwater object detection
Alsaadi et al. An automated mammals detection based on SSD-mobile net
CN108038872B (en) Dynamic and static target detection and real-time compressed sensing tracking research method
Zhang et al. An underwater fish individual recognition method based on improved YoloV4 and FaceNet
Xu et al. Cucumber flower detection based on YOLOv5s-SE7 within greenhouse environments
Cui et al. Reverse attention dual-stream network for extracting laver aquaculture areas from GF-1 remote sensing images
Ge et al. Real-time object detection algorithm for Underwater Robots
Siripattanadilok et al. Recognition of partially occluded soft-shell mud crabs using Faster R-CNN and Grad-CAM
Zhou et al. Fish density estimation with multi-scale context enhanced convolutional neural network
Li et al. Underwater object detection based on improved SSD with convolutional block attention
Tarekegn et al. Underwater Object Detection using Image Enhancement and Deep Learning Models
Yu et al. Precise segmentation of remote sensing cage images based on SegNet and voting mechanism
Pang et al. Target tracking based on siamese convolution neural networks
Muri et al. Temperate fish detection and classification: A deep learning based approach

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination