CN115496780A - Fish identification counting, speed measurement early warning and access quantity statistical system - Google Patents

Fish identification counting, speed measurement early warning and access quantity statistical system Download PDF

Info

Publication number
CN115496780A
CN115496780A CN202211148365.XA CN202211148365A CN115496780A CN 115496780 A CN115496780 A CN 115496780A CN 202211148365 A CN202211148365 A CN 202211148365A CN 115496780 A CN115496780 A CN 115496780A
Authority
CN
China
Prior art keywords
fish
algorithm
fishes
picture
coordinates
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211148365.XA
Other languages
Chinese (zh)
Inventor
徐敬
廖文栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202211148365.XA priority Critical patent/CN115496780A/en
Publication of CN115496780A publication Critical patent/CN115496780A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30232Surveillance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30241Trajectory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30242Counting objects in image
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A40/00Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
    • Y02A40/80Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in fisheries management
    • Y02A40/81Aquaculture, e.g. of fish

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a fish identification counting, speed measurement early warning and in-out quantity counting system. A user can customize a training data set by combining the actual situation of the user, and a series of functions of multi-target tracking of fishes, counting of the number of various fishes in a picture, calculation of the number of fishes entering and leaving a monitoring picture, speed measurement of the moving speed of each fish and the like can be realized by combining a trained weight file with a Deepsort-based multi-target tracking algorithm, a biconvex track measurement and calculation algorithm and a visual gradient-based speed measurement algorithm which are used in the invention. The intelligent management and control system can achieve the effect of intelligent management and control according to the ornamental fishes to be managed by the intelligent management and control system. The method has good application expansibility, and can perform a plurality of secondary developments by adding some algorithms in combination with user requirements.

Description

Fish identification counting, speed measurement early warning and access quantity statistical system
Technical Field
The invention relates to the field of artificial intelligence, in particular to a fish identification counting, speed measurement early warning and in-out quantity counting system.
Background
In recent years, with the economic level rise of Chinese people, some artistic appreciation and leisure and entertainment activities are gradually prevalent, and the enjoyment and cultivation of ornamental fishes is an interesting activity at present, and the beautiful images of marine ornamental fishes can be seen in many marine museums, shopping malls, hotels, tourist attractions, entertainment and exhibition places, and even in individual houses. In recent years, the culture scale and the exhibition scale of the marine ornamental fishes become larger, and a series of problems are brought along with the increase of the culture scale and the exhibition scale. The ocean ornamental fishes need to be finely managed and scientifically raised in most cases because of a series of factors of higher economic price, meeting the requirements of mass ornamental arts, bringing joyful mood for people to appreciate and the like. However, manual management is time-consuming and labor-consuming, and the management effect is general, so that an artificial intelligence technology is urgently needed to alleviate the problem.
At present, some people only use artificial intelligence technology to identify fish species, and do not further research to monitor more fish indexes and do not make deeper research for intelligent culture.
Disclosure of Invention
The invention provides a fish identification counting, speed measurement early warning and in-out quantity counting system aiming at the problems in the background technology. The system can realize the functions of dynamically tracking fishes, counting the number of various fishes in a picture, calculating the number of fishes entering and leaving a monitoring picture, carrying out speed measurement and early warning on the swimming speed of each fish and the like.
1. The technical scheme of the invention comprises the following steps:
s1: the method comprises the steps of collecting videos or pictures of 5 ornamental fishes as a data set, labeling the fish coordinates in the pictures by using a genie labeling assistant, and then generating a labeling file. The labeled data set is divided into a training set and a verification set by using a Python program script.
S2: and putting the data set into a network structure of YOLOv5 for training, and storing the trained weight file after the training is finished.
S3: and calling the weight file to perform target detection on the video frame, and detecting a fish target frame in the picture. And processing the fish by a Deepsort-based multi-target tracking algorithm and a visual gradient-based speed measurement algorithm to obtain the sequence number and the speed of the fish target frame.
S4: and then obtaining the number of the entering and leaving fishes through a counting algorithm based on biconvex trajectory measurement.
S5: and outputting a system detection result.
2. In some alternative embodiments, step S1 comprises:
s11: and recording the video of the ornamental fish, taking the photo of the ornamental fish and searching the video or the photo of the ornamental fish on the internet by using a camera, and taking the video and the photo as a training data set. The video is converted into pictures for training, so the following operations are performed. Simultaneously pressing a Windows key and an R key on a keyboard, inputting cmd, clicking an open command line window, and inputting pip install ffmpeg to install the ffmpeg (a toolkit in a command line). After ffmpeg is installed, ffmpeg is used in the command line to convert the video into a picture data set according to a certain sampling rate, and instructions such as: mpeg-i test. Mp4-r5-f image2 \ output \ frame _%05d. Jpg, where-i is followed by a video file suffixed mp 4; wherein-r is followed by a sampling rate, i.e. several pictures are split in one second of video; where-f is followed by the type of picture saved, and finally the output picture save folder and naming format, where% 05d indicates a sequence number of 5 digits, and the picture naming is continued from frame _00000.jpg, frame _00001.jpg, frame _00002.jpg \8230, 8230, all the time.
S12: a genie marking assistant (software for making a label on a data set) is used for marking the data set, several kinds of ornamental fishes are defined, and 5 kinds of ornamental fishes are used in the development of the system: clown Fish (Clowfish), spotted puffer Fish (Yellow Box Fish), horse Fish (Schooling Banner Fish), emperor Angel Fish (Emperor Angelfish), yellow Tang Fish (Yellow Tang Fish); then, a box is drawn with a mouse for all the ornamental fishes appearing in each picture, the ornamental fishes are framed, and an ornamental fish species, such as one of the aforementioned 5 species, is selected for each box. After all the annotations are finished, setting an export path and an export format, wherein the export format is selected as passacal-voc, and 5228 pictures generate 5228 annotation files with the suffixes of xml format.
S13: because the training uses a markup file suffixed with txt, then a markup file conversion is performed. The Python program script is used for batch conversion. Import the relevant library file using import os and import xml. Etree. Elementtree, convert the pixel coordinates in xml file to txt normalized coordinates, and the fish species are represented by 5 numbers 0,1,2,3, 4. The conversion equations are shown in equation 1, equation 2, equation 3, and equation 4.
Figure BDA0003855731070000021
Figure BDA0003855731070000022
Figure BDA0003855731070000023
Figure BDA0003855731070000024
Where (x, y) -coordinates of the centre of the box after normalization
w, h-normalized Box Width and height
(x min ,y min ) Coordinates of the upper left corner of the square
(x max ,y max ) Coordinates of the lower right corner of the square
width, height-width and height of the entire picture
3. In some alternative embodiments, step S2 comprises:
s21: in the Input part of the network fabric. (1) Firstly conducting Mosaic data enhancement, and carrying out zooming, distribution and cutting on 4 input images by the algorithm to randomly splice the images. The method enables input images to be diversified, the trained model is more beneficial to detecting small objects, and meanwhile, originally processed 4 pictures are changed into 1 picture, so that the calculated amount is reduced. (2) Then, an adaptive picture scaling algorithm is carried out, and the algorithm is used during testing and model inference, so that the inference speed of the algorithm is greatly improved. (3) And then, performing a self-adaptive anchor frame calculation algorithm, putting a program corresponding to the tested initial anchor frame value into the original code, and obtaining the optimal anchor frame value each time the data set is trained.
S22: in the BackBone part of the network structure, in the Focus structure, a picture is sliced, then a Concat operation is carried out, and finally the picture becomes an easily processed characteristic diagram after a convolution operation. This reduces floating point operations and speeds up computation. For example, the 796 × 796 × 3 picture is sliced, spliced and convolved to become a 398 × 398 × 12 feature map. Then, the feature map is processed by a CSPNet structure, the operation of the structure mainly splits the feature map into two parts, one part is processed by convolution operation, and the other part is spliced with the result of the convolution operation of the previous part. This structure can reduce the amount of calculation, but the improvement in precision is small.
S23: in the Neck part of the network structure, an FPN-PAN structure is used, multi-dimensional feature extraction is carried out on the structure, and the receptive field is greatly increased.
S24: in the Head part of the network structure, a CIOU Loss function is used as a regression Loss function of the bounding box, the Loss function considers more comprehensive information, and the actual effect is better. In the post-processing stage of target detection, non-maximum suppression operation is performed for screening of a plurality of target frames. From FIG. 6, equation 5 for calculating CIOU Loss is derived.
Figure BDA0003855731070000025
Distance _ C in the formula-the diagonal length of the minimum bounding rectangle
Distance _ 2-length of line connecting center points of two bounding boxes
IOU-cross-over ratio
v-measure aspect ratio uniformity parameter
4. In some alternative embodiments, step S3 comprises:
s31: the method mainly comprises the following steps of a Deepsort-based multi-target tracking algorithm: (1) Firstly, a trained fish classification weight file is used for detecting a fish target in a video frame to obtain characteristic information, wherein the characteristic information comprises position information, category information, confidence information and the like of the fish. (2) The matching target is carried out by calculating the matching degree of the two frames of image information before and after, and then the serial number is distributed to each tracked target, so that the dynamic tracking of various fishes is realized. The Deepsort algorithm adds cascade matching and new track confirmation, and solves the problem that the object in the Sort algorithm is disconnected due to shielding. The tracks are divided into acknowledged tracks and unacknowledged tracks. The newly generated track must be continuously matched with the detector for a plurality of times, and the unconfirmed track can be converted into the confirmed track; the trace of the confirmation state is continuously disconnected with the detector for a plurality of times, and then the trace of the confirmation state is deleted.
S32: the core idea of the speed measurement algorithm based on the visual gradient is as follows. Pixel coordinate recording (1920 × 1280 resolution) is performed first: and (4) recording all the serial numbers of the fishes and the center XY coordinates detected and sequenced by the Deepsort algorithm in a list. When a plurality of video frames are spaced and the same ID number is tracked, the pixel point speed is calculated, and the calculation formulas are formula 6 and formula 7.
(x-x -1 ) 2 +(y-y -1 ) 2 =Distance 2 (formula 6)
Figure BDA0003855731070000031
Where x, y-position coordinates after movement
x -1 ,y -1 Position coordinates before movement
Distance-Distance of moving position
t-time of movement
Figure BDA0003855731070000032
-pixel point velocity
The speed of movement in a video frame, i.e. the speed of a pixel moving in a unit time, is called the pixel speed. The pixel point speed and the actual speed are still different, and the physical model and the lens of the shooting camera are required to be subjected to inclination angle analysis. In the same time, under the condition of the same swimming distance, more pixel points are passed by the fishes near the camera in swimming; the fish near the camera swims through fewer pixel points. Therefore, the former needs to be scaled down by adding a scaling factor, and the latter needs to be scaled up by adding a scaling factor, but a parameter which is difficult to find by manual operation is provided. The best solution is to measure multiple sets of actual velocities and pixel point velocities and solve the equations using machine learning, as shown in equation 8.
Figure BDA0003855731070000033
Where λ is a proportionality coefficient between actual speed and pixel speed
s i -machine learning the parameters to be solved from the sets of actual velocities and pixel point velocities
y-vertical axis pixel coordinate
n is an integer, a value being taken according to the actual situation
v-actual speed
5. In some alternative embodiments, step S4 comprises:
s41: the core idea of the counting algorithm based on the biconvex trajectory estimation is as follows. (1) drawing two convex detection lines on the video: green line, yellow line. (2) And continuously comparing the fish position coordinates with the coordinates of the two lines. And counting when the coordinates of the monitoring points are crossed with the two convex detection lines. (3) The fish position coordinate firstly passes through a green line and then a yellow line, the fish is judged to enter a monitoring area, and the number of the fishes is added by 1; the fish position coordinate passes through a yellow line and then a green line, the fish is judged to leave the monitoring area, and the number of the fishes is reduced by 1.
6. In general, compared with the prior art, the above technical solution of the present invention can achieve the following beneficial effects:
(1) The invention realizes a series of functions of multi-target tracking of fishes, counting of the quantity of various fishes in a picture, calculation of the quantity of fishes entering and leaving a monitoring picture, speed measurement of the swimming speed of each fish and the like by using a Deepsort-based multi-target tracking algorithm, a double convex track measurement-based counting algorithm and a visual gradient-based speed measurement algorithm.
(2) The intelligent management and control system can automatically collect data sets for training according to ornamental fishes to be managed by the intelligent management and control system, and achieves the effect of intelligent management and control.
(3) The invention has better application expansibility, combines with the user requirements, adds some algorithms and can carry out a plurality of secondary developments, such as: fish speed measurement early warning, fish separation from a water tank detection, fish theft alarm and the like.
Drawings
FIG. 1 is a flow chart of an implementation of the present invention.
FIG. 2 is a graph of the effect of weight file detection of the present invention.
Fig. 3 is the Mosaic data enhancement of the present invention.
Fig. 4 is the adaptive picture scaling of the present invention.
Fig. 5 is a FPN-PAN structure of the present invention.
FIG. 6 is a CIOU Loss schematic diagram of the present invention.
Fig. 7 is a non-maxima suppression operation of the present invention.
FIG. 8 is a workflow of the Deepsort algorithm of the present invention.
Fig. 9 is a schematic diagram of a biconvex trajectory estimation of the invention.
FIG. 10 shows the results of the system test of the present invention.
Detailed Description
The technical scheme of the invention is further explained by combining the attached drawings.
A fish identification counting, speed measurement early warning and access quantity statistical system comprises the following steps:
s1: the method comprises the steps of collecting 5 ornamental fish videos or pictures as a data set, labeling the fish coordinates in the pictures by using a genie labeling assistant, and then generating a labeling file. The labeled data set is divided into a training set and a verification set by using a Python program script.
In this embodiment, step S1 may be implemented by:
s11: and recording the video of the ornamental fish, taking the photo of the ornamental fish and searching the video or the photo of the ornamental fish on the internet by using a camera, and taking the video and the photo as a training data set. The video is converted into pictures for training, so the following operations are carried out. And simultaneously pressing a Windows key and an R key on a keyboard, inputting a cmd, clicking to open a command line window, and inputting pip install ffmpeg to install the ffmpeg (a tool kit in a command line). After ffmpeg is installed, ffmpeg is used in the command line to convert the video into a picture data set according to a certain sampling rate, and instructions such as: mpeg-i test. Mp4-r5-f image2 \ output \ frame _%05d.jpg, where-i is followed by a video file with mp4 suffix; wherein-r is followed by a sampling rate, i.e. several pictures are split in one second of video; where-f is followed by the type of saved pictures and finally the output picture deposit folder and naming format, where% 05d indicates a sequence number of 5 digits, picture naming is continued from frame _00000.jpg, frame _00001.jpg, frame _00002.jpg \8230, 8230, and so forth.
S12: using genius marking assistant (a piece of software for making label on the data set) to mark the data set, firstly defining several kinds of ornamental fishes, and using 5 kinds of ornamental fishes in the system development: clown Fish (Clowfish), spotted-spotted globefish (Yellow Box Fish), horse-Fish (Schooling Banner Fish), imperial Emperor Angelfish (Emperor Angelfish), yellow Tang Fish (Yellow Tang Fish); then, a box is drawn with a mouse for all the ornamental fishes appearing in each picture, the ornamental fishes are framed, and an ornamental fish species, such as one of the aforementioned 5 species, is selected for each box. After all the annotations are finished, setting an export path and an export format, wherein the export format is selected as passacal-voc, and 5228 pictures generate 5228 annotation files with the suffixes of xml format.
S13: because the training uses a markup file suffixed with txt, then a markup file conversion is performed. The Python program script is used for batch conversion. Import the relevant library file using import os and import xml. Etree. Elementtree, convert the pixel coordinates in xml file to txt normalized coordinates, and the fish species are represented by 5 numbers 0,1,2,3, 4. The conversion equations are shown in equation 1, equation 2, equation 3, and equation 4.
Figure BDA0003855731070000041
Figure BDA0003855731070000042
Figure BDA0003855731070000051
Figure BDA0003855731070000052
Where (x, y) -coordinates of the centre of the box after normalization
w, h-normalized Box Width and height
(x min ,y min ) Coordinates of the upper left corner of the square
(x max ,y max ) Coordinates of the lower right corner of the square
width, height-width and height of the entire picture
S2: putting the data set into a network structure of YOL0v5 for training, and after the training is finished, saving the trained weight file, wherein the detection effect of the weight file is shown in FIG. 2.
In this embodiment, step S2 can be implemented by:
s21: in the Input part of the network fabric. (1) Firstly, conducting Mosaic data enhancement, and the algorithm randomly splices 4 input images through zooming, distribution and cutting, as shown in fig. 3. The method enables input images to be more diversified, the trained model is more beneficial to detecting small objects, and meanwhile, originally, 4 images are processed into 1 image, so that the calculated amount is reduced. (2) Then, an adaptive picture scaling algorithm is performed, and the algorithm is used during testing and model inference, which greatly improves the inference speed of the algorithm, as shown in fig. 4. (3) And then, performing a self-adaptive anchor frame calculation algorithm, putting a program corresponding to the tested initial anchor frame value into the original code, and obtaining the optimal anchor frame value each time the data set is trained.
S22: in the BackBone part of the network structure, in the Focus structure, a picture is sliced, then Concat operation is carried out, and finally the picture becomes an easily processed characteristic diagram after convolution operation. This reduces floating point operations and speeds up computation. For example, the 796 × 796 × 3 picture is sliced, spliced and convolved to become a 398 × 398 × 12 feature map. And then, through a CSPNet structure, the operation of the structure mainly splits the characteristic diagram into two parts, one part is subjected to convolution operation, and the other part is spliced with the result of the convolution operation of the previous part. This structure can reduce the amount of calculation, but the precision improvement is small.
S23: in the hack part of the network structure, an FPN-PAN structure is used, and the structure carries out multi-dimensional feature extraction, so that the receptive field is greatly increased, as shown in FIG. 5.
S24: in the Head part of the network structure, a CIOU Loss function is used as a regression Loss function of the bounding box, the Loss function considers more comprehensive information, and the actual effect is better. In the post-processing stage of target detection, a non-maximum suppression operation is performed for the screening of a large number of target frames, as shown in fig. 7. From FIG. 6, equation 5 for calculating CIOU Loss is derived.
Figure BDA0003855731070000053
Distance _ C in the formula-the diagonal length of the minimum bounding rectangle
Distance _ 2-length of line connecting center points of two bounding boxes
IOU-cross-over ratio
v-measure aspect ratio uniformity parameter
S3: and calling the weight file to perform target detection on the video frame, and detecting a fish target frame in the picture. And processing the fish target frames by a Deepsort-based multi-target tracking algorithm and a visual gradient-based speed measurement algorithm to obtain the serial numbers and the speeds of the fish target frames.
In this embodiment, step S3 can be implemented by:
s31: the method mainly comprises the following steps of a multi-target tracking algorithm based on Deepsort: (1) Firstly, a trained fish classification weight file is used for detecting a fish target in a video frame to obtain characteristic information, wherein the characteristic information comprises position information, category information, confidence information and the like of the fish. (2) The matching target is carried out by calculating the matching degree of the two frames of image information before and after, and then the serial number is distributed to each tracked target, so that the dynamic tracking of various fishes is realized. The Deepsort algorithm adds cascade matching and new track confirmation, and solves the problem that the object in the Sort algorithm is disconnected due to shielding. The trajectories are divided into confirmed state trajectories and unconfirmed state trajectories. The newly generated track must be continuously matched with the detector for a plurality of times, and the unconfirmed track can be converted into the confirmed track; the confirmation state track is continuous and the detector is disconnected for a plurality of times, and then the confirmation state track is deleted. The workflow of the Deepsort algorithm is shown in FIG. 8.
S32: the core idea of the speed measurement algorithm based on the visual gradient is as follows. Pixel coordinate recording (1920 × 1280 resolution) is performed first: and (4) recording all the fish ID serial numbers and center XY coordinates detected and sequenced by the Deepsort algorithm into a list. When a plurality of video frames are spaced and the same ID number is tracked, the pixel point speed is calculated, and the calculation formulas are formula 6 and formula 7.
(x-x -1 ) 2 +(y-y -1 ) 2 =Distance 2 (formula 6)
Figure BDA0003855731070000061
Where x, y-position coordinates after movement
x -1 ,y -1 -position coordinates before movement
Distance-Distance of moving position
t-time of movement
Figure BDA0003855731070000062
-pixel point velocity
The speed of movement in the video frame, i.e. the moving pixels per unit time, is called the pixel speed. The pixel point speed and the actual speed are still different, and the physical model and the lens of the shooting camera are required to be subjected to inclination angle analysis. In the same time, under the condition of the same swimming distance, more pixel points are passed by the fishes near the camera in swimming; the fish near the camera swims through fewer pixel points. Therefore, the former needs to be scaled down by a scaling factor, and the latter needs to be scaled up by a scaling factor, but a parameter which is difficult to find by manual operation is provided. The best solution is to measure multiple sets of actual velocities and pixel point velocities and solve the equations using machine learning, as shown in equation 8.
Figure BDA0003855731070000063
Where λ is a proportionality coefficient of actual speed and pixel speed
s i -machine learning the parameters to be solved from the sets of actual velocities and pixel point velocities
y-vertical axis pixel coordinate
n is an integer, which is taken to be a value according to actual conditions
v-actual speed
S4: and then obtaining the number of the entering and leaving fishes through a counting algorithm based on biconvex trajectory measurement.
In this embodiment, step S4 can be implemented by:
s41: the core idea of the counting algorithm based on the biconvex trajectory estimation is as follows. (1) drawing two convex detection lines on a video: green line, yellow line, as shown in fig. 9. (2) And continuously comparing the fish position coordinates with the coordinates of the two lines. And counting when the coordinates of the monitoring points are crossed with the two convex detection lines. (3) The fish position coordinate firstly passes through a green line and then a yellow line, the fish is judged to enter a monitoring area, and the number of the fishes is added by 1; the fish position coordinate passes through a yellow line and then a green line, the fish is judged to leave the monitoring area, and the number of the fishes is reduced by 1.
After the above steps, a system detection result is output, as shown in fig. 10.
It should be noted that, according to the implementation requirement of the method, each step described in the present application can be divided into more steps, or two or more steps or partial operations of the steps can be combined into a new step, so as to achieve the purpose of the present invention. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (5)

1. The utility model provides a fish discernment count, speed measurement early warning and business turn over quantity statistical system which characterized in that includes:
s1: the method comprises the steps of collecting 5 ornamental fish videos or pictures as a data set, labeling the fish coordinates in the pictures by using a genie labeling assistant, and then generating a labeling file. The labeled data set is divided into a training set and a verification set by using a Python program script.
S2: and (4) putting the data set into a network structure of YOLOv5 for training, and storing a trained weight file after training is finished.
S3: and calling the weight file to perform target detection on the video frame, and detecting a fish target frame in the picture. And processing the fish by a Deepsort-based multi-target tracking algorithm and a visual gradient-based speed measurement algorithm to obtain the sequence number and the speed of the fish target frame.
S4: and then obtaining the number of the entering and leaving fishes through a counting algorithm based on biconvex trajectory measurement and calculation.
S5: and outputting a system detection result.
2. The method according to claim 1, wherein step S1 comprises:
s11: and recording the video of the ornamental fish, taking the photo of the ornamental fish and searching the video or the photo of the ornamental fish on the internet by using a camera, and taking the video and the photo as a training data set. The video is converted into pictures for training, so the following operations are performed. And simultaneously pressing a Windows key and an R key on a keyboard, inputting a cmd, clicking to open a command line window, and inputting pip install ffmpeg to install the ffmpeg (a tool kit in a command line). After ffmpeg is installed, ffmpeg is used in the command line to convert the video into a picture data set according to a certain sampling rate, and instructions are used, such as: mpeg-i test. Mp4-r5-f image2 \ output \ frame%05d. Jpg, where-i is followed by a video file with the suffix mp 4; wherein, the sampling rate is followed by r, namely, a video is divided into a plurality of pictures in one second; where-f is followed by the type of saved pictures and finally the output picture deposit folder and naming format, where% 05d indicates a sequence number of 5 digits, picture naming is continued from frame _00000.jpg, frame _00001.jpg, frame _00002.jpg \8230, 8230, and so forth.
S12: a genie marking assistant (software for making a label on a data set) is used for marking the data set, several kinds of ornamental fishes are defined, and 5 kinds of ornamental fishes are used in the development of the system: clown Fish (Clowfish), spotted-spotted globefish (Yellow Box Fish), horse-Fish (Schooling Banner Fish), imperial Emperor Angelfish (Emperor Angelfish), yellow Tang Fish (Yellow Tang Fish); then, a box is drawn with a mouse for all the ornamental fishes appearing in each picture, the ornamental fishes are framed, and one ornamental fish species, such as one of the aforementioned 5 species, is selected for each box. After all the annotations are marked, setting an export path and an export format, wherein the export format is selected as passacal-voc, and 5228 pictures generate 5228 annotation files with the suffix of xml format.
S13: because the training uses a markup file suffixed with txt, then a markup file conversion is performed. The Python program script is used for batch conversion. Import the relevant library file using import os and import xml. Etree. Elementtree, convert the pixel coordinates in xml file to txt normalized coordinates, and the fish species are represented by 5 numbers 0,1,2,3, 4. The conversion equations are shown in equation 1, equation 2, equation 3, and equation 4.
Figure FDA0003855731060000011
Figure FDA0003855731060000012
Figure FDA0003855731060000013
Figure FDA0003855731060000014
Where (x, y) -coordinates of the centre of the box after normalization
w, h-normalized Box Width and height
(x min ,y min ) Coordinates of the upper left corner of the square
(x max ,y max ) Coordinates of the lower right corner of the square
width, height — width and height of the entire picture.
3. The method according to claim 2, wherein step S2 comprises:
s21: in the Input part of the network fabric. (1) Firstly, conducting Mosaic data enhancement, and randomly splicing 4 input images by the algorithm through zooming, distribution and cutting. The method enables input images to be more diversified, the trained model is more beneficial to detecting small objects, and meanwhile, originally, 4 images are processed into 1 image, so that the calculated amount is reduced. (2) Then, an adaptive picture scaling algorithm is carried out, and the algorithm is used during testing and model inference, so that the inference speed of the algorithm is greatly improved. (3) And then, performing a self-adaptive anchor frame calculation algorithm, putting a program corresponding to the tested initial anchor frame value into the original code, and obtaining the optimal anchor frame value each time the data set is trained.
S22: in the BackBone part of the network structure, in the Focus structure, a picture is sliced, then Concat operation is carried out, and finally the picture becomes an easily processed characteristic diagram after convolution operation. This reduces floating point operations and speeds up computation. For example, the 796 × 796 × 3 picture is sliced, spliced and convolved to become a 398 × 398 × 12 feature map. Then, the feature map is processed by a CSPNet structure, the operation of the structure mainly splits the feature map into two parts, one part is processed by convolution operation, and the other part is spliced with the result of the convolution operation of the previous part. This structure can reduce the amount of calculation, but the improvement in precision is small.
S23: in the Neck part of the network structure, an FPN-PAN structure is used, multi-dimensional feature extraction is carried out on the structure, and the receptive field is greatly increased.
S24: in the Head part of the network structure, a CIOU Loss function is used as a regression Loss function of the bounding box, the Loss function considers more comprehensive information, and the actual effect is better. In the post-processing stage of target detection, a non-maximum suppression operation is performed on the screening of a plurality of target frames. From FIG. 6, equation 5 for calculating CIOU Loss is derived.
Figure FDA0003855731060000021
Distance _ C in the formula-the diagonal length of the minimum circumscribed rectangle
Distance _ 2-length of line connecting center points of two bounding boxes
IOU-cross-over ratio
v-measure aspect ratio uniformity parameter.
4. The method according to claim 3, wherein step S3 comprises:
s31: the method mainly comprises the following steps of a multi-target tracking algorithm based on Deepsort: (1) Firstly, a trained fish classification weight file is used for detecting fish targets in a video frame to obtain characteristic information, wherein the information comprises position information, category information, confidence information and the like of fishes. (2) The matching target is carried out by calculating the matching degree of the two frames of image information before and after, and then the serial number is distributed to each tracked target, so that the dynamic tracking of various fishes is realized. The Deepsort algorithm adds cascade matching and new track confirmation, and solves the problem that the object in the Sort algorithm is disconnected due to shielding. The trajectories are divided into confirmed state trajectories and unconfirmed state trajectories. The newly generated track must be continuously matched with the detector for multiple times, and the unconfirmed track can be converted into the confirmed track; the trace of the confirmation state is continuously disconnected with the detector for a plurality of times, and then the trace of the confirmation state is deleted.
S32: the core idea of the speed measurement algorithm based on the visual gradient is as follows. Pixel coordinate recording (1920 × 1280 resolution) is performed first: and (4) recording all the fish ID serial numbers and center XY coordinates detected and sequenced by the Deepsort algorithm into a list. When a plurality of video frames are spaced and the same ID number is tracked, the pixel point speed is calculated, and the calculation formulas are formula 6 and formula 7.
(x-x -1 ) 2 +(y-y -1 ) 2 =Distance 2 (formula 6)
Figure FDA0003855731060000022
Where x, y-position coordinates after movement
x -1 ,y -1 Position coordinates before movement
Distance-Distance of moving position
t- — moving time
Figure FDA0003855731060000023
-pixel point velocity
The speed of movement in the video frame, i.e. the moving pixels per unit time, is called the pixel speed. The pixel point speed and the actual speed are still different, and the physical model and the lens of the shooting camera are required to be subjected to inclination angle analysis. In the same time, under the condition of the same swimming distance, more pixel points are traveled by the fishes close to the camera; the number of pixels through which fish near the camera swim is small. Therefore, the former needs to be scaled down by adding a scaling factor, and the latter needs to be scaled up by adding a scaling factor, but a parameter which is difficult to find by manual operation is provided. The best solution is to measure multiple sets of actual velocities and pixel point velocities and solve the equations using machine learning, as shown in equation 8.
Figure FDA0003855731060000031
Where λ is a proportionality coefficient of actual speed and pixel speed
s i -machine learning the parameters to be solved from the sets of actual velocities and pixel point velocities
y-vertical axis pixel coordinate
n is an integer, which is taken to be a value according to actual conditions
v-actual speed.
5. The method according to claim 4, wherein step S4 comprises:
s41: the core idea of the counting algorithm based on the biconvex trajectory estimation is as follows. (1) drawing two convex detection lines on the video: green line, yellow line. (2) And continuously comparing the fish position coordinates with the coordinates of the two lines. And counting when the coordinates of the monitoring points are crossed with the two convex detection lines. (3) The fish position coordinate firstly passes through a green line and then passes through a yellow line, the fish is judged to enter a monitoring area, and the number of the fishes is increased by 1; the fish position coordinate passes through a yellow line and then a green line, the fish is judged to leave the monitoring area, and the number of the fishes is reduced by 1.
CN202211148365.XA 2022-09-21 2022-09-21 Fish identification counting, speed measurement early warning and access quantity statistical system Pending CN115496780A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211148365.XA CN115496780A (en) 2022-09-21 2022-09-21 Fish identification counting, speed measurement early warning and access quantity statistical system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211148365.XA CN115496780A (en) 2022-09-21 2022-09-21 Fish identification counting, speed measurement early warning and access quantity statistical system

Publications (1)

Publication Number Publication Date
CN115496780A true CN115496780A (en) 2022-12-20

Family

ID=84471375

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211148365.XA Pending CN115496780A (en) 2022-09-21 2022-09-21 Fish identification counting, speed measurement early warning and access quantity statistical system

Country Status (1)

Country Link
CN (1) CN115496780A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116363494A (en) * 2023-05-31 2023-06-30 睿克环境科技(中国)有限公司 Fish quantity monitoring and migration tracking method and system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116363494A (en) * 2023-05-31 2023-06-30 睿克环境科技(中国)有限公司 Fish quantity monitoring and migration tracking method and system

Similar Documents

Publication Publication Date Title
Jia et al. Detection and segmentation of overlapped fruits based on optimized mask R-CNN application in apple harvesting robot
Wu et al. Using channel pruning-based YOLO v4 deep learning algorithm for the real-time and accurate detection of apple flowers in natural environments
Gai et al. A detection algorithm for cherry fruits based on the improved YOLO-v4 model
CN106127204B (en) A kind of multi-direction meter reading Region detection algorithms of full convolutional neural networks
CN109948425B (en) Pedestrian searching method and device for structure-aware self-attention and online instance aggregation matching
CN106096577B (en) A kind of target tracking method in camera distribution map
Shao et al. Deeply learned attributes for crowded scene understanding
CN109815364A (en) A kind of massive video feature extraction, storage and search method and system
CN109033998A (en) Remote sensing image atural object mask method based on attention mechanism convolutional neural networks
CN103324677B (en) Hierarchical fast image global positioning system (GPS) position estimation method
CN112115906A (en) Open dish identification method based on deep learning target detection and metric learning
Liu et al. Super-pixel cloud detection using hierarchical fusion CNN
CN106537390A (en) Identifying presentation styles of educational videos
Rong et al. Pest identification and counting of yellow plate in field based on improved mask r-cnn
JP6787831B2 (en) Target detection device, detection model generation device, program and method that can be learned by search results
CN110827312A (en) Learning method based on cooperative visual attention neural network
CN108229432A (en) Face calibration method and device
TW202240471A (en) Methods, apparatuses, devices, and storage media for detecting target
CN115496780A (en) Fish identification counting, speed measurement early warning and access quantity statistical system
CN113362279A (en) Intelligent concentration detection method of immunochromatographic test paper
Yu et al. TasselLFANet: a novel lightweight multi-branch feature aggregation neural network for high-throughput image-based maize tassels detection and counting
CN115330833A (en) Fruit yield estimation method with improved multi-target tracking
CN112925470B (en) Touch control method and system of interactive electronic whiteboard and readable medium
Shuai et al. An improved YOLOv5-based method for multi-species tea shoot detection and picking point location in complex backgrounds
KR101313285B1 (en) Method and Device for Authoring Information File of Hyper Video and Computer-readable Recording Medium for the same

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination