CN116912670A - Deep sea fish identification method based on improved YOLO model - Google Patents

Deep sea fish identification method based on improved YOLO model Download PDF

Info

Publication number
CN116912670A
CN116912670A CN202211441477.4A CN202211441477A CN116912670A CN 116912670 A CN116912670 A CN 116912670A CN 202211441477 A CN202211441477 A CN 202211441477A CN 116912670 A CN116912670 A CN 116912670A
Authority
CN
China
Prior art keywords
model
training
module
deep sea
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211441477.4A
Other languages
Chinese (zh)
Inventor
刘长红
温嘉文
吴博淳
刘金辉
李天注
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou University
Original Assignee
Guangzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou University filed Critical Guangzhou University
Priority to CN202211441477.4A priority Critical patent/CN116912670A/en
Publication of CN116912670A publication Critical patent/CN116912670A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/05Underwater scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration by the use of more than one image, e.g. averaging, subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20224Image subtraction
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A40/00Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
    • Y02A40/80Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in fisheries management
    • Y02A40/81Aquaculture, e.g. of fish

Abstract

The invention relates to the field of deep learning target detection, and discloses a deep sea fish identification method based on an improved YOLO model, which comprises the steps of generating an antagonistic neural network for sample expansion to obtain an expanded sample; training an improved YOLO v7 model, wherein the improved YOLO-v7 model is composed of an HN recursion gating convolution module, a CBS convolution module, a downsampling MP module, a receptive field increasing module SPPCSPC module, a RepConv module and an ASFF pyramid feature fusion module; dividing the extended sample into an input test picture and an input training picture, and dividing the input training picture into 7:3, dividing the ratio into a test set and a verification set, marking class names of each deep sea fish, and carrying out data enhancement by adopting Cutout data enhancement and Random affine transformation; the improved YOLO-V7 model is input into the data after data enhancement for training, the performance of the model is judged according to the loss function, model training parameters are updated, and 100 training iterations are performed in total; and obtaining a model file with the optimal model after training, and detecting the pictures and videos of the deep sea fishes by using the model.

Description

Deep sea fish identification method based on improved YOLO model
Technical Field
The invention relates to the field of deep learning target detection, in particular to a deep sea fish identification method based on an improved YOLO model.
Background
The invention described in reference [1] is an underwater video fish identification method based on a neural network, comprising the steps of: training a neural network model, wherein the model comprises an input layer, a first convolution layer, a second convolution layer, a third convolution layer, a maximum pooling layer, a full connection layer and an output layer which are sequentially connected, wherein the first convolution layer carries out feature map fusion after carrying out different feature extraction on information of different channels by a convolution layer of each channel in the input layer, the second convolution layer adopts a multiple convolution method to extract the scales of different receptive fields from targets of different scales, and then carries out feature map fusion and batch normalization processing; taking each channel of a color image in underwater video data and a gray level image thereof as the input of a model; and outputting a plurality of target positioning frames and the confidence degrees thereof by the model, and screening out targets according to the confidence degrees. The method can meet the requirements of real-time video fish identification and simultaneously reduce the quality requirements of images shot by the camera.
The invention described in reference [2] belongs to the technical field of image target detection, and particularly discloses a fabric flaw detection method based on a Yolov4 improved algorithm, which introduces a latest lightweight attention module Coordinate Attention (CA) on a backbone network, not only can capture cross-channel information, but also can capture direction sensing and position sensing information, so that the network can perform heavy detection on an interested target, and deformable convolution (Deformable Convolutional Network, DCN) is added to enhance the adaptability of the network to flaws with changeable shapes, and the detection accuracy is improved. For the feature fusion part, self-adaptive weighted fusion (ASFF) is used on the basis of the original path aggregation network, so that the extracted features of each feature layer are fused with different weights before prediction, and meanwhile, the cross-stage local network structure (CSP) is used for partial convolution of the feature fusion part for replacement, so that the accuracy of the network on fabric flaw detection is greatly improved under the condition of ensuring the speed.
The invention described in reference [3] discloses a fish identification method and device based on convolutional neural network, comprising: (1) Collecting an original fish image, performing significance analysis on the original fish image to locate and divide fish targets in the fish image to obtain a foreground image, and linearly fusing the foreground image and the original fish image to obtain a fish image with obvious contrast as a training sample so as to construct a training set; (2) Pre-training ResNet by using ImageNet, extracting ResNet with determined network parameters as a feature extraction unit, wherein the output of the feature extraction unit is sequentially connected with an average pooling layer and a Softmax classifier to form a fish identification network; (3) Optimizing network parameters of a fish identification network by using the training set to obtain a fish identification model; (4) And identifying the fish image to be identified by utilizing the fish identification model, and outputting an identification result. The fish identification method and device based on the convolutional neural network can accurately identify fish.
The method described in reference [4] provides a brief description of the problem that the fish shape in the natural environment is various and is susceptible to different light and background environments, resulting in reduced recognition accuracy and poor classification results of some conventional fish recognition algorithms based on color texture or feature point extraction.
[1] Underwater video fish identification method based on neural network
https://d.wanfangdata.com.cn/patent/ChJQYXRlbnROZXdTMjAyMjAzMjMSEENOMjAyM DExMzE5MzYxLjQaCGk4eGo1M3Rx
[2] Fabric flaw detection method based on YOLO v4 improved algorithm
https://d.wanfangdata.com.cn/patent/ChJQYXRlbnROZXdTMjAyMjAzMjMSEENOMjAyM TEwNTA1MzI2LlgaCHFqeWY1NXE2
[3] Fish identification method and device based on convolutional neural network
https://d.wanfangdata.com.cn/patent/ChJQYXRlbnROZXdTMjAyMjAzMjMSEENOMjAxO TEwOTEyMjg3LjgaCDhpcnZlN3Zp
[4] Fish identification algorithm based on improved AlexNet
https://d.wanfangdata.com.cn/periodical/ChlQZXJpb2RpY2FsQ0hJTmV3UzIwMjIxM DEzEg1kemtqMjAyMTA0MDAzGgh3NndnODJ1Zw%3D%3D
Object detection is one of the most important subjects in computer vision. Most computer vision problems involve detecting visual object categories such as pedestrians, cars, buses, faces, etc. This area is not limited to academia but also in video surveillance, healthcare, vehicle sensing and autopilot. Starting from AlexNet in 2012, the target detection algorithm is rapidly developed in the field of deep learning, and is mainly divided into two ideas: one-Stage and Two-Stage. The One-Stage extracts characteristics directly through a convolutional neural network, and predicts classification and positioning of targets. Two-Stage is firstly Region generation, namely a candidate Region (Region Propos) is generated, and then classification and positioning of a target are predicted through a convolutional neural network;
the YOLO model is a target recognition model based on One-stage thought, and has good target detection performance. However, in the deep sea fish identification problem, the YOLOv7 trunk feature extraction network is a CNN network, the CNN has translational invariance and locality, the capability of global modeling long-distance modeling is lacking, the original YOLO model feature fusion network is PANet, and although the original YOLO model feature fusion network can better fuse features of targets with different scales compared with the FPN, so that the effect is improved, but there is room for improvement, and a more advanced feature fusion network exists, meanwhile, the existing data set of the deep sea fish is less, and under the condition of small sample input, how to improve the whole network precision is still to be studied.
The prior art is improved based on the traditional neural network, but in the research of recent years, the neural network model is continuously optimized and improved, more improved network structures are proposed, and the current latest target detection neural network model has the characteristics of higher accuracy and higher speed. Meanwhile, the prior art lacks a more effective data enhancement method under the condition of less learning samples, and the samples are expanded so as to achieve a better model effect, so that a deep sea fish identification method based on an improved YOLO model is provided.
Disclosure of Invention
(one) solving the technical problems
Aiming at the defects of the prior art, the invention provides a deep sea fish identification method based on an improved YOLO model, which solves the problems.
(II) technical scheme
In order to achieve the above purpose, the present invention provides the following technical solutions: the deep sea fish identification method based on the improved YOLO model comprises the following steps:
the first step: collecting deep sea fish pictures;
and a second step of: generating an antagonistic neural network for sample expansion to obtain an expanded sample;
and a third step of: training an improved YOLO v7 model, wherein the improved YOLO-v7 model is composed of an HN recursion gating convolution module, a CBS convolution module, a downsampling MP module, a receptive field increasing module SPPCSPC module, a RepConv module and an ASFF pyramid feature fusion module;
fourth step: dividing the extended sample into an input test picture and an input training picture, and dividing the input training picture into 7:3, dividing the ratio into a test set and a verification set, marking class names of each deep sea fish, and carrying out data enhancement by adopting Cutout data enhancement and Random affine transformation;
fifth step: the improved YOLO-V7 model is input into the data after data enhancement for training, the performance of the model is judged according to the loss function, model training parameters are updated, and 100 training iterations are performed in total;
sixth step: and obtaining a model file with the optimal model after training, and detecting the pictures and videos of the deep sea fishes by using the model.
Preferably, the generating antagonistic neural network in the first step is divided into a discrimination model and a generating model, and the input deep sea fish image sample is processed according to the following 6:4, dividing the ratio into a training set and a testing set, inputting the classified pictures into a generated countermeasure network training to obtain a generated model and a judging model, inputting the pictures in the training set into the generated model obtained after the previous training to obtain a plurality of simulation samples, and finally combining the generated simulation samples with the original samples to be used as a new data set.
Preferably, the HN recursive gating convolution module and the ASFF pyramid feature fusion module are improved modules based on the original network, the HN recursive gating convolution module is added into the backbone network and the detection head network, the original feature fusion network of the YOLO-v7 network is improved into the ASFF self-adaptive feature fusion network, the main principle of the HN recursive gating convolution module is recursive gate convolution, and high-order space interaction is performed by using gate convolution and recursive design.
Preferably, the principle of the high-order space interaction is as follows:
s1: first, a set of projection features p0 and p0 are obtained using a linear projection function
S2: performing convolution:
scaling the output by 1/α to stabilize training, { f k The } is a set of deep convolutional layers, { g k -for matching dimensions in different orders;
the last recursive step q n Is fed to the projection layer phi out And obtaining a final recursive gate convolution result. In order to reduce the computational overhead caused by higher-order interactions, the channel dimensions in each order are set to:
s3: the original feature fusion network of the YOLO-v7 network is improved to be an ASFF self-adaptive feature fusion network, for the features of a certain level, the features of other levels are adjusted to be the same resolution and are simply integrated, then training is carried out to find the optimal fusion mode, and at each spatial position, the features of different levels are adaptively fused together;
s4: the spatial weight of each scale feature in fusion is adaptively adjusted through learning;
the feature fusion formula is as follows:
s5: alpha is obtained by softmax algorithm i Is the value of (1):
s6: the ASFF pyramid feature fusion module was added to the YOLO-v7 model.
Preferably, the Cutout data enhancement randomly subtracts a portion of the picture during training, and the Random affine transformation comprises Random rotation, translation, scaling, and miscut operations on the image.
Preferably, the loss function of YOLO-v7 consists essentially of classification loss, obj confidence loss, and location loss, calculated as follows:
Loss=λ 1 L cls1 L obj1 L los
the positioning loss function is CIOU, and the calculation formula of the CIOU is as follows:
wherein ρ is 2 (b,b gt ) The Euclidean distance between the center points of the prediction frame and the real frame, c represents the diagonal distance of the minimum closure area capable of simultaneously containing the prediction frame and the real frame, and the calculation formulas of alpha and v are as follows:
the final Loss value is:
(III) beneficial effects
Compared with the prior art, the invention provides a deep sea fish identification method based on an improved YOLO model, which comprises the following steps of
The beneficial effects are that:
1. according to the deep sea fish identification method based on the improved YOLO model, high accuracy and high speed can be achieved under the condition that an input sample data set is small through expansion of deep sea fish data samples and improvement of a YOLO-v7 network. In the face of smaller data sets we use the generation of more sample data against the network (GAN). The method is improved on the original YOLO-v7 model, and an HN recursion gating convolution module and an ASFF self-adaptive feature fusion network are introduced. Both of these improved methods have been experimentally verified to have a positive effect on improving the performance of the model.
2. The deep sea fish identification method based on the improved YOLO model provides an accurate and efficient target detection method, and the same scheme of the method is not detected and recorded in the market at present, so that the scheme can well fill the technical gap of efficient target detection of small-sample deep sea fish, and has good application prospect.
Drawings
FIG. 1 is a schematic overall flow diagram;
FIG. 2 is a schematic diagram of a process for generating an reactive neural network;
FIG. 3 is a schematic flow diagram of an expanded portion of a sample dataset;
FIG. 4 is a schematic diagram of an ASFF pyramid feature fusion network;
FIG. 5 is a schematic diagram of the model test effect.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1-5, the deep sea fish identification method based on the improved YOLO model comprises the following steps:
the first step: the invention relates to expansion of a sample data set, which expands a data sample by utilizing a generated antagonistic neural network (GAN), wherein the generated antagonistic neural network (GAN) is divided into a judging model and a generating model, the generating model is used for capturing the distribution of a real sample and generating a new simulation sample according to the distribution, and the judging model is a classifier for judging whether the input is the real sample or the simulation sample. The generation model and the countermeasure model enable the discrimination model to correctly discriminate the source of the training sample through continuous countermeasure training, and simultaneously enable the simulation sample generated by the generation model to be more similar to the real sample, thereby achieving the purpose of expanding sample data, and the input deep sea fish image sample is processed according to the following steps: 4, dividing the ratio into a training set and a testing set, inputting the classified pictures into a generated countermeasure network for training to obtain a generated model and a judging model, inputting the pictures in the training set into the generated model obtained after the previous training to obtain a plurality of simulation samples, and finally combining the generated simulation samples with the original samples to serve as a new data set for the next training of the improved YOLO-v7 model;
and a second step of: the improved YOLO v7 model is trained, and the improved YOLO-v7 model is composed of an HN recursion gating convolution module, a CBS convolution module, a downsampling MP module, a receptive field increasing module SPPCSPC module, a RepConv module and an ASFF pyramid feature fusion module. The HN recursion gating convolution module and the ASFF pyramid feature fusion module are improved modules based on the original network. And adding an HN recursion gating convolution module into an original backbone network (backbone) and a detection head network (head), and improving an original feature fusion network (PANet) of the YOLO-v7 network into an ASFF self-adaptive feature fusion network. The main principle of the HN recursive gated convolution module is recursive gate convolution (g n Conv). The characteristics of input self-adaption, long-range and high-order space interaction of the transducer neural network model can also be effectively realized through a convolution-based framework. Recursive gate convolution (gn Conv), high order spatial interactions are performed with the gate convolution and the recursive design. This approach has a high degree of flexibility and customizable, is compatible with various variants of convolution, and extends the second order interactions in self-attention to arbitrary orders without introducing significant additional computation. The principle for higher order spatial interactions is as follows:
first, a set of projection features p0 and p0 are obtained using a linear projection function
The convolution is then performed in the following recursive manner:
wherein the output is scaled by 1/α to stabilize training, { f k The } is a set of deep convolutional layers, { g k -for matching dimensions in different orders;
the last recursive step q n Is fed to the projection layer phi out And obtaining a final recursive gate convolution result. In order to reduce the computational overhead caused by higher-order interactions, the channel dimensions in each order are set to:
the original feature fusion network (PANet) of the YOLO-v7 network is improved to an ASFF adaptive feature fusion network. ASFF enables the network to learn directly how to spatially filter features at other levels, leaving only useful information to combine. For a certain level of features, the other levels of features are first tuned to the same resolution and simply integrated, and then trained to find the best fusion approach. At each spatial location, features of different levels are adaptively fused together, for example: if a certain position carries contradictory information, the features will be filtered out, and if the features of a certain position have more distinguishing clues, the features will be enhanced;
the spatial weights of the scale features at the time of fusion are adaptively adjusted through learning.
The feature fusion formula is as follows:
alpha is obtained by softmax algorithm i Is the value of (1):
after the step of adding the ASFF pyramid feature fusion module to the detection head part of the YOLO-v7 is completed, the improvement of the whole YOLO-v7 model is completed;
and a third step of: training the model, and dividing the extended sample obtained in the first step into an input test picture and an input training picture. Inputting a test picture for testing the final effect of the subsequent model, and inputting a training picture according to 7:3 is divided into a test set and a verification set, and the class name of each deep sea fish is marked by using Labelimg marking software. And clustering the GT frames according to the marks, wherein the GT frame mark data are (c, x1, y1, x2 and y 2), c represents the category of objects contained in the GT frame, x1 and y1 respectively represent the x coordinate and the y coordinate of the top left corner vertex in the GT frame, x2 and y2 respectively represent the x coordinate and the y coordinate of the bottom right corner vertex in the GT frame, and the data enhancement part adopts Cutout data enhancement and Random affine transformation. The Cutout data enhancement randomly subtracts a part of the picture during training, so that the robustness of the model can be improved. Random affine transformation involves Random rotation, translation, scaling, miscut operations on the image. The data after data enhancement is input into the improved YOLO-V7 model for training, the performance of the model is judged according to a loss function, model training parameters are updated, and after 100 training iterations, the loss function of the YOLO-V7 mainly comprises three parts, namely classification loss (class loss), obj confidence loss (object loss) and positioning loss (Location loss), and the calculation formula is as follows:
Loss=λ 1 L cls1 L obj1 L los
the positioning loss function is CIOU (complete IoU), and the CIOU calculation formula is as follows:
wherein ρ is 2 (b,b gt ) The Euclidean distance between the center points of the prediction frame and the real frame, c represents the diagonal distance of the minimum closure area capable of simultaneously containing the prediction frame and the real frame, and the calculation formulas of alpha and v are as follows:
the final Loss value is:
and obtaining a model file with the optimal model after training, and detecting the pictures and videos of the deep sea fishes by using the model.
And the third part is a test of the model, and the input test picture separated in the second step is used for testing the accuracy of the model. And inputting a plurality of deep sea fish sample graphs, wherein the final detection effect is shown in figure 5.
Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (6)

1. The deep sea fish identification method based on the improved YOLO model is characterized by comprising the following steps of:
the first step: collecting deep sea fish pictures;
and a second step of: sample expansion is carried out on the production countermeasure neural network, and an expansion sample is obtained;
and a third step of: training an improved YOLO v7 model, wherein the improved YOLO-v7 model is composed of an HN recursion gating convolution module, a CBS convolution module, a downsampling MP module, a receptive field increasing module SPPCSPC module, a RepConv module and an ASFF pyramid feature fusion module;
fourth step: dividing the extended sample into an input test picture and an input training picture, and dividing the input training picture into 7:3, dividing the ratio into a test set and a verification set, marking class names of each deep sea fish, and carrying out data enhancement by adopting Cutout data enhancement and Random affine transformation;
fifth step: the improved YOLO-V7 model is input into the data after data enhancement for training, the performance of the model is judged according to the loss function, model training parameters are updated, and 100 training iterations are performed in total;
sixth step: and obtaining a model file with the optimal model after training, and detecting the pictures and videos of the deep sea fishes by using the model.
2. The method for identifying deep sea fish based on the improved YOLO model according to claim 1, wherein: the generation of the countermeasure neural network in the first step is divided into a discrimination model and a generation model, and input deep sea fish image samples are subjected to the following steps: 4, dividing the ratio into a training set and a testing set, inputting the classified pictures into a generated countermeasure network training to obtain a generated model and a judging model, inputting the pictures in the training set into the generated model obtained after the previous training to obtain a plurality of simulation samples, and finally combining the generated simulation samples with the original samples to be used as a new data set.
3. The method for identifying deep sea fish based on the improved YOLO model according to claim 1, wherein: the HN recursion gating convolution module and the ASFF pyramid feature fusion module are improved modules based on the original network, the HN recursion gating convolution module is added into the backbone network and the detection head network, the original feature fusion network of the YOLO-v7 network is improved into the ASFF self-adaptive feature fusion network, the main principle of the HN recursion gating convolution module is recursion gating convolution, and high-order space interaction is carried out by gating convolution and recursion design.
4. The method for identifying deep sea fish based on the improved YOLO model according to claim 1, wherein: the principle of the high-order space interaction is as follows:
s1: first, a set of projection features p0 and p0 are obtained using a linear projection function
S2: performing convolution:
scaling the output by 1/α to stabilize training, { f k The } is a set of deep convolutional layers, { g k -for matching dimensions in different orders;
the last recursive step q n Is fed to the projection layer phi out And obtaining a final recursive gate convolution result. To reduce the heightThe computational overhead caused by order interaction sets the channel dimension in each order to:
s3: the original feature fusion network of the YOLO-v7 network is improved to be an ASFF self-adaptive feature fusion network, for the features of a certain level, the features of other levels are adjusted to be the same resolution and are simply integrated, then training is carried out to find the optimal fusion mode, and at each spatial position, the features of different levels are adaptively fused together;
s4: the spatial weight of each scale feature in fusion is adaptively adjusted through learning;
the feature fusion formula is as follows:
s5: alpha is obtained by softmax algorithm i Is the value of (1):
s6: the ASFF pyramid feature fusion module was added to the YOLO-v7 model.
5. The method for identifying deep sea fish based on the improved YOLO model according to claim 1, wherein: the Cutout data enhancement randomly subtracts a portion of the picture during training, and the Random affine transformation includes Random rotation, translation, scaling, miscut operations on the image.
6. The method for identifying deep sea fish based on the improved YOLO model according to claim 1, wherein: the loss function of YOLO-v7 consists of classification loss, obj confidence loss and location loss, and the calculation formula is as follows:
Loss=λ 1 L cls1 L obj1 L los
the positioning loss function is CIOU, and the calculation formula of the CIOU is as follows:
wherein ρ is 2 (b,b gt ) The Euclidean distance between the center points of the prediction frame and the real frame, c represents the diagonal distance of the minimum closure area capable of simultaneously containing the prediction frame and the real frame, and the calculation formulas of alpha and v are as follows:
the final Loss value is:
CN202211441477.4A 2022-11-17 2022-11-17 Deep sea fish identification method based on improved YOLO model Pending CN116912670A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211441477.4A CN116912670A (en) 2022-11-17 2022-11-17 Deep sea fish identification method based on improved YOLO model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211441477.4A CN116912670A (en) 2022-11-17 2022-11-17 Deep sea fish identification method based on improved YOLO model

Publications (1)

Publication Number Publication Date
CN116912670A true CN116912670A (en) 2023-10-20

Family

ID=88353633

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211441477.4A Pending CN116912670A (en) 2022-11-17 2022-11-17 Deep sea fish identification method based on improved YOLO model

Country Status (1)

Country Link
CN (1) CN116912670A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117541586A (en) * 2024-01-10 2024-02-09 长春理工大学 Thyroid nodule detection method based on deformable YOLO

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117541586A (en) * 2024-01-10 2024-02-09 长春理工大学 Thyroid nodule detection method based on deformable YOLO

Similar Documents

Publication Publication Date Title
CN111259786B (en) Pedestrian re-identification method based on synchronous enhancement of appearance and motion information of video
Li et al. Traffic light recognition for complex scene with fusion detections
CN108416266B (en) Method for rapidly identifying video behaviors by extracting moving object through optical flow
CN110929593B (en) Real-time significance pedestrian detection method based on detail discrimination
Geng et al. Using deep learning in infrared images to enable human gesture recognition for autonomous vehicles
CN111768388A (en) Product surface defect detection method and system based on positive sample reference
CN113920107A (en) Insulator damage detection method based on improved yolov5 algorithm
CN114638784A (en) Method and device for detecting surface defects of copper pipe based on FE-YOLO
CN111275010A (en) Pedestrian re-identification method based on computer vision
CN109902576B (en) Training method and application of head and shoulder image classifier
CN113408584A (en) RGB-D multi-modal feature fusion 3D target detection method
CN112329771A (en) Building material sample identification method based on deep learning
Avola et al. Real-time deep learning method for automated detection and localization of structural defects in manufactured products
CN114283326A (en) Underwater target re-identification method combining local perception and high-order feature reconstruction
Li et al. Ferrite beads surface defect detection based on spatial attention under weakly supervised learning
Yang et al. An improved algorithm for the detection of fastening targets based on machine vision
CN116912670A (en) Deep sea fish identification method based on improved YOLO model
Chen et al. Multi-scale attention networks for pavement defect detection
Xiang et al. Crowd density estimation method using deep learning for passenger flow detection system in exhibition center
CN110910497B (en) Method and system for realizing augmented reality map
CN110889418A (en) Gas contour identification method
CN111582057A (en) Face verification method based on local receptive field
CN116596966A (en) Segmentation and tracking method based on attention and feature fusion
CN112199984B (en) Target rapid detection method for large-scale remote sensing image
CN114927236A (en) Detection method and system for multiple target images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination