CN113627558A - Fish image identification method, system and equipment - Google Patents

Fish image identification method, system and equipment Download PDF

Info

Publication number
CN113627558A
CN113627558A CN202110955820.6A CN202110955820A CN113627558A CN 113627558 A CN113627558 A CN 113627558A CN 202110955820 A CN202110955820 A CN 202110955820A CN 113627558 A CN113627558 A CN 113627558A
Authority
CN
China
Prior art keywords
fish
network
module
fish image
image recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110955820.6A
Other languages
Chinese (zh)
Inventor
孔青
仲国强
陈振潮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ocean University of China
Original Assignee
Ocean University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ocean University of China filed Critical Ocean University of China
Priority to CN202110955820.6A priority Critical patent/CN113627558A/en
Publication of CN113627558A publication Critical patent/CN113627558A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a fish image identification method, a system and equipment, wherein the identification method comprises the following steps: s1, collecting fish images, screening the fish images, enhancing data and uniformly processing the fish images in size, determining classification labels and constructing a training data set; s2, adding a CBAM module into each residual block of the depth residual error network to construct a fish image identification network; s3, training the fish image recognition network constructed in the step S2 by using a training data set, and obtaining a fish image recognition model after training is finished; and S4, screening the fish images to be recognized, enhancing data and uniformly processing the fish images to be recognized, and recognizing the processed fish images to be recognized by using the fish image recognition model to obtain a recognition result. The method can improve the recognition rate of common fish images, and the recognition accuracy rate can reach more than 80%.

Description

Fish image identification method, system and equipment
Technical Field
The invention belongs to the technical field of image recognition, relates to a fish image recognition technology, and particularly relates to a fish image recognition method, a fish image recognition system and fish image recognition equipment.
Background
With the increasing demand of human beings on marine resources, marine fishery resources are more and more emphasized. In order to protect a marine ecosystem and prevent fishermen from catching inappropriate fishes in inappropriate time periods, monitoring cameras are installed on the deck of a fishing boat in many countries and international organizations, but due to the fact that the conditions of frequent rain and wave mixing, water mist diffusion and the like in the marine operation process seriously affect the monitoring image quality, a supervisor is difficult to identify the fishes in a picture, massive monitoring videos need a large amount of human resources, and the identification accuracy is poor.
Disclosure of Invention
Aiming at the problems in the technology, the invention provides a fish image identification method and system equipment, which can accurately and quickly identify and classify fish.
In order to achieve the above object, the present invention provides a fish image recognition method, which comprises the following specific steps:
s1, collecting fish images, screening the fish images, enhancing the data and uniformly processing the fish images in size, determining classification labels and constructing a training data set;
s2, adding a CBAM module into each residual block of the depth residual error network to construct a fish image identification network;
s3, training the fish image recognition network constructed in the step S2 by using a training data set, and obtaining a fish image recognition model after training is finished;
and S4, screening the fish images to be recognized, enhancing data and uniformly processing the fish images to be recognized, and recognizing the processed fish images to be recognized by using the fish image recognition model to obtain a recognition result.
Preferably, in step S1 and step S4, the specific requirements of the screening are: removing fish images with unclear or incomplete fish features 1/3, removing fish images with a resolution below 100, and removing fish images that are different from a certain fish category.
Preferably, in step S1 and step S4, the data enhancement adopts a torchvision image library operation, which specifically includes: and horizontally turning the image, randomly cutting, adding Gaussian noise, adjusting the image brightness and randomly rotating and transforming the image.
Preferably, the CBAM module includes a channel attention module and a space attention module arranged in sequence, and given an input feature map, the CBAM module successively infers an attention map along two separate dimensions of the channel attention module and the space attention module, and then multiplies the attention map with the input feature map to obtain a refined adaptive feature map, which is a final output feature map of the CBAM module and is expressed as:
Figure BDA0003220214250000021
wherein F' is the final feature output diagram,
Figure BDA0003220214250000022
representing multiplication of elements, F' obtaining an output feature map for the channel attention module, MC(F) Channel attention map, M, inferred for the channel attention Modules(F') is a spatial attention diagram deduced by the spatial attention module, and F is an input feature diagram.
Preferably, the channel attention module generates two different spatial context descriptors by aggregating the spatial information of the input feature map F using average pooling and maximum pooling operations, and forwards the two descriptors to a shared network composed of multiple layers of perceptrons MLP, wherein the shared network includes a hidden layer, to generate the channel attention map.
Preferably, after the shared network is applied to each descriptor, the output feature vectors are combined using an element addition method to obtain a channel attention map, which is expressed as:
Figure BDA0003220214250000023
where Avg (-) is the average pooling characteristic function, Max (-) is the maximum pooling characteristic function, MLP (-) represents the multi-layered perceptron output function,
Figure BDA0003220214250000031
the features obtained from the average pooling of the feature maps F,
Figure BDA0003220214250000032
features obtained for maximum pooling of feature map F, W0(. is a linear function of the first layer network in the shared network, W1The (-) is a linear variation function of a second layer network in the shared network, and the sigma (-) represents a sigmoid function.
Preferably, the spatial attention module gathers the channel information of the feature map F' by using the average pooling and maximum pooling operations to generate two-dimensional maps, each representing an average ensemble feature and a maximum ensemble feature of the entire channel, connects the average ensemble feature and the maximum ensemble feature, and convolves them by a standard convolution layer to generate a spatial attention map, represented as:
Figure BDA0003220214250000033
in the formula (f)m*m(. cndot.) is a convolution function, m represents the convolution kernel size,
Figure BDA0003220214250000034
the features obtained from the average pooling of the feature maps F',
Figure BDA0003220214250000035
features obtained for maximum pooling of the feature map F'.
Preferably, in step S3, when training the fish image recognition network, the Adam algorithm is used to perform network parameter optimization, the output layer is classified by using a Softmax function, and the cross entropy loss function is used to perform network optimization.
In order to achieve the above object, the present invention also provides a fish image recognition system, comprising:
the data acquisition module is used for acquiring fish image data;
the training data set construction module is used for screening fish images, enhancing data and uniformly processing sizes, determining classification labels and constructing a training data set;
the fish image identification network construction module is used for adding a CBAM module into each residual block of the depth residual network to construct a fish image identification network;
the training module is used for training the constructed fish image recognition network by utilizing the training data set, and obtaining a fish image recognition model after the training is finished;
and the identification module is used for identifying the fish image to be identified according to the fish image identification model.
In order to achieve the above object, the present invention further provides a fish image identification apparatus, comprising a computer memory, a computer processor, and a computer program stored in the computer memory and executable on the computer processor, the computer program being configured to implement the steps of the above fish image identification method.
Compared with the prior art, the invention has the advantages and positive effects that:
(1) the invention discloses a CBAM module (attention mechanism module) in a residual block of a convolutional layer, which is a simple and efficient forward convolutional neural network attention module. Because the CBAM module is a lightweight universal module, the CBAM module can be seamlessly integrated into any CNN framework, almost has no influence on efficiency and computing power, can realize end-to-end training and improve the recognition efficiency. According to the invention, an attention mechanism is introduced on the basis of a convolution residual error network, fish information which is more critical to the current task is focused in the input fish picture information, the attention degree to other information is reduced, even irrelevant information is filtered out, the problem of information overload can be solved, and the efficiency and the accuracy of task processing are improved.
(2) The fish image acquisition and classification label provided by the invention come from network and manual screening, so that the fish image identification method provided by the invention has higher identification rate for common fish images, and the identification accuracy can reach 80.95%.
Drawings
FIG. 1 is a flow chart of a fish image recognition method according to an embodiment of the invention;
FIG. 2 is a schematic structural diagram of a CBAM module according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a channel attention module according to an embodiment of the invention;
FIG. 4 is a schematic structural diagram of a spatial attention module according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating a structure of a residual block according to an embodiment of the present invention;
FIG. 6 is a schematic structural diagram of a convolutional layer according to an embodiment of the present invention;
fig. 7 is a schematic view of a display interface of a fish identification result according to an embodiment of the present invention.
Detailed Description
The invention is described in detail below by way of exemplary embodiments. It should be understood, however, that elements, structures and features of one embodiment may be beneficially incorporated in other embodiments without further recitation.
Example 1: referring to fig. 1, the embodiment provides a fish image identification method, which specifically includes the steps of:
and S1, collecting fish images, screening the fish images, enhancing the data, uniformly processing the sizes of the fish images, determining classification labels, and constructing a training data set.
Specifically, the screening may be manual screening or automatic screening (completed by screening software), and the specific requirements of the screening are as follows: removing fish images with unclear or incomplete fish features 1/3, removing fish images with a resolution below 100, and removing fish images that are different from a certain fish category.
Specifically, the data enhancement adopts a torchvision image library operation, and specifically includes: and horizontally turning the image, randomly cutting, adding Gaussian noise, adjusting the image brightness and randomly rotating and transforming the image. The angle of the random rotation transformation can be different from 30 degrees, 60 degrees, 90 degrees and 270 degrees, and the transformation can be specifically carried out according to the actual requirement.
After data enhancement, all fish pictures are unified in size, for example: the size of all fish pictures is normalized to 224 × 224 pixels, which is determined according to actual conditions.
And S2, adding a CBAM module into each residual block of the depth residual error network to construct a fish image identification network.
Specifically, referring to fig. 2, the CBAM module includes a channel attention module and a space attention module arranged in sequence, and given an input feature map, the CBAM module sequentially infers an attention map along two separate dimensions of the channel attention module and the space attention module, and then multiplies the attention map with the input feature map to obtain a refined adaptive feature map, which is a final output feature map of the CBAM module and is represented as:
Figure BDA0003220214250000061
wherein F' is the final feature output diagram,
Figure BDA0003220214250000062
representing multiplication of elements, F' obtaining an output feature map for the channel attention module, MC(F) Channel attention map, M, inferred for the channel attention Modules(F') is a spatial attention diagram deduced by the spatial attention module, and F is an input feature diagram.
Referring to fig. 3, the channel attention module generates two different spatial context descriptors by aggregating the spatial information of the input feature map F using average pooling and maximum pooling operations, and forwards the two descriptors to a shared network composed of multiple layers of perceptrons MLP, wherein the shared network includes a hidden layer, to generate the channel attention map. After the shared network is applied to each descriptor, the output feature vectors are combined using element addition, resulting in a channel attention map, denoted as:
Figure BDA0003220214250000063
in the formula, Avg (. cndot.) is an average pooling feature functionNumber, Max (-) is the maximum pooled feature function, MLP (-) represents the multi-layered perceptron output function,
Figure BDA0003220214250000064
the features obtained from the average pooling of the feature maps F,
Figure BDA0003220214250000065
features obtained for maximum pooling of feature map F, W0(. is a linear function of the first layer network in the shared network, W1The (-) is a linear variation function of a second layer network in the shared network, and the sigma (-) represents a sigmoid function.
Referring to fig. 4, the spatial attention module aggregates the channel information of the feature map F' by using the average pooling and maximum pooling operations to generate two-dimensional maps, each representing the average ensemble feature and the maximum ensemble feature of the entire channel, connects the average ensemble feature and the maximum ensemble feature, and convolves them by a standard convolution layer to generate a spatial attention map, represented as:
Figure BDA0003220214250000066
in the formula (f)m*m(. cndot.) is a convolution function, m represents the convolution kernel size,
Figure BDA0003220214250000067
the features obtained from the average pooling of the feature maps F',
Figure BDA0003220214250000071
features obtained for maximum pooling of the feature map F'.
And S3, training the fish image recognition network constructed in the step S2 by using the training data set, and obtaining a fish image recognition model after training.
Specifically, when the fish image recognition network is trained, the Adam algorithm is adopted for optimizing network parameters, the output layer is classified by adopting a Softmax function, and a cross entropy loss function is adopted for optimizing the network. It should be noted that, the Adam algorithm is adopted for parameter optimization, and since the algorithm combines the advantages of the adaptive gradient algorithm and the root-mean-square propagation algorithm, the learning rate of each parameter is dynamically adjusted by calculating the first moment estimate and the second moment estimate of the gradient of each parameter.
Wherein, the expression of the Softmax function is as follows:
Figure BDA0003220214250000072
in the formula, sigma (z)jThe j is the output value of the jth neuron, and j is 1, 2.
It should be noted that the Softmax function maps the output values of a plurality of neurons into a [0,1] interval, each value in the interval representing the probability that the sample belongs to each class, and the sum of the values is 1.
The cross entropy loss function is expressed as:
Figure BDA0003220214250000073
in the formula, GlossFor the loss value, m is the number of samples of the current batch of input networks, n is the number of classes,
Figure BDA0003220214250000074
is a genuine label, yijIs a predicted label.
It should be noted that the cross entropy function describes the distance between the actual output probability and the expected output probability distribution, and the smaller the value of the cross entropy function is, the better the learning effect in the model training process is.
And S4, screening the fish images to be recognized, enhancing data and uniformly processing the fish images to be recognized, and recognizing the processed fish images to be recognized by using the fish image recognition model to obtain a recognition result.
Specifically, the screening mode may be manual screening or automatic screening (completed by using screening software), and the specific requirements of the screening are as follows: removing fish images with unclear or incomplete fish features 1/3, removing fish images with a resolution below 100, and removing fish images that are different from a certain fish category.
Specifically, the data enhancement adopts a torchvision image library operation, and specifically includes: and (3) performing horizontal turning, random clipping, Gaussian noise addition, image brightness adjustment (brightness adjustment or dimming) and random rotation transformation on the image. The angle of the random rotation transformation can be different from 30 degrees, 60 degrees, 90 degrees and 270 degrees, and the transformation can be specifically carried out according to the actual requirement.
After data enhancement, all fish pictures are unified in size, for example: the size of all fish pictures is normalized to 224 × 224 pixels, which is determined according to actual conditions.
In order to evaluate the identification accuracy of the fish image identification model, the embodiment adopts Top-1 accuracy (Acc for short)Top-1) As an evaluation criterion, the Top-1 accuracy is the probability that the fish class represented by the maximum probability in the last output probability vector is consistent with the correct fish class, and the formula is as follows:
Figure BDA0003220214250000081
wherein N represents the total number of images, NTop-1Indicating the number of correctly classified images.
The fish image recognition method comprises the steps of collecting fish images, constructing a training data set, combining a depth residual error network and a CBAM (CBAM) module based on the depth residual error network, constructing a fish image recognition network, training the fish image recognition network through the training data set to obtain a fish image recognition model, and recognizing the fish images to be recognized through the fish image recognition model. Because the fish image recognition method is based on the CBAM module, the module is a lightweight universal module, can be seamlessly integrated into any CNN framework, almost has no influence on efficiency and algorithm, can realize end-to-end training requirements, improves the feature expression capability of the network, further improves the convergence speed and the test precision of the fish image recognition model, and has the advantages of simple structure and good recognition effect of the trained model.
Example 2: referring to the drawings, the present embodiment provides a fish image recognition system, including:
the data acquisition module 1 is used for acquiring fish image data;
the training data set construction module 2 is used for screening, enhancing and uniformly processing the fish images in size, determining classification labels and constructing a training data set;
the fish image identification network construction module 3 is used for adding a CBAM module into each residual block of the depth residual network to construct a fish image identification network;
the training module 4 is used for training the constructed fish image recognition network by utilizing the training data set, and obtaining a fish image recognition model after the training is finished;
and the identification module 5 is used for identifying the fish image to be identified according to the fish image identification model.
Referring to fig. 2, the CBAM module includes a channel attention module and a space attention module, which are sequentially arranged, and given an input feature map, the CBAM module sequentially infers an attention map along two separate dimensions of the channel attention module and the space attention module, and then multiplies the attention map with the input feature map to obtain a refined adaptive feature map, which is a final output feature map of the CBAM module and is represented as:
Figure BDA0003220214250000091
wherein F' is the final feature output diagram,
Figure BDA0003220214250000092
representing multiplication of elements, F' obtaining an output feature map for the channel attention module, MC(F) Channel attention map, M, inferred for the channel attention Modules(F') spatial annotation inferred by the spatial attention ModuleIn the diagram, F is an input feature map.
Referring to fig. 3, the channel attention module generates two different spatial context descriptors by aggregating the spatial information of the input feature map F using average pooling and maximum pooling operations, and forwards the two descriptors to a shared network composed of multiple layers of perceptrons MLP, wherein the shared network includes a hidden layer, to generate the channel attention map. After the shared network is applied to each descriptor, the output feature vectors are combined using element addition, resulting in a channel attention map, denoted as:
Figure BDA0003220214250000101
where Avg (-) is the average pooling characteristic function, Max (-) is the maximum pooling characteristic function, MLP (-) represents the multi-layered perceptron output function,
Figure BDA0003220214250000102
the features obtained from the average pooling of the feature maps F,
Figure BDA0003220214250000103
features obtained for maximum pooling of feature map F, W0(. is a linear function of the first layer network in the shared network, W1The (-) is a linear variation function of a second layer network in the shared network, and the sigma (-) represents a sigmoid function.
Referring to fig. 4, the spatial attention module aggregates the channel information of the feature map F' by using the average pooling and maximum pooling operations to generate two-dimensional maps, each representing the average ensemble feature and the maximum ensemble feature of the entire channel, connects the average ensemble feature and the maximum ensemble feature, and convolves them by a standard convolution layer to generate a spatial attention map, represented as:
Figure BDA0003220214250000104
in the formula (f)m*mIs a convolution functionM x m denotes the convolution kernel size,
Figure BDA0003220214250000105
the features obtained from the average pooling of the feature maps F',
Figure BDA0003220214250000106
features obtained for maximum pooling of the feature map F'.
Specifically, when the fish image recognition network constructed by the training data set is trained, the Adam algorithm is adopted for optimizing network parameters, the output layer is classified by the Softmax function, and the cross entropy loss function is adopted for optimizing the network. It should be noted that, the Adam algorithm is adopted for parameter optimization, and since the algorithm combines the advantages of the adaptive gradient algorithm and the root-mean-square propagation algorithm, the learning rate of each parameter is dynamically adjusted by calculating the first moment estimate and the second moment estimate of the gradient of each parameter.
Wherein, the expression of the Softmax function is as follows:
Figure BDA0003220214250000107
in the formula, sigma (z)jThe j is the output value of the jth neuron, and j is 1, 2.
It should be noted that the Softmax function maps the output values of a plurality of neurons into a [0,1] interval, each value in the interval representing the probability that the sample belongs to each class, and the sum of the values is 1.
The cross entropy loss function is expressed as:
Figure BDA0003220214250000111
in the formula, GlossFor the loss value, m is the number of samples of the current batch of input networks, n is the number of classes,
Figure BDA0003220214250000112
is a real targetStick, yijIs a predicted label.
It should be noted that the cross entropy function describes the distance between the actual output probability and the expected output probability distribution, and the smaller the value of the cross entropy function is, the better the learning effect in the model training process is.
In the fish recognition system, the fish images are collected through the data collection module, the training data set is constructed through the training data set construction module, the depth residual error network is combined with the CBAM module through the fish image recognition network construction module based on the depth residual error network to construct the fish image recognition network, then the training module trains the fish image recognition network through the training data set to obtain the fish image recognition model, and finally the fish images to be recognized are recognized through the recognition module and the fish image recognition model. When the fish image recognition system constructs the fish image recognition network, the residual error network is combined with the CBAM module based on the deep residual error network, the CBAM module is a lightweight universal module and can be seamlessly integrated into any CNN framework, the efficiency and the algorithm are hardly influenced, the end-to-end training requirement can be realized, the feature expression capability of the network is improved, the convergence speed and the test precision of the fish image recognition model are further improved, and the trained model is simple in structure and good in recognition effect.
Example 3: the present embodiment provides a fish image identification apparatus comprising a computer memory, a computer processor, and a computer program stored in the computer memory and executable on the computer processor, the computer program being configured to implement the steps of the fish image identification method of embodiment 1.
The above method is described below with reference to specific examples.
Collecting fish images, screening the collected fish images, enhancing data and uniformly processing sizes, determining classification labels and constructing a training data set.
The screening method specifically comprises the following steps: removing fish images with unclear or incomplete fish characteristics 1/3, removing fish images with resolution lower than 100 and removing fish images different from a certain fish category by manual commander selection. The quality of the training data set is improved through screening, the screened fish images contain 21 common fish species, the total number of the fish images is 300, and the image format is PG or PNG.
Because the number of the fish images obtained by manual screening is relatively small for the training sample size required by the depth residual error network, the number of the fish images is expanded by adopting a data enhancement mode, a single picture can be expanded into a plurality of image copies by data enhancement, the training sample size is greatly increased, the generalization of the network is further improved, and overfitting is reduced. In the data enhancement of the embodiment, a torchvision image library is adopted to perform data enhancement on a sample, and the data enhancement specifically comprises the following steps: and (3) performing horizontal turning, random clipping, Gaussian noise addition, image brightness adjustment (brightness adjustment or dimming) and random rotation transformation on the image. The random rotation transformation angles are 30 degrees, 60 degrees, 90 degrees and 270 degrees.
After the data enhancement was completed, all fish images were normalized in size to 224 x 224 pixels.
And finally, adding a classification label to each image to form a training data set, and writing the training data set into a CSV file, so that the deep residual error network can be conveniently read.
ResNet50 is used as the backbone network of the embodiment, and CBAM modules are added into each residual module of the convolution layer in ResNet50, so that the fish image identification network is constructed. The CBAM module is added into the ResNet50, so that the feature expression capability of the network can be improved on one hand, and on the other hand, the network can be informed of what to pay attention to, and the characterization of a specific area can be enhanced.
Referring to fig. 5, for the deep residual network, if the optimal feature output is y and the input obtained by the CBAM module is x, the desired non-linear processing result (i.e., residual) provided by the CBAM module is f (x) y-x, so that the output is f (x) + x. If the front shallow network already provides the optimal output x ═ y, f (x) should approach 0, so that it is ensured that the error rate of the fish image identification network is not higher. In the embodiment of the present invention, three layers of convolution, channel attention and spatial attention are included, and the actually used residual block structure is shown in fig. 6. The results of fish image recognition by the fish image recognition model are shown in fig. 7.
The above-described embodiments are intended to illustrate rather than to limit the invention, and any modifications and variations of the present invention are possible within the spirit and scope of the claims.

Claims (10)

1. A fish image identification method is characterized by comprising the following specific steps:
s1, collecting fish images, screening the fish images, enhancing the data and uniformly processing the fish images in size, determining classification labels and constructing a training data set;
s2, adding a CBAM module into each residual block of the depth residual error network to construct a fish image identification network;
s3, training the fish image recognition network constructed in the step S2 by using a training data set, and obtaining a fish image recognition model after training is finished;
and S4, screening the fish images to be recognized, enhancing data and uniformly processing the fish images to be recognized, and recognizing the processed fish images to be recognized by using the fish image recognition model to obtain a recognition result.
2. The fish image recognition system of claim 1, wherein in steps S1 and S4, the specific requirements of the screening are: removing fish images with unclear or incomplete fish features 1/3, removing fish images with a resolution below 100, and removing fish images that are different from a certain fish category.
3. The fish image recognition system of claim 1, wherein the data enhancement operates using a torchvision image library in steps S1 and S4, and specifically comprises: and horizontally turning the image, randomly cutting, adding Gaussian noise, adjusting the image brightness and randomly rotating and transforming the image.
4. The fish image recognition method of claim 1, wherein in step S2, the CBAM module comprises a channel attention module and a space attention module arranged in sequence, given an input feature map, the CBAM module sequentially infers an attention map along two separate dimensions of the channel attention module and the space attention module, and then multiplies the attention map with the input feature map to obtain a refined adaptive feature map, which is a final output feature map of the CBAM module and is represented as:
Figure FDA0003220214240000011
wherein F' is the final feature output diagram,
Figure FDA0003220214240000012
representing multiplication of elements, F' obtaining an output feature map for the channel attention module, MC(F) Channel attention map, M, inferred for the channel attention Modules(F') is a spatial attention diagram deduced by the spatial attention module, and F is an input feature diagram.
5. The fish image recognition method of claim 4, wherein the channel attention module generates two different spatial context descriptors by aggregating the spatial information of the input feature map F using average pooling and maximum pooling operations, and forwards the two descriptors to a shared network consisting of a plurality of layers of perceptrons MLP, wherein the shared network contains a hidden layer, to generate the channel attention.
6. The fish image recognition method of claim 5, wherein after the shared network is applied to each descriptor, the output feature vectors are combined using element addition to obtain a channel attention map represented as:
Figure FDA0003220214240000021
where Avg (-) is the average pooling characteristic function, Max (-) is the maximum pooling characteristic function, MLP (-) represents the multi-layered perceptron output characteristic graph,
Figure FDA0003220214240000022
the features obtained from the average pooling of the feature maps F,
Figure FDA0003220214240000023
features obtained for maximum pooling of feature map F, W0(. is a linear function of the first layer network in the shared network, W1The (-) is a linear variation function of a second layer network in the shared network, and the sigma (-) represents a sigmoid function.
7. The fish image recognition method of claim 4, wherein the spatial attention module gathers channel information of the feature map F' by using average pooling and maximum pooling operations to generate two-dimensional maps, each representing average and maximum ensemble features of the entire channel, connects the average and maximum ensemble features, and convolves with a standard m x m convolution layer to generate the spatial attention map, represented as:
Figure FDA0003220214240000024
in the formula (f)m*m(. cndot.) is a function of convolution,
Figure FDA0003220214240000025
the features obtained from the average pooling of the feature maps F',
Figure FDA0003220214240000026
for features resulting from maximal pooling of the feature map F', m × m represents the convolution kernel size.
8. The fish image recognition method of claim 1, wherein in step S3, when training the fish image recognition network, Adam algorithm is used for network parameter optimization, the output layer is classified by Softmax function, and cross entropy loss function is used for network optimization.
9. A fish image recognition system, comprising:
the data acquisition module is used for acquiring fish image data;
the training data set construction module is used for screening fish images, enhancing data and uniformly processing sizes, determining classification labels and constructing a training data set;
the fish image identification network construction module is used for adding a CBAM module into each residual block of the depth residual network to construct a fish image identification network;
the training module is used for training the constructed fish image recognition network by utilizing the training data set, and obtaining a fish image recognition model after the training is finished;
and the identification module is used for identifying the fish image to be identified according to the fish image identification model.
10. A fish image identification device comprising a computer memory, a computer processor and a computer program stored in the computer memory and executable on the computer processor, characterized in that the computer program is arranged to implement the steps of the fish image identification method according to any one of claims 1 to 8.
CN202110955820.6A 2021-08-19 2021-08-19 Fish image identification method, system and equipment Pending CN113627558A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110955820.6A CN113627558A (en) 2021-08-19 2021-08-19 Fish image identification method, system and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110955820.6A CN113627558A (en) 2021-08-19 2021-08-19 Fish image identification method, system and equipment

Publications (1)

Publication Number Publication Date
CN113627558A true CN113627558A (en) 2021-11-09

Family

ID=78386721

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110955820.6A Pending CN113627558A (en) 2021-08-19 2021-08-19 Fish image identification method, system and equipment

Country Status (1)

Country Link
CN (1) CN113627558A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114240686A (en) * 2022-02-24 2022-03-25 深圳市旗扬特种装备技术工程有限公司 Wisdom fishery monitoring system
CN115482419A (en) * 2022-10-19 2022-12-16 江苏雷默智能科技有限公司 Data acquisition and analysis method and system for marine fishery products

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110197205A (en) * 2019-05-09 2019-09-03 三峡大学 A kind of image-recognizing method of multiple features source residual error network
CN110781921A (en) * 2019-09-25 2020-02-11 浙江农林大学 Depth residual error network and transfer learning-based muscarinic image identification method and device
CN111291670A (en) * 2020-01-23 2020-06-16 天津大学 Small target facial expression recognition method based on attention mechanism and network integration
CN111563473A (en) * 2020-05-18 2020-08-21 电子科技大学 Remote sensing ship identification method based on dense feature fusion and pixel level attention
CN112200241A (en) * 2020-10-09 2021-01-08 山东大学 Automatic sorting method for fish varieties based on ResNet transfer learning
CN112241679A (en) * 2020-09-14 2021-01-19 浙江理工大学 Automatic garbage classification method
CN112651438A (en) * 2020-12-24 2021-04-13 世纪龙信息网络有限责任公司 Multi-class image classification method and device, terminal equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110197205A (en) * 2019-05-09 2019-09-03 三峡大学 A kind of image-recognizing method of multiple features source residual error network
CN110781921A (en) * 2019-09-25 2020-02-11 浙江农林大学 Depth residual error network and transfer learning-based muscarinic image identification method and device
CN111291670A (en) * 2020-01-23 2020-06-16 天津大学 Small target facial expression recognition method based on attention mechanism and network integration
CN111563473A (en) * 2020-05-18 2020-08-21 电子科技大学 Remote sensing ship identification method based on dense feature fusion and pixel level attention
CN112241679A (en) * 2020-09-14 2021-01-19 浙江理工大学 Automatic garbage classification method
CN112200241A (en) * 2020-10-09 2021-01-08 山东大学 Automatic sorting method for fish varieties based on ResNet transfer learning
CN112651438A (en) * 2020-12-24 2021-04-13 世纪龙信息网络有限责任公司 Multi-class image classification method and device, terminal equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SANGHYUN WOO ET.AL: "CBAM: Convolutional Block Attention Module", 《ECCV 2018 PAPERS》, pages 3 - 19 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114240686A (en) * 2022-02-24 2022-03-25 深圳市旗扬特种装备技术工程有限公司 Wisdom fishery monitoring system
CN114240686B (en) * 2022-02-24 2022-06-03 深圳市旗扬特种装备技术工程有限公司 Wisdom fishery monitoring system
CN115482419A (en) * 2022-10-19 2022-12-16 江苏雷默智能科技有限公司 Data acquisition and analysis method and system for marine fishery products
CN115482419B (en) * 2022-10-19 2023-11-14 江苏雷默智能科技有限公司 Data acquisition and analysis method and system for marine fishery products

Similar Documents

Publication Publication Date Title
CN113537106B (en) Fish ingestion behavior identification method based on YOLOv5
CN111046880A (en) Infrared target image segmentation method and system, electronic device and storage medium
Barreiros et al. Zebrafish tracking using YOLOv2 and Kalman filter
CN112598713A (en) Offshore submarine fish detection and tracking statistical method based on deep learning
Alkhudaydi et al. An exploration of deep-learning based phenotypic analysis to detect spike regions in field conditions for UK bread wheat
CN110781921A (en) Depth residual error network and transfer learning-based muscarinic image identification method and device
CN111611889B (en) Miniature insect pest recognition device in farmland based on improved convolutional neural network
Li et al. Detection of uneaten fish food pellets in underwater images for aquaculture
CN113627558A (en) Fish image identification method, system and equipment
WO2021238586A1 (en) Training method and apparatus, device, and computer readable storage medium
CN112749654A (en) Deep neural network model construction method, system and device for video fog monitoring
CN113349111A (en) Dynamic feeding method, system and storage medium for aquaculture
CN115131325A (en) Breaker fault operation and maintenance monitoring method and system based on image recognition and analysis
CN117253192A (en) Intelligent system and method for silkworm breeding
Sosa-Trejo et al. Vision-based techniques for automatic marine plankton classification
Miranda et al. Pest identification using image processing techniques in detecting image pattern through neural network
CN114037737B (en) Neural network-based offshore submarine fish detection and tracking statistical method
Nguyen et al. Joint image deblurring and binarization for license plate images using deep generative adversarial networks
CN114581769A (en) Method for identifying houses under construction based on unsupervised clustering
Lian et al. A pulse-number-adjustable MSPCNN and its image enhancement application
CN116152699B (en) Real-time moving target detection method for hydropower plant video monitoring system
Shetty et al. Plant Disease Detection for Guava and Mango using YOLO and Faster R-CNN
Kaur et al. Deep learning with invariant feature based species classification in underwater environments
CN116311086B (en) Plant monitoring method, training method, device and equipment for plant monitoring model
CN112949438B (en) Fruit visual classification method and system based on Bayesian network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination