CN113627558A - Fish image identification method, system and equipment - Google Patents
Fish image identification method, system and equipment Download PDFInfo
- Publication number
- CN113627558A CN113627558A CN202110955820.6A CN202110955820A CN113627558A CN 113627558 A CN113627558 A CN 113627558A CN 202110955820 A CN202110955820 A CN 202110955820A CN 113627558 A CN113627558 A CN 113627558A
- Authority
- CN
- China
- Prior art keywords
- fish
- network
- module
- fish image
- image recognition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 28
- 241000251468 Actinopterygii Species 0.000 claims abstract description 168
- 238000012549 training Methods 0.000 claims abstract description 55
- 238000012216 screening Methods 0.000 claims abstract description 26
- 238000012545 processing Methods 0.000 claims abstract description 14
- 230000002708 enhancing effect Effects 0.000 claims abstract description 12
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims abstract description 10
- 238000011176 pooling Methods 0.000 claims description 39
- 238000010586 diagram Methods 0.000 claims description 16
- 238000010276 construction Methods 0.000 claims description 8
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 6
- 230000003044 adaptive effect Effects 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 6
- 239000013598 vector Substances 0.000 claims description 5
- 230000004931 aggregating effect Effects 0.000 claims description 4
- 238000012886 linear function Methods 0.000 claims description 4
- 230000001131 transforming effect Effects 0.000 claims description 3
- 230000006870 function Effects 0.000 description 32
- 230000009466 transformation Effects 0.000 description 7
- 238000013527 convolutional neural network Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 210000002569 neuron Anatomy 0.000 description 4
- 101100153586 Caenorhabditis elegans top-1 gene Proteins 0.000 description 3
- 101100370075 Mus musculus Top1 gene Proteins 0.000 description 3
- 238000012544 monitoring process Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 241000282414 Homo sapiens Species 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000001965 increasing effect Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000009792 diffusion process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000003595 mist Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a fish image identification method, a system and equipment, wherein the identification method comprises the following steps: s1, collecting fish images, screening the fish images, enhancing data and uniformly processing the fish images in size, determining classification labels and constructing a training data set; s2, adding a CBAM module into each residual block of the depth residual error network to construct a fish image identification network; s3, training the fish image recognition network constructed in the step S2 by using a training data set, and obtaining a fish image recognition model after training is finished; and S4, screening the fish images to be recognized, enhancing data and uniformly processing the fish images to be recognized, and recognizing the processed fish images to be recognized by using the fish image recognition model to obtain a recognition result. The method can improve the recognition rate of common fish images, and the recognition accuracy rate can reach more than 80%.
Description
Technical Field
The invention belongs to the technical field of image recognition, relates to a fish image recognition technology, and particularly relates to a fish image recognition method, a fish image recognition system and fish image recognition equipment.
Background
With the increasing demand of human beings on marine resources, marine fishery resources are more and more emphasized. In order to protect a marine ecosystem and prevent fishermen from catching inappropriate fishes in inappropriate time periods, monitoring cameras are installed on the deck of a fishing boat in many countries and international organizations, but due to the fact that the conditions of frequent rain and wave mixing, water mist diffusion and the like in the marine operation process seriously affect the monitoring image quality, a supervisor is difficult to identify the fishes in a picture, massive monitoring videos need a large amount of human resources, and the identification accuracy is poor.
Disclosure of Invention
Aiming at the problems in the technology, the invention provides a fish image identification method and system equipment, which can accurately and quickly identify and classify fish.
In order to achieve the above object, the present invention provides a fish image recognition method, which comprises the following specific steps:
s1, collecting fish images, screening the fish images, enhancing the data and uniformly processing the fish images in size, determining classification labels and constructing a training data set;
s2, adding a CBAM module into each residual block of the depth residual error network to construct a fish image identification network;
s3, training the fish image recognition network constructed in the step S2 by using a training data set, and obtaining a fish image recognition model after training is finished;
and S4, screening the fish images to be recognized, enhancing data and uniformly processing the fish images to be recognized, and recognizing the processed fish images to be recognized by using the fish image recognition model to obtain a recognition result.
Preferably, in step S1 and step S4, the specific requirements of the screening are: removing fish images with unclear or incomplete fish features 1/3, removing fish images with a resolution below 100, and removing fish images that are different from a certain fish category.
Preferably, in step S1 and step S4, the data enhancement adopts a torchvision image library operation, which specifically includes: and horizontally turning the image, randomly cutting, adding Gaussian noise, adjusting the image brightness and randomly rotating and transforming the image.
Preferably, the CBAM module includes a channel attention module and a space attention module arranged in sequence, and given an input feature map, the CBAM module successively infers an attention map along two separate dimensions of the channel attention module and the space attention module, and then multiplies the attention map with the input feature map to obtain a refined adaptive feature map, which is a final output feature map of the CBAM module and is expressed as:
wherein F' is the final feature output diagram,representing multiplication of elements, F' obtaining an output feature map for the channel attention module, MC(F) Channel attention map, M, inferred for the channel attention Modules(F') is a spatial attention diagram deduced by the spatial attention module, and F is an input feature diagram.
Preferably, the channel attention module generates two different spatial context descriptors by aggregating the spatial information of the input feature map F using average pooling and maximum pooling operations, and forwards the two descriptors to a shared network composed of multiple layers of perceptrons MLP, wherein the shared network includes a hidden layer, to generate the channel attention map.
Preferably, after the shared network is applied to each descriptor, the output feature vectors are combined using an element addition method to obtain a channel attention map, which is expressed as:
where Avg (-) is the average pooling characteristic function, Max (-) is the maximum pooling characteristic function, MLP (-) represents the multi-layered perceptron output function,the features obtained from the average pooling of the feature maps F,features obtained for maximum pooling of feature map F, W0(. is a linear function of the first layer network in the shared network, W1The (-) is a linear variation function of a second layer network in the shared network, and the sigma (-) represents a sigmoid function.
Preferably, the spatial attention module gathers the channel information of the feature map F' by using the average pooling and maximum pooling operations to generate two-dimensional maps, each representing an average ensemble feature and a maximum ensemble feature of the entire channel, connects the average ensemble feature and the maximum ensemble feature, and convolves them by a standard convolution layer to generate a spatial attention map, represented as:
in the formula (f)m*m(. cndot.) is a convolution function, m represents the convolution kernel size,the features obtained from the average pooling of the feature maps F',features obtained for maximum pooling of the feature map F'.
Preferably, in step S3, when training the fish image recognition network, the Adam algorithm is used to perform network parameter optimization, the output layer is classified by using a Softmax function, and the cross entropy loss function is used to perform network optimization.
In order to achieve the above object, the present invention also provides a fish image recognition system, comprising:
the data acquisition module is used for acquiring fish image data;
the training data set construction module is used for screening fish images, enhancing data and uniformly processing sizes, determining classification labels and constructing a training data set;
the fish image identification network construction module is used for adding a CBAM module into each residual block of the depth residual network to construct a fish image identification network;
the training module is used for training the constructed fish image recognition network by utilizing the training data set, and obtaining a fish image recognition model after the training is finished;
and the identification module is used for identifying the fish image to be identified according to the fish image identification model.
In order to achieve the above object, the present invention further provides a fish image identification apparatus, comprising a computer memory, a computer processor, and a computer program stored in the computer memory and executable on the computer processor, the computer program being configured to implement the steps of the above fish image identification method.
Compared with the prior art, the invention has the advantages and positive effects that:
(1) the invention discloses a CBAM module (attention mechanism module) in a residual block of a convolutional layer, which is a simple and efficient forward convolutional neural network attention module. Because the CBAM module is a lightweight universal module, the CBAM module can be seamlessly integrated into any CNN framework, almost has no influence on efficiency and computing power, can realize end-to-end training and improve the recognition efficiency. According to the invention, an attention mechanism is introduced on the basis of a convolution residual error network, fish information which is more critical to the current task is focused in the input fish picture information, the attention degree to other information is reduced, even irrelevant information is filtered out, the problem of information overload can be solved, and the efficiency and the accuracy of task processing are improved.
(2) The fish image acquisition and classification label provided by the invention come from network and manual screening, so that the fish image identification method provided by the invention has higher identification rate for common fish images, and the identification accuracy can reach 80.95%.
Drawings
FIG. 1 is a flow chart of a fish image recognition method according to an embodiment of the invention;
FIG. 2 is a schematic structural diagram of a CBAM module according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a channel attention module according to an embodiment of the invention;
FIG. 4 is a schematic structural diagram of a spatial attention module according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating a structure of a residual block according to an embodiment of the present invention;
FIG. 6 is a schematic structural diagram of a convolutional layer according to an embodiment of the present invention;
fig. 7 is a schematic view of a display interface of a fish identification result according to an embodiment of the present invention.
Detailed Description
The invention is described in detail below by way of exemplary embodiments. It should be understood, however, that elements, structures and features of one embodiment may be beneficially incorporated in other embodiments without further recitation.
Example 1: referring to fig. 1, the embodiment provides a fish image identification method, which specifically includes the steps of:
and S1, collecting fish images, screening the fish images, enhancing the data, uniformly processing the sizes of the fish images, determining classification labels, and constructing a training data set.
Specifically, the screening may be manual screening or automatic screening (completed by screening software), and the specific requirements of the screening are as follows: removing fish images with unclear or incomplete fish features 1/3, removing fish images with a resolution below 100, and removing fish images that are different from a certain fish category.
Specifically, the data enhancement adopts a torchvision image library operation, and specifically includes: and horizontally turning the image, randomly cutting, adding Gaussian noise, adjusting the image brightness and randomly rotating and transforming the image. The angle of the random rotation transformation can be different from 30 degrees, 60 degrees, 90 degrees and 270 degrees, and the transformation can be specifically carried out according to the actual requirement.
After data enhancement, all fish pictures are unified in size, for example: the size of all fish pictures is normalized to 224 × 224 pixels, which is determined according to actual conditions.
And S2, adding a CBAM module into each residual block of the depth residual error network to construct a fish image identification network.
Specifically, referring to fig. 2, the CBAM module includes a channel attention module and a space attention module arranged in sequence, and given an input feature map, the CBAM module sequentially infers an attention map along two separate dimensions of the channel attention module and the space attention module, and then multiplies the attention map with the input feature map to obtain a refined adaptive feature map, which is a final output feature map of the CBAM module and is represented as:
wherein F' is the final feature output diagram,representing multiplication of elements, F' obtaining an output feature map for the channel attention module, MC(F) Channel attention map, M, inferred for the channel attention Modules(F') is a spatial attention diagram deduced by the spatial attention module, and F is an input feature diagram.
Referring to fig. 3, the channel attention module generates two different spatial context descriptors by aggregating the spatial information of the input feature map F using average pooling and maximum pooling operations, and forwards the two descriptors to a shared network composed of multiple layers of perceptrons MLP, wherein the shared network includes a hidden layer, to generate the channel attention map. After the shared network is applied to each descriptor, the output feature vectors are combined using element addition, resulting in a channel attention map, denoted as:
in the formula, Avg (. cndot.) is an average pooling feature functionNumber, Max (-) is the maximum pooled feature function, MLP (-) represents the multi-layered perceptron output function,the features obtained from the average pooling of the feature maps F,features obtained for maximum pooling of feature map F, W0(. is a linear function of the first layer network in the shared network, W1The (-) is a linear variation function of a second layer network in the shared network, and the sigma (-) represents a sigmoid function.
Referring to fig. 4, the spatial attention module aggregates the channel information of the feature map F' by using the average pooling and maximum pooling operations to generate two-dimensional maps, each representing the average ensemble feature and the maximum ensemble feature of the entire channel, connects the average ensemble feature and the maximum ensemble feature, and convolves them by a standard convolution layer to generate a spatial attention map, represented as:
in the formula (f)m*m(. cndot.) is a convolution function, m represents the convolution kernel size,the features obtained from the average pooling of the feature maps F',features obtained for maximum pooling of the feature map F'.
And S3, training the fish image recognition network constructed in the step S2 by using the training data set, and obtaining a fish image recognition model after training.
Specifically, when the fish image recognition network is trained, the Adam algorithm is adopted for optimizing network parameters, the output layer is classified by adopting a Softmax function, and a cross entropy loss function is adopted for optimizing the network. It should be noted that, the Adam algorithm is adopted for parameter optimization, and since the algorithm combines the advantages of the adaptive gradient algorithm and the root-mean-square propagation algorithm, the learning rate of each parameter is dynamically adjusted by calculating the first moment estimate and the second moment estimate of the gradient of each parameter.
Wherein, the expression of the Softmax function is as follows:
in the formula, sigma (z)jThe j is the output value of the jth neuron, and j is 1, 2.
It should be noted that the Softmax function maps the output values of a plurality of neurons into a [0,1] interval, each value in the interval representing the probability that the sample belongs to each class, and the sum of the values is 1.
The cross entropy loss function is expressed as:
in the formula, GlossFor the loss value, m is the number of samples of the current batch of input networks, n is the number of classes,is a genuine label, yijIs a predicted label.
It should be noted that the cross entropy function describes the distance between the actual output probability and the expected output probability distribution, and the smaller the value of the cross entropy function is, the better the learning effect in the model training process is.
And S4, screening the fish images to be recognized, enhancing data and uniformly processing the fish images to be recognized, and recognizing the processed fish images to be recognized by using the fish image recognition model to obtain a recognition result.
Specifically, the screening mode may be manual screening or automatic screening (completed by using screening software), and the specific requirements of the screening are as follows: removing fish images with unclear or incomplete fish features 1/3, removing fish images with a resolution below 100, and removing fish images that are different from a certain fish category.
Specifically, the data enhancement adopts a torchvision image library operation, and specifically includes: and (3) performing horizontal turning, random clipping, Gaussian noise addition, image brightness adjustment (brightness adjustment or dimming) and random rotation transformation on the image. The angle of the random rotation transformation can be different from 30 degrees, 60 degrees, 90 degrees and 270 degrees, and the transformation can be specifically carried out according to the actual requirement.
After data enhancement, all fish pictures are unified in size, for example: the size of all fish pictures is normalized to 224 × 224 pixels, which is determined according to actual conditions.
In order to evaluate the identification accuracy of the fish image identification model, the embodiment adopts Top-1 accuracy (Acc for short)Top-1) As an evaluation criterion, the Top-1 accuracy is the probability that the fish class represented by the maximum probability in the last output probability vector is consistent with the correct fish class, and the formula is as follows:
wherein N represents the total number of images, NTop-1Indicating the number of correctly classified images.
The fish image recognition method comprises the steps of collecting fish images, constructing a training data set, combining a depth residual error network and a CBAM (CBAM) module based on the depth residual error network, constructing a fish image recognition network, training the fish image recognition network through the training data set to obtain a fish image recognition model, and recognizing the fish images to be recognized through the fish image recognition model. Because the fish image recognition method is based on the CBAM module, the module is a lightweight universal module, can be seamlessly integrated into any CNN framework, almost has no influence on efficiency and algorithm, can realize end-to-end training requirements, improves the feature expression capability of the network, further improves the convergence speed and the test precision of the fish image recognition model, and has the advantages of simple structure and good recognition effect of the trained model.
Example 2: referring to the drawings, the present embodiment provides a fish image recognition system, including:
the data acquisition module 1 is used for acquiring fish image data;
the training data set construction module 2 is used for screening, enhancing and uniformly processing the fish images in size, determining classification labels and constructing a training data set;
the fish image identification network construction module 3 is used for adding a CBAM module into each residual block of the depth residual network to construct a fish image identification network;
the training module 4 is used for training the constructed fish image recognition network by utilizing the training data set, and obtaining a fish image recognition model after the training is finished;
and the identification module 5 is used for identifying the fish image to be identified according to the fish image identification model.
Referring to fig. 2, the CBAM module includes a channel attention module and a space attention module, which are sequentially arranged, and given an input feature map, the CBAM module sequentially infers an attention map along two separate dimensions of the channel attention module and the space attention module, and then multiplies the attention map with the input feature map to obtain a refined adaptive feature map, which is a final output feature map of the CBAM module and is represented as:
wherein F' is the final feature output diagram,representing multiplication of elements, F' obtaining an output feature map for the channel attention module, MC(F) Channel attention map, M, inferred for the channel attention Modules(F') spatial annotation inferred by the spatial attention ModuleIn the diagram, F is an input feature map.
Referring to fig. 3, the channel attention module generates two different spatial context descriptors by aggregating the spatial information of the input feature map F using average pooling and maximum pooling operations, and forwards the two descriptors to a shared network composed of multiple layers of perceptrons MLP, wherein the shared network includes a hidden layer, to generate the channel attention map. After the shared network is applied to each descriptor, the output feature vectors are combined using element addition, resulting in a channel attention map, denoted as:
where Avg (-) is the average pooling characteristic function, Max (-) is the maximum pooling characteristic function, MLP (-) represents the multi-layered perceptron output function,the features obtained from the average pooling of the feature maps F,features obtained for maximum pooling of feature map F, W0(. is a linear function of the first layer network in the shared network, W1The (-) is a linear variation function of a second layer network in the shared network, and the sigma (-) represents a sigmoid function.
Referring to fig. 4, the spatial attention module aggregates the channel information of the feature map F' by using the average pooling and maximum pooling operations to generate two-dimensional maps, each representing the average ensemble feature and the maximum ensemble feature of the entire channel, connects the average ensemble feature and the maximum ensemble feature, and convolves them by a standard convolution layer to generate a spatial attention map, represented as:
in the formula (f)m*mIs a convolution functionM x m denotes the convolution kernel size,the features obtained from the average pooling of the feature maps F',features obtained for maximum pooling of the feature map F'.
Specifically, when the fish image recognition network constructed by the training data set is trained, the Adam algorithm is adopted for optimizing network parameters, the output layer is classified by the Softmax function, and the cross entropy loss function is adopted for optimizing the network. It should be noted that, the Adam algorithm is adopted for parameter optimization, and since the algorithm combines the advantages of the adaptive gradient algorithm and the root-mean-square propagation algorithm, the learning rate of each parameter is dynamically adjusted by calculating the first moment estimate and the second moment estimate of the gradient of each parameter.
Wherein, the expression of the Softmax function is as follows:
in the formula, sigma (z)jThe j is the output value of the jth neuron, and j is 1, 2.
It should be noted that the Softmax function maps the output values of a plurality of neurons into a [0,1] interval, each value in the interval representing the probability that the sample belongs to each class, and the sum of the values is 1.
The cross entropy loss function is expressed as:
in the formula, GlossFor the loss value, m is the number of samples of the current batch of input networks, n is the number of classes,is a real targetStick, yijIs a predicted label.
It should be noted that the cross entropy function describes the distance between the actual output probability and the expected output probability distribution, and the smaller the value of the cross entropy function is, the better the learning effect in the model training process is.
In the fish recognition system, the fish images are collected through the data collection module, the training data set is constructed through the training data set construction module, the depth residual error network is combined with the CBAM module through the fish image recognition network construction module based on the depth residual error network to construct the fish image recognition network, then the training module trains the fish image recognition network through the training data set to obtain the fish image recognition model, and finally the fish images to be recognized are recognized through the recognition module and the fish image recognition model. When the fish image recognition system constructs the fish image recognition network, the residual error network is combined with the CBAM module based on the deep residual error network, the CBAM module is a lightweight universal module and can be seamlessly integrated into any CNN framework, the efficiency and the algorithm are hardly influenced, the end-to-end training requirement can be realized, the feature expression capability of the network is improved, the convergence speed and the test precision of the fish image recognition model are further improved, and the trained model is simple in structure and good in recognition effect.
Example 3: the present embodiment provides a fish image identification apparatus comprising a computer memory, a computer processor, and a computer program stored in the computer memory and executable on the computer processor, the computer program being configured to implement the steps of the fish image identification method of embodiment 1.
The above method is described below with reference to specific examples.
Collecting fish images, screening the collected fish images, enhancing data and uniformly processing sizes, determining classification labels and constructing a training data set.
The screening method specifically comprises the following steps: removing fish images with unclear or incomplete fish characteristics 1/3, removing fish images with resolution lower than 100 and removing fish images different from a certain fish category by manual commander selection. The quality of the training data set is improved through screening, the screened fish images contain 21 common fish species, the total number of the fish images is 300, and the image format is PG or PNG.
Because the number of the fish images obtained by manual screening is relatively small for the training sample size required by the depth residual error network, the number of the fish images is expanded by adopting a data enhancement mode, a single picture can be expanded into a plurality of image copies by data enhancement, the training sample size is greatly increased, the generalization of the network is further improved, and overfitting is reduced. In the data enhancement of the embodiment, a torchvision image library is adopted to perform data enhancement on a sample, and the data enhancement specifically comprises the following steps: and (3) performing horizontal turning, random clipping, Gaussian noise addition, image brightness adjustment (brightness adjustment or dimming) and random rotation transformation on the image. The random rotation transformation angles are 30 degrees, 60 degrees, 90 degrees and 270 degrees.
After the data enhancement was completed, all fish images were normalized in size to 224 x 224 pixels.
And finally, adding a classification label to each image to form a training data set, and writing the training data set into a CSV file, so that the deep residual error network can be conveniently read.
ResNet50 is used as the backbone network of the embodiment, and CBAM modules are added into each residual module of the convolution layer in ResNet50, so that the fish image identification network is constructed. The CBAM module is added into the ResNet50, so that the feature expression capability of the network can be improved on one hand, and on the other hand, the network can be informed of what to pay attention to, and the characterization of a specific area can be enhanced.
Referring to fig. 5, for the deep residual network, if the optimal feature output is y and the input obtained by the CBAM module is x, the desired non-linear processing result (i.e., residual) provided by the CBAM module is f (x) y-x, so that the output is f (x) + x. If the front shallow network already provides the optimal output x ═ y, f (x) should approach 0, so that it is ensured that the error rate of the fish image identification network is not higher. In the embodiment of the present invention, three layers of convolution, channel attention and spatial attention are included, and the actually used residual block structure is shown in fig. 6. The results of fish image recognition by the fish image recognition model are shown in fig. 7.
The above-described embodiments are intended to illustrate rather than to limit the invention, and any modifications and variations of the present invention are possible within the spirit and scope of the claims.
Claims (10)
1. A fish image identification method is characterized by comprising the following specific steps:
s1, collecting fish images, screening the fish images, enhancing the data and uniformly processing the fish images in size, determining classification labels and constructing a training data set;
s2, adding a CBAM module into each residual block of the depth residual error network to construct a fish image identification network;
s3, training the fish image recognition network constructed in the step S2 by using a training data set, and obtaining a fish image recognition model after training is finished;
and S4, screening the fish images to be recognized, enhancing data and uniformly processing the fish images to be recognized, and recognizing the processed fish images to be recognized by using the fish image recognition model to obtain a recognition result.
2. The fish image recognition system of claim 1, wherein in steps S1 and S4, the specific requirements of the screening are: removing fish images with unclear or incomplete fish features 1/3, removing fish images with a resolution below 100, and removing fish images that are different from a certain fish category.
3. The fish image recognition system of claim 1, wherein the data enhancement operates using a torchvision image library in steps S1 and S4, and specifically comprises: and horizontally turning the image, randomly cutting, adding Gaussian noise, adjusting the image brightness and randomly rotating and transforming the image.
4. The fish image recognition method of claim 1, wherein in step S2, the CBAM module comprises a channel attention module and a space attention module arranged in sequence, given an input feature map, the CBAM module sequentially infers an attention map along two separate dimensions of the channel attention module and the space attention module, and then multiplies the attention map with the input feature map to obtain a refined adaptive feature map, which is a final output feature map of the CBAM module and is represented as:
wherein F' is the final feature output diagram,representing multiplication of elements, F' obtaining an output feature map for the channel attention module, MC(F) Channel attention map, M, inferred for the channel attention Modules(F') is a spatial attention diagram deduced by the spatial attention module, and F is an input feature diagram.
5. The fish image recognition method of claim 4, wherein the channel attention module generates two different spatial context descriptors by aggregating the spatial information of the input feature map F using average pooling and maximum pooling operations, and forwards the two descriptors to a shared network consisting of a plurality of layers of perceptrons MLP, wherein the shared network contains a hidden layer, to generate the channel attention.
6. The fish image recognition method of claim 5, wherein after the shared network is applied to each descriptor, the output feature vectors are combined using element addition to obtain a channel attention map represented as:
where Avg (-) is the average pooling characteristic function, Max (-) is the maximum pooling characteristic function, MLP (-) represents the multi-layered perceptron output characteristic graph,the features obtained from the average pooling of the feature maps F,features obtained for maximum pooling of feature map F, W0(. is a linear function of the first layer network in the shared network, W1The (-) is a linear variation function of a second layer network in the shared network, and the sigma (-) represents a sigmoid function.
7. The fish image recognition method of claim 4, wherein the spatial attention module gathers channel information of the feature map F' by using average pooling and maximum pooling operations to generate two-dimensional maps, each representing average and maximum ensemble features of the entire channel, connects the average and maximum ensemble features, and convolves with a standard m x m convolution layer to generate the spatial attention map, represented as:
8. The fish image recognition method of claim 1, wherein in step S3, when training the fish image recognition network, Adam algorithm is used for network parameter optimization, the output layer is classified by Softmax function, and cross entropy loss function is used for network optimization.
9. A fish image recognition system, comprising:
the data acquisition module is used for acquiring fish image data;
the training data set construction module is used for screening fish images, enhancing data and uniformly processing sizes, determining classification labels and constructing a training data set;
the fish image identification network construction module is used for adding a CBAM module into each residual block of the depth residual network to construct a fish image identification network;
the training module is used for training the constructed fish image recognition network by utilizing the training data set, and obtaining a fish image recognition model after the training is finished;
and the identification module is used for identifying the fish image to be identified according to the fish image identification model.
10. A fish image identification device comprising a computer memory, a computer processor and a computer program stored in the computer memory and executable on the computer processor, characterized in that the computer program is arranged to implement the steps of the fish image identification method according to any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110955820.6A CN113627558A (en) | 2021-08-19 | 2021-08-19 | Fish image identification method, system and equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110955820.6A CN113627558A (en) | 2021-08-19 | 2021-08-19 | Fish image identification method, system and equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113627558A true CN113627558A (en) | 2021-11-09 |
Family
ID=78386721
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110955820.6A Pending CN113627558A (en) | 2021-08-19 | 2021-08-19 | Fish image identification method, system and equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113627558A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114240686A (en) * | 2022-02-24 | 2022-03-25 | 深圳市旗扬特种装备技术工程有限公司 | Wisdom fishery monitoring system |
CN115482419A (en) * | 2022-10-19 | 2022-12-16 | 江苏雷默智能科技有限公司 | Data acquisition and analysis method and system for marine fishery products |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110197205A (en) * | 2019-05-09 | 2019-09-03 | 三峡大学 | A kind of image-recognizing method of multiple features source residual error network |
CN110781921A (en) * | 2019-09-25 | 2020-02-11 | 浙江农林大学 | Depth residual error network and transfer learning-based muscarinic image identification method and device |
CN111291670A (en) * | 2020-01-23 | 2020-06-16 | 天津大学 | Small target facial expression recognition method based on attention mechanism and network integration |
CN111563473A (en) * | 2020-05-18 | 2020-08-21 | 电子科技大学 | Remote sensing ship identification method based on dense feature fusion and pixel level attention |
CN112200241A (en) * | 2020-10-09 | 2021-01-08 | 山东大学 | Automatic sorting method for fish varieties based on ResNet transfer learning |
CN112241679A (en) * | 2020-09-14 | 2021-01-19 | 浙江理工大学 | Automatic garbage classification method |
CN112651438A (en) * | 2020-12-24 | 2021-04-13 | 世纪龙信息网络有限责任公司 | Multi-class image classification method and device, terminal equipment and storage medium |
-
2021
- 2021-08-19 CN CN202110955820.6A patent/CN113627558A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110197205A (en) * | 2019-05-09 | 2019-09-03 | 三峡大学 | A kind of image-recognizing method of multiple features source residual error network |
CN110781921A (en) * | 2019-09-25 | 2020-02-11 | 浙江农林大学 | Depth residual error network and transfer learning-based muscarinic image identification method and device |
CN111291670A (en) * | 2020-01-23 | 2020-06-16 | 天津大学 | Small target facial expression recognition method based on attention mechanism and network integration |
CN111563473A (en) * | 2020-05-18 | 2020-08-21 | 电子科技大学 | Remote sensing ship identification method based on dense feature fusion and pixel level attention |
CN112241679A (en) * | 2020-09-14 | 2021-01-19 | 浙江理工大学 | Automatic garbage classification method |
CN112200241A (en) * | 2020-10-09 | 2021-01-08 | 山东大学 | Automatic sorting method for fish varieties based on ResNet transfer learning |
CN112651438A (en) * | 2020-12-24 | 2021-04-13 | 世纪龙信息网络有限责任公司 | Multi-class image classification method and device, terminal equipment and storage medium |
Non-Patent Citations (1)
Title |
---|
SANGHYUN WOO ET.AL: "CBAM: Convolutional Block Attention Module", 《ECCV 2018 PAPERS》, pages 3 - 19 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114240686A (en) * | 2022-02-24 | 2022-03-25 | 深圳市旗扬特种装备技术工程有限公司 | Wisdom fishery monitoring system |
CN114240686B (en) * | 2022-02-24 | 2022-06-03 | 深圳市旗扬特种装备技术工程有限公司 | Wisdom fishery monitoring system |
CN115482419A (en) * | 2022-10-19 | 2022-12-16 | 江苏雷默智能科技有限公司 | Data acquisition and analysis method and system for marine fishery products |
CN115482419B (en) * | 2022-10-19 | 2023-11-14 | 江苏雷默智能科技有限公司 | Data acquisition and analysis method and system for marine fishery products |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113537106B (en) | Fish ingestion behavior identification method based on YOLOv5 | |
CN111046880A (en) | Infrared target image segmentation method and system, electronic device and storage medium | |
Barreiros et al. | Zebrafish tracking using YOLOv2 and Kalman filter | |
CN112598713A (en) | Offshore submarine fish detection and tracking statistical method based on deep learning | |
Alkhudaydi et al. | An exploration of deep-learning based phenotypic analysis to detect spike regions in field conditions for UK bread wheat | |
CN110781921A (en) | Depth residual error network and transfer learning-based muscarinic image identification method and device | |
CN111611889B (en) | Miniature insect pest recognition device in farmland based on improved convolutional neural network | |
Li et al. | Detection of uneaten fish food pellets in underwater images for aquaculture | |
CN113627558A (en) | Fish image identification method, system and equipment | |
WO2021238586A1 (en) | Training method and apparatus, device, and computer readable storage medium | |
CN112749654A (en) | Deep neural network model construction method, system and device for video fog monitoring | |
CN113349111A (en) | Dynamic feeding method, system and storage medium for aquaculture | |
CN115131325A (en) | Breaker fault operation and maintenance monitoring method and system based on image recognition and analysis | |
CN117253192A (en) | Intelligent system and method for silkworm breeding | |
Sosa-Trejo et al. | Vision-based techniques for automatic marine plankton classification | |
Miranda et al. | Pest identification using image processing techniques in detecting image pattern through neural network | |
CN114037737B (en) | Neural network-based offshore submarine fish detection and tracking statistical method | |
Nguyen et al. | Joint image deblurring and binarization for license plate images using deep generative adversarial networks | |
CN114581769A (en) | Method for identifying houses under construction based on unsupervised clustering | |
Lian et al. | A pulse-number-adjustable MSPCNN and its image enhancement application | |
CN116152699B (en) | Real-time moving target detection method for hydropower plant video monitoring system | |
Shetty et al. | Plant Disease Detection for Guava and Mango using YOLO and Faster R-CNN | |
Kaur et al. | Deep learning with invariant feature based species classification in underwater environments | |
CN116311086B (en) | Plant monitoring method, training method, device and equipment for plant monitoring model | |
CN112949438B (en) | Fruit visual classification method and system based on Bayesian network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |