CN112200241A

CN112200241A - Automatic sorting method for fish varieties based on ResNet transfer learning

Info

Publication number: CN112200241A
Application number: CN202011071983.XA
Authority: CN
Inventors: 张艺; 周斌; 刘改勤
Original assignee: Shandong University
Current assignee: Shandong University
Priority date: 2020-10-09
Filing date: 2020-10-09
Publication date: 2021-01-08

Abstract

The invention relates to a method for automatically sorting fish varieties based on ResNet transfer learning, which comprises the following steps: A. collecting images of different varieties of fishes in a detection box by utilizing a transfer learning training fish classification network model (1); (2) data enhancement; (3) forming a fish image dataset; (4) dividing the training set, the verification set and the test set; (5) pre-training the ResNet34 model; (6) constructing a fish classification network model; (7) loading the weights of the ResNet34 model pre-trained in the step (5) into a fish classification network model; (8) training a fish classification network model by adopting a training set; B. automatically sorting the fish varieties through the trained fish classification network model; according to the invention, the visual space attention module is added into the ResNet34, so that irrelevant information in the fish image can be ignored, areas which are important for distinguishing fish are focused, and the accuracy of fish sorting is improved.

Description

Automatic sorting method for fish varieties based on ResNet transfer learning

Technical Field

The invention relates to a method for automatically sorting fish varieties based on ResNet transfer learning, and belongs to the technical field of fish variety identification.

Background

In aquaculture, farmers usually mix other varieties of fries when purchasing fries, for example, the varieties of Philippine eel fries are different, introduced anguilla marmorata fries are often mixed with Pacific double-color eels, and the sizes, weights and shapes of the anguilla marmorata fries are similar in seedling stage and are not easy to distinguish, so that the quality and the sales of the anguilla marmorata fries are greatly influenced, and economic losses are brought to the farmers. At present, the main fry sorting method relies on manual sorting, which requires workers to know the fish variety, thereby not only consuming a large amount of labor and having low efficiency, but also being easy to have sorting errors. Therefore, how to design an automatic fish sorting method to replace manual sorting has important application value.

The existing automatic fish sorting method is roughly divided into two types: firstly, the fish is classified according to the weight, size and contour data information of the fish, however, the weight data needs to be weighed away from water, so that the method is not suitable for live fish, and meanwhile, the method cannot effectively distinguish different varieties of fries with similar sizes and contour shapes; secondly, images of fishes are collected by utilizing an image recognition technology, models are constructed for classification, the method is applied to the aspects of tracing the fishes, screening the fishes dying of diseases, sorting the caught aquatic products, identifying the body parts of the fishes and the like, and the image recognition technology is not seen to be applied to the method for automatically sorting the varieties of the same fry.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a method for automatically sorting fish varieties based on ResNet transfer learning, which can automatically sort the varieties of the same fish fry.

Interpretation of terms:

1. ResNet, a residual network proposed by Microsoft laboratories in 2015, is used for solving the problem of deep network degradation, and a residual module of the residual network can enable the network to have an ultra-deep structure, so that the classification task and the target detection in the ImageNet competition of the current year, and the target detection and the image segmentation task in COCO data set all achieve the first achievement.

2. The migration learning is a machine learning method, which adjusts an existing model algorithm and applies the adjusted model algorithm to a new field, for example, a pre-trained model is reused in another task, and the knowledge and method in the big data field are migrated to the field with smaller data volume.

3. The ImageNet data set is a large visual database for visual object recognition software research, has more than 1400 million marked images, and is the largest database for image recognition in the world at present.

4. The ResNet34 model, as shown in FIG. 1, the ResNet34 model is composed of 33 convolutional layers and 1 fully-connected layer, 2 layers of pooling layers are composed, except for the 1 st convolutional layer, the other 32 convolutional layers are divided into 4 groups, which respectively include 3, 4, 6, 3 residual error learning units, the residual error learning units include the sequentially connected 2 convolutional layers, and a direct connection is established between the input and output of the 2 convolutional layers, when the input and output of the residual error learning units have the same dimension, the solid line direct connection is adopted, when the dimension is increased, the dotted line connection is adopted, the convolutional layer with 1x1 convolutional core, and the step length is 2.

5. The image processing library PIL in python is provided with an ImageEnhance module which is specially used for image enhancement processing and can adjust the brightness, the Contrast, the chroma, the sharpness and the like of an image, and Contrast is a Contrast enhancement class and is used for adjusting the Contrast of the image. All Image processing operations are realized by an enhance method of a corresponding class, an Image object is required to be transferred to a constructor of the class as a parameter, and a new Image object is returned.

The technical scheme of the invention is as follows:

a method for automatically sorting fish varieties based on ResNet transfer learning comprises the following steps:

A. training fish classification network model by using transfer learning

(1) Collecting images of different varieties of fishes in a detection box;

(2) carrying out data enhancement on the collected images of different varieties of fishes;

(3) the images of different species of fish collected in the step (1) and the image enhanced by the data in the step (2) form a fish image data set together, and the size of the image in the fish image data set is adjusted to 224 multiplied by 224, wherein the unit is pixel;

(4) dividing a fish image data set into a training set, a verification set and a test set according to the ratio of 8:1: 1;

(5) pre-training a ResNet34 model with an ImageNet dataset;

(6) constructing a fish classification network model, wherein the improved ResNet34 network structure is adopted, and the method comprises the following steps: adding a visual space attention module into a ResNet34 model, and changing the node number of the last full-connection layer of the ResNet34 model into the number of fish varieties to be predicted;

(7) loading the weight of the ResNet34 model pre-trained in the step (5) into a fish classification network model, and not loading the weight of the last full connection layer of the ResNet34 model pre-trained in the step (5); the parameters of the fish classification network model are initialized by adopting the weight of the pre-trained ResNet34 model, so that the time of subsequent network training can be saved, and the risk of overfitting under the condition of a small data set is reduced.

(8) Training a fish classification network model by adopting a training set;

B. automatically sorting the fish varieties through the trained fish classification network model in the step A

(9) Sending the fish to be detected into a detection box, sending 1 fish each time, and collecting fish images shot by cameras in a plurality of (3) frames of detection boxes;

(10) b, sending each frame of fish image into the trained fish classification network model in the step A for detection, and outputting fish varieties;

(11) and judging the finally predicted fish variety according to the detected fish variety of each frame of fish image. Automatically opening the gate corresponding to the variety and sending the fish into the fishpond of the corresponding variety to sort the fish varieties.

According to the invention, in the step (2), the acquired images of different varieties of fish are subjected to data enhancement, namely, the acquired images of different varieties of fish are respectively subjected to the processing of the steps a-d:

a. rotating: rotating the images by 90 degrees and 270 degrees clockwise respectively, and storing the two rotated images;

b. cutting: clipping 20 pixels on the upper, lower, left and right sides of the image, and storing the clipped image;

c. turning: respectively carrying out horizontal turning and vertical turning on the images, wherein the horizontal turning means that pixels on the left side and the right side are exchanged by taking a vertical axis of the center of the images as a symmetrical axis, the vertical turning means that pixels on the upper side and the lower side are exchanged by taking a horizontal axis of the center of the images as a symmetrical axis, and the two turned images are stored;

d. contrast enhancement: contrast enhancement is carried out by adopting Contrast class in an ImageEnhance module of a python image processing library PIL, and an enhancement factor is set to be 1.5.

Preferably, in the step (1), each image only contains one fish, and each fish collects a plurality of (5) images of various swimming postures of the fish in water, including images of various visual angles such as the side, the back and the abdomen of the fish;

preferably, according to the invention, the fish classification network model comprises two visuospatial attention modules and a ResNet34 model;

the ResNet34 model comprises 33 convolutional layers, 2 pooling layers and 1 fully-connected layer, except the 1 st convolutional layer conv1, the other 32 convolutional layers are divided into 4 groups, namely conv2_ x, conv3_ x, conv4_ x and conv5_ x, which respectively comprise 3, 4, 6 and 3 residual error learning units, each residual error learning unit comprises 2 convolutional layers which are sequentially connected, and a direct connection is established between the input and the output of each 2 convolutional layer;

the 2 layers of pooling layers comprise a maximum pooling layer max pool and an average pooling layer average pool _ 1; the 1 layer full connection layer is a full connection layer fc;

the 1 st layer of convolution layer conv1, the maximum value pooling layer max pool, 4 groups of convolution layers, namely conv2_ x, conv3_ x, conv4_ x, conv5_ x, the average pooling layer average pool _1 and the full connection layer fc are connected in sequence;

the two visual space attention modules have the same structure, the first attention module is positioned between the convolution layer 1 conv1 and the maximum pooling layer max pool, and the second attention module is positioned between conv5_ x and the average pooling layer average pool _ 1;

the visual space attention module comprises 1 average pooling layer average pool _2, 1 convolution layer conv6 and sigmoid activation function layer.

Further preferably, the output of the residual learning unit is represented by formula (i):

q＝F(p)+p (Ⅰ)

in formula (I), p is the input of the residual error learning unit, F (p) is the output of p after 2 layers of convolution layers, and q is the output of the residual error learning unit;

further preferably, the sigmoid activation function is represented by formula (II):

a＝sigmoid(conv6(average pool_2(x))) (Ⅱ)

in formula (II), x is the input of the visual space attention module, and a is the output of the visual space attention module; and x passes through the visual space attention module and then outputs a, the a and each channel of the x are multiplied respectively, and then the multiplied values are input into a maximum pooling layer max pool or an average pooling layer average pool _ 1.

According to the present invention, in the step (11), the judging method is preferably: and (4) if the same fish variety is detected in a plurality of frames of fish images, judging that the finally predicted fish variety is the fish variety, and if not, returning to the step (9).

And after the finally predicted fish species are judged, opening a gate corresponding to the fish species, sending the fish species into a corresponding fish pond, and if the detection results of the 3 frames are inconsistent, acquiring the 3 frames of images in the camera again for detection again until the 3 frames of fish species are detected to be consistent.

The invention has the beneficial effects that:

1. the invention provides a method for automatically sorting fish varieties based on ResNet transfer learning, which applies an image processing technology based on deep learning to the technical field of fish variety identification, can realize automatic sorting of the same type of fry varieties, does not need manual detection, liberates labor force, effectively improves working efficiency, and avoids sorting errors caused by subjective judgment of workers.

2. The invention adopts a transfer learning method, uses the ResNet34 model weight pre-trained on ImageNet to train a fish classification model, and can quickly train an ideal effect under the condition of less fish data sets.

3. The ResNet34 network is improved, the visual space attention module is added into the ResNet34, irrelevant information in the fish image can be ignored, areas which are important for distinguishing the fish are focused, and the accuracy of fish sorting is improved.

4. The invention solves the problem that some current methods are ineffective in classifying fishes with similar sizes and weights, can identify the varieties of the same fry, and is also effective for various caught fish products.

Drawings

FIG. 1 is a schematic diagram of the network structure of the ResNet34 model;

FIG. 2 is a schematic flow chart illustrating training of a fish classification network model using transfer learning according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a network structure of a fish classification network model according to an embodiment of the present invention;

FIG. 4 is a schematic flow chart of automatic sorting of fish species by a trained fish classification network model according to an embodiment of the present invention.

Detailed Description

The invention is further defined in the following, but not limited to, the figures and examples in the description.

Example 1

A method for automatically sorting fish varieties based on ResNet transfer learning takes the fish of two varieties of anguilla marmorata and Pacific double-color eel fries as an example, comprises a detection box, a No. 1 fishpond and a No. 2 fishpond, wherein the detection box is respectively connected with the No. 1 fishpond and the No. 2 fishpond through pipelines with gates. The method comprises the following steps:

A. training a fish classification network model by using transfer learning, as shown in fig. 2:

(1) acquiring images of different varieties of fishes in a detection box, wherein each image only comprises one fish, each fish acquires 5 images of various swimming postures of each fish in water, the images of various visual angles such as the side, back and abdomen of the fish are included as much as possible, each fish shoots 30 fries, and 150 images of anguilla marmorata and 150 images of Pacific double-color eel fries are obtained together;

(2) performing data enhancement on the collected fish images, and performing data enhancement on each image to generate 6 new images; data enhancement, comprising:

d. contrast enhancement: contrast enhancement is carried out by adopting Contrast in an ImageEnhance module of a python image processing library PIL, an enhancement factor is set to be 1.5, and the enhanced image is stored.

(3) Combining the images of different varieties of fishes acquired in the step (1) and the image enhanced by the data in the step (2) to form a fish image data set to obtain 1050 images of anguilla marmorata and 1050 images of Pacific two-color eel fry, wherein the size of the data set is 2100 images, and the size of the images in the fish image data set is adjusted to 224 multiplied by 224, and the unit is a pixel;

(4) dividing a fish image data set into a training set, a verification set and a test set according to the ratio of 8:1: 1; the test system comprises a training set, a verification set and a test set, wherein the training set comprises 1680 images, the verification set comprises 210 images, and the number of the two fry images is half of that of the two fry images;

(5) pre-training a ResNet34 model with an ImageNet dataset;

(6) constructing a fish classification network model, wherein the fish classification network model adopts an improved ResNet34 network structure, and the method comprises the following steps: adding a visual space attention module into a ResNet34 model, and changing the node number of the last full-connection layer of the ResNet34 model into the number of fish varieties to be predicted; sorting two varieties of fishes as an example, wherein the node number of the last full-connection layer is changed from 1000 to 2; as shown in fig. 3:

the fish classification network model comprises two visual space attention modules and a ResNet34 model;

the output of the residual learning unit is shown in formula (i):

q＝F(p)+p (Ⅰ)

the 2 layers of pooling layers comprise a maximum pooling layer max pool and an average pooling layer average pool _ 1; the 1 layer full connection layer is a full connection layer fc; the 1 st layer of convolution layer conv1, the maximum value pooling layer max pool, 4 groups of convolution layers, namely conv2_ x, conv3_ x, conv4_ x, conv5_ x, the average pooling layer average pool _1 and the full connection layer fc are connected in sequence; the concrete structure is as follows:

layer 1 convolution layer conv1, convolution kernel size 7 x 7, output channel size 64;

the 1 st group of convolution layers conv2_ x, the convolution kernel size is 3 x 3, and the output channel size is 64;

the 2 nd convolution layer conv3_ x, the convolution kernel size is 3 x 3, the output channel size is 128;

the 3 rd group of convolution layers conv4_ x, the convolution kernel size is 3 x 3, and the output channel size is 256;

the 4 th convolution layer conv5_ x, the convolution kernel size is 3 x 3, and the output channel size is 512;

the visual space attention module comprises 1 average pooling layer average pool _2, 1 convolution layer conv6 and sigmoid activation function layer. The sigmoid activation function is shown as formula (II):

a＝sigmoid(conv6(average pool_2(x))) (Ⅱ)

in formula (II), x is the input of the visual space attention module, and a is the output of the visual space attention module; x passes through the visual space attention module and then outputs a, the number of channels of a is 1, the other dimensionalities of a except the channels are consistent with x, the data of each channel of a and x are multiplied respectively, and then the data are input into a max pool layer or average pool layer _1 of a maximum pooling layer, so that irrelevant information in an image is ignored, and an area which is important for distinguishing fishes is focused;

averaging the pooling layer average _1, averaging the channels, and outputting the channel with the size of 1;

convolution layer conv6, convolution kernel size 7 × 7, output channel size 1;

(8) Training a fish classification network model by adopting a training set; a fish data set is adopted to train a fish classification network model, wherein a loss function of the network is cross entropy loss, an Adam optimizer is adopted to carry out optimization, the learning rate is set to be 0.0001, the batch size is set to be 16, and 90 epochs are trained.

B. And (4) automatically sorting the fish varieties by the fish classification network model trained in the step (A), as shown in fig. 4:

(9) sending fish to be detected into a detection box, sending 1 fish each time, and collecting 3 frames of fish images shot by a camera in the detection box; after the fish is fed into the detection box, randomly selecting a 3-frame image input model within 2s for detection so as to prevent the occurrence of detection errors caused by image blurring due to the movement of the fish;

(11) judging the finally predicted fish species according to the detected fish species of the 3-frame images, automatically opening a gate corresponding to the species and sending the fish species into a fish pond of the corresponding species to sort the fish species;

specifically, the fish breed to be predicted finally is judged from the detected fish breed of the 3-frame image, and the judgment method is as follows: if the same variety is detected in all the 3 frames, the gate corresponding to the fish of the variety is opened and the fish is sent into the corresponding fish pond, if the detection results of the 3 frames are inconsistent, the 3 frames of images in the camera are collected again for detection again until the 3 frames of fish are detected to be consistent in variety.

In this embodiment, taking the sorting of anguilla marmorata and pacific eels as an example, through experiment 1, the accuracy rate, recall rate and accuracy rate of ResNet34 and ResNet34 improved by the present invention in classifying two fry under the training conditions of data enhancement and migration learning are compared, as shown in table 1:

TABLE 1

The vision space attention module added in the invention can effectively improve the accuracy of fish classification.

This example also compares the accuracy and recall of the ResNet34 and ResNet50 networks in the presence of data enhancement and transfer learning training for two fry classifications, as shown in table 2:

TABLE 2

It can be seen that the ResNet34 used in the present invention is significantly better than the ResNet50 network.

This example also compares the accuracy and recall of the ResNet34 and ResNet50 networks for two fry classifications with and without transfer learning training, as shown in table 3:

TABLE 3

The accuracy and recall rate of ResNet34 and ResNet50 under the condition of transfer learning training are higher than those of non-transfer learning training, which is enough to prove that the transfer learning method adopted by the invention can quickly train better ideal effect under the condition of less fish data sets.

This example also compares the accuracy and recall of the ResNet34 and ResNet50 networks for two fry classifications with and without data enhancement, as shown in table 4:

TABLE 4

The accuracy rate and the recall rate of ResNet34 and ResNet50 under the condition of data enhancement are both higher than those of no data enhancement, so that the data enhancement method effectively increases the number of training samples, reduces the dependence of the model on certain characteristic attributes, and improves the generalization capability of the model.

In conclusion, compared with the traditional method, the automatic sorting method of the fish varieties based on ResNet transfer learning can realize the automatic sorting of the fish varieties with similar sizes and weights, does not need manual identification, effectively improves the working efficiency and reduces the error rate of fry classification.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. The spirit and scope of the present invention should be construed broadly and limited only by the appended claims, as they may be amended or otherwise amended without departing from the scope of the present invention.

Claims

1. A method for automatically sorting fish varieties based on ResNet transfer learning is characterized by comprising the following steps:

A. training fish classification network model by using transfer learning

(1) Collecting images of different varieties of fishes in a detection box;

(3) the images of different varieties of fishes collected in the step (1) and the images subjected to data enhancement in the step (2) form a fish image data set;

(4) dividing a fish image data set into a training set, a verification set and a test set;

(5) pre-training a ResNet34 model with an ImageNet dataset;

(6) constructing a fish classification network model, wherein the fish classification network model is as follows: adding a visual space attention module into a ResNet34 model, and changing the node number of the last full-connection layer of the ResNet34 model into the number of fish varieties to be predicted;

(7) loading the weight of the ResNet34 model pre-trained in the step (5) into a fish classification network model, and not loading the weight of the last full connection layer of the ResNet34 model pre-trained in the step (5);

(8) training a fish classification network model by adopting a training set;

(9) Sending fish to be detected into a detection box, sending 1 fish each time, and collecting fish images shot by cameras in a plurality of frames of detection boxes;

(11) and judging the finally predicted fish variety according to the detected fish variety of each frame of fish image.

2. The method for automatically sorting fish varieties based on ResNet transfer learning of claim 1, wherein the fish classification network model comprises two visual space attention modules and a ResNet34 model;

3. The method for automatically sorting fish species based on ResNet transfer learning as claimed in claim 1, wherein in step (2), the collected images of different species of fish are subjected to data enhancement, that is, the collected images of different species of fish are respectively subjected to the processing of steps a-d:

4. The method for automatically sorting fish varieties based on ResNet transfer learning as claimed in claim 1, wherein in step (1), each image contains only one fish, and each fish collects a plurality of images of various swimming postures in water, including images of various visual angles of the fish.

5. The method for automatically sorting fish varieties based on ResNet transfer learning of claim 2, wherein the output of the residual learning unit is shown as formula (I):

q＝F(p)+p (Ⅰ)

in formula (I), p is the input of the residual error learning unit, F (p) is the output of p after passing through 2 convolutional layers, and q is the output of the residual error learning unit.

6. The method for automatically sorting fish varieties based on ResNet transfer learning of claim 2, wherein sigmoid activation function is shown as formula (II):

a＝sigmoid(conv6(average pool_2(x))) (Ⅱ)

7. The method for automatically sorting fish varieties based on ResNet transfer learning according to any one of claims 1-6, wherein in the step (11), the judging method comprises the following steps: and (4) if the same fish variety is detected in a plurality of frames of fish images, judging that the finally predicted fish variety is the fish variety, and if not, returning to the step (9).