GB2611257A

GB2611257A - Transfer learning-based method for improved VGG16 network pig identity recognition

Info

Publication number: GB2611257A
Application number: GB2219795.8A
Authority: GB
Inventors: Zhu Weixing; Tang Zhiye; Li Xincheng
Original assignee: Jiangsu University
Current assignee: Jiangsu University
Priority date: 2021-06-03
Filing date: 2021-06-09
Publication date: 2023-03-29
Anticipated expiration: 2041-06-09
Also published as: CN113469356B; GB2611257B; CN113469356A; WO2022252272A1; GB202219795D0

Abstract

Disclosed is a transfer learning-based method for improved VGG16 network pig identity recognition. The method comprises: first performing frame by frame extraction on a processed video to obtain a series of pictures, which are preprocessed into a data set, and then dividing a training set and a test set; constructing an improved VGG16 network training model BN-VGG16; and saving a pre-trained feature extraction model Pre-VGG16; next is a transfer learning process: transferring a Pre-VGG16 feature extraction network obtained by source domain training to a Pig-VGG16 network for recognizing pigs; and performing multi-block improve absolute difference local direction pattern (MB-IADLDP) feature extraction on a data set that has undergone size adjustment, and performing serial fusion, and finally performing identity recognition on a pig. A transfer learning-based improved VGG16 model is superior to conventional VGG16 network models in terms of operating speed and precision.

Description

TRANSFER LEARNING-BASED METHOD FOR IMPROVED VGG16

NETWORK PIG IDENTITY RECOGNITION

TECHNICAL FIELD

The present disclosure relates to artificial intelligence technology, and in particular to the technical fields of transfer learning, deep learning, and neural network.

BACKGROUND

With the rise of the era of big data, neural network has also developed. The original neural network is only a single layer perceptron. The basic neural network also includes Hopfield neural network, linear neural network and BP neural network. Now, the neural network has a stage of results, mainly including the depth of confidence network, convolution neural network, depth residual network, LSTIVI network and so on. Deep neural network has powerful representation ability, but it has a lot of parameters and a large amount of calculation. Recent research is mainly towards reducing the amount of parameters, learning more abundant features and speeding up the training speed. Neural network is also widely used, such as face recognition, identity recognition, driverless and so on. It can be seen that the flexibility of neural network is very high, and it can adapt to a variety of tasks. For identity recognition, many network models can be used, such as VGG16, VGG19, Alexnet, Googlenet, RESNET and so on. But the selection of models should be based on the actual situation. Just for pig identity identification, the depth of VGG16 and the amount of calculation are enough, so it is unnecessary to use deeper models. In the actual experiment simulation, we will encounter the situation that the data set is not sufficient and the model needs to be reused repeatedly. At this time, transfer learning emerges. Transfer learning can simplify the calculation amount, improve the operation efficiency and have a good effect on the identity identification of pigs.

SUMMARY

The technical problem solved by the present disclosure is to provide a pig identity identification method by using an improved VGG16 network based on transfer learning. By continuous improvement and model optimization, the development of neural network has reached the stage of deep neural network. The application of typical network model is also more extensive, aiming at the existing research methods of deep neural network in the identity identification of pigs. The present disclosure puts forward a pig identity identification method by using an improved VGG16 network based on transfer learning. In 2014, at the Imagenet large scale visual recognition challenge, the Computer Vision Laboratory of Oxford University proposed the structure of VGG convolutional neural network. The final result of the competition is that positioning is the first place and classification is the second place. Therefore, VGG series models have great advantages in identity identification and feature extraction.

A traditional VGG16 model is introduced as follows.

As shown in FIG. 4, the traditional VGG16 model has two convolution layers with 64 convolution cores, two convolution layers with 128 convolution cores, three convolution layers with 256 convolution cores, six convolution layers with 512 convolution cores, two full connection layers with 4096 neurons and one full connection layer with 1000 neurons. The dimension of the input image is controlled at 224822483.

Convolution layer. it imitates human's local perception. When human's brain recognizes a picture, it perceives a certain feature in a picture, and then performs further comprehensive operation to obtain global information. Specifically, each neuron of the traditional neural network needs to connect each pixel. The result is that the number of weights is huge and the training is difficult. Now the number of weights of each neuron in the convolution layer is the size of the convolution kernel. That is to say, no neuron is only connected with the corresponding part of the pixels, so as to reduce the number of weights and improve the training efficiency. At the same time, we can also set the step size of the convolution kernel according to the needs, which is to maximize the efficiency of the algorithm. In the present disclosure, a 3 x3 convolution core is used, and two 3 83 convolution cores are equivalent to a 585 convolution core. Assuming that the picture is 224 x224, the step size is 1, and there is no filling, according to the convolution calculation formula(n+2*p-t)/q + 1, wherein n is the image scale, P is the filling value, f is the convolution kernel size, q is the step size, * is a convolution operator symbol, the convolution result of 5 x 5 is 220, and the convolution result of 3 x3 of 2 times is 220. But the computation of 5 x5 convolution kernel is larger. Generally speaking, the convolution kernel of 3 x3 has the following advantages over 5x5 and 7 x 7 (1) It has fast computation speed and high efficiency; (2) The receptive field is the same; (3) a 3x3 convolution kernel has more nonlinear effects than a large convolution kernel.

Pooling layer. a pooling layer is generally after the convolution layer, which mainly plays the role of dimensionality reduction. Because a lot of feature information are redundant after convolution operation, the pooling layer can just solve this dimensionality reduction problem. There are two main pooling methods, one is maximum pooling, and the other is average pooling. The maximum pooling layer can better preserve the texture information of the image, and the average pooling layer can preserve the local spatial information of the image. The strategy of combining the maximum pooling layer with the average pooling layer is used in the present disclosure. And replacing the maximum pooling layer with the combination of the maximum pooling layer and the average pooling layer can improve the accuracy of feature extraction, so as to improve the accuracy of identification.

Fully connected layer. a fully connected layer is often placed in the last layer, which mainly plays the role of feature weighting. In the present disclosure, the final full connection layer is replaced by the convolution layer, and the replacement rule is that the convolution core size is set to the size of the input space, so that any size of image input could be accepted. At the same time, CNN shares a lot of computing, which improves the efficiency of the whole network.

The present disclosure also adds a BN layer after each maximum pooling layer. The BN layer has the following advantages. (1) The training speed is accelerated, so that a learning rate is improved to train the network; (2) The generalization ability of network is improved; (3) BN layer is essentially a normalized network layer, so local response normalized layer can be replaced by it. With the more and more widely application of deep learning, requirements for accuracy are higher and higher. However, high accuracy depends on a large number of annotation data or images. The annotation process is time-consuming and labor-consuming. Transfer learning can solve this problem well, so transfer learning has received more and more attention.

The technical solutions of the present disclosure are as follows.

1. A pig identity identification method by using an improved VGG16 network based on transfer learning is provided, which includes the following steps.

Step 1. Extracting a picture frame by frame from a video; Obtaining expanded data sets by adjusting a contrast, adding noise points, cropping and other operations; dividing the expanded data sets into a training set and a test set; Step 2. Adding a BN layer after each pooling layer to build a BN-VGG16 model; Step 3. Obtaining Gauss improved factor particle swarm optimization algorithm (G-IFPSO) by adding Gauss improved factor to particle swarm optimization algorithm; Step 4. Training the training set processed in the Step 1; using the G-LFPSO algorithm to optimize a loss function, wherein the loss function is a weighted fusion of cross entropy loss function and mean square error loss function; saving a pre-trained feature extraction network PreVGG16; Step 5. Using a pixel multi block method by improving absolute differential local direction pattern (PMB-IADLDP) to extract traditional features of pigs, wherein the traditional features are used for feature fusion and identity identification of pigs; and Step 6. Transferring the Pre-V6016 feature extraction network to two different neural networks for training, fine-tuning parameters in the networks; and then, adjusting the datasets to 224 x 224 x 3, using the pixel multi block method by improving absolute differential local direction pattern (PMB-IADLDP) to extract features from the adjusted datasets; fusing serially the features extracted in the two neural networks and PMB-IADLDP features, that is to say vector fusion; and finally, identifying the identity of pigs.

2. Step 1 of the claim 1 specifically includes the followings. Firstly, the video is extracted frame by frame to get the images. These images are then preprocessed. That is to say, by flipping the image horizontally and vertically, using gamma transform, histogram equalization, logarithmic transform, reducing and adding noise points, the datasets could be expanded and the processed data sets are obtained. The number of pictures is increased by 4900. Finally, the datasets is divided as training sets and test sets in the proportion of 6.1.

3. The improved BN-VGG16 network model built in step 2 of the claim 1 specifically includes the followings. BN(Batch Normalization) layer is added after each maximum pooling layer. The structure of the whole network is consisted of two convolution layers with 64 convolution cores which are followed by a maximum pooling layer and a BN layer, two convolution layers with 128 convolution cores which are followed by a maximum pooling layer and a BN layer, three convolution layers with 256 convolution cores which are followed by a maximum pooling layer and a BN layer, three convolution layers with 512 convolution cores which are followed by a maximum pooling layer and a BN layer, two fully connected layers containing 4096 neurons, one fully connected layer containing 1000 neurons and a softmax layer.

4. The improved particle swarm optimization algorithm in step 3 of the claim 1 specifically includes the followings. The particle swarm optimization algorithm is improved. Gaussian perturbation is added to the optimal particle.

5. The improved particle swarm optimization algorithm in step 3 of the claim 1 specifically includes the followings. The particle swarm optimization algorithm is improved. The speed weight is optimized in real time according to the number of iterations and the offset is added, so that the weight would not disappear.

6. The improved particle swarm optimization algorithm in step 3 of the claim 1 specifically includes the followings. The particle swarm optimization algorithm is improved. The self learning factor and the population learning factor need to be optimized. That is to say, the self learning factor is optimized with the number of iterations.

7. The improved VGGl6 network based on transfer learning in the claim 1 specifically includes the followings. In the training process of step 4, the dropout value is set as 0.65. Its purpose is to prevent over fitting. The dimension of the trained data set is adjusted to 224 x 224 x 3. The cross entropy loss function and mean square error loss function are selected as the loss function Q, and the two functions were weighted together.

G-IFPSO algorithm is used to optimize the weight. The optimization process is as follows.

(1) Initialization parameters. That is to say, the particle position, velocity, individual optimal position, population optimal position and learning factor are initialized; (2) According to formula (2), the weight of PSO (particle swarm optimization) is updated with the number of iterations; (3) According to formula (3), the current optimal value of the learning factor is obtained with the increase of iteration numbers; (4) According to formula (1), (3) and (4), the position and velocity components of particles are updated; (5) The fitness value is calculated according to formula (2); (6) The individual extremum and global extremum of particles are compared, and the optimal value is replaced continuously; (7) If the maximum number of iterations is reached, the optimal solution'' 7) would be output. Otherwise, the second step would be continuously operated; Finally, the iterative training is completed and a model would be gotten and the pretrained feature extraction network is saved.

8. The features extraction of PMB-IADLDP in step 5 of the claim 1 specifically includes the followings The size of the processed images are transformed to 222x 222 Then these processed images are subdivided into blocks, and the size of each block is 3x3. There are 74 sub-blocks After 3 x3 block coding Gi is obtained, Kirsch mask operator was used to calculate the result kh.

It is shown in formula (8). The differential coding and absolute coding are carried out respectively, and they are shown in formula (9) and formula (10). Take three largest results from the differential coding. The maximum three directions are set to 1, and the other directions are set to 0. The direction with the maximum absolute coding value is set to 1, and the other directions are set to 0. The final PMB-IADLDP feature extraction result is obtained by weighted fusion of the two results. Finally, the matrix of 74 x 8 dimensions is obtained. The purpose of differential coding is to make the eight area pixels around the center pixel gc more closely related with their surroundings, so as to enrich the extracted information. The direction with the highest absolute value indicates that the texture effect is best in that direction. The results of absolute coding and differential coding are fused with weight, which not only retains the main texture, but also reduces the redundancy of information 9. The methods that the Pre-VGG16 feature extraction network is applied to two neural network models respectively in step 6 of the claim 1 specifically includes the followings. The Pre-VGG16 feature extraction network is applied to two neural network models respectively. The difference between the two networks is at the last pooling layer. One is an average pooling layer, and the other is the maximum pooling layer. Then the features extracted by the two neural networks and PMB-IADLDP are fused serially. Finally, the fusion results are input to a full connection layer and softmax layer for final identification. The fusion strategy is that the features to be fused is expanded and normalized to form a new feature vector. The length of the new feature vector is equal to the sum of the length of the feature vectors to be connected, and then the new feature vector is sent to the neural network to get the results of identification. The fully connection layer of the Pig-VGG16 network is changed into convolution layer. Then the trained parameters of the pig identity identification network are initialized. And the parameters are adjusted to the user-defined values. Dropout is set to 0.6. Epoch is set to 25, 3x3 size is used in the convolution core. The cross entropy loss function and the mean square error loss function are used as the loss function. The whole training process is completed on tensorflow 2.0. First, the codes of the modules of convolution layer, pool layer and full connection layer are written out according to the modules of BN-VGG16, and then they are debugged and saved. The BN layer code is added after the pooling layer and debugged. Then the data set is input into the main program and each module is called for model training. After the number of iterations is reached, the feature extraction model is saved. Then, the model is migrated to two different networks. The features extracted from two neural networks and PMB-IADLDP are fused, and the fusion results are input into the full connection layer and softmax layer for final identification. Compared with the above methods, the present disclosure has the following obvious advantages.

(I) The BN layer is added to each maximum pooling layer to accelerate the training speed of the whole network, so that a larger learning rate can be used to train the network, and the generalization ability of the network is also improved.

(2) The loss function is a weighted fusion of cross entropy loss function and mean square error loss function. The weighted value is optimized by G-IFPSO algorithm, and the optimal weight can be obtained by iteration. The improvement of particle swarm optimization algorithm is the improvement of speed weight and elite particle, and Gaussian disturbance is added, so that the weight has been changing and will not disappear. Therefore the ability of global search is improved, and the problem of easily falling into local optimum is solved.

(3) Two neural networks are fused, which are different in pooling layer. The maximum pooling layer can better preserve the texture information of images, and the average pooling layer can preserve the local spatial information of images. The combination of the two can improve the accuracy of feature extraction and identification.

(4) Using transfer learning strategy, the feature extraction module of VGG16 is transferred to pig identity identification network (Pig-VGG16), which improves the efficiency of the whole network and saves time (5) The full connection layer is replaced by the convolution layer, so that the whole network can display images of different scales and achieve scale freedom of the whole network.

BRIEF DESCRIPTION OF THE DRAWINGS

In order to more clearly explain the specific implementation steps and experimental principles of this disclosure, the required drawings are briefly explained below.

FIG. 1 Flow chart of experimental method.

FIG. 2 Feature extraction process of PMB-IADLDP.

FIG. 3 Kirsch mask operator.

FIG. 4 Traditional VGG16 model.

FIG. 5 Improved VGG16 model BN-VGG16.

FIG, 6 VGG1 6 model based on transfer learning method.

FIG. 7 Experimental comparison after adding BN layer.

FIG. 8 Comparison of experimental results.

DETAILED DESCRIPTION OF THE EMBODIMENTS

1. An improved VGG16 network based on transfer learning is used to identify the identity of pigs. This method includes the following steps.

Step 1. Extracting a picture frame by frame from a video; obtaining expanded data sets by adjusting a contrast, adding noise points, cropping and other operations, and dividing the expanded data sets into a training set and a test set; Step 2. Adding a BN layer after each pooling layer to build a BN-VGG16 model; Step 3. Obtaining Gauss improved factor particle swarm optimization algorithm (G-IFPSO) by adding gauss improved factor to particle swarm optimization algorithm; Step 4. Training the training set processed in the Step 1; using the G-LFPSO algorithm to optimize a loss function, wherein the loss function is a weighted fusion of cross entropy loss function and mean square error loss function, and saving a pre-trained feature extraction network Pre-VGG16; Step 5. Using a pixel multiblock method by improving absolute differential local direction pattern (PMB-IADLDP) to extract traditional features of pigs, wherein the traditional features are used for feature fusion and identity identification of the pigs; and Step 6. Transferring the Pre-VGG16 feature extraction network to two different neural networks for training; fine-tuning parameters in the networks, and then, adjusting the datasets to 224 x224 x3; using the pixel multiblock method by improving absolute differential local direction pattern (PMB-IADLDP) to extract features from the adjusted datasets; fusing serially the features extracted in the two neural networks and PMB-IADLDP features, that is to say vector fusion, and finally, identifying the identity os pigs.

2. Step 1 of the claim 1 specifically includes the followings. Firstly, the images are extracted frame by frame from a video. These images are then preprocessed. That is to say, by flipping the image horizontally and vertically, using gamma transform, histogram equalization, logarithmic transform, reducing and adding noise points, the datasets could be expanded and the processed data sets are obtained. The number of images is increased by 4900. Finally, the datasets is divided as training sets and test sets in the proportion of 6.1.

3. The improved BN-VGG16 network model built in step 2 of the claim 1 specifically includes the followings. a BN (Batch Normalization) layer is added after each maximum pooling layer. The structure of the whole network is consisted of two convolution layers with 64 convolution cores which are followed by a maximum pooling layer and a BN layer, two convolution layers with 128 convolution cores which are followed by a maximum pooling layer and a BN layer, three convolution layers with 256 convolution cores which were followed by a maximum pooling layer and a BN layer, three convolution layers with 512 convolution cores which were followed by a maximum pooling layer and a BN layer, two fully connected layers containing 4096 neurons, one fully connected layer containing 1000 neurons and a softmax layer.

4. The improved particle swarm optimization algorithm in step 3 of the claim I specifically includes the followings. The particle swarm optimization algorithm is improved. Gaussian perturbation is added to the optimal particle. The formula of the improved particle swarm optimization algorithm is as follows.

= cir(Pim -run) + c2r 2(P! - * La! = Xim ± Vim gm = N (I) gni. a) wherein Pg"' is the optimal value of particle swarm; P g,n is the optimal value of the particle swarm after the disturbance; Pim is individual optimal value; N (P. a) is Gaussian function iti is the average, 0-is the variance 1""'is velocity component; x'"' is location component; 141 is inertia weight; c' is self-learning factor; C2 is population learning factor; r: is random value between 0 and 1, Fitness function is as follows.

F (x) = a0 + b (2) wherein a is the scalar coefficient; b is the offset; and Q is the loss function after weighted fusion 5. The improved particle swarm optimization algorithm in step 3 of the claim 1 specifically includes the followings. The particle swarm optimization algorithm is improved. The speed weight is optimized in real time with the increase of the number of iterations and the offset is added, so that the weight will not disappear. The improved speed weight formula is as follows. max-

IV = IV min+ ( )(11) max-TV rn0±d i max (3) wherein I max is the maximum number of iterations; w max is the maximum value of speed weight; W nun is the minimum value of speed weight; I is the current number of iterations; d is the offset.

6. The improved particle swarm optimization algorithm in step 3 of the claim I specifically includes the followings. The particle swarm optimization algorithm is improved. The self learning factor and the population learning factor need to be optimized. That is to say, the self learning factor is optimized with the increase of the number of iterations. The formulas of the improved learning factor are as follows.

ci -2( (4) i max i max-imax wherein i max is maximum number of iterations is current number of iterations 7. The improved V0016 network based on transfer learning in the claim 1 specifically includes the followings. In the training process of step 4, the dropout value is set as 0.65. Its purpose is to prevent over fitting. The dimension of the trained data set is adjusted to 224 x 224 x 3.The cross entropy loss function and mean square error loss function are weighted as the loss function Q. The formula is shown in formula (5).

- MSE (5) a+flL+ a+13 wherein a is the final loss value of cross entropy loss function tends to be stable; Pis the final loss value of the mean square error loss function tends to be stable; L is the cross entropy loss function; MSE is the mean square error loss function, a 11 = r -Let a a + 13

P

The cross entropy loss function L is shown in formula (6).

AI

= -1E Ed= -1 - log(paa)) Nd N, d (6) wherein M is number of categories; d is is the d-th category; Ld is the value of the loss function of Category D; Yth' is a indicator variable (0 or 1); If the category is the same as that of sample i, the result is 1, otherwise it is 0; Pik is the prediction probability of observation sample i belonging to this category.

The mean square error loss function MSE is shown in formula (7).

Ini(Ya Y1)2 MSE (y, y) -fl (7) wherein -11° is the value of the c th input; Y c is its predicted value.

G-TFPSO algorithm is used to optimize the weight. The optimization process is as follows.

(1) Initialization parameters. That is to say, the particle position, velocity, individual optimal position, population optimal position and learning factor are initialized; (2) According to formula (2), the weight of PSO (particle swarm optimization) is updated with the number of iterations; (3) According to formula (3), the current optimal value of the learning factor is obtained with the increase of iteration numbers (4) According to formula (1) (3) (4), the position and velocity components of particles are updated; (5) The fitness value is calculated according to formula (2); (6) The individual extremum and global extremum of particles are compared, and the optimal value is replaced continuously.

(7) If the maximum number of iterations is reached, the optimal solution would be output Otherwise, the second step would be continuously operated.

Finally, the iterative training is carried out. When the iterative loss value is less than a certain threshold, the training is stopped. The model is gotten and the pre trained feature extraction network is saved.

8. The features extraction of PMB-IADLDP in step 5 of the claim 1 specifically includes the followings.The size of the processed images are transformed to 222 x222. Then they are subdivided into blocks, and the size of each block was 3 x3. There are 74 sub-blocks. After 3 x3 block coding G; is obtained, Kirsch mask operator is used to calculate the result hi72. It is shown in formula (8).The differential coding and absolute coding are carried out respectively, and they are shown in formula (9) and formula (10). Take three largest results from the differential coding. The maximum three directions are set to 1, and the other directions are set to 0.The direction with the maximum absolute coding value is set to 1, and the other directions are set to 0.The final PMB-IADLDP feature extraction result is obtained by weighted fusion of the two results. Finally, the matrix of 74 x 8 dimensions is obtained. The purpose of differential coding is to make the eight area pixels around the center pixel tc.i more closely related with their surroundings, so as to enrich the extracted information The direction with the highest absolute value indicates that the texture effect is best in that direction. The results of absolute coding and differential coding are weighted, which not only retain the main texture, but also reduce the information redundancy.

E.; = Gi* = 1, 2, ..., 74, j = 0,2 (8) wherein G' is the coding value of the ith block; is Kirsch mask operator in the j-th direction; * is a convolution operator symbol.

The formula for the differential code is as follows.

-ei -El, cl= 7 1=7 (9) wherein Cis the ith encoding around the center pixel of the block. The absolute coding formula is as follows dch - = 0,1,... 3 (10) wherein ek is the k-th largest coding value in block.

LDP = x2 = 0,2,...,7,k =3 x 0 wherein,s(x) = 0, others (1 2) In formula (11), LDP represents the coding value of local direction pattern; s(x) is a step function; If x is greater than 0, it is set to 1, and otherwise it is set to 0; Formula (12) is used to get the maximum value of absolute coding.

9. The methods that the Pre-VGGI6 feature extraction network is applied to two neural network models respectively in step 6 of the claim 1 specifically includes the followings. The Pre-VGG16 feature extraction network is applied to two neural network models respectively. The difference between the two networks is at the last pooling layer. One is the average pooling layer, and the other is the maximum pooling layer. Then the features extracted by the two neural networks and PME-IADLDP are fused serially. Finally, the fusion results are input to the full connection layer and softmax layer for final identification. The fusion strategy is that the features to be fused are expanded and normalized to form a new feature vector. The length of the new feature vector is equal to the sum of the length of the feature vectors to be connected, and then the new feature vector are sent to the neural network to get the results of identification. The fully connection layer of the Pig-VGG16 network is changed into convolution layer. Then the trained parameters of the pig identification network are initialized. And the parameters are adjusted to the user-defined values. Dropout is set to 0.6. Epoch is set to 25, 3 x3 size is used in the convolution core. The cross entropy loss function and the mean square error loss function are used as the loss function. The whole training process was completed on tensorflow2.0. First, the codes of the convolution layer, pool layer and full connection layer modules are written according to the modules of BN-VGG16, and then they are debugged and saved. The BN layer code was added after the pooling layer and debugged. Then the data set is input into the main program and each module is called for model training. After the number of iterations is reached, the feature extraction model is saved. Then, the model is migrated to two different networks. Because the feature extraction is the same in two different networks, so it could be called directly. Just the last pooling layer is needed to modify. The features extracted from two neural networks and PMBIADLDP are fused, and the fusion results are input into the full connection layer and softmax layer for final identification. Observe the difference between traditional VGG16 and BN-VGG16 in the identification accuracy of pigs, and it is shown in FIG. 7. Finally, the comparison results of three networks are gotten. As shown in FIG. 8, the identification accuracy of Pig-VGG16 network is the highest, and the accuracy can reach 0.6 at the beginning. Therefore, Pig-VGG16 network is more suitable for the identification of pigs than traditional VGG16 and improved VGG16. The above examples are only illustrative examples of the present disclosure. The feasibility of the present disclosure is explained in detail. But it is not limited to that.

Claims

CLAIMSWhat is claimed is: 1. A pig identity identification method by using an improved VGG16 network based on transfer learning, characterized by comprising the following steps: Step I. Extracting images frame by frame from video; obtaining expanded data sets by adjusting a contrast, adding noise points, cropping and other operations, and dividing the expanded data sets into a training set and a test set; Step 2. Adding a BN layer after each pooling layer to build a BN-V0016 model; Step 3. obtaining Gauss improved factor particle swarm optimization algorithm (G-LFPSO) by adding gauss improved factor to particle swarm optimization algorithm; Step 4. Training the training set processed in the Step 1, using the G-IFPSO algorithm to optimize a loss function, wherein the loss function is a weighted fusion of cross entropy loss function and mean square error loss function, and saving a pre-trained feature extraction network Pre-VGG16; Step 5. Using a pixel multi block method by improving absolute differential local direction pattern (PMB-IADLDP) to extract traditional features of pigs, wherein the traditional features are used for feature fusion and identity identification of a individual pig; and Step 6. Transferring the Pre-VGG16 feature extraction network to two different neural networks for training, fine-tuning parameters in the networks, and then, adjusting the datasets to 224 x 224 x 3, using the pixel multi block method by improving absolute differential local direction pattern (PMB-IADLDP) to extract features from the adjusted datasets, fusing serially the features extracted in the two neural networks and PMB-IADLDP features, i.e vector fusion, and finally, identifying the identity of pigs.
2. The pig identity identification method by using the improved VGG16 network based on transfer learning according to claim 1, characterized in that the Step 1 specifically comprises the followings: firstly, Images are extracted frame by frame from video; these images are then preprocessed, that is to say, by flipping the images horizontally and vertically, using gamma transform, histogram equalization, logarithmic transform, reducing and adding noise points, the datasets could be expanded and the processed data sets are obtained. The number of images is increased by 4900. Finally, the datasets is divided as training sets and test sets in the proportion of 6:1.
3. The pig identity identification method by using the improved VGG16 network based on transfer learning according to claim 1, characterized in that the improved BN-VGG16 network model built in the Step 2 specifically comprises the followings: BN (Batch Normalization) layer is added after each maximum pooling layer. The structure of the whole network is consisted of two convolution layers with 64 convolution cores which are followed by a maximum pooling layer and a BN layer, two convolution layers with 128 convolution cores which are followed by a maximum pooling layer and a BN layer, three convolution layers with 256 convolution cores which are followed by a maximum pooling layer and a BN layer, three convolution layers with 512 convolution cores which are followed by a maximum pooling layer and a BN layer, two fully connected layers containing 4096 neurons, one fully connected layer containing 1000 neurons and a softmax layer.
4. The pig identity identification method by using the improved VGG16 network based on transfer learning according to claim 1, characterized in that the improved particle swarm optimization algorithm in the Step 3 specifically comprises the followings: The particle swarm optimization algorithm is improved. Gaussian perturbation is added to the optimal particle. The formulas of the improved particle swarm optimization algorithm are as follows.vim = WVni + I(P,,,, -xim)+ c2r 2(15 gni-.x.,m) Lon = Kim +Vim n = gm, CT) (1) wherein -Pgm is the optimal value of particle swarm; PI gm is the optimal value of the particle swarm after adding disturbance; is individual optimal value; IV (P cr) is Gaussian function; P is the average; a is the variance; 11-is velocity component; xi-is location component; is inertia weight; clis self-learning factor; c2is population learning factor; Ti, r 2 is random value between 0 and 1; The formulas of Fitness function is as follows.F(x)= al) + (2) wherein a is the scalar coefficient, b is the offset, and Q is the loss function after weighted fusion.
5. The pig identity identification method by using the improved VGG16 network based on transfer learning according to claim 1, characterized in that the improved particle swarm optimization algorithm in the Step 3 specifically comprises the followings: in Step 3, the particle swarm optimization algorithm is improved. The speed weight is optimize in real time according to the number of iterations and the offset is added, so that the weight would not disappear.The formulas of the improved speed weight is as follows. I mad= min+ ( )012 max TV min)+d (3) wherein im.L\ is the maximum number of iterations; IFmmcc is the maximum value of speed weight; s the minimum value of speed weight. i is the current number of iterations; and d is the offset.
6. The pig identity identification method by using the improved VGG16 network based on transfer learning according to claim 1, characterized in that the improved particle swarm optimization algorithm in the Step 3 specifically comprises the followings: in the Step 3, the particle swarm optimization algorithm is improved.The self learning factor and the population learning factor needed to be optimized. That is to say, the self learning factor is optimized with the number of iterations.The formulas of the improved learning factor are as follows.lmTaadI (4) ci-2( i max c2-2(1 I flax i max wherein i max is Maximum number of iterations; s Current iterations 7. The pig identity identification method by using the improved VGG16 network based on transfer learning according to claim 6, characterized in that the improved VGG16 network based on transfer learning in the Step 4 specifically comprises the followings: in the training process of the Step 4 the dropout value is set as 0.65. Its purpose is to prevent over fitting.The dimension of the trained data set is adjusted to 224 x224 x3.The cross entropy loss function and mean square error loss function are selected as the loss function Q, and the two functions are weighted together The formula of the loss function Q is shown in formula (5) 0 -,8 L +ce MSE (5) a + ,6 a+fl wherein a is the final stable loss value of cross entropy loss function, fi is the last loss value of the mean square error loss function tends to be stable; L is the cross entropy loss function is shown in formula (6); MSE is the mean square error loss function is shown in formula (7). fi =Let a + fi a + /3 The cross entropy loss function is shown in formula (6). lviL = -11Ld -1 -pc-log(p,k)) Nd N,d (6) wherein M is the number of categories; d is the d-th category; lid is the value of the loss function of Category D, -V' is indicator variable (0 or 1); If the category is the same as that of sample i, the result would be 1, otherwise it would be D; ihk is the prediction probability of observation sample i belonging to this category.The mean square error loss function is shown in formula (7).yn c-)2 msE ( y, y) = Cl fl (7) wherein Ys' is the value of the cth input; Yc is its predicted value.G-IFPSO algorithm is used to optimize the weight. The optimization process is as follows.( I) Initialization parameters. That is to say, the particle position, velocity, individual optimal position, population optimal position and learning factor are initialized; (2) According to formula (1), the weight of PSO (particle swarm optimization) is updated with the number of iterations; (3) According to formula (4), the current optimal value of the learning factor is obtained with the increase of iteration numbers; (4) According to formula (1), (3), (4), the position and velocity components of particles are updated; (5) The fitness value is calculated according to formula (2); (6) The individual extremum and global extremum of particles are compared, and the optimal value is replaced continuously; (7) If the maximum number of iterations is reached, the optimal solution (11' 7) would be output; Otherwise, the second step would be continuously operated.The above iterative training is carried out continuously. When the iterative loss value is less than a certain threshold or the maximum number of iterations is reached, the training would be stopped. The model would be gotten and the pretrained feature extraction network is saved 8. The pig identity identification method by using the improved VGG16 network based on transfer learning according to claim 1, characterized in that the features extraction of PMBIADLDP in the Step 6 specifically comprises the followings: the size of the processed images are transformed to 222 x 222. Then they are subdivided into blocks, and the size of each block is 3 x 3. There are 74 sub-blocks. After 3 x3 block coding is obtained, Kirsch mask operator is used to calculate the result E. It is shown in formula (8). The differential coding and absolute coding are carried out respectively, and they are shown in formula (9) and formula (10). Take three largest results from the differential coding. The three directions with the maximum result are set to 1, and the other directions are set to 0. The direction with the maximum absolute coding value is set to 1, and the other directions are set to 0. The final PMB-IADLDP feature extraction result is obtained by weighted fusion of the two results. Finally, the matrix of 74 x 8 dimensions is obtained. The purpose of differential coding is to make the eight area pixels around the center pixel g more closely related with their surroundings, so as to enrich the extracted information.
The direction where the absolute value is largest indicate that the texture effect in this direction is the best. The results of absolute coding and differential coding are weighted fusion, which not only retains the main texture, but also reduces the redundancy of information. E, = G,* 114,,i =1,2_74, j= 0,2, 7 (8) wherein G' is the coding value of the ith block; is Kirsch mask operator in the j-th direction; * is a convolution operator symbol.
The formula for the differential coding is as follows.d= ie,-e, +1, 0 <i < 6 (9) wherein e, is the ith encoding around the center pixel of the block. The absolute coding formula is as follows.da -e, 41,1 = 0,1, 3 (10) wherein eh is the k-th largest coding value in the block, LDP =±s(le '1de kl) x 2, i 0, 2, , 7, = 3 i=0 11. .x- 0 (12) wherein s(x) -10, others In formula (11), LDP represents the coding value of local direction pattern, s(x) is a step function. If x is greater than 0, it is set to 1, and otherwise it is set to 0; Formula (12) is used to get the maximum value of absolute coding.
9. The pig identity identification method by using the improved VGG16 network based on transfer learning according to claim 1, characterized in that the methods that the Pre-VGG16 feature extraction network is applied to two neural network models respectively in the Step 6 specifically comprises the followings. The Pre-VGG16 feature extraction network is applied to two neural network models respectively. The difference between the two networks is at the last pooling layer. One is the average pooling layer, and the other is the maximum pooling layer. Then the features extracted by the two neural networks and MB-IADLDP are fused serially. Finally, the fusion results are input to the full connection layer and softmax layer for final identification. The fusion strategy is that the features to be fused are expanded and normalized to form a new feature vector. The length of the new feature vector is equal to the sum of the length of the feature vectors to be connected, and then the new feature vector are sent to the neural network to get the results of identification. The fully connection layer of the Pig-Vgg16 network is changed into convolution layer. Then the trained parameters of the pig identity identification network are initialized. And the parameters are adjusted to the user-defined values. Dropout is set to 0.6.Epoch is set to 25, 3 x3 size is used in the convolution core. The cross entropy loss function and the mean square error loss function are used as the loss function. The whole training process is completed on tensorflow2.0. First, the code of the modules of convolution layer, pool layer and full connection layer are written out according to the modules of BN-VGG16, and then they are debugged and saved. The BN layer code is added after the pooling layer and debugged. Then the data set is input into the main program and each module is called for model training. After the number of iterations is reached, the feature extraction model is saved. Then, the model is migrated to two different networks. Because the feature extraction is the same in two different networks, so it could be called directly. Just the last pooling layer is needed to modify. The features extracted from two neural networks and PMB-IADLDP are fused, and the fusion results are input into the full connection layer and softmax layer for final identification.