CN113469356B

CN113469356B - Improved VGG16 network pig identity recognition method based on transfer learning

Info

Publication number: CN113469356B
Application number: CN202110618450.7A
Authority: CN
Inventors: 朱伟兴; 汤志烨; 李新城
Original assignee: Jiangsu University
Current assignee: Jiangsu University
Priority date: 2021-06-03
Filing date: 2021-06-03
Publication date: 2024-06-07
Anticipated expiration: 2041-06-03
Also published as: GB202219795D0; CN113469356A; WO2022252272A1; GB2611257A; GB2611257B

Abstract

The invention discloses an improved VGG16 network pig identity recognition method based on transfer learning. Firstly, extracting processed videos frame by frame to obtain a series of pictures, preprocessing the pictures into a data set, and then dividing a training set and a testing set; constructing an improved VGG16 network training model BN-VGG16, and storing a Pre-trained feature extraction model Pre-VGG16; the next is a transfer learning process, wherein the Pre-VGG16 feature extraction network obtained by source domain training is transferred to the Pig-identifying Pic-VGG 16 network; and (3) performing multi-block improved absolute value differential local direction mode (Multi Block ImproveAbsolute Difference Local Direction Pattern, MB-IADLDP for short) feature extraction on the data set with the adjusted size, performing serial fusion, and finally performing pig identification. The improved VGG16 model based on transfer learning is superior to the traditional VGG16 network model in both operation speed and accuracy.

Description

Improved VGG16 network pig identity recognition method based on transfer learning

Technical Field

The invention relates to an artificial intelligence technology, in particular to the technical fields of transfer learning, deep learning and neural networks.

Background

With the advent of the big data age, neural networks have also evolved. The initial neural network is just a single layer perceptron, namely the base neural network, which is also a Hopfield neural network, a linear neural network and a BP neural network. Through the staged development, the Boltzmann machine, the limited Boltzmann machine, a recurrent neural network and the like appear. At present, the neural network has staged achievements, and has been staged to the deep neural network, mainly including a deep belief network, a convolutional neural network, a deep residual network, an LSTM network and the like. The deep neural network has strong characterization capability, but a lot of parameters are calculated in a large amount, and recent researches mainly progress towards reducing the parameters, learning more abundant features and accelerating the training speed. The application of the neural network is very wide, such as face recognition, identity recognition, unmanned driving and the like, and the flexibility of the visible neural network is very high, so that the neural network can adapt to various tasks. For the aspect of identity recognition, multiple network models such as VGG16, VGG19, alexnet, googlenet, resnet and the like can be used, but the model is selected according to the actual situation, so that the aspect of the identity recognition of pigs is realized, the depth and the calculated amount of the VGG16 are enough, and a deeper model is unnecessary. When the actual experiment is simulated, the situations of insufficient data set and repeated use of the model are met, and at the moment, transfer learning is generated, so that the calculation amount can be simplified, the operation efficiency is improved, and the method has a good effect on the identification of pigs.

Disclosure of Invention

The technical problem solved by the invention is to provide an improved VGG16 network pig identity recognition method based on transfer learning.

The invention provides a method for improving VGG16 network based on transfer learning, which is used for carrying out the identification of pigs, aiming at the research method of the existing deep neural network on the identification of pigs. In 2014, IMAGENET LARGE SCALE Visual Recognition Challenge, the oxford university computer vision laboratory proposed the structure of VGG convolutional neural network, and the final result of the competition was that the location was the first name and the second name was classified. It can be seen that the VGG series model has great advantages in identification and feature extraction.

The following I present the following conventional VGG16 model:

The conventional VGG16 model is shown in fig. 4, and has 2 convolutional layers including 64 convolutional kernels, 2 convolutional layers including 128 convolutional kernels, 3 convolutional layers including 256 convolutional kernels, 6 convolutional layers including 512 convolutional kernels, 2 fully connected layers including 4096 neurons, 1 fully connected layer including 1000 neurons, and the dimension of the input image is controlled to 224×224×3.

Convolution layer: the local perception of the person is simulated, and when the human brain recognizes the picture, the human brain perceives a certain feature in one picture and then performs comprehensive operation further, so that global information is obtained. Specifically, each neuron of the traditional neural network needs to be connected with each pixel, so that the weight number is huge, the training difficulty is high, the weight number of each neuron of the convolution layer is the size of the convolution kernel, namely, no neuron is only connected with a corresponding part of pixels, the weight number is reduced, the training efficiency is improved, meanwhile, the size and the step length of the convolution kernel can be set according to the requirement, and the efficiency is maximized. The present invention uses 3*3 convolution kernels, two 3*3 convolution kernels correspond to one 5*5 convolution kernel, assuming that the picture is 224×224, the step size is 1, no padding, according to the convolution calculation formula (n+ 2*p-f)/q+1, where n is the picture scale, p is the padding value, f is the convolution kernel size, q is the step size, the convolution result of 5*5 is 224-5+1=220, the convolution result of two 3*3 is 220, the two results are the same, but the calculated amount of one 5*5 convolution is 5*5 ×channel number=25×channel number, and the calculated amount of two 3*3 convolutions is 3*3 ×channel number×2=18×channel number, that is, the convolution of the calculated amount 5*5 is significantly larger. Similarly, we can replace one 7*7 convolution kernel with three 3*3 convolution kernels. Overall, the 3*3 convolution kernel has the following advantages over 5*5 and 7*7: and (1) the calculation speed is high, and the efficiency is high. The receptive fields obtained in (2) are the same. (3) 3*3 have more nonlinear effects than a large size convolution kernel.

Pooling layer: the pooling layer generally plays a role in reducing the dimension after the convolution layer. Because the network extracts and many characteristic information after convolution, there is no similar information which can be replaced with each other, if the information is reserved, the redundancy degree of the information is greatly improved, the calculation difficulty is increased, and the pooling layer can just solve the problem. The pooling mainly comprises two methods, namely maximum pooling and average pooling, wherein the maximum pooling layer can better retain the texture information of the image, and the average pooling layer can retain the local spatial information of the image.

Full tie layer: the fully connected layer is often placed in the last layer, mainly to act as feature weighting. In the invention, the last full-connection layer is replaced by the convolution layer, and the replacement rule is that the size of the convolution kernel is set to be the size of the input space, so that the picture input with any size can be accepted, and CNN shares a large amount of calculation, thereby improving the operation efficiency of the whole network.

The present invention also adds a BN layer after each max-pooling layer. The BN layer has the following advantages: (1) The training speed is increased so that we can train the network using a larger learning rate. (2) the generalization capability of the network is improved. (3) The BN layer is essentially a normalized network layer, so the locally-responsive normalized layer can be replaced by it.

Along with the wider application of deep learning, the requirement on precision is higher, but the high precision depends on a large amount of marking data or images, the marking process is very time-consuming and labor-consuming, and the problem can be well solved by transfer learning, so that the transfer learning receives more and more attention. What we use is that the method is based on feature migration, and the main focus is on how to find out common feature representations between the source domain and the target domain, and then use these features for knowledge migration.

The technical scheme of the invention is as follows:

an improved VGG16 network pig identity recognition method based on transfer learning comprises the following steps:

(1) Extracting frame by frame according to the video, performing operations such as overturning, cutting, contrast enhancement and the like to obtain an expanded data set, and then dividing a test set and a training set;

(2) Adding a BN layer after each pooling layer to construct a BN-VGG16 model after improving a network layer, so that the coarse dimension reduction result is subjected to fine dimension reduction, and the precision of the whole network is improved; on the other hand, the method is convenient for people to train the network by using a larger learning rate, and the problem of gradient disappearance is not worried about, so that the training speed of the network is improved. The improved BN-VGG16 model is adopted to automatically extract the depth characteristics of pigs, namely the characteristics of colors, textures, shapes and the like of the pigs, so that the preparation is made for the final identification of the pigs.

(3) The present patent improves the existing particle swarm algorithm into a Gaussian-improvement factor particle swarm algorithm (G-IFPSO for short). The first improvement is that Gaussian disturbance is added in the optimal particles, so that the next particles are learned towards the neighborhood of the optimal particles, the local optima are avoided, and the accuracy of the identification of pigs is improved. The second point is improved in that the speed weight is optimized in real time according to the iteration times, so that the global searching capability of the algorithm is improved; and the offset is added, so that the weight cannot disappear, and the identification efficiency of pigs is improved. The third improvement is to optimize the self learning factors and the population learning factors, namely, optimize the learning factors along with the iteration times, so that the global searching capability of the algorithm is improved again, and the speed of the identification of pigs is improved.

(4) Training by using the training set processed in the step 1, optimizing the weights of the cross entropy loss function and the mean square error loss function in the weighted fusion loss function by adopting a G-IFPSO algorithm, and storing a Pre-trained feature extraction network Pre-VGG16. This step can further improve the accuracy of the identification of the pig.

(5) The existing LDP algorithm is improved into an absolute value difference local direction mode algorithm (Multi Block Improve Absolute Difference Local Direction Pattern, MB-IADLDP for short) with multi-block improvement. The method is used for extracting the traditional characteristics of the pigs, and provides characteristic information for characteristic fusion and identification of the pigs.

(6) The Pre-VGG16 feature extraction network is respectively migrated to two different neural networks for training, network parameters are finely adjusted, then a data set is adjusted to 224 x 3, the adjusted data set is subjected to multi-block improved absolute value differential local direction mode (Multi Block Improve Absolute Difference Local Direction Pattern, MB-IADLDP for short) feature extraction, the extracted features of the two neural networks and MB-IADLDP features are subjected to serial fusion, namely vector fusion, and finally the identity of the pig is identified. And (3) carrying out identity recognition on the pigs by utilizing a BN-VGG16 model and combining with a transfer learning method and a feature fusion method, and analyzing an experimental result of the model to obtain an experimental conclusion.

The improvement of the VGG16 in the step (2) specifically comprises: each maximum pooling layer is followed by a BN (Batch Normalization) layer. The structure of the whole network is 2 convolution layers comprising 64 convolution kernels, followed by a max pooling layer and BN layer, 2 convolution layers comprising 128 convolution kernels, followed by a max pooling layer and BN layer, 3 convolution layers comprising 256 convolution kernels, followed by a max pooling layer and BN layer, 3 convolution layers comprising 512 convolution kernels, followed by a max pooling layer and BN layer, 2 fully connected layers comprising 4096 neurons, 1 fully connected layer comprising 1000 neurons, and finally a softmax layer. In general, the BN layer is added behind the convolution layer to prevent the gradient from disappearing, and meanwhile, the BN layer has a certain dimension reduction effect due to normalization, the pooling layer is the dimension reduction effect, and after the pooling layer is placed, the result after coarse dimension reduction is subjected to fine dimension reduction, so that the precision of the whole network is improved. In the neural network, generally, the data distribution of each layer is different, so that the network convergence and training are difficult, but the BN layer can convert the data of each layer into a state that the variance is 1 and the mean value is 0, so that each layer is easy to converge, and the convergence and training speed of the whole network is increased. In a neural network, if the activation output of the network is large, the corresponding gradient is small, so that the learning rate of the network is slow, the gradient disappears, the training cannot be continued, and the BN layer can be regarded as a regularization constraint (that is, a plane irreducible algebraic curve is expressed by some form of full pure parameter), so that the gradient disappearance is solved. The neural network layers may perform training learning towards one direction, so that over fitting may be caused, all samples of the BN layer are correlated together, the output of one sample is not only dependent on the sample itself, but also dependent on other samples belonging to one batch together with the sample, and each network is randomly fetched, so that the whole network cannot perform training learning towards one direction, and the occurrence of over fitting phenomenon is prevented, and the recognition accuracy of the BN layer added is higher than that of the BN layer not added as shown in fig. 6. The BN layer can accelerate the training speed of the whole network and improve the generalization capability of the network, so that the network can be trained by using a larger learning rate, and the problem of gradient disappearance is not worried.

The improvement of the particle swarm algorithm in the step (3) is that Gaussian disturbance is added to the optimal particles, so that the following particles learn towards the neighborhood of the optimal particles instead of the optimal particles, and the problem that the traditional particle swarm algorithm is easy to fall into local optimization is solved.

The particle swarm algorithm is improved in the step (3) that the weight is optimized in real time according to the iteration times, so that the global searching capability of the algorithm is improved; and adds an offset so that the weight does not vanish.

The particle swarm algorithm is improved in the step (3) that self learning factors and population learning factors are optimized, namely, the learning factors are optimized along with the iteration times, and the global searching capability of the algorithm is improved again.

The training process in the step (4) specifically comprises the following steps: the value of dropout during training is set to 0.65, which aims to prevent the occurrence of the overfitting phenomenon; adjusting the dimension of the trained dataset to 224 x 3; the loss function selects a cross entropy loss function and a mean square error loss function, and the two functions are subjected to weighted fusion.

The cross entropy loss function can adapt to the condition of multiple categories, the characteristics of pigs are various, the cross entropy loss function is quite suitable, the cross entropy loss function is a function in a logarithmic form, when the cross entropy loss function approaches to an upper boundary, the state of high gradient can be still maintained, the convergence speed is not influenced, but the operation process is complex, the calculation speed is not fast, the mean square error loss function can compensate the defect, the combination advantages of the cross entropy loss function and the cross entropy loss function are complementary, and the operation speed of the whole model is improved; and finally, performing iterative training, stopping training when the iteration loss value is smaller than a certain threshold value, obtaining a model and storing a pre-trained feature extraction network.

The specific process of MB-IADLDP characteristic extraction in the step (5) is as follows: the processed image is subjected to size transformation to 222 x 222, then the processed image is divided into blocks, the size of each block is 3*3, 74 blocks are obtained, the code G _i of the blocks of 3*3 is obtained, kirsch mask operator calculation is carried out to obtain E _i, difference coding and absolute coding are respectively carried out, the maximum 3 results obtained by the difference coding are obtained, namely, the maximum 3 directions of the obtained results are set to 1, the other directions are set to 0, the maximum direction of the absolute coding is set to 1, and the other directions are set to 0. And carrying out weighted fusion on the two obtained results to obtain the final MB-IADLDP characteristic extraction result, and finally obtaining a matrix with 74 x 8 dimensions, wherein the whole extraction process is shown in figure 2. The difference value coding is used for enabling 8 field pixels around the center pixel g _c to be respectively and closely related to the surrounding, so that the extraction information is enriched; since the direction of the large absolute value indicates that the texture effect of the direction is the best, the results of absolute value coding and difference coding are subjected to weighted fusion, so that the main texture is reserved, and the information redundancy is reduced.

The specific description of the application of the transfer learning in the neural network feature fusion in the step (6) is as follows: applying a BN-VGG16 feature extraction network to a Pig identification network Pig-VGG16, respectively applying Pre-VGG16 network migration learning to two neural network models, wherein the difference of the two networks is the difference of the last pooling layer, the one is a mean pooling layer, the one is a maximum pooling layer, the maximum pooling layer can better retain texture information of an image, the mean pooling layer can retain local spatial information of the image, the combination of the two can improve the feature extraction precision, thereby improving the identification precision, serially fusing the features extracted by the two neural networks and the features extracted by MB-IADLDP, and finally inputting the fused result into a full-connection layer and a softmax layer for final identification. The fusion strategy is to spread the features to be fused, normalize the features respectively, and connect the features to form a new feature vector, wherein the length of the generated new feature vector is equal to the sum of the lengths of the feature vectors to be connected, and then send the new feature vector into a neural network to obtain a final recognition result. The full-connection layer of the Pig-VGG16 is changed into a convolution layer, so that the dimension of an input picture is not limited, pictures with different dimensions can be processed, a Pre-VGG16 network is migrated to a Pig-identification network Pig-VGG16, the feature extraction network is not required to be trained all the time, the user can use the picture all the time after single training is completed, and the efficiency is improved; initializing trained parameters of the pig identification network, and adjusting the parameters to a self-defined value, namely adjusting the parameters to be parameters of the identification network at the historical training moment, namely setting dropout to 0.6, setting epoch to 25, using 3*3 size of convolution kernel, and using cross entropy loss function and mean square error loss function of loss function; the training process is divided into at least two periods, and parameter adjustment is performed between two adjacent periods. The whole process is completed on TensorFlow2.0. Writing out a convolution layer, a pooling layer and a full connection layer module according to each module of BN-VGG16 respectively, and debugging and saving; then adding program codes of a BN layer behind the pooling layer and debugging; then inputting a data set by using a main program and calling each module to perform model training; after the iteration times are reached, the feature extraction part model is stored; then, the method is migrated to two different networks, and the characteristic extraction parts are identical, so that the method can be directly invoked, and only the last pooling layer is required to be modified; and fusing the two neural networks and the features extracted by MB-IADLDP, and inputting the fusion result into the full-connection layer and the softmax layer to carry out final identification. And observing the difference of the method and the traditional VGG16 and BN-VGG16 in the pig identification accuracy to obtain a final comparison result. Compared with the existing method, the method has the following obvious advantages:

(1) The BN layer is added to each maximum pooling layer, so that the training speed of the whole network is accelerated, the network can be trained by using a larger learning rate, and the generalization capability of the network is improved.

(2) The loss function is formed by weighting and fusing a cross entropy loss function and a mean square error loss function, the weighted value is optimized by the G-IFPSO algorithm, and the optimal weight value can be obtained through iteration. The particle swarm algorithm is improved in speed weight and elite particles, and Gaussian disturbance is added, so that the weight is always changed and cannot disappear, the global searching capacity is improved, and the problem that local optimization is easy to fall in is solved.

(3) Two neural networks, which are mainly different in pooling layer, are fused. The maximum pooling layer can better reserve the texture information of the image, the average pooling layer can reserve the local spatial information of the image, and the combination of the maximum pooling layer and the average pooling layer can improve the feature extraction precision, so that the identification precision is improved.

(4) The transfer learning strategy is adopted, the feature extraction module of the VGG16 is transferred to the Pig identification network Pic-VGg 16, the efficiency of the whole network is improved, and meanwhile, the repeated training module is not needed, so that the time is saved.

(5) The final full-connection layer is replaced by a convolution layer, so that the whole network can be provided with pictures with different scales, and the scale freedom of the whole network is realized.

Drawings

In order to more clearly illustrate the concrete implementation steps and experimental principles of the present invention, I briefly describe the following drawings required in the present invention:

FIG. 1 is a flow chart of an experimental method;

FIG. 2 is a MB-IADLDP feature extraction process;

FIG. 3 is a Kirsch mask operator;

FIG. 4 is a conventional VGG16 model;

FIG. 5 is a modified VGG16 model BN-VGG16;

FIG. 6 is a VGG16 model based on a transfer learning method;

FIG. 7 is a graph showing experimental comparison after adding BN layer;

fig. 8 is a graph comparing experimental results.

Detailed Description

The following is a detailed description of specific examples in conjunction with the above figures.

The step (1) specifically includes: firstly, extracting video frame by frame to obtain pictures; then preprocessing the obtained picture, namely performing horizontal overturn and random direction overturn, gamma transformation, histogram equalization, logarithmic transformation, denoising and noise point addition on the image to expand the data set, and finally obtaining the processed data set, wherein the data set is expanded to 4900 from 500 sheets at the beginning; finally, the processed data set is divided into 6: the scale of 1 is divided into training and test sets.

The improvement of the particle swarm algorithm in the step (3) is to add gaussian disturbance to the optimal particles, so that the following particles learn towards the neighborhood of the optimal particles instead of learning towards the optimal particles, thereby solving the problem that the traditional particle swarm algorithm is easy to fall into local optimization, and the improved particle swarm algorithm has the following formula:

p _gm -optimum value of particle swarm;

P' _gm -optimum value of particle swarm after disturbance;

p _im -individual optimum;

n (μ, σ) -gaussian function, where μ is the mean and σ is the variance;

v _im —a velocity component;

x _im —position component;

w—inertial weight;

c ₁, self learning factors;

c ₂ -population learning factor;

r ₁ r₂ -a random value between 0 and 1;

Fitness function:

F(x)＝aQ+b (2)

wherein a is a scalar coefficient, b is an offset, and Q is a loss function after weighted fusion, and the loss function is specifically shown as a formula (3).

The particle swarm algorithm is improved in the step (3) that the speed weight is optimized in real time according to the iteration times, so that the global searching capability of the algorithm is improved; and adds an offset so that the weight does not vanish. The improved speed weight formula is as follows:

i _max -maximum number of iterations;

i-the current iteration number;

d-offset.

The improvement of the particle swarm algorithm in the step (3) is to optimize the self learning factors and the population learning factors, namely, the learning factors are optimized along with the iteration times, and the global searching capability of the algorithm is improved again. The improved learning factor formula is as follows:

i _max -maximum number of iterations;

i-the current iteration number.

The training process in the step (4) specifically includes: the value of dropout during training is set to 0.65, which aims to prevent the occurrence of the overfitting phenomenon; adjusting the dimension of the trained dataset to 224 x 3; the loss function selects a cross entropy loss function and a mean square error loss function, and the two functions are subjected to weighted fusion, and a weighted formula is shown in a formula (5).

Alpha-the cross entropy loss function eventually tends to stabilize the loss value;

beta-mean square error loss function is the last loss value that tends to be stable;

l-a cross entropy loss function, as shown in formula (4);

MSE, a mean square error loss function, as shown in equation (5).

Order the

The cross entropy loss function is shown as (6)

M-number of categories;

u-represents the u-th category;

l _u -the loss function value of the u-th class;

y _uc —indicating a variable, 1 if the class is the same as that of sample i, or 0 otherwise;

p _uc -the predicted probability that the observation sample i belongs to the present class.

The mean square error loss function is shown as (7)

Y _c —representing the value of the ith input;

y' _c -represents its predicted value.

The weights are then optimized by the G-IFPSO algorithm, which is the following:

(8) Initializing parameters, namely the position, speed, individual optimal position, population optimal position and learning factor of the particles;

(9) Continuously updating the weight of the particle swarm algorithm along with the iteration times according to the formula (2);

(10) According to the formula (3), the learning factor obtains the current optimal value along with the iteration times;

(11) Updating the position and velocity components of the particles according to equations (1) (3) (4);

(12) Calculating a fitness value according to formula (2);

(13) Comparing the individual extremum and the global extremum of the particles, and continuously replacing the optimum value;

(14) If the maximum iteration number is reached, outputting the optimal solution (eta, gamma), otherwise, returning to the second step and continuing training.

The specific process of extracting MB-IADLDP features in the step (5) is as follows: the processed image is subjected to size transformation to 222 x 222, then the processed image is divided into blocks, the size of each block is 3*3, 74 blocks are obtained in total, the code G _i of the blocks of 3*3 is obtained, kirsch mask operator calculation is performed to obtain E _i, as shown in a formula (8), difference coding and absolute coding are respectively performed, as shown in a formula (9) and a formula (10), the largest 3 results of the difference coding are obtained, namely, the largest 3 directions of the obtained results are set to 1, the other directions are set to 0, the direction of the largest absolute coding is set to 1, and the other directions of the largest absolute coding are set to 0. And carrying out weighted fusion on the two obtained results to obtain the final MB-IADLDP characteristic extraction result, and finally obtaining a matrix with 74 x 8 dimensions, wherein the whole extraction process is shown in figure 2. The difference value coding is used for enabling 8 field pixels around the center pixel g _c to be respectively and closely related to the surrounding, so that the extraction information is enriched; since the direction of the large absolute value indicates that the texture effect of the direction is the best, the results of absolute value coding and difference coding are subjected to weighted fusion, so that the main texture is reserved, and the information redundancy is reduced.

E_i＝G_i*M_j,i＝1,2,...,74,j＝0,2,...,7 (8)

G _i -the encoded value of the ith partition;

m _j -Kirsch mask operator in the j-th direction;

The difference coding formula is as follows:

e _a -the a-th code around the center pixel in the tile;

The absolute coding formula is as follows:

da_c＝|e_c-e_c+4|,c＝0,1,...,3 (10)

e _k, the k-th large coding value in the partition;

The specific description of the application of the transfer learning in the neural network feature fusion in the step (6) is as follows: applying a BN-VGG16 feature extraction network to a Pig identification network Pig-VGG16, respectively applying Pre-VGG16 network migration learning to two neural network models, wherein the two networks are different in the last pooling layer, the one is a mean pooling layer, the one is a maximum pooling layer, the maximum pooling layer can better retain texture information of an image, the mean pooling layer can retain local spatial information of the image, the combination of the two can improve feature extraction precision, thereby improving the identification precision, and serially fusing the features extracted by the two neural networks and the features extracted by MB-IADLDP, and finally, carrying out final identification on the fused result in the input full-connection layer and the softmax layer. The fusion strategy is to spread the features to be fused, normalize the features respectively, and connect the features to form a new feature vector, wherein the length of the generated new feature vector is equal to the sum of the lengths of the feature vectors to be connected, and then send the new feature vector into a neural network to obtain a final recognition result. The full-connection layer of the Pig-VGG16 is changed into a convolution layer, so that the dimension of an input picture is not limited, pictures with different dimensions can be processed, a Pre-VGG16 network is migrated to a Pig-identification network Pig-VGG16, the feature extraction network is not required to be trained all the time, the user can use the picture all the time after single training is completed, and the efficiency is improved; initializing trained parameters of the pig identification network, and adjusting the parameters to a self-defined value, namely adjusting the parameters to be parameters of the identification network at the historical training moment, namely setting dropout to 0.6, setting epoch to 25, using 3*3 size of convolution kernel, and using cross entropy loss function and mean square error loss function of loss function; the training process is divided into at least two periods, and parameter adjustment is performed between two adjacent periods. The whole process is completed on TensorFlow2.0. Writing out a convolution layer, a pooling layer and a full connection layer module according to each module of BN-VGG16 respectively, and debugging and saving; then adding program codes of a BN layer behind the pooling layer and debugging; then inputting a data set by using a main program and calling each module to perform model training; after the iteration times are reached, the feature extraction part model is stored; then, the method is migrated to two different networks, and the characteristic extraction parts are identical, so that the method can be directly invoked, and only the last pooling layer is required to be modified; and fusing the two neural networks and the features extracted by MB-IADLDP, and inputting the fusion result into the full-connection layer and the softmax layer to carry out final identification. And observing the difference of the method and the traditional VGG16 and BN-VGG16 in the pig identification accuracy to obtain a final comparison result. As shown in FIG. 7, the accuracy of identification of the Pic-VGG 16 network is highest and can reach 0.6 at the beginning, which is incomparable with the conventional VGG16 and the modified VGG16 network, so that the Pic-VGG 16 network is more suitable for the identification of pigs than the conventional VGG16 and the modified VGG 16.

The above examples are merely illustrative of the invention to illustrate the feasibility of the invention in particular, but not exclusively.

Claims

1. The identification method for improving VGG16 network pigs based on transfer learning is characterized by comprising the following steps of:

Step 1, extracting frame by frame according to a video, performing overturning, cutting and contrast enhancement operations to obtain an expanded data set, and then dividing a test set and a training set;

step 2, adding a BN layer after each pooling layer to construct a BN-VGG16 model after improving a network layer;

step 3, improving the particle swarm algorithm into a Gaussian-improved factor particle swarm algorithm G-IFPSO;

Step 4, training by using the training set processed in the step 1, optimizing weights of a cross entropy loss function and a mean square error loss function in the weighted fusion loss function by adopting a G-IFPSO algorithm, and storing a Pre-trained feature extraction network Pre-VGG16;

Step 5, adopting a multi-block improved absolute value differential local direction mode algorithm for extracting the traditional characteristics of the pigs, and providing characteristic information for characteristic fusion and identification of the pigs;

And 6, respectively migrating the Pre-VGG16 feature extraction networks into two different neural networks for training, finely adjusting network parameters, adjusting a data set to 224 x 3, extracting the multi-block improved absolute value differential local direction mode MB-IADLDP features of the adjusted data set, serially fusing the features extracted by the two neural networks and the MB-IADLDP features, namely vector fusion, and finally carrying out pig identification.

2. The method for identifying the identity of the improved VGG16 network pig based on transfer learning of claim 1, wherein the step 1 specifically comprises the following steps: firstly, extracting video frame by frame to obtain pictures; then preprocessing the obtained picture, namely performing horizontal overturn and random direction overturn, gamma transformation, histogram equalization, logarithmic transformation, denoising and noise point addition on the image to expand the data set, and finally obtaining the processed data set, wherein the data set is expanded to 4900 from 500 sheets at the beginning; finally, the processed data set is divided into 6: the scale of 1 is divided into training and test sets.

3. The method for identifying the identity of the improved VGG16 network pig based on the transfer learning of claim 1, wherein the step 2 of constructing the BN-VGG16 model after the improved network layer specifically comprises the following steps: each maximum pooling layer is followed by a BN layer, the structure of the whole network is 2 convolutional layers containing 64 convolutional kernels, followed by a maximum pooling layer and a BN layer, 2 convolutional layers containing 128 convolutional kernels, followed by a maximum pooling layer and a BN layer, 3 convolutional layers containing 256 convolutional kernels, followed by a maximum pooling layer and a BN layer, 3 convolutional layers containing 512 convolutional kernels, followed by a maximum pooling layer and a BN layer, 2 fully connected layers containing 4096 neurons, 1 fully connected layers containing 1000 neurons, and finally one softmax layer.

4. The method for identifying the identity of the improved VGG16 network pig based on transfer learning of claim 1, wherein in step 3, a particle swarm algorithm is improved, gaussian disturbance is added to optimal particles, and the improved particle swarm algorithm has the following formula:

p _gm -optimum value of particle swarm;

P' _gm -optimum value of particle swarm after disturbance;

p _im -individual optimum;

n (μ, σ) -gaussian function, where μ is the mean and σ is the variance;

v _im —a velocity component;

x _im —position component;

w—inertial weight;

c ₁, self learning factors;

c ₂ -population learning factor;

r ₁ r₂ -a random value between 0 and 1;

Fitness function:

F(x)＝aQ+b (2)

Where a is a scalar coefficient, b is an offset, and Q is a weighted fused loss function.

5. The method for identifying the identity of the improved VGG16 network pig based on transfer learning of claim 4, wherein in the step 3, the particle swarm algorithm is improved, the speed weight is optimized in real time according to the iteration times, and the offset is added, so that the weight cannot disappear, and the improved speed weight formula is as follows:

i _max is the maximum number of iterations; w _max is the maximum value of the speed weight; w _min is the speed weight minimum; i is the current iteration number; d is the offset.

6. The method for identifying the identity of the improved VGG16 online pig based on transfer learning of claim 5, wherein in the step 3, the particle swarm algorithm is improved, and self learning factors and population learning factors are optimized, namely the learning factors are optimized according to the iteration times, and an improved learning factor formula is as follows:

i _max -maximum number of iterations;

i-the current iteration number.

7. The method for improving the identity recognition of VGG16 network pigs based on transfer learning of claim 6, wherein training by using the training set processed in step 1 in step 4 specifically comprises: the value of dropout during training is set to 0.65, which aims to prevent the occurrence of the overfitting phenomenon; adjusting the dimension of the trained dataset to 224 x 3; the loss function selects a cross entropy loss function and a mean square error loss function, and the two functions are subjected to weighted fusion, wherein a weighted formula is shown in a formula (5):

l-a cross entropy loss function, as shown in formula (4);

MSE, a mean square error loss function, as shown in equation (5);

Order the

The cross entropy loss function is shown as (6)

M-number of categories;

u-represents the u-th category;

l _u -the loss function value of the u-th class;

p _uc -the predicted probability that the observation sample i belongs to the present class;

The mean square error loss function is shown as (7)

Y _c —representing the value of the ith input;

y' _c -represents its predicted value;

the weight is optimized by adopting a G-IFPSO algorithm, and the optimization algorithm process is as follows:

(1) Initializing parameters, namely the position, speed, individual optimal position, population optimal position and learning factor of the particles;

(2) Continuously updating the weight of the particle swarm algorithm along with the iteration times according to the formula (2);

(3) According to the formula (3), the learning factor obtains the current optimal value along with the iteration times;

(4) Updating the position and velocity components of the particles according to equations (1) (3) (4);

(5) Calculating a fitness value according to formula (2);

(6) Comparing the individual extremum and the global extremum of the particles, and continuously replacing the optimum value;

(7) If the maximum iteration times are reached, outputting an optimal solution (eta, gamma), otherwise, returning to the second step and continuing training;

And finally, performing iterative training, stopping training when the iteration loss value is smaller than a certain threshold value, obtaining a model and storing a pre-trained feature extraction network.

8. The identification method for improving VGG16 network pigs based on transfer learning of claim 1, wherein the specific process of extracting MB-IADLDP features in the step 6 is as follows: performing size transformation on the processed image to 222 x 222, then partitioning the processed image into blocks, wherein the size of each block is 3*3, 74 blocks are obtained in total, the code G _i of the block of 3*3 is obtained, kirsch mask operator calculation is performed to obtain E _i, as shown in a formula (8), difference coding and absolute coding are respectively performed, as shown in a formula (9) and a formula (10), the maximum 3 results of the difference coding are obtained, namely, the maximum 3 directions of the obtained results are set to 1, the other directions are set to 0, the maximum direction of the absolute coding is set to 1, and the other directions are set to 0; the two obtained results are weighted and fused to obtain the final MB-IADLDP characteristic extraction result, and finally a matrix with 74 x 8 dimensions is obtained, and the difference value coding is used for enabling 8 field pixels around the center pixel g _c to be respectively and tightly connected with the surrounding, so that the extraction information is enriched; because the direction with large absolute value indicates that the texture effect of the direction is the best, the results of absolute value coding and difference coding are subjected to weighted fusion, so that the main texture is reserved, and the information redundancy is reduced;

E_i＝G_i*M_j,i＝1,2,...,74,j＝0,2,...,7 (8)

G _i -the encoded value of the ith partition;

m _j -Kirsch mask operator in the j-th direction;

The difference coding formula is as follows:

e _a -the a-th code around the center pixel in the tile;

The absolute coding formula is as follows:

da_c＝|e_c-e_c+4|,c＝0,1,...,3 (10)

e _k, the k-th large coding value in the partition;

9. The method for identifying the identity of the improved VGG16 network pig based on the transfer learning of claim 1, wherein in step 6, the Pre-VGG16 feature extraction network is transferred to two different neural networks for training, and the specific description is as follows: applying a BN-VGG16 feature extraction network to a Pig identification network Pig-VGG16, respectively applying Pre-VGG16 network migration learning to two neural network models, wherein the difference of the two networks is the difference of the last pooling layer, the mean pooling layer and the maximum pooling layer, serially fusing the features extracted by the two neural networks and the features extracted by MB-IADLDP, and finally inputting the fused result to a full-connection layer and a softmax layer for final identification; the fusion strategy is that the features to be fused are unfolded, normalized respectively and then connected with each other to form a new feature vector, wherein the length of the generated new feature vector is equal to the sum of the lengths of the feature vectors to be connected, and then the new feature vector is sent into a neural network to obtain a final recognition result; changing the full connection layer of the Pic-VGG 16 into a convolution layer; initializing trained parameters of the pig identification network, and adjusting the parameters to a self-defined value, namely adjusting the parameters to be parameters of the identification network at the historical training moment, namely setting dropout to 0.6, setting epoch to 25, using 3*3 size of convolution kernel, and using cross entropy loss function and mean square error loss function of loss function; dividing the training process into at least two periods, and adjusting parameters between two adjacent periods; the whole process is finished on TensorFlow2.0, and a convolution layer, a pooling layer and a full connection layer module are respectively written out according to each module of BN-VGG16 and are debugged and stored; then adding program codes of a BN layer behind the pooling layer and debugging; then inputting a data set by using a main program and calling each module to perform model training; after the iteration times are reached, the feature extraction part model is stored; then, the method is migrated to two different networks, and the characteristic extraction parts are identical, so that the method can be directly invoked, and only the last pooling layer is required to be modified; and fusing the two neural networks and the features extracted by MB-IADLDP, and inputting the fusion result into the full-connection layer and the softmax layer to carry out final identification.