GB2611257A - Transfer learning-based method for improved VGG16 network pig identity recognition - Google Patents

Transfer learning-based method for improved VGG16 network pig identity recognition Download PDF

Info

Publication number
GB2611257A
GB2611257A GB2219795.8A GB202219795A GB2611257A GB 2611257 A GB2611257 A GB 2611257A GB 202219795 A GB202219795 A GB 202219795A GB 2611257 A GB2611257 A GB 2611257A
Authority
GB
United Kingdom
Prior art keywords
layer
improved
vgg16
loss function
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB2219795.8A
Other versions
GB2611257B (en
GB202219795D0 (en
Inventor
Zhu Weixing
Tang Zhiye
Li Xincheng
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu University
Original Assignee
Jiangsu University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu University filed Critical Jiangsu University
Publication of GB202219795D0 publication Critical patent/GB202219795D0/en
Publication of GB2611257A publication Critical patent/GB2611257A/en
Application granted granted Critical
Publication of GB2611257B publication Critical patent/GB2611257B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/061Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using biological neurons, e.g. biological neurons connected to an integrated circuit
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • Neurology (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

Disclosed is a transfer learning-based method for improved VGG16 network pig identity recognition. The method comprises: first performing frame by frame extraction on a processed video to obtain a series of pictures, which are preprocessed into a data set, and then dividing a training set and a test set; constructing an improved VGG16 network training model BN-VGG16; and saving a pre-trained feature extraction model Pre-VGG16; next is a transfer learning process: transferring a Pre-VGG16 feature extraction network obtained by source domain training to a Pig-VGG16 network for recognizing pigs; and performing multi-block improve absolute difference local direction pattern (MB-IADLDP) feature extraction on a data set that has undergone size adjustment, and performing serial fusion, and finally performing identity recognition on a pig. A transfer learning-based improved VGG16 model is superior to conventional VGG16 network models in terms of operating speed and precision.

Description

TRANSFER LEARNING-BASED METHOD FOR IMPROVED VGG16
NETWORK PIG IDENTITY RECOGNITION
TECHNICAL FIELD
The present disclosure relates to artificial intelligence technology, and in particular to the technical fields of transfer learning, deep learning, and neural network.
BACKGROUND
With the rise of the era of big data, neural network has also developed. The original neural network is only a single layer perceptron. The basic neural network also includes Hopfield neural network, linear neural network and BP neural network. Now, the neural network has a stage of results, mainly including the depth of confidence network, convolution neural network, depth residual network, LSTIVI network and so on. Deep neural network has powerful representation ability, but it has a lot of parameters and a large amount of calculation. Recent research is mainly towards reducing the amount of parameters, learning more abundant features and speeding up the training speed. Neural network is also widely used, such as face recognition, identity recognition, driverless and so on. It can be seen that the flexibility of neural network is very high, and it can adapt to a variety of tasks. For identity recognition, many network models can be used, such as VGG16, VGG19, Alexnet, Googlenet, RESNET and so on. But the selection of models should be based on the actual situation. Just for pig identity identification, the depth of VGG16 and the amount of calculation are enough, so it is unnecessary to use deeper models. In the actual experiment simulation, we will encounter the situation that the data set is not sufficient and the model needs to be reused repeatedly. At this time, transfer learning emerges. Transfer learning can simplify the calculation amount, improve the operation efficiency and have a good effect on the identity identification of pigs.
SUMMARY
The technical problem solved by the present disclosure is to provide a pig identity identification method by using an improved VGG16 network based on transfer learning. By continuous improvement and model optimization, the development of neural network has reached the stage of deep neural network. The application of typical network model is also more extensive, aiming at the existing research methods of deep neural network in the identity identification of pigs. The present disclosure puts forward a pig identity identification method by using an improved VGG16 network based on transfer learning. In 2014, at the Imagenet large scale visual recognition challenge, the Computer Vision Laboratory of Oxford University proposed the structure of VGG convolutional neural network. The final result of the competition is that positioning is the first place and classification is the second place. Therefore, VGG series models have great advantages in identity identification and feature extraction.
A traditional VGG16 model is introduced as follows.
As shown in FIG. 4, the traditional VGG16 model has two convolution layers with 64 convolution cores, two convolution layers with 128 convolution cores, three convolution layers with 256 convolution cores, six convolution layers with 512 convolution cores, two full connection layers with 4096 neurons and one full connection layer with 1000 neurons. The dimension of the input image is controlled at 224822483.
Convolution layer. it imitates human's local perception. When human's brain recognizes a picture, it perceives a certain feature in a picture, and then performs further comprehensive operation to obtain global information. Specifically, each neuron of the traditional neural network needs to connect each pixel. The result is that the number of weights is huge and the training is difficult. Now the number of weights of each neuron in the convolution layer is the size of the convolution kernel. That is to say, no neuron is only connected with the corresponding part of the pixels, so as to reduce the number of weights and improve the training efficiency. At the same time, we can also set the step size of the convolution kernel according to the needs, which is to maximize the efficiency of the algorithm. In the present disclosure, a 3 x3 convolution core is used, and two 3 83 convolution cores are equivalent to a 585 convolution core. Assuming that the picture is 224 x224, the step size is 1, and there is no filling, according to the convolution calculation formula(n+2*p-t)/q + 1, wherein n is the image scale, P is the filling value, f is the convolution kernel size, q is the step size, * is a convolution operator symbol, the convolution result of 5 x 5 is 220, and the convolution result of 3 x3 of 2 times is 220. But the computation of 5 x5 convolution kernel is larger. Generally speaking, the convolution kernel of 3 x3 has the following advantages over 5x5 and 7 x 7 (1) It has fast computation speed and high efficiency; (2) The receptive field is the same; (3) a 3x3 convolution kernel has more nonlinear effects than a large convolution kernel.
Pooling layer. a pooling layer is generally after the convolution layer, which mainly plays the role of dimensionality reduction. Because a lot of feature information are redundant after convolution operation, the pooling layer can just solve this dimensionality reduction problem. There are two main pooling methods, one is maximum pooling, and the other is average pooling. The maximum pooling layer can better preserve the texture information of the image, and the average pooling layer can preserve the local spatial information of the image. The strategy of combining the maximum pooling layer with the average pooling layer is used in the present disclosure. And replacing the maximum pooling layer with the combination of the maximum pooling layer and the average pooling layer can improve the accuracy of feature extraction, so as to improve the accuracy of identification.
Fully connected layer. a fully connected layer is often placed in the last layer, which mainly plays the role of feature weighting. In the present disclosure, the final full connection layer is replaced by the convolution layer, and the replacement rule is that the convolution core size is set to the size of the input space, so that any size of image input could be accepted. At the same time, CNN shares a lot of computing, which improves the efficiency of the whole network.
The present disclosure also adds a BN layer after each maximum pooling layer. The BN layer has the following advantages. (1) The training speed is accelerated, so that a learning rate is improved to train the network; (2) The generalization ability of network is improved; (3) BN layer is essentially a normalized network layer, so local response normalized layer can be replaced by it. With the more and more widely application of deep learning, requirements for accuracy are higher and higher. However, high accuracy depends on a large number of annotation data or images. The annotation process is time-consuming and labor-consuming. Transfer learning can solve this problem well, so transfer learning has received more and more attention.
The technical solutions of the present disclosure are as follows.
1. A pig identity identification method by using an improved VGG16 network based on transfer learning is provided, which includes the following steps.
Step 1. Extracting a picture frame by frame from a video; Obtaining expanded data sets by adjusting a contrast, adding noise points, cropping and other operations; dividing the expanded data sets into a training set and a test set; Step 2. Adding a BN layer after each pooling layer to build a BN-VGG16 model; Step 3. Obtaining Gauss improved factor particle swarm optimization algorithm (G-IFPSO) by adding Gauss improved factor to particle swarm optimization algorithm; Step 4. Training the training set processed in the Step 1; using the G-LFPSO algorithm to optimize a loss function, wherein the loss function is a weighted fusion of cross entropy loss function and mean square error loss function; saving a pre-trained feature extraction network PreVGG16; Step 5. Using a pixel multi block method by improving absolute differential local direction pattern (PMB-IADLDP) to extract traditional features of pigs, wherein the traditional features are used for feature fusion and identity identification of pigs; and Step 6. Transferring the Pre-V6016 feature extraction network to two different neural networks for training, fine-tuning parameters in the networks; and then, adjusting the datasets to 224 x 224 x 3, using the pixel multi block method by improving absolute differential local direction pattern (PMB-IADLDP) to extract features from the adjusted datasets; fusing serially the features extracted in the two neural networks and PMB-IADLDP features, that is to say vector fusion; and finally, identifying the identity of pigs.
2. Step 1 of the claim 1 specifically includes the followings. Firstly, the video is extracted frame by frame to get the images. These images are then preprocessed. That is to say, by flipping the image horizontally and vertically, using gamma transform, histogram equalization, logarithmic transform, reducing and adding noise points, the datasets could be expanded and the processed data sets are obtained. The number of pictures is increased by 4900. Finally, the datasets is divided as training sets and test sets in the proportion of 6.1.
3. The improved BN-VGG16 network model built in step 2 of the claim 1 specifically includes the followings. BN(Batch Normalization) layer is added after each maximum pooling layer. The structure of the whole network is consisted of two convolution layers with 64 convolution cores which are followed by a maximum pooling layer and a BN layer, two convolution layers with 128 convolution cores which are followed by a maximum pooling layer and a BN layer, three convolution layers with 256 convolution cores which are followed by a maximum pooling layer and a BN layer, three convolution layers with 512 convolution cores which are followed by a maximum pooling layer and a BN layer, two fully connected layers containing 4096 neurons, one fully connected layer containing 1000 neurons and a softmax layer.
4. The improved particle swarm optimization algorithm in step 3 of the claim 1 specifically includes the followings. The particle swarm optimization algorithm is improved. Gaussian perturbation is added to the optimal particle.
5. The improved particle swarm optimization algorithm in step 3 of the claim 1 specifically includes the followings. The particle swarm optimization algorithm is improved. The speed weight is optimized in real time according to the number of iterations and the offset is added, so that the weight would not disappear.
6. The improved particle swarm optimization algorithm in step 3 of the claim 1 specifically includes the followings. The particle swarm optimization algorithm is improved. The self learning factor and the population learning factor need to be optimized. That is to say, the self learning factor is optimized with the number of iterations.
7. The improved VGGl6 network based on transfer learning in the claim 1 specifically includes the followings. In the training process of step 4, the dropout value is set as 0.65. Its purpose is to prevent over fitting. The dimension of the trained data set is adjusted to 224 x 224 x 3. The cross entropy loss function and mean square error loss function are selected as the loss function Q, and the two functions were weighted together.
G-IFPSO algorithm is used to optimize the weight. The optimization process is as follows.
(1) Initialization parameters. That is to say, the particle position, velocity, individual optimal position, population optimal position and learning factor are initialized; (2) According to formula (2), the weight of PSO (particle swarm optimization) is updated with the number of iterations; (3) According to formula (3), the current optimal value of the learning factor is obtained with the increase of iteration numbers; (4) According to formula (1), (3) and (4), the position and velocity components of particles are updated; (5) The fitness value is calculated according to formula (2); (6) The individual extremum and global extremum of particles are compared, and the optimal value is replaced continuously; (7) If the maximum number of iterations is reached, the optimal solution'' 7) would be output. Otherwise, the second step would be continuously operated; Finally, the iterative training is completed and a model would be gotten and the pretrained feature extraction network is saved.
8. The features extraction of PMB-IADLDP in step 5 of the claim 1 specifically includes the followings The size of the processed images are transformed to 222x 222 Then these processed images are subdivided into blocks, and the size of each block is 3x3. There are 74 sub-blocks After 3 x3 block coding Gi is obtained, Kirsch mask operator was used to calculate the result kh.
It is shown in formula (8). The differential coding and absolute coding are carried out respectively, and they are shown in formula (9) and formula (10). Take three largest results from the differential coding. The maximum three directions are set to 1, and the other directions are set to 0. The direction with the maximum absolute coding value is set to 1, and the other directions are set to 0. The final PMB-IADLDP feature extraction result is obtained by weighted fusion of the two results. Finally, the matrix of 74 x 8 dimensions is obtained. The purpose of differential coding is to make the eight area pixels around the center pixel gc more closely related with their surroundings, so as to enrich the extracted information. The direction with the highest absolute value indicates that the texture effect is best in that direction. The results of absolute coding and differential coding are fused with weight, which not only retains the main texture, but also reduces the redundancy of information 9. The methods that the Pre-VGG16 feature extraction network is applied to two neural network models respectively in step 6 of the claim 1 specifically includes the followings. The Pre-VGG16 feature extraction network is applied to two neural network models respectively. The difference between the two networks is at the last pooling layer. One is an average pooling layer, and the other is the maximum pooling layer. Then the features extracted by the two neural networks and PMB-IADLDP are fused serially. Finally, the fusion results are input to a full connection layer and softmax layer for final identification. The fusion strategy is that the features to be fused is expanded and normalized to form a new feature vector. The length of the new feature vector is equal to the sum of the length of the feature vectors to be connected, and then the new feature vector is sent to the neural network to get the results of identification. The fully connection layer of the Pig-VGG16 network is changed into convolution layer. Then the trained parameters of the pig identity identification network are initialized. And the parameters are adjusted to the user-defined values. Dropout is set to 0.6. Epoch is set to 25, 3x3 size is used in the convolution core. The cross entropy loss function and the mean square error loss function are used as the loss function. The whole training process is completed on tensorflow 2.0. First, the codes of the modules of convolution layer, pool layer and full connection layer are written out according to the modules of BN-VGG16, and then they are debugged and saved. The BN layer code is added after the pooling layer and debugged. Then the data set is input into the main program and each module is called for model training. After the number of iterations is reached, the feature extraction model is saved. Then, the model is migrated to two different networks. The features extracted from two neural networks and PMB-IADLDP are fused, and the fusion results are input into the full connection layer and softmax layer for final identification. Compared with the above methods, the present disclosure has the following obvious advantages.
(I) The BN layer is added to each maximum pooling layer to accelerate the training speed of the whole network, so that a larger learning rate can be used to train the network, and the generalization ability of the network is also improved.
(2) The loss function is a weighted fusion of cross entropy loss function and mean square error loss function. The weighted value is optimized by G-IFPSO algorithm, and the optimal weight can be obtained by iteration. The improvement of particle swarm optimization algorithm is the improvement of speed weight and elite particle, and Gaussian disturbance is added, so that the weight has been changing and will not disappear. Therefore the ability of global search is improved, and the problem of easily falling into local optimum is solved.
(3) Two neural networks are fused, which are different in pooling layer. The maximum pooling layer can better preserve the texture information of images, and the average pooling layer can preserve the local spatial information of images. The combination of the two can improve the accuracy of feature extraction and identification.
(4) Using transfer learning strategy, the feature extraction module of VGG16 is transferred to pig identity identification network (Pig-VGG16), which improves the efficiency of the whole network and saves time (5) The full connection layer is replaced by the convolution layer, so that the whole network can display images of different scales and achieve scale freedom of the whole network.
BRIEF DESCRIPTION OF THE DRAWINGS
In order to more clearly explain the specific implementation steps and experimental principles of this disclosure, the required drawings are briefly explained below.
FIG. 1 Flow chart of experimental method.
FIG. 2 Feature extraction process of PMB-IADLDP.
FIG. 3 Kirsch mask operator.
FIG. 4 Traditional VGG16 model.
FIG. 5 Improved VGG16 model BN-VGG16.
FIG, 6 VGG1 6 model based on transfer learning method.
FIG. 7 Experimental comparison after adding BN layer.
FIG. 8 Comparison of experimental results.
DETAILED DESCRIPTION OF THE EMBODIMENTS
1. An improved VGG16 network based on transfer learning is used to identify the identity of pigs. This method includes the following steps.
Step 1. Extracting a picture frame by frame from a video; obtaining expanded data sets by adjusting a contrast, adding noise points, cropping and other operations, and dividing the expanded data sets into a training set and a test set; Step 2. Adding a BN layer after each pooling layer to build a BN-VGG16 model; Step 3. Obtaining Gauss improved factor particle swarm optimization algorithm (G-IFPSO) by adding gauss improved factor to particle swarm optimization algorithm; Step 4. Training the training set processed in the Step 1; using the G-LFPSO algorithm to optimize a loss function, wherein the loss function is a weighted fusion of cross entropy loss function and mean square error loss function, and saving a pre-trained feature extraction network Pre-VGG16; Step 5. Using a pixel multiblock method by improving absolute differential local direction pattern (PMB-IADLDP) to extract traditional features of pigs, wherein the traditional features are used for feature fusion and identity identification of the pigs; and Step 6. Transferring the Pre-VGG16 feature extraction network to two different neural networks for training; fine-tuning parameters in the networks, and then, adjusting the datasets to 224 x224 x3; using the pixel multiblock method by improving absolute differential local direction pattern (PMB-IADLDP) to extract features from the adjusted datasets; fusing serially the features extracted in the two neural networks and PMB-IADLDP features, that is to say vector fusion, and finally, identifying the identity os pigs.
2. Step 1 of the claim 1 specifically includes the followings. Firstly, the images are extracted frame by frame from a video. These images are then preprocessed. That is to say, by flipping the image horizontally and vertically, using gamma transform, histogram equalization, logarithmic transform, reducing and adding noise points, the datasets could be expanded and the processed data sets are obtained. The number of images is increased by 4900. Finally, the datasets is divided as training sets and test sets in the proportion of 6.1.
3. The improved BN-VGG16 network model built in step 2 of the claim 1 specifically includes the followings. a BN (Batch Normalization) layer is added after each maximum pooling layer. The structure of the whole network is consisted of two convolution layers with 64 convolution cores which are followed by a maximum pooling layer and a BN layer, two convolution layers with 128 convolution cores which are followed by a maximum pooling layer and a BN layer, three convolution layers with 256 convolution cores which were followed by a maximum pooling layer and a BN layer, three convolution layers with 512 convolution cores which were followed by a maximum pooling layer and a BN layer, two fully connected layers containing 4096 neurons, one fully connected layer containing 1000 neurons and a softmax layer.
4. The improved particle swarm optimization algorithm in step 3 of the claim I specifically includes the followings. The particle swarm optimization algorithm is improved. Gaussian perturbation is added to the optimal particle. The formula of the improved particle swarm optimization algorithm is as follows.
= cir(Pim -run) + c2r 2(P! - * La! = Xim ± Vim gm = N (I) gni. a) wherein Pg"' is the optimal value of particle swarm; P g,n is the optimal value of the particle swarm after the disturbance; Pim is individual optimal value; N (P. a) is Gaussian function iti is the average, 0-is the variance 1""'is velocity component; x'"' is location component; 141 is inertia weight; c' is self-learning factor; C2 is population learning factor; r: is random value between 0 and 1, Fitness function is as follows.
F (x) = a0 + b (2) wherein a is the scalar coefficient; b is the offset; and Q is the loss function after weighted fusion 5. The improved particle swarm optimization algorithm in step 3 of the claim 1 specifically includes the followings. The particle swarm optimization algorithm is improved. The speed weight is optimized in real time with the increase of the number of iterations and the offset is added, so that the weight will not disappear. The improved speed weight formula is as follows. max-
IV = IV min+ ( )(11) max-TV rn0±d i max (3) wherein I max is the maximum number of iterations; w max is the maximum value of speed weight; W nun is the minimum value of speed weight; I is the current number of iterations; d is the offset.
6. The improved particle swarm optimization algorithm in step 3 of the claim I specifically includes the followings. The particle swarm optimization algorithm is improved. The self learning factor and the population learning factor need to be optimized. That is to say, the self learning factor is optimized with the increase of the number of iterations. The formulas of the improved learning factor are as follows.
ci -2( (4) i max i max-imax wherein i max is maximum number of iterations is current number of iterations 7. The improved V0016 network based on transfer learning in the claim 1 specifically includes the followings. In the training process of step 4, the dropout value is set as 0.65. Its purpose is to prevent over fitting. The dimension of the trained data set is adjusted to 224 x 224 x 3.The cross entropy loss function and mean square error loss function are weighted as the loss function Q. The formula is shown in formula (5).
- MSE (5) a+flL+ a+13 wherein a is the final loss value of cross entropy loss function tends to be stable; Pis the final loss value of the mean square error loss function tends to be stable; L is the cross entropy loss function; MSE is the mean square error loss function, a 11 = r -Let a a + 13
P
The cross entropy loss function L is shown in formula (6).
AI
= -1E Ed= -1 - log(paa)) Nd N, d (6) wherein M is number of categories; d is is the d-th category; Ld is the value of the loss function of Category D; Yth' is a indicator variable (0 or 1); If the category is the same as that of sample i, the result is 1, otherwise it is 0; Pik is the prediction probability of observation sample i belonging to this category.
The mean square error loss function MSE is shown in formula (7).
Ini(Ya Y1)2 MSE (y, y) -fl (7) wherein -11° is the value of the c th input; Y c is its predicted value.
G-TFPSO algorithm is used to optimize the weight. The optimization process is as follows.
(1) Initialization parameters. That is to say, the particle position, velocity, individual optimal position, population optimal position and learning factor are initialized; (2) According to formula (2), the weight of PSO (particle swarm optimization) is updated with the number of iterations; (3) According to formula (3), the current optimal value of the learning factor is obtained with the increase of iteration numbers (4) According to formula (1) (3) (4), the position and velocity components of particles are updated; (5) The fitness value is calculated according to formula (2); (6) The individual extremum and global extremum of particles are compared, and the optimal value is replaced continuously.
(7) If the maximum number of iterations is reached, the optimal solution would be output Otherwise, the second step would be continuously operated.
Finally, the iterative training is carried out. When the iterative loss value is less than a certain threshold, the training is stopped. The model is gotten and the pre trained feature extraction network is saved.
8. The features extraction of PMB-IADLDP in step 5 of the claim 1 specifically includes the followings.The size of the processed images are transformed to 222 x222. Then they are subdivided into blocks, and the size of each block was 3 x3. There are 74 sub-blocks. After 3 x3 block coding G; is obtained, Kirsch mask operator is used to calculate the result hi72. It is shown in formula (8).The differential coding and absolute coding are carried out respectively, and they are shown in formula (9) and formula (10). Take three largest results from the differential coding. The maximum three directions are set to 1, and the other directions are set to 0.The direction with the maximum absolute coding value is set to 1, and the other directions are set to 0.The final PMB-IADLDP feature extraction result is obtained by weighted fusion of the two results. Finally, the matrix of 74 x 8 dimensions is obtained. The purpose of differential coding is to make the eight area pixels around the center pixel tc.i more closely related with their surroundings, so as to enrich the extracted information The direction with the highest absolute value indicates that the texture effect is best in that direction. The results of absolute coding and differential coding are weighted, which not only retain the main texture, but also reduce the information redundancy.
E.; = Gi* = 1, 2, ..., 74, j = 0,2 (8) wherein G' is the coding value of the ith block; is Kirsch mask operator in the j-th direction; * is a convolution operator symbol.
The formula for the differential code is as follows.
-ei -El, cl= 7 1=7 (9) wherein Cis the ith encoding around the center pixel of the block. The absolute coding formula is as follows dch - = 0,1,... 3 (10) wherein ek is the k-th largest coding value in block.
LDP = x2 = 0,2,...,7,k =3 x 0 wherein,s(x) = 0, others (1 2) In formula (11), LDP represents the coding value of local direction pattern; s(x) is a step function; If x is greater than 0, it is set to 1, and otherwise it is set to 0; Formula (12) is used to get the maximum value of absolute coding.
9. The methods that the Pre-VGGI6 feature extraction network is applied to two neural network models respectively in step 6 of the claim 1 specifically includes the followings. The Pre-VGG16 feature extraction network is applied to two neural network models respectively. The difference between the two networks is at the last pooling layer. One is the average pooling layer, and the other is the maximum pooling layer. Then the features extracted by the two neural networks and PME-IADLDP are fused serially. Finally, the fusion results are input to the full connection layer and softmax layer for final identification. The fusion strategy is that the features to be fused are expanded and normalized to form a new feature vector. The length of the new feature vector is equal to the sum of the length of the feature vectors to be connected, and then the new feature vector are sent to the neural network to get the results of identification. The fully connection layer of the Pig-VGG16 network is changed into convolution layer. Then the trained parameters of the pig identification network are initialized. And the parameters are adjusted to the user-defined values. Dropout is set to 0.6. Epoch is set to 25, 3 x3 size is used in the convolution core. The cross entropy loss function and the mean square error loss function are used as the loss function. The whole training process was completed on tensorflow2.0. First, the codes of the convolution layer, pool layer and full connection layer modules are written according to the modules of BN-VGG16, and then they are debugged and saved. The BN layer code was added after the pooling layer and debugged. Then the data set is input into the main program and each module is called for model training. After the number of iterations is reached, the feature extraction model is saved. Then, the model is migrated to two different networks. Because the feature extraction is the same in two different networks, so it could be called directly. Just the last pooling layer is needed to modify. The features extracted from two neural networks and PMBIADLDP are fused, and the fusion results are input into the full connection layer and softmax layer for final identification. Observe the difference between traditional VGG16 and BN-VGG16 in the identification accuracy of pigs, and it is shown in FIG. 7. Finally, the comparison results of three networks are gotten. As shown in FIG. 8, the identification accuracy of Pig-VGG16 network is the highest, and the accuracy can reach 0.6 at the beginning. Therefore, Pig-VGG16 network is more suitable for the identification of pigs than traditional VGG16 and improved VGG16. The above examples are only illustrative examples of the present disclosure. The feasibility of the present disclosure is explained in detail. But it is not limited to that.

Claims (9)

  1. CLAIMSWhat is claimed is: 1. A pig identity identification method by using an improved VGG16 network based on transfer learning, characterized by comprising the following steps: Step I. Extracting images frame by frame from video; obtaining expanded data sets by adjusting a contrast, adding noise points, cropping and other operations, and dividing the expanded data sets into a training set and a test set; Step 2. Adding a BN layer after each pooling layer to build a BN-V0016 model; Step 3. obtaining Gauss improved factor particle swarm optimization algorithm (G-LFPSO) by adding gauss improved factor to particle swarm optimization algorithm; Step 4. Training the training set processed in the Step 1, using the G-IFPSO algorithm to optimize a loss function, wherein the loss function is a weighted fusion of cross entropy loss function and mean square error loss function, and saving a pre-trained feature extraction network Pre-VGG16; Step 5. Using a pixel multi block method by improving absolute differential local direction pattern (PMB-IADLDP) to extract traditional features of pigs, wherein the traditional features are used for feature fusion and identity identification of a individual pig; and Step 6. Transferring the Pre-VGG16 feature extraction network to two different neural networks for training, fine-tuning parameters in the networks, and then, adjusting the datasets to 224 x 224 x 3, using the pixel multi block method by improving absolute differential local direction pattern (PMB-IADLDP) to extract features from the adjusted datasets, fusing serially the features extracted in the two neural networks and PMB-IADLDP features, i.e vector fusion, and finally, identifying the identity of pigs.
  2. 2. The pig identity identification method by using the improved VGG16 network based on transfer learning according to claim 1, characterized in that the Step 1 specifically comprises the followings: firstly, Images are extracted frame by frame from video; these images are then preprocessed, that is to say, by flipping the images horizontally and vertically, using gamma transform, histogram equalization, logarithmic transform, reducing and adding noise points, the datasets could be expanded and the processed data sets are obtained. The number of images is increased by 4900. Finally, the datasets is divided as training sets and test sets in the proportion of 6:1.
  3. 3. The pig identity identification method by using the improved VGG16 network based on transfer learning according to claim 1, characterized in that the improved BN-VGG16 network model built in the Step 2 specifically comprises the followings: BN (Batch Normalization) layer is added after each maximum pooling layer. The structure of the whole network is consisted of two convolution layers with 64 convolution cores which are followed by a maximum pooling layer and a BN layer, two convolution layers with 128 convolution cores which are followed by a maximum pooling layer and a BN layer, three convolution layers with 256 convolution cores which are followed by a maximum pooling layer and a BN layer, three convolution layers with 512 convolution cores which are followed by a maximum pooling layer and a BN layer, two fully connected layers containing 4096 neurons, one fully connected layer containing 1000 neurons and a softmax layer.
  4. 4. The pig identity identification method by using the improved VGG16 network based on transfer learning according to claim 1, characterized in that the improved particle swarm optimization algorithm in the Step 3 specifically comprises the followings: The particle swarm optimization algorithm is improved. Gaussian perturbation is added to the optimal particle. The formulas of the improved particle swarm optimization algorithm are as follows.vim = WVni + I(P,,,, -xim)+ c2r 2(15 gni-.x.,m) Lon = Kim +Vim n = gm, CT) (1) wherein -Pgm is the optimal value of particle swarm; PI gm is the optimal value of the particle swarm after adding disturbance; is individual optimal value; IV (P cr) is Gaussian function; P is the average; a is the variance; 11-is velocity component; xi-is location component; is inertia weight; clis self-learning factor; c2is population learning factor; Ti, r 2 is random value between 0 and 1; The formulas of Fitness function is as follows.F(x)= al) + (2) wherein a is the scalar coefficient, b is the offset, and Q is the loss function after weighted fusion.
  5. 5. The pig identity identification method by using the improved VGG16 network based on transfer learning according to claim 1, characterized in that the improved particle swarm optimization algorithm in the Step 3 specifically comprises the followings: in Step 3, the particle swarm optimization algorithm is improved. The speed weight is optimize in real time according to the number of iterations and the offset is added, so that the weight would not disappear.The formulas of the improved speed weight is as follows. I mad= min+ ( )012 max TV min)+d (3) wherein im.L\ is the maximum number of iterations; IFmmcc is the maximum value of speed weight; s the minimum value of speed weight. i is the current number of iterations; and d is the offset.
  6. 6. The pig identity identification method by using the improved VGG16 network based on transfer learning according to claim 1, characterized in that the improved particle swarm optimization algorithm in the Step 3 specifically comprises the followings: in the Step 3, the particle swarm optimization algorithm is improved.The self learning factor and the population learning factor needed to be optimized. That is to say, the self learning factor is optimized with the number of iterations.The formulas of the improved learning factor are as follows.lmTaadI (4) ci-2( i max c2-2(1 I flax i max wherein i max is Maximum number of iterations; s Current iterations 7. The pig identity identification method by using the improved VGG16 network based on transfer learning according to claim 6, characterized in that the improved VGG16 network based on transfer learning in the Step 4 specifically comprises the followings: in the training process of the Step 4 the dropout value is set as 0.65. Its purpose is to prevent over fitting.The dimension of the trained data set is adjusted to 224 x224 x3.The cross entropy loss function and mean square error loss function are selected as the loss function Q, and the two functions are weighted together The formula of the loss function Q is shown in formula (5) 0 -,8 L +ce MSE (5) a + ,6 a+fl wherein a is the final stable loss value of cross entropy loss function, fi is the last loss value of the mean square error loss function tends to be stable; L is the cross entropy loss function is shown in formula (6); MSE is the mean square error loss function is shown in formula (7). fi =Let a + fi a + /3 The cross entropy loss function is shown in formula (6). lviL = -11Ld -1 -pc-log(p,k)) Nd N,d (6) wherein M is the number of categories; d is the d-th category; lid is the value of the loss function of Category D, -V' is indicator variable (0 or 1); If the category is the same as that of sample i, the result would be 1, otherwise it would be D; ihk is the prediction probability of observation sample i belonging to this category.The mean square error loss function is shown in formula (7).yn c-)2 msE ( y, y) = Cl fl (7) wherein Ys' is the value of the cth input; Yc is its predicted value.G-IFPSO algorithm is used to optimize the weight. The optimization process is as follows.( I) Initialization parameters. That is to say, the particle position, velocity, individual optimal position, population optimal position and learning factor are initialized; (2) According to formula (1), the weight of PSO (particle swarm optimization) is updated with the number of iterations; (3) According to formula (4), the current optimal value of the learning factor is obtained with the increase of iteration numbers; (4) According to formula (1), (3), (4), the position and velocity components of particles are updated; (5) The fitness value is calculated according to formula (2); (6) The individual extremum and global extremum of particles are compared, and the optimal value is replaced continuously; (7) If the maximum number of iterations is reached, the optimal solution (11' 7) would be output; Otherwise, the second step would be continuously operated.The above iterative training is carried out continuously. When the iterative loss value is less than a certain threshold or the maximum number of iterations is reached, the training would be stopped. The model would be gotten and the pretrained feature extraction network is saved 8. The pig identity identification method by using the improved VGG16 network based on transfer learning according to claim 1, characterized in that the features extraction of PMBIADLDP in the Step 6 specifically comprises the followings: the size of the processed images are transformed to 222 x 222. Then they are subdivided into blocks, and the size of each block is 3 x 3. There are 74 sub-blocks. After 3 x3 block coding is obtained, Kirsch mask operator is used to calculate the result E. It is shown in formula (8). The differential coding and absolute coding are carried out respectively, and they are shown in formula (9) and formula (10). Take three largest results from the differential coding. The three directions with the maximum result are set to 1, and the other directions are set to 0. The direction with the maximum absolute coding value is set to 1, and the other directions are set to 0. The final PMB-IADLDP feature extraction result is obtained by weighted fusion of the two results. Finally, the matrix of 74 x 8 dimensions is obtained. The purpose of differential coding is to make the eight area pixels around the center pixel g more closely related with their surroundings, so as to enrich the extracted information.
  7. The direction where the absolute value is largest indicate that the texture effect in this direction is the best. The results of absolute coding and differential coding are weighted fusion, which not only retains the main texture, but also reduces the redundancy of information. E, = G,* 114,,i =1,2_74, j= 0,2, 7 (8) wherein G' is the coding value of the ith block; is Kirsch mask operator in the j-th direction; * is a convolution operator symbol.
  8. The formula for the differential coding is as follows.d= ie,-e, +1, 0 <i < 6 (9) wherein e, is the ith encoding around the center pixel of the block. The absolute coding formula is as follows.da -e, 41,1 = 0,1, 3 (10) wherein eh is the k-th largest coding value in the block, LDP =±s(le '1de kl) x 2, i 0, 2, , 7, = 3 i=0 11. .x- 0 (12) wherein s(x) -10, others In formula (11), LDP represents the coding value of local direction pattern, s(x) is a step function. If x is greater than 0, it is set to 1, and otherwise it is set to 0; Formula (12) is used to get the maximum value of absolute coding.
  9. 9. The pig identity identification method by using the improved VGG16 network based on transfer learning according to claim 1, characterized in that the methods that the Pre-VGG16 feature extraction network is applied to two neural network models respectively in the Step 6 specifically comprises the followings. The Pre-VGG16 feature extraction network is applied to two neural network models respectively. The difference between the two networks is at the last pooling layer. One is the average pooling layer, and the other is the maximum pooling layer. Then the features extracted by the two neural networks and MB-IADLDP are fused serially. Finally, the fusion results are input to the full connection layer and softmax layer for final identification. The fusion strategy is that the features to be fused are expanded and normalized to form a new feature vector. The length of the new feature vector is equal to the sum of the length of the feature vectors to be connected, and then the new feature vector are sent to the neural network to get the results of identification. The fully connection layer of the Pig-Vgg16 network is changed into convolution layer. Then the trained parameters of the pig identity identification network are initialized. And the parameters are adjusted to the user-defined values. Dropout is set to 0.6.Epoch is set to 25, 3 x3 size is used in the convolution core. The cross entropy loss function and the mean square error loss function are used as the loss function. The whole training process is completed on tensorflow2.0. First, the code of the modules of convolution layer, pool layer and full connection layer are written out according to the modules of BN-VGG16, and then they are debugged and saved. The BN layer code is added after the pooling layer and debugged. Then the data set is input into the main program and each module is called for model training. After the number of iterations is reached, the feature extraction model is saved. Then, the model is migrated to two different networks. Because the feature extraction is the same in two different networks, so it could be called directly. Just the last pooling layer is needed to modify. The features extracted from two neural networks and PMB-IADLDP are fused, and the fusion results are input into the full connection layer and softmax layer for final identification.
GB2219795.8A 2021-06-03 2021-06-09 Pig identity identification method by using improved vgg16 network based on transfer learning Active GB2611257B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110618450.7A CN113469356B (en) 2021-06-03 2021-06-03 Improved VGG16 network pig identity recognition method based on transfer learning
PCT/CN2021/099162 WO2022252272A1 (en) 2021-06-03 2021-06-09 Transfer learning-based method for improved vgg16 network pig identity recognition

Publications (3)

Publication Number Publication Date
GB202219795D0 GB202219795D0 (en) 2023-02-08
GB2611257A true GB2611257A (en) 2023-03-29
GB2611257B GB2611257B (en) 2024-02-28

Family

ID=77872193

Family Applications (1)

Application Number Title Priority Date Filing Date
GB2219795.8A Active GB2611257B (en) 2021-06-03 2021-06-09 Pig identity identification method by using improved vgg16 network based on transfer learning

Country Status (3)

Country Link
CN (1) CN113469356B (en)
GB (1) GB2611257B (en)
WO (1) WO2022252272A1 (en)

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114299436A (en) * 2021-12-30 2022-04-08 东北农业大学 Group-breeding pig fighting behavior identification method integrating space-time double-attention mechanism
CN114511926B (en) * 2022-01-17 2024-05-14 江苏大学 Pig feeding behavior identification method based on combination of improved support vector machine and optical flow method
CN116259145A (en) * 2022-09-26 2023-06-13 广州当康自然资源科技有限公司 Wild boar early warning and disposal system based on AI intelligent recognition
CN116138243A (en) * 2022-09-26 2023-05-23 广州当康自然资源科技有限公司 Escape-inducing type wild boar driving method and device for simulating wild boar killing scene
CN116012367B (en) * 2023-02-14 2023-09-12 山东省人工智能研究院 Deep learning-based stomach mucosa feature and position identification method
CN116647376B (en) * 2023-05-25 2024-01-26 中国人民解放军军事科学院国防科技创新研究院 Voiceprint information-based underwater acoustic network node identity authentication method
CN116881639B (en) * 2023-07-10 2024-07-23 国网四川省电力公司营销服务中心 Electricity larceny data synthesis method based on generation countermeasure network
CN116978099B (en) * 2023-07-25 2024-03-12 湖北工业大学 Lightweight sheep identity recognition model construction method and recognition model based on sheep face
CN116824512B (en) * 2023-08-28 2023-11-07 西华大学 27.5kV visual grounding disconnecting link state identification method and device
CN116994067B (en) * 2023-09-07 2024-05-07 佛山科学技术学院 Method and system for predicting fractional flow reserve based on coronary artery calcification
CN116975656B (en) * 2023-09-22 2023-12-12 唐山师范学院 Intelligent damage detection and identification method and system based on acoustic emission signals
CN117541991B (en) * 2023-11-22 2024-06-14 无锡科棒安智能科技有限公司 Intelligent recognition method and system for abnormal behaviors based on security robot
CN117392551B (en) * 2023-12-12 2024-04-02 国网江西省电力有限公司电力科学研究院 Power grid bird damage identification method and system based on bird droppings image features
CN118015338A (en) * 2024-01-12 2024-05-10 中南大学 Physical knowledge embedded aluminum electrolysis superheat degree identification method and system
CN117556715B (en) * 2024-01-12 2024-03-26 湖南大学 Method and system for analyzing degradation of intelligent ammeter in typical environment based on information fusion
CN117576573B (en) * 2024-01-16 2024-05-17 广州航海学院 Building atmosphere evaluation method, system, equipment and medium based on improved VGG16 model
CN117934962B (en) * 2024-02-06 2024-07-02 青岛兴牧畜牧科技发展有限公司 Pork quality classification method based on reference color card image correction
CN117911829B (en) * 2024-03-15 2024-05-31 山东商业职业技术学院 Point cloud image fusion method and system for vehicle navigation
CN118196908B (en) * 2024-04-23 2024-08-16 淮阴工学院 Personnel dangerous behavior identification method and system for working area of transformer substation
CN118135566B (en) * 2024-05-06 2024-07-02 苏州宝丽迪材料科技股份有限公司 Semi-supervised learning fiber master batch electron microscope image aggregation structure area identification method
CN118279671B (en) * 2024-05-08 2024-09-13 北京弘象科技有限公司 Satellite inversion cloud classification method, device, electronic equipment and computer storage medium
CN118171049B (en) * 2024-05-13 2024-07-16 西南交通大学 Big data-based battery management method and system for edge calculation
CN118172636B (en) * 2024-05-15 2024-07-23 乐麦信息技术(杭州)有限公司 Method and system for adaptively adjusting image text and non-image patterns in batches

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110414626A (en) * 2019-08-06 2019-11-05 广东工业大学 A kind of pig variety ecotype method, apparatus and computer readable storage medium
CN111178197A (en) * 2019-12-19 2020-05-19 华南农业大学 Mass R-CNN and Soft-NMS fusion based group-fed adherent pig example segmentation method
CN111241933A (en) * 2019-12-30 2020-06-05 南京航空航天大学 Pig farm target identification method based on universal countermeasure disturbance
US20200285896A1 (en) * 2019-03-09 2020-09-10 Tongji University Method for person re-identification based on deep model with multi-loss fusion training strategy
CN111666838A (en) * 2020-05-22 2020-09-15 吉林大学 Improved residual error network pig face identification method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200285896A1 (en) * 2019-03-09 2020-09-10 Tongji University Method for person re-identification based on deep model with multi-loss fusion training strategy
CN110414626A (en) * 2019-08-06 2019-11-05 广东工业大学 A kind of pig variety ecotype method, apparatus and computer readable storage medium
CN111178197A (en) * 2019-12-19 2020-05-19 华南农业大学 Mass R-CNN and Soft-NMS fusion based group-fed adherent pig example segmentation method
CN111241933A (en) * 2019-12-30 2020-06-05 南京航空航天大学 Pig farm target identification method based on universal countermeasure disturbance
CN111666838A (en) * 2020-05-22 2020-09-15 吉林大学 Improved residual error network pig face identification method

Also Published As

Publication number Publication date
CN113469356B (en) 2024-06-07
GB2611257B (en) 2024-02-28
CN113469356A (en) 2021-10-01
WO2022252272A1 (en) 2022-12-08
GB202219795D0 (en) 2023-02-08

Similar Documents

Publication Publication Date Title
GB2611257A (en) Transfer learning-based method for improved VGG16 network pig identity recognition
CN110110624B (en) Human body behavior recognition method based on DenseNet and frame difference method characteristic input
CN108491880B (en) Object classification and pose estimation method based on neural network
CN108416266B (en) Method for rapidly identifying video behaviors by extracting moving object through optical flow
CN110263912A (en) A kind of image answering method based on multiple target association depth reasoning
CN114596520A (en) First visual angle video action identification method and device
CN111797882B (en) Image classification method and device
RU2665273C2 (en) Trained visual markers and the method of their production
CN113657388A (en) Image semantic segmentation method fusing image super-resolution reconstruction
CN112766062B (en) Human behavior identification method based on double-current deep neural network
CN109711356B (en) Expression recognition method and system
CN111626090B (en) Moving target detection method based on depth frame difference convolutional neural network
CN112288776B (en) Target tracking method based on multi-time step pyramid codec
CN111401207B (en) Human body action recognition method based on MARS depth feature extraction and enhancement
CN112712052A (en) Method for detecting and identifying weak target in airport panoramic video
CN114638408A (en) Pedestrian trajectory prediction method based on spatiotemporal information
CN114283352A (en) Video semantic segmentation device, training method and video semantic segmentation method
CN116993975A (en) Panoramic camera semantic segmentation method based on deep learning unsupervised field adaptation
CN114581502A (en) Monocular image-based three-dimensional human body model joint reconstruction method, electronic device and storage medium
CN116863194A (en) Foot ulcer image classification method, system, equipment and medium
CN115393396A (en) Unmanned aerial vehicle target tracking method based on mask pre-training
Lv et al. MFALNet: A multiscale feature aggregation lightweight network for semantic segmentation of high-resolution remote sensing images
CN114463837A (en) Human behavior recognition method and system based on self-adaptive space-time convolution network
CN112418235A (en) Point cloud semantic segmentation method based on expansion nearest neighbor feature enhancement
CN116188509A (en) High-efficiency three-dimensional image segmentation method