CN110895714A - Network compression method of YOLOv3 - Google Patents

Network compression method of YOLOv3 Download PDF

Info

Publication number
CN110895714A
CN110895714A CN201911270679.5A CN201911270679A CN110895714A CN 110895714 A CN110895714 A CN 110895714A CN 201911270679 A CN201911270679 A CN 201911270679A CN 110895714 A CN110895714 A CN 110895714A
Authority
CN
China
Prior art keywords
network
yolov3
training
data set
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911270679.5A
Other languages
Chinese (zh)
Inventor
王以忠
许素霞
房臣
郭肖勇
杨国威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University of Science and Technology
Original Assignee
Tianjin University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University of Science and Technology filed Critical Tianjin University of Science and Technology
Priority to CN201911270679.5A priority Critical patent/CN110895714A/en
Publication of CN110895714A publication Critical patent/CN110895714A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a network compression method of YOLOv3, which comprises the following steps: obtaining a picture data set from the acquired video; carrying out information annotation on the picture data set; performing data amplification on the data set; training the YOLOv3 model with the dataset; measuring and selecting a stored weight file; sparsifying the selected YOLOv3 weight; pruning the network model; fine-tuning the network model to improve the detection effect; judging whether the accuracy reaches a threshold value, outputting a compressed network model if the accuracy reaches the threshold value, and otherwise, continuing to perform sparse training, pruning and fine adjustment; and carrying out target detection on the campus data by using the compressed network. According to the invention, through network compression of YOLOv3, the size of a network model can be reduced, the network detection speed is increased, and the requirement of object detection on objects such as pedestrians, vehicles and the like in a campus is met.

Description

Network compression method of YOLOv3
Technical Field
The invention belongs to the technical field of deep neural network optimization, and particularly relates to a compression method of a deep neural network.
Background
The development of the deep learning technology enables the deep neural network to obtain remarkable results in the technical field of target classification and detection. However, in practical application, the deep neural network has some difficulties in the field of target detection and classification. Because the deep neural network needs to perform a large amount of calculations in the process of target detection and classification, high requirements are required on equipment for performing calculations in the actual application process, and otherwise, the requirement for rapid identification and classification is difficult to achieve. The YOLOv3 classifies 80 objects, and when the types of objects needing to be detected are relatively small, such as pedestrians and vehicles in a campus, the YOLOv3 has redundancy in detection. In order to reduce the calculation amount of the deep neural network and improve the portability of the deep neural network on equipment such as a development board, some researches start to be directed at the compression of the deep neural network, and the current main compression methods comprise network pruning, network quantization, low-rank decomposition, knowledge distillation and compact network design.
The traditional pruning method is to set a weight threshold, the weight below the threshold is deleted, and the purpose of compressing the network is achieved by deleting the weight of the network. The compression method does not carry out sparse training on the network for weight deletion, and randomly deletes the weight as a single parameter, so that the weight with correlation is also deleted, the precision of the compressed network is poor, and the running time of the network is not reduced. Unstructured pruning of individual weights is not suitable for complex deep neural networks and requires specialized software computational libraries and hardware support, which further increases the complexity of the compressed network.
Disclosure of Invention
The invention provides a network compression method of YOLOv3, which aims to reduce the calculation amount of a deep neural network, improve the portability of the neural network on devices such as a development board and the like and compress the network on the premise of not reducing the network detection effect.
The technical scheme for realizing the invention is as follows:
a network compression method of YOLOv3, comprising the following steps:
obtaining a picture dataset from the captured video: the method comprises the steps of obtaining a plurality of video clips of different corners from a monitoring video stored in a school, and obtaining and screening pictures from the video clips.
Carrying out information annotation on the picture data set: and carrying out information annotation on four types of targets needing to be detected on the stored picture data set.
Performing data amplification on the data set: and expanding the data set by rotating, turning and zooming the picture.
Training a Yolov3 network model with a data set; the YOLOv3 model was trained with the processed picture dataset.
And (3) weighing weight file: and comparing the stored weight files by calculating the network recall rate and the accuracy rate, and selecting the file with the best effect as the weight file of the target detection network.
Sparse training: and performing L1 regularization constraint on the gamma coefficient of the network BN layer to generate a sparse weight matrix.
Network model pruning: the scaling factor gamma is introduced for each channel to weigh the importance of the network channel, the closer the scaling factor is to zero, the less important the corresponding channel is to the network, and the less influential channel is deleted.
Fine-tuning the network model: the accuracy of the pruned network model is reduced, and the accuracy of the network is improved through fine adjustment.
Target detection: and carrying out object detection on objects such as pedestrians, vehicles and the like appearing in the campus monitoring video by using the compressed network.
The invention provides a network compression method of YOLOv3, which comprises the following steps:
the video acquisition device comprises: cameras arranged in different corners of the campus;
the picture processing module: classifying a training set and a data set of pictures stored in a video clip, labeling the pictures, and amplifying the data set by the labeled data set through rotation and turning operations;
a training module: training the neural network by using the amplified data set, and extracting picture characteristics;
a pruning module: and (4) channel deletion is carried out on the network after the sparse network is deleted, and fine adjustment is carried out on the network after the channel deletion is carried out.
A detection module: and detecting the compressed YOLOv3 network by using the campus data set, and detecting the detection speed and accuracy of the compressed network.
Compared with the prior art, the invention has the following advantages:
the invention provides a network compression method of YOLOv3, which is used for sparsifying training a network, so that a model can adjust parameters towards a structural sparse direction, and the correlation among weights is considered; the importance of network channels is measured by introducing scaling factors to the sparse network, only the channels with small influence factors are deleted in the network pruning process, and the network feature learning capability and the detection precision are ensured; the trimmed network is finely adjusted, so that the detection effect of the network is recovered, the size of a network model is reduced, the detection accuracy of the deep neural network is guaranteed, and the calculation cost in the detection process is reduced.
Drawings
Fig. 1 is a flowchart of a network compression method of YOLOv3 according to the present invention;
fig. 2 is a schematic block diagram of a network compression method of YOLOv3 according to the present invention;
FIG. 3 is a labeled picture used to train the YOLOv3 network;
FIG. 4 is a flow chart of pruning a YOLOv3 network;
fig. 5 is a diagram of the results of network detection after compression.
Detailed description of the preferred embodiment
The invention is described in further detail below with reference to the accompanying drawings and specific embodiments:
fig. 1 is a flowchart of a network compression method of YOLOv3 provided by the present invention, and fig. 2 is a schematic block diagram of a network compression method of YOLOv3 provided by the present invention, where the method includes:
obtaining a picture data set from a captured video, specifically comprising:
the method comprises the steps of obtaining multiple sections of video files through cameras in different places of a campus, and screening out 50 video clips. The method comprises the steps of intercepting a picture for each video file every 5 frames, screening 5000 pictures for the stored picture files, and adopting a screening principle that picture data comprise four types of data of pedestrians, bicycles, automobiles and backpacks, which are the four most common things in a campus.
Carrying out information annotation on the picture data set: the data set used to train the network is obtained by the picture processing module in fig. 2. And dividing 5000 screened pictures into 4000 training sets and 1000 testing sets according to the ratio of 8: 2, and carrying out information labeling on the pictures in the training sets by using a labeling tool labellimg. And (3) framing pedestrians, bicycles, automobiles and backpacks appearing in the pictures by using rectangular frames respectively, and labeling corresponding type information, as shown in fig. 3, labeling four types of objects appearing in the pictures for the marked pictures.
Performing data amplification on the data set: the marked image data set needs data amplification, so that the feature learning capacity of the deep neural network on the campus image is enhanced, and the network detection effect is improved. Firstly, all pictures are scaled according to the proportion of 0.5, 0.75, 1.25 and 1.5, then the pictures are horizontally turned over, and then all the pictures are rotated by different angles, and finally the purpose of amplifying a data set is achieved.
Training a Yolov3 network model with a data set; the augmented campus picture data set is used as a data set for training a Yolov3 network, network training is carried out through a training module in fig. 2, pictures and corresponding marked xml files are placed in the same folder, the name of each picture is obtained and stored in a train. Modifying data, names and network structure files, wherein the data files comprise classified number and paths, txt file addresses of the paths of the pictures required for training, names file addresses and storage addresses of the weight files; the names file comprises 4 types of label information classification, including four types of pedestrians, bicycles, automobiles and backpacks; setting a basic learning rate of 0.001, the number of iterations of 20000 and a network as a training mode in a network parameter file, then starting training of the network, saving a weight file every 2000 iterations in the training process, and obtaining 12 weight files including a final weight file and a latest weight file after the training is finished.
And (3) weighing weight file: through the training of the network, 12 trained weight files are saved. Drawing a loss curve through the stored log file, calculating the mAP of the stored weight file, wherein the image path which is required to be added in the data file for calculating the mAP is the same as the trained image path, and selecting the weight file with the highest mAP and lower loss as the weight file compressed by YOLOv 3.
Model compression of the network is implemented by the network pruning module in fig. 2, and the method includes:
as shown in fig. 4, which is a flow chart of network pruning, the weight file of the initial network is the selected weight file with the highest mag, and the network needs to be sparsely trained, specifically including:
the data set used for sparsely training the network is also a campus data set, and since the neural network only detects 4 types of objects, and the Yolov3 original network detects 80 types of objects, the network needs to be sparsely trained for detecting the campus objects, redundant weights are deleted, and the network only needs to learn the characteristics of the objects in the campus data set. The method comprises the steps of conducting sparse training on a network, applying L1 regularization constraint on gamma coefficients of a batch normalization layer of the network, enabling intersection points of isolines of square error terms and isolines of regularization terms to be generally arranged on coordinate axes when L1 regularization is adopted, adjusting adjustment factors, forcing weights to be equal to 0, selecting variables, and enabling a model to adjust parameters towards a structural sparse direction. At this time, the Gamma coefficient of the BN layer enables the network to force some weights to go to 0 according to the characteristics of the data set picture in the sparse training process.
Pruning the network after sparsifying the network, specifically comprising:
the importance of the network channels is weighted by introducing a scaling factor γ for each channel. The closer the scaling factor is to zero, the less important the corresponding channel is to the network. And in the training process, the network and the scaling factor are simultaneously trained, the channel with the small scaling factor is automatically deleted, and the advantage of introducing the scaling factor is that no additional overhead is brought to the network. Pruning the network according to the proportion of 20%, 40%, 60% and 80%, and then selecting a network model with low precision reduction and large compression ratio for precision recovery of the model calculation precision after pruning. The network after the channel deletion only needs less parameters and less running memory.
Fine-tuning the network model: by deleting the channels with the scaling factors close to zero, the input and output connected with the channels and the related weights are correspondingly removed, so that the accuracy of the compressed neural network is reduced, and the compressed neural network can be compensated in a fine adjustment mode. And fine-tuning the campus data set, generating a corresponding anchor value through a clustering algorithm according to the characteristics of the campus data set, training the pruned network model, and improving the accuracy of network detection. And after fine adjustment, evaluating the detection performance of the network, outputting the pruned network model after the network performance meets the requirement, and if the network performance does not meet the requirement, continuously performing sparse training, channel deletion and fine adjustment in a circulating manner.
Detection with campus datasets: because the YOLOv3 is pruned by using the campus data set, the network sparsely training, pruning and fine tuning are performed on pedestrians, automobiles, bicycles and bags in the campus in the compression process, for example, as shown in FIG. 5, the compressed network can meet the requirement of detecting four types of objects in the campus.

Claims (7)

1. A network compression method of YOLOv3 is characterized in that:
step 1: obtaining a picture data set from the acquired video;
step 2: carrying out information annotation on the picture data set;
and step 3: performing data amplification on the data set;
and 4, step 4: training the YOLOv3 model with the dataset;
and 5: weighing the weight files stored in the training process, and selecting the weight files with good detection effect;
step 6: sparsifying the selected YOLOv3 weight;
and 7: pruning the network model and deleting unimportant channels;
and 8: fine-tuning the network model to improve the detection effect;
and step 9: and carrying out target detection on pedestrians, vehicles and the like appearing in the campus data by using the compressed network.
2. The network compression method of YOLOv3, according to claim 1, wherein: the video clips in the step 1 are monitoring videos in different areas of a school, one picture is stored in each 1O frame from multiple video files, the stored pictures are selected, and the pictures without detection targets are removed. The selected pictures comprise four types of pictures including pedestrians, bicycles, automobiles and bags, and the pictures are divided into a training set and a testing set.
3. The network compression method of YOLOv3, according to claim 1, wherein: the data set is as follows 8: and 2, dividing the ratio of the image to a training set and a test set, carrying out information labeling on the image in the training set by using a labeling tool, and marking pedestrians, bicycles, automobiles and bags appearing in the image in a rectangular frame.
4. The network compression method of YOLOv3, according to claim 1, wherein: in order to improve the characteristic learning effect of the network and improve the detection accuracy, the data amplification is achieved by turning, rotating and zooming the pictures and the labeled files.
5. The network compression method of YOLOv3, according to claim 1, wherein: carrying out YOLOv3 network model training by using the data set after data augmentation to generate two files, wherein one file is used for storing picture names, the other file is used for storing absolute paths of all pictures, and the label information of each picture is stored in a single file; parameters in the network structure file are modified for training, and the training log is used for observing the training condition of the model and adjusting the parameters to improve the network performance. A weight file is saved every 2000 iterations in the training process. And comparing the stored weight files by calculating the network recall rate and the accuracy, and selecting the weight with the best effect as the weight file of the detection network for compression.
6. The network compression method of YOLOv3, according to claim 1, wherein: before pruning, the network needs to be sparsely trained, the data set used for training is a marked campus picture data set, L1 regularization constraint is carried out on gamma coefficients of a network BN layer, and a sparsely weighted matrix is generated. When the model pruning is carried out, a scaling factor is introduced into each channel to measure the importance of the channel, and the unimportant channels are automatically deleted, so that the purpose of reducing the size of the network model is achieved.
7. The network compression method of YOLOv3, according to claim 1, wherein: with the reduction of network channels, the detection effect of the network is reduced, and the detection effect of the network can be improved by finely adjusting the network by using the campus data set.
CN201911270679.5A 2019-12-11 2019-12-11 Network compression method of YOLOv3 Pending CN110895714A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911270679.5A CN110895714A (en) 2019-12-11 2019-12-11 Network compression method of YOLOv3

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911270679.5A CN110895714A (en) 2019-12-11 2019-12-11 Network compression method of YOLOv3

Publications (1)

Publication Number Publication Date
CN110895714A true CN110895714A (en) 2020-03-20

Family

ID=69787298

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911270679.5A Pending CN110895714A (en) 2019-12-11 2019-12-11 Network compression method of YOLOv3

Country Status (1)

Country Link
CN (1) CN110895714A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111612144A (en) * 2020-05-22 2020-09-01 深圳金三立视频科技股份有限公司 Pruning method and terminal applied to target detection
CN111709489A (en) * 2020-06-24 2020-09-25 广西师范大学 Citrus identification method based on improved YOLOv4
CN111709522A (en) * 2020-05-21 2020-09-25 哈尔滨工业大学 Deep learning target detection system based on server-embedded cooperation
CN111814902A (en) * 2020-07-21 2020-10-23 南方电网数字电网研究院有限公司 Target detection model training method, target identification method, device and medium
CN111832607A (en) * 2020-05-28 2020-10-27 东南大学 Bridge disease real-time detection method based on model pruning
CN111898591A (en) * 2020-08-28 2020-11-06 电子科技大学 Modulation signal identification method based on pruning residual error network
CN112001259A (en) * 2020-07-28 2020-11-27 联芯智能(南京)科技有限公司 Aerial weak human body target intelligent detection method based on visible light image
CN112101313A (en) * 2020-11-17 2020-12-18 北京蒙帕信创科技有限公司 Machine room robot inspection method and system
CN112115837A (en) * 2020-09-11 2020-12-22 中国电子科技集团公司第五十四研究所 Target detection method based on YoloV3 and dual-threshold model compression
CN112464718A (en) * 2020-10-23 2021-03-09 西安电子科技大学 Target detection method based on YOLO-Terse network and storage medium
CN112614125A (en) * 2020-12-30 2021-04-06 湖南科技大学 Mobile phone glass defect detection method and device, computer equipment and storage medium
CN112668451A (en) * 2020-12-24 2021-04-16 南京泓图人工智能技术研究院有限公司 Crowd density real-time monitoring method based on YOLOv5

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111709522B (en) * 2020-05-21 2022-08-02 哈尔滨工业大学 Deep learning target detection system based on server-embedded cooperation
CN111709522A (en) * 2020-05-21 2020-09-25 哈尔滨工业大学 Deep learning target detection system based on server-embedded cooperation
CN111612144A (en) * 2020-05-22 2020-09-01 深圳金三立视频科技股份有限公司 Pruning method and terminal applied to target detection
CN111612144B (en) * 2020-05-22 2021-06-15 深圳金三立视频科技股份有限公司 Pruning method and terminal applied to target detection
CN111832607A (en) * 2020-05-28 2020-10-27 东南大学 Bridge disease real-time detection method based on model pruning
CN111709489A (en) * 2020-06-24 2020-09-25 广西师范大学 Citrus identification method based on improved YOLOv4
CN111814902A (en) * 2020-07-21 2020-10-23 南方电网数字电网研究院有限公司 Target detection model training method, target identification method, device and medium
CN112001259A (en) * 2020-07-28 2020-11-27 联芯智能(南京)科技有限公司 Aerial weak human body target intelligent detection method based on visible light image
CN111898591A (en) * 2020-08-28 2020-11-06 电子科技大学 Modulation signal identification method based on pruning residual error network
CN111898591B (en) * 2020-08-28 2022-06-24 电子科技大学 Modulation signal identification method based on pruning residual error network
CN112115837A (en) * 2020-09-11 2020-12-22 中国电子科技集团公司第五十四研究所 Target detection method based on YoloV3 and dual-threshold model compression
CN112464718A (en) * 2020-10-23 2021-03-09 西安电子科技大学 Target detection method based on YOLO-Terse network and storage medium
CN112464718B (en) * 2020-10-23 2024-02-20 西安电子科技大学 Target detection method based on YOLO-Terse network and storage medium
CN112101313A (en) * 2020-11-17 2020-12-18 北京蒙帕信创科技有限公司 Machine room robot inspection method and system
CN112668451A (en) * 2020-12-24 2021-04-16 南京泓图人工智能技术研究院有限公司 Crowd density real-time monitoring method based on YOLOv5
CN112614125A (en) * 2020-12-30 2021-04-06 湖南科技大学 Mobile phone glass defect detection method and device, computer equipment and storage medium
CN112614125B (en) * 2020-12-30 2023-12-01 湖南科技大学 Method and device for detecting glass defects of mobile phone, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110895714A (en) Network compression method of YOLOv3
EP3805988A1 (en) Training method for model, storage medium and computer device
CN113378890A (en) Lightweight pedestrian and vehicle detection method based on improved YOLO v4
WO2017080196A1 (en) Video classification method and device based on human face image
WO2021046957A1 (en) Video classification method, device and system
CN110827292B (en) Video instance segmentation method and device based on convolutional neural network
CN112464718B (en) Target detection method based on YOLO-Terse network and storage medium
CN114783034B (en) Facial expression recognition method based on fusion of local sensitive features and global features
CN110046568B (en) Video action recognition method based on time perception structure
CN113327227B (en) MobileneetV 3-based wheat head rapid detection method
CN111968114B (en) Orthopedics consumable detection method and system based on cascade deep learning method
CN113408630A (en) Transformer substation indicator lamp state identification method
CN117557538A (en) PCB surface defect detection method, device, computer equipment and storage medium
CN111931920A (en) Target detection method, device and storage medium based on cascade neural network
CN117079099A (en) Illegal behavior detection method based on improved YOLOv8n
CN110728316A (en) Classroom behavior detection method, system, device and storage medium
CN114005054A (en) AI intelligence system of grading
Dai et al. Clothing recognition based on improved resnet18 model
CN114329070A (en) Video feature extraction method and device, computer equipment and storage medium
CN111339362A (en) Short video multi-label classification method based on deep collaborative matrix decomposition
CN113807363B (en) Image classification method based on lightweight residual error network
CN112784106A (en) Content data processing method, report data processing method, computer device, and storage medium
CN109598218B (en) Method for quickly identifying vehicle type
CN114863279B (en) Flowering phase detection method based on RS-DCNet
CN113076898B (en) Traffic vehicle target detection method, device, equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200320