CN112836657A - Pedestrian detection method and system based on lightweight YOLOv3 - Google Patents
Pedestrian detection method and system based on lightweight YOLOv3 Download PDFInfo
- Publication number
- CN112836657A CN112836657A CN202110171542.5A CN202110171542A CN112836657A CN 112836657 A CN112836657 A CN 112836657A CN 202110171542 A CN202110171542 A CN 202110171542A CN 112836657 A CN112836657 A CN 112836657A
- Authority
- CN
- China
- Prior art keywords
- layer
- pedestrian detection
- lightweight
- yolov3
- convolution layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 247
- 238000012549 training Methods 0.000 claims abstract description 88
- 238000012795 verification Methods 0.000 claims abstract description 27
- 238000000034 method Methods 0.000 claims abstract description 16
- 230000000694 effects Effects 0.000 claims abstract description 12
- 230000003321 amplification Effects 0.000 claims description 15
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 15
- 230000006835 compression Effects 0.000 claims description 12
- 238000007906 compression Methods 0.000 claims description 12
- 238000005096 rolling process Methods 0.000 claims description 12
- 238000005070 sampling Methods 0.000 claims description 12
- 238000010276 construction Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 9
- 230000009466 transformation Effects 0.000 claims description 8
- 238000002372 labelling Methods 0.000 claims description 5
- 238000000605 extraction Methods 0.000 claims description 4
- 230000007423 decrease Effects 0.000 claims description 2
- 238000004364 calculation method Methods 0.000 abstract description 11
- 230000002265 prevention Effects 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 66
- 230000006870 function Effects 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 3
- 241001465754 Metazoa Species 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000000428 dust Substances 0.000 description 1
- 239000003897 fog Substances 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 230000009545 invasion Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 239000004576 sand Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biomedical Technology (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a pedestrian detection method based on lightweight YOLOv3, which comprises the following steps: establishing a pedestrian data set aiming at a perimeter intrusion prevention application scene; constructing a lightweight YOLOv3 pedestrian detection network; dividing a pedestrian detection training set, and training a lightweight YOLOv3 pedestrian detection network to obtain a lightweight pedestrian detection model; dividing a pedestrian detection verification set, and verifying the effect of the lightweight pedestrian detection model obtained by training; the lightweight YOLOv3 pedestrian detection model is deployed into embedded front-end equipment. The method adopts the high-precision lightweight backbone network to replace the backbone network of the traditional YOLOv3 detection network, greatly reduces the forward calculation amount of the pedestrian detection network and the parameter data amount of the pedestrian detection network, greatly improves the pedestrian detection speed while ensuring the pedestrian detection precision, and is suitable for embedded equipment with lower computing capacity and smaller storage space.
Description
Technical Field
The invention relates to the technical field of target identification, in particular to a pedestrian detection method based on light-weight YOLOv 3.
Background
The perimeter security system is widely applied to construction places such as detention houses, prisons, airports, nuclear power plants, oil depots and the like, and is used for preventing illegal invasion. With the continuous progress of social science and technology, the challenges of security are more and more serious, and the creation of stronger and more intelligent perimeter security systems is urgent. Traditional perimeter security protection system comprises closed rail and a large amount of surveillance cameras, receives the influence of natural environment factors such as bad weather more easily, has the too high problem of false alarm rate, and the user uses and experiences not well.
In recent years, with rapid progress of hardware technology and leap-type development of deep learning technology, the perimeter security system introduces artificial intelligence technology to protect, judges illegal intrusion targets by using a target identification algorithm based on deep learning, can accurately identify the interested intrusion targets, enables the perimeter security system not to be interfered by factors such as illumination shadows, rain, snow, fog, sand, dust, tree shaking, small animals and the like, and greatly reduces the false alarm rate of the perimeter security system.
The target identification algorithm based on deep learning generally has the problems of huge network forward computation and overlarge model parameter data amount, and needs to be operated on a high-performance server with strong computation capability. Due to the fact that the deployment environment of the perimeter security system is complex, the images collected by the front-end monitoring camera are transmitted to the rear-end high-performance server in real time to process the images, and the problems of time delay, packet loss and the like caused by overlarge data amount exist. And the target recognition algorithm is deployed in the front-end embedded equipment, and the target recognition result is returned to the back end for display, so that the pressure of the transmission system can be effectively reduced. In order to solve the problems, a lightweight target identification algorithm needs to be designed, the forward operation amount of a network is reduced, and the data amount of model parameters is reduced, so that the method can be used in embedded equipment with low computing capacity and limited storage space.
For example, a method for detecting a vehicle and a license plate and fusing a long focus and a short focus based on light-weight YOLOv3 disclosed in application number CN201910500483.4 is used for establishing a vehicle and license plate data set and designing and training a light-weight YOLOv3 network. Aiming at the problems of large network parameter quantity and long calculation time of the YOLOv3 network, the light-weight network is used for replacing a backbone network, and other convolutional layer frameworks are reconstructed, so that the detection speed is greatly improved on the premise of ensuring the detection precision, and the target detection network can be moved to a vehicle-mounted embedded unit. The light weight network designed in the invention greatly reduces the parameter and the calculation amount of the original backbone network of YOLOv3, but still has a space for further reducing the calculation amount, and the running efficiency of the detection algorithm in the embedded equipment can be further improved by designing a more efficient light weight network.
Disclosure of Invention
The invention aims to solve the technical problem of how to improve the running speed of a pedestrian detection network in embedded equipment while ensuring the accuracy of pedestrian detection, and provides a pedestrian detection method based on light-weight YOLOv 3.
The invention solves the technical problems through the following technical means:
a pedestrian detection method based on light-weight YOLOv3 comprises the following steps:
s1, establishing a perimeter security pedestrian detection data set; the data set comprises a real pedestrian image in a protected scene and an annotation; extracting a pedestrian image in a natural scene contained in the open source data set and converting the annotation information of the pedestrian image; collecting unmanned images with the number equivalent to that of the pedestrian images as background images and constructing a blank file for each background image as a label;
s2, constructing a light YOLOv3 pedestrian detection network; the lightweight backbone network structure adopted by the lightweight YOLOv3 pedestrian detection network is as follows: sequentially including rolling layer conv1, lightweight layer 3, lightweight layer 2, lightweight layer 1 × 2, lightweight layer 3, lightweight layer 1 × 3, lightweight layer 2, lightweight layer 1 × 2, lightweight layer 3, and rolling layer conv 2;
s3, dividing a pedestrian detection training set, and training a lightweight YOLOv3 pedestrian detection network;
s4, dividing a pedestrian detection verification set, and verifying the effect of the lightweight YOLOv3 pedestrian detection model;
s5, the embedded device deploys a light-weight YOLOv3 pedestrian detection model.
The lightweight YOLOv3 pedestrian detection network is constructed, the lightweight backbone network is adopted to replace a darknet53 backbone network used by the traditional YOLOv3, the calculated amount of the lightweight YOLOv3 pedestrian detection network is reduced by 71% compared with the traditional YOLOv3 forward calculation amount, and the speed of detecting pedestrians in each frame of image is greatly improved; before convolution operation is carried out on each lightweight layer in the lightweight backbone network to extract features, the number of feature channels participating in operation is increased through the amplification convolution layers, and the extracted image features are richer; the lightweight layer 1 fuses low-dimensional features and high-dimensional features, further improves feature expression capability, and ensures that the whole lightweight backbone network has excellent feature expression capability.
Further, constructing lightweight YOLOv3 pedestrian detection network extraction features in the S2; detecting the pedestrian by adopting a three-scale detection module: the small-scale output is used for detecting pedestrians with large target proportion, the medium-scale output is used for detecting pedestrians with medium target proportion, and the large-scale output is used for detecting pedestrians with small target proportion.
Further, the detection module structure adopted for constructing the light-weight YOLOv3 pedestrian detection network in S2 is as follows: the small-scale comprises a convolution layer conv3, a convolution layer conv4, a convolution layer conv5, a convolution layer conv6, a convolution layer conv7, a convolution layer conv8 and a convolution layer conv9 in sequence; the medium-scale sequentially comprises a route layer 1, a convolution layer conv10, an up-sampling layer 1, a route layer 2, a convolution layer conv11, a convolution layer conv12, a convolution layer conv13, a convolution layer conv14, a convolution layer conv15, a convolution layer conv16 and a convolution layer conv 17; the large-scale sequentially comprises a route layer 3, a convolution layer conv18, an up-sampling layer 2, a route layer 4, a convolution layer conv19, a convolution layer conv20, a convolution layer conv21, a convolution layer conv22, a convolution layer conv23, a convolution layer conv24 and a convolution layer conv 25.
Further, the light weight layer 1 for constructing the light weight YOLOv3 pedestrian detection network in the step S2 sequentially includes an amplification convolutional layer 1 × 1conv, a depth convolutional layer 3 × 3DwConv (step size is 1), a compression convolutional layer 1 × 1conv, and a short layer; the lightweight layer 2 includes an amplification convolutional layer 1 × 1conv, a depth convolutional layer 3 × 3DwConv (step 2), and a compression convolutional layer 1 × 1conv in this order; the lightweight layer 3 includes an amplification convolutional layer 1 × 1conv, a depth convolutional layer 3 × 3DwConv (step 1), and a compression convolutional layer 1 × 1conv in this order.
Further, training the lightweight YOLOv3 pedestrian detection network in S3, randomly selecting image samples with a set proportion in the perimeter security pedestrian detection data set as a pedestrian detection training set, and performing online data enhancement on training images in the training process, including: randomly selecting two original training images to carry out random cutting, random scaling and random color transformation operations, and carrying out corresponding transformation on the marking information of the two original training images according to the cutting and scaling operations; and fusing the two transformed training images into a new training image, and combining the labeling information of the two transformed training images to be used as the label of the new training image. The formula for fusing the two training images is as follows:
I(x,y)=0.5×I1(x,y)+0.5×I2(x,y)
wherein, I1(x, y) and I2And (x, y) respectively represents the pixel values of the two transformed training images at the coordinate point (x, y), and I (x, y) represents the pixel value of the new fused training image at the coordinate point (x, y).
Further, in the step S3, training the lightweight YOLOv3 pedestrian detection network until the loss function is stable and does not decrease any more, and stopping the training, wherein the loss function adopted in the training process is as follows:
the method comprises the following steps that S represents the size of a detection module adopted by the lightweight pedestrian detection network, and B represents the number of target frames predicted by each cell under each detection scale of the detection module;indicating whether the jth predicted target frame of the ith cell under a certain scale contains a target or not, and if so, containing the targetIf no target is includedxi,yi,wi,hi,CiRespectively representing the ith cell at a certain scaleThe coordinate of the central point x, the coordinate of the central point y, the width, the height and the confidence coefficient of the predicted target frame are 1;respectively representing the x coordinate of the central point, the y coordinate of the central point, the width, the height and the confidence coefficient of the target marked in advance, class represents the target category to be detected, and pi(c) For the prediction probability of each of the classes,true probability for each category;
the first line of the loss function represents the loss of effective predicted target center coordinates; the second row represents the penalty on the effective predicted target width and height; the third row represents confidence loss for all prediction boxes; the fourth row represents the class penalty for an effective prediction target.
Further, verifying the effect of the lightweight pedestrian detection model in S4, randomly selecting a sample with a set proportion in the perimeter security pedestrian detection data set as a pedestrian detection verification set, detecting pedestrians and positions thereof existing in each image sample of the verification set through the trained lightweight pedestrian detection model, storing detection results, and comparing the detection results with pedestrian positions in the verification set marking information, thereby finally obtaining the overall recall rate and accuracy data of the lightweight pedestrian detection model on the pedestrian detection verification set.
The invention also provides a pedestrian detection system based on the light-weight YOLOv3, which comprises
The data set establishing module is used for establishing a perimeter security pedestrian detection data set; the method comprises the following steps: acquiring and marking a real pedestrian image in a protected place scene; extracting a pedestrian image in a natural scene contained in the open source data set and converting the annotation information of the pedestrian image; collecting unmanned images with the number equivalent to that of the pedestrian images as background images and constructing a blank file for each background image as a label;
the light-weight YOLOv3 pedestrian detection network construction module is used for constructing a light-weight YOLOv3 pedestrian detection network; the lightweight backbone network structure adopted by the lightweight YOLOv3 pedestrian detection network is as follows: sequentially including rolling layer conv1, lightweight layer 3, lightweight layer 2, lightweight layer 1 × 2, lightweight layer 3, lightweight layer 1 × 3, lightweight layer 2, lightweight layer 1 × 2, lightweight layer 3, and rolling layer conv 2;
the lightweight YOLOv3 pedestrian detection network training module divides a pedestrian detection training set and trains a lightweight YOLOv3 pedestrian detection network;
the light-weight YOLOv3 pedestrian detection network verification module divides a pedestrian detection verification set and verifies the effect of a light-weight YOLOv3 pedestrian detection model;
the pedestrian detection system is characterized by comprising a lightweight YOLOv3 pedestrian detection model application module and an embedded device deployment lightweight YOLOv3 pedestrian detection model.
Further, in the lightweight YOLOv3 pedestrian detection network construction module, a three-scale detection module is adopted to detect pedestrians: the small-scale output is used for detecting pedestrians with large target proportion, the medium-scale output is used for detecting pedestrians with medium target proportion, and the large-scale output is used for detecting pedestrians with small target proportion.
Further, the detection module structure adopted in the lightweight YOLOv3 pedestrian detection network construction module is as follows: the small-scale comprises a convolution layer conv3, a convolution layer conv4, a convolution layer conv5, a convolution layer conv6, a convolution layer conv7, a convolution layer conv8 and a convolution layer conv9 in sequence; the medium-scale sequentially comprises a route layer 1, a convolution layer conv10, an up-sampling layer 1, a route layer 2, a convolution layer conv11, a convolution layer conv12, a convolution layer conv13, a convolution layer conv14, a convolution layer conv15, a convolution layer conv16 and a convolution layer conv 17; the large-scale sequentially comprises a route layer 3, a convolution layer conv18, an up-sampling layer 2, a route layer 4, a convolution layer conv19, a convolution layer conv20, a convolution layer conv21, a convolution layer conv22, a convolution layer conv23, a convolution layer conv24 and a convolution layer conv 25; the lightweight layer 1 comprises an amplification convolutional layer 1 × 1conv, a depth convolutional layer 3 × 3DwConv, a compression convolutional layer 1 × 1conv and a short layer in sequence; the lightweight layer 2 comprises an amplification convolutional layer 1 × 1conv, a depth convolutional layer 3 × 3DwConv and a compression convolutional layer 1 × 1conv in this order; the lightweight layer 3 includes an amplification convolutional layer 1 × 1conv, a depth convolutional layer 3 × 3DwConv, and a compression convolutional layer 1 × 1conv in this order.
The invention has the advantages that:
the lightweight YOLOv3 pedestrian detection network is constructed, the lightweight backbone network is adopted to replace a darknet53 backbone network used by the traditional YOLOv3, the forward calculation amount of the lightweight YOLOv3 pedestrian detection network is 41.364BFLOPS, the forward calculation amount is reduced by 71% compared with that of the traditional YOLOv3, and the speed of detecting pedestrians in each frame of image is greatly improved; the pedestrian detection model parameter data volume obtained by training the lightweight YOLOv3 pedestrian detection network is 89MB, which is reduced by 62% compared with the traditional YOLOv3, and the requirement on the storage space of hardware is reduced. Each lightweight layer in the backbone network of the lightweight YOLOv3 pedestrian detection network increases the number of characteristic channels participating in operation through an amplification convolution layer before convolution operation is carried out to extract characteristics, the extracted image characteristics are richer, meanwhile, the lightweight layer 1 fuses low-dimensional characteristics and high-dimensional characteristics, the characteristic expression capability is further improved, and the backbone network is ensured to have excellent characteristic extraction capability; the detection head of the lightweight YOLOv3 pedestrian detection network adopts three scales to detect pedestrian targets with three different sizes, namely large, medium and small, so that the missing detection rate is greatly reduced; the rich features extracted by the backbone network of the lightweight YOLOv3 pedestrian detection network are matched with the multi-scale detection of the detection head, so that the pedestrian detection can obtain higher precision. In conclusion, the lightweight YOLOv3 pedestrian detection method provided by the invention is suitable for embedded equipment with low computing power and small storage space, can ensure high detection precision, and is convenient for front-end application of perimeter security products.
Drawings
Fig. 1 is a general flowchart of a pedestrian detection method based on light-weight YOLOv3 in an embodiment of the present invention.
Fig. 2 is a light YOLOv3 pedestrian detection network structure diagram in the embodiment of the present invention.
Fig. 3 is a structure diagram of a lightweight layer in an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a pedestrian detection method based on light YOLOv3, which comprises the following steps as shown in figure 1:
s1, establishing a perimeter security pedestrian detection data set
Collect the image and establish the pedestrian detection data set to perimeter security protection application scene, ensure that pedestrian detection data set image source is diversified, specifically include: acquiring a real pedestrian image from a protected wanted scene; extracting a natural scene descending person image meeting the requirement from the open source data set; an image of a wanted scene or an image of a natural scene in the case of no person is collected as a background image. The proportion of the pedestrian image to the background image in the established perimeter security pedestrian detection data set is approximately 1:1, and the total number of the images reaches 87300.
And marking the collected images, wherein each marked image corresponds to a mark file with the same name and the format is txt. Marking the position of each pedestrian in the collected real pedestrian image to generate a corresponding marking file; converting the position information of each pedestrian in the existing marking files of the pedestrian images in the open source data set to generate a new marking file; each background image generates a blank txt document as a markup file. And storing the position information of each pedestrian in the mark file of the pedestrian image as a line, and sequentially storing the x coordinate of the upper left corner, the y coordinate of the upper left corner, the width and the height of the outer frame of the pedestrian.
S2, constructing a light YOLOv3 pedestrian detection network
The lightweight YOLOv3 pedestrian detection network adopts a lightweight backbone network to replace a darknet53 backbone network used by the traditional YOLOv 3; detecting the pedestrian by adopting a three-scale detection module: the small-scale output tensor 19 × 19 × 18 is used for detecting a pedestrian with a large target proportion, the medium-scale output tensor 38 × 38 × 18 is used for detecting a pedestrian with a medium target proportion, and the large-scale output tensor 76 × 76 × 18 is used for detecting a pedestrian with a small target proportion. The structure of the lightweight YOLOv3 pedestrian detection network is shown in fig. 2, and the output tensors of each stage are shown in the following table:
the light-weight backbone network used in the light-weight YOLOv3 pedestrian detection network includes, in order, rolling layer conv1, light-weight layer 3, light-weight layer 2, light-weight layer 1 × 2, light-weight layer 3, light-weight layer 1 × 3, light-weight layer 2, light-weight layer 1 × 2, light-weight layer 3, and rolling layer conv 2.
The small-scale detection module adopted by the light-weight YOLOv3 pedestrian detection network sequentially comprises a convolution layer conv3, a convolution layer conv4, a convolution layer conv5, a convolution layer conv6, a convolution layer conv7, a convolution layer conv8 and a convolution layer conv 9; the medium-scale sequentially comprises a route layer 1, a convolution layer conv10, an up-sampling layer 1, a route layer 2, a convolution layer conv11, a convolution layer conv12, a convolution layer conv13, a convolution layer conv14, a convolution layer conv15, a convolution layer conv16 and a convolution layer conv 17; the large-scale sequentially comprises a route layer 3, a convolution layer conv18, an up-sampling layer 2, a route layer 4, a convolution layer conv19, a convolution layer conv20, a convolution layer conv21, a convolution layer conv22, a convolution layer conv23, a convolution layer conv24 and a convolution layer conv 25.
The lightweight backbone network uses three different lightweight layers, the structure of which is shown in fig. 3. The lightweight layer 1 comprises in sequence the following operations: the number of channels of the amplified convolutional layer 1 × 1conv output characteristic diagram is 6 times of the number of channels of the input characteristic diagram, and the resolution of the output characteristic diagram is the same as that of the input characteristic diagram; the number of channels of the output characteristic diagram of the depth convolution layer 3 multiplied by 3DwConv (step length is 1) is 6 times of the number of channels of the input characteristic diagram, and the resolution of the output characteristic diagram is the same as that of the input characteristic diagram; the number of channels of the compressed convolutional layer 1 × 1conv output characteristic diagram is the same as that of the channels of the input characteristic diagram, and the resolution of the output characteristic diagram is the same as that of the input characteristic diagram; the number of channels of the output feature map of the shortcut layer is the same as that of the channels of the input feature map, and the resolution of the output feature map is the same as that of the input feature map. The lightweight layer 2 comprises in sequence the following operations: the number of channels of the amplified convolutional layer 1 × 1conv output characteristic diagram is 6 times of the number of channels of the input characteristic diagram, and the resolution of the output characteristic diagram is the same as that of the input characteristic diagram; the number of channels of the output feature map of the depth convolution layer 3 multiplied by 3DwConv (step size is 2) is 6 times of the number of channels of the input feature map, and the resolution of the output feature map is 1/2 of the resolution of the input feature map; the number of channels of the compressed convolutional layer 1 × 1conv output characteristic diagram is the same as that of the channels of the input characteristic diagram, and the resolution of the output characteristic diagram is 1/2 of that of the input characteristic diagram. The lightening layer 3 comprises in sequence the following operations: the number of channels of the amplified convolutional layer 1 × 1conv output characteristic diagram is 6 times of the number of channels of the input characteristic diagram, and the resolution of the output characteristic diagram is the same as that of the input characteristic diagram; the number of channels of the output characteristic diagram of the depth convolution layer 3 multiplied by 3DwConv (step length is 1) is 6 times of the number of channels of the input characteristic diagram, and the resolution of the output characteristic diagram is the same as that of the input characteristic diagram; the number of channels of the compressed convolutional layer 1 × 1conv output characteristic diagram is the same as that of the channels of the input characteristic diagram, and the resolution of the output characteristic diagram is the same as that of the input characteristic diagram.
The constructed lightweight YOLOv3 pedestrian detection network replaces a darknet53 backbone network used by the traditional YOLOv3 with a lightweight backbone network, the forward computation amount is 41.364BFLOPS, the forward computation amount is reduced by 71% compared with the traditional YOLOv3, and the speed of detecting pedestrians in each frame of image is greatly improved; meanwhile, the number of characteristic channels participating in operation is increased by amplifying the convolution layer in the lightweight layer, and high accuracy of pedestrian detection can be ensured.
S3, dividing a pedestrian detection training set, training a lightweight YOLOv3 pedestrian detection network
And the training pedestrian detection training set randomly selects 90% of images from the perimeter security pedestrian detection data set to form the image. The online data enhancement is carried out on training images in the training process, and the online data enhancement comprises the following steps: randomly selecting two original training images to carry out random cutting, random scaling and random color transformation operations, and carrying out corresponding transformation on the marking information of the two original training images according to the cutting and scaling operations; and fusing the two transformed training images into a new training image, and combining the labeling information of the two transformed training images to be used as the label of the new training image. The formula for fusing the two training images is as follows:
I(x,y)=0.5×I1(x,y)+0.5×I2(x,y)
wherein, I1(x, y) and I2And (x, y) respectively represents the pixel values of the two transformed training images at the coordinate point (x, y), and I (x, y) represents the pixel value of the new fused training image at the coordinate point (x, y).
In the training process, multi-resolution training is adopted, the input resolution of the scaled training images is not fixed, the scaled resolution of the training images is randomly changed after every 20 times of iterative training, and the selectable resolution is as follows: 320. 352, 384, 416, 448, 480, 512, 544, 576, 608.
S3-3, the loss function used to train the lightweighting YOLOv3 is as follows:
wherein S represents the size of a detection module adopted by the lightweight pedestrian detection network, and the numerical values are respectively 19, 38 and 76; b represents the number of the predicted target frames of each cell under each detection scale of the detection module, and the numerical value of the predicted target frames is 3;indicating whether the jth predicted target frame of the ith cell under a certain scale contains a target or not, and if so, containing the targetIf no target is includedxi,yi,wi,hi,CiRespectively representing the ith cell at a certain scalePredicted target Box of 1The center point x coordinate, the center point y coordinate, the width, the height and the confidence coefficient of the sensor;respectively representing the x coordinate of the central point, the y coordinate of the central point, the width, the height and the confidence coefficient of the target marked in advance. class represents the class of target to be detected, pi(c) For the prediction probability of each of the classes,as is the true probability of each class. The first line of the loss function represents the loss of the effective predicted target center coordinates; the second row represents the penalty on the effective predicted target width and height; the third row represents confidence loss for all prediction boxes; the fourth row represents the class penalty for an effective prediction target.
The data volume of the trained lightweight YOLOv3 pedestrian detection model parameter is 89MB, which is reduced by 62% compared with the traditional YOLOv3, and the requirement on the storage space of the embedded device is reduced.
S4, dividing a pedestrian detection verification set, and verifying the effect of the lightweight YOLOv3 pedestrian detection model
The pedestrian detection verification set is formed by randomly selecting 10% of samples from the perimeter security protection pedestrian detection data set, the pedestrian detection verification set and the pedestrian detection training set do not have coincident images, and the union set of the pedestrian detection verification set and the pedestrian detection training set is the perimeter security protection pedestrian detection data set.
When the effect of the lightweight YOLOv3 pedestrian detection model is verified, each image in the verification set is sequentially selected and zoomed to 608 x 608, pedestrians and position information thereof existing in the image are detected through the trained lightweight pedestrian detection model, the detection result is stored and compared with the position of the pedestrian in the mark file corresponding to the image, and finally, the overall recall rate and accuracy data of the lightweight pedestrian detection model on the pedestrian detection verification set are obtained and used for evaluating the detection effect of the pedestrian detection model.
S5, deployment lightweight YOLOv3 pedestrian detection model of embedded equipment
The forward calculation amount of the constructed lightweight pedestrian detection network is 41.364BFLOPS, which is 71% lower than that of the traditional YOLOv3, thereby not only greatly improving the pedestrian detection speed, but also ensuring that the pedestrian detection can obtain higher precision; the data volume of the trained lightweight YOLOv3 pedestrian detection model parameter is 89MB, which is reduced by 62% compared with the traditional YOLOv3, and the requirement on the storage space of the embedded device is reduced. The lightweight pedestrian detection model which meets the requirements of recall rate and accuracy index is deployed in the embedded equipment to operate, and the characteristics of low computing capability and small storage space of the embedded equipment can be adapted.
The lightweight YOLOv3 pedestrian detection network is constructed, the lightweight backbone network is adopted to replace a darknet53 backbone network used by the traditional YOLOv3, the forward calculation amount of the lightweight YOLOv3 pedestrian detection network is 41.364BFLOPS, the forward calculation amount is reduced by 71% compared with that of the traditional YOLOv3, and the speed of detecting pedestrians in each frame of image is greatly improved; the pedestrian detection model parameter data volume obtained by training the lightweight YOLOv3 pedestrian detection network is 89MB, which is reduced by 62% compared with the traditional YOLOv3, and the requirement on the storage space of hardware is reduced. Each lightweight layer in the backbone network of the lightweight YOLOv3 pedestrian detection network increases the number of characteristic channels participating in operation through an amplification convolution layer before convolution operation is carried out to extract characteristics, the extracted image characteristics are richer, meanwhile, the lightweight layer 1 fuses low-dimensional characteristics and high-dimensional characteristics, the characteristic expression capability is further improved, and the backbone network is ensured to have excellent characteristic extraction capability; the detection head of the lightweight YOLOv3 pedestrian detection network adopts three scales to detect pedestrian targets with three different sizes, namely large, medium and small, so that the missing detection rate is greatly reduced; the rich features extracted by the backbone network of the lightweight YOLOv3 pedestrian detection network are matched with the multi-scale detection of the detection head, so that the pedestrian detection can obtain higher precision. In conclusion, the lightweight YOLOv3 pedestrian detection method provided by the invention is suitable for embedded equipment with low computing power and small storage space, can ensure high detection precision, and is convenient for front-end application of perimeter security products.
The invention also provides a pedestrian detection system based on the light-weight YOLOv3, and the flow chart is shown in fig. 1 and comprises the following components:
detection data set construction module
Collect the image and establish the pedestrian detection data set to perimeter security protection application scene, ensure that pedestrian detection data set image source is diversified, specifically include: acquiring a real pedestrian image from a protected wanted scene; extracting a natural scene descending person image meeting the requirement from the open source data set; an image of a wanted scene or an image of a natural scene in the case of no person is collected as a background image. The proportion of the pedestrian image to the background image in the established perimeter security pedestrian detection data set is approximately 1:1, and the total number of the images reaches 87300.
And marking the collected images, wherein each marked image corresponds to a mark file with the same name and the format is txt. Marking the position of each pedestrian in the collected real pedestrian image to generate a corresponding marking file; converting the position information of each pedestrian in the existing marking files of the pedestrian images in the open source data set to generate a new marking file; each background image generates a blank txt document as a markup file. And storing the position information of each pedestrian in the mark file of the pedestrian image as a line, and sequentially storing the x coordinate of the upper left corner, the y coordinate of the upper left corner, the width and the height of the outer frame of the pedestrian.
Lightweight YOLOv3 pedestrian detection network construction module
The lightweight YOLOv3 pedestrian detection network adopts a lightweight backbone network to replace a darknet53 backbone network used by the traditional YOLOv 3; detecting the pedestrian by adopting a three-scale detection module: the small-scale output tensor 19 × 19 × 18 is used for detecting a pedestrian with a large target proportion, the medium-scale output tensor 38 × 38 × 18 is used for detecting a pedestrian with a medium target proportion, and the large-scale output tensor 76 × 76 × 18 is used for detecting a pedestrian with a small target proportion. The structure of the lightweight YOLOv3 pedestrian detection network is shown in fig. 2, and the output tensors of each stage are shown in the following table:
the light-weight backbone network used in the light-weight YOLOv3 pedestrian detection network includes, in order, rolling layer conv1, light-weight layer 3, light-weight layer 2, light-weight layer 1 × 2, light-weight layer 3, light-weight layer 1 × 3, light-weight layer 2, light-weight layer 1 × 2, light-weight layer 3, and rolling layer conv 2.
The small-scale detection module adopted by the light-weight YOLOv3 pedestrian detection network sequentially comprises a convolution layer conv3, a convolution layer conv4, a convolution layer conv5, a convolution layer conv6, a convolution layer conv7, a convolution layer conv8 and a convolution layer conv 9; the medium-scale sequentially comprises a route layer 1, a convolution layer conv10, an up-sampling layer 1, a route layer 2, a convolution layer conv11, a convolution layer conv12, a convolution layer conv13, a convolution layer conv14, a convolution layer conv15, a convolution layer conv16 and a convolution layer conv 17; the large-scale sequentially comprises a route layer 3, a convolution layer conv18, an up-sampling layer 2, a route layer 4, a convolution layer conv19, a convolution layer conv20, a convolution layer conv21, a convolution layer conv22, a convolution layer conv23, a convolution layer conv24 and a convolution layer conv 25.
The lightweight backbone network uses three different lightweight layers, the structure of which is shown in fig. 3. The lightweight layer 1 comprises in sequence the following operations: the number of channels of the amplified convolutional layer 1 × 1conv output characteristic diagram is 6 times of the number of channels of the input characteristic diagram, and the resolution of the output characteristic diagram is the same as that of the input characteristic diagram; the number of channels of the output characteristic diagram of the depth convolution layer 3 multiplied by 3DwConv (step length is 1) is 6 times of the number of channels of the input characteristic diagram, and the resolution of the output characteristic diagram is the same as that of the input characteristic diagram; the number of channels of the compressed convolutional layer 1 × 1conv output characteristic diagram is the same as that of the channels of the input characteristic diagram, and the resolution of the output characteristic diagram is the same as that of the input characteristic diagram; the number of channels of the output feature map of the shortcut layer is the same as that of the channels of the input feature map, and the resolution of the output feature map is the same as that of the input feature map. The lightweight layer 2 comprises in sequence the following operations: the number of channels of the amplified convolutional layer 1 × 1conv output characteristic diagram is 6 times of the number of channels of the input characteristic diagram, and the resolution of the output characteristic diagram is the same as that of the input characteristic diagram; the number of channels of the output feature map of the depth convolution layer 3 multiplied by 3DwConv (step size is 2) is 6 times of the number of channels of the input feature map, and the resolution of the output feature map is 1/2 of the resolution of the input feature map; the number of channels of the compressed convolutional layer 1 × 1conv output characteristic diagram is the same as that of the channels of the input characteristic diagram, and the resolution of the output characteristic diagram is 1/2 of that of the input characteristic diagram. The lightening layer 3 comprises in sequence the following operations: the number of channels of the amplified convolutional layer 1 × 1conv output characteristic diagram is 6 times of the number of channels of the input characteristic diagram, and the resolution of the output characteristic diagram is the same as that of the input characteristic diagram; the number of channels of the output characteristic diagram of the depth convolution layer 3 multiplied by 3DwConv (step length is 1) is 6 times of the number of channels of the input characteristic diagram, and the resolution of the output characteristic diagram is the same as that of the input characteristic diagram; the number of channels of the compressed convolutional layer 1 × 1conv output characteristic diagram is the same as that of the channels of the input characteristic diagram, and the resolution of the output characteristic diagram is the same as that of the input characteristic diagram.
The constructed lightweight YOLOv3 pedestrian detection network replaces a darknet53 backbone network used by the traditional YOLOv3 with a lightweight backbone network, the forward computation amount is 41.364BFLOPS, the forward computation amount is reduced by 71% compared with the traditional YOLOv3, and the speed of detecting pedestrians in each frame of image is greatly improved; meanwhile, the number of characteristic channels participating in operation is increased by amplifying the convolution layer in the lightweight layer, and high accuracy of pedestrian detection can be ensured.
Lightweight YOLOv3 pedestrian detection network training module
And the training pedestrian detection training set randomly selects 90% of images from the perimeter security pedestrian detection data set to form the image. The online data enhancement is carried out on training images in the training process, and the online data enhancement comprises the following steps: randomly selecting two original training images to carry out random cutting, random scaling and random color transformation operations, and carrying out corresponding transformation on the marking information of the two original training images according to the cutting and scaling operations; and fusing the two transformed training images into a new training image, and combining the labeling information of the two transformed training images to be used as the label of the new training image. The formula for fusing the two training images is as follows:
I(x,y)=0.5×I1(x,y)+0.5×I2(x,y)
wherein, I1(x, y) and I2(x, y) respectively representing coordinate points of the two transformed training imagesAnd I (x, y) represents the pixel value of the new training image at the coordinate point (x, y) after fusion.
In the training process, multi-resolution training is adopted, the input resolution of the scaled training images is not fixed, the scaled resolution of the training images is randomly changed after every 20 times of iterative training, and the selectable resolution is as follows: 320. 352, 384, 416, 448, 480, 512, 544, 576, 608.
S3-3, the loss function used to train the lightweighting YOLOv3 is as follows:
wherein S represents the size of a detection module adopted by the lightweight pedestrian detection network, and the numerical values are respectively 19, 38 and 76; b represents the number of the predicted target frames of each cell under each detection scale of the detection module, and the numerical value of the predicted target frames is 3;indicating whether the jth predicted target frame of the ith cell under a certain scale contains a target or not, and if so, containing the targetIf no target is includedxi,yi,wi,hi,CiRespectively representing the ith cell at a certain scaleThe coordinate of the central point x, the coordinate of the central point y, the width, the height and the confidence coefficient of the predicted target frame are 1;respectively representing the x coordinate of the central point, the y coordinate of the central point, the width, the height and the confidence coefficient of the target marked in advance. class represents the purpose of the assayClass of logo, pi(c) For the prediction probability of each of the classes,as is the true probability of each class. The first line of the loss function represents the loss of the effective predicted target center coordinates; the second row represents the penalty on the effective predicted target width and height; the third row represents confidence loss for all prediction boxes; the fourth row represents the class penalty for an effective prediction target.
The data volume of the trained lightweight YOLOv3 pedestrian detection model parameter is 89MB, which is reduced by 62% compared with the traditional YOLOv3, and the requirement on the storage space of the embedded device is reduced.
Lightweight YOLOv3 pedestrian detection model verification module
The pedestrian detection verification set is formed by randomly selecting 10% of samples from the perimeter security protection pedestrian detection data set, the pedestrian detection verification set and the pedestrian detection training set do not have coincident images, and the union set of the pedestrian detection verification set and the pedestrian detection training set is the perimeter security protection pedestrian detection data set.
When the effect of the lightweight YOLOv3 pedestrian detection model is verified, each image in the verification set is sequentially selected and zoomed to 608 x 608, pedestrians and position information thereof existing in the image are detected through the trained lightweight pedestrian detection model, the detection result is stored and compared with the position of the pedestrian in the mark file corresponding to the image, and finally, the overall recall rate and accuracy data of the lightweight pedestrian detection model on the pedestrian detection verification set are obtained and used for evaluating the detection effect of the pedestrian detection model.
Lightweight YOLOv3 pedestrian detection model deployment module
The forward calculation amount of the constructed lightweight pedestrian detection network is 41.364BFLOPS, which is 71% lower than that of the traditional YOLOv3, thereby not only greatly improving the pedestrian detection speed, but also ensuring that the pedestrian detection can obtain higher precision; the data volume of the trained lightweight YOLOv3 pedestrian detection model parameter is 89MB, which is reduced by 62% compared with the traditional YOLOv3, and the requirement on the storage space of the embedded device is reduced. The lightweight pedestrian detection model which meets the requirements of recall rate and accuracy index is deployed in the embedded equipment to operate, and the characteristics of low computing capability and small storage space of the embedded equipment can be adapted.
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Claims (10)
1. A pedestrian detection method based on light-weight YOLOv3 is characterized by comprising the following steps:
s1, establishing a perimeter security pedestrian detection data set; the data set comprises a real pedestrian image in a protected scene and an annotation; extracting a pedestrian image in a natural scene contained in the open source data set and converting the annotation information of the pedestrian image; collecting unmanned images with the number equivalent to that of the pedestrian images as background images and constructing a blank file for each background image as a label;
s2, constructing a light YOLOv3 pedestrian detection network; the lightweight backbone network structure adopted by the lightweight YOLOv3 pedestrian detection network is as follows: sequentially including rolling layer conv1, lightweight layer 3, lightweight layer 2, lightweight layer 1 × 2, lightweight layer 3, lightweight layer 1 × 3, lightweight layer 2, lightweight layer 1 × 2, lightweight layer 3, and rolling layer conv 2;
s3, dividing a pedestrian detection training set, and training a lightweight YOLOv3 pedestrian detection network;
s4, dividing a pedestrian detection verification set, and verifying the effect of the lightweight YOLOv3 pedestrian detection model;
s5, the embedded device deploys a light-weight YOLOv3 pedestrian detection model.
2. The pedestrian detection method based on light-weight YOLOv3 according to claim 1, wherein the light-weight YOLOv3 pedestrian detection network extraction features are constructed in S2; detecting the pedestrian by adopting a three-scale detection module: the small-scale output is used for detecting pedestrians with large target proportion, the medium-scale output is used for detecting pedestrians with medium target proportion, and the large-scale output is used for detecting pedestrians with small target proportion.
3. The pedestrian detection method based on light-weight YOLOv3 according to claim 1, wherein the detection module structure adopted for constructing the light-weight YOLOv3 pedestrian detection network in S2 is as follows: the small-scale comprises a convolution layer conv3, a convolution layer conv4, a convolution layer conv5, a convolution layer conv6, a convolution layer conv7, a convolution layer conv8 and a convolution layer conv9 in sequence; the medium-scale sequentially comprises a route layer 1, a convolution layer conv10, an up-sampling layer 1, a route layer 2, a convolution layer conv11, a convolution layer conv12, a convolution layer conv13, a convolution layer conv14, a convolution layer conv15, a convolution layer conv16 and a convolution layer conv 17; the large-scale sequentially comprises a route layer 3, a convolution layer conv18, an up-sampling layer 2, a route layer 4, a convolution layer conv19, a convolution layer conv20, a convolution layer conv21, a convolution layer conv22, a convolution layer conv23, a convolution layer conv24 and a convolution layer conv 25.
4. The pedestrian detection method based on light-weight YOLOv3 according to claim 3, wherein the light-weight layer 1 for constructing the light-weight YOLOv3 pedestrian detection network in step S2 sequentially comprises an amplification convolutional layer 1 x 1conv, a depth convolutional layer 3 x 3DwConv, a compression convolutional layer 1 x 1conv, a shortcut layer; the lightweight layer 2 comprises an amplification convolutional layer 1 × 1conv, a depth convolutional layer 3 × 3DwConv and a compression convolutional layer 1 × 1conv in this order; the lightweight layer 3 includes an amplification convolutional layer 1 × 1conv, a depth convolutional layer 3 × 3DwConv, and a compression convolutional layer 1 × 1conv in this order.
5. The pedestrian detection method based on light-weight YOLOv3 as claimed in claim 1, wherein the training of the light-weight YOLOv3 pedestrian detection network in S3 randomly selects image samples with a set proportion in a perimeter security pedestrian detection data set as a pedestrian detection training set, and performs online data enhancement on the training images during the training process, including: randomly selecting two original training images to carry out random cutting, random scaling and random color transformation operations, and carrying out corresponding transformation on the marking information of the two original training images according to the cutting and scaling operations; and fusing the two transformed training images into a new training image, and combining the labeling information of the two transformed training images to be used as the label of the new training image. The formula for fusing the two training images is as follows:
I(x,y)=0.5×I1(x,y)+0.5×I2(x,y)
wherein, I1(x, y) and I2And (x, y) respectively represents the pixel values of the two transformed training images at the coordinate point (x, y), and I (x, y) represents the pixel value of the new fused training image at the coordinate point (x, y).
6. The pedestrian detection method based on light weight YOLOv3 according to claim 1, wherein the training of the light weight YOLOv3 pedestrian detection network in S3 is stopped until the loss function stabilizes and no longer decreases, and the loss function used in the training process is as follows:
the method comprises the following steps that S represents the size of a detection module adopted by the lightweight pedestrian detection network, and B represents the number of target frames predicted by each cell under each detection scale of the detection module;indicating whether the jth predicted target frame of the ith cell under a certain scale contains a target or not, and if so, containing the targetIf no target is includedxi,yi,wi,hi,CiRespectively representing the ith cell at a certain scaleThe coordinate of the central point x, the coordinate of the central point y, the width, the height and the confidence coefficient of the predicted target frame are 1;respectively representing the x coordinate of the central point, the y coordinate of the central point, the width, the height and the confidence coefficient of the target marked in advance, class represents the target category to be detected, and pi(c) For the prediction probability of each of the classes,true probability for each category;
the first line of the loss function represents the loss of effective predicted target center coordinates; the second row represents the penalty on the effective predicted target width and height; the third row represents confidence loss for all prediction boxes; the fourth row represents the class penalty for an effective prediction target.
7. The pedestrian detection method based on light weight YOLOv3 as claimed in claim 1, wherein the step S4 is to verify the effect of the light weight pedestrian detection model, randomly select a proportion of samples in the perimeter security pedestrian detection data set as a pedestrian detection verification set, detect pedestrians and their positions in each image sample of the verification set through the trained light weight pedestrian detection model, store the detection results and compare the detection results with the pedestrian positions in the verification set labeling information, and finally obtain the overall recall rate and accuracy data of the light weight pedestrian detection model on the pedestrian detection verification set.
8. A pedestrian detection system based on light-weight YOLOv3 is characterized by comprising
The data set establishing module is used for establishing a perimeter security pedestrian detection data set; the method comprises the following steps: acquiring and marking a real pedestrian image in a protected place scene; extracting a pedestrian image in a natural scene contained in the open source data set and converting the annotation information of the pedestrian image; collecting unmanned images with the number equivalent to that of the pedestrian images as background images and constructing a blank file for each background image as a label;
the light-weight YOLOv3 pedestrian detection network construction module is used for constructing a light-weight YOLOv3 pedestrian detection network; the lightweight backbone network structure adopted by the lightweight YOLOv3 pedestrian detection network is as follows: sequentially including rolling layer conv1, lightweight layer 3, lightweight layer 2, lightweight layer 1 × 2, lightweight layer 3, lightweight layer 1 × 3, lightweight layer 2, lightweight layer 1 × 2, lightweight layer 3, and rolling layer conv 2;
the lightweight YOLOv3 pedestrian detection network training module divides a pedestrian detection training set and trains a lightweight YOLOv3 pedestrian detection network;
the light-weight YOLOv3 pedestrian detection network verification module divides a pedestrian detection verification set and verifies the effect of a light-weight YOLOv3 pedestrian detection model;
the pedestrian detection system is characterized by comprising a lightweight YOLOv3 pedestrian detection model application module and an embedded device deployment lightweight YOLOv3 pedestrian detection model.
9. The pedestrian detection system based on light-weight YOLOv3 of claim 8, wherein in the light-weight YOLOv3 pedestrian detection network construction module, a three-scale detection module is used to detect pedestrians: the small-scale output is used for detecting pedestrians with large target proportion, the medium-scale output is used for detecting pedestrians with medium target proportion, and the large-scale output is used for detecting pedestrians with small target proportion.
10. The pedestrian detection system based on light-weight YOLOv3 of claim 8, wherein the detection module structure employed in the light-weight YOLOv3 pedestrian detection network construction module is as follows: the small-scale comprises a convolution layer conv3, a convolution layer conv4, a convolution layer conv5, a convolution layer conv6, a convolution layer conv7, a convolution layer conv8 and a convolution layer conv9 in sequence; the medium-scale sequentially comprises a route layer 1, a convolution layer conv10, an up-sampling layer 1, a route layer 2, a convolution layer conv11, a convolution layer conv12, a convolution layer conv13, a convolution layer conv14, a convolution layer conv15, a convolution layer conv16 and a convolution layer conv 17; the large-scale sequentially comprises a route layer 3, a convolution layer conv18, an up-sampling layer 2, a route layer 4, a convolution layer conv19, a convolution layer conv20, a convolution layer conv21, a convolution layer conv22, a convolution layer conv23, a convolution layer conv24 and a convolution layer conv 25; the lightweight layer 1 comprises an amplification convolutional layer 1 × 1conv, a depth convolutional layer 3 × 3DwConv, a compression convolutional layer 1 × 1conv and a short layer in sequence; the lightweight layer 2 comprises an amplification convolutional layer 1 × 1conv, a depth convolutional layer 3 × 3DwConv and a compression convolutional layer 1 × 1conv in this order; the lightweight layer 3 includes an amplification convolutional layer 1 × 1conv, a depth convolutional layer 3 × 3DwConv, and a compression convolutional layer 1 × 1conv in this order.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110171542.5A CN112836657B (en) | 2021-02-08 | 2021-02-08 | Pedestrian detection method and system based on lightweight YOLOv3 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110171542.5A CN112836657B (en) | 2021-02-08 | 2021-02-08 | Pedestrian detection method and system based on lightweight YOLOv3 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112836657A true CN112836657A (en) | 2021-05-25 |
CN112836657B CN112836657B (en) | 2023-04-18 |
Family
ID=75930942
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110171542.5A Active CN112836657B (en) | 2021-02-08 | 2021-02-08 | Pedestrian detection method and system based on lightweight YOLOv3 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112836657B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113408423A (en) * | 2021-06-21 | 2021-09-17 | 西安工业大学 | Aquatic product target real-time detection method suitable for TX2 embedded platform |
CN113723322A (en) * | 2021-09-02 | 2021-11-30 | 南京理工大学 | Pedestrian detection method and system based on single-stage anchor-free frame |
CN114187606A (en) * | 2021-10-21 | 2022-03-15 | 江阴市智行工控科技有限公司 | Garage pedestrian detection method and system adopting branch fusion network for light weight |
CN115690545A (en) * | 2021-12-03 | 2023-02-03 | 北京百度网讯科技有限公司 | Training target tracking model and target tracking method and device |
CN117392613A (en) * | 2023-12-07 | 2024-01-12 | 武汉纺织大学 | Power operation safety monitoring method based on lightweight network |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110321874A (en) * | 2019-07-12 | 2019-10-11 | 南京航空航天大学 | A kind of light-weighted convolutional neural networks pedestrian recognition method |
CN110378210A (en) * | 2019-06-11 | 2019-10-25 | 江苏大学 | A kind of vehicle and car plate detection based on lightweight YOLOv3 and long short focus merge distance measuring method |
AU2019101142A4 (en) * | 2019-09-30 | 2019-10-31 | Dong, Qirui MR | A pedestrian detection method with lightweight backbone based on yolov3 network |
WO2019223254A1 (en) * | 2018-05-21 | 2019-11-28 | 北京亮亮视野科技有限公司 | Construction method for multi-scale lightweight face detection model and face detection method based on model |
CN111340141A (en) * | 2020-04-20 | 2020-06-26 | 天津职业技术师范大学(中国职业培训指导教师进修中心) | Crop seedling and weed detection method and system based on deep learning |
CN111967468A (en) * | 2020-08-10 | 2020-11-20 | 东南大学 | FPGA-based lightweight target detection neural network implementation method |
CN112183578A (en) * | 2020-09-01 | 2021-01-05 | 国网宁夏电力有限公司检修公司 | Target detection method, medium and system |
AU2020103494A4 (en) * | 2020-11-17 | 2021-01-28 | China University Of Mining And Technology | Handheld call detection method based on lightweight target detection network |
-
2021
- 2021-02-08 CN CN202110171542.5A patent/CN112836657B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019223254A1 (en) * | 2018-05-21 | 2019-11-28 | 北京亮亮视野科技有限公司 | Construction method for multi-scale lightweight face detection model and face detection method based on model |
CN110378210A (en) * | 2019-06-11 | 2019-10-25 | 江苏大学 | A kind of vehicle and car plate detection based on lightweight YOLOv3 and long short focus merge distance measuring method |
CN110321874A (en) * | 2019-07-12 | 2019-10-11 | 南京航空航天大学 | A kind of light-weighted convolutional neural networks pedestrian recognition method |
AU2019101142A4 (en) * | 2019-09-30 | 2019-10-31 | Dong, Qirui MR | A pedestrian detection method with lightweight backbone based on yolov3 network |
CN111340141A (en) * | 2020-04-20 | 2020-06-26 | 天津职业技术师范大学(中国职业培训指导教师进修中心) | Crop seedling and weed detection method and system based on deep learning |
CN111967468A (en) * | 2020-08-10 | 2020-11-20 | 东南大学 | FPGA-based lightweight target detection neural network implementation method |
CN112183578A (en) * | 2020-09-01 | 2021-01-05 | 国网宁夏电力有限公司检修公司 | Target detection method, medium and system |
AU2020103494A4 (en) * | 2020-11-17 | 2021-01-28 | China University Of Mining And Technology | Handheld call detection method based on lightweight target detection network |
Non-Patent Citations (4)
Title |
---|
QI-CHAO MAO 等: "Mini-YOLOv3: Real-Time Object Detector for Embedded Applications", 《IEEE》 * |
平嘉蓉等: "基于轻量级神经网络的人群计数模型设计", 《无线电工程》 * |
武星 等: "基于轻量化YOLOv3卷积神经网络的苹果检测方法", 《农业机械学报》 * |
黄同愿等: "基于YOLOV3的改进模型在行人检测中的应用", 《重庆理工大学学报(自然科学)》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113408423A (en) * | 2021-06-21 | 2021-09-17 | 西安工业大学 | Aquatic product target real-time detection method suitable for TX2 embedded platform |
CN113408423B (en) * | 2021-06-21 | 2023-09-05 | 西安工业大学 | Aquatic product target real-time detection method suitable for TX2 embedded platform |
CN113723322A (en) * | 2021-09-02 | 2021-11-30 | 南京理工大学 | Pedestrian detection method and system based on single-stage anchor-free frame |
CN114187606A (en) * | 2021-10-21 | 2022-03-15 | 江阴市智行工控科技有限公司 | Garage pedestrian detection method and system adopting branch fusion network for light weight |
CN114187606B (en) * | 2021-10-21 | 2023-07-25 | 江阴市智行工控科技有限公司 | Garage pedestrian detection method and system adopting branch fusion network for light weight |
CN115690545A (en) * | 2021-12-03 | 2023-02-03 | 北京百度网讯科技有限公司 | Training target tracking model and target tracking method and device |
CN115690545B (en) * | 2021-12-03 | 2024-06-11 | 北京百度网讯科技有限公司 | Method and device for training target tracking model and target tracking |
CN117392613A (en) * | 2023-12-07 | 2024-01-12 | 武汉纺织大学 | Power operation safety monitoring method based on lightweight network |
CN117392613B (en) * | 2023-12-07 | 2024-03-08 | 武汉纺织大学 | Power operation safety monitoring method based on lightweight network |
Also Published As
Publication number | Publication date |
---|---|
CN112836657B (en) | 2023-04-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112836657B (en) | Pedestrian detection method and system based on lightweight YOLOv3 | |
CN108062349B (en) | Video monitoring method and system based on video structured data and deep learning | |
CN107563372B (en) | License plate positioning method based on deep learning SSD frame | |
CN109977782B (en) | Cross-store operation behavior detection method based on target position information reasoning | |
CN111461209B (en) | Model training device and method | |
CN111553355B (en) | Monitoring video-based method for detecting and notifying store outgoing business and managing store owner | |
CN109214280A (en) | Shop recognition methods, device, electronic equipment and storage medium based on streetscape | |
CN110781806A (en) | Pedestrian detection tracking method based on YOLO | |
CN113378675A (en) | Face recognition method for simultaneous detection and feature extraction | |
CN111753610A (en) | Weather identification method and device | |
CN113378668A (en) | Method, device and equipment for determining accumulated water category and storage medium | |
CN113989744A (en) | Pedestrian target detection method and system based on oversized high-resolution image | |
CN115294519A (en) | Abnormal event detection and early warning method based on lightweight network | |
CN115760921A (en) | Pedestrian trajectory prediction method and system based on multi-target tracking | |
CN115953744A (en) | Vehicle identification tracking method based on deep learning | |
CN111897993A (en) | Efficient target person track generation method based on pedestrian re-recognition | |
CN117475355A (en) | Security early warning method and device based on monitoring video, equipment and storage medium | |
Kamenetsky et al. | Aerial car detection and urban understanding | |
CN115131826B (en) | Article detection and identification method, and network model training method and device | |
CN110765900A (en) | DSSD-based automatic illegal building detection method and system | |
CN115546667A (en) | Real-time lane line detection method for unmanned aerial vehicle scene | |
CN114639084A (en) | Road side end vehicle sensing method based on SSD (solid State disk) improved algorithm | |
Singhal et al. | A Comparative Analysis of Deep Learning based Vehicle Detection Approaches. | |
Zhang et al. | Accurate Detection and Tracking of Small‐Scale Vehicles in High‐Altitude Unmanned Aerial Vehicle Bird‐View Imagery | |
CN110942008A (en) | Method and system for positioning waybill information based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |