CN109670450B - Video-based man-vehicle object detection method - Google Patents

Video-based man-vehicle object detection method Download PDF

Info

Publication number
CN109670450B
CN109670450B CN201811565548.5A CN201811565548A CN109670450B CN 109670450 B CN109670450 B CN 109670450B CN 201811565548 A CN201811565548 A CN 201811565548A CN 109670450 B CN109670450 B CN 109670450B
Authority
CN
China
Prior art keywords
video
detection method
object detection
conv4
vehicle object
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811565548.5A
Other languages
Chinese (zh)
Other versions
CN109670450A (en
Inventor
王景彬
王思俊
刘琰
杜晓琳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Tiandy Information Systems Integration Co ltd
Original Assignee
Tianjin Tiandy Information Systems Integration Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Tiandy Information Systems Integration Co ltd filed Critical Tianjin Tiandy Information Systems Integration Co ltd
Priority to CN201811565548.5A priority Critical patent/CN109670450B/en
Publication of CN109670450A publication Critical patent/CN109670450A/en
Application granted granted Critical
Publication of CN109670450B publication Critical patent/CN109670450B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • G06V20/584Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads of vehicle lights or traffic lights
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/625License plates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention provides a man-vehicle object detection method based on video, which comprises the following steps: A. collecting data; B. labeling the data in the step A; C. generating a training set and a testing set for the marked data; D. constructing a convolutional neural network E, and performing model training on the convolutional neural network in the step D; F. and (5) detecting. The invention has the beneficial effects that: the detection rate is high, and the snapshot can achieve good effect for some vehicles which are difficult to detect and identify by the vehicle license plate; the intelligent monitoring system has a relatively accurate identification effect on identification and monitoring of pedestrians and non-motor vehicles, can better realize various monitoring evidence collection, and provides a guarantee for harmonious society, safe traffic and intelligent traveling.

Description

Video-based man-vehicle object detection method
Technical Field
The invention belongs to the technical field of traffic monitoring, and particularly relates to a man-vehicle object detection method based on video.
Background
In the traffic field, detection and separation of vehicles, pedestrians and non-motor vehicles are unavoidable, then the vehicles, pedestrians and non-motor vehicles are respectively monitored, and early warning and recording of illegal events are carried out, so that the detection of vehicles is core in the traffic monitoring technical field, the detection of vehicles is relatively mature and is based on license plates, the accuracy can be up to 99%, but for some license-free vehicles and some engineering vehicles, the snap vehicles cannot be effectively positioned through license plate recognition, and thus, the difficulty of post evidence collection work is caused; pedestrian and non-motor vehicle detection, because the targets are relatively small and the complexity of the gesture features is far higher than that of vehicle detection, so far, the gesture features are still in continuous exploration optimization; pedestrian and non-motor vehicles are also used as main objects in traffic, and are indispensable for effectively detecting components, and have important roles in the progress of traffic fields.
Disclosure of Invention
In view of the above, the present invention is directed to a method for detecting objects of vehicles based on video, so as to solve the above-mentioned drawbacks.
In order to achieve the above purpose, the technical scheme of the invention is realized as follows:
a person and vehicle object detection method based on video comprises the following steps:
A. collecting data;
B. labeling the data in the step A;
C. generating a training set and a testing set for the marked data;
D. constructing a convolutional neural network;
E. performing model training on the convolutional neural network in the step D;
F. and (5) detecting.
Further, in the step a, pictures of various traffic targets in different time periods of various traffic scenes are collected.
In the step B, the circumscribed rectangle of the traffic target is used as a boundary for marking.
Further, in the step C, the training set and the testing set are randomly generated according to the proportion of 4:1 of the number of pictures.
Further, the process of constructing the convolutional neural network in the step D is as follows:
D1. using a VGG network, replacing one filter and space of size greater than or equal to 5*5 at a time with a plurality of filters and spaces of size less than or equal to 3*3 at the base layer of the roll;
D2. and removing the pooling layers respectively connected behind each layer of Conv1_2, conv2_2, conv3_2, conv4_2 and Conv5_2 in the VGG network, and adding four groups of convolution modules of Conv5_x, conv6_x, conv7_x and Conv8_x, wherein Convy_1 in each convolution module group is 1/2 of the channel number of Convy_2.
Further, the training process in the step E is as follows:
E1. c, carrying out data enhancement on the training set and the test set generated in the step C through changing brightness, saturation, rotation and mirroring and image clipping;
E2. position regression and classification probability calculation were performed on the feature maps obtained from conv4_2, conv5_2, conv6_2, conv7_2, and conv8_2, respectively.
Further, the detection process in the step F is as follows:
F1. image color transformation: converting the image color format from YUV to BGR format, the conversion conditions are as follows,
B=Y+1.779*(U-128)
G=Y-0.3455*(U-128)-0.7169*(V-128)
R=Y+1.4075*(V-128);
F2. sending the BGR format into a trained model for detection, and outputting the detected target type, target position (x, y, w, h) and target confidence coefficient by the model;
F3. filtering the result, namely filtering out false detection targets by limiting the confidence coefficient of the targets; the most accurate target type information is obtained through cross-correlation definition; and obtaining the most accurate target position information (x, y, w, h) through non-maximum value inhibition.
Compared with the prior art, the human-vehicle object detection method based on the video has the following advantages:
the human-vehicle object detection method based on the video is high in detection rate, and good effects can be achieved for some vehicles which are difficult to detect and identify by using the vehicle license plate; the intelligent monitoring system has a relatively accurate identification effect on identification and monitoring of pedestrians and non-motor vehicles, can better realize various monitoring evidence collection, and provides a guarantee for harmonious society, safe traffic and intelligent traveling.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention. In the drawings:
FIG. 1 is a training flow chart according to an embodiment of the present invention;
fig. 2 is a network configuration diagram of an embodiment of the present invention.
Detailed Description
It should be noted that, without conflict, the embodiments of the present invention and features of the embodiments may be combined with each other.
The invention will be described in detail below with reference to the drawings in connection with embodiments.
As shown in fig. 1, a method for detecting a person and a vehicle based on video includes the following steps:
A. collecting data;
B. labeling the data in the step A;
C. generating a training set and a testing set for the marked data;
D. constructing a convolutional neural network;
E. performing model training on the convolutional neural network in the step D;
F. and (5) detecting.
In the step A, pictures of various traffic targets are collected in different time periods (including daytime, nighttime, forward light and backlight) of various traffic scenes, wherein the various traffic targets comprise a Bicycle, a Pederstrain, a Car, a Truck, a Bus, a tricycl and an Engineering Truck
In the step B, the circumscribed rectangle of the traffic target is used as a boundary for marking, and the labels are a Bicycle (non-motor Bicycle), a pedestrian, a Car, a Truck, a Bus, a tricycl and an engineering_truck.
In the step C, a training set and a testing set are randomly generated according to the proportion of 4:1 of the number of pictures; the training set is used for training the network weight, and the testing set is used for testing the accuracy of the trained model and preventing the training from fitting.
The process of constructing the convolutional neural network in the step D is as follows:
D1. using VGG network, replacing one large size filter and interval with several small size filters and intervals on the roll base layer, wherein the small size refers to 3*3 and below; the large size generally refers to 5*5 and above, so that the visual field of input data is not changed, an improved convolution network is comprehensively designed for training of vehicles and people, two 3*3 convolutions can be consistent with one 5*5 convolution receptive field, and three 3*3 convolutions can be consistent with one 7*7 convolution receptive field;
D2. the pooling layers respectively connected behind each layer of Conv1_2, conv2_2, conv3_2, conv4_2 and Conv5_2 are removed in the VGG network, so that information loss caused by dimension reduction is reduced, four groups of convolution modules of Conv5_x, conv6_x, conv7_x and Conv8_x are added, wherein Convy_1 in each convolution module group is 1/2 of the number of channels of Convy_2, and through deepening the convolution network and different configurations of channels, more intensive target characteristics can be obtained; a specific network structure diagram is shown in fig. 2;
the training process in the step E is as follows:
E1. the existing samples (the training set and the test set generated in the step C) are subjected to data enhancement by changing brightness, saturation, rotation, mirroring and image clipping, and the existing samples are subjected to data enhancement by changing brightness, saturation, rotation, mirroring, image clipping and other methods due to the fact that deep learning requires a huge number of samples;
E2. position regression and classification probability calculation were performed on the feature maps obtained from conv4_2, conv5_2, conv6_2, conv7_2, and conv8_2, respectively, and position regression and classification probability calculation were performed on the feature maps obtained from conv4_2, conv5_2, conv6_2, conv7_2, and conv8_2, respectively, so that the model had multi-scale features. In this way, even the smaller target has more obvious position characteristics on the larger characteristic diagram, so that the large target and the small target can be detected in a compatible and better mode,
the Loss function Loss used is a weighting of the position error mbox_loc and the confidence error mbox_conf:
Loss(x,c,l,g)=1/N(mbox_conf(x,c)+α*mbox_loc(x,l,g))
wherein, the weight proportion alpha takes the value range of 0-1, N is the number of positive samples of the group Truth (target real frame), c is the category confidence prediction value, l is the position prediction value of the prediction frame, and g is the position parameter of the group Truth;
the position error mbox_loc is calculated using SmoothL1 loss:
wherein k represents the category to which the group Truth belongs,indicating whether the ith prediction frame and the jth real frame are matched with each other with respect to the category k, wherein the matching is 1, otherwise, the matching is 0;
the confidence error mbox_conf is obtained by the softmaxloss method:
wherein p represents the predicted class of the prediction,indicating whether the ith prediction box and the jth real box match with respect to class p.
The detection process in the step F is as follows:
F1. image color transformation: since the image format acquired by the monitoring camera is YUV type and the network input is BGR format, the image color format is converted from YUV to BGR format, the conversion conditions are as follows,
B=Y+1.779*(U-128)
G=Y-0.3455*(U-128)-0.7169*(V-128)
R=Y+1.4075*(V-128)
F2. sending the BGR format into a trained model for detection, and outputting the detected target type, target position (x, y, w, h) and target confidence coefficient by the model;
F3. filtering the result, namely filtering out false detection targets by limiting the confidence coefficient of the targets; the most accurate target type information is obtained through the definition of an intersectional-over-Union (IOU); the most accurate target location information (x, y, w, h) is obtained by Non-maximal suppression (Non-Maximum Suppression, NMS). In this embodiment, the confidence level is greater than or equal to 0.8 as the correct target, and then the targets smaller than 0.8 are filtered out; taking IOU greater than or equal to 0.4, and considering as the same target; the NMS threshold was taken to be 0.4.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.

Claims (4)

1. The human-vehicle object detection method based on the video is characterized by comprising the following steps of:
A. collecting data;
B. labeling the data in the step A;
C. generating a training set and a testing set for the marked data;
D. constructing a convolutional neural network;
E. performing model training on the convolutional neural network in the step D;
F. detecting;
the convolutional neural network of the step D comprises the following steps:
conv1_1, conv1_2, conv2_1, conv2_2, conv3_1, conv3_2, conv4_1, conv4_2, conv4_3, conv5_1, conv5_2, conv6_1, conv6_2, conv7_1, conv7_2, conv8_1, conv8_2 convolutional layers;
the training process in the step E is as follows:
E1. c, carrying out data enhancement on the training set and the test set generated in the step C through changing brightness, saturation, rotation and mirroring and image clipping;
E2. position regression and classification probability calculation were performed on the feature maps obtained from conv4_2, conv5_2, conv6_2, conv7_2, and conv8_2, respectively;
in-process feature graphs obtained from Conv4_2, conv5_2, conv6_2, conv7_2 and Conv8_2 are respectively calculated, and finally, the results on the feature graphs are finally fused to obtain final output result information output.
2. The video-based person-vehicle object detection method according to claim 1, wherein: and A, collecting pictures containing various traffic targets in different time periods of various traffic scenes.
3. The video-based person-vehicle object detection method according to claim 1, wherein: and B, marking by taking the circumscribed rectangle of the traffic target as a boundary.
4. The video-based person-vehicle object detection method according to claim 1, wherein: and C, randomly generating a training set and a testing set according to the proportion of 4:1 of the number of pictures.
CN201811565548.5A 2018-12-20 2018-12-20 Video-based man-vehicle object detection method Active CN109670450B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811565548.5A CN109670450B (en) 2018-12-20 2018-12-20 Video-based man-vehicle object detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811565548.5A CN109670450B (en) 2018-12-20 2018-12-20 Video-based man-vehicle object detection method

Publications (2)

Publication Number Publication Date
CN109670450A CN109670450A (en) 2019-04-23
CN109670450B true CN109670450B (en) 2023-07-25

Family

ID=66144134

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811565548.5A Active CN109670450B (en) 2018-12-20 2018-12-20 Video-based man-vehicle object detection method

Country Status (1)

Country Link
CN (1) CN109670450B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110111577B (en) * 2019-05-15 2020-11-27 武汉纵横智慧城市股份有限公司 Non-motor vehicle identification method, device, equipment and storage medium based on big data
CN110399800B (en) * 2019-06-28 2021-05-07 智慧眼科技股份有限公司 License plate detection method and system based on deep learning VGG16 framework and storage medium
CN111461130B (en) * 2020-04-10 2021-02-09 视研智能科技(广州)有限公司 High-precision image semantic segmentation algorithm model and segmentation method
CN113239962A (en) * 2021-04-12 2021-08-10 南京速度软件技术有限公司 Traffic participant identification method based on single fixed camera
CN114973154A (en) * 2022-07-29 2022-08-30 成都宜泊信息科技有限公司 Parking lot identification method, parking lot identification system, parking lot control method, parking lot control system, parking lot equipment and parking lot control medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104537387A (en) * 2014-12-16 2015-04-22 广州中国科学院先进技术研究所 Method and system for classifying automobile types based on neural network
CN107358182A (en) * 2017-06-29 2017-11-17 维拓智能科技(深圳)有限公司 Pedestrian detection method and terminal device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10068171B2 (en) * 2015-11-12 2018-09-04 Conduent Business Services, Llc Multi-layer fusion in a convolutional neural network for image classification
CN107316007B (en) * 2017-06-07 2020-04-03 浙江捷尚视觉科技股份有限公司 Monitoring image multi-class object detection and identification method based on deep learning
CN107944442B (en) * 2017-11-09 2019-08-13 北京智芯原动科技有限公司 Based on the object test equipment and method for improving convolutional neural networks
CN108090457A (en) * 2017-12-27 2018-05-29 天津天地人和企业管理咨询有限公司 A kind of motor vehicle based on video does not give precedence to pedestrian detection method
CN108564555B (en) * 2018-05-11 2021-09-21 中北大学 NSST and CNN-based digital image noise reduction method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104537387A (en) * 2014-12-16 2015-04-22 广州中国科学院先进技术研究所 Method and system for classifying automobile types based on neural network
CN107358182A (en) * 2017-06-29 2017-11-17 维拓智能科技(深圳)有限公司 Pedestrian detection method and terminal device

Also Published As

Publication number Publication date
CN109670450A (en) 2019-04-23

Similar Documents

Publication Publication Date Title
CN109670450B (en) Video-based man-vehicle object detection method
CN111368687B (en) Sidewalk vehicle illegal parking detection method based on target detection and semantic segmentation
Chen et al. Vehicle detection, tracking and classification in urban traffic
CN111104903B (en) Depth perception traffic scene multi-target detection method and system
CN104134079B (en) A kind of licence plate recognition method based on extremal region and extreme learning machine
CN104463241A (en) Vehicle type recognition method in intelligent transportation monitoring system
CN111800507A (en) Traffic monitoring method and traffic monitoring system
CN110222596B (en) Driver behavior analysis anti-cheating method based on vision
CN103093249A (en) Taxi identifying method and system based on high-definition video
CN115311241B (en) Underground coal mine pedestrian detection method based on image fusion and feature enhancement
CN111881739B (en) Automobile tail lamp state identification method
Razalli et al. Emergency vehicle recognition and classification method using HSV color segmentation
CN111274886A (en) Deep learning-based pedestrian red light violation analysis method and system
CN114329074B (en) Traffic energy efficiency detection method and system for ramp road section
CN113033275A (en) Vehicle lane-changing non-turn signal lamp analysis system based on deep learning
CN115424217A (en) AI vision-based intelligent vehicle identification method and device and electronic equipment
Zhang et al. Vehicle detection in UAV aerial images based on improved YOLOv3
CN114359196A (en) Fog detection method and system
CN104778454A (en) Night vehicle tail lamp extraction method based on descending luminance verification
Zhang et al. A front vehicle detection algorithm for intelligent vehicle based on improved gabor filter and SVM
CN105574490A (en) Vehicle brand identification method and system based on headlight image characteristics
CN115019263A (en) Traffic supervision model establishing method, traffic supervision system and traffic supervision method
CN114882469A (en) Traffic sign detection method and system based on DL-SSD model
CN114372556A (en) Driving danger scene identification method based on lightweight multi-modal neural network
Visshwak et al. On-the-fly traffic sign image labeling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant