CN115601717B - Deep learning-based traffic offence behavior classification detection method and SoC chip - Google Patents

Deep learning-based traffic offence behavior classification detection method and SoC chip Download PDF

Info

Publication number
CN115601717B
CN115601717B CN202211280838.1A CN202211280838A CN115601717B CN 115601717 B CN115601717 B CN 115601717B CN 202211280838 A CN202211280838 A CN 202211280838A CN 115601717 B CN115601717 B CN 115601717B
Authority
CN
China
Prior art keywords
feature map
traffic
feature
module
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211280838.1A
Other languages
Chinese (zh)
Other versions
CN115601717A (en
Inventor
王嘉诚
张少仲
张栩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongcheng Hualong Computer Technology Co Ltd
Original Assignee
Zhongcheng Hualong Computer Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongcheng Hualong Computer Technology Co Ltd filed Critical Zhongcheng Hualong Computer Technology Co Ltd
Priority to CN202211280838.1A priority Critical patent/CN115601717B/en
Publication of CN115601717A publication Critical patent/CN115601717A/en
Application granted granted Critical
Publication of CN115601717B publication Critical patent/CN115601717B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/54Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/94Hardware or software architectures specially adapted for image or video understanding
    • G06V10/955Hardware or software architectures specially adapted for image or video understanding using specific electronic processors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/625License plates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The application discloses a deep learning-based traffic offence classification detection method and a SoC chip, which belong to the technical field of computer vision, wherein a general processor and a plurality of different types of neural network processors are integrated on the SoC chip, real-time videos are collected and marked on a road surface, traffic indication marks, signal lamps and traffic participants in the videos through a classification neural network model, classification is carried out according to different traffic participants, and different traffic offence detection models are input for traffic offence detection and offence object identification. Compared with the method for classifying, labeling and identifying all types of traffic participants through a single neural network model, the method for detecting the illegal behaviors of different road traffic participants through training different algorithm models has the advantages that the neural network model suitable for each stage of task is adopted, so that the algorithm complexity of the traffic illegal detection model can be effectively reduced, the overall detection efficiency is improved, and the real-time requirement of detecting the road traffic illegal behaviors is fully met.

Description

Deep learning-based traffic offence behavior classification detection method and SoC chip
Technical Field
The application belongs to the technical field of computer vision, and particularly relates to a traffic offence behavior classification detection method based on deep learning and an SoC chip.
Background
In recent years, with the improvement of national living standard, the national automobile storage capacity is higher and higher, and more vehicles run on roads. Under the complex traffic environment of the road, the vehicle violation often happens, and the road safety and the personal and property safety of the vehicle owners are seriously threatened. For traffic illegal behaviors of motor vehicles, a fixed electronic monitoring device snapshot system and a mobile snapshot road traffic technology monitoring device exist at present, and manual screening is carried out on a photographed image road, so that a good effect cannot be achieved in efficiency, and labor cost is increased. With the development of computer multimedia technology and image processing technology in recent years, the component occupied by the illegal action judgment based on video in intelligent traffic is larger and larger, and the research strength invested in various communities is also larger and larger. Because of the increasing importance of China on road monitoring, the video detection technology has become the most important information acquisition means in the intelligent traffic field, and comprehensive evaluation has great feasibility when being applied to road traffic illegal behavior detection.
In order to solve the problems of low manual screening efficiency and high labor cost of vehicle traffic violation behaviors, the Chinese patent of the application with publication number of CN113177443A discloses a method for intelligently identifying road traffic violations based on image vision, which comprises the following steps: modeling and pre-training after modeling; lane line detection classification and lane number determination; detecting pedestrians, zebra crossings, traffic lights and bus lane marks; vehicle detection and tracking; judging traffic violations; and (3) utilizing a monocular plane camera and an edge computing function, performing target detection and classification on a series of captured frame images by adopting a deep learning method, and then judging whether the detected vehicle generates traffic violations through logic.
Meanwhile, the traffic illegal behaviors of pedestrians are more serious, compared with the traffic illegal behaviors of vehicles, the traffic illegal behaviors of pedestrians obviously lack of supervision measures, and the pedestrians are mainly characterized by large walking randomness, changeable directions and easy danger, so that the supervision of the pedestrians participating in the road traffic is also urgent. The detection of traffic violations cannot be limited to vehicle traffic violations and should also be made for pedestrians.
The publication number CN112528759A discloses a traffic offence detection method based on computer vision, which detects the offence of other road traffic participants in addition to the offence of vehicles in traffic scenes. The scheme is as follows: through computer vision technology, real-time target recognition is carried out on objects such as pedestrians, different types of motor vehicles, traffic indicator lights, traffic lanes, zebra crossings, license plates, car logos and the like in a traffic scene, and meanwhile, real-time detection and statistics are carried out on information such as the speed of the motor vehicles, the flow rate of the motor vehicles and the like, and the traffic supervisor is assisted in carrying out traffic supervision on behaviors which violate traffic road regulation rules such as red light running of pedestrians, red light running of the motor vehicles, no traffic of pedestrians and overspeed running of the motor vehicles and the like. However, the scheme detects illegal behaviors of different traffic participants in a traffic scene through a single deep learning algorithm model, and has the problems of high algorithm complexity, insufficient detection precision and low detection efficiency.
Disclosure of Invention
The application provides a traffic illegal behavior classification detection method based on deep learning and an SoC chip, and aims to solve the problems of high algorithm complexity, insufficient detection precision and lower detection efficiency.
In order to solve the technical problems, the method comprises the steps of firstly classifying the pictures acquired in the traffic scene according to different traffic participants, and then respectively inputting the classified pictures into corresponding neural network models to detect and judge traffic illegal behaviors. The specific scheme is as follows:
the traffic illegal behavior classification detection method based on deep learning comprises the following steps:
s1: and acquiring traffic scene videos through the imaging equipment, and decomposing the traffic scene videos into a plurality of continuous video frames.
S2: inputting the video frames into a trained classified neural network model, and marking road surfaces, traffic indication marks, signal lamps and traffic participants in the video frames, wherein the classified neural network model adopts an improved YOLOv5s neural network model; the improved YOLOv5s neural network model reduces the number of layers and the number of channels of each layer of the neural network of the model, and improves the flow of the spatial pyramid pooling structure into a characteristic diagram which sequentially passes through three maximum pooling layers;
s3: the classification neural network model classifies video frames according to traffic participants, classifies the video frames with vehicle labels as vehicle feature graphs, and classifies the video frames with pedestrian labels as pedestrian feature graphs;
s4: inputting the vehicle feature map into a trained vehicle traffic violation detection model, judging vehicle traffic violation behaviors, and if the vehicle feature map is judged to be traffic violation behaviors, generating a vehicle traffic violation image for subsequent processing after license plate recognition; the vehicle traffic violation detection model comprises a vehicle violation judging algorithm and a license plate recognition model, wherein the vehicle violation judging algorithm judges whether the vehicle is illegal or not according to the running track of the vehicle, road surface identification, traffic indication identification and signal lamp state, the license plate recognition model adopts an SVM model, and a license plate is recognized through a license plate positioning module, a license plate character segmentation module and a license plate character recognition module;
s5: inputting the pedestrian feature map into a trained pedestrian traffic violation detection model, judging pedestrian traffic violation behaviors, and if the pedestrian feature map is judged to be traffic violation behaviors, generating a pedestrian traffic violation image for subsequent processing after face recognition; the pedestrian traffic violation detection model comprises a violation judging algorithm and a face recognition model, the violation judging algorithm judges whether the vehicle is illegal or not according to the action track of the pedestrian, the road surface mark, the traffic indication mark and the signal lamp state, the face recognition model adopts a simplified Retinaface model, and the simplified Retinaface model further comprises: the system comprises a trunk feature extraction network, an FPN feature pyramid network, an SSH feature enhancement network and a face target output end.
Preferably, the improved YOLOv5s neural network model comprises:
and the input end outputs a characteristic diagram A of three RGB channels with the resolution of 640 multiplied by 640 by using a Mosaic method, an adaptive anchor frame calculation method and an adaptive picture scaling method.
The characteristic extraction network is used for receiving the characteristic image A, extracting the characteristic image A through a convolution network and detecting a subsequent target; the feature map A sequentially passes through a Focus module, a CBL module, a CSP1 module, a CBL module and a CSP1 module of the feature extraction network and then outputs a feature map B; the feature map B sequentially passes through a CBL module and a CSP1 module of the feature extraction network and then outputs a feature map C; and the feature map C sequentially passes through a CBL module, an SPPF module, a CSP2 module and a CBL module of the feature extraction network and then outputs a feature map D.
The feature fusion network is used for mixing and combining the feature images at different stages and transmitting the mixed and combined feature images to an output end for prediction; in the feature fusion network, a feature map B, a feature map C and a feature map D are respectively fused with feature maps of different stages of the feature fusion network to generate feature maps F, G and H with different sizes to an output end for target detection.
And the output end outputs three target detection results with different sizes through a classification Loss function and a regression Loss function after carrying out 3X 3 convolution on the feature map F, the feature map G and the feature map H, wherein the classification Loss function adopts a cross entropy Loss function, and the regression Loss function adopts CIOU_Loss.
Preferably, the training of the improved YOLOv5s neural network model specifically comprises the following steps:
s2-1: and establishing a traffic scene data set, wherein the traffic scene data set is divided into a training set and a verification set according to a set proportion.
S2-2: and sequentially inputting the training set into an input end, a feature extraction network, a feature fusion network and an output end of the improved YOLOv5s neural network, predicting the positions and classifications of the road surface, the traffic indication sign, the signal lamp and the traffic participant in the training set, and outputting a prediction result, wherein the prediction result is compared with the verification set for verification.
S2-3: and repeating the step S2-2 until the set training times are reached, and storing the last trained model as a trained classified neural network model.
Preferably, the mosaics data enhancement operation in the input end uses four pictures to splice in a random scaling, random cutting and random arrangement mode; the self-adaptive anchor frame calculates self-adaptively to calculate the optimal anchor frame values in different training sets, calculates the optimal recall rate aiming at the default anchor frame, and if the optimal recall rate is lower than 0.98, recalculates the anchor frame; the self-adaptive picture scaling method comprises the steps of calculating scaling ratios of width and height according to the input image size and the output feature image size, calculating the actual size after scaling according to the scaling ratios, and calculating gray filling values according to the actual size to align to 640 multiplied by 640 to output feature image A size.
Preferably, the feature extraction network further comprises three groups of convolutions, a CSP module is added in each group of convolutions, the CSP module breaks the feature map into two parts, one part carries out convolution operation, and the result of the convolution operation of the other part and the last part carries out feature fusion operation with increased channel number; a third set of convolutions employs a modified spatial pyramid pooling structure.
Preferably, the improved spatial pyramid pooling structure outputs the input feature map in two paths after passing through the CBL module once, one path of feature map passes through three 5×5 max pooling layers in series, each max pooling layer outputs one feature map and the other path of feature map for tensor splicing operation, and the spliced feature map is output to the CSP2 module after passing through the CBL module.
Preferably, the feature fusion network adopts a feature pyramid network and a path aggregation network to aggregate the image features at the stage, and the fusion mode of the feature map is specifically as follows:
the feature map D performs tensor splicing operation with the feature map C after up-sampling operation, the spliced feature map sequentially passes through the CSP2 module and the CBL module and then outputs a feature map E, the feature map E performs tensor splicing operation with the feature map B after up-sampling again, and the spliced feature map outputs a feature map F through one CSP2 module; the feature map F performs tensor splicing operation with the feature map E after passing through a CBL module, and the spliced feature map outputs a feature map G through a CSP2 module; the feature map G performs tensor splicing operation with the feature map D after passing through a CBL module, and the spliced feature map outputs a feature map H through a CSP2 module; and the feature map F, the feature map G and the feature map H are used as final output of a feature fusion network and are input to the output end of the improved YOLOv5s neural network model.
Preferably, the ciou_loss algorithm measures the coincidence degree between the real frame and the predicted frame, and sets a minimum rectangle capable of wrapping the real frame and the predicted frame, so as to evaluate the distance between the two frames, wherein the diagonal distance of the rectangle isThe center distance of the real frame and the predicted frame is introduced +.>To evaluate the situation that two frames are wrapped around each other, the aspect ratio of the real frame and the predicted frame is introduced>With this, the center points of the two frames are evaluated to be coincident, and the formula of the ciou_loss algorithm is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,i.e. the degree of coincidence of the real frame and the predicted frame, < >>Is the intersection ratio of the real frame and the predicted frame, < >>Is the width of the prediction frame, +.>Is the height of the prediction box, +.>Is the width of the real frame, +.>Is the height of the real frame.
Preferably, the improved YOLOv5s neural network model inputs the marked feature map classified as a vehicle into a vehicle traffic violation detection model, the vehicle traffic violation detection model carries out illegal judgment and license plate recognition on the feature map classified as the vehicle, carries out traffic illegal judgment on the vehicle based on the running track of the vehicle, the road surface mark, the traffic indication mark and the signal lamp state, and further carries out license plate recognition if the traffic illegal action is established; the license plate recognition adopts a trained SVM model.
Preferably, the improved YOLOv5s neural network model inputs the marked feature map classified as a pedestrian into a pedestrian traffic violation detection model, and the pedestrian traffic violation detection model carries out violation judgment and face recognition on the marked feature map classified as the pedestrian; based on the action track of the pedestrian, the road surface mark, the traffic indication mark and the signal lamp state, judging the traffic violation of the pedestrian, and if the traffic violation is true, further carrying out face recognition; the face recognition adopts a simplified Retinaface model which is completed through training, and the simplified Retinaface model further comprises a trunk feature extraction network, an FPN feature pyramid network, an SSH feature enhancement network and a face target output end which are sequentially arranged.
The utility model provides a traffic offence behavior classification detects SoC chip based on deep learning, the SoC chip includes general purpose processor and a plurality of neural network processor, general purpose processor is through the operation of custom instruction control a plurality of neural network processor, a plurality of neural network processor further includes: a neural network processor for image classification, a neural network processor for vehicle traffic violation detection, and a neural network processor for pedestrian traffic violation detection, for performing the above-described methods.
Compared with the prior art, the application has the following technical effects:
1. the improved YOLOv5s neural network model is utilized to mark and classify the traffic images acquired in real time according to different traffic participants, the characteristics of small network depth and small feature diagram width of the improved YOLOv5s neural network model are fully utilized, the operation complexity is reduced, the neural network model is enabled to be light and accurate, the improved YOLOv5s neural network model is applied to the traffic scene to perform target detection classification mainly on a large target, and the target detection classification efficiency can be effectively improved.
2. The improved YOLOv5s neural network model has very small volume and less memory use, can be easily deployed in embedded equipment, can effectively reduce the complexity of detection equipment and increases the use scene of the equipment.
3. In the improved YOLOv5s neural network model, an SPPF module is adopted to replace an SPP module, tensor splicing is carried out after feature graphs pass through three 5X 5 maximum pooling layers in series, compared with tensor splicing carried out after the feature graphs pass through the maximum pooling layers in parallel in the SPP module, the calculation results of the two are the same, the calculation speed of the SPPF module is 2.5 times that of the SPP module, and the feature extraction efficiency is improved.
4. By training different algorithm models to detect the illegal behaviors of different road traffic participants, compared with the method of classifying, labeling and identifying the traffic participants by a single neural network model, the method can effectively reduce the algorithm complexity of the traffic illegal detection model, improve the overall detection efficiency and fully meet the real-time requirement of road traffic illegal behavior detection by adopting the neural network model suitable for each stage of tasks.
Drawings
FIG. 1 is a flow chart of a deep learning-based traffic offence classification detection method of the present application;
FIG. 2 is a schematic diagram of an improved YOLOv5s model structure of the deep learning-based traffic offence classification detection method of the present application;
fig. 3 is a schematic diagram of a block regression algorithm CIoU of the deep learning-based traffic offence classification detection method according to the present application.
In the figure: 1. an input end; 2. a feature extraction network; 3. a feature fusion network; 4. and an output terminal.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be clearly and completely described below with reference to the accompanying drawings in conjunction with specific embodiments of the present application.
Referring to fig. 1 to 3, the traffic offence classification detection method based on deep learning includes the following steps:
s1: the method comprises the steps that traffic scene videos are collected through imaging equipment, the imaging equipment comprises vehicle-mounted mobile shooting equipment and fixed monitoring equipment, the collected traffic scene videos are decomposed into a plurality of continuous video frames according to the shooting frame number of the videos, and the generated images are used for classifying neural network models to detect and mark targets.
S2: the method comprises the steps of inputting a video frame into a trained classified neural network model, marking road surfaces, traffic indication marks, signal lamps and traffic participants in the video frame, classifying according to the traffic participants, respectively generating a vehicle characteristic image and a pedestrian characteristic image, wherein the classified neural network model adopts an improved YOLOv5s neural network model, compared with the YOLOv5 neural network model, the improved YOLOv5s neural network model reduces the number of neural network layers and the number of channels of each layer of the model, improves the flow of a spatial pyramid pooling structure into a characteristic image, sequentially passes through three layers of maximum pooling layers, fully utilizes the characteristics of few improved YOLOv5s neural network models and small characteristic image width, reduces the complexity of operation, ensures that the neural network model is light in weight and simultaneously keeps accurate, and is applied to target detection classification under a traffic scene with a large target as a main object, so that the efficiency of target detection classification can be effectively improved.
S3: the classification neural network model classifies video frames according to traffic participants, classifies video frames with vehicle labels as vehicle feature maps, and classifies video frames with pedestrian labels as pedestrian feature maps. By classifying the video frames, traffic offence pictures of different traffic participants can be sent to corresponding neural network models for traffic offence detection and traffic participant identification, and the accuracy and efficiency of traffic offence detection can be effectively improved.
S4: and inputting the vehicle feature map into a trained vehicle traffic violation detection model, judging the vehicle traffic violation, and if the vehicle feature map is judged to be the traffic violation, generating a vehicle traffic violation image for subsequent processing. The vehicle traffic violation detection model comprises a violation judging algorithm and a license plate recognition model, wherein the violation judging algorithm judges whether the vehicle is illegal or not according to the running track of the vehicle, the road surface mark, the traffic indication mark and the signal lamp state, the license plate recognition model adopts an SVM model, and the license plate is recognized through a license plate positioning module, a license plate character segmentation module and a license plate character recognition module.
S5: and inputting the pedestrian feature map into a trained pedestrian traffic violation detection model, judging the pedestrian traffic violation, and if the pedestrian feature map is judged to be the traffic violation, generating a pedestrian traffic violation image for subsequent processing. The pedestrian traffic violation detection model comprises a violation judging algorithm and a face recognition model, wherein the violation judging algorithm judges whether the vehicle is illegal or not according to the action track of the pedestrian, the road surface mark, the traffic indication mark and the signal lamp state, the face recognition model adopts a simplified RetinaFace model, and the simplified RetinaFace model further comprises: the system comprises a trunk feature extraction network, an FPN feature pyramid network, an SSH feature enhancement network and a face target output end.
The improved YOLOv5s neural network model structure comprises an input end 1, a feature extraction network 2, a feature fusion network 3 and an output end 4.
The Input end 1 (Input), the Input end 1 uses the Mosaic data enhancement operation to improve the training speed of the model and the network precision, and provides a self-adaptive anchor frame calculation and self-adaptive picture scaling method, and the Input end 1 outputs a characteristic diagram A of three channels of RGB with the resolution of 640 multiplied by 640.
The mosaics data enhancement operation uses four pictures to splice in a random zooming, random cutting and random arrangement mode, so that the pictures can enrich the background of a detection target, and meanwhile, the detection effect of a small target can be improved; the self-adaptive anchor frame calculates self-adaptive optimal anchor frame values in different training sets, and an anchor frame with specific length and width is required to be set for different data sets, in the network training stage, a model outputs a corresponding prediction frame on the basis of an initial anchor frame, calculates the difference between the prediction frame and a real frame, and performs reverse updating operation, so that parameters of the whole network are updated; the self-adaptive picture scaling method comprises the steps of calculating scaling ratios of width and height according to the input image size and the output feature image size, calculating the scaled actual size according to the scaling ratios, and finally calculating gray filling values according to the actual size to output a feature image A in an aligned 640X 640 RGB three-channel mode.
A feature extraction network 2 (backhaul) for receiving the feature map a, and extracting the feature map a through a convolutional network for subsequent target detection; the feature map A sequentially passes through a Focus module, a CBL module, a CSP1 module, a CBL module and a CSP1 module of the feature extraction network to output a feature map B, which is a first group of convolution; the feature map B sequentially passes through a CBL module and a CSP1 module of the feature extraction network to output a feature map C, which is a second group of convolution; the feature map C sequentially passes through the CBL module, SPPF module, CSP2 module, and CBL module of the feature extraction network, and then outputs a feature map D, which is a third set of convolutions.
The method comprises the steps that a Focus module carries out slicing on an input feature map A, four slicing operations are carried out on the feature map A in parallel to generate four 320 multiplied by 12 intermediate feature maps I, tensor splicing is carried out on the feature maps, and the feature maps I pass through a CBL module with 32 convolution kernels at a time to finally generate 320 multiplied by 32 intermediate feature maps II;
the CBL module is the most basic module in YOLOv5s, in order Conv convolution, BN (Batch Normalization ), and LeakyRelu activation functions; continuing to convolve the intermediate feature map II generated by the Focus module in the CBL module to generate an intermediate feature map III of 160 multiplied by 64;
the CSP module in this embodiment has two structures, the CSP1 structure is applied to the feature extraction network 2 (backhaul), and the CSP2 structure is applied to the feature fusion network 3 (Neck); the CSP1 module divides the three middle feature images into two paths in parallel, one path sequentially passes through the CBL module, n residual error components (ResUnit) and one convolution, tensor splicing operation is carried out on the other path after one convolution, the generated feature images pass through the one-time BN batch normalization module, the activation function LeakyRelu and the CBL module to generate 80 multiplied by 128 middle feature images, and the middle feature images pass through the CSP1 module again to generate 80 multiplied by 128 feature images B; the residual structure is added, so that the gradient value of back propagation between layers can be increased, gradient disappearance caused by deepening is avoided, and therefore, finer granularity characteristics can be extracted without worrying about network degradation;
the feature map B generates a feature map C through a second group of convolution, the second group of convolution comprises a CBL module and a CSP1 module, the size of the feature map B is changed into 40 multiplied by 256 after passing through the CBL module, and the feature map C with the same size is output after passing through the CSP1 module;
the feature map C generates a feature map D through a third group of convolutions, wherein the third group of convolutions comprise a CBL module, an SPPF module, a CSP2 module and a CBL module, the feature map is changed into 20 multiplied by 512 through the CBL module for the first time, and the feature map D with the size of 20 multiplied by 256 is generated through the CBL module for the second time; the SPPF module outputs the input feature map in two paths after passing through the CBL module once, one path of feature map passes through three 5 multiplied by 5 maximum pooling layers in series, each maximum pooling layer outputs one feature map to carry out tensor splicing operation with the other path of feature map, and the spliced feature map is output to the CSP2 module after passing through the CBL module; the CSP2 module is similar in structure to the CPS1 module except that the n residual components are replaced with 2n CBL modules.
A feature fusion network 3 (Neck) for mixing and combining feature images at different stages, enhancing the robustness of the network, enhancing the object detection capability, and transmitting the features to an output end for prediction; in the feature fusion network, feature images B, C and D are fused in different layers to generate feature images F, G and H with different sizes to an output end for target detection. The specific fusion process is as follows:
the feature map D performs tensor splicing operation with the feature map C after upsampling operation, the size of the spliced intermediate feature map is 40 multiplied by 512, the feature map E with the size of 40 multiplied by 128 is output after passing through a CSP2 module and a CBL module in sequence, the feature map E performs tensor splicing operation with the feature map B after upsampling again, the size of the spliced intermediate feature map is 80 multiplied by 256, and the feature map F with the size of 80 multiplied by 128 is output after passing through the CSP2 module; the feature map F is subjected to tensor splicing operation with the feature map E after passing through a CBL module, the size of the spliced feature map is 40 multiplied by 256, and a feature map G with the size of 40 multiplied by 128 is output through a CSP2 module; the feature map G is subjected to tensor splicing operation with the feature map D after passing through a CBL module, the size of the spliced feature map is 20 multiplied by 512, and a feature map H with the size of 20 multiplied by 128 is output through a CSP2 module; the feature map F, the feature map G and the feature map H are used as final outputs of the feature fusion network and are input to an output end 4 of the improved YOLOv5s neural network model.
And an Output end 4 (Output) for respectively carrying out 3×3 convolution on the feature map F, the feature map G and the feature map H, and outputting three target detection results with different sizes through a classification loss function, a regression loss function and a confidence loss function, wherein the classification loss function adopts a cross entropy loss function and is used for calculating whether the anchor frame and the corresponding calibration classification are correct. The regression Loss function uses CIOU_Loss for predicting the error between the box and the real box. The confidence loss function is calculated by sample pairs obtained by positive sample matching, namely, target confidence score in a prediction frame, and IOU values of the prediction frame and a target frame corresponding to the prediction frame are used as real frames, and binary cross entropy is calculated by the two to obtain final target confidence loss, so that the confidence of a network is calculated.
The Output end 4 (Output) adopts a CIoU_Loss algorithm to measure the coincidence degree between the real frame and the predicted frame, the CIoU_Loss algorithm sets a minimum rectangle which can wrap the real frame and the predicted frame, the distance between the two frames is evaluated, and the diagonal distance of the rectangle isThe center distance of the real frame and the predicted frame is introduced +.>To evaluate the situation that two frames are wrapped around each other, the aspect ratio of the real frame and the predicted frame is introduced>With this, the center points of the two frames are evaluated to be coincident, and the formula of the ciou_loss algorithm is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,i.e. the degree of coincidence of the real frame and the predicted frame, < >>Is the intersection ratio of the real frame and the predicted frame, < >>Is the width of the prediction frame, +.>Is the height of the prediction box, +.>Is the width of the real frame, +.>Is the height of the real frame.
The training of the improved YOLOv5s neural network model specifically comprises the following steps:
s2-1: establishing a traffic scene data set, wherein the traffic scene data set is divided into a training set and a verification set according to the proportion of 8:2; the traffic scene data set can adopt a UA-DETRAC data set which is mainly shot on a road overpass of Beijing and Tianjin, and is manually marked with 8250 vehicles and a 121 ten thousand target object outer frame. Vehicles are divided into four classes, namely cars, buses, vans and other vehicles; weather conditions fall into four categories, i.e., cloudy, night, sunny and rainy.
S2-2: the training set is sequentially input into an input end 1, a feature extraction network 2, a feature fusion network 3 and an output end 4 of the improved YOLOv5s neural network, the positions and classifications of the road surface, the traffic indication sign, the signal lamp and the traffic participants in the training set are predicted, the prediction result is output, and the comparison verification is carried out on the prediction result and the verification set.
S2-3: and repeating the step S2-2 until the set training times are reached, and storing the last trained model as a trained classified neural network model.
The improved YOLOv5s neural network model inputs the feature map classified as the vehicle into a vehicle traffic violation detection model, the vehicle traffic violation detection model carries out violation judgment and license plate recognition on the feature map, the vehicle is subjected to traffic violation judgment based on the running track of the vehicle, the road surface mark, the traffic indication mark and the signal lamp state, and if the traffic violation is met, the license plate recognition is further carried out; the license plate recognition adopts a trained SVM model. The license plate recognition by using the SVM model comprises the following steps:
positioning a license plate, converting gray scales, converting a color picture into a gray scale image, and commonly obtaining an average value of R=G=B=pixels; removing noise by Gaussian smoothing and median filtering; performing binarization processing to convert the image into black and white, wherein the gray value of the pixel is set to 255 when the gray value is more than 127, and is set to 0 when the gray value is less than 127; the canny edge detection is carried out, the closing operation and the opening operation are carried out, the small area is eliminated, and the large area is reserved, so that the license plate position is positioned; expanding and thinning, amplifying the image outline, and converting the image outline into areas which contain license plates; selecting a proper license plate position through an algorithm, and filtering out or searching a blue bottom area in a smaller area; marking the license plate position and extracting the license plate.
And (3) dividing the license plate characters, wherein only black pixels and white pixels exist in the image, so that the characters can be divided through the white pixels and the black pixels of the image, and the characters are positioned by judging the positions of the black-white pixel values of each row and each column respectively.
And the character recognition of the license plate is respectively used for recognizing Chinese province abbreviations and subsequent letters and numbers on the license plate by training two SVM models.
The improved YOLOv5s neural network model inputs the marked feature images classified as pedestrians into a pedestrian traffic violation detection model, and the pedestrian traffic violation detection model carries out violation judgment and face recognition on the feature images; based on the action track of the pedestrian, the road surface mark, the traffic indication mark and the signal lamp state, judging the traffic violation of the pedestrian, and if the traffic violation is true, further carrying out face recognition; the face recognition adopts a simplified Retinaface model which is completed through training, and the simplified Retinaface model further comprises a trunk feature extraction network, an FPN feature pyramid network, an SSH feature enhancement network and a face target output end which are sequentially arranged.
The backbone feature extraction network adopts a lightweight version mcet based on mobilet, so that the detection speed is faster. The mcet reserves 3 layers of feature graphs of a feature pyramid network, generates 3 detection frames on different scales, introduces anchor frame sizes with different sizes on each scale, and ensures that faces with different sizes can be detected; the FPN feature pyramid network firstly adjusts the channel number of three effective feature graphs by using 1x1 convolution, and then carries out up-sampling feature fusion by using up-sampling and tensor addition after adjustment, and outputs three feature graphs C1, C2 and C3; three effective feature layers P1, P2 and P3 are obtained through the FPN feature pyramid network, and in order to further strengthen feature extraction, an SSH module is used for strengthening a receptive field (the size of a region on an input image is mapped back by pixel points on a feature map output by each layer of the convolutional neural network), and the SSH module comprises three detection modules: detection Module M3 for detecting large faces, detection Module M2 for detecting medium faces, and Detection Module M1 for detecting small faces. The SSH module improves the detection of the small face by introducing context information into the feature map and outputs three effective feature layers; the face target output end obtains a prediction result through three effective feature layers S1, S2 and S3, and the prediction result is divided into three: classification prediction, face frame prediction and face key point prediction.
Traffic offence behavior classification detects SoC chip based on deep learning, and SoC chip includes general treater and a plurality of neural network processor, and general treater is through the operation of custom instruction control a plurality of neural network processor, and a plurality of neural network processor further includes: a neural network processor for image classification, a neural network processor for vehicle traffic violation detection, and a neural network processor for pedestrian traffic violation detection, for performing the above-described methods.
The foregoing is merely a preferred embodiment of the present application, and it should be noted that modifications and improvements could be made by those skilled in the art without departing from the inventive concept, which falls within the scope of the present application.

Claims (7)

1. The traffic illegal behavior classification detection method based on deep learning is characterized by comprising the following steps of:
s1: collecting traffic scene videos through imaging equipment, and decomposing the traffic scene videos into a plurality of continuous video frames;
s2: inputting the video frames into a trained classified neural network model, and marking road surfaces, traffic indication marks, signal lamps and traffic participants in the video frames, wherein the classified neural network model adopts an improved YOLOv5s neural network model; the improved YOLOv5s neural network model reduces the number of layers and the number of channels of each layer of the neural network of the model, and improves the flow of the spatial pyramid pooling structure into a characteristic diagram which sequentially passes through three maximum pooling layers;
s3: the classification neural network model classifies video frames according to traffic participants, classifies the video frames with vehicle labels as vehicle feature graphs, and classifies the video frames with pedestrian labels as pedestrian feature graphs;
s4: inputting the vehicle feature map into a trained vehicle traffic violation detection model, judging vehicle traffic violation behaviors, and if the vehicle feature map is judged to be traffic violation behaviors, generating a vehicle traffic violation image for subsequent processing after license plate recognition; the vehicle traffic violation detection model comprises a vehicle violation judging algorithm and a license plate recognition model, wherein the vehicle violation judging algorithm judges whether the vehicle is illegal or not according to the running track of the vehicle, road surface identification, traffic indication identification and signal lamp state, the license plate recognition model adopts an SVM model, and a license plate is recognized through a license plate positioning module, a license plate character segmentation module and a license plate character recognition module;
s5: inputting the pedestrian feature map into a trained pedestrian traffic violation detection model, judging pedestrian traffic violation behaviors, and if the pedestrian feature map is judged to be traffic violation behaviors, generating a pedestrian traffic violation image for subsequent processing after face recognition; the pedestrian traffic violation detection model comprises a pedestrian violation judgment algorithm and a face recognition model, wherein the pedestrian violation judgment algorithm judges whether a vehicle is illegal or not according to the action track of a pedestrian, road surface identification, traffic indication identification and signal lamp state, the face recognition model adopts a simplified Retinaface model, and the simplified Retinaface model further comprises: a trunk feature extraction network, an FPN feature pyramid network, an SSH feature enhancement network and a face target output end;
the training of the improved YOLOv5s neural network model specifically comprises the following steps:
s2-1: establishing a traffic scene data set, wherein the traffic scene data set is divided into a training set and a verification set according to a set proportion;
s2-2: sequentially inputting the training set into an input end, a feature extraction network, a feature fusion network and an output end of an improved YOLOv5s neural network, predicting the positions and classifications of the road surface, traffic indication marks, signal lamps and traffic participants in the training set, and outputting a prediction result, wherein the prediction result is compared with the verification set for verification;
s2-3: repeating the step S2-2 until the set training times are reached, and storing the last trained model as a trained classified neural network model;
the improved YOLOv5s neural network model includes:
the input end uses a Mosaic method, a self-adaptive anchor frame calculation and a self-adaptive picture scaling method to output a characteristic diagram A of three channels of RGB with the resolution of 640 multiplied by 640;
the characteristic extraction network is used for receiving the characteristic image A, extracting the characteristic image A through a convolution network and detecting a subsequent target; the feature map A sequentially passes through a Focus module, a CBL module, a CSP1 module, a CBL module and a CSP1 module of the feature extraction network and then outputs a feature map B; the feature map B sequentially passes through a CBL module and a CSP1 module of the feature extraction network and then outputs a feature map C; the feature map C sequentially passes through a CBL module, an SPPF module, a CSP2 module and a CBL module of the feature extraction network and then outputs a feature map D;
the feature fusion network is used for mixing and combining the feature images at different stages and transmitting the mixed and combined feature images to an output end for prediction; in the feature fusion network, a feature map B, a feature map C and a feature map D are respectively fused with feature maps of different stages of the feature fusion network to generate feature maps F, G and H with different sizes to an output end for target detection;
the output end is used for respectively carrying out 3×3 convolution on the feature map F, the feature map G and the feature map H, and then outputting three target detection results with different sizes through a classification Loss function and a regression Loss function, wherein the classification Loss function adopts a cross entropy Loss function, and the regression Loss function adopts CIOU_Loss;
the CIoU_Loss algorithm is used for measuring the coincidence degree between a real frame and a predicted frame, a minimum rectangle capable of wrapping the real frame and the predicted frame is set by the CIoU_Loss algorithm, the distance between the two frames is evaluated, the diagonal distance of the rectangle is c, the center point distance d between the real frame and the predicted frame is introduced, the condition that the two frames are wrapped with each other is evaluated, the aspect ratio v of the real frame and the predicted frame is introduced, the condition that the center points of the two frames coincide is evaluated, and the formula of the CIoU_Loss algorithm is as follows:
wherein CIoU is the coincidence degree of the real frame and the predicted frame, ioU is the intersection ratio of the real frame and the predicted frame, w is the width of the predicted frame, h is the height of the predicted frame, and w gt Is the width of the real frame, h gt Is the height of the real frame;
the mosaics data enhancement operation in the input end uses four pictures to splice in a random scaling, random cutting and random arrangement mode; the self-adaptive anchor frame calculates self-adaptively to calculate the optimal anchor frame values in different training sets, calculates the optimal recall rate aiming at the default anchor frame, and if the optimal recall rate is lower than 0.98, recalculates the anchor frame; the self-adaptive picture scaling method comprises the steps of calculating scaling ratios of width and height according to the input image size and the output feature image size, calculating the actual size after scaling according to the scaling ratios, and calculating gray filling values according to the actual size to align to 640 multiplied by 640 to output feature image A size.
2. The deep learning-based traffic offence classification detection method of claim 1, wherein the feature extraction network further comprises three sets of convolutions, a CSP module is added to each set of convolutions, the CSP module breaks down a feature map into two parts, one part performs a convolution operation, and the results of the other part and the last part perform a feature fusion operation with increased channel numbers; a third set of convolutions employs a modified spatial pyramid pooling structure.
3. The method for classifying and detecting traffic offence based on deep learning according to claim 2, wherein the improved spatial pyramid pooling structure outputs the input feature map in two paths after passing through the CBL module once, one path of feature map passes through three 5×5 maximum pooling layers in series, each maximum pooling layer outputs one feature map and the other path of feature map to perform tensor stitching operation, and the stitched feature map is output to the CSP2 module after passing through the CBL module.
4. The traffic offence classification detection method based on deep learning as claimed in claim 1, wherein the feature fusion network adopts a feature pyramid network and a path aggregation network to aggregate the image features of the stage, and the fusion mode of the feature map is specifically as follows:
the feature map D performs tensor splicing operation with the feature map C after up-sampling operation, the spliced feature map sequentially passes through the CSP2 module and the CBL module and then outputs a feature map E, the feature map E performs tensor splicing operation with the feature map B after up-sampling again, and the spliced feature map outputs a feature map F through one CSP2 module; the feature map F performs tensor splicing operation with the feature map E after passing through a CBL module, and the spliced feature map outputs a feature map G through a CSP2 module; the feature map G performs tensor splicing operation with the feature map D after passing through a CBL module, and the spliced feature map outputs a feature map H through a CSP2 module; and the feature map F, the feature map G and the feature map H are used as final output of a feature fusion network and are input to the output end of the improved YOLOv5s neural network model.
5. The deep learning-based traffic offence classification detection method of claim 1, wherein the improved YOLOv5s neural network model inputs a marked feature map classified as a vehicle into a vehicle traffic offence detection model, the vehicle traffic offence detection model performs offence judgment and license plate recognition on the feature map classified as a vehicle, performs traffic offence judgment on the vehicle based on a running track of the vehicle, a road surface identification, a traffic indication sign and a signal lamp state, and further performs license plate recognition if traffic offence is established; the license plate recognition adopts a trained SVM model.
6. The deep learning-based traffic offence classification detection method of claim 1, wherein the modified YOLOv5s neural network model inputs a labeled feature map classified as a pedestrian into a pedestrian traffic offence detection model that performs offence determination and face recognition on the labeled feature map classified as a pedestrian; based on the action track of the pedestrian, the road surface mark, the traffic indication mark and the signal lamp state, judging the traffic violation of the pedestrian, and if the traffic violation is true, further carrying out face recognition; the face recognition adopts a simplified Retinaface model which is completed through training, and the simplified Retinaface model further comprises a trunk feature extraction network, an FPN feature pyramid network, an SSH feature enhancement network and a face target output end which are sequentially arranged.
7. The traffic offence behavior classification detection SoC chip based on deep learning is characterized in that the SoC chip comprises a general processor and a plurality of neural network processors, the general processor controls the plurality of neural network processors to run through a custom instruction, and the plurality of neural network processors further comprise: a neural network processor for image classification, a neural network processor for vehicle traffic violation detection, and a neural network processor for pedestrian traffic violation detection, for performing the method of any of claims 1-6.
CN202211280838.1A 2022-10-19 2022-10-19 Deep learning-based traffic offence behavior classification detection method and SoC chip Active CN115601717B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211280838.1A CN115601717B (en) 2022-10-19 2022-10-19 Deep learning-based traffic offence behavior classification detection method and SoC chip

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211280838.1A CN115601717B (en) 2022-10-19 2022-10-19 Deep learning-based traffic offence behavior classification detection method and SoC chip

Publications (2)

Publication Number Publication Date
CN115601717A CN115601717A (en) 2023-01-13
CN115601717B true CN115601717B (en) 2023-10-10

Family

ID=84849432

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211280838.1A Active CN115601717B (en) 2022-10-19 2022-10-19 Deep learning-based traffic offence behavior classification detection method and SoC chip

Country Status (1)

Country Link
CN (1) CN115601717B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117671572A (en) * 2024-02-02 2024-03-08 深邦智能科技集团(青岛)有限公司 Multi-platform linkage road image model processing system and method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112528759A (en) * 2020-11-24 2021-03-19 河海大学 Traffic violation behavior detection method based on computer vision
CN113177443A (en) * 2021-04-13 2021-07-27 深圳市天双科技有限公司 Method for intelligently identifying road traffic violation based on image vision
CN113239858A (en) * 2021-05-28 2021-08-10 西安建筑科技大学 Face detection model training method, face recognition method, terminal and storage medium
CN114120405A (en) * 2021-11-19 2022-03-01 重庆科技学院 Intelligent identification method for unsafe behaviors of oil and gas laboratory
CN114898416A (en) * 2022-01-21 2022-08-12 北方工业大学 Face recognition method and device, electronic equipment and readable storage medium
CN115116032A (en) * 2022-06-15 2022-09-27 南京信息工程大学 Traffic sign detection method based on improved YOLOv5

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112528759A (en) * 2020-11-24 2021-03-19 河海大学 Traffic violation behavior detection method based on computer vision
CN113177443A (en) * 2021-04-13 2021-07-27 深圳市天双科技有限公司 Method for intelligently identifying road traffic violation based on image vision
CN113239858A (en) * 2021-05-28 2021-08-10 西安建筑科技大学 Face detection model training method, face recognition method, terminal and storage medium
CN114120405A (en) * 2021-11-19 2022-03-01 重庆科技学院 Intelligent identification method for unsafe behaviors of oil and gas laboratory
CN114898416A (en) * 2022-01-21 2022-08-12 北方工业大学 Face recognition method and device, electronic equipment and readable storage medium
CN115116032A (en) * 2022-06-15 2022-09-27 南京信息工程大学 Traffic sign detection method based on improved YOLOv5

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于YOLOv5s的轻量化森林火灾检测算法研究;皮骏 等;《图学学报》;全文 *

Also Published As

Publication number Publication date
CN115601717A (en) 2023-01-13

Similar Documents

Publication Publication Date Title
CN111368687B (en) Sidewalk vehicle illegal parking detection method based on target detection and semantic segmentation
CN107729801B (en) Vehicle color recognition system based on multitask deep convolution neural network
Fang et al. Road-sign detection and tracking
CN110717387B (en) Real-time vehicle detection method based on unmanned aerial vehicle platform
Lin et al. A license plate recognition system for severe tilt angles using mask R-CNN
CN113723377B (en) Traffic sign detection method based on LD-SSD network
CN102855500A (en) Haar and HoG characteristic based preceding car detection method
CN111274942A (en) Traffic cone identification method and device based on cascade network
Shi et al. A vision system for traffic sign detection and recognition
Yonetsu et al. Two-stage YOLOv2 for accurate license-plate detection in complex scenes
CN106919939B (en) A kind of traffic signboard tracks and identifies method and system
CN102163278B (en) Illegal vehicle intruding detection method for bus lane
CN112381101B (en) Infrared road scene segmentation method based on category prototype regression
Zhang et al. Automatic detection of road traffic signs from natural scene images based on pixel vector and central projected shape feature
CN115601717B (en) Deep learning-based traffic offence behavior classification detection method and SoC chip
Mammeri et al. North-American speed limit sign detection and recognition for smart cars
CN115376108A (en) Obstacle detection method and device in complex weather
Omidi et al. An embedded deep learning-based package for traffic law enforcement
Nguwi et al. Number plate recognition in noisy image
CN115376082A (en) Lane line detection method integrating traditional feature extraction and deep neural network
Rahmani et al. IR-LPR: A Large Scale Iranian License Plate Recognition Dataset
Al Khafaji et al. Traffic Signs Detection and Recognition Using A combination of YOLO and CNN
Chaki et al. A framework for LED signboard recognition for the autonomous vehicle management system
CN113111859A (en) License plate deblurring detection method based on deep learning
CN110490116A (en) A kind of far infrared pedestrian detection method of selective search and machine learning classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant